Modeling Session Length

From CSL Wiki

Jump to: navigation, search

Contents

Exponential and Mixture of Exponentials

In this section, I try to fit the session length of each user to several exponential models.

Single Exponential

Entire data

User 27
rate: 1.019121e-05

Two-sample Kolmogorov-Smirnov test: D = 0.1308, p-value < 2.2e-16

Truncated Data

User 27: values less than session length mean
rate:  2.647844e-05

Two-sample Kolmogorov-Smirnov test: D = 0.1441, p-value < 2.2e-16

Two Exponentials

Entire data

user 27:
lambda1 = 0.8352734
lambda2 = 0.1647266 (1-lambda1)

theta1 = 115838.0
theta2 = 8301.25

Two-sample Kolmogorov-Smirnov test: D = 0.0842, p-value < 2.2e-16

Truncated data

All values larger than the mean of session length was filtered.

user 27:
lambda1 = 9.999998e-01
lambda2 = 2.184364e-07 (1-lambda1)

theta1 = 3.776658e+04
theta2 = 5.883020e+03

Two-sample Kolmogorov-Smirnov test: D = D = 0.1569, p-value < 2.2e-16

Three Exponentials

Skewed Normal

Does not seem to fit well.

Pareto Distribution

Does not seem to fit well

Weibull Dist.

Entire Data

User 27

      shape          scale    
  8.031812e-01   8.545603e+04

Two-sample Kolmogorov-Smirnov test: D = 0.0862, p-value < 2.2e-16

All User

zoomed in versions:


Mixture of two Weibull distributions


Zoomed-in


Mixture of shifted Weibull distributions

The shift value is inferred "visually" from histograms (the second significant peak is used).

Here are the inferred parameters:

User Shift lambda      1- lambda   shape1        scale1       shape2        scale2
4 60000 9.363492e-01 6.365080e-02 5.622782e-01 7.590271e+04 1.000001e+00 1.276356e+04
7 60000 7.692202e-01 2.307798e-01 8.260485e-01 9.728153e+03 6.861826e-01 1.448050e+05
8 60000 9.193318e-01 8.066816e-02 6.285758e-01 6.406651e+04 9.145829e-01 1.881905e+04
10 60000 8.440280e-01 1.559720e-01 6.486115e-01 1.292032e+05 7.390162e-01 2.243549e+04
12 60000 9.430498e-01 5.695024e-02 5.211197e-01 4.489515e+04 1.102138e+00 3.604611e+03
13 30000 9.468530e-01 5.314696e-02 6.339674e-01 3.261529e+04 2.455253e+00 3.927456e+03
14 120000 9.518266e-01 4.817343e-02 5.621301e-01 7.212900e+04 8.075629e-01 6.085530e+03
15 60000 8.531923e-01 1.468077e-01 5.599087e-01 4.751162e+04 7.198283e-01 2.612631e+04
16 60000 9.713030e-01 2.869700e-02 5.168969e-01 6.593559e+04 5.715084e-01 1.483467e+03
17 120000 8.788666e-01 1.211334e-01 4.760974e-01 8.307615e+04 1.000000e+00 1.613652e+04
18 60000 7.370529e-01 2.629471e-01 6.409044e-01 1.718583e+05 7.697831e-01 3.701132e+04
19 60000 7.960501e-01 2.039499e-01 6.062555e-01 9.332721e+04 8.012496e-01 3.442842e+04
20 60000 9.928491e-01 7.150866e-03 4.878766e-01 5.428988e+04 1.993046e+00 1.613936e+03
21 30000 7.390277e-01 2.609723e-01 6.880169e-01 8.210198e+04 9.053757e-01 2.406844e+04
23 60000 8.951718e-01 1.048282e-01 6.861414e-01 6.507520e+04 8.696657e-01 1.141436e+04
27 60000 8.703499e-01 1.296501e-01 7.469908e-01 8.286917e+04 8.328785e-01 1.803125e+04
31 15000 8.931465e-01 1.068535e-01 7.600937e-01 8.524262e+04 6.292988e+00 5.484129e+04
32 30000 9.037864e-01 9.621363e-02 7.653899e-01 8.558759e+04 7.015634e+00 3.629068e+04


Zoomed version

Without filtering large values

Estimated parameters:

[4] 60000 9.253915e-01 7.460852e-02 5.320434e-01 7.631602e+04 1.000000e+00 1.574404e+04
[7] 60000 9.930031e-01 6.996891e-03 4.332226e-01 3.423495e+04 1.076783e+00 1.278752e+03
[8] 60000 8.517060e-01 1.482940e-01 4.340177e-01 7.624585e+04 9.104787e-01 3.079966e+04
[10] 60000 8.062710e-01 1.937290e-01 5.650638e-01 1.412031e+05 7.556074e-01 2.887774e+04
[12] 60000 9.054261e-01 9.457387e-02 4.145306e-01 4.493060e+04 7.515960e-01 1.118765e+04
[13] 30000 8.448721e-01 1.551279e-01 3.687256e-01 4.503278e+04 1.010531e+00 1.591010e+04
[14] 120000 9.147960e-01 8.520401e-02 3.428995e-01 1.274641e+05 6.780854e-01 2.077731e+04
[15] 60000 8.351915e-01 1.648085e-01 4.733699e-01 5.075704e+04 7.479246e-01 2.682107e+04
[16] 60000 9.632318e-01 3.676820e-02 4.420814e-01 7.603740e+04 5.363905e-01 2.939823e+03
[17] 120000 8.699364e-01 1.300636e-01 4.155831e-01 9.455886e+04 1.000000e+00 1.835415e+04
[18] 60000 5.577954e-01 4.422046e-01 4.034967e-01 2.216034e+05 7.204537e-01 7.790729e+04
[19] 60000 7.166221e-01 2.833779e-01 4.481462e-01 1.100888e+05 8.201137e-01 4.604754e+04
[20] 60000 9.911703e-01 8.829726e-03 4.143966e-01 6.031742e+04 1.798530e+00 1.764013e+03
[21] 30000 6.803063e-01 3.196937e-01 6.031880e-01 8.076950e+04 8.888908e-01 2.966137e+04
[23] 60000 8.359453e-01 1.640547e-01 5.173480e-01 6.423631e+04 8.192808e-01 2.053605e+04
[27] 60000 8.188268e-01 1.811732e-01 6.431339e-01 8.161202e+04 8.081415e-01 2.755969e+04
[31] 15000 9.568729e-01 4.312708e-02 4.812504e-01 7.892411e+04 2.239252e+00 5.951436e+03
[32] 30000 9.127021e-01 8.729786e-02 5.542861e-01 2.750267e+04 8.833102e-01 6.074153e+03


Zoomed versions:

Without filtering large values

Sapling from each user's data with prob=0.8


Zoomed plots:


WM data

Personal tools