Research:Reading time/Distribution fitting plots

From testwiki
Revision as of 00:49, 6 August 2019 by imported>Groceryheist (Created page with "==== Distribution fitting plots ==== To further explore how well these distributions fit the data, we present a series of diagnostic plots that compare the empirical distribu...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Distribution fitting plots

To further explore how well these distributions fit the data, we present a series of diagnostic plots that compare the empirical distribution of the data with the model predicted distributions. For each of the four models under consideration (lomax, log-normal, exponentiated Weibull, Weibull), we present a density plot, a distribution plot, and a quantile-quantile plot (Q-Q plot). The density plots compare the probability density function of the estimated parametric model to the normalized histogram of the data. Similarly the distribution plots compare the estimated cumulative distribution to the empirical distribution. The Q-Q plots plot the values of the quantile function for the data on the x-axis and for the estimated model on the y-axis. These plots can help us explain diagnose ways that the data diverge from each of the models. We present the x-axis of all these plots on a logarithmic scale to improve the visibility of the data.

We show these plots for data from English Wikipedia. For this wiki, the likelihood-based goodness-of-fit measures indicate that the exponentiated Weibull model is the best fit (BIC = 19321) followed in order by the lomax (BIC = 19351), the log-normal (BIC = 19373) and the Weibull (BIC = 20111), but the log-normal model is the only model that passes the KS test (p = 0.089).

Template:Col-begin Template:Col-break

Template:Inline figure

Template:Inline figure

Template:Col-break

Template:Inline figure

Template:Inline figure

Template:Col-end