Selecting input probability distributions
17 important questions on Selecting input probability distributions
What happens when the input distribution is appropriate?
What are the approaches to use data to specify a distribution?
2. Use empirical distribution (histogram).
3. Fit theoretical distribution.
What are the (dis)advantages of fitting a theoretical distribution?
+ Scalable.
- May not be valid.
- Difficult (mixture multiple distributions).
- Higher grades + faster learning
- Never study anything twice
- 100% sure, 100% understanding
What are the (dis)advantages of using an empirical distribution?
+ Valid, observations representative.
What are the (dis)advantages of using trace-driven simulation?
- Seldom enough data to make all desired simulation runs.
+ Valid w.r.t. real world.
By which parameters are theoretical distributions characterized?
- Scale parameter: compress/expand distribution.
- Shape parameter: determined, distinct from location and scale, the basic form of a distribution within the general family of distributions of interest.
What are the steps in the selection of theoretical distributions?
2. Estimation of parameters.
3. Goodness-of-fit-tests.
What continuous distributions can we use? When can we use them?
- Triangular: rough model in absence of data.
- Exponential: interarrivel times, failure times.
- Gamma: processing, repair times.
- Weibull: processing, repair times.
- Normal: errors, changes in stock price, sums of large number of quantities.
- Lognormal: processing, repair times, products of large number of quantities.
- Beta: rough model in absence of data, random fractions (defectives).
What discrete distributions can we use? When can we use them?
- Uniform: quantity with bounds known.
- Binomial: number of defectives in batches, demand/batch sizes.
- Geometric: number of failures before first succes, demand/batch sizes.
- Negative binomial distribution: number of failures before nth succes, demand/batch sizes.
- Poisson: number of events in time arrival, demand/batch sizes.
What are graphical techniques to check assumption?
- Scatter diagram.
How can we see form the scatter diagram whether there is independence?
- dependence if along a line with positive/negative slope in first quadrant.
In what ways can we hypothesize families of distributions?
- Summary statistics (mean, median, coefficient of variation, skewness, etc.)
- Histograms.
- Quantile summaries.
- Box plots.
What are the advantages of maximum likelihood estimation?
+ Asymptotically unbiased.
+ Invariant under transformation.
+ Asymptotically normally distributed.
+ Strongly consistent.
What are the heuristic procedures of goodness-of-fit tests?
- Frequency comparisons.
- Distribution-function-differences plot.
- Quantile-Quantile plot (amplifies differences in tails).
- Probability-probability plot (amplifies differences in the middle).
What are the steps for a chi-square test?
- Tally Nj: number of Xi's in jth interval.
- Compute expexted proportion pj of the Xi's that would fall in the jth interval if we were sampling from fitted distribution.
- Statistic: X^2 = sum(j,k) (Nj-np_j)^2/npj > X^2_k-1;1-alpha
For what is the ExpertFit software needed?
What are the modules (steps) in ExpertFit?
- Models: fits distribution (MLE), ranks on quality of fit, determines whether best fit is good enough (otherwise, recommends empirical distributions).
- Comparisons: further investigates quality of fit (plots, tests).
- Applications: computes characteristics of fitted distribution. Puts selected distribution into proper format chosen simulation package.
The question on the page originate from the summary of the following study material:
- A unique study and practice tool
- Never study anything twice again
- Get the grades you hope for
- 100% sure, 100% understanding