Algemeen notes

20 important questions on Algemeen notes

What are two important guidelines for presenting statistical results?

  1. Understandibility: Statistics should be reported in a form that is easily understood by most people.
  2. Interpretability: report the statistics in units or measure in plain language, or at leas in terms that require the least statistical knowledge of your audience.
  3. "Confidence" : report a confidence interval (most often 95% CI) to indicate the confidence you have in the reported statistic.


Confidence interval = a region generated by a procedure that, under repeated sampling, contains the true value of the parameter of interest with a specified probability.

On what does the width of the confidence interval dependent?

  • The standard error of the parameters -> the larger the sample, the smaller the standard error.
  • The level of confidence -> 95% confidence interval is wider than a 90% CI. A 100% is (-infinite, +infinite), contains all possible values of parameter and is not very informative

What is:

1. risk difference
2. relative difference
3. (odd ratio)

  1. The difference of two proportions
  2. the ratio of two proportions


Deal with categorical data, displayed in two-by-two contingency table containing counts per category.
  • Higher grades + faster learning
  • Never study anything twice
  • 100% sure, 100% understanding
Discover Study Smart

Calculation:
  • standard error
  • adjusted confidence interval

Formule 3.1 onderin bovenste foto = standard error.
onderste foto = adjusted confidence interval


Does not perform well when the sample size is small -> so better to do an Adjusted (wald's) confidence interval.

What does a 95% Confidence interval mean?

The range of 'plausible values' for the unknown risk difference in the population.

Cohen's w effect size index (Blz.31)

Less intuitive, but used on power analyses and sample size calculations.

it reflects the differences between the observed frequencies in the cells of a contingency table and that expected under the null hypothesis. Can be calculated from observed and expected proportions (relative frequencies) in each cell or from the test statistics X^2

Anova + variance calculation

  • Analysis the variance in a set of observations
  • it assigns chunks of the total variance to the independent variables and their interaction in the general linear model in equation 4.1 blz. 41
  • the remaining residual variance or error is not explained by the model's factors
  • to calculate variance -> use the sum of squared errors, SS. So the sum of the squared differences between each individual measurement and the overal mean, X| (streep erboven).

Why use contrasts? (and not post-hoc test)
Degrees of freedom

  • Main advantage contrast is based on all observations in the ANOVA.
  • so you have more degrees of freedom (than in post hoc test were you can only compare two samples)
  • more degrees of freedom means a larger effective sample size and hence higher statistical power
  • So a planned comparison is hypothesis-driven and formulated a priori. Also, unlike a post-hoc test, a planned comparison has more power as it uses the entire data set instead of a subset

Why multiple linear regression? Blz. 52

To find out the regression of a single dependent variable influenced by more than one independent variable or predictors.

e.g.
  • What factors (predictors/independent variables) explain or account for the variation of the dependent variable -> could lead to identification of causal factors
  • What effect does one independent factor has on the dependent factor when you correct for another independent variable?
  • When I have some info of independent variables, can I then predict the dependent variable?

There are two different ways to construct a multiple linear regression model. Blz.59

Forced entry -> manually enter the predictors
Sequential regression -> decided by statistical software.

Sequential regression -> forward selection or backward elimination, determines the order in which predictors are removed or added from a model under construction.

RMSE (root mean square error).             Blz. 61

  • Is a measure of the average deviation or error of the data points (yi) from the values calculated from the regression model (Y^i)
  • The smaller RMSE the better -> the smaller the difference between all measured values and their predicted or calculated values, the better the model fits the data.

What is the Parsimony principle?                 Blz.67

Aiming for the simplest model that still has adequate explanatory and predictive power.

Statistical power or sensitivity

Is the probability that you will detect an effect given that there really is an effect present in the population from which the sample is taken.

Smallest effect size of interest.            Blz.79

Has to be set with your own expertise and professional judgements.

Type-2 error    -    False negative                 blz.82

Missing a effect that actually is present in the population

Type-1 error          - Fals positive.                     blz. 82

When you get statistical significant result, but when there actually is no effect.

Noncentrality parameter, ncp.              blz.82

  • Is a measure of how far the peak of the t-distribution under Ha has shifted from that under H0.
  • the same effect size d wil five a larger ncp when sample size n increases.
  • When Cohen's effect size d = 0 then ncp = 0 and we have a central t-distribution that is symmetric and centerend around 0.

How to increase the effect size d

  • By decreasing variability (standard deviation)
  • by increasing the smallest effect size of interest

The two consequences of two tailed testing are:

  1. There are now two critical t-values: one negative to the left of the central value t= 0 and one positive to the right. The H0 will be rejected when the significance test gives t > trite or t < -tcrit
  2. the crital t-values are more extreme as they have to enclose aress smaller than 5% under the distribution when H0 is true.

What is a post hoc test and when is it used?

A post hoc test is used only after we find a statistically significant result and need to determine where our differences truly came from. The term “post hoc” comes from the Latin for “after the event”. There are many different post hoc tests that have been developed, and most of them will give us similar answers.

The question on the page originate from the summary of the following study material:

  • A unique study and practice tool
  • Never study anything twice again
  • Get the grades you hope for
  • 100% sure, 100% understanding
Remember faster, study better. Scientifically proven.
Trustpilot Logo