Analyzing association between quantitative variables
34 important questions on Analyzing association between quantitative variables
How do we call a straight line function that predicts the value for the response variable y, from a value from the explanatory variable x?
How do we symbolize the predicted value of the response variable y?
How do we call the point where the regression line starts at the y-axis, a in the formula of the regression line?
- Higher grades + faster learning
- Never study anything twice
- 100% sure, 100% understanding
How do we call the concept symbolized by b in the formula of the regression line, and what is it's function?
How do we call the prediction errors, the difference between the predicted value for y and the actual value for y?
To find a regression line in a scatterplot, you have a few options of potential regression line how do you pick the right one?
How do you call he method of producing the line through a scatterplot with the smallest value for the residual sum of squares using ^y=a+bx?
What are the properties of a regression line?
- the least residual sum of squares
- sum and mean of residuals equals 0
- passes through the point (sample mean explanatory variable, sample mean response variable)
How do we call a function that describes the relation between x and y in a population?
How do we call a simple approximation for how variables relate in a population?
When is an outlier influential?
- it's x value is low or high compared to the data
- it does not fall in a straight line with the rest of the data.
How do we call the probability distribution of all y values at a fixed value of x?
How do we call the concept that describes how the population mean of each conditional distribution depends on the value of the explanatory variable?
How do we call a square table which lists variables in rows, and again in columns and shows the correlation between them?
How do we call a summary of the sizes of errors with predicting y values using the sample mean?
If an r2 is 0,4 what does that tell us?
- the error when predicting y using ^y is 40% smaller than the error using the sample mean to predict y.
- 40% of y is explained by x.
What does r2 symbolize?
Of which two things does the size of the correlation depend?
- if the subjects are grouped for observations instead of observed individualy, the correlation tends to increase
- the correlation is smaller when we use a restricted range of possible x values, compared to when we use the full range.
How do we call making predictions about individuals based on results from groups?
What are the assumptions for testing independence of two variables?
- the population means of y at different values for x have a straight line relationship
- the data was gathered using randomization
- the population values for y at each value of x follow a normal distribution, with the same standard deviation at each x value.
How do we call using a regression line to predict y values for x values outside the observed range of data?
How do we call predictions about the future using time series of data?
What is the key characteristic of a regression outlier?
it doesn't follow the rest of the trend
How do we call it when an observation has a large effect on the results of a regression analysis?
What are the properties of an influential outlier?
- its x value is relatively low or high compared to the rest of the data
- the observation is an regression outlier
How do we call it when the correlation and the regression lie aren't influenced by outliers?
How do we call an usual unobserved variable that influences the association between the variables of interest?
How do we call the procedure with an association, where you add a third variable and analyze the data at separate levels of that variable?
What are the four pitfalls for regression?
- extrapolating
- simpson's paradox
- influential outliers
- confounding
What kind of interval do you use if you want to know what kind of value for y you can expect for a specific value of x for one person?
How do we call the estimate that describes the variability of y values at a fixed x value?
Why is a confidence interval more narrow than a prediction interval?
- a confidence interval predicts a population mean
- a prediction interval predicts an individual value.
How do we call the estimate that summarizes how much less error there is in predicting y using regression compared to the mean?
What does the standardized residual show?
The question on the page originate from the summary of the following study material:
- A unique study and practice tool
- Never study anything twice again
- Get the grades you hope for
- 100% sure, 100% understanding