Descriptive Analytics I: Nature of Data, Statistical Modeling, and Visualisation - Regression Modelling for Inferential Statistics

6 important questions on Descriptive Analytics I: Nature of Data, Statistical Modeling, and Visualisation - Regression Modelling for Inferential Statistics

What is regression and what statistical purpose does it serve?

Regression is relatively simple statistical technique to model the dependency of a variable (response or output variable) on 1 (or more) explanatory (input) variables.
It can be used for:
  1. hypothesis testing (theory building): investigating potential relationships between different variables. It can reveal the strength and directions of relationships between a number of explanatory variables and the respons variable.
  2. Prediction/forecasting: estimating values of a response variable based on 1 or more explanatory variables. The equation is used to predict.

What are the commonalities and differences between regression and correlation?

Correlation: is not concerned with te relationship between variables. It gives an estimate on the degree of association between the variables?
regression: attempts to describe the dependence of a respons var. on 1 (or more) explanatory vars. Implicit assumption that there is a 1-way causal effect.

What is OLS? How does OLS determine the linear regression line?

OLS: Ordinary Least Squares: is a method/algorithm to identify the regression line. It leads to the mathematical expression for the estimated value of the regression line.
  • Higher grades + faster learning
  • Never study anything twice
  • 100% sure, 100% understanding
Discover Study Smart

List and describe the main steps to follow in developing a linear regression model?

tbd

What are the most commonly pronounced assumptions for linear regression?

Linearity: linear relationship between vars.
Independence (of errors): the errors of the response variable are uncorrelated of each other.
Normality (of errors): the errors of the response variable are normally distributed
Constant variance (of errors): the errors of the response variable have the same variance. Assumption is invalid if resp.vars. over a wide enough range.
Multicollinearity: the explanatory variables are not correlated.

What is time series? What are the main forecasting techniques for time series data?

Is a sequence of data points of the variable of interest, measured and represented at successive points in time spaced at uniform time intervals.
naïve forecast: today's forecast is the same as yesterday's actual
ARIMA: very complex: combination of AutoRegressIve and Moving Average patterns
Averaging methods: simple average, moving average, weighted moving average,...

The question on the page originate from the summary of the following study material:

  • A unique study and practice tool
  • Never study anything twice again
  • Get the grades you hope for
  • 100% sure, 100% understanding
Remember faster, study better. Scientifically proven.
Trustpilot Logo