We introduce several classic machine learning models, in addition to linear regression and logistic regression.

Regression models

As introduced before, regression is the predictive modeling for numerical variable $y$ with variable(s) $x$. In this section, we will focus on some classic regression models, including general linear model, autoregression, and lasso regression. At the end of this section, we will introduce a special statistical method that is called non-parametric method which makes no assumption on the population distribution or sample size.

1. General linear model

What is general linear model?

The General Linear Model (GLM) underlies most of statistical analyses that are generally used in social science research. The general linear model is a generalization of multiple linear regression to the case of more than one dependent variable.

<aside> 💡 Tips : Not to be confused Generalized linear model with General linear methods. The first term is more general, and refers to a larger class of models popularized by McCullagh and Nelder (1982, 2nd edition 1989).The second term usually refers to conventional linear regression models. This article describes more details.

</aside>

How does general linear model work?

Essentially the GLM is similar to two variable linear regression model. The difference is that each terms in the GLM can represent a set of variables, not just a single one. Therefore, the general linear model can be written as:

                                                       $y=b_0+bx+e$

where:

GLM allows us to include an enormous amount of variables. A model can have multiple outcome, with a set of $y$-values. If we have multiple inputs, we can include them as a set of $x$-values. For each $x$-value we estimate a $b$-value that represents an $x$,$y$ relationship. The estimates of these $b$-values, and the statistical testing of these estimates, is what enables us to test specific research hypotheses about relationships between variables or differences between groups.

GLM model can incorporate with a number of different statistical models: Analysis of variance(ANOVA), Analysis of covariance(ANCOVA), multivariate analysis of variance(MANOVA), Multivariate analysis of covariance(MANCOVA), ordinary linear regression, t-test and F-test, and etc.

2. Autoregression

What is Autoregression?

<aside> 💡 The term autoregression indicates that it is a regression of the variable against itself.

</aside>