Linear Regression Models Main


Notes and Ideas:


Definition:

Linear Regrssion Means Linear Algebra Basic Guide of features Genral Linear Model

Funciton

  • From the euqation above, we have shown the lienar model based on the n number of features. Considering only a subject feature as us probably already have understood that will be the slope and the b will represent the intercept.

  • Cost Function:

    • We are looing for optimizing and such taht it minimizes the cost function which
      • Assumed that the dataset has M instances and p features.
    • Cost Function:
    • We are looing for optimizing and such taht it minimizes the cost function which
      • Assumed that the dataset has M instances and p features.

    Linear Regression Assumptions:

    • Ordinary Leaset Square:
  • Linear Relationship between predictors and the target variable, meaning the pattern must in the form of a straight-line(or a hyperplane in case of multiple linear regressions)

    • This assumption is validated if there is no discerning, nonlinear pattern in the residual plot. Let’s consider the following example
    • Example:
      • In the above case, the assumption is violated since a U-shape pattern is apparent. In other words, the true relationship is nonlinear.
  • Homoscedaticity, i.e, constant variance of the Residulas

    • This assumption is validated if the residuals are scattered evenly (about the same distance) with respect to the zero-horizontal line throughout the x-axis in the residual plot.
    • Example:
      • In the below case, the assumption is violated since the variance is getting smaller on larger fitted values
    • Independenet observations, this is actually equivalent to independent Residulas
      • This assumption is validated if there is no discerning pattern between several consecutive residuals in the residual plot.
      • Example:
      • In the below case, the assumption is violated since there are discerning patterns (both are linear with a negative slope) between consecutive residuals
    • Normality of Residuals, i.e., the residuals follow the normal distribution Find the Best Model:
  • see Model Selection

  • Apply Penalty(Ridge and Lasso regression are some of the simple techniques to reduce model complexity and prevent over-fitting which may result from simple linear regression ):

    • Ridge Regression

      Ridge Regression( Penalty)


      Defination and Ideas:

      • In ridge Regression, the cost function is altered by adding a penalty equivalent to square of the magnitude of the coefficients:

        • equivalent to saying minimizing the cost function in equation under the condition that for some c>0,
      • The ridge regression puts constraint on the coefficients(). Then penalty term() regularizes the coefficients such that if the coefficients take large values the optimization function is penalized.

      • We taking the correlation of the matrix and add constant

      1+e & & & \ & 1+e & & \ & & 1+e & \ & & & 1+e \end{matrix}| $$

      Issue with Ridge Regression:

      • While it can have better prediction error than linear regression

      • it workout best when there was a subset of the true coefficients that are samll or zero.

      • it will never sets coefficients to zero exactly, and therefore cannot perform variable selection in the linear model


      Apply Ridge Regression: R Code:

      Python Code:

      Link to original

    • Lasso Regression

      Lasso Regression(L1 Penalty)


      Defination and Ideas:

      • In Lasso Regression, the cost function for lasso(least absolute shrinkage and selection operator) regression can be written as:
        • For some t>0, -This type of regluarization not only helps in reducing over-fitting but it can help us in Feature Selection.

      Link to original


Key words:


TAGS