Partial regression plot
In applied statistics, a partial regression plot attempts to show the effect of adding another variable to a model that already has one or more independent variables. Partial regression plots are also referred to as added variable plots, adjusted variable plots, and individual coefficient plots.
Motivation
This article needs attention from an expert in statistics. The specific problem is: motivation could be made stronger, see e.g. https://stats.stackexchange.com/questions/125561/what-does-an-added-variable-plot-partial-regression-plot-explain-in-a-multiple. (June 2024) |
When performing a linear regression with a single independent variable, a scatter plot of the response variable against the independent variable provides a good indication of the nature of the relationship. If there is more than one independent variable, things become more complicated since independent variables might be (negatively or positively) correlated. Although it can still be useful to generate scatter plots of the response variable against each of the independent variables, this does not take into account the effect of the other independent variables in the model. For example, due to omitted variable bias, a simple scatter plot might display a strong positive slope between and , even when the true multivariate coefficient in the full model is negative.
Calculation
Partial regression plots are formed by:
- Computing the residuals of regressing the response variable against the independent variables but omitting Xi
- Computing the residuals from regressing Xi against the remaining independent variables
- Plotting the residuals from (1) against the residuals from (2).
Velleman and Welsch[1] express this mathematically as:
where
- Y•[i] = residuals from regressing Y (the response variable) against all the independent variables except Xi
- Xi•[i] = residuals from regressing Xi against the remaining independent variables.
Properties
Velleman and Welsch[1] list the following useful properties for this plot:
- The least squares linear fit to this plot has an intercept of 0 and a slope , where corresponds to the regression coefficient for Xi of a regression of Y on all of the covariates.
- The residuals from the least squares linear fit to this plot are identical to the residuals from the least squares fit of the original model (Y against all the independent variables including Xi).
- The influences of individual data values on the estimation of a coefficient are easy to see in this plot.
- It is easy to see many kinds of failures of the model or violations of the underlying assumptions (nonlinearity, heteroscedasticity, unusual patterns). .
Partial regression plots are related to, but distinct from, partial residual plots. Partial regression plots are most commonly used to identify data points with high leverage and influential data points that might not have high leverage. Partial residual plots are most commonly used to identify the nature of the relationship between Y and Xi (given the effect of the other independent variables in the model). Note that since the simple correlation between the two sets of residuals plotted is equal to the partial correlation between the response variable and Xi, partial regression plots will show the correct strength of the linear relationship between the response variable and Xi. This is not true for partial residual plots. On the other hand, for the partial regression plot, the x-axis is not Xi. This limits its usefulness in determining the need for a transformation (which is the primary purpose of the partial residual plot).
See also
- Partial residual plot
- Partial leverage plot
- Variance inflation factor for a multi-linear fit.
References
Further reading
- Tom Ryan (1997). Modern Regression Methods. John Wiley.
- Neter, Wasserman, and Kunter (1990). Applied Linear Statistical Models (3rd ed.). Irwin.
{{cite book}}: CS1 maint: multiple names: authors list (link) - Draper, N.R.; Smith, H. (1998). Applied Regression Analysis (3rd ed.). John Wiley. ISBN 0-471-17082-8.
- Cook and Weisberg (1982). Residuals and Influence in Regression. Chapman and Hall. ISBN 0-412-24280-X.
- Belsley, Kuh, and Welsch (1980). Regression Diagnostics. John Wiley. ISBN 0-471-05856-4.
{{cite book}}: CS1 maint: multiple names: authors list (link)
External links
This article incorporates public domain material from the National Institute of Standards and Technology
Content Disclaimer
Informasi ini disarikan dari Wikipedia dan disajikan kembali untuk tujuan edukasi. Konten tersedia di bawah lisensi CC BY-SA 3.0. Kami tidak bertanggung jawab atas ketidakakuratan data yang bersumber dari kontribusi publik tersebut.
- The information displayed on this website is sourced in part or in whole from Wikipedia and has been adapted for the purpose of restating it. We strive to provide accurate and relevant information, however:
- There is no guarantee of absolute accuracy. Wikipedia is an open, collaborative project that can be edited by anyone, so information is subject to change.
- It is not intended to constitute professional advice. The content displayed is for informational and educational purposes only. For important decisions (e.g., medical, legal, or financial), please consult a professional.
- Content copyright. Wikipedia is licensed under the Creative Commons Attribution-ShareAlike License (CC BY-SA). This means that content may be reused with appropriate attribution and shared under a similar license.
- Responsible use. Any risk arising from the use of information from this website is entirely the responsibility of the user.