One of my favorite tricks is adding a constant to each of the independent variables in a regression so as to shift the intercept. Of course just shifting the data will not change R-squared, slopes, F-scores, P-values, etc., so why do it?
Because just about any software package capable of doing regression, even Excel, can give you standard errors and confidence intervals for the Intercept, but it is much harder to get most packages to give you standard errors and confidence intervals around the predicted value of the dependent variable for OTHER combinations of the independent variables. Shifting the intercept is an easy way to get confidence intervals for arbitrary combinations of the independent variables.
This sort of thing becomes especially important at a time when the Statistics community is loudly calling for a move away from P-values. Instead it is recommended that researchers give confidence intervals in clinically meaningful terms.
#data #researchers #statistics #r #excel #regression
β΄οΈ @AI_Python_EN
Because just about any software package capable of doing regression, even Excel, can give you standard errors and confidence intervals for the Intercept, but it is much harder to get most packages to give you standard errors and confidence intervals around the predicted value of the dependent variable for OTHER combinations of the independent variables. Shifting the intercept is an easy way to get confidence intervals for arbitrary combinations of the independent variables.
This sort of thing becomes especially important at a time when the Statistics community is loudly calling for a move away from P-values. Instead it is recommended that researchers give confidence intervals in clinically meaningful terms.
#data #researchers #statistics #r #excel #regression
β΄οΈ @AI_Python_EN
If you are into statistical analysis, don't miss this paper on variable selection!
#statistical_analysis #regression #variable_selection #model_building #epidemiology
https://onlinelibrary.wiley.com/doi/pdf/10.1002/bimj.201700067
β΄οΈ @AI_Python_EN
#statistical_analysis #regression #variable_selection #model_building #epidemiology
https://onlinelibrary.wiley.com/doi/pdf/10.1002/bimj.201700067
β΄οΈ @AI_Python_EN
ββLight regression analysis of some Microsoft employees salary distrubution
How basic knowledge of regression and couple of graphs can make an information look much better and clear.
Link: https://onezero.medium.com/leak-of-microsoft-salaries-shows-fight-for-higher-compensation-3010c589b41e
#regression #simple #salary #infographic
How basic knowledge of regression and couple of graphs can make an information look much better and clear.
Link: https://onezero.medium.com/leak-of-microsoft-salaries-shows-fight-for-higher-compensation-3010c589b41e
#regression #simple #salary #infographic
Medium
Leak of Microsoft Salaries Shows Fight for Higher Compensation
The numbers range from $40,000 to $320,000 and reveal key details about how pay works at big tech companies
Wherefore Multivariate Regression?
Multivariate analysis (MVA), in a regression setting, typically implies that a single dependent variable (outcome) is modeled as a function of two or more independent variables (predictors).
There are situations, though, in which we have two or more dependent variables we wish to model simultaneously, multivariate regression being one example. I tend to approach this through a structural equation modeling (SEM) framework but there are several alternatives.
Why not run one #regression for each outcome? There are several reasons, and the excerpt below from Multilevel Analysis: An Introduction to Basic and Advanced Multilevel Modeling (Snijders and Bosker) is a particularly succinct explanation in the context of multilevel models.
"Why analyze multiple dependent variables simultaneously? It is possible to analyze all m dependent variables separately. There are several reasons why it may be sensible to analyze the data jointly, that is, as multivariate data.
1. Conclusions can be drawn about the correlations between the dependent variables β notably, the extent to which the unexplained correlations depend on the individual and on the group level. Such conclusions follow from the partitioning of the covariances between the dependent variables over the levels of analysis.
2. The tests of specific effects for single dependent variables are more powerful in the multivariate analysis. This will be visible in the form of smaller standard errors. The additional power is negligible if the dependent variables are only weakly correlated, but may be considerable if the dependent variables are strongly correlated while at the same time the data are very incomplete, that is, the average number of measurements available per individual is considerably less than m.
3. Testing whether the effect of an explanatory variable on dependent variable Y1 is larger than its effect on Y2, when the data on Y1 and Y2 were observed (totally or partially) on the same individuals, is possible only by means of a multivariate analysis.
4. If one wishes to carry out a single test of the joint effect of an explanatory variable on several dependent variables, then a multivariate analysis is also required. Such a single test can be useful, for example, to avoid the danger of capitalization on chance which is inherent in carrying out a separate test for each dependent variable.
A multivariate analysis is more complicated than separate analyses for each dependent variable. Therefore, when one wishes to analyze several dependent variables, the greater complexity of the multivariate analysis will have to be balanced against the reasons listed above. Often it is advisable to start by analyzing the data for each dependent variable separately."
Source: Multilevel Analysis: An Introduction to Basic and Advanced Multilevel Modeling, Tom Snijders
βοΈ @AI_Python_EN
Multivariate analysis (MVA), in a regression setting, typically implies that a single dependent variable (outcome) is modeled as a function of two or more independent variables (predictors).
There are situations, though, in which we have two or more dependent variables we wish to model simultaneously, multivariate regression being one example. I tend to approach this through a structural equation modeling (SEM) framework but there are several alternatives.
Why not run one #regression for each outcome? There are several reasons, and the excerpt below from Multilevel Analysis: An Introduction to Basic and Advanced Multilevel Modeling (Snijders and Bosker) is a particularly succinct explanation in the context of multilevel models.
"Why analyze multiple dependent variables simultaneously? It is possible to analyze all m dependent variables separately. There are several reasons why it may be sensible to analyze the data jointly, that is, as multivariate data.
1. Conclusions can be drawn about the correlations between the dependent variables β notably, the extent to which the unexplained correlations depend on the individual and on the group level. Such conclusions follow from the partitioning of the covariances between the dependent variables over the levels of analysis.
2. The tests of specific effects for single dependent variables are more powerful in the multivariate analysis. This will be visible in the form of smaller standard errors. The additional power is negligible if the dependent variables are only weakly correlated, but may be considerable if the dependent variables are strongly correlated while at the same time the data are very incomplete, that is, the average number of measurements available per individual is considerably less than m.
3. Testing whether the effect of an explanatory variable on dependent variable Y1 is larger than its effect on Y2, when the data on Y1 and Y2 were observed (totally or partially) on the same individuals, is possible only by means of a multivariate analysis.
4. If one wishes to carry out a single test of the joint effect of an explanatory variable on several dependent variables, then a multivariate analysis is also required. Such a single test can be useful, for example, to avoid the danger of capitalization on chance which is inherent in carrying out a separate test for each dependent variable.
A multivariate analysis is more complicated than separate analyses for each dependent variable. Therefore, when one wishes to analyze several dependent variables, the greater complexity of the multivariate analysis will have to be balanced against the reasons listed above. Often it is advisable to start by analyzing the data for each dependent variable separately."
Source: Multilevel Analysis: An Introduction to Basic and Advanced Multilevel Modeling, Tom Snijders
βοΈ @AI_Python_EN