Preface
1 Introduction
1.1 What Is Regression Analysis?
1.2 Publicly Available Data Sets
1.3 Selected Applications of Regression Analysis
1.3.1 Agricultural Sciences
1.3.2 Industrial and Labor Relations
1.3.3 History
1.3.4 Government
1.3.5 Environmental Sciences
1.4 Steps in Regression Analysis
1.4.1 Statement of the Problem
1.4.2 Selection of Potentially Relevant Variables
1.4.3 Data Collection
1.4.4 Model Specification
1.4.5 Method of Fitting
1.4.6 Model Fitting
1.4.7 Model Criticism and Selection
1.4.8 Objectives of Regression Analysis
1.5 Scope and Organization of the Book
Exercises
2 Simple Linear Regression
2.1 Introduction
2.2 Covariance and Correlation Coefficient
2.3 Example: Computer Repair Data
2.4 The Simple Linear Regression Model
2.5 Parameter Estimation
2.6 Tests of Hypotheses
2.7 Confidence Intervals
2.8 Predictions
2.9 Measuring the Quality of Fit
2.10 Regression Line Through the Origin
2.11 Trivial Regression Models
2.12 Bibliographic Notes
Exercises
3 Multiple Linear Regression
3.1 Introduction
3.2 Description of the Data and Model
3.3 Example: Supervisor Performance Data
3.4 Parameter Estimation
3.5 Interpretations of Regression Coefficients
3.6 Properties of the Least Squares Estimators
3.7 Multiple Correlation Coefficient
3.8 Inference for Individual Regression Coefficients
3.9 Tests of Hypotheses in a Linear Model
3.9.1 Testing All Regression Coefficients Equal To Zero
3.9.2 Testing a Subset of Regression Coefficients Equal to Zero
3.9.3 Testing the Equality of Regression Coefficients
3.9.4 Estimating and Testing of Regression Parameters Under Constraints
3.10 Predictions
3.11 Summary
Exercises
Appendix: Multiple Regression in Matrix Notation
4 Regression Diagnostics: Detection of Model Violations
4.1 Introduction
4.2 The Standard Regression Assumptions
4.3 Various Types of Residuals
4.4 Graphical Methods
4.5 Graphs Before Fitting a Model
4.5.1 One-Dimensional Graphs
4.5.2 Two-Dimensional Graphs
4.5.3 Rotating Plots
4.5.4 Dynamic Graphs
4.6 Graphs After Fitting a Model
4.7 Checking Linearity and Normality Assumptions
4.8 Leverage, Influence, and Outliers
4.8.1 Outliers in the Response Variable
4.8.2 Outliers in the Predictors
4.8.3 Masking and Swamping Problems
4.9 Measures of Influence
4.9.1 Cook’s Distance
4.9.2 Welsch and Kuh Measure
4.9.3 Hadi’s Influence Measure
4.10 The Potential-Residual Plot
4.11 What to Do with the Outliers?
4.12 Role of Variables in a Regression Equation
4.12.1 Added-Variable Plot
4.12.2 Residual Plus Component Plot
4.13 Effects of an Additional Predictor
4.14 Robust Regression
Exercises
5 Qualitative Variables as Predictors
5.1 Introduction
5.2 Salary Survey Data
5.3 Interaction Variables
5.4 Systems of Regression Equations
5.4.1 Models with Different Slopes and Different Intercepts
5.4.2 Models with Same Slope and Different Intercepts
5.4.3 Models with Same Intercept and Different Slopes
5.5 Other Applications of Indicator Variables
5.6 Seasonality
5.7 Stability of Regression Parameters Over Time
Exercises
6 Transformation of Variables
6.1 Introduction
6.2 Transformations to Achieve Linearity
6.3 Bacteria Deaths Due to X-Ray Radiation
6.3.1 Inadequacy of a Linear Model
6.3.2 Logarithmic Transformation for Achieving Linearity
6.4 Transformations to Stabilize Variance
6.5 Detection of Heteroscedastic Errors
6.6 Removal of Heteroscedasticity
6.7 Weighted Least Squares
6.8 Logarithmic Transformation of Data
6.9 Power Transformation
6.10 Summary
Exercises
7 Weighted Least Squares
7.1 Introduction
7.2 Heteroscedastic Models
7.2.1 Supervisors Data
7.2.2 College Expense Data
7.3 Two-Stage Estimation
7.4 Education Expenditure Data
7.5 Fitting a Dose-Response Relationship Curve
Exercises
8 The Problem of Correlated Errors
8.1 Introduction: Autocorrelation
8.2 Consumer Expenditure and Money Stock
8.3 Durbin–Watson Statistic
8.4 Removal of Autocorrelation by Transformation
8.5 Iterative Estimation With Autocorrelated Errors
8.6 Autocorrelation and Missing Variables
8.7 Analysis of Housing Starts
8.8 Limitations of Durbin–Watson Statistic
8.9 Indicator Variables to Remove Seasonality
8.10 Regressing Two Time Series
Exercises
9 Analysis of Collinear Data
9.1 Introduction
9.2 Effects on Inference
9.3 Effects on Forecasting
9.4 Detection of Multicollinearity
9.5 Centering and Scaling
9.5.1 Centering and Scaling in Intercept Models
9.5.2 Scaling in No-Intercept Models
9.6 Principal Components Approach
9.7 Imposing Constraints
9.8 Searching for Linear Functions of the
β's
9.9 Computations Using Principal Components
9.10 Bibliographic Notes
Exercises
Appendix: Principal Components
10 Biased Estimation of Regression Coefficients
10.1 Introduction
10.2 Principal Components Regression
10.3 Removing Dependence Among the Predictors
10.4 Constraints on the Regression Coefficients
10.5 Principal Components Regression: A Caution
10.6 Ridge Regression
10.7 Estimation by the Ridge Method
10.8 Ridge Regression: Some Remarks
10.9 Summary
Exercises
Appendix: Ridge Regression
11 Variable Selection Procedures
11.1 Introduction
11.2 Formulation of the Problem
11.3 Consequences of Variables Deletion
11.4 Uses of Regression Equations
11.4.1 Description and Model Building
11.4.2 Estimation and Prediction
11.4.3 Control
11.5 Criteria for Evaluating Equations
11.5.1 Residual Mean Square
11.5.2 Mallows Cp
11.5.3 Information Criteria: Akaike and Other Modified Forms
11.6 Multicollinearity and Variable Selection
11.7 Evaluating All Possible Equations
11.8 Variable Selection Procedures
11.8.1 Forward Selection Procedure
11.8.2 Backward Elimination Procedure
11.8.3 Stepwise Method
11.9 General Remarks on Variable Selection Methods
11.10 A Study of Supervisor Performance
11.11 Variable Selection With Collinear Data
11.12 The Homicide Data
11.13 Variable Selection Using Ridge Regression
11.14 Selection of Variables in an Air Pollution Study
11.15 A Possible Strategy for Fitting Regression Models
11.16 Bibliographic Notes
Exercises
Appendix: Effects of Incorrect Model Specifications
12 Logistic Regression
12.1 Introduction
12.2 Modeling Qualitative Data
12.3 The Logit Model
12.4 Example: Estimating Probability of Bankruptcies
12.5 Logistic Regression Diagnostics
12.6 Determination of Variables to Retain
12.7 Judging the Fit of a Logistic Regression
12.8 Classification Problem: Another Approach
12.8.1 Multinomial Logistic Regression
12.8.2 Example: Determining Chemical Diabetes
12.8.3 Ordered Response Category: Ordinal Logistic Regression
12.8.4 Example: Determining Chemical Diabetes Revisited
Exercises
Appendix A: Statistical Tables
References
Index