Menu Expand
Applied Multivariate Statistical Analysis: Pearson New International Edition

Applied Multivariate Statistical Analysis: Pearson New International Edition

Richard A. Johnson | Dean W. Wichern

(2013)

Additional Information

Book Details

Abstract

For courses in Multivariate Statistics, Marketing Research, Intermediate Business Statistics, Statistics in Education, and graduate-level courses in Experimental Design and Statistics.

 

Appropriate for experimental scientists in a variety of disciplines, this market-leading text offers a readable introduction to the statistical analysis of multivariate observations. Its primary goal is to impart the knowledge necessary to make proper interpretations and select appropriate techniques for analyzing multivariate data. Ideal for a junior/senior or graduate level course that explores the statistical methods for describing and analyzing multivariate data, the text assumes two or more statistics courses as a prerequisite.

 

 

Table of Contents

Section Title Page Action Price
Cover Cover
Table of Contents i
Chapter 1: Aspects of Multivariate Analysis 1
1.1 Introduction 1
1.2 Applications of Multivariate Techniques 3
1.3 The Organization of Data 5
Arrays 5
Descriptive Statistics 6
Graphical Techniques 11
1.4 Data Displays and Pictorial Representations 19
Linking Multiple Two-Dimensional Scatter Plots 20
Graphs of Growth Curves 24
Stars 26
Chernoff Faces 27
1.5 Distance 30
1.6 Final Comments 37
Exercises 37
References 47
Chapter 2: Sample Geometry and Random Sampling 49
2.1 Introduction 49
2.2 The Geometry of the Sample 49
2.3 Random Samples and the Expected Values of the Sample Mean and Covariance Matrix 57
2.4 Generalized Variance 61
Generalized Variance Determined by and its Geometrical Interpretation 72
Another Generalization of Variance 75
2.5 Sample Mean, Covariance, and Correlation as Matrix Operations 75
2.6 Sample Values of Linear Combinations of Variables 78
Exercises 82
References 86
Chapter 3: Matrix Algebra and Random Vectors 87
3.1 Introduction 87
3.2 Some Basics of Matrix and Vector Algebra 87
Vectors 87
3.3 Positive Definite Matrices 98
3.4 A Square-Root Matrix 103
3.5 Random Vectors and Matrices 104
3.6 Mean Vectors and Covariance Matrices 106
Partitioning the Covariance Matrix 111
The Mean Vector and Covariance Matrix for Linear Combinations of Random Variables 113
Partitioning the Sample Mean Vector and Covariance Matrix 115
3.7 Matrix Inequalities and Maximization 116
Supplement 3A: Vectors and Matrices: Basic Concepts 120
Vectors 120
Matrices 125
Exercises 141
References 148
Chapter 4: The Multivariate Normal Distribution 149
4.1 Introduction 149
4.2 The Multivariate Normal Density and its Properties 149
Additional Properties of the Multivariate Normal Distribution 156
4.3 Sampling from a Multivariate Normal Distribution and Maximum Likelihood Estimation 168
The Multivariate Normal Likelihood 168
Maximum Likelihood Estimation of μ and Σ 170
Sufficient Statistics 173
4.4 The Sampling Distribution of X and S 173
Properties of the Wishart Distribution 174
4.5 Large-Sample Behavior of X and S 175
4.6 Assessing the Assumption of Normality 177
Evaluating the Normality of the Univariate Marginal Distributions 177
Evaluating Bivariate Normality 182
4.7 Detecting Outliers and Cleaning Data 187
Steps for Detecting Outliers 189
4.8 Transformations to Near Normality 192
Transforming Multivariate Observations 195
Exercises 200
References 208
Chapter 5: Inferences About a Mean Vector 210
5.1 Introduction 210
5.2 The Plausibility of μ0 as a Value for a Normal Population Mean 210
5.3 Hotelling’s T2 and Likelihood Ratio Tests 216
General Likelihood Ratio Method 219
5.4 Confidence Regions and Simultaneous Comparisons of Component Means 220
Simultaneous Confidence Statements 223
A Comparison of Simultaneous Confidence Intervals with One-at-a-Time Intervals 229
The Bonferroni Method of Multiple Comparisons 232
5.5 Large Sample Inferences about a Population Mean Vector 234
5.6 Multivariate Quality Control Charts 239
Charts for Monitoring a Sample of Individual Multivariate Observations for Stability 241
Control Regions for Future Individual Observations 247
Control Ellipse for Future Observations 248
T2-Chart for Future Observations 248
Control Charts Based on Subsample Means 249
Control Regions for Future Subsample Observations 251
5.7 Inferences about Mean Vectors When Some Observations are Missing 251
5.8 Difficulties Due to Time Dependence in Multivariate Observations 256
Supplement 5A: Simultaneous Confidence\rIntervals and Ellipses as Shadows of the p-Dimensional Ellipsoids 258
Exercises 261
References 272
Chapter 6: Comparisons of Several Multivariate Means 273
6.1 Introduction 273
6.2 Paired Comparisons and a Repeated Measures Design 273
Paired Comparisons 273
A Repeated Measures Design for Comparing Treatments 279
6.3 Comparing Mean Vectors from Two Populations 284
Assumptions Concerning the Structure of the Data 284
Further Assumptions When n1 and n2 are Small 285
Simultaneous Confidence Intervals 288
The Two-Sample Situation When Σ1 ≠ Σ2 291
An Approximation to the Distribution of T2 for Normal Populations When Sample Sizes are Not Large 294
6.4 Comparing Several Multivariate Population Means (One-Way Manova) 296
Assumptions about the Structure of the Data for One-Way Manova 296
A Summary of Univariate Anova 297
Multivariate Analysis of Variance (Manova) 301
6.5 Simultaneous Confidence Intervals for Treatment Effects 308
6.6 Testing for Equality of Covariance Matrices 310
6.7 Two-Way Multivariate Analysis of Variance 312
Univariate Two-Way Fixed-Effects Model with Interaction 312
Multivariate Two-Way Fixed-Effects Model with Interaction 315
6.8 Profile Analysis 323
6.9 Repeated Measures Designs and Growth Curves 328
6.10 Perspectives and a Strategy for Analyzing Multivariate Models 332
Exercises 337
References 358
Chapter 7: Multivariate Linear Regression Models 360
7.1 Introduction 360
7.2 The Classical Linear Regression Model 360
7.3 Least Squares Estimation 364
Sum-of-Squares Decomposition 366
Geometry of Least Squares 367
7.4 Inferences About the Regression Model 370
Inferences Concerning the Regression Parameters 370
Likelihood Ratio Tests for the Regression Parameters 374
7.5 Inferences from the Estimated Regression Function 378
Estimating the Regression Function at Z0 378
Forecasting a New Observation at Z0 379
7.6 Model Checking and Other Aspects of Regression 381
Does the Model Fit? 381
Leverage and Influence 384
Additional Problems in Linear Regression 384
7.7 Multivariate Multiple Regression 387
Other Multivariate Test Statistics 398
Predictions from Multivariate Multiple Regressions 399
7.8 The Concept of Linear Regression 401
7.9 Comparing the Two Formulations of the Regression Model 410
Mean Corrected Form of the Regression Model 410
Relating the Formulations 412
7.10 Multiple Regression Models with Time Dependent Errors 413
Supplement 7A: The Distribution of the Likelihood Ratio for the Multivariate Multiple Regression Model 418
Exercises 420
References 428
Chapter 8: Principal Components 430
8.1 Introduction 430
8.2 Population Principal Components 430
Principal Components for Covariance Matrices with Special Structures 439
8.3 Summarizing Sample Variation by Principal Components 441
The Number of Principal Components 444
Interpretation of the Sample Principal Components 448
Standardizing the Sample Principal Components 449
8.4 Graphing the Principal Components 454
8.5 Large Sample Inferences 456
Large Sample Properties of λi and ei 456
Testing for the Equal Correlation Structure 457
8.6 Monitoring Quality with Principal Components 459
Checking a Given Set of Measurements for Stability 459
Controlling Future Values 463
Supplement 8A: The Geometry of the Sample\rPrincipal Component Approximation 466
The p-Dimensional Geometrical Interpretation 468
The n-Dimensional Geometrical Interpretation 469
Exercises 470
References 480
Chapter 9: Factor Analysis and Inference for Structured Covariance Matrices 481
9.1 Introduction 481
9.2 The Orthogonal Factor Model 482
9.3 Methods of Estimation 488
The Principal Component (and Principal Factor) Method 488
A Modified Approach—the Principal Factor Solution 494
The Maximum Likelihood Method 495
A Large Sample Test for the Number of Common Factors 501
9.4 Factor Rotation 504
Oblique Rotations 512
9.5 Factor Scores 513
The Weighted Least Squares Method 514
The Regression Method 516
9.6 Perspectives and a Strategy for Factor Analysis 519
Supplement 9A: Some Computational Details for Maximum Likelihood Estimation 527
Recommended Computational Scheme 528
Maximum Likelihood Estimators of p = LzLz + ψz 529
Exercises 530
References 538
Chapter 10: Canonical Correlation Analysis 539
10.1 Introduction 539
10.2 Canonical Variates and Canonical Correlations 539
10.3 Interpreting the Population Canonical Variables 545
Identifying the Canonical Variables 545
Canonical Correlations as Generalizations of Other Correlation Coefficients 547
The First r Canonical Variables as a Summary of Variability 548
A Geometrical Interpretation of the Population Canonical Correlation Analysis 549
10.4 The Sample Canonical Variates and Sample Canonical Correlations 550
10.5 Additional Sample Descriptive Measures 558
Matrices of Errors of Approximations 558
Proportions of Explained Sample Variance 561
10.6 Large Sample Inferences 563
Exercises 567
References 574
Chapter 11: Discrimination and Classification 575
11.1 Introduction 575
11.2 Separation and Classification for Two Populations 576
11.3 Classification with Two Multivariate Normal Populations 584
Classification of Normal Populations When Σ1 = Σ2 = Σ 584
Scaling 589
Fisher’s Approach to Classification with Two Populations 590
Is Classification a Good Idea? 592
Classification of Normal Populations When Σ1 ≠ Σ2 593
11.4 Evaluating Classification Functions 596
11.5 Classification with Several Populations 606
The Minimum Expected Cost of Misclassification Method 606
Classification with Normal Populations 609
11.6 Fisher’s Method for Discriminating among Several Populations 621
Using Fisher’s Discriminants to Classify Objects 628
11.7 Logistic Regression and Classification 634
Introduction 634
The Logit Model 634
Logistic Regression Analysis 636
Classification 638
Logistic Regression with Binomial Responses 640
11.8 Final Comments 644
Including Qualitative Variables 644
Classification Trees 644
Neural Networks 647
Selection of Variables 648
Testing for Group Differences 648
Graphics 649
Practical Considerations Regarding Multivariate Normality 649
Exercises 650
References 669
Chapter 12: Clustering, Distance Methods and Ordination 671
12.1 Introduction 671
12.2 Similarity Measures 673
Distances and Similarity Coefficients for Pairs of Items 673
Similarities and Association Measures for Pairs of Variables 677
Concluding Comments on Similarity 678
12.3 Hierarchical Clustering Methods 680
Single Linkage 682
Complete Linkage 685
Average Linkage 690
Ward’s Hierarchical Clustering Method 692
Final Comments—Hierarchical Procedures 695
12.4 Nonhierarchical Clustering Methods 696
K-means Method 696
Final Comments—Nonhierarchical Procedures 701
12.5 Clustering Based on Statistical Models 703
12.6 Multidimensional Scaling 706
12.7 Correspondence Analysis 716
Algebraic Development of Correspondence Analysis 718
12.8 Biplots for Viewing Sampling Units and Variables 726
Constructing Biplots 727
12.9 Procrustes Analysis: A Method for Comparing Configurations 732
Constructing the Procrustes Measure of Agreement 733
Supplement 12A: Data Mining 740
Introduction 740
The Data Mining Process 741
Model Assessment 742
Exercises 747
References 755
Selected Additional References for Model Based Clustering 756
Appendix 757
Table 1: Standard Normal Probabilities 758
Table 2: Student’s T-Distribution Percentage Points 759
Table 3: X2 Distribution Percentage Points 760
Table 4: F-Distribution Percentage Points (α = 10) 761
Table 5: F-Distribution Percentage Points (α = .05) 762
Table 6: F-Distribution Percentage Points (α = .01) 763
Index 765