# Scree plot

> Mediated Wiki article. Canonical URL: https://mediated.wiki/source/Scree_plot
> Markdown URL: https://mediated.wiki/source/Scree_plot.md
> Source: https://en.wikipedia.org/wiki/Scree_plot
> Source revision: 1333488221
> License: Creative Commons Attribution-ShareAlike 4.0 International (https://creativecommons.org/licenses/by-sa/4.0/)

Diagnostic plot in multivariate statistics

A sample scree plot produced in [R](/source/R_(programming_language)). The [Kaiser criterion](/source/Kaiser_criterion) is shown in red.

In [multivariate statistics](/source/Multivariate_statistics), a **scree plot** is a line plot of the [eigenvalues](/source/Eigenvalue) of [factors](/source/Factor_analysis) or [principal components](/source/Principal_component) in an analysis.[1] The scree plot is used to determine the number of factors to retain in an [exploratory factor analysis](/source/Exploratory_factor_analysis) (FA) or principal components to keep in a [principal component analysis](/source/Principal_component_analysis) (PCA). The procedure of finding statistically significant factors or components using a scree plot is also known as a **scree test**. [Raymond B. Cattell](/source/Raymond_B._Cattell) introduced the scree plot in 1966.[2]

A scree plot always displays the eigenvalues in a downward curve, ordering the eigenvalues from largest to smallest. According to the scree test, the ["elbow" of the graph](/source/Elbow_of_a_curve) where the eigenvalues seem to level off is found, and factors or components to the left of this point should be retained as significant.[3]

As the "elbow" point has been defined as point of maximum curvature, this property has led to the creation of the [Kneedle algorithm](https://en.wikipedia.org/w/index.php?title=Kneedle_algorithm&action=edit&redlink=1).[4]

## Etymology

A scree at [Mount Yamnuska](/source/Mount_Yamnuska), [Alberta](/source/Alberta), [Canada](/source/Canada)

The scree plot is named after the elbow's resemblance to a [scree](/source/Scree) in nature.

## Criticism

This test is sometimes criticized for its subjectivity. Scree plots can have multiple "elbows" that make it difficult to know the correct number of factors or components to retain, making the test [unreliable](/source/Reliability_(statistics)). There is also no standard for the scaling of the x and y axes, which means that different statistical programs can produce different plots from the same data.[5]

The test has also been criticized for producing too few factors or components for factor retention.[*[clarification needed](https://en.wikipedia.org/wiki/Wikipedia:Please_clarify)*][1]

## See also

Wikimedia Commons has media related to [Scree plot](https://commons.wikimedia.org/wiki/Category:Scree_plot).

- [Biplot](/source/Biplot)

- [Parallel analysis](/source/Parallel_analysis)

- [Elbow method](/source/Elbow_method_(clustering))

- [Determining the number of clusters in a data set](/source/Determining_the_number_of_clusters_in_a_data_set)

## References

1. ^ [***a***](#cite_ref-LewithJonas2010_1-0) [***b***](#cite_ref-LewithJonas2010_1-1) George Thomas Lewith; Wayne B. Jonas; Harald Walach (23 November 2010). [*Clinical Research in Complementary Therapies: Principles, Problems and Solutions*](https://books.google.com/books?id=CSNw-spnFdkC&pg=PA354). Elsevier Health Sciences. p. 354. [ISBN](/source/ISBN_(identifier)) [978-0-7020-4916-3](https://en.wikipedia.org/wiki/Special:BookSources/978-0-7020-4916-3).

1. **[^](#cite_ref-2)** Cattell, Raymond B. (1966). "The Scree Test For The Number Of Factors". *Multivariate Behavioral Research*. **1** (2): 245–276. [doi](/source/Doi_(identifier)):[10.1207/s15327906mbr0102_10](https://doi.org/10.1207%2Fs15327906mbr0102_10). [PMID](/source/PMID_(identifier)) [26828106](https://pubmed.ncbi.nlm.nih.gov/26828106).

1. **[^](#cite_ref-DmitrienkoChuang-Stein2007_3-0)** Alex Dmitrienko; [Christy Chuang-Stein](/source/Christy_Chuang-Stein); [Ralph B. D'Agostino](/source/Ralph_B._D'Agostino) (2007). [*Pharmaceutical Statistics Using SAS: A Practical Guide*](https://books.google.com/books?id=FlXwIvSHND8C&pg=PA380). SAS Institute. p. 380. [ISBN](/source/ISBN_(identifier)) [978-1-59994-357-2](https://en.wikipedia.org/wiki/Special:BookSources/978-1-59994-357-2).

1. **[^](#cite_ref-4)** Satopaa, Ville; Albrecht, Jeannie; Irwin, David; Raghavan, Barath (2011-06-20). *Finding a "kneedle" in a haystack: Detecting knee points in system behavior*. 2011 / 31st International Conference on Distributed Computing Systems. IEEE Workshops. [Institute of Electrical and Electronics Engineers](/source/Institute_of_Electrical_and_Electronics_Engineers). pp. 166–171. [doi](/source/Doi_(identifier)):[10.1109/ICDCSW.2011.20](https://doi.org/10.1109%2FICDCSW.2011.20).

1. **[^](#cite_ref-NormanStreiner2007_5-0)** Norman, Geoffrey R.; Streiner, David L. (15 September 2007). [*Biostatistics: The bare essentials*](https://books.google.com/books?id=8rkqWafdpuoC&pg=PA201). PMPH-USA. p. 201. [ISBN](/source/ISBN_(identifier)) [978-1-55009-400-8](https://en.wikipedia.org/wiki/Special:BookSources/978-1-55009-400-8) – via Google Books.

v t e Statistics Outline Index Descriptive statistics Continuous data Center Mean Arithmetic Arithmetic-Geometric Contraharmonic Cubic Generalized/power Geometric Harmonic Heronian Heinz Lehmer Median Mode Dispersion Average absolute deviation Coefficient of variation Interquartile range Percentile Range Standard deviation Variance Shape Central limit theorem Moments Kurtosis L-moments Skewness Count data Index of dispersion Summary tables Contingency table Frequency distribution Grouped data Dependence Partial correlation Pearson product-moment correlation Rank correlation Kendall's τ Spearman's ρ Scatter plot Graphics Bar chart Biplot Box plot Control chart Correlogram Fan chart Forest plot Histogram Pie chart Q–Q plot Radar chart Run chart Scatter plot Stem-and-leaf display Violin plot Heatmap Scatter Plot Matrix ECDF plot Line chart Statistical data processing Transformations Data transformation Log transformation Power transform Box–Cox transformation Yeo–Johnson transformation Variance-stabilizing transformation Anscombe transform Fisher transformation Scaling and normalization Feature scaling Normalization Standardization (z-score) Min–max normalization Unit vector normalization Data cleaning Data cleaning Outlier Winsorizing Truncation Missing data Data reduction Dimensionality reduction Principal component analysis Factor analysis Time-series preprocessing Differencing Detrending Seasonal adjustment Stationarity transformation Data collection Study design Effect size Missing data Optimal design Population Replication Sample size determination Statistic Statistical power Survey methodology Sampling Cluster Stratified Opinion poll Questionnaire Standard error Controlled experiments Blocking Factorial experiment Interaction Random assignment Randomized controlled trial Randomized experiment Scientific control Adaptive designs Adaptive clinical trial Stochastic approximation Up-and-down designs Observational studies Cohort study Cross-sectional study Natural experiment Quasi-experiment Statistical inference Statistical theory Population Statistic Probability distribution Sampling distribution Order statistic Empirical distribution Density estimation Statistical model Model specification Lp space Parameter location scale shape Parametric family Likelihood (monotone) Location–scale family Exponential family Completeness Sufficiency Statistical functional Bootstrap U V Optimal decision loss function Efficiency Statistical distance divergence Asymptotics Robustness Frequentist inference Point estimation Estimating equations Maximum likelihood Method of moments M-estimator Minimum distance Unbiased estimators Mean-unbiased minimum-variance Rao–Blackwellization Lehmann–Scheffé theorem Median unbiased Plug-in Interval estimation Confidence interval Pivot Likelihood interval Prediction interval Tolerance interval Resampling Bootstrap Jackknife Testing hypotheses 1- & 2-tails Power Uniformly most powerful test Permutation test Randomization test Multiple comparisons Parametric tests Likelihood-ratio Score/Lagrange multiplier Wald Specific tests Z-test (normal) Student's t-test F-test Goodness of fit Chi-squared G-test Kolmogorov–Smirnov Anderson–Darling Lilliefors Jarque–Bera Normality (Shapiro–Wilk) Likelihood-ratio test Model selection Cross validation AIC BIC Rank statistics Sign Sample median Signed rank (Wilcoxon) Hodges–Lehmann estimator Rank sum (Mann–Whitney) Nonparametric anova 1-way (Kruskal–Wallis) 2-way (Friedman) Ordered alternative (Jonckheere–Terpstra) Van der Waerden test Bayesian inference Bayesian probability prior posterior Credible interval Bayes factor Bayesian estimator Maximum posterior estimator Correlation Regression analysis Correlation Pearson product-moment Partial correlation Confounding variable Coefficient of determination Regression analysis Errors and residuals Regression validation Mixed effects models Simultaneous equations models Multivariate adaptive regression splines (MARS) Template:Least squares and regression analysis Linear regression Simple linear regression Ordinary least squares General linear model Bayesian regression Non-standard predictors Nonlinear regression Nonparametric Semiparametric Isotonic Robust Homoscedasticity and Heteroscedasticity Generalized linear model Exponential families Logistic (Bernoulli) / Binomial / Poisson regressions Partition of variance Analysis of variance (ANOVA, anova) Analysis of covariance Multivariate ANOVA Degrees of freedom Categorical / multivariate / time-series / survival analysis Categorical Cohen's kappa Contingency table Graphical model Log-linear model McNemar's test Cochran–Mantel–Haenszel statistics Multivariate Regression Manova Principal components Canonical correlation Discriminant analysis Cluster analysis Classification Structural equation model Factor analysis Multivariate distributions Elliptical distributions Normal Time-series General Decomposition Trend Stationarity Seasonal adjustment Exponential smoothing Cointegration Structural break Granger causality Specific tests Dickey–Fuller Johansen Q-statistic (Ljung–Box) Durbin–Watson Breusch–Godfrey Time domain Autocorrelation (ACF) partial (PACF) Cross-correlation (XCF) ARMA model ARIMA model (Box–Jenkins) Autoregressive conditional heteroskedasticity (ARCH) Vector autoregression (VAR) (Autoregressive model (AR)) Frequency domain Spectral density estimation Fourier analysis Least-squares spectral analysis Wavelet Whittle likelihood Survival Survival function Kaplan–Meier estimator (product limit) Proportional hazards models Accelerated failure time (AFT) model First hitting time Hazard function Nelson–Aalen estimator Test Log-rank test Applications Biostatistics Bioinformatics Clinical trials / studies Epidemiology Medical statistics Engineering statistics Chemometrics Methods engineering Probabilistic design Process / quality control Reliability System identification Social statistics Actuarial science Census Crime statistics Demography Econometrics Jurimetrics National accounts Official statistics Population statistics Psychometrics Spatial statistics Cartography Environmental statistics Geographic information system Geostatistics Kriging Category Mathematics portal Commons WikiProject

---
Adapted from the Wikipedia article [Scree plot](https://en.wikipedia.org/wiki/Scree_plot) by Wikipedia contributors ([contributor history](https://en.wikipedia.org/wiki/Scree_plot?action=history)). Available under [Creative Commons Attribution-ShareAlike 4.0 International](https://creativecommons.org/licenses/by-sa/4.0/). Changes may have been made.
