This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.MathematicsWikipedia:WikiProject MathematicsTemplate:WikiProject Mathematicsmathematics articles
This article is within the scope of WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.StatisticsWikipedia:WikiProject StatisticsTemplate:WikiProject StatisticsStatistics articles
Sum of squares explains in more depth that the reason for dividing the sum of squares by norn-1 is to keep the variance from growing linearly as more samples are gathered.
You don't believe that Sum of squares overlaps with the Least squares and regression analysis section of Least squares?
In my experience, when people use the term "sums of squares", they use it in reference to ANOVA or fitting a regression model using the Least squares method—instead, they use terms like "random variable", "standard deviation", "variance", or "range" when talking about variation.
Personally, I'd expect an article on sums of squares to discuss the following instead of variance:
Sums of squares due to regression
Sums of squares due to error
Total sum of squares
Sums of squares due to a given ANOVA factor
The relationship between the above and ANOVA and regression models
I don't think this should get merged with least squares. The least squares article could reasonably neglect some topics that belong here: how to partition the sum of squares in complicated experimental designs. Michael Hardy17:49, 3 October 2006 (UTC)[reply]
This page is so narrowly focused as to be an embarrassment. I've added the expert needed tag -- don't have time to rewrite myself (way too many other stat-related pages need rewriting). In the meantime, anyone interested in really learning about sums-of-squares should fine a good book -- Scheffe mentioned above, books on experimental design or ANOVA, or to really understand the mathematics, books on the geometry of least squares and linear models.
Is there a name for the root sum of squares? Like "geometric mean" except not that, obviously. Geometric sum or something? Is there a Wikipedia article on it? —Preceding unsigned comment added by 71.167.61.18 (talk) 19:16, 4 November 2009 (UTC)[reply]
This article could usefully be expanded to include a section on the matrix generalization of sums of squares to XTX, the sums of squares and cross-products matrix. Or should that be a separate article? --Qwfp (talk) 11:45, 20 February 2011 (UTC)[reply]
The following discussion is an archived discussion of the proposal. Please do not modify it. Subsequent comments should be made in a new section on the talk page. No further edits should be made to this section.
Comment if moved, it definitely should be replaced by a disambiguation page. That hatnote is enormous. And there's least squares from the first section of this talk page, that is missing from the hatnote as well. Not to mention the method of integration... 65.93.15.125 (talk) 12:31, 25 February 2011 (UTC)[reply]
Partition of sums of squaresorPartition of a sum of squares could also be considered. So could Sum of squares (statistics). And there is the concept of a sum of square in number theory. A "sum of squares" disambiguation page could link to several statistics articles and several number theory articles. Michael Hardy (talk) 16:11, 25 February 2011 (UTC)[reply]
Support suggested alternative of Partition of sums of squares as this reflects content, while Sum of squares (statistics) is too general and more suited to a disambig page, given the number of stats articles listed in the new Sum of squares (disambiguation) . Melcombe (talk) 10:29, 2 March 2011 (UTC)[reply]
The above discussion is preserved as an archive of the proposal. Please do not modify it. Subsequent comments should be made in a new section on this talk page. No further edits should be made to this section.
In the section "Partitioning..." with calculations, the variables
$\hat{y}_i$ and $\beta_i$, $\epsilon_i$ are undefined. I guess you can find them in the
pages this page points to, but this is very inconvenient.
Should there be "Here $\beta_1, .., \beta_p$ are the estimated coefficients of
the regression model, $\hat{y}_i = beta_0 + \beta_1 x_{i1} + ... \beta_p x_{ip}$ is the predicted
value and $\epsilon_i = y_i - \hat{y}_i$. I am not completely familiar with the terminology of this field, so I leave to make this change to someone who is.
The following parts must also be proven. They are unclear to many including me.
1. The requirement that the model includes a constant or equivalently that the design matrix contains a column of 1s ensures that Sigma1n Epsilonihat=0 2. (For all j=1,...,p) Sigma1n EpsilonihatXij=0 212.174.38.3 (talk) 07:40, 21 November 2016 (UTC)[reply]
No: p isn't the number of groups, it's the number of predictors, which is immaterial to the proof. The inclusion of p in the theorem is merely to be explicit about is meant by "a linear regression model ... including a constant". It could simply link to Linear regression#Introduction for the definition, at the expense of making this article less self-contained. Qwfp (talk) 19:36, 9 September 2018 (UTC)[reply]
Agree with the sentiment, but oppose the merge on the grounds that the scope of the articles is sufficiently different to warrant separate discussion. Squared deviations from the mean focuss on applications in statistics, which are important, whereas Partition of sums of squares covers the topic more broadly, with examples from linear regression. There readership are different, and best served by having the pages kept separate. Klbrain (talk) 20:05, 9 March 2023 (UTC)[reply]
It is complete unclear to me why these sums should be 0
Geometrically I can understand that is the projection of on the space spanned by the x's, and hence is perpediculer to each x, But analytically it is not at all clear.