Least square method is defined as “a method of estimating value from a set of observations by minimizing the sum of the squares of the differences between the observations and values to be found.”

Assumptions of least square method: 

                 Two assumptions are made in using the method of least squares. The first is that the measured response and the standard analyte concentration actually have linear relationship. The mathematical relationship describing this assumption is called the regression model, which may be represented as

                                                     y =mx +b


                   Where b is the y intercept (the value of y when x is zero) and m is the slope of the line. We always presume that any deviation from the straight line between the individual points results from the measurement error.

That is, we assume that there is no error in the x value of the points (concentration). Those two premises are ideal for many analytical methods, but bear in mind that when there is substantial ambiguity in the data. Examination of simple linear minimum squares may not give the best straight line.

Finding the least square line : 

                As illustrated in figure 1, the vertical deviation of each point from the straight line is called a residual. The line generated by the method of the least squares is the one which minimizes the sum of residual squares for all points.

In addition to providing the best fit between the experimental points and the straight line, the method gives the standard deviation for m and b.

        The least squares method finds the sum of the squares of the residuals SS resid and minimizes these according to the minimization technique of calculus . The value of SS resid is found from

                          SS resid = £ i=1 [ yi -z ( b +mxi)] 2

Where N is the number of points used. The calculation of slope and intercept is simplified when three quantities are defined, Sx x ,Sy y ,Sxy as follows :



Where xi and yi are individuals pairs of data for x and y. N is the number of pairs for x and y, and x’ and y’ are the average values for x and y; that is x’ = £xi / N and y’ = £yi / N .


                            The closer the data points are to the line predicted by a least squares analysis, the smaller are the residuals. The smaller me of the squares of the residuals SS resid . Measures the variation in the observed values of the dependent variable ( y values ) that are not explained by the presumed linear relationship between x and y.


The sum of the squares is a measure of the total variation in the observed values of y because the deviations are measured from the mean value value of y .

An important quantity called the coefficient of determination (R 2 ) measures the fraction of the observed variation in y that is explained by the linear relationship and is given by:

                          – totally SS SS residue

The closer R2 is to unity , the better the linear model explains the y variations. The difference between SS tot and SS resid is the sum of the square due to regression.

In contrast to SS resid , SS regr is a measure of the explained variation.

                          SS regr = SS tot – SS resid 
                           R 2  = SS regr 
                                     SS tot 

Read more: