# Linear Regression

Linear regression fits a straight line to the selected data using a method
called the *Sum Of Least Squares*.

## Sum Of Least Squares

The Sum Of Least Squares method provides an objective measure for comparing a number of straight lines to find the one that best fits the selected data.

- Plot each data point in a table
- Calculate the distance between each data point and the proposed straight line
- Square the distances (to remove negative values)
- Calculate the sum of the squares
- Repeat steps 2 to 4 for each possible line
- Select the line with the lowest sum of squares (from step 4).

### Example

The table below demonstrates how the sum of squares is calculated for a line where

Price = 20.50 + 0.11 * day n

Date | n |
Closing Price | Price (20.50 + 0.11* n) |
Distance | Squared |
---|---|---|---|---|---|

13-Feb | 1 | 20.55 | 20.61 | -0.06 | 0.0036 |

14-Feb | 2 | 20.80 | 20.72 | 0.08 | 0.0064 |

15-Feb | 3 | 20.95 | 20.83 | 0.12 | 0.0144 |

16-Feb | 4 | 20.78 | 20.94 | -0.16 | 0.0256 |

17-Feb | 5 | 21.10 | 21.05 | 0.05 | 0.0025 |

Sum |
0.0525 |

The sum of squares is calculated for each possible line and the line with the lowest sum is selected.

## Mathematical Formula

Manually calculating the sum of squares for each possible line would be enormously time-consuming. Fortunately there is a quicker way.

The formula for a straight line is

y = a + bx

For our purposes:

is the price**y**is the date**x**is the constant (the value when x equals zero)**a**is the slope of the line**b**

The formula for calculating the line of best fit is

b = ( nΣxy - ΣxΣy ) / ( nΣx² - (Σx)² )

a = ( Σy - bΣx ) / n

Where *n* is the number of data points selected.