Hamish Burke | 2025-03-20
Related to: #bigData
PCA
- linear projection of high dimensional data
- reduce dimensions from
to - orthogonal is dot product between two vectors is 0
- perpendicular
- covariance between two vars tells us how strongly they are linearly correlated
- As we centred it,
for both and is 0 (simplifies above equation)
- All eigenvectors are perpendicular
- Centre data (subtract the matrix)
- Subtract the mean of each feature from all the values (to centre it on 0)
- Calculate
covariance matrix - Calculate the
eigenvectors of the covariance matrix (orthogonal) - Select the
eigenvectors that correspond to the highest eigenvalues to be the new space dimensions