Should we ue binary columns in pca

broken image

We remind that the covariance matrix is defined as: , correspond to the eigenvectors of the sample covariance matrix of the data. Moreover it can be demonstrated that the columns of the matrix, solution of Eq. Hence, the recover matrix is obtained by transposing the compression matrix. It can be demonstrated that if and are solutions of Eq. The PCA consists in finding the compression matrix and the recover matrix such that the mean squared error computed over the set of instances is minimal, i.e., the following problem is solved: A real matrix of size is used to approximately reconstruct each original vector from the compressed version by performing the transformation hence, is the recovered version of the original feature vector and it has the same dimensionality.

broken image

Given a set of feature vectors in, a real matrix of size, where, is used to reduce the dimensionality of by performing the linear transformation.

broken image