I'm trying to use Python and NumPy to calculate a covariance matrix.
Here is the matrix:
[[0.69, 0.49],
[-1.31, -1.21],
[0.39, 0.99],
[0.09, 0.29],
[1.29, 1.09],
[0.49, 0.79],
[0.19, -0.31],
[-0.81, -0.81],
[-0.31, -0.31],
[-0.71, -1.01]]
Here is the expected result:
[[0.7322, 0.6189],
[0.6189, 0.5956]]
Here is the equation I was given: covariance matrix = (1 / (n - 1))ZTZ
n is the number of entries (10)
Z is the matrix
I tried np.cov(np_matrix), which didn't return the correct size or values.
I've also tried this:
np.cov(np_matrix, rowvar=False)
>>> array([[0.61655556, 0.61544444],
[0.61544444, 0.71655556]]
I'm also trying to calculate it manually instead of using the cov function, but multiplying by the transpose doesn't even return the correct value.
The correct value is this:
[[6.59, 5.57],
[5.57, 5.36]]
However, this is what my code returns:
np_matrix.T @ np_matrix
>>> array([[5.549, 5.539],
[5.539, 6.449]])
np_matrix.T * np_matrix
>>> ValueError: operands could not be broadcast together with shapes (2,10) (10,2)
I've also tried setting the arrays to floats and doubles.
I'm trying to use Python and NumPy to calculate a covariance matrix.
Here is the matrix:
[[0.69, 0.49],
[-1.31, -1.21],
[0.39, 0.99],
[0.09, 0.29],
[1.29, 1.09],
[0.49, 0.79],
[0.19, -0.31],
[-0.81, -0.81],
[-0.31, -0.31],
[-0.71, -1.01]]
Here is the expected result:
[[0.7322, 0.6189],
[0.6189, 0.5956]]
Here is the equation I was given: covariance matrix = (1 / (n - 1))ZTZ
n is the number of entries (10)
Z is the matrix
I tried np.cov(np_matrix), which didn't return the correct size or values.
I've also tried this:
np.cov(np_matrix, rowvar=False)
>>> array([[0.61655556, 0.61544444],
[0.61544444, 0.71655556]]
I'm also trying to calculate it manually instead of using the cov function, but multiplying by the transpose doesn't even return the correct value.
The correct value is this:
[[6.59, 5.57],
[5.57, 5.36]]
However, this is what my code returns:
np_matrix.T @ np_matrix
>>> array([[5.549, 5.539],
[5.539, 6.449]])
np_matrix.T * np_matrix
>>> ValueError: operands could not be broadcast together with shapes (2,10) (10,2)
I've also tried setting the arrays to floats and doubles.
Share Improve this question edited Mar 13 at 21:39 Ben Grossmann 4,8771 gold badge13 silver badges19 bronze badges asked Mar 13 at 21:17 marbledcrystalsmarbledcrystals 12 bronze badges 5 |1 Answer
Reset to default 0The formula you have been provided applies only if your matrix Z
is mean-centred, i.e. Z=X−μ
. Here is the proof:
import numpy as np
X = np.array([
[0.69, 0.49],
[-1.31, -1.21],
[0.39, 0.99],
[0.09, 0.29],
[1.29, 1.09],
[0.49, 0.79],
[0.19, -0.31],
[-0.81, -0.81],
[-0.31, -0.31],
[-0.71, -1.01]
])
# Compute the mean
mean_X = np.mean(X, axis=0)
# Mean-center the data
Z = X - mean_X
n = X.shape[0] # Number of samples (rows)
cov = (1 / (n - 1)) * (Z.T @ Z)
print(cov)
Result:
[[0.61655556 0.61544444]
[0.61544444 0.71655556]]
The numpy
way:
print(np.cov(X, rowvar=False))
Result:
[[0.61655556 0.61544444]
[0.61544444 0.71655556]]
发布者:admin,转转请注明出处:http://www.yc00.com/questions/1744685152a4587870.html
np_matrix.T @ np_matrix/(np_matrix.shape[0] - 1)
yields the same result as the np.cov function (with row_var=False) – Ben Grossmann Commented Mar 13 at 21:45