我想用 R 计算无心皮尔逊相关性。cor
函数中的方法之一“皮尔逊”相关性是指中心皮尔逊相关性。我错了吗?
有没有办法计算无心皮尔逊相关性?
它不应该是很难计算自己...基于http://www.stanford.edu/~maureenh/quals/html/ml/node53.html(链接现在死了):
皮尔逊相关的无中心版本假设总体的均值为零。这相当于计算角度的余弦。
r_{xy} =
\frac {\sum_{i=1}^n (x_{i}) (y_{i})}
{(n-1) s_{x}^{(0)} s_{y}^{(0)} }
Where
s_{x}^{(0)} = \sqrt{ \frac{1}{n-1} \sum_{i=1}^n x_{i} ^2 }
set.seed(101)
x <- runif(100)
y <- runif(100)
n <- length(x)
stopifnot(length(y)==n)
sx0 <- sqrt(sum(x^2)/(n-1))
sy0 <- sqrt(sum(y^2)/(n-1))
(c1 <- sum(x*y)/((n-1)*sx0*sy0)) ## 0.7859549
实际上,我遵循的公式太接近了-n-1
的因素抵消了,甚至更容易:
all.equal(c1,sum(x*y)/(sqrt(sum(x^2)*sum(y^2)))) ## TRUE
您也可以尝试library("sos"); findFn("uncentered Pearson correlation")
(但我没有得到任何点击...)
因为我目前正面临同样的问题,所以我最近正在寻找一个类似的软件包,并找到了以下软件包。它称为philentropy
。在该软件包中有一个名为lin.cor
的函数。当设置method = "pearson2"
时,应该得到皮尔逊的无心相关系数。有关进一步说明,请参阅以下链接:https://www.rdocumentation.org/packages/philentropy/versions/0.5.0/topics/lin.cor
def uncentered_corr_coeff (x,y):
import numpy as np
# find the lengths of the x and y vectors
x_length = len(x)
y_length = len(y)
# check to see if the vectors have the same length
if x_length is not y_length:
print 'The vectors that you entered are not the same length'
return False
# calculate the numerator and denominator
xy = 0
xx = 0
yy = 0
for i in range(x_length):
xy = xy + x[i]*y[i]
xx = xx + x[i]**2.0
yy = yy + y[i]**2.0
# calculate the uncentered pearsons correlation coefficient
uxy = xy/np.sqrt(xx*yy)
return uxy
这是很久以前问过的,但万一有人遇到这个问题,我会用这个事实
COV(X,Y)= E(XY)-E(X)E(Y)。因此,你可以做,
Cov1<-cov(xx,yy)#the centered covariance
UCov1<-Cov1+mean(xx)*mean(yy)#the uncentered covariance
UCor1<-cov2cor(UCov1)#the uncentered correlation
对于数据帧 X,你可以做,
Covs<-cov(X)#the centered covariances
Means<-mean(X)#the means
UCovs<-Covs+Means%*%t(Means)#the uncentered covariances
UCors<-cov2cor(UCovs)#the uncentered correlations
本站系公益性非盈利分享网址,本文来自用户投稿,不代表边看边学立场,如若转载,请注明出处
评论列表(30条)