# correlacionar dos conjuntos de datos de diferentes escalas

How do we correlate two data sets/curves which have different scales, i.e. one curve has its y axis range from (0,70000) and another curve has y axis range from (0, 150000). If they are on same scale, then cor() function can be used. I wanted to check if one curve is dependent on other/ are both curves related. Any ideas?

preguntado el 24 de agosto de 12 a las 21:08

## 3 Respuestas

Si echas un vistazo al definition of Pearson's product moment of correlation (Que es que `cor` calculates by default), you will see it is a linear operator. That is, if a and b are constants, then cor(aX + b, Y) = cor(X, Y). So, differences in range between X and Y are not important. Keep in mind though that this correlation only measures linear dependence: they may be "related" but have a low correlation. This can happen if the relationship is non-linear, for example:

``````set.seed(100)
x <- rnorm(100)
y <- x^2
cor(x,y)
# 0.1224623
``````

Respondido 24 ago 12, 22:08

Can this be used to measure similarity between two curves ? Suppose there are two 1-d datasets with very different ranges but similar shape. Can the above correlation be an appropriate measure of similarity between them ? - Kanmani

If you're looking for correlation between two sets of data, the amount of correlation is not dependent on differences in the range of the data sets.

For example, we can make a random set of `y` values and then scale them up. The correlation is still 1:

``````> y <- rnorm(100)
> y2 <- y*2 + 20
> cor(y, y2)
 1
``````

To further show that the amount of correlation is independent of linear changes in scale, look at the case with uncorrelated data:

``````> y3 = rnorm(100)
> cor(y, y3)
 -0.05293818
> y4 <- y3*2 + 20
> cor(y, y4)
 -0.05293818
``````

So, to answer your question. I think the function `cor` should still work fine for you.

Respondido 24 ago 12, 21:08

The correlation shouldn't depend on the absolute ranges of the data, I wouldn't think. Just multiply one data set by a constant so that it has the same range as the other?

Respondido 24 ago 12, 21:08

No es la respuesta que estás buscando? Examinar otras preguntas etiquetadas or haz tu propia pregunta.