Hacer coincidir una columna de un marco de datos con las columnas de otro marco de datos y, si coinciden, agregar una nueva columna
Frecuentes
Visto 168 equipos
0
I got two big data frames, one (df1) has this structure
V1 V2 V3
1 Chr1 7507 10944
2 Chr1 10944 13170
3 Chr1 13170 20065
4 Chr1 20065 28273
5 Chr1 28273 29960
6 Chr1 29960 36599
7 Chr1 36599 37513
8 Chr1 37513 40360
9 Chr1 40360 48796
10 Chr1 48796 50661
The other (df2) has this
V1 V2 V3 V4 V5
1 Chr1 7507 7507 1 1
2 Chr1 10944 10944 1 2
3 Chr1 13170 13170 1 22
4 Chr1 20065 20065 1 3
5 Chr1 28273 28273 1 161
6 Chr1 29960 29960 1 10
7 Chr1 36599 36599 1 604
8 Chr1 37513 37513 1 117
9 Chr1 40360 40360 1 8
10 Chr1 48796 48796 1 3
what I'm trying to do is to check if the column V2 or V3 (is the same) of df2 is = or between the range of V2 and V3 of df1 then I want to write the value of V5 of df2 in a new column in df1 if not write 0. the result that i want would be like :
Chr1 7507 10944 1
Chr1 10944 13170 2
Chr1 13170 20065 22
Chr1 20065 28273 3
Chr1 28273 29960 161
Chr1 29960 36599 10
Chr1 36599 37513 604
Chr1 37513 40360 117
Chr1 40360 48796 8
.
.
.
Do you know any good way to do this? Thank you very much.
1 Respuestas
0
As @beginneR already mentioned in the comments, all V2
y V3
valores de df2
have an exact match with V2
of df1
. If I interpret your question correctly, this is probably not what you wanted. The following example is what I yhink you are looking for.
Reading the two dataframes:
df1 <- read.table(header=TRUE, text="rn V1 V2 V3
1 Chr1 7507 10944
2 Chr1 10944 13170
3 Chr1 13170 20065
4 Chr1 20065 28273
5 Chr1 28273 29960
6 Chr1 29960 36599
7 Chr1 36599 37513
8 Chr1 37513 40360
9 Chr1 40360 48796
10 Chr1 48796 50661")
df2 <- read.table(header=TRUE, text="rn V1 V2 V3 V4 V5
1 Chr1 7507 7507 1 1
2 Chr1 10944 10944 1 2
3 Chr1 13170 13170 1 22
4 Chr1 20065 20065 1 3
5 Chr1 28273 28273 1 161
6 Chr1 29960 29960 1 10
7 Chr1 36599 36599 1 604
8 Chr1 37513 37513 1 117
9 Chr1 40360 40360 1 8
10 Chr1 48796 48796 1 3")
Getting rid of V3 in df2 as it is exactly the same as V2:
df2 <- df2[,-4]
Making the values in V2 of df2 higher
df2$V2 <- df2$V2 + 2000
Con la ifelse
function you can assign the values of V5 to a new variable in df1 when the meet the requirements:
df1$V4 <- ifelse(df2$V2 >= df1$V2 & df2$V2 <= df1$V3, df2$V5, 0)
contestado el 28 de mayo de 14 a las 19:05
Thank you very much for your answer Jaap and @beginneR!! based on this i had to modify some elements and i did it! We have to take care some more parameters! so Now the code is like upto <- dim(df1)[1] for (i in 1:upto) { result <- df2$V1 == df1$V1[i] & df1$V2[i] <= df2$V2 & df1$V3[i] > df2$V2 sum <- sum(df2$V5[result]) df1$V4[i] <- sum } - user3683485
No es la respuesta que estás buscando? Examinar otras preguntas etiquetadas r dataframe multiple-columns matching or haz tu propia pregunta.
Creo que necesitas usar
merge
. You can check the example in R help files (?merge
) - dickoaIn your example, all V2 and V3 values of df2 have an exact match in V2 of df1. If this applies to your whole data, then a relatively simple
merge
is appropriate as suggested by @dickoa. If your actual data is different (so that you would need to check ranges), it would be better if you could also edit your sample data - talat