Uso de dlply con pROC

Estoy tratando de aplicar el roc() function from the pROC package to specific variables from dataframe df, subset on df$site which consists of characters that look like "01", "02", "03". The function roc() returns a list, so I'm expecting my object roc_site to be a list which in turn contains a list of results for each site.

roc_site <- dlply(
  .data = df, 
  .variables = "site", 
  .fun = roc, 
  .progress = "text",
  response = df$Risk,
  predictor = df$Rating, 
  na.rm = TRUE, plot = TRUE)

This runs successfully, and roc_site is a list that consists of one list for each site, but the results for each site are identical; it hasn't split the dataframe apart. What am I missing?

preguntado el 04 de julio de 12 a las 02:07

Sometimes when you cannot figure out what is going on, it helps to replace the "function" inside such a split-apply loop with a simple print(). Entonces, lo harás ver what is getting passed. -

That's a great idea; thanks for the tip! -

Or browser() so that you can inspect what you're getting -

1 Respuestas

La función a la que le pasas .fun in dlply needs to accept the entire chunk of the data frame as its (first) argument.

So in this case, what you really want is to write your own small function that will take your data frame and calculate what you want. e.g.

foo <- function(x){
    roc(x$Risk, x$Rating, na.rm = TRUE, plot = TRUE)
}

and then pass that function to .fun.

The reason you're getting the identical results is that for each chunk, dlply esta llamando roc on your chunk, but passing df$Risk y df$Rating cada vez, and those are the vectors for the entire data set.

Respondido 04 Jul 12, 03:07

That was amazingly fast and amazingly informative. Thank you very, very much. - ahj

No es la respuesta que estás buscando? Examinar otras preguntas etiquetadas or haz tu propia pregunta.