Interpretación de la declaración de valor residual en el resumen de lm () [cerrado]

I am working with R to create some linear models (using lm()) on the data that i have collected. Now I am not that good at statistics and am finding it difficult to understand the summary of the linear model that is generated through R.

I mean the residual value : Min , 1Q, Median, 3Q, Max

My question is: what do these values mean and how can I know from these values if my model is good ?

This are some of the residual values that I have.

Min: -4725611 1Q:-2161468 median:-1352080  3Q:3007561 Max:6035077

preguntado el 28 de agosto de 12 a las 10:08

Your question is meaningless as it stands. One wouldn't evaluate goodness of fit from the residuals alone. Post the results of summary(your_model) into your question to make this more meaningful. -

This is not really a programming question, any textbook of econometrics will help you out. I recommend [this][1] for you to read, It's easy to read and it has good examples for interpretations. [1]: -

Linear regression assumes normal distribution of residuals. A slight violation of this assumption is not problematic, but the distribution should be at least symmetric, that is the median should be close to zero and absolute values of first and third quartil should be similar. Of course there are far better diagnostics available, e.g., several plots of the residuals. Try something like plot(lm(your.y~your.x)) for some diagnostic plots. -

@Andrie : I understand that residuals are not enough by themselves to evaluate the best fit. But I just wanted to know how can I interpre the residual scores as they are provided in R, which I have shown above. -

@Ben Bolker: Done, but I think this is off-topic on SO and probably a duplicate on CV. -

1 Respuestas

One fundamental assumption of linear regression (and the associated hypothesis tests in particular) is that residuals are normal distributed with expected value zero. A slight violation of this assumption is not problematic, as the statistics is pretty robust. However, the distribution should be at least symmetric.

The best way to judge if the assumption of normality is fullfilled, is to plot the residuals. There are many different diagnostic plots available, e.g., you can do the following:

fit <- lm(y~x)

This will give you a plot of residuals vs. fitted values and a qq-plot of standardized residuals. The quantiles given by summary(fit) are useful for a quick check if residuals are symmetric. There, min and max values are not that important, but the median should be close to zero and the first and third quartil should have similar absolute values. Of course, this check only makes sense if you have a sufficient number of values.

If residuals are not normal distributed there are several possibilities to deal with that, e.g.,

  • transformaciones,
  • generalized linear models,
  • or a non-linear model could be more appropriate.

There are many good books on linear regression and even some good web tutorials. I suggest to read at least one of those carefully.

Respondido 28 ago 12, 12:08

No es la respuesta que estás buscando? Examinar otras preguntas etiquetadas or haz tu propia pregunta.