1

这可能是一个基本问题,但我似乎无法在任何地方找到解决方案。如果我们有一个包含 100 个因子的数据框(将它们称为a1a100,如何在 R 中输入线性模型?我知道你可以写

lm(y~ a1*...*a100)

但是如果名字很长,写出来就需要很长时间。有更快的方法吗?例如,通过引用列或类似的东西?有点相关,如果我得到一个列名包含括号(例如y-max())的数据表,我怎么能输入呢?它在 R 中读取为函数,但事实并非如此。

如果已经有人问过这个问题,我深表歉意,但我似乎找不到答案。

谢谢大家

- -编辑 - -

谢谢你的回答。但是,如果我确实想要更高阶的交互项,我将如何实现呢?我需要编写脚本还是有更聪明的方法?

4

2 回答 2

3

if you want to include all others y~. is enough, but if you want some selected vars, lets say, var 2 to 50, 52-100. you can do something like this?

vars<-names(df)[c(2:50,52:101)] #or whatever..
covs<-paste(vars, collapse="+")
model<-paste("y~",covs)
df.lm<-lm(as.formula(model), data=df)
于 2013-12-11T03:26:01.980 回答
3

Many of these things should be possible to figure out by reading the Introduction to R manual that comes with R when you download it.

Generally, a factor with many levels is stored as a single variable:

treat <- c("control", "placebo", "placebo", "control", "drugA", "control", 
           "drugB", ...)

If so, you can just use lm(y~treat), and R will handle this for you. On the other hand, if you have a data frame with y and a1 through a100 only, then you can use lm(y~., my.data), and R will take care of that for you also.

于 2013-12-11T03:27:28.743 回答