Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: R for Stata Users
From
Fred Wolfe <[email protected]>
To
[email protected]
Subject
Re: st: R for Stata Users
Date
Fri, 26 Feb 2010 12:21:01 -0600
I think that I agree with everything that has been said about R. I
have the book on order, and I certainly hope that it does more than
the blurb suggests.
I have used R for their -rpart- ("CART" - recursive partitioning) and
-randomForest- programs. Here is an example of (probably poor) R code
that I used (BTW, I ended up making the publication graphs in Stata!):
setwd("/Users/fwolfe/statdata/fibcrit")
sink("rmd1_r.log", split=TRUE)
load("/Users/fwolfe/statdata/fibcrit/mdbinshort.rdata")
library(rpart)
set.seed(9987)
mdbinshort$md_acrcrit <- as.factor(mdbinshort$md_acrcrit)
cat.rp <- rpart(md_acrcrit ~ .,data=mdbinshort,method="class",xval=100)
jpeg(filename = "mdbinshort1.jpeg")
plot(cat.rp,uniform = T)
text(cat.rp)
print(cat.rp)
summary(cat.rp)
printcp(cat.rp)
dev.off()
library(randomForest)
set.seed(4804)
cat.rf <- randomForest(md_acrcrit
~.,data=mdbinshort,importance=TRUE, proximity=TRUE)
print(cat.rf)
## look at variable importance
round(importance(cat.rf),2)
## Plot variable importance
jpeg(filename = "mdbinshort2.jpeg")
varImpPlot(cat.rf,main="Predictors of FM Classification")
dev.off()
To make this work for me, I built all of the analysis files in Stata,
and then converted them to R as in this example:
// md data medium sx but not locations
preserve
keep if mdpure == 1
keep md_acrcrit mdrps mdpain mdfatig mdsleep mdmood mdcog somat
mdunfresh *sx
save mdbinmedium, replace
shell "/Applications/StatTransfer9/st.command" mdbinmedium.dta
mdbinmedium.rdata -y
restore
// md data medium sx but not locations with rps as cat
preserve
keep if mdpure == 1
keep md_acrcrit rcat mdpain mdfatig mdsleep mdmood mdcog somat mdunfresh *sx
save mdbinmediumcat, replace
shell "/Applications/StatTransfer9/st.command" mdbinmediumcat.dta
mdbinmediumcat.rdata -y
restore
The real problem is making all of this work in R if you are not an R
expert. I spent hours with errors and books.
What I would like to see, and that is why I am posting this, is a
Stata program that writes R specific program code. So that if I wanted
to run -rpart-, I could run a Stata program that would create the R
code, call R, and run the program.
I think I am not expert enough to attempt this myself. Perhaps this
list could compile that 5-10 most useful R programs and a community
effort to build translation programs could be undertaken.
Fred
On Fri, Feb 26, 2010 at 11:35 AM, Airey, David C
<[email protected]> wrote:
>
> .
>
> Agreed. R is documented that way on purpose according to other R docs. It's the recommended style. But it would be helpful if they had a "verbose" option in their dynamically created help files, like with the Stata online help versus PDF manual help! God I love how fast I can get to helpful help in Stata. Yes, sometimes R has good vignettes, but not enough.
>
> All major statistical software environments are getting explicit with links to R functionality it seems. JMP (SAS) is building in R functionality to version 9 and SPSS 18 already does. I tried Roger Newson's SSC package to run R in Stata a while ago, and it worked fine. But linking to R is not the problem as Stas points out. You have to soak yourself in R head to toe until you are positively inebriated or pickled, is my guess.
>
> > Well, as pretty much of R documentation, it gives the minimal amount of
> > information. [...] -Stas
--
Fred Wolfe
National Data Bank for Rheumatic Diseases
Wichita, Kansas
NDB Office +1 316 263 2125 Ext 0
Research Office +1 316 686 9195
[email protected]
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/