Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: RE; R for Stata Users
From 
 
[email protected] 
To 
 
[email protected] 
Subject 
 
st: RE; R for Stata Users 
Date 
 
Sun, 28 Feb 2010 10:52:51 -0500 
I am not on the Statalist, but do take the Digest, so do not get the 
listings until the following day. Most of the time I try to see what 
has been discussed, sometimes i just don't have the time. Fortunately I 
looked this morning.
Bob Muenchen of the Univ of Tennessee wrote a book a couple of years 
ago titled "R for SAS and SPSS users"  The folks at both SPSS and SAS 
have seemed to love it, once they realized that the book was aimed to 
help SAS/SPSS users who also wanted to learn R. It was not written to 
convert anyone from SAS/SPSS to R. Bob is a SAS user and has no 
intention of changing.
The statistics editor at Springer contacted me about working with Bob 
for a book to be titled "R for Stata Users" He knew that I added R code 
at the end of the chapters of my then recently published "Logistic 
Regression Models" (May 2009, Chapman & Hall/CRC) which - insofar as it 
was possible - was aimed to produce output corresponding to the Stata 
examples I use throughout the text. I initially did this to assist 
members of my classes with Statistics.com. I teach Logistic Regression 
and Advanced Logistic Regression, as well as a couple of other courses 
for them. Nearly all "students" are professors who teach statistics 
courses in some discipline, or active researchers wanting to update 
their knowledge of certain area of statistics. Many -- perhaps even 
most -- of these students use R, with SAS as the next most common 
sofware of preference. Very few come to class as Stata users. From the 
feedback I get however, many of these students are so impressed with 
what Stata can do that they end up as Stata users after the class is 
over. They most definitely end up respecting Stata for its scope of 
capabilities and ease of use. I have a 30 page tutorial on Stata as 
Appendix A to help these students, and provide references to other 
places where they can learn Stata, including the suite of Stata Press 
books. Man times I have to tell them that there simply is no 
corresponding SAS, SPSS, or R function available for some procedure we 
are discussing.
It is clear in Logistic Regression Models that Stata has for more 
modeling capabilities in this area than is available in R. I have a 
couple of my later chapters which have no R examples at the end of 
chapter, eg the chapter on exact logistic regression.  But there are 
areas in which someone has posted a library of functions to CRAN that 
is not available in Stata; eg wavelets. I needed to write a NB2-NB1 
hurdle model for a project a week ago. Stata does not have a command 
for it, and I or anyone else I know has not written one, but it is 
available using the flexmix function in R. This does not happen much, 
but it can happen to any of us.
Now - to address the questions raised. I joined the project with Bob 
because it is clear from email I get, from students and other profs I 
relate with, and from what I see myself, that many - perhaps most - 
textbooks now being published use R for examples. R is free and is not 
a commercial package. Many university stat departments are now 
requiring that their students learn R. And, from what I see being on 
the editorial boards of 7 journals now, most examples used in Journals 
employ R.
What does this mean? Well, as a long time committed Stata user (some 22 
years now) it means that if I am going to get the most from textbooks 
using R for examples, and if I am to better understand articles using R 
for examples, then I want to understand the basics of R.  If I am to 
better help my R-using students to understand the Stata code and 
examples I use in my books, I should know R so that i can use it to 
teach them how to understand Stata and the examples. But there are some 
models that are not yet available in Stata, but are available in R.
I didn't write the cover -- but the purpose of the book is to help 
Stata users learn enough R to
1) better understand texts and journal articles employing R for 
examples, and
2) to better be able to use R for the estimation of statistical 
procedures that are currently unavailable in Stata. This includes how 
to set up variables/observations, deal with missing values, and so 
forth.
I can't imagine anyone actually switching from Stata to R, unless they 
simply have no money to purchase the software and do not have access to 
a university site license. there is nowhere in the book that advocates 
such a change. In fact, for portions of the book that I wrote, I 
compare Stata code with R code for doing some operation or functions. 
Mostly Stata is easier  - but sometimes not.
I myself find it much easier to use Stata than R for most commands and 
operations. I too had trouble with the R "if"operator - because there 
isn't any. this was difficult for me at first, but there are ways to 
perform the operation that end up not so bad at all. However, Stata is 
more direct.
The foremost area of instruction in "R for Stata Users" is perhaps data 
management. This is the area that is most difficult for Stata users 
trying to interpret R code that is presented in a text or article. 
There are two chapters on graphics and one on basic statistical 
commands, but nothing beyond linear regression and ANOVA.
The book is NOT for Stata users who have no reason to learn R. If it 
were not for me having so many students who are R users and having to 
present materials aimed to teach various statistical methods, and if I 
did not want to better understand texts and journal articles that use 
R, I would have no reason at all for learning it. Also, I referee more 
articles than I have time for, in addition to my AE responsibilities, 
and find that the majority of manuscripts I get use R for their 
examples. In order to do a more responsible job as referee I felt that 
I needed to learn R.  But that has no bearing on what my preferred 
statistical package is for my own work. It is clearly Stata - for a 
host of reasons. But I still find it useful to know R as well. And that 
is the point of the book. The book was written for those wanting to 
augment Stata, or to better understand sources that use R for examples.
Joseph Hilbe
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/