Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Re: Your paper on Stata,SAS and SPSS

From	"John F Hall" <[email protected]>
To	"Statalist" <[email protected]>
Subject	Re: st: Re: Your paper on Stata,SAS and SPSS
Date	Wed, 11 Aug 2010 15:55:31 +0200

Alan

Thanks for this: very helpful.

Not sure about Poisson distributions (my course went 11 weeks beforetouching t-test or chi-sq) but the reversing command looks neat. I have theexact same situation in one of my tutorials constructing simple attitudescales (from a survey of 15- and 16- year olds) to measure "attachment tostatus quo" and "sexism" from a list of items some of which need to bereverse coded, but there's also a lengthy narrative explaining what I'mdoing and why, and also warnings of missing data (with COUNT) advice ongiving scales a true zero point (with COMPUTE) and the dangers with RECODEif you're not careful, especially if you save the working file withoutkeeping a copy of the original.

[7-hour interlude here as a digger looking for a mains water leak wentthrough the phone cable, but at least France Telecom came the same day tofix it. As soon as I've filled the whole back in, I'll scout round for someexamples of Stata output]

You can see the sequence and contents of my SPSS tutorials onhttp://surveyresearch.weebly.com/spsspasw-18-tutorial-guide.html : all themain menus are displayed in the left pane on the site.

I don't have Stata installed and don't want to download a trial versionuntil I have time to do it justice.


John Hall
http://surveyresearch.weebly.com

----- Original Message -----From: Alan Acock

To: [email protected]
Cc: [email protected] ; Bruce Weaver
Sent: Wednesday, August 11, 2010 2:41 AM
Subject: Re: st: Re: Your paper on Stata,SAS and SPSS



On Aug 10, 2010, at Tue Aug 5 12:48 , John F Hall wrote:

Alan
I only joined the list two days ago, so I haven't had a chance to findmuch Stata syntax to set alongside SPSS. Listers have sent one or twoone-liners, but with no accompanying output examples.I'm talking about reading from a raw data matrix, adding variable andvalue labels, declaring missing values, data transformations, indexconstruction and the like (possibly via correlation) followed by simpleanalysis like frequency counts, barcharts and contingency tables using %%,not fancy multivariate inferential statistics. Had I still been teaching,that would have come much later in my course, but far too late for thesurvey report that had to be on the client's desk by yesterday.You're welcome to download data sets and tutorials from my site and offerStata examples to set alongside the SPSS syntax and output (no GUI for me:far too cumbersome, complex and tiresome).
John Hall
http://surveyresearch.weebly.com


John,

To read the following you should have a fixed font, e.g., courier, and mayhave some problems if your email system raps lines around.

I sent one line commands because that is how simple the syntax is. Here is acomplete program. The dataset is installed on your PC when you installStata. It is called auto.dta.


Here is the entire program:
********begin*********
sysuse auto
tab foreign
fre foreign
ttest mpg, by(foreign)
tab rep78 foreign, col chi2 V
pwcorr weight trunk headroom length price, obs sig
regress price weight trunk headroom length, beta
********end***********

Let me elaborate.

The sysuse auto installs the sample datasets that come with the Stataprogram.

The tab foreign does a frequency distribution--
==========
. tab foreign

 Car type |      Freq.     Percent        Cum.
------------+-----------------------------------
 Domestic |         52       70.27       70.27
  Foreign |         22       29.73      100.00
------------+-----------------------------------
    Total |         74      100.00
==========

I prefer the frequency distribution output that SPSS has. A user wrote acommand, fre, that does this. From the Stata command line you can say finditfre and follow the link to install it (with one click). Here is what you getwith that command: As an SPSS person you probably also prefer this output

===========
. fre foreign

foreign -- Car type
----------------------------------------------------------------
                 |      Freq.    Percent      Valid       Cum.
-------------------+--------------------------------------------
Valid   0 Domestic |         52      70.27      70.27      70.27
      1 Foreign  |         22      29.73      29.73     100.00
      Total      |         74     100.00     100.00
---------------------------------------------------------------
===========

As an example of an independent t-test you may want to know if price issignificantly different depending on whether the car is domestic (U.S.) orforeign (not U.S.). The ttest command gives you this

===========
. ttest mpg, by(foreign)

Two-sample t test with equal variances
------------------------------------------------------------------------------

Group | Obs Mean Std. Err. Std. Dev. [95% Conf.Interval]

---------+--------------------------------------------------------------------

Domestic | 52 19.82692 .657777 4.743297 18.5063821.14747Foreign | 22 24.77273 1.40951 6.611187 21.8414927.70396

---------+--------------------------------------------------------------------

combined | 74 21.2973 .6725511 5.785503 19.956922.63769

---------+--------------------------------------------------------------------

diff | -4.9458041.362162 -7.661225 -2.230384

------------------------------------------------------------------------------

diff = mean(Domestic) - mean(Foreign) t-3.6308Ho: diff = 0 degrees of freedom =72


  Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0

Pr(T < t) = 0.0003 Pr(|T| > |t|) = 0.0005 Pr(T > t) =0.9997

In order to do a what SPSS calls a crosstabulation of two variables and geta chi-square test and Cramer's V

you use the next one line command:
===========

. tab rep78 foreign, col chi2 V

+-------------------+
| Key               |
|-------------------|
|     frequency     |
| column percentage |
+-------------------+

  Repair |
  Record |       Car type
    1978 |  Domestic    Foreign |     Total
-----------+----------------------+----------
       1 |         2          0 |         2
         |      4.17       0.00 |      2.90
-----------+----------------------+----------
       2 |         8          0 |         8
         |     16.67       0.00 |     11.59
-----------+----------------------+----------
       3 |        27          3 |        30
         |     56.25      14.29 |     43.48
-----------+----------------------+----------
       4 |         9          9 |        18
         |     18.75      42.86 |     26.09
-----------+----------------------+----------
       5 |         2          9 |        11
         |      4.17      42.86 |     15.94
-----------+----------------------+----------
   Total |        48         21 |        69
         |    100.00     100.00 |    100.00

        Pearson chi2(4) =  27.2640   Pr = 0.000
             Cramér's V =   0.6286
==============

If you want a correlation matrix with the pairwise N and the level ofsignificance you use the next line

==============
. pwcorr weight trunk headroom length price, obs sig

           |   weight    trunk headroom   length    price
-------------+---------------------------------------------
    weight |   1.0000
           |
           |       74
           |
     trunk |   0.6722   1.0000
           |   0.0000
           |       74       74
           |
  headroom |   0.4835   0.6620   1.0000
           |   0.0000   0.0000
           |       74       74       74
           |
    length |   0.9460   0.7266   0.5163   1.0000
           |   0.0000   0.0000   0.0000
           |       74       74       74       74
           |
     price |   0.5386   0.3143   0.1145   0.4318   1.0000
           |   0.0000   0.0064   0.3313   0.0001
           |       74       74       74       74       74
==============

If you want to do a simple multiple regression and get R-square, B's,beta's, etc.

==============
. regress price weight trunk headroom length, beta

Source | SS df MS Number of obs =74-------------+------------------------------ F( 4, 69) =10.20Model | 236016580 4 59004145 Prob > F =0.0000Residual | 399048816 69 5783316.17 R-squared =0.3716-------------+------------------------------ Adj R-squared =0.3352Total | 635065396 73 8699525.97 Root MSE =2404.9


------------------------------------------------------------------------------

price | Coef. Std. Err. t P>|t|Beta

-------------+----------------------------------------------------------------

------------------------------------------------------------------------------

==============

All of these are very basic commands for a beginning course. Stata has menuswhere you can point and click, but you can see why many users don't botherwith these. In my book on Stata I reproduce most of the sorts of commandsyou cover in your tutorials. The fact that you make these available at nocharge for SPSS people is very nice of you.

There are some areas where SPSS has an advantage. People doing traditionalANOVA find SPSS easier to use, for example. As far as data management goesit is a mixed thing. I work with some complex datasets so the added power ofStata is important for data management. Michael Mitchell has a great book ondata management (Stata Press). Stata does use the two step process oflabeling variables and some find this awkward. The advantage is that thesame value labels, once defined in step one, can be applied broadly toappropriate variables.

The extensibility of Stata by users is remarkable. Some of what you see onStatalist is the code they wrote and this can be complicated even though thecommand is simple. For example, a user wrote a command revrs.If I say revrs varlist (after installing the command the first time), Statawill reverse code each of the variables and reassign the value labels forthem, then generate new variables with rev at the start while keeping theoriginal variables unchanged. Some of these user written commands areextremely powerful. Scott Long, also a sociologist, wrote a one line commandthat runs a Poisson regression, a negative binomial regression, a zeroinflated Poisson regression, and a zero inflated Negative Binomialregression. The output includes the results for each of these and a tablehelping you decide which model fits the data best. This would not be of muchuse for a beginning student, but illustrates the power of the extensibilityof Stata.

Michael mentioned the price difference and it is really dramatic. When youbuy (not lease) Stata you get everything. The price is not an annual fee.

Many people still use SPSS and I hope IBM invests enough to make it a morecompetitive product for social science researchers. I'm concerned that theirprimary interest may be in the predictive analytics applications formarketing researchers, but I hope this is a mistaken concern.


--alan
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq

* http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- AW: st: Re: Your paper on Stata,SAS and SPSS
  - From: "Martin Weiss" <[email protected]>

References:
- st: Re: Your paper on Stata,SAS and SPSS
  - From: "John F Hall" <[email protected]>
- Re: st: Re: Your paper on Stata,SAS and SPSS
  - From: Alan Acock <[email protected]>
- Re: st: Re: Your paper on Stata,SAS and SPSS
  - From: "John F Hall" <[email protected]>
- Re: st: Re: Your paper on Stata,SAS and SPSS
  - From: Alan Acock <[email protected]>

Prev by Date: st: control hight of superscript
Next by Date: AW: st: Re: Your paper on Stata,SAS and SPSS
Previous by thread: Re: st: Re: Your paper on Stata,SAS and SPSS
Next by thread: AW: st: Re: Your paper on Stata,SAS and SPSS
Index(es):
- Date
- Thread