Dear Joseph
Thanks for your thoughts on this. Here are some thoughts (perhaps a
bit late)...
> Does anyone on the List know of a publicly accessible
> (ideally, uncontrived) example of a complex data file that
> illustrates the advantages of SAS over SPSS, Stata, and other
> packages for reading these? If so, could you please post the
> URL? (Or the URL of a description of what such a data file
> would look like--perhaps something like an anecdote or case
> study illustrating the power of the DATA step with a
> particularly nasty example that some SAS user encountered.)
I have had clients come in with data files that contained, say, visit
data where each subject had a different number of visits. The data is
stored in a "WIDE" format and would contain a variable indicating how
many visits there were with respect to variable A, and then repeat the
values of A for how many visits there were. Then it would have a
variable indicating how many visits there were with respect to variable
B, and then have the data repeated for variable B for each visit, and so
forth. Yes, such a file could be read in Stata using the "file" command,
but this would not be as easy as reading the file in a DATA step in SAS.
Also, we have had data files that have an irregular structure with a
variety of different kinds of records. The variable that indicated the
type of record was not stored in the same column location each time
(because the location of that variable depended on the contents of
previous variables), similar to the above example. Again, this irregular
data structure would be very hard to read in Stata.
> Hierarchical data files
> . . . It is harder to read in such files in SAS, however you
> have additional power while reading the files in SAS. . Stata
> is the weakest program in this respect, being hard to use
> (probably equivalent to SAS in difficulty) but not offering
> the kind of additional power that you get in SAS."
All three packages can do this, but when you have a hierarchical file
which does not have a specific variable indicating the type of record,
this can be rather tricky and quite a bit of extra effort in Stata, but
is not too much extra effort in SAS.
I hope this clarifies.
Best regards,
Michael Mitchell
UCLA ATS Statistical Consulting Group
http://www.ats.ucla.edu/stat/
NOTE NEW FALL WALK IN CONSULTING SCHEDULE - M, T, W 10-12 & 2-4, Th
10-12 in MS 4919
See http://www.ats.ucla.edu/stat/schedule/ for more details.
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of
> Joseph Coveney
> Sent: Friday, December 02, 2005 12:53 AM
> To: Statalist
> Subject: st: Reading very complex raw data files
>
> In Michael Mitchell's "Strategically using General Purpose Statistics
> Packages: A Look at Stata, SAS and SPSS," he alludes to the
> superior ability of SAS to read complex data files.
> Excerpting from Page 20 of the technical report,
>
> "Complex raw data files
> Some raw data files are stored in a very complex format,
> perhaps having varying numbers of variables. Without a doubt,
> SAS is the most powerful tool for reading these kinds of
> complex data files and is the very best tool for reading very
> complex raw data files.
> Hierarchical data files
> . . . It is harder to read in such files in SAS, however you
> have additional power while reading the files in SAS. . Stata
> is the weakest program in this respect, being hard to use
> (probably equivalent to SAS in difficulty) but not offering
> the kind of additional power that you get in SAS."
>
> Does anyone on the List know of a publicly accessible
> (ideally, uncontrived) example of a complex data file that
> illustrates the advantages of SAS over SPSS, Stata, and other
> packages for reading these? If so, could you please post the
> URL? (Or the URL of a description of what such a data file
> would look like--perhaps something like an anecdote or case
> study illustrating the power of the DATA step with a
> particularly nasty example that some SAS user encountered.)
>
> I couldn't locate anything pertinent via the customary search
> engines. I'm not referring to EBCDIC, XML (or even SAS 6.04
> dataset files, apparently, for that matter), but rather a
> file with a data organizational complexity that illustrates
> what Michael is talking about. Thank you.
>
> Joseph Coveney
>
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/