On Thu, 14 Oct 2004, Daniel Egan wrote:
> Hello list,
>
> I have a huge dataset which is in EBCDIC that I need to get into
> Stata. I have been told that I can do this using SAS as an
> intermediary, translating from EBCDIC to ASCII.
We use SAS often for this, especially where the EBCDIC data includes
packed decimal or zoned decimal data. In that case, SAS is the only
package we have available (Bill Gould, please take note). A disadvantage
of SAS is the 200 character limit on character variables, which means you
have to divide the record up into chunks and translate each separately.
> Does anyone have any experience with this, or can point me towards a
> generic translator? I just want to see if there is any easier way than
> SAS.
The dd command in Unix can do this conversion for most situations. The
chief problem we have noticed is that non-EBCDIC values in the input data
are dropped with no placeholder substituted. If your EBCDIC dataset is
fixed format, and (for example) missing fields are packed with nulls (very
common), the result will be a shorter and unusable dataset.
The command is:
dd conv=ascii <ebcdic.in >ascii.out
or
dd conv=ibm ...
I don't really know the difference between the two conversions.
Daniel Feenberg
feenberg isat nber dotte org
>
>
> Cheers, and thanks,
>
>
> Dan Egan
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/