| |
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: binary format type str question
Thanks David. That's what I think I'm doing and it works for data_label
and time_stamp, but it doesn't seem to work for the str types.
Here's an example. There are 6 variables with types 98, 136, 102, 105,
102, and 98. I read that as 6 str types with maximum lengths 98 bytes,
136 bytes, etc. There are 51 observations. But the remaining number of
bytes is 1071. This means there are 3.5 bytes per datum. There aren't
enough bytes to go around if I assume fixed lengths! One the other hand,
if I try to start another variable as soon as I hit a zero, I find there
are multiple zeros in a row, which would seem to indicate no data for
some variables. Hmm. Clearly I'm missing something.
--Mark.
David Kantor wrote:
At 08:35 PM 3/12/2007, Mark Fisher wrote:
Hi. I'm writing a Mathematica program to read stata "dta" files. I
have the "Stata help for dta" page, which is quite useful. Everything
seems to work fine as long as the data types are in the range 251 to
255 (byte, int, long, float, or double). But I can't figure out how to
properly read the data when the data types are in the range 1 to 244
(str1, str2, ... str244). BTW, I have no trouble reading the "char"
strings for the data_label and the time_stamp; I just read them in as
a list of bytes, discard the bytes starting with the first zero, and
convert the remaining bytes to ascii. But the str types don't seem be
the same sort of beast. Any guidance would be appreciated. Thanks.
--Mark.
I'm not looking at the documentation for this, and I've never done any
work like that, but I do recall reading that the string types are stored
such that...
they have a 0-byte terminator if they are shorter than the maximal
length of the type;
they have no terminator otherwise -- if they fill up the maximal length.
Thus, you need the type's nominal (maximal) length as a factor in
reading the values.
For example, if the type is str20, then the values have a 0-byte
terminator if they are shorter than 20, and no terminator if they are 20
characters long.
I hope this is correct and that it helps. Good luck.
--David
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/