[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: how does insheet determine datatypes?

From	Phil Schumm <[email protected]>
To	[email protected]
Subject	Re: st: how does insheet determine datatypes?
Date	Tue, 9 Jan 2007 13:46:56 -0600

On Jan 9, 2007, at 1:21 PM, David Kantor wrote:

Insheet, at least as it has been up to Stata 8, behaves ungracefully if the second line contains var lables (and var names are in the first line), which is how some raw datasets are composed. In this case, you get everything as string -- usually very long ones. And the var names in the raw data are ignored; you get default names v1, v2, etc.. And what were supposed to be the var names and labels end up as data in the first and second observations.

In Stata 9, the -names- option causes -insheet- to handle the variable names properly:

--------- foo.raw ---------
firstvar,secondvar,thirdvar
"label for first var","and the second","and third"
1,2,3
4,5,6
7,8,9
---------------------------

. insheet using foo, names
(3 vars, 4 obs)

. cl

firstvar secondvar thirdvar
1. label for first var and the second and third
2. 1 2 3
3. 4 5 6
4. 7 8 9

Of course, the variable labels still need to be dealt with.

If you encounter this situation, you may want to use - convert_top_lines-.

I just took a look at the code, and -convert_top_lines- could also benefit from a function to generate Stata names from strings (i.e., if a variable name in the first row of the data file is not a valid Stata name, -convert_top_lines- currently throws an error). You can of course trap this error as Nick pointed out, but you're then still left with the question of what to name the corresponding variable. It would be nice to do this in a way that was guaranteed to be consistent with the way -insheet- does it (in case you were making use of both on the same project).

-- Phil

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: how does insheet determine datatypes?
  - From: David Kantor <[email protected]>

References:
- Re: st: how does insheet determine datatypes?
  - From: Jens Lauritsen <[email protected]>
- Re: st: how does insheet determine datatypes?
  - From: David Kantor <[email protected]>

Prev by Date: [no subject]
Next by Date: st: Weighted Least Squares
Previous by thread: Re: st: how does insheet determine datatypes?
Next by thread: Re: st: how does insheet determine datatypes?
Index(es):
- Date
- Thread