|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: reading data with missing obs
On Feb 8, 2009, at 3:14 AM, Glen Waddell wrote:
I am trying to read in data from multiple xml files (although this
issue is not specific to this format, I believe) in which some of
the sheets within some of the files have variable names in the
first row and nothing else. That is, all variables are missing in
some of the sheets within some of the files.
In reading in these files, Stata only appears to read in the first
variable... not continuing to subsequent columns. If I had but a
few files I would brute force this by filling the empty cells with
some unique character, read them in and then drop the obs
accordingly. However, I have many of these sheets to be read in,
so automating it would be quite valuable.
I recently had a similar situation in which columns of data (from an
Excel spreadsheet saved in xml format) started with missing values,
and thus were not read in correctly with the basic -xmluse- command.
After some experimentation, I found I had to read the data as
"allstring" (an option of -xmluse-) and then -destring- the desired
series after Stata imported them as string variables in order to
missing values to be captured correctly. For example:
xmluse [file.xml], doctype(excel) cells([b2:z300]) allstring firstrow
missing nocompress
destring [varlist], ignore([stray_text]) replace
Note that you will need to replace the items in [square brackets]
with those appropriate to your problem. Also note that this solution
is specific to xml formatted data, even if the problem (as you
suggest) is not.
Hope this helps,
Mike
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/