[email protected]
> This is a relatively simple question. I am appending data
> from the same survey
> for different years. The problem is that some variables
> appear as strings in
> some datasets and as bytes/numbers in others. Perhaps this
> is due from the
> translation from dbf to dta with stattransfer in my
> specific case, and may be
> due to the presence of typos, eg: ` instead of 1, etc..
> When appending, only the
> original data is used, the new data is lost.
> The following two mock datasets illustrate the situation:
> 1.dta:
> var1 [byte]
> 1
> 2
> 3
> and
> 2.dta:
> var1 [str1]
> `
> 1
> 2
>
> When appending the two (for instance append using 2.dta)
> Stata warns you about
> this:
> (note: var1 is str1 in using data but will be byte now)
> and the result is:
> var1
> 1
> 2
> 3
> .
> .
> .
>
>
> So far I've managed with a set of destring, replace force
> on each separate file.
> But when working with a large number of files and
> variables, it may become
> cumbersome. What I wanted to know is if there is a way to
> tell Stata to append
> every variable as the most general format, that is, str*,
> when there is a problem like this.
No, there isn't, as far as I know.
One might ask "Why not?" and the best answer I can
think of grows from what you say here: you had
to use -destring, replace force- to coerce
at least some of the string variables. As you say,
there are problems in some observations in
treating these strings as numeric.
Stata in essence is not in the business of making
strong assumptions about what your data really
mean or really should be. It devolves all
responsibility for such decisions to you.
Also, while there is a good case for what you
suggest, Stata is not in the business
of making unilateral changes from numeric
to string or from string to numeric.
There is more discussion about numbers
and strings, together with one person's
attempt to explain Stata philosophy here,
in Stata Journal 2(3), 314-329 (2002).
The paper might be of interest, although
I don't think it offers any quick and easy
alternatives to the problem here.
Nick
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/