[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Correcting the original data set

From	"Nick Cox" <[email protected]>
To	<[email protected]>
Subject	st: RE: Correcting the original data set
Date	Mon, 20 Jan 2003 15:45:33 -0000

Chau Tak Wai
> 
> Originally I have a dataset something like this:
> 
> id	x1	x2 	...
> 1	10	20	
> 2	15	14
> 3	17	17
> ..	..	..
> 
> where id is the identifier of that observation and x1, x2,... are
> variables we are going to use for analysis. However, this 
> original set
> of data contain some errors needed to be corrected. Now I 
> have another
> dataset that contains the error-corrections like this:
> 
> id	err_var	 	correct
> 1	x1		14
> 1	x2		18
> 2	x1		6
> ..	..		..
> 
> where err_var contains the variable name that have 
> problems, and correct
> is the corrected value of that variable. I would like to 
> seek advice on
> how I can merge the later data set with the first one and 
> replace the
> wrong values of the first dataset. 
> 

This is interesting. My first thought was to -reshape- 
the error-corrections data to -wide- and then -merge, update-. 
but it would seem likely that the -reshape- would 
produce missing values which would then clobber existing 
valid values. 

So my second thought is to -reshape- the existing data 
to -long- and then -merge, update- and then -reshape- 
back to -wide-. 

Another thought is to read in a text version of 
the error corrections data and convert it line 
by line into a .do file 

e.g. 

1 x1 14 

becomes 

replace x1 = 14 if id == 1 

That should be a one-line program in your favourite 
scripting language. 

Nick 
[email protected] 

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: Correcting the original data set
  - From: Chau Tak Wai <[email protected]>

Prev by Date: Re: st: New version of the -parmest- package on SSC
Next by Date: st: Re: ST: correlations and values
Previous by thread: st: Correcting the original data set
Next by thread: st: Re: ST: correlations and values
Index(es):
- Date
- Thread