Date sent: Tue, 7 Jun 2005 11:29:28 +0100
Send reply to: [email protected]
> I have variables of the format in one file from dhs
> caseid1
> 1002 9
> 100211
> in a separate file I concatenate two variables and the format becomes
> caseid1 10029 100211 what I would like to do is to eliminate the
> blanks from the variables in each file using trim but cannot get it to
> work, and then to split the variable caseid1 into two variables to
> remove the blank in the middle and then rejoin the two variables so
> that I can merge
>
> the one file has details relating to household, one line per
> household, the second file has details relating to the household
> member so want the household details to be applied to all household
> members can anyone help
>
Theres no need to split and concatenate the variables before merging, a very simple
way to remove the spaces would be to take your first file which contains the spaces in
the variable caseid1 and do something like...
gen str temp = subinstr(caseid1, " ","", .)
drop caseid1
rename temp caseid1
sort caseid1
merge caseid1 using file2
This assumes that the storage type of the caseid1 in the second file is string (which I
suspect its likely to be as you say you concatenated two variables together and -egen
caseid1 = concat(x y)- results in a string).
The 'trick' is to use the subinstr(s1, s2, s3, n) (see -man strfun-) which substitutes string
s2 for string s3 in string s1 and does so n times.
HTH's
Neil
Neil Shephard
Genetics Statistician
ARC Epidemiology Unit, University of Manchester
[email protected]
[email protected]
"If your result needs a statistician then you should design a better experiment" -
Ernest Rutherford
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/