Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: RE: making data duplicate in terms of several variables in case of a given variable taking identical values
From
Ekaterina Hertog <[email protected]>
To
"[email protected]" <[email protected]>
Subject
Re: st: RE: making data duplicate in terms of several variables in case of a given variable taking identical values
Date
Thu, 08 Jul 2010 18:18:22 +0400
Dear Martin,
Sorry for the late reply. Thank you very much for your advice!
As I sometimes have the same zipcode appearing 5 or more times and one
of the (say) prefecture values is missing I made a slight modification
in your code and then it worked very well.
When I ran
bys zipcode: gen byte prefvaries=prefecture[1]!=prefecture[_N]
sometimes the missing prefecture value would be 3rd for example and the
first and (say) 5th values were present and the same so prefvaries did
not capture this variation.
I modified the code you suggested as follows
bys zipcode (prefecture): gen byte prefvaries=prefecture[1]!=prefecture[_N]
and it fixed the problem.
Thanks a lot,
Katya
Martin Weiss wrote:
<>
" I
think that the only cases where prefecture, towncode and areacode vary
while zipcodes are identical are when prefecture, towncode and areacode
are sometimes missing and sometimes not, but I would like to check that
before I do the necessary replacements."
You have to check those conditions one by one:
***********
clear*
input str10(zipcode prefecture) int(towncode areacode)
"0010027" "hokkaido" 100 1100
"0010029" "hokkaido" 100 1100
"0010029" "" . .
"0010030" "hokkaido" 100 1100
"0200822" "iwate" 201 3201
"0200823" "" . .
"0200823" "iwate" 201 3201
"0200831" "iwate" 201 3201
end
compress
li, noo sepby(zipcode)
bys zipcode: gen byte prefvaries=prefecture[1]!=prefecture[_N]
by zipcode: gen byte townvaries=towncode[1]!=towncode[_N]
by zipcode: gen byte areavaries=areacode[1]!=areacode[_N]
by zipcode: egen missings=total(mi(prefecture,towncode, areacode))
by zipcode: gen byte onlysomemiss=missings!=_N & missings!=0
drop missings
//all conditions fulfilled?
gen byte complies=prefvaries+townvaries+areavaries+onlysomemiss==4
li, noo sepby(zipcode) ab(15)
***********
HTH
Martin
-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Ekaterina Hertog
Sent: Montag, 5. Juli 2010 20:59
To: [email protected]
Subject: st: making data duplicate in terms of several variables in case of
a given variable taking identical values
Dear all,
I have some data which looks like this
zipcode prefecture towncode areacode
0010027 hokkaido 100 1100
0010029 hokkaido 100 1100
0010029 . . .
0010030 hokkaido 100 1100
0200822 iwate 201 3201
0200823 . . .
0200823 iwate 201 3201
0200831 iwate 201 3201
I use Stata 11.
I would like to make my observations identical in terms of prefecture,
towncode and areacode when they are identical in terms of zipcode. I
think that the only cases where prefecture, towncode and areacode vary
while zipcodes are identical are when prefecture, towncode and areacode
are sometimes missing and sometimes not, but I would like to check that
before I do the necessary replacements.
I looked into duplicate commands, but did not seem to find a good
solution. I would be most grateful for any pointers.
Sincerely yours,
katya
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
--
Ekaterina Hertog (née Korobtseva)
Career Development Fellow
Department of Sociology and Nissan Institute of Japanese Studies
University of Oxford
27 Winchester Road
Oxford
OX2 6NA
United Kingdom
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/