Patrick:
Welcome to the list. I would do this as follows: If you know
the lowest and highest number your id variable can take than
it is pretty simple to create a new file that will contain
all integers between these numbers. Than you can merge that
file with your dataset, which will create the new cases and
the _merge variable that is created by -merge- will tell you
which cases are added. See the example below.
HTH,
Maarten
*------------- begin example -----------
clear
set obs 30
gen mpg = _n + 11 /*I want to fill in all missing integers of mpg*/
list in 1/10
sort mpg
tempfile numbers /*this way the file `numbers' will only be available*/
save `numbers' /*during this do session, see: -help tempfile-*/
sysuse auto, clear
sort mpg
list mpg foreign in 1/10
merge mpg using `numbers'
tab _merge /*a case is added if _merge == 2, see: -help merge-*/
sort mpg
gen var1skippedvalue = _merge==2 /*this uses a logical expression
var1skipped value equals 1 if it is added and zero if it is not*/
list mpg foreign var1skippedvalue in 1/10
*----------- end example ---------------
-----------------------------------------
Maarten L. Buis
Department of Social Research Methodology
Vrije Universiteit Amsterdam
Boelelaan 1081
1081 HV Amsterdam
The Netherlands
visiting adress:
Buitenveldertselaan 3 (Metropolitan), room Z434
+31 20 5986715
http://home.fsw.vu.nl/m.buis/
-----------------------------------------
-----Original Message-----
From: [email protected] [mailto:[email protected]]On Behalf Of Patrick Woodburn
Sent: woensdag 8 november 2006 10:44
To: [email protected]
Subject: st: Generating blank observations
Dear Statalist,
This is my first post to the list, and I hope it is clear enough. I
just have one question for now:
If I have an id variable called "var1" with a selection of unique values
in a given range of integers (eg the values 1, 3, 5, 6, 7, and 9), and I
want to create new observations which contain each missing value in that
range and are blank for all other variables (eg new observations
containing 2, 4, 8 and 10) and a new variable to flag that they have
been artificially generated, what do I do? Currently, all I can think
of is the rather roundabout way of doing it below, but I can't help but
think that surely there must be a more efficient method.
Best regards,
Patrick
*Code begins (dataset already open)
preserve
keep var1
drop if var1==.
bysort var1: assert _n==1
gen flag=0
gen id=1
reshape wide flag, i(id) j(var1)
forvalues i=1/10 {
cap gen flag`i'=1
}
reshape long flag, i(id) j(var1)
drop id
keep if flag==1
save var1skippedvalues
restore
append using var1skippedvalues
This message (and any associated files) is intended only for the use of
the individual or entity to which it is addressed and may contain
information that is confidential, proprietary, subject to copyright or
constitutes legally privileged information. If you are not the intended
recipient you are hereby notified that any dissemination, copying, printing
or distribution of this message, or files associated with this message,
is Illegal. If you have received this message in error, please notify
us immediately by replying to the message and deleting it from your
computer. Medical Research Council deserves the right to monitor all
communications through its networks. Any views expressed in this message are
those of the individual sender, except where the message states
otherwise and the sender to state them to be the views of any such entity.
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/