I have a large unbalanced panel dataset that I collected. The panel variable is the userid (~30,000 users) and the time variable is the month (22 months). Both userid and month are numeric variables. All users do not have observations for each of the 22 months as some join later in the study. My dataset is in .dta format.
I am trying to define lag variables but keep failing.
I tried:
sort userid month
tsset userid month
and alterbatively
xtset userid month
Then whenever I try to define a lag variable for the var totsol
gen l1totsol = l1.totsol
(100080 missing values generated)
I thought there might be something wrong in my dataset and tried to use
webuse abdata
sort id year
tsset id year
(or xtset id year)
Then whenever I try to generate the lag variable for the var n, I get missing values generated.
gen l1n= l1.n
(140 missing values generated)
I also tried 2 other approaches with no avail
. by id: gen lag1=wage[_n-1]
(140 missing values generated)
. by id: gen lag1 = wage[_n-1] if year==year[_n-1]+1
(140 missing values generated)
This kept me up all day and all night yesterday.
Any help please?
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/