Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Matei Frunzetti <matei.frunzetti@student.uni-tuebingen.de> |
To | statalist@hsphsun2.harvard.edu |
Subject | RE: st: RE: Unbalancing a panel data set with country pairs |
Date | Sat, 22 Jan 2011 13:57:27 +0100 |
Zitat von Nick Cox <n.j.cox@durham.ac.uk>:
<sacrifice> . search identifier points you to useful information. In this case, egen idnew = group(id1 id2), labelis one of the easiest solutions. There is also a miniature review of possibilities atSJ-7-4 dm0034 . . . Stata tip 52: Generating composite categorical variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox Q4/07 SJ 7(4):582--583 (no commands)tip on how to generate categorical variables using tostring and egen, group()but the help to -egen- -- which I did refer you to earlier -- contains enough to solve this problem.Nick n.j.cox@durham.ac.uk Matei Frunzetti Thank you for your timely answer and excuse me for not being clear enough: my main problem is that the country pairs have a single id for each country, so i need to introduce a new id with its values being unique identifiers for the respective country pairs such that i can use by etc. ex: id1 id2 idnew var1 ... cty1 cty2 pair1 234 cty2 cty1 pair2 456 I am aware that this must seem pretty banal but i simply cannot find a thread or help that would adress this problem, I probably just don't know where to look. Zitat von Nick Cox <n.j.cox@durham.ac.uk>:I assume that "country pair" defines a panel identifier. Look at the help for -egen-, which gives you one way to start. In terms of missings, egen missing_in_obs = rowmiss(varlist) egen missing_in_panel = total(missing_in_obs), by(panelid) drop if missing_in_panel In terms of not enough years, you don't quite say what the time resolution of your data is. But assuming you have yearly observations then bysort panelid : drop if _N < 17 drops the short panels. If it's half-years, quarters, months, etc. adjust as needed. You need to plug in your own <varlist> and <panelid>. The big concept you appear to be missing is that of working -by:-. There is a tutorial at SJ-2-1 pr0004 . . . . . . . . . . Speaking Stata: How to move step by: step . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox Q1/02 SJ 2(1):86--102 (no commands) explains the use of the by varlist : construct to tackle a variety of problems with group structure, ranging from simple calculations for each of several groups to more advanced manipulations that use the built-in _n and _N Nick n.j.cox@durham.ac.uk Matei Frunzetti I 'm working on a panel data set over 17 years. It's fairly unbalanced und i need to drop all observations for country pairs that either lack full length (as in years) or have missings in one of the independant variables. The problem is that i have to delete all observations of these country pairs for all years if only one or more variables have a missing or if it is one or more years short. I ran into a dead end trying to figure out how to imply the rest of the observations of a "faulty" country pair into the drop command.* * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/
* * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/