Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: How to drop low frequency patterns from panel data
From
Kim Peeters <[email protected]>
To
"[email protected]" <[email protected]>
Subject
Re: st: How to drop low frequency patterns from panel data
Date
Fri, 3 Feb 2012 05:46:59 -0800 (PST)
Thank you Nick!
----- Original Message -----
From: Nick Cox <[email protected]>
To: [email protected]
Cc:
Sent: Friday, February 3, 2012 12:29 PM
Subject: Re: st: How to drop low frequency patterns from panel data
Sounds more like
egen tag = tag(ID patternvar)
egen IDcount = total(tag), by(patternvar)
drop if IDcount < 20
For the kind of logic here, see if desired
SJ-8-4 dm0042 . . . . . . . . . . . . Speaking Stata: Distinct observations
(help distinct if installed) . . . . . . N. J. Cox and G. M. Longton
Q4/08 SJ 8(4):557--568
shows how to answer questions about distinct observations
from first principles; provides a convenience command
On Fri, Feb 3, 2012 at 10:37 AM, Kim Peeters <[email protected]> wrote:
> Dear Nick,
>
> Thank you for your fast reply and my apologies for not mentioning that -xtpatternvar- is a user-written command. Unfortunately, the solution that you suggest does not solve my question. I admit that my question was not clear. :-)
>
> Observations (i.e. persons) have multiple rows (one row for every year) of data. The code that you suggest loops through the entire data set and drops the patterns that occur less than twenty times in the entire data set, regardless of the number of rows within observations. However, the solution I’m looking for should drop all persons that share the same pattern if that pattern occurs less than twenty time (i.e. if less than twenty persons have the same pattern).
>
> Thank you for your advice.
>
> Best regards,
> Kim
>
>
>
> ________________________________
> From: Nick Cox <[email protected]>
> To: [email protected]
> Sent: Friday, February 3, 2012 10:35 AM
> Subject: Re: st: How to drop low frequency patterns from panel data
>
> -xtpatternvar- is a user-written command from SSC. Please remember to
> explain where user-written programs you refer to come from.
>
> bysort pattern : drop if _N < 20
>
> is I think what you seek.
>
> Nick
>
> On Fri, Feb 3, 2012 at 9:22 AM, Kim Peeters <[email protected]> wrote:
>
>> I have an unbalanced panel data set. The yearly data spans a period of almost twenty years. However, most subjects only participated in the last years of the study, which is confirmed by the analysis of the different panel patterns using -xtdescribe-. While some patterns' frequency is >1000, other patterns only occur once. To improve the data quality, I would like to drop all patterns that occur less than twenty times.
>>
>> I have not been able to accomplish this. Thus far, I can only re-generate the -xtdescribe- output again.
>> xtpatternvar,gen(pattern)
>> egen tag =tag(ID)
>> tabulate pattern if tag, sort
>>
>> Any advice on how to drop low frequency patterns from panel data?
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/