Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: rectangulizing data
From
Austin Nichols <[email protected]>
To
[email protected]
Subject
Re: st: rectangulizing data
Date
Thu, 26 May 2011 12:58:22 -0400
Dmitriy Krichevskiy <[email protected]>:
Nor is dropping cases harmless; there is some discussion at
http://www.urban.org/publications/411971.html
and slides 12-14 of
http://www-personal.umich.edu/~nicholsa/an_dds.pdf
On Thu, May 26, 2011 at 12:52 PM, Dmitriy Krichevskiy
<[email protected]> wrote:
> Thank you for you responses; I apologize for the confusion(s),
>
> Clarification then,
>
> The data comes from Survey of Income and Program Participation (SIPP)
> and my particular dataset combines 7 years of data. The data is
> collected quarterly and recorded monthly (via phone interviews). Hence
> time=14 is the second month of the second year. Many people in this
> sample miss interviews often, also income exhibits a lot of volatility
> (I still do not know why). My goal is to analyze income transitions
> from quintile to quintile (via -xttrans-) and for annual income I need
> to aggregate monthly income while differentiating between zero income
> from missing income. Hence, I am trying to drop people who only have
> few month of income on record for those years where their information
> is incomplete while keeping the same people for other years in which
> they have all the income information recorded. Given very large
> volatility and a lot of missing interviews I am not sure imputing
> income is harmless.
>
> On 5/26/11, Nick Cox <[email protected]> wrote:
>> I think this might need to be
>>
>> bysort ID year: egen obs = count(month)
>>
>> -- perhaps after some work --
>>
>> but as is agreed the example is unclear.
>>
>> On 26 May 2011, at 16:52, Oliver Jones <[email protected]>
>> wrote:
>>
>>> Hi,
>>> your example data structure is a bit confusing since you have month
>>> greater than 12... I'll assume you have at most 12 Month per person
>>> per year.
>>>
>>> Maybe this can help to drop people how have less than 12 observations
>>> for one particular year. Let's assume this year is 2006.
>>>
>>> bysort ID: egen obs = count(Month)
>>> drop if year == 2006 & obs < 12
>>>
>>> Dose it work?
>>>
>>> Best
>>> Oliver
>>>
>>> Am 26.05.2011 17:19, schrieb Dmitriy Krichevskiy:
>>>> Dear Listers,
>>>> I am trying to figure out the simplest way to covert a large panel
>>>> dataset from monthly to annual income. The income is only reported
>>>> monthly and I would want to clean the data of anyone missing a month
>>>> in a particular year. I would like to drop observations for that
>>>> person-year only and keep that person if they are fully present in
>>>> some other year. Here is an equivalent data structure. As always,
>>>> that
>>>> a lot for your help.
>>>> Dmitriy
>>>>
>>>> ID Month Income
>>>> 1 1 1000
>>>> 1 2 500
>>>> 1 3 1000
>>>> 1 13 0
>>>> 1 14 0
>>>> 1 15 0
>>>> 1 16 0
>>>> 1 17 600
>>>> 1 18 1000
>>>> 1 19 1000
>>>> 1 20 1000
>>>> 1 21 1000
>>>> 1 22 1000
>>>> 1 23 660
>>>> 1 24 800
>>>> 1 25 1200
>>>> 2 1 2400
>>>> 2 2 2400
>>>> 2 5 2600
>>>> *
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/