Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: How to fill in the missing data
From
Sergiy Radyakin <[email protected]>
To
"[email protected]" <[email protected]>
Subject
Re: st: How to fill in the missing data
Date
Mon, 10 Jun 2013 01:27:03 -0400
Alexis, in your approach when you impute the weight you have a risk of
carrying the weight of one patient to the next one, if the first
measurement is missing for the second patient (your last line
disregards ID). So unless it is known that the first measurement of
weight is always present, (and we see from the provided example it is
not the case) this method would create very incorrect results.
Wong, are your datapoints such that each patientid-age combinations
are unique? or do you sometimes see same patient twice within a year?
(then be careful even with the -sort- statement).
It sounds like interpolation is likely needed here since the intervals
of missing observations are of different size and weight probably
changes smoothly with age. But it shouldn't be difficult.
Best, Sergiy
On Mon, Jun 10, 2013 at 1:01 AM, Alexis Penot <[email protected]> wrote:
> You can try this
> sort id age
> gen weight2 = weight
> replace weight2 = weight2[_n-1] if missing(weight2)
>
> Alexis
>
> Le 10 juin 2013 à 06:45, Ching Wong <[email protected]> a écrit :
>
>> Hi,
>>
>> I have a dataset as following:
>>
>> id age weight
>> 1 21 50.2
>> 1 22
>> 1 23 52.9
>> 1 24 51.0
>> 1 25
>> 2 22
>> 2 23
>> 2 25 60.2
>> 3 21
>>
>> And I would like to create a new variable "weight2" and fill in the
>> missing data based on the previous value
>>
>> My expected output value should be as follows:
>>
>> id age weight weight2
>> 1 21 50.2 50.2
>> 1 22 . 50.2
>> 1 23 52.9 52.9
>> 1 24 51.0 51.0
>> 1 25 . 51.0
>> 2 22 . .
>> 2 23 . .
>> 2 25 60.2 60.2
>> 3 21 . .
>>
>> I have tried the command below but that cannot produce what I expected.
>>
>> - bysort id (age): gen weight_hat = weight[_n-1]
>>
>> It is very obvious that command is missing something. So what will be
>> the correct command in this case?
>>
>> Cheers,
>>
>> Wong
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/