Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: shifting existing values to subsequent variables in a row
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: shifting existing values to subsequent variables in a row
Date
Sun, 8 May 2011 08:31:10 +0100
I'd -reshape- such data to long. Then use -by:- and subscripting.
Nick
On Sat, May 7, 2011 at 8:14 PM, Buzz Burhans <[email protected]> wrote:
> I have been given a several datasets with ~50K observations and 197
> variables.
>
> Of the 197 variables, 180 are obtained from 20 measurement occasions where 9
> measurements were made. After the initial 17 variables describing the
> subject (in my case individual cows), there are a series of 9 measurements
> from the first measurement occasion, repeated in the same order for each
> subsequent measurement occasion
>
> Inspecting the data, it is clear that in a small but substantial proportion
> of the observations/cases the last datum is not present for a given
> measurement occasion, with the consequence that the value provided for all
> the subsequent variables is shifted left by one variable. It is quite clear
> from the content of the data in the remaining variables in a row ( a
> mixture of text and
> numbers) that the values are offset one column to the left, i.e. one
> variable to the left.
>
> I can fairly easily identify the "bad" observations with an -if- statement.
> Such observations occur randomly and not usually consecutively. The
> problem can occur in
> multiple subsets of the data, as it appears to have occurred with the final
> datum on different measurement occasions (but not every one, only in some
> instances), so whatever procedure I come up with will need to be reusable
> for application to data that is shifted after a later measurement occasion.
> I am trying to come up with an easy way to shift all the remaining
> values back to the correct variable. The 9 variables on each occasion all
> have different names, though within a
> measurement occasion the prefix is the same for all 9 measurements
> (variables) attached to that occasion (i.e. day1this day1that day1other
> day1more...)
>
> The data look something like this for 2 observations where the first is
> ok, the second with a absent missing value and subsequent values shifted
> right::
>
> var1 - var25 day1No8 day1last day2first day2second
> day2third.....
> ___________|________|________|_______|___________|___________....
>
> Ok values....| 100 Hot 25
> 120 UP
> Bad values..| 100 25 120
> UP
>
> What I need is:
>
> var1 - var25 day1No8 day1last day2first day2second
> day2third.....
> ___________|________|________|_______|___________|___________....
>
> Ok values....| 100 Hot 25
> 120 UP
> shift values.| 100 . 25
> 120 UP
>
> I would appreciate any ideas on how to accomplish this efficiently when
> applied to multiple observations.
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/