Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: shifting existing values to subsequent variables in a row
From
"Buzz Burhans" <[email protected]>
To
<[email protected]>
Subject
st: shifting existing values to subsequent variables in a row
Date
Sat, 7 May 2011 13:14:40 -0600
I have been given a several datasets with ~50K observations and 197
variables.
Of the 197 variables, 180 are obtained from 20 measurement occasions where 9
measurements were made. After the initial 17 variables describing the
subject (in my case individual cows), there are a series of 9 measurements
from the first measurement occasion, repeated in the same order for each
subsequent measurement occasion
Inspecting the data, it is clear that in a small but substantial proportion
of the observations/cases the last datum is not present for a given
measurement occasion, with the consequence that the value provided for all
the subsequent variables is shifted left by one variable. It is quite clear
from the content of the data in the remaining variables in a row ( a
mixture of text and
numbers) that the values are offset one column to the left, i.e. one
variable to the left.
I can fairly easily identify the "bad" observations with an -if- statement.
Such observations occur randomly and not usually consecutively. The
problem can occur in
multiple subsets of the data, as it appears to have occurred with the final
datum on different measurement occasions (but not every one, only in some
instances), so whatever procedure I come up with will need to be reusable
for application to data that is shifted after a later measurement occasion.
I am trying to come up with an easy way to shift all the remaining
values back to the correct variable. The 9 variables on each occasion all
have different names, though within a
measurement occasion the prefix is the same for all 9 measurements
(variables) attached to that occasion (i.e. day1this day1that day1other
day1more...)
The data look something like this for 2 observations where the first is
ok, the second with a absent missing value and subsequent values shifted
right::
var1 - var25 day1No8 day1last day2first day2second
day2third.....
___________|________|________|_______|___________|___________....
Ok values....| 100 Hot 25
120 UP
Bad values..| 100 25 120
UP
What I need is:
var1 - var25 day1No8 day1last day2first day2second
day2third.....
___________|________|________|_______|___________|___________....
Ok values....| 100 Hot 25
120 UP
shift values.| 100 . 25
120 UP
I would appreciate any ideas on how to accomplish this efficiently when
applied to multiple observations.
Buzz
Buzz Burhans, Ph.D.
Dairy-Tech Group
So. Albany, VT / Twin Falls ID
Cell: 208-320-0829
ID Fax: 208-735-1289
VT Fax: 802-755-6842
Email: [email protected]
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/