Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Need help with transforming dataset, please
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: Need help with transforming dataset, please
Date
Tue, 26 Feb 2013 08:12:26 +0000
Here is another way to do it, this time in place.
qui foreach s in Math English Geog French {
gen `s' = .
forval j = 1/4 {
replace `s' = Grade`j' if Subject`j' == upper("`s'")
}
}
drop Subject* Grade*
In a real example there might be subjects such as "Integrated Science"
with spaces or other characters that couldn't appear in variable
names. Replace all such characters by underscores, or remove them.
Nick
On Tue, Feb 26, 2013 at 7:52 AM, Nick Cox <[email protected]> wrote:
> Note that this case is documented, at least in terms of an aphorism or
> slogan, within
>
> http://www.stata.com/support/faqs/data-management/problems-with-reshape/
>
> The slogan is "You may need two reshapes to get to where you want to be".
>
> Nick
>
> On Tue, Feb 26, 2013 at 12:32 AM, Nick Cox <[email protected]> wrote:
>> . reshape long Subject Grade, i(StudId)
>> (note: j = 1 2 3 4)
>>
>> Data wide -> long
>> -----------------------------------------------------------------------------
>> Number of obs. 4 -> 16
>> Number of variables 10 -> 5
>> j variable (4 values) -> _j
>> xij variables:
>> Subject1 Subject2 ... Subject4 -> Subject
>> Grade1 Grade2 ... Grade4 -> Grade
>> -----------------------------------------------------------------------------
>>
>> . drop if missing(Subject)
>> (2 observations deleted)
>>
>> . replace Subject = proper(Subject)
>> (14 real changes made)
>>
>> . drop _j
>>
>> . reshape wide Grade , string i(StudId) j(Subject)
>> (note: j = English French Geog Math)
>>
>> Data long -> wide
>> -----------------------------------------------------------------------------
>> Number of obs. 14 -> 4
>> Number of variables 4 -> 6
>> j variable (4 values) Subject -> (dropped)
>> xij variables:
>> Grade -> GradeEnglish
>> GradeFrench ... GradeMath
>> -----------------------------------------------------------------------------
>>
>> . renpfix Grade
>>
>> . l
>>
>> +--------------------------------------------------+
>> | StudId English French Geog Math Gender |
>> |--------------------------------------------------|
>> 1. | 121 4 5 5 3 F |
>> 2. | 122 4 . 5 5 F |
>> 3. | 123 3 2 3 1 M |
>> 4. | 124 2 . 1 2 M |
>> +--------------------------------------------------+
>>
>> Here's the code in one
>>
>> reshape long Subject Grade, i(StudId)
>> drop if missing(Subject)
>> replace Subject = proper(Subject)
>> drop _j
>> reshape wide Grade , string i(StudId) j(Subject)
>> renpfix Grade
>> l
>>
>> Nick
>>
>> On Tue, Feb 26, 2013 at 12:18 AM, Bidemi Carrol <[email protected]> wrote:
>>
>>> I would appreciate some help with transforming the dataset below.
>>> Reshape does not seem to work, but I may be applying it wrong.
>>>
>>> The original dataset is as follows (with many more grade and subject
>>> combinations):
>>>
>>> StudId Gender Subject1 Grade1 Subject2 Grade2 Subject3 Grade3
>>> Subject4 Grade4
>>> 121 F MATH 3 ENGLISH 4 GEOG 5
>>> FRENCH 5
>>> 122 F ENGLISH 4 GEOG 5 MATH 5
>>> 123 M FRENCH 2 ENGLISH 3 MATH 1
>>> GEOG 3
>>> 124 M GEOG 1 MATH 2 ENGLISH 2
>>>
>>> I would like it to look like this:
>>>
>>> StudID Gender Math English French Geog
>>> 121 F 3 4 5 5
>>> 122 F 5 4 5
>>> 123 M 1 3 2 3
>>> 124 M 2 2 1
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/