Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Need help with transforming dataset, please

From	Nick Cox <[email protected]>
To	[email protected]
Subject	Re: st: Need help with transforming dataset, please
Date	Tue, 26 Feb 2013 08:12:26 +0000

Here is another way to do it, this time in place.

qui foreach s in Math English Geog French {
      gen `s' = .
      forval j = 1/4 {
             replace `s' = Grade`j' if Subject`j' == upper("`s'")
     }
}

drop Subject* Grade*

In a real example there might be subjects such as "Integrated Science"
with spaces or other characters that couldn't appear in variable
names. Replace all such characters by underscores, or remove them.

Nick

On Tue, Feb 26, 2013 at 7:52 AM, Nick Cox <[email protected]> wrote:
> Note that this case is documented, at least in terms of an aphorism or
> slogan, within
>
> http://www.stata.com/support/faqs/data-management/problems-with-reshape/
>
> The slogan is "You may need two reshapes to get to where you want to be".
>
> Nick
>
> On Tue, Feb 26, 2013 at 12:32 AM, Nick Cox <[email protected]> wrote:
>> . reshape long Subject Grade, i(StudId)
>> (note: j = 1 2 3 4)
>>
>> Data                               wide   ->   long
>> -----------------------------------------------------------------------------
>> Number of obs.                        4   ->      16
>> Number of variables                  10   ->       5
>> j variable (4 values)                     ->   _j
>> xij variables:
>>          Subject1 Subject2 ... Subject4   ->   Subject
>>                Grade1 Grade2 ... Grade4   ->   Grade
>> -----------------------------------------------------------------------------
>>
>> . drop if missing(Subject)
>> (2 observations deleted)
>>
>> . replace Subject = proper(Subject)
>> (14 real changes made)
>>
>> . drop _j
>>
>> . reshape wide Grade , string i(StudId) j(Subject)
>> (note: j = English French Geog Math)
>>
>> Data                               long   ->   wide
>> -----------------------------------------------------------------------------
>> Number of obs.                       14   ->       4
>> Number of variables                   4   ->       6
>> j variable (4 values)           Subject   ->   (dropped)
>> xij variables:
>>                                   Grade   ->   GradeEnglish
>> GradeFrench ... GradeMath
>> -----------------------------------------------------------------------------
>>
>> . renpfix Grade
>>
>> . l
>>
>>      +--------------------------------------------------+
>>      | StudId   English   French   Geog   Math   Gender |
>>      |--------------------------------------------------|
>>   1. |    121         4        5      5      3        F |
>>   2. |    122         4        .      5      5        F |
>>   3. |    123         3        2      3      1        M |
>>   4. |    124         2        .      1      2        M |
>>      +--------------------------------------------------+
>>
>> Here's the code in one
>>
>> reshape long Subject Grade, i(StudId)
>> drop if missing(Subject)
>> replace Subject = proper(Subject)
>> drop _j
>> reshape wide Grade , string i(StudId) j(Subject)
>> renpfix Grade
>> l
>>
>> Nick
>>
>> On Tue, Feb 26, 2013 at 12:18 AM, Bidemi Carrol <[email protected]> wrote:
>>
>>> I would appreciate some help with transforming the dataset below.
>>> Reshape does not seem to work, but I may be applying it wrong.
>>>
>>> The original dataset is as follows (with many more grade and subject
>>> combinations):
>>>
>>> StudId Gender  Subject1 Grade1 Subject2  Grade2   Subject3  Grade3
>>> Subject4  Grade4
>>> 121    F           MATH       3        ENGLISH  4           GEOG     5
>>>         FRENCH  5
>>> 122    F           ENGLISH  4        GEOG      5           MATH      5
>>> 123    M          FRENCH   2        ENGLISH  3           MATH      1
>>>        GEOG      3
>>> 124    M          GEOG       1        MATH       2          ENGLISH  2
>>>
>>> I would like it to look like this:
>>>
>>> StudID  Gender  Math  English  French  Geog
>>> 121        F          3         4             5        5
>>> 122        F          5         4                       5
>>> 123        M         1         3             2         3
>>> 124        M          2        2                        1
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: Need help with transforming dataset, please
  - From: Bidemi Carrol <[email protected]>
- Re: st: Need help with transforming dataset, please
  - From: Nick Cox <[email protected]>
- Re: st: Need help with transforming dataset, please
  - From: Nick Cox <[email protected]>

Prev by Date: Re: st: Need help with transforming dataset, please
Next by Date: Re: st: cumulative summation
Previous by thread: Re: st: Need help with transforming dataset, please
Next by thread: st: Mediate with non-binary treatment variable
Index(es):
- Date
- Thread