Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: RE: Problems with the reshape command
From
Syed Basher <[email protected]>
To
[email protected]
Subject
Re: st: RE: Problems with the reshape command
Date
Wed, 19 Jan 2011 13:52:19 -0800 (PST)
In my case, long structure is equally (or better) informative as wide structure.
In fact, with the wide structure I get numerous empty cells which is visually
uncomfortable. I guess I will leave the choice between long and wide structure
to the end-user in my office by supplying them both structures. I appreciate
Nick's meddling on this matter.
Syed
----- Original Message ----
From: Nick Cox <[email protected]>
To: [email protected]
Sent: Wed, January 19, 2011 10:19:58 PM
Subject: Re: st: RE: Problems with the reshape command
Rebecca is clearly right in the sense that if you create a
sufficiently fine identifier, -reshape- will oblige. But what is
useful about the data structure created? It rules out as many analyses
as it allows because -line- is arbitrary and separates things you
might want to compare.
But even if this is what Syed is asking for, there is a deeper
question: With this structure, why -reshape- at all? Almost all
analysis questions are easier to answer with a long structure.
Nick
On Wed, Jan 19, 2011 at 5:33 PM, POPE, REBECCA <[email protected]> wrote:
> Hi Syed,
> I'm a bit confused by your use of the term "cross-tab", but since you are using
>reshape, I'm going to assume you are just trying to get the prices for the
>different goods to become variables. If so, do you have some other additional
>identifying variable that you could use in your reshape command? If you have
>multiple prices for the same item at the same port, might the shipments be from
>different suppliers or have arrived on different dates? If so, you could use
>something like the following:
>
> . reshape wide price, i(port date) j(item)
>
> I'm guessing this won't give you exactly what you want because there will still
>be multiple lines per port (at least if your real data looks like the
>hypothetical data), but you'll have gotten around reshape's objections and can
>use other commands to consolidate after that. Other users might have more
>elegant solutions, but I hope this helps.
>
> If you don't have another logical ID variable to add to port, you can generate
>a fake one by doing the following:
>
> . by port, sort: generate line = _n
> . reshape wide price, i(port line) j(item)
>
> port line pri~1006 pri~2011 pri~2045 pri~4029 pri~4061 pri~7031
>pri~8041
>------------------------------------------------------------------------------------------
>-
> 1 1 . . . . 92.79 .
> .
> 1 2 37.55 . . . . .
> .
> 1 3 . . 16.21 . . .
> .
> 2 1 . . . . . .
> 12.55
> 2 2 . 13.13 . . . .
> .
>------------------------------------------------------------------------------------------
>-
> 2 3 . 89.68 . . . .
> .
> 3 1 . . . 27.62 . .
> .
> 3 2 . . 15.18 . . .
> .
> 3 3 . . . . . 68.01
> .
> 3 4 . . . 15.47 . .
> .
>
>
> Regards,
> Rebecca
>
> Rebecca A. Pope
> Program Manager
> UAMS CCTR Health Services Research
> Fay W. Boozman College of Public Health
> Dept. of Health Policy and Management
>
> -----Original Message-----
> From: [email protected]
>[mailto:[email protected]] On Behalf Of Syed Basher
> Sent: Wednesday, January 19, 2011 10:49 AM
> To: [email protected]
> Subject: st: Problems with the reshape command
>
> Dear Statalist,
>
> I am using Stata 11.1. I have the following hypothetical data:
>
> +--------------------------+
> | port item price |
> |--------------------------|
> 1. | 3 4029 27.62 |
> 2. | 3 4029 15.47 |
> 3. | 1 1006 37.55 |
> 4. | 3 2045 15.18 |
> 5. | 1 2045 16.21 |
> |------------------------|
> 6. | 1 4061 92.79 |
> 7. | 2 8041 12.55 |
> 8. | 2 2011 89.68 |
> 9. | 3 7031 68.01 |
> 10. | 2 2011 13.13 |
> |-----------------------|
>
> I would like to reshape the data to wide format using:
> . reshape wide price, i(port) j(item)
>
> This is of course problematic in Stata since "item" is not unique within
> "port". Eventually I would like to obtain the following cross-tab (in wide
> format):
>
> port | 1006 2011 2045 4029 4061 7031 8041
> -------------------------------------------------------------------
> 1 | 37.55 16.21 92.79
> 2 |
> 89.68 12.55
> 2 | 13.33
> 3 | 15.18 27.62 68.01
> 3 | 15.47
>
> I have been consulting Stata's FAQs on this issue
> (http://www.stata.com/support/faqs/data/reshape3.html) without much success.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/