Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Syed Basher <syed.basher@yahoo.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: RE: Problems with the reshape command |
Date | Wed, 19 Jan 2011 13:52:19 -0800 (PST) |
In my case, long structure is equally (or better) informative as wide structure. In fact, with the wide structure I get numerous empty cells which is visually uncomfortable. I guess I will leave the choice between long and wide structure to the end-user in my office by supplying them both structures. I appreciate Nick's meddling on this matter. Syed ----- Original Message ---- From: Nick Cox <njcoxstata@gmail.com> To: statalist@hsphsun2.harvard.edu Sent: Wed, January 19, 2011 10:19:58 PM Subject: Re: st: RE: Problems with the reshape command Rebecca is clearly right in the sense that if you create a sufficiently fine identifier, -reshape- will oblige. But what is useful about the data structure created? It rules out as many analyses as it allows because -line- is arbitrary and separates things you might want to compare. But even if this is what Syed is asking for, there is a deeper question: With this structure, why -reshape- at all? Almost all analysis questions are easier to answer with a long structure. Nick On Wed, Jan 19, 2011 at 5:33 PM, POPE, REBECCA <RPOPE@uams.edu> wrote: > Hi Syed, > I'm a bit confused by your use of the term "cross-tab", but since you are using >reshape, I'm going to assume you are just trying to get the prices for the >different goods to become variables. If so, do you have some other additional >identifying variable that you could use in your reshape command? If you have >multiple prices for the same item at the same port, might the shipments be from >different suppliers or have arrived on different dates? If so, you could use >something like the following: > > . reshape wide price, i(port date) j(item) > > I'm guessing this won't give you exactly what you want because there will still >be multiple lines per port (at least if your real data looks like the >hypothetical data), but you'll have gotten around reshape's objections and can >use other commands to consolidate after that. Other users might have more >elegant solutions, but I hope this helps. > > If you don't have another logical ID variable to add to port, you can generate >a fake one by doing the following: > > . by port, sort: generate line = _n > . reshape wide price, i(port line) j(item) > > port line pri~1006 pri~2011 pri~2045 pri~4029 pri~4061 pri~7031 >pri~8041 >------------------------------------------------------------------------------------------ >- > 1 1 . . . . 92.79 . > . > 1 2 37.55 . . . . . > . > 1 3 . . 16.21 . . . > . > 2 1 . . . . . . > 12.55 > 2 2 . 13.13 . . . . > . >------------------------------------------------------------------------------------------ >- > 2 3 . 89.68 . . . . > . > 3 1 . . . 27.62 . . > . > 3 2 . . 15.18 . . . > . > 3 3 . . . . . 68.01 > . > 3 4 . . . 15.47 . . > . > > > Regards, > Rebecca > > Rebecca A. Pope > Program Manager > UAMS CCTR Health Services Research > Fay W. Boozman College of Public Health > Dept. of Health Policy and Management > > -----Original Message----- > From: owner-statalist@hsphsun2.harvard.edu >[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Syed Basher > Sent: Wednesday, January 19, 2011 10:49 AM > To: statalist@hsphsun2.harvard.edu > Subject: st: Problems with the reshape command > > Dear Statalist, > > I am using Stata 11.1. I have the following hypothetical data: > > +--------------------------+ > | port item price | > |--------------------------| > 1. | 3 4029 27.62 | > 2. | 3 4029 15.47 | > 3. | 1 1006 37.55 | > 4. | 3 2045 15.18 | > 5. | 1 2045 16.21 | > |------------------------| > 6. | 1 4061 92.79 | > 7. | 2 8041 12.55 | > 8. | 2 2011 89.68 | > 9. | 3 7031 68.01 | > 10. | 2 2011 13.13 | > |-----------------------| > > I would like to reshape the data to wide format using: > . reshape wide price, i(port) j(item) > > This is of course problematic in Stata since "item" is not unique within > "port". Eventually I would like to obtain the following cross-tab (in wide > format): > > port | 1006 2011 2045 4029 4061 7031 8041 > ------------------------------------------------------------------- > 1 | 37.55 16.21 92.79 > 2 | > 89.68 12.55 > 2 | 13.33 > 3 | 15.18 27.62 68.01 > 3 | 15.47 > > I have been consulting Stata's FAQs on this issue > (http://www.stata.com/support/faqs/data/reshape3.html) without much success. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/