Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: RE: reshaping to wide format, and need to create a "j" variable
From
Nick Cox <[email protected]>
To
"'[email protected]'" <[email protected]>
Subject
st: RE: reshaping to wide format, and need to create a "j" variable
Date
Tue, 25 Oct 2011 22:15:15 +0100
I think this is backwards in two senses. For most purposes this is the better data structure for later analyses, so your problem is the other way round, to -reshape long- the other files. Also, it is not at clear that you will want to -merge- these; how would that be done, on which variables? It sounds much likely that you will want to -append-.
All that said, what you need for a -reshape wide- is
bysort id (placement) : gen j = _n
whereas what you ask for is
bysort id (placement) : gen j = _N
which won't do the job at all. Your anonymous colleagues who think that -foreach- is required are assigned to suffer this tutorial:
SJ-2-1 pr0004 . . . . . . . . . . Speaking Stata: How to move step by: step
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox
Q1/02 SJ 2(1):86--102 (no commands)
explains the use of the by varlist : construct to tackle
a variety of problems with group structure, ranging from
simple calculations for each of several groups to more
advanced manipulations that use the built-in _n and _N
But my main point is that I don't follow your diagnosis.
Nick
[email protected]
Kendra Lewis
I have data that tracks foster children's transitions and placements.
The data is in long format, such that children have the same ID
variable over time, but their ID variable appears in several different
rows. Each time their ID variable appears indicates a move or
transition. Here is an example dataset:
id placement date of placement age at placement
1 1 6.7.2009 14
1 2 8.2.2010 15
1 3 2.3.2011 15
1 1 3.4.2011 15
2 1 5.4.2009 12
3 1 4.6.2009 13
3 2 7.8.2010 14
4 1 4.5.2009 10
4 2 6.7.2009 10
4 3 5.2.2010 11
4 2 7.8.2010 11
4 3 9.9.2010 12
4 1 1.4.2011 12
5 1 7.8.2009 13
5 2 6.4.2010 14
So, id #1 has had 4 placements, #2 has only had 1, #3 as had 2, and so
on. The data needs to be put into wide format to merge in other
datasets of a similar format. I know the necessary command will be
reshape wide stub, i(id) j(??)
where stub will be the varlist-as there is no common stub in the
dataset. What I need is a "j" variable that indicates a count of the
number of times the id number appears. For example, the count for id
#1 is 4, for id #2 is 1, for id #3 is 2, for id #4 is 6, and for id #5
is 2. Then I can use this for the reshape command to be my "j"
variable.
Does anyone have any suggestions? I've spoken to a few people and we
think it may be some sort of "foreach" loop command but we are not
sure.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/