Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: Re: Create Variable of Groupings
From
"Joseph Coveney" <[email protected]>
To
<[email protected]>
Subject
st: Re: Create Variable of Groupings
Date
Sat, 2 Nov 2013 17:38:36 +0900
Lisa Wang wrote:
I want to create a new variable that groups my observations so I can
do something like a panel analysis.
I have variables: identifiers date amount id3. id3 is a concatenation
of identifiers and date.
For instance,
identifiers | date | amount | id3
1007 | 17aug2006 | 10 | 1007 17030
1007 | 17aug2006 | 7 | 1007 17030
1007 | 17aug2006 | 8.5 | 1007 17030
2049 | 26may2009 | 10 | 2049 18043
2049 | 26may2009| 7 | 2049 18043
2049 | 12mar2007 | 7 | 2049 17237
2049 | 12mar2007 | 7 | 2049 17237
2049 |12mar2007 | 7 | 2049 17237
I would like it to output event_id = 1 for 1007 17030, 2 for 2049
18043, 3 for 2049 17237 etc etc....down the page.
But at this point it seems to give me 2681 for 1007 17030, 5130 for
2049 18043 (ie. it is not sequential).
I tried this:
- bysort id* date : gen event_id = _n - but that gives me numbering
WITHIN groups
and also tried:
- egen event_id = group(id3) - but it was not sequential. Do you think
I need to so a by or sort beforehand?
Thank you in advance for all your helpful suggestions as I am
currently stuck and can't proceed.
--------------------------------------------------------------------------------
See the line of code below, starting at "Begin here".
Joseph Coveney
. input long identifiers str9 date double amount str1 id3
identifiers date amount id3
1. 1007 17aug2006 10 1007 17030
2. 1007 17aug2006 7 1007 17030
3. 1007 17aug2006 8.5 1007 17030
4. 2049 26may2009 10 2049 18043
5. 2049 26may2009 7 2049 18043
6. 2049 12mar2007 7 2049 17237
7. 2049 12mar2007 7 2049 17237
8. 2049 12mar2007 7 2049 17237
9. end
. quietly replace id3 = string(identifiers) + ///
> " " + string(date(date, "DMY"))
.
. *
. * Begin here
. *
. generate byte event_id = sum(id3 != id3[_n-1])
.
. list, noobs sepby(event_id)
+-------------------------------------------------------+
| identi~s date amount id3 event_id |
|-------------------------------------------------------|
| 1007 17aug2006 10 1007 17030 1 |
| 1007 17aug2006 7 1007 17030 1 |
| 1007 17aug2006 8.5 1007 17030 1 |
|-------------------------------------------------------|
| 2049 26may2009 10 2049 18043 2 |
| 2049 26may2009 7 2049 18043 2 |
|-------------------------------------------------------|
| 2049 12mar2007 7 2049 17237 3 |
| 2049 12mar2007 7 2049 17237 3 |
| 2049 12mar2007 7 2049 17237 3 |
+-------------------------------------------------------+
.
. exit
end of do-file
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/