Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: sxpose -not possible; would exceed present limit on number of variables
From
Nick Cox <[email protected]>
To
"[email protected]" <[email protected]>
Subject
Re: st: sxpose -not possible; would exceed present limit on number of variables
Date
Thu, 20 Feb 2014 00:50:54 +0000
Simplified code:
gen j = substr(v1, -1, 1) if v1 != "EntityID"
gen which = subinstr(v1, j, "", 1) if v1 != "EntityID"
gen EntityID = v2 if v1 == "EntityID"
replace EntityID = EntityID[_n-1] if missing(EntityID)
drop if v1 == "EntityID"
drop v1
reshape wide v2, i(EntityID j) j(which) string
renpfix v2
expand endyr - begyr + 1
rename begyr year
bysort EntityID j : replace year = year[_n-1] + 1 if _n > 1
drop endyr j
l
Nick
[email protected]
On 20 February 2014 00:34, Nick Cox <[email protected]> wrote:
> In your sample data, blocks *3 *4 *5 seem to be the same information repeated.
>
> With the sample data, this is code to play with
>
> gen j = substr(word(v1, 1), -1, 1) if word(v1, 1) != "EntityID"
> gen which = subinstr(v1, j, "", 1) if word(v1, 1) != "EntityID"
> gen EntityID = v2 if word(v1, 1) == "EntityID"
> replace EntityID = EntityID[_n-1] if missing(EntityID)
> drop if word(v1,1) == "EntityID"
> drop v1
> reshape wide v2, i(EntityID j) j(which) string
> renpfix v2
> expand endyr - begyr + 1
> rename begyr year
> bysort EntityID j : replace year = year[_n-1] + 1 if _n > 1
> drop endyr j
> l
> Nick
> [email protected]
>
>
> On 19 February 2014 21:06, R Zhang <[email protected]> wrote:
>> Hi Statalisters,
>>
>>
>> My data has 13,458 observation and 21 variables.
>> EntityID corpid1 begyr1 gvkey1 endyr1 corpid2 begyr2 gvkey2 endyr2
>> corpid3 begyr3 gvkey3 endyr3 corpid4 begyr4 gvkey4 endyr4 corpid5
>> begyr5 gvkey5 endyr5
>> 100091 8101 1961 1000 1970 8091 1971 1000 1973 8011 1974 1001 2000
>> 8012 2000 1001 2002 8012 2003 1001 2005
>>
>>
>> for each unique EntityID, the corresponding gvkey and corpid could
>> vary over time as indicated by begyr and endyr,
>>
>> what I want is a dataset that give me the gvkey and corpid for each
>> time period, so I can match it to another dataset that has company
>> specific financial data , the match variable will be gvkey, year.
>>
>> as of now, i thought I should reshape the data, Someone on the forum
>> kindly offered me the following program to reshape my data. sample
>> code (see below) works for his hypothetical data, but when i ran with
>> my data (13,458 observation and 21 variables.). I got an error "not
>> possible; would exceed present limit on number of variables", could
>> you shed light on this?
>>
>> *****************
>> input str20 v1 v2
>> EntityID 100091
>> corpid1 8101
>> begyr1 1961
>> gvkey1 1000
>> endyr1 1970
>> corpid2 8091
>> begyr2 1971
>> gvkey2 1000
>> endyr2 1973
>> corpid3 8011
>> begyr3 1974
>> gvkey3 1001
>> endyr3 2000
>> corpid4 8011
>> begyr4 1974
>> gvkey4 1001
>> endyr4 2000
>> corpid5 8011
>> begyr5 1974
>> gvkey5 1001
>> endyr5 2000
>> end
>>
>> compress
>> sxpose, clear firstnames force
>> reshape long corpid begyr gvkey endyr, i(EntityID) j(pd)
>> ***********************
>>
>> what I ultimately want is :
>> EntityID corpid year gvkey
>> 100091 8101 1961 1000
>> 100091 8101 1962 1000
>> 100091 8101 1963 1000
>> 100091 8101 1964 1000
>> 100091 8101 1965 1000
>> 100091 8101 1966 1000
>> ...
>> 100091 8091 1971 1000
>> 100091 8091 1972 1000
>> 100091 8091 1973 1000
>> 100091 8091 1974 1000
>>
>> p.s if you think there is a better way , please also share.
>>
>> thanks!!!
>>
>> -R
>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/