| |
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: AW: AW: st: RE: Decile sorts: output
I can only come up with an inefficient way of accomplishing this.
Hopefully it works.
Le
Let's assume that we have n variables
=====================================
tempfile 1 2 3 .... n
forvalues i=1/n{
preserve
keep decile`i' meanvar`i'
bys decille`i': keep if _n==1
rename decile`i' decile
sort decile
save `"`i'"', replace
restore, preserve
}
use `"1"', clear
forvalues i=2/n{
merge decile, using `"`i'"'
drop _merge
}
xpose, clear
list
========================================
On 11/10/06, Thomas Erdmann <[email protected]> wrote:
To be more precise the data looks like:
DecileVar1 MeanVar1 DecileVar2 MeanVar2
...
Obs1 1 0.2 1 0.5
Obs2 1 0.2 8 0.7
Obs3 4 0.6 8 0.7
...
Obsn
While it should look like indicated below.
- Tom
-----Urspr�ngliche Nachricht-----
Von: [email protected]
[mailto:[email protected]] Im Auftrag von Philipp Rehm
Gesendet: Freitag, 10. November 2006 15:58
An: [email protected]
Betreff: Re: AW: AW: st: RE: Decile sorts: output
-reshape- should be useful.
If I understand your data-set correctly, it is long, along these lines:
clear
input str4 var decile mean
var1 1 2
var1 2 5
var1 3 7
var2 1 4
var2 2 8
var2 3 9
end
reshape wide mean, i(var) j(decile)
HTH,
Philipp
Thomas Erdmann wrote:
> Thanks for the further suggestion using -levelsof- ; I will go through it
> tonight.
>
> Based on the output produced I have now two types of variables:
> (1) R* for each variable containing the mean return per decile
> (2) G* for each variable containing the decile number 1 to 10
>
> Basically I would like to produce a table like this (where the figures in
> the table represent the mean returns of the deciles per variable):
>
> 1 2 3 ... 10
> Var1 1.2 1.5 1.6 ... 2.3
> Var2 0.9 0.7 0.6 ... 0.3
> Varx
> ...
> Varn
>
> But somehow don't arrive at summarizing the data in a convenient way,
> obviously this (below) does not work as after collapse all other variables
> are gone.
>
> foreach X of varlist c1* {
> sort G_`X'
> collapse (mean) RG_`X', by(G_`X')
> }
>
> Please excuse if this is very basic stuff, but I would appreciate a short
> hint. Thanks.
>
> - Tom
>
>
>
>
> -----Urspr�ngliche Nachricht-----
> Von: [email protected]
> [mailto:[email protected]] Im Auftrag von Jeph Herrin
> Gesendet: Freitag, 10. November 2006 14:28
> An: [email protected]
> Betreff: Re: AW: st: RE: Decile sorts
>
> So, using -levelsof- per Philipp's suggestion:
>
>
> levelsof yrm, level(l)
> foreach X of varlist c1* {
> gen dec_`X'=.
> foreach YRM in `l' {
> xtile deciles=`X' if yrm==`YRM', n(10)
> replace dec_`X'=deciles if yrm==`YRM'
> drop deciles
> }
> bys dec_`X': egen Rr`X'=mean(c1ds_ri)
> }
>
> maybe?
> jeph
>
>
> Thomas Erdmann wrote:
>> A further note on Jeph's suggestion:
>>
>> It looks very convenient, but I need to adjust for the fact that I do not
>> need the mean of the same item but of a different attribute:
>>
>> foreach X of varlist c1* {
>> xtile deciles_`X'=`X', n(10)
>> bysort deciles_`X': egen Rr`X'=mean(c1ds_ri)
>> }
>>
>> But a problem still remains:
>> the deciles are calculated over all observations - but what I need is
>> calculating the mean of deciles by yrm (my time variable representing
>> YearMonth) and afterwards the mean of all deciles groups (1-10) over all
>> yrm's. I was not able to integrate this into this short solution as -by-
> is
>> not allowed for -xtile- .
>>
>> -Tom
>>
>>
>>
>>
>>
>> -----Urspr�ngliche Nachricht-----
>> Von: [email protected]
>> [mailto:[email protected]] Im Auftrag von Jeph Herrin
>> Gesendet: Freitag, 10. November 2006 01:26
>> An: [email protected]
>> Betreff: Re: st: RE: Decile sorts
>>
>> Oops, don't forget to drop -deciles-
>>
>> foreach X of varlist c1* {
>> xtile deciles=`X', n(10)
>> bys deciles: egen R`X'=mean(`X')
>> drop deciles
>> }
>>
>>
>>
>>
>>
>>
>> Jeph Herrin wrote:
>>> Maybe I'm missing something, but why not:
>>>
>>> foreach X of varlist c1* {
>>> xtile deciles=`X', n(10)
>>> bys deciles: egen R`X'=mean(`X')
>>> }
>>>
>>> ?
>>>
>>> hth,
>>> Jeph
>>>
>>>
>>> Nick Cox wrote:
>>>> Various comments sprinkled here and there. You may have
>>>> strong reasons to use these decile bins, but binning strikes me as,
>>>> usually, at best a means towards an end (or perhaps ends towards some
>>>> means). Some nonparametric
>>>> regression might do more justice to the data.
>>>> Also, you are mixing two naming conventions 1...10 and 10...90. Just
>>>> use one.
>>>> Nick [email protected]
>>>> Thomas Erdmann
>>>>
>>>>> I am trying to sort my observations into deciles according to one
>>>>> attribute
>>>>> and afterwards calculating the average of another attribute of those
>>>>> ten groups.
>>>>
>>>>> Please find the code I came up with below [lines with ... are
>>>>> omitted], yrm is the time variable (YearMonth)
>>>>>
>>>>> (1) As far as I can tell it works out, but a) it's a lot of code and
>>>>> b)produces a lot of variables and c)generating the output is rather
>>>>> awkward.
>>>>>
>>>>> Could you give me hints on how to implement a smarter solution or if
>>>>> there
>>>>> are any errors in the way the calculation is carried out currently?
>>>>
>>>>> *** Generate Percentiles
>>>>> sort yrm
>>>>> foreach X of varlist c1* {
>>>>> by yrm: egen p10_`X'= pctile(`X'), p(10.0)
>>>>> by yrm: egen p20_`X'= pctile(`X'), p(20.0)
>>>>> by yrm: egen p30_`X'= pctile(`X'), p(30.0)
>>>>> ...
>>>>> by yrm: egen p90_`X'= pctile(`X'), p(90.0)
>>>>> }
>>>> This is two loops rolled out into one.
>>>> sort yrm foreach X of varlist c1* { forval i =
>>>> 10(10)90 { by yrm : egen p`i'_`X' = pctile(`X'), p(`i')
>>>> }
>>>> }
>>>>
>>>>> *** Sort into Percentile groups
>>>>> foreach X of varlist c1* {
>>>>> gen G_`X'=1 if `X'<p10_`X' & `X'~=.
>>>>> replace G_`X'=2 if `X'>p10_`X' & `X'<p20_`X' ... replace
>>>>> G_`X'=9 if `X'>p80_`X' & `X'<p90_`X' replace G_`X'=10 if
>>>>> `X'>p90_`X' & `X'~=.
>>>>> }
>>>> Similar story with boundary conditions.
>>>> foreach X of varlist c1* {
>>>> gen byte G_`X' = `X' < p10_`X'
>>>> forval i = 2/9 { local j = 10 * `i'
>>>> replace G_`X' = `i' if `X' < p`j'_`X' & G_`X' == 0 }
>>>> replace G_`X' = cond(`X' == ., ., 10) if G_`X' == 0 }
>>>>
>>>>
>>>>> *** Calculate return mean for each group
>>>>> sort yrm
>>>>> foreach X of varlist G* {
>>>>> by yrm: egen R1`X'= mean(c1ds_ri) if `X'==1
>>>>> by yrm: egen R2`X'= mean(c1ds_ri) if `X'==2
>>>>> ...
>>>>> by yrm: egen R9`X'= mean(c1ds_ri) if `X'==9
>>>>> by yrm: egen R10`X'= mean(c1ds_ri) if `X'==10
>>>>> }
>>>> Why do you need all these variables? The results for bin are disjoint,
>>>> so can be put in a single variable.
>>>> foreach X of varlist G* { bysort yrm `X' : egen R`X' =
>>>> mean(c1ds_ri)
>>>> }
>>>> Having said that, it can probably done more directly with a series of
>>>> -collapse-s.
>>>> *
>>>> * For searches and help try:
>>>> * http://www.stata.com/support/faqs/res/findit.html
>>>> * http://www.stata.com/support/statalist/faq
>>>> * http://www.ats.ucla.edu/stat/stata/
>>>>
>>>>
>>> *
>>> * For searches and help try:
>>> * http://www.stata.com/support/faqs/res/findit.html
>>> * http://www.stata.com/support/statalist/faq
>>> * http://www.ats.ucla.edu/stat/stata/
>>>
>>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/support/faqs/res/findit.html
>> * http://www.stata.com/support/statalist/faq
>> * http://www.ats.ucla.edu/stat/stata/
>>
>>
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
--
No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.1.409 / Virus Database: 268.14.0/524 - Release Date: 08.11.2006
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
--
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Le Wang, Ph.D.
Minnesota Population Center
University of Minnesota
(o) 612-624-5818
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/