Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: adding observation of means of variables
From
Abhimanyu Arora <[email protected]>
To
[email protected]
Subject
Re: st: adding observation of means of variables
Date
Thu, 16 Feb 2012 13:43:22 +0100
I see your point if it were the original dataset, true, but while I do
start with it in the -do- file, it had to be modified on the way.
But sure, if you have better suggestions, would love to hear them.
Thanks
Abhimanyu
On Thu, Feb 16, 2012 at 1:34 PM, Nick Cox <[email protected]> wrote:
> OK, but there is no need to add the means to the dataset to do that.
>
> Nick
> [email protected]
>
> Abhimanyu Arora
>
> Thanks very much for your conscientious advice.
>
> Basically, I had created tables for in my paper (which had averages in
> the last row). Now that the analysis is complete, I just would like to
> make sure the numbers are replicable from A-Z (in stata itself---I
> thought bringing them out as datasets would be ok for my purpose), in
> case the referees would like to see where they come from.
>
> On Thu, Feb 16, 2012 at 1:04 PM, Nick Cox <[email protected]> wrote:
>
>> Phil gives accurate advice, and as he said there are other ways to do it.
>>
>> Here's another:
>>
>> set obs `=_N + 1'
>>
>> ds, has(type numeric)
>>
>> qui foreach v in `r(varlist)' {
>> su `v', meanonly
>> replace `v' = r(mean) in L
>> }
>>
>> That said, I think this is a bad idea for working with Stata. No, let me rephrase that: it's a very bad idea. A rule of thumb, blunt though it will seem, is that if you have to ask how to do this you don't yet understand Stata well enough to use it safely.
>>
>> My advice is not to do this.
>>
>> It's a spreadsheet practice that matches the way spreadsheets are set-up. It's not a good idea for working with statistical software like Stata, The problem is that once those extra observation(s) are added, you _must_ always exclude them from further analyses with the same dataset. Otherwise you just get nonsense results. Add to that the fact that if you -sort- your dataset, or some program or command -sort-s your data as a side-effect (now rare but not impossible), those observations with summaries will typically no longer be at the end of your dataset, so you need to invent extra machinery to keep track of where they are.
>>
>> Better advice would depend on knowing quite why you want to this. Keeping means in variables, although there can be redundancy, can be a reasonable idea for some purposes.
>>
>> Nick
>> [email protected]
>>
>> Phil Clayton
>>
>> Not as far as I know, but it's easy to program. Here's one solution:
>>
>> preserve
>> collapse (mean) *
>> tempfile means
>> save `means'
>> restore
>> append using `means'
>>
>> The above assumes that all variables are numeric. If they're not, you could replace:
>> collapse (mean) *
>> with:
>> ds, has(type numeric)
>> collapse (mean) `r(varlist)'
>>
>> On 16/02/2012, at 10:33 PM, Abhimanyu Arora wrote:
>>
>>> Is there a direct command that appends an observation to the dataset,
>>> giving the means of all the numeric variables?
>>> Perhaps I am using -findit- not that efficiently, but if I am not
>>> mistaken there was one...
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/