Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Adding randomness to a variable
From
Owen Gallupe <[email protected]>
To
[email protected]
Subject
Re: st: Adding randomness to a variable
Date
Mon, 21 Oct 2013 13:18:17 -0400
Thank you, Red Owl and Richard!
The -gen- method described by both of you is exactly what I'm looking
for in this instance. But what is described on page 2 of the link
Richard provided will be very helpful in the future when creating more
sophisticated simulated data than I have been working with so far.
Regards,
Owen
On Mon, Oct 21, 2013 at 12:41 PM, Richard Williams
<[email protected]> wrote:
> At 10:04 AM 10/21/2013, Owen Gallupe wrote:
>>
>> Hi,
>>
>> Given the random number generator capabilities of Stata, I suspect
>> there is an easy solution to this which I just haven't managed to
>> track down. Having said that, is there any function that allows you to
>> take an existing variable and add a small degree of randomness to it?
>> I'm thinking along the lines of a jitter option when generating a
>> variable. I know that this exact command doesn't actually exist, but a
>> command of the following form is what I'm looking for:
>>
>> gen varx = jitter(var)
>>
>> My idea is that it would take this:
>> 5
>> 6
>> 7
>> 8
>> 9
>>
>> And turn it into something like this:
>> 4.73
>> 6.11
>> 6.80
>> 8.34
>> 9.09
>>
>> I'm aware that the following two options would produce something
>> similar, but my idea is to manually create a variable that has the
>> exact properties I want for teaching purposes but then add a little
>> "error" to it.
>>
>> a)
>> gen varx = .5*var1 + .8660254*var2
>>
>> b)
>> clear
>> matrix c = (1.00, 0.30, -0.25, -0.10, 0.10, 0.20 \ ///
>> 0.30, 1.00, -0.15, -0.10, 0.12, 0.35 \ ///
>> -0.25, -0.15, 1.00, 0.13, -0.08, -0.16 \ ///
>> -0.10, -0.10, 0.13, 1.00, 0.06, -0.14 \ ///
>> 0.10, 0.12, -0.08, 0.06, 1.00, 0.001 \ ///
>> 0.20, 0.35, -0.16, -0.14, 0.001, 1.00)
>> corr2data var1 var2 var3 var4 var5 var6, n(2000) corr(c)
>
>
> I've used the corr2data approach to create vars like e1 and e2 that were
> uncorrelated with anything else, and then added them to the other vars I had
> created. See (especially page 2)
>
> http://www3.nd.edu/~rwilliam/stats2/l21.pdf
>
> For existing data you can also do stuff like
>
> gen x2 = x + rnormal()
>
> That will add random noise to x; but corr2data is better if you want EXACT
> properties, e.g. by chance alone the randomnness you add above could be/
> should be slightly correlated with the original x.
>
> Instead of corr2data, consider using drawnorm if you want to be sampling
> from a population with known properties, rather than creating a population
> with the exact properties.
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/