Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Questions for random data generation and value label
From
Joerg Luedicke <[email protected]>
To
[email protected]
Subject
Re: st: Questions for random data generation and value label
Date
Mon, 11 Mar 2013 16:45:10 -0400
You are still not saying which distribution you would like to sample
from! Any sample must be from _some_ distribution.
Joerg
On Mon, Mar 11, 2013 at 4:28 PM, Yu Xue <[email protected]> wrote:
> Thanks Maarten, David, Nick, Joerg !
>
> Let me use an example to describe my question more clearly.
>
> There is an actual data that has three variables: Var1, Var2, Var3.
> Each of them has continuous numeric values. And I get the max, min,
> SD, mean for each of them, and save them in several macros, and then
> clear the memory.
>
> Then, I want to generate a synthetic data, which also include three
> variables: SynVar1, SynVar2, SynVar3. And they keep the same max, min,
> SD, mean of Var1, Var2, Var3, respectively as in actual data.
>
> Hope I describe it clearly.
> Thank you very much
>
>
> On Mon, Mar 11, 2013 at 12:48 PM, Joerg Luedicke
> <[email protected]> wrote:
>> The normal distribution has support -infinity,+infinity, so it is not
>> clear what you mean with 'range' here. Do you want to draw from a
>> truncated normal distribution?
>>
>> Joerg
>>
>> On Mon, Mar 11, 2013 at 12:49 PM, Yu Xue <[email protected]> wrote:
>>> Thanks Maarten!
>>>
>>> What I want is Normal Distribution. Is there a way to randomly
>>> generate a variable with specific mean, SD, and range,
>>>
>>> Thanks!!
>>> Mark
>>>
>>> On Mon, Mar 11, 2013 at 10:35 AM, Maarten Buis <[email protected]> wrote:
>>>> On Mon, Mar 11, 2013 at 4:20 PM, Yu Xue wrote:
>>>>> I already checked "-help random_number_functions-", but I still can
>>>>> not find the answer to my question.
>>>>>
>>>>> I knew that I can use a formula similar like this:
>>>>> Var=a+int((b-a+1)*runiform()), to keep a specific range in [a,b]
>>>>> and use another formula: Var=invnorm(uniform())*SD+mean, to keep
>>>>> specific Standard deviation and mean.
>>>>> But I do not know how to generate a "Var" with all specific range, SD, and mean.
>>>>> Please note that I do not generate a sample data from the actual data,
>>>>> what I want to generate is synthetic data (totally fake data).
>>>>
>>>> What distribution do you want to draw your new variable from? Do you
>>>> want it to be normally (Gaussian) distributed, gamma distributed, beta
>>>> distribed, Fisk distributed, Laplace distributed, ... The number of
>>>> choices is huge, but without choosing your distribution you cannot
>>>> draw your random numbers.
>>>>
>>>> -- Maarten
>>>>
>>>>
>>>> ---------------------------------
>>>> Maarten L. Buis
>>>> WZB
>>>> Reichpietschufer 50
>>>> 10785 Berlin
>>>> Germany
>>>>
>>>> http://www.maartenbuis.nl
>>>> ---------------------------------
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/