Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: upper limit on fweights? overflowing into missing values?
From
Richard Williams <[email protected]>
To
[email protected], "[email protected]" <[email protected]>
Subject
Re: st: upper limit on fweights? overflowing into missing values?
Date
Mon, 29 Jul 2013 08:53:39 -0500
Just to sum up my current thinking/guesses on this:
* the maximum number of observations in Stata is 2,147,483,647
* Nonetheless, fweighted data sets can have more observations than that
* However, not all routines will work when the
fweighted data has more than 2,147,483,647 cases.
You can do some simple descriptive things, but
you can't do more complicated things like regression or correlations.
* As to why that is, I am guessing that some
routines have the 2,147,483,647 limit hardcoded
in. Or, maybe there just isn't enough precision
to handle calculations when the N is larger than that.
* Given that most people don't have more than
2,147,483,647 cases (and even if they did, their
computer memory couldn't handle them) StataCorp
probably hasn't spent a lot of time worrying about this.
* Still, an added sentence or two in the fweights
documentation or elsewhere warning about limits might be a good idea.
I am curious what the original author is doing
that requires analyzing 4 billion+ cases. Some
sort of genetic research maybe? I've certainly
never heard of any kind of Survey research having an N that large.
At 06:53 PM 7/28/2013, Nick Cox wrote:
This is interesting, but in principle I don't see that Stata's limit
on # of observations has any bearing on how big frequency weights can
be. I can imagine people wanting to use frequency weights to subvert
the limit on number of observations.
A different point is that if there is a limit on how big weights can
be it should be documented e.g. at -help limits-.
Nick
[email protected]
On 29 July 2013 00:46, Richard Williams <[email protected]> wrote:
> According to -help limits-, the maximum
number of observations is 2,147,483,647. Your
weights give you more than 4 billion cases,
well above that. Further, the help also says
that this is a theoretical maximum; memory
availability will certainly impose a smaller maximum.
>
> On my computer, I specified [fw = 1073741823]
on the pwcorr command and it ran. Then I
specified [fw = 1073741824] and it did not run.
These numbers put you just below and just above
the maximum number of cases that Stata allows.
>
> So in short, it appears that your fweighted
cases can't exceed the 2 billion+ that Stata
allows, and memory restrictions may hold you to even less than that.
>
> Also, you probably need to specify that the
fweight variable is type long, e.g.
>
> input y x long fw
>
> Sent from my iPad
>
> On Jul 27, 2013, at 12:36 PM, László Sándor <[email protected]> wrote:
>
>> Hi,
>> If you care, here is an example that silently produces missing values.
>> I notified Stata Support.
>>
>> input y x fw
>> 2 1 2147483621
>> 1 2 2147483621
>> end
>> de
>> pwcorr y x [fw=fw]
>> exit
>>
>> Thanks,
>>
>> Laszlo
>>
>> On Sun, Jul 21, 2013 at 5:08 PM, Nick Cox <[email protected]> wrote:
>>> I'd suggest documenting your problems with a reproducible example and
>>> sending Stata tech support.
>>>
>>>
>>> Nick
>>> [email protected]
>>>
>>>
>>> On 21 July 2013 21:55, László Sándor <[email protected]> wrote:
>>>> Hi,
>>>> in Stata/MP 12.1 I am getting missing values with using -pwcorr- with
>>>> -fweights- though the feature works fine with other data or if I scale
>>>> my weights down. Is it possible to simply have too large fweights,
>>>> e.g. if they cannot be of type -long- anymore?
>>>>
>>>> If so, why doesn't Stata warn me about this?
>>>>
>>>> I vaguely remember some Statalist of Stata blog discussion of this,
>>>> but I could not even Google it up, and Stata still did not warn me?
>>>>
>>>> Actually, why didn't Stata complain that I did not have integer
>>>> fweights if obviously the variable wasn't of type byte, int or long?
>>>>
>>>> Thanks,
>>>>
>>>> Laszlo
>>>>
>>>> *
>>>> * For searches and help try:
>>>> * http://www.stata.com/help.cgi?search
>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>>> * http://www.ats.ucla.edu/stat/stata/
>>>
>>> *
>>> * For searches and help try:
>>> * http://www.stata.com/help.cgi?search
>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>> * http://www.ats.ucla.edu/stat/stata/
>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
HOME: (574)289-5227
EMAIL: [email protected]
WWW: http://www.nd.edu/~rwilliam
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/