Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: RE: Re: Stata appears to be eating some string IDs when saving a file
From
"David Radwin" <[email protected]>
To
<[email protected]>
Subject
st: RE: Re: Stata appears to be eating some string IDs when saving a file
Date
Tue, 2 Apr 2013 17:06:41 -0700 (PDT)
Just out of curiosity, approximately how large is the file? Gigabytes?
Hundreds of gigabytes? (I realize that even a small file could be larger
than a small server, but that seems unlikely these days.)
I'm glad you identified the problem, and thank you for reporting back to the
list for posterity.
David
--
David Radwin
Senior Research Associate
MPR Associates, Inc.
2150 Shattuck Ave., Suite 800
Berkeley, CA 94704
Phone: 510-849-4942
Fax: 510-849-0794
www.mprinc.com
> -----Original Message-----
> From: [email protected] [mailto:owner-
> [email protected]] On Behalf Of Dimitriy V. Masterov
> Sent: Tuesday, April 02, 2013 4:35 PM
> To: Statalist
> Subject: st: Re: Stata appears to be eating some string IDs when saving a
> file
>
> STS has confirmed that I am not a crazy person, at least not in this
> instance. This is a real bug.
>
> The problem is that Stata does not return an error when the file
> system fills up. The developers are now aware of this and they would
> like to have Stata detect this problem in the future and report the
> error correctly. They also plan to add some more error checking to the
> -use- command so that it catches files that have been corrupted.
>
> For now, the best way to detect these types of issue is to use the
> -datasignature- command to verify that the data set was not
> modified/corrupted when saved.
>
> DVM
>
> On Sun, Mar 31, 2013 at 10:32 PM, Dimitriy V. Masterov
> <[email protected]> wrote:
> > I believe I diagnosed the issue. This seems to happen when I am
> > running low on space in my home directory on the server. When I freed
> > up some space, the problem went away. I wish there was some sort of
> > warning to alert users that this is happening. This has been a very
> > frustrating and terrifying experience.
> >
> > DVM
> >
> > On Sat, Mar 30, 2013 at 2:25 PM, Dimitriy V. Masterov
> > <[email protected]> wrote:
> >> I am having a strange problem with Stata deleting the values for about
> 80%
> >> of my data when I save a file. It only does it for string variables,
> >> and this only happens some of the time that I run this code.
> >>
> >> Here's the relevant part:
> >>
> >> . des ;
> >>
> >> Contains data
> >> obs: 10,766,127
> >> vars: 4
> >> size: 387,580,572
> >> ------------------------------
> >> -----------------------------------------------------------------------
> --------------------------------------------------------------------------
> -----------------------------------------------
> >> storage display value
> >> variable name type format label variable label
> >> -----------------------------------------------------------------------
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> ---
> >> slr_id str10 %10s
> >> byr_id str10 %10s
> >> item_id str12 %12s
> >> pt_m2m_cat float %21.0g pt_m2m_cat
> >> -----------------------------------------------------------------------
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> ---
> >> Sorted by:
> >> Note: dataset has changed since last saved
> >>
> >> . assert !missing(slr_id) & !missing(byr_id) & !missing(item_id) &
> >> !missing(pt_m2m_cat);
> >>
> >> . count;
> >> 10766127
> >>
> >> . save "pt_m2m_cat.dta", replace;
> >> file pt_m2m_cat.dta saved
> >>
> >> . use "pt_m2m_cat.dta", clear;
> >>
> >> . assert !missing(slr_id) & !missing(byr_id) & !missing(item_id) &
> >> !missing(pt_m2m_cat);
> >> 3407873 contradictions in 10766127 observations
> >> assertion is false
> >> r(9);
> >>
> >>
> >> My Stata MP is 12.1 (March 20, 2013), on an Ubuntu box. Any ideas how
> >> to diagnose this?
> >>
> >> DVM
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/