Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: Re: Stata appears to be eating some string IDs when saving a file
From
"Dimitriy V. Masterov" <[email protected]>
To
Statalist <[email protected]>
Subject
st: Re: Stata appears to be eating some string IDs when saving a file
Date
Tue, 2 Apr 2013 16:34:45 -0700
STS has confirmed that I am not a crazy person, at least not in this
instance. This is a real bug.
The problem is that Stata does not return an error when the file
system fills up. The developers are now aware of this and they would
like to have Stata detect this problem in the future and report the
error correctly. They also plan to add some more error checking to the
-use- command so that it catches files that have been corrupted.
For now, the best way to detect these types of issue is to use the
-datasignature- command to verify that the data set was not
modified/corrupted when saved.
DVM
On Sun, Mar 31, 2013 at 10:32 PM, Dimitriy V. Masterov
<[email protected]> wrote:
> I believe I diagnosed the issue. This seems to happen when I am
> running low on space in my home directory on the server. When I freed
> up some space, the problem went away. I wish there was some sort of
> warning to alert users that this is happening. This has been a very
> frustrating and terrifying experience.
>
> DVM
>
> On Sat, Mar 30, 2013 at 2:25 PM, Dimitriy V. Masterov
> <[email protected]> wrote:
>> I am having a strange problem with Stata deleting the values for about 80%
>> of my data when I save a file. It only does it for string variables,
>> and this only happens some of the time that I run this code.
>>
>> Here's the relevant part:
>>
>> . des ;
>>
>> Contains data
>> obs: 10,766,127
>> vars: 4
>> size: 387,580,572
>> ------------------------------
>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>> storage display value
>> variable name type format label variable label
>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>> slr_id str10 %10s
>> byr_id str10 %10s
>> item_id str12 %12s
>> pt_m2m_cat float %21.0g pt_m2m_cat
>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>> Sorted by:
>> Note: dataset has changed since last saved
>>
>> . assert !missing(slr_id) & !missing(byr_id) & !missing(item_id) &
>> !missing(pt_m2m_cat);
>>
>> . count;
>> 10766127
>>
>> . save "pt_m2m_cat.dta", replace;
>> file pt_m2m_cat.dta saved
>>
>> . use "pt_m2m_cat.dta", clear;
>>
>> . assert !missing(slr_id) & !missing(byr_id) & !missing(item_id) &
>> !missing(pt_m2m_cat);
>> 3407873 contradictions in 10766127 observations
>> assertion is false
>> r(9);
>>
>>
>> My Stata MP is 12.1 (March 20, 2013), on an Ubuntu box. Any ideas how
>> to diagnose this?
>>
>> DVM
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/