Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: Cannot get insheet to work, data do not load properly
From
Joe Canner <[email protected]>
To
"[email protected]" <[email protected]>
Subject
RE: st: Cannot get insheet to work, data do not load properly
Date
Thu, 29 Aug 2013 17:35:40 +0000
Laura,
I second Nick's suggestion about corrupted files. Text files are particularly susceptible to such things, depending on what you are using to generate them and/or edit them.
I noticed some other things that are awry that might help focus your search:
1. You ran -insheet- several times in a row and the next-to-last time it read a different number of observations. Did you make any changes to the file before or after that step? If not, that in itself is a cause for concern. If you did make changes, perhaps you accidentally made some other fatal changes.
2. You variables have names v1, v2, etc. You should add the -names- option to -insheet- to fix this, although on my Stata (also 12.1) it figured out automatically that I had names in the first row. This suggests that maybe the problem is near the beginning. Are the variable names separated by tabs as well?
Regards,
Joe Canner
Johns Hopkins University School of Medicine
-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Laura Grant
Sent: Thursday, August 29, 2013 12:36 PM
To: [email protected]
Subject: Re: st: Cannot get insheet to work, data do not load properly
Thanks Nick I will try that!
On Thu, Aug 29, 2013 at 11:31 AM, Nick Cox <[email protected]> wrote:
> Could be a corruption issue. Somewhere deep in the file there is
> complete garbage. I am guessing wildly because in that case I would
> expect Stata to read in some, then stop. But using -hexdump- to look
> at the file sometimes reveals a problem.
> Nick
> [email protected]
>
>
> On 29 August 2013 17:22, Laura Grant <[email protected]> wrote:
>> Don't think it's a length or size issue -- I have State SE and as I
>> mentioned, it loads but with blank entries.
>>
>> The data look like this, tab delimited, 9 variables about 1.7mil observations:
>> ACCOUNTNUMBER ConcatenatedAddress DEVICE BillType PREVIOUSREADDATE
>> PREVIOUSREADING PRESENTREADDATE PRESENTREADING USE
>> 44444444 5555 N GENERAL AV 99999999 Res 9/11/07 0:00 1106 12/11/07
>> 0:00 1131 25
>> 44444443 5553 N GENERAL AV 99999996 Res 12/11/07 0:00 1131 3/11/08
>> 0:00 1158 27
>>
>> I can view them in excel (but the length is too long) or in a text editor.
>> They look fine.
>> I can delete the top lines, save as different types, and the load
>> still looks like the screen capture.
>>
>> Would appreciate any help!
>>
>> On Thu, Aug 29, 2013 at 10:32 AM, Nick Cox <[email protected]> wrote:
>>> I don't know what the limits are for STATA, but Stata can take 2
>>> billion observations.
>>>
>>> Main answer is that Laura's screen capture isn't extra evidence.
>>> Something is not in order for the file. Perhaps you could show us
>>> the first few lines of the file, or get in touch with tech-support.
>>>
>>> Nick
>>> [email protected]
>>>
>>>
>>> On 29 August 2013 16:22, O'Neill, Sinead <[email protected]> wrote:
>>>> May your dataset is too large for STATA.
>>>> SAS can handle extremely large data.
>>>>
>>>> Sinéad O Neill
>>>> PhD Scholar
>>>> NPEC, ANU Research Centre
>>>> Dept of Obstetrics & Gynaecology
>>>> 5th Floor CUMH
>>>> Wilton, Cork.
>>>> (+353-21-492-0656)
>>>> (+353-86-3586895)
>>>>
>>>> On 29 Aug 2013, at 16:21, "Laura Grant" <[email protected]> wrote:
>>>>
>>>>> I have a long dataset (1.7m observations) that I can view
>>>>> partially in excel and fully in text editors.
>>>>>
>>>>> However when I go to insheet it in Stata the data load as all
>>>>> blank entries except the first cell, which always loads as " ˇ˛X "
>>>>> where X is the first character of the first line of the data.
>>>>>
>>>>> See screen capture at
>>>>> goo.gl/zaesv7
>>>>>
>>>>> The number of variables and observations are correct but they are ALL MISSING.
>>>>>
>>>>> The code I am using, as seen in pic link above, are variations on
>>>>>
>>>>> insheet using "Res Usage 2008 to 2010.txt", names tab clear
>>>>>
>>>>> Thoughts? Thanks!
>>>>
>>>
>>> *
>>> * For searches and help try:
>>> * http://www.stata.com/help.cgi?search
>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>> * http://www.ats.ucla.edu/stat/stata/
>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/