Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: insheet and dropping cases
From
Sergiy Radyakin <[email protected]>
To
"[email protected]" <[email protected]>
Subject
Re: st: insheet and dropping cases
Date
Thu, 20 Feb 2014 13:27:36 -0500
Ben,
-- the problem is likely caused by presence of unprintable characters
in the file, that are tolerated by StatTransfer, but not by Stata;
-- character with ASCII code 255 is a usual suspect;
-- pasting raw data to statalist is likely not to reveal the problem,
since the special characters might not survive massaging throw emails;
-- isolating the problem in the text editor into a new file could help
(keep the last record read in correctly and one immediately after),
then make the file available through a link, to retain its binary
structure, not all text editors will retain special chars on save;
-- use hexdump "file" , analyze tabulate to see unprintable
characters, then search for them in the file or use filefilter;
-- see "zap gremlins" for relevant tactic.
On the bright side: you are lucky you have 363 cases. Last time I had
this problem, only 16gb out of 40gb were read in. Try to open that
file in the notepad :)
Hope this helps.
Best, Sergiy Radyakin
On Thu, Feb 20, 2014 at 12:34 PM, Radwin, David <[email protected]> wrote:
> One other possibility is to use -inputst-, a Stata program that calls Stat/Transfer (part of -stcmd- by Roger Newson and available at SSC).
>
> This workaround is probably less computationally efficient than the suggestions from others, but since you already know that Stat/Transfer works, this approach might be faster and easier than trying to figure out the problem with your text files and -insheet- or -import delimited-.
>
> David
> --
> David Radwin, Senior Research Associate
> Education and Workforce Development
> RTI International
> 2150 Shattuck Ave. Suite 800, Berkeley, CA 94704
> Phone: 510-665-8274
>
> www.rti.org/education
>
>
>> -----Original Message-----
>> From: [email protected] [mailto:owner-
>> [email protected]] On Behalf Of Phil Schumm
>> Sent: Thursday, February 20, 2014 6:38 AM
>> To: Statalist Statalist
>> Subject: Re: st: insheet and dropping cases
>>
>> On Feb 20, 2014, at 8:28 AM, Ben Hoen <[email protected]> wrote:
>> > Hexdump I had never used. This is what it returned:
>>
>> <snip>
>>
>> > Do you see anything suspicious here? (I replaced all the commas with
>> "_", using filefilter - another great suggestion - wondering if that was
>> causing any issues and insheet still returned 184 observations.)
>>
>>
>> I don't see anything obvious -- you'll need to look at the file directly.
>> Is Stata reading the first 184 observations, or are the 184 observations
>> from different places in the file? Check that first, and if you are
>> getting the first 184 observations, then look at lines 184-6 (depending on
>> whether the file has a header line). Something has to be going on there.
>>
>>
>> -- Phil
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/