Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | "Ben Hoen" <bhoen@lbl.gov> |
To | <statalist@hsphsun2.harvard.edu> |
Subject | RE: st: insheet and dropping cases |
Date | Thu, 20 Feb 2014 13:40:19 -0500 |
Thanks Sergiy! LOL. I hear you re big files. I have a few million records I am eventually going to try to read into Stata (hence the pre-planning). I will see if I can find them as you suggest. And thanks to David for the improved work-around using -inputst- I will try to report back what I find as this might be an issue for another user sometime. Best, Ben Ben Hoen LBNL Office: 845-758-1896 Cell: 718-812-7589 -----Original Message----- From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Sergiy Radyakin Sent: Thursday, February 20, 2014 1:28 PM To: statalist@hsphsun2.harvard.edu Subject: Re: st: insheet and dropping cases Ben, -- the problem is likely caused by presence of unprintable characters in the file, that are tolerated by StatTransfer, but not by Stata; -- character with ASCII code 255 is a usual suspect; -- pasting raw data to statalist is likely not to reveal the problem, since the special characters might not survive massaging throw emails; -- isolating the problem in the text editor into a new file could help (keep the last record read in correctly and one immediately after), then make the file available through a link, to retain its binary structure, not all text editors will retain special chars on save; -- use hexdump "file" , analyze tabulate to see unprintable characters, then search for them in the file or use filefilter; -- see "zap gremlins" for relevant tactic. On the bright side: you are lucky you have 363 cases. Last time I had this problem, only 16gb out of 40gb were read in. Try to open that file in the notepad :) Hope this helps. Best, Sergiy Radyakin On Thu, Feb 20, 2014 at 12:34 PM, Radwin, David <dradwin@rti.org> wrote: > One other possibility is to use -inputst-, a Stata program that calls Stat/Transfer (part of -stcmd- by Roger Newson and available at SSC). > > This workaround is probably less computationally efficient than the suggestions from others, but since you already know that Stat/Transfer works, this approach might be faster and easier than trying to figure out the problem with your text files and -insheet- or -import delimited-. > > David > -- > David Radwin, Senior Research Associate > Education and Workforce Development > RTI International > 2150 Shattuck Ave. Suite 800, Berkeley, CA 94704 > Phone: 510-665-8274 > > www.rti.org/education > > >> -----Original Message----- >> From: owner-statalist@hsphsun2.harvard.edu [mailto:owner- >> statalist@hsphsun2.harvard.edu] On Behalf Of Phil Schumm >> Sent: Thursday, February 20, 2014 6:38 AM >> To: Statalist Statalist >> Subject: Re: st: insheet and dropping cases >> >> On Feb 20, 2014, at 8:28 AM, Ben Hoen <bhoen@lbl.gov> wrote: >> > Hexdump I had never used. This is what it returned: >> >> <snip> >> >> > Do you see anything suspicious here? (I replaced all the commas with >> "_", using filefilter - another great suggestion - wondering if that was >> causing any issues and insheet still returned 184 observations.) >> >> >> I don't see anything obvious -- you'll need to look at the file directly. >> Is Stata reading the first 184 observations, or are the 184 observations >> from different places in the file? Check that first, and if you are >> getting the first 184 observations, then look at lines 184-6 (depending on >> whether the file has a header line). Something has to be going on there. >> >> >> -- Phil > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/