Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: insheet and dropping cases
From
"Ben Hoen" <[email protected]>
To
<[email protected]>
Subject
RE: st: insheet and dropping cases
Date
Thu, 20 Feb 2014 09:28:51 -0500
Thanks Phil and Ronnie. Great advice. I AM running 12.1 and therefore the
issue might be solved by updating.
Hexdump I had never used. This is what it returned:
. hexdump IL.txt, analyze
Line-end characters Line length (tab=1)
\r\n (Windows) 364 minimum
912
\r by itself (Mac) 0 maximum
3,627
\n by itself (Unix) 0
Space/separator characters Number of lines
364
[blank] 9,621 EOL at EOF?
yes
[tab] 0
[comma] (,) 112 Length of first 5
lines
Control characters Line 1
3,627
binary 0 0 Line 2
1,159
CTL excl. \r, \n, \t 0 Line 3
1,136
DEL 0 Line 4
1,201
Extended (128-159,255) 0 Line 5
1,104
ASCII printable
A-Z 60,639
a-z 7,271 File format
ASCII
0-9 238,495
Special (!@#$ etc.) 77,759
Extended (160-254) 0
---------------
Total 394,625
Observed were:
\n \r blank ! " # & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; @ A
B C D
E F G H I J K L M N O P Q R S T U V W X Y Z _ a b c d e f g h I k l
m n
o p q r s t u v w x y z }
Do you see anything suspicious here? (I replaced all the commas with "_",
using filefilter - another great suggestion - wondering if that was causing
any issues and insheet still returned 184 observations.)
Thanks all.
Ben
Ben Hoen
LBNL
Office: 845-758-1896
Cell: 718-812-7589
-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Phil Schumm
Sent: Thursday, February 20, 2014 9:05 AM
To: Statalist Statalist
Subject: Re: st: insheet and dropping cases
On Feb 19, 2014, at 9:08 PM, Ben Hoen <[email protected]> wrote:
> I have 70 files to import and insheet (via a loop with append) is the
simpliest so I would like to use it but cannot figure out why it is not
importing all of the cases. It does the same thing for all states, always
importing less than what are transferred via stattransfer, regardless of the
total file size or number of cases..
>
> Any ideas?
As Ronnie pointed out, -insheet- has been superseded by -import delimited-
in Stata 13. Based on my experience, I believe this was more than simply
renaming -insheet-, but reflects a substantial change under the hood. So,
if you have access to Stata 13, the first thing would be to try this to see
if you get a different result. You might, especially if Stat/Transfer reads
the file properly.
If you don't have Stata 13 or if you get the same result with -import
delimited-, then take one file and find out where it stopped (e.g., count
how many observations were read successfully). Open the file with a text
editor (ideally one capable of showing non-printing characters), go to that
line, and start looking (either on that line or the line after). I'll bet
you'll find either some unexpected character or something different from the
lines above it. This will at least tell you what is causing the problem,
and then perhaps someone here can suggest a fix.
-- Phil
P.S. You might find
http://www.stata.com/support/faqs/data-management/malformed-end-of-line-sequ
ences/ helpful, which shows how to use -hexdump- to troubleshoot a similar
problem.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/