Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Stripping ASCII characters

From	Ronan Conroy <[email protected]>
To	"<[email protected]>" <[email protected]>
Subject	Re: st: Stripping ASCII characters
Date	Tue, 25 Feb 2014 13:52:57 +0000

Prof. Ronan Conroy
Associate Professor of Biostatistics


RCSI Department of Epidemiology and Public Health Medicine
Royal College of Surgeons in Ireland
Lower Mercer Street, Dublin 2, Ireland
T: 01-402-2431
E: [email protected]  W: www.rcsi.ie

RCSI DEVELOPING HEALTHCARE LEADERS
WHO MAKE A DIFFERENCE WORLDWIDE
On 2014 Feabh 24, at 21:03, Thomas, Anthony wrote:

> When insheeting a csv file using Stata 11 - Unix, Stata aborts with the error:
>
> too many variables specified
> error in line 5000000 of file
>
> Output of "hexdump" indicated the file contained control characters
> (^Z), and was in binary format, when it was expected to be ASCII. I
> tried using "filefilter "f1.csv" "f2.csv", from(^Z) to() replace" to
> strip the problem characters, but a hexdump on f2.csv indicates the
> (^Z) are still present. From what I understand ^Z (sub) is used in
> place of a character that cannot be read by Stata, is this the case?
> If so, is there any way to strip these characters from my file prior
> to import?

This is the place where a good text editor comes in handy. Many have a 'strip non-ASCII' command that does what you want.

I ended up with 4,500 text files of which about 10% were corrupted. BBEdit (free, lite version=TextWrangler) processed the whole lot in a second or two!

r

Ronán Conroy
[email protected]
Associate Professor
Division of Population Health Sciences
Royal College of Surgeons in Ireland
Beaux Lane House
Dublin 2


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: Stripping ASCII characters
  - From: "Thomas, Anthony" <[email protected]>

References:
- st: Stripping ASCII characters
  - From: "Thomas, Anthony" <[email protected]>

Prev by Date: st: Re:Converting Quarterly GDP Data into Monthly Data Using Cubic Spline Interpolation
Next by Date: Re: st: Plotting regression coefficients from a factor using coefplot
Previous by thread: Re: st: Stripping ASCII characters
Next by thread: Re: st: Stripping ASCII characters
Index(es):
- Date
- Thread