[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Getting rid of line-breaks in Data

From	"Eric A. Booth" <[email protected]>
To	[email protected]
Subject	Re: st: Getting rid of line-breaks in Data
Date	Thu, 18 Jun 2009 19:49:55 -0500

Elmar:

I had a similar issue with an unknown character (it wasn't a box...itwas a symbol that looked like a em-dash with a dot over it and,similar to your situation, acted like a end-of-line character for someprograms. I used file filter with some of its patterns for EOLcharacters until one of them knocked it out--solved my issue.

So, you may try all the EOL patterns mentioned in the -filefilter-help file:



filefilter oldfile.txt newfile.txt , from(\n) to(\t)

If "\n" doesnt work, try to substitute it with "\r", "\M", "\W", or"\U" or some ASCII characters (you might want to try the ascii"\254d", see: http://www.theasciicode.com.ar/ascii-table-codes/ascii-codes-254.htmlfor more).



Eric

__
Eric A. Booth
Public Policy Research Institute
Texas A&M University
[email protected]
Office: +979.845.6754
Fax: +979.845.0249





On Jun 18, 2009, at 7:32 PM, Matt Spittal wrote:

Dear Elmar,
Carriage returns can be very difficult to deal with. I don't haveany clearanswers, except to say that I have found a good text editor to beinvaluablefor cleaning a file. For instance, with my text editor(TextWrangler) I canchange between UNIX, Windows and Mac carriage returns and I can usegrepfunctions to find and replace symbols like the carriage return. Ifyou canexport your data from Access as a text file (csv) and then clean itwithin a
text editor, then this might be a good solution.
I am not sure what computer system or text editor you are using atpresent,
but some very good advice on text editors is given here.

   http://fmwww.bc.edu/repec/bocode/t/textEditors.html

Good luck,

-- Matt
[email protected]




On 18/6/09 5:28 PM, "Elmar Saathoff" <[email protected]> wrote:
Dear list members,
I am frequently using data that were imported from PDAs viaMsAccess. Insome cases these data contain some little squares that do not seemto do
much harm in Stata, but that other applications interpret as
linebreaks/carriage returns/paragraph marks, which is quite ahassle. Itseems that these things are inadvertently entered into the PDAs bythe
people collecting the data. Unfortunately I cannot show them in this
email, because my email client also interprets them as carriagereturns.
Anyway, I have been trying to identify and get rid of these things by
programming (using "subinstr", "egen...split" etc.), butunfortunately,whatever I do, Stata also interprets them as carriage returns, bothin
do files and in the command window, even if I change the delimiter to
";" via the delimit command.

Any advice would be greatly appreciated.

Thanks in advance, Elmar
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- Re: st: Getting rid of line-breaks in Data
  - From: Matt Spittal <[email protected]>

Prev by Date: Re: st: Getting rid of line-breaks in Data
Next by Date: st: Choosing cut offs to maximize ROC
Previous by thread: Re: st: Getting rid of line-breaks in Data
Next by thread: st: Model selection - metareg
Index(es):
- Date
- Thread