Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Use Regular Expressions to replace words/strings of characters in a text file
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: Use Regular Expressions to replace words/strings of characters in a text file
Date
Thu, 16 Aug 2012 18:55:55 +0100
Incidentally, writing a blank line if you don't want a line looks
unnecessary and indeed a bad idea.
Nick
On Thu, Aug 16, 2012 at 6:54 PM, Nick Cox <[email protected]> wrote:
> As I understand it, your program reads in one line from the file and
> then cycles round and round because r(eof) remains false at the bottom
> of each loop. The only exceptions will be when a file is precisely one
> line long.
>
> So, once you have processed each line, you must (try to) read in
> another line. You may be used to scripting languages in which cycling
> over lines of an input file is automatic, but Stata doesn't work like
> that with the -file- commands.
>
> Earlier I pointed you to -log2html- (SSC) which (necessarily) uses
> precisely the same logic. Did you study it?
>
> Nick
>
> On Thu, Aug 16, 2012 at 6:42 PM, tashi lama <[email protected]> wrote:
>> Actually, may be I don't need it. The problem though is, and I am actually doing it, if I take away two -exit-s , the code runs forever.
>> ----------------------------------------
>>> Date: Thu, 16 Aug 2012 18:14:56 +0100
>>> Subject: Re: st: Use Regular Expressions to replace words/strings of characters in a text file
>>> From: [email protected]
>>> To: [email protected]
>>>
>>> Why the -exit-s?
>>>
>>> On Thu, Aug 16, 2012 at 6:09 PM, tashi lama <[email protected]> wrote:
>>> > I can't figure out where I screwed although the code looks fine in my eyes. Could someone please take a look. The code looks for a line with "Cell" and writes the line. If the line doesn't have "Cell", it leaves blank.
>>> >
>>> > file open myfile using tt.xml, read write
>>> > file read myfile line
>>> > while r(eof)==0 {
>>> > if strpos(`"`line'"',"Cell"){
>>> > file write myfile `"`line'"' _newline
>>> > exit
>>> > }
>>> > else {
>>> > file write myfile ""
>>> > exit
>>> > }
>>> > }
>>> > file close myfile
>>> > exit
>>> >
>>> > Thanx...
>>> > -------
>>> >
>>> > --------------------------------
>>> >> Date: Thu, 16 Aug 2012 16:14:07 +0100
>>> >> Subject: Re: st: Use Regular Expressions to replace words/strings of characters in a text file
>>> >> From: [email protected]
>>> >> To: [email protected]
>>> >>
>>> >> Suppose you are reading from a file "myinput" and copying to a file
>>> >> "myoutput", except for changes. That will mean
>>> >>
>>> >> 1. Writing a line fom "myinput" to "myoutput" unchanged if it is OK.
>>> >>
>>> >> 2. Writing a line to "myoutput" modified if you need to change it.
>>> >>
>>> >> 3. Doing nothing if you need to delete it.
>>> >>
>>> >> A line not written to a new file is the same as a line deleted. You
>>> >> don't need new machinery here.
>>> >>
>>> >> The program -log2html- from SSC is an example of how to do useful
>>> >> things by repeated application of such ideas using -file- commands.
>>> >>
>>> >> Nick
>>> >>
>>> >> On Thu, Aug 16, 2012 at 4:02 PM, tashi lama <[email protected]> wrote:
>>> >> > I got the general idea to execute( to read line by line and check and drop) but I am lacking a machinery. So, I go
>>> >> > file open myfile using tt.xml, read write
>>> >> > file read myfile line
>>> >> > **here I need to say delete the `line' if strpos("`line'","table"). I can't find the way to delete the line. Any idea?
>>> >> > Thanx..
>>> >> >
>>> >> > ----------------------------------------
>>> >> >> From: [email protected]
>>> >> >> To: [email protected]
>>> >> >> Subject: RE: st: Use Regular Expressions to replace words/strings of characters in a text file
>>> >> >> Date: Wed, 15 Aug 2012 17:24:21 +0000
>>> >> >>
>>> >> >> Thanx . I did look into file before posting this. I will look into more carefully. Thanx. \
>>> >> >> ----------------------------------------
>>> >> >> > Date: Wed, 15 Aug 2012 18:20:53 +0100
>>> >> >> > Subject: Re: st: Use Regular Expressions to replace words/strings of characters in a text file
>>> >> >> > From: [email protected]
>>> >> >> > To: [email protected]
>>> >> >> >
>>> >> >> > No and no. -filefilter- is only part of the answer. If your problems
>>> >> >> > were mine, I would be processing the file with -file- statements. You
>>> >> >> > would have to read in each line and process it. That could include
>>> >> >> > setting markers for when you are in certain parts of the file and are
>>> >> >> > prepared to make particular changes.
>>> >> >> >
>>> >> >> > There is no objection, naturally, to do your doing that with any
>>> >> >> > scripting language you know. But my impression is that your problem is
>>> >> >> > programmable in Stata.
>>> >> >> >
>>> >> >> > Nick
>>> >> >> >
>>> >> >> > On Wed, Aug 15, 2012 at 4:59 PM, tashi lama <[email protected]> wrote:
>>> >> >> > > Thanx. This is great although it seems that its options are limited. Are there ways to pull the followings around
>>> >> >> > > 1. filefilter tt1.xml tt2.xml, from("Worksheet") to("") removes worksheet from the file. What if I have a block of characters that I need to remove? Is there a way to say like I want all the characters removed that occur before string table( first occurence of table since there might be multiple strings table in the text)?
>>> >> >> > > 2. This might be a redundant of the last part of the first question. Since there are no options in filefilter saying look the pattern only from here to here like in subinstr("fafaa","a","t",1) to mean replace first a or subinstr("fafaa","a","t",.) to mean replace all a, is there a way to do search and replace for only part of the text?
>>> >> >> > >
>>> >> >> > > Thanx tons..
>>> >> >> > > Tashi
>>> >> >> > > -----------------------------
>>> >> >> > >> Date: Wed, 15 Aug 2012 01:07:39 +0100
>>> >> >> > >> Subject: Re: st: Use Regular Expressions to replace words/strings of characters in a text file
>>> >> >> > >> From: [email protected]
>>> >> >> > >> To: [email protected]
>>> >> >> > >>
>>> >> >> > >> -help filefilter-
>>> >> >> > >>
>>> >> >> > >> Nick
>>> >> >> > >>
>>> >> >> > >> On Tue, Aug 14, 2012 at 9:18 PM, tashi lama <[email protected]> wrote:
>>> >> >> > >>
>>> >> >> > >> > I looked for previous posts and read and still can't find the solution. I could use stata's string functions(regexr, subinstr etc) to search and replace string in a string. But if I have a text file(test1.xml) that I need to run a search and replace on, how do I do it? Putting a text in a local macro( put aside the macro limitation) and using regexr wouldn't be an option because I have to generate a xml file from the dataset.
>>> >> >> > >> >
>>> >> >> > >> > view test1.xml
>>> >> >> > >> >
>>> >> >> > >> > <?xml version="1.0" encoding="US-ASCII" standalone="yes"?>
>>> >> >> > >> > <?mso-application progid="Excel.Sheet"?>
>>> >> >> > >> > <Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"
>>> >>
>>> >> *
>>> >> * For searches and help try:
>>> >> * http://www.stata.com/help.cgi?search
>>> >> * http://www.stata.com/support/statalist/faq
>>> >> * http://www.ats.ucla.edu/stat/stata/
>>> > *
>>> > * For searches and help try:
>>> > * http://www.stata.com/help.cgi?search
>>> > * http://www.stata.com/support/statalist/faq
>>> > * http://www.ats.ucla.edu/stat/stata/
>>>
>>> *
>>> * For searches and help try:
>>> * http://www.stata.com/help.cgi?search
>>> * http://www.stata.com/support/statalist/faq
>>> * http://www.ats.ucla.edu/stat/stata/
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/statalist/faq
>> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/