Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Datamanagement: warning when using infile with optional if
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: Datamanagement: warning when using infile with optional if
Date
Tue, 28 Feb 2012 16:27:03 +0000
I would put it differently: What is the problem? It appeared to be
that the messages irritate you, but now you say that you want to see
them. That's all fine by me.
Please note that I did not recommend editing the data file. I
specifically mentioned working on a copy of the data file.
Nick
On Tue, Feb 28, 2012 at 4:17 PM, <[email protected]> wrote:
> Nick,
>
> Thank you for your reply. Good to know that there is no obvious solution, and that you think pursuing the bail-out route it not worthwhile.
> Editing the raw data file is not really an option for me, since I re-run the file in case cat=="A", more specifically I re-run it 20 times (many different categories).
>
> I might use -qui- once I know that the code is working properly.
>
> Thanks again,
> Arne
>
> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On Behalf Of Nick Cox
> Sent: 28 February 2012 16:08
> To: [email protected]
> Subject: Re: st: Datamanagement: warning when using infile with optional if
>
> It's built-in to Stata that -if- tests every (potential) observation.
> How else is Stata to know -- at least in this problem -- that your test is satisfied? More to your point, adding extra code to ensure bail-out once a line is known to be invalid would slow -infile- down more frequently than it speeds it up: at least that's my guess.
>
> -quietly- suppresses the little messages.
>
> There are many ways to work with this kind of file, including deleting lines from a copy that don't match a regular expression using any decent text editor or scripting language before you enter Stata.
>
> Nick
>
> On Tue, Feb 28, 2012 at 3:41 PM, <[email protected]> wrote:
>
>> I am reading ASCII data with a dictionary using the command -infile-
>> whilst conditioning on an variable (using -if-) that is read in the
>> same time. I created a simplified example to show you what is happening:
>>
>> The data looks like:
>> -----------------data.txt--------------
>> 1Ajohn1
>> 1B8724
>> 2Ajane0
>> 2B8625
>> 3Amark1
>> -----------------------------------------
>>
>> With dictionary file
>> -----------------dctB.dct--------------
>> dictionary using data.txt {
>> _column(1) int id %1f "Identifier"
>> _column(2) str1 cat %1s "Category"
>> _column(3) int dob %2f "Date of Birth"
>> _column(5) int age %2f "Age"
>> }
>> -----------------------------------------
>>
>> My aim is to read only those lines where the variable cat is equal to B.
>> I do this by making use of the command infile using dctB if cat=="B"
>>
>> I do end up with the required result. Stata does a great job at
>> conditioning on a variable that it is reading at the same time,
>> however, it returns an error for every line where cat=="A" as it
>> contains non numeric characters, where Stata expects only integers.
>> Not only does this produce messy .log files (especially with thousands
>> of lines), it indicates that Stata has to read every line completely
>> which is time consuming and somewhat unnecessary.
>>
>> Does anyone have a suggestion to improve on my current method?
>> Preferably one that produces readable .log files?
>>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/