Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Problem with -infix- with if qualifiers and strings?
From
Robert Picard <[email protected]>
To
"[email protected]" <[email protected]>
Subject
Re: st: Problem with -infix- with if qualifiers and strings?
Date
Tue, 12 Nov 2013 18:17:05 -0500
Here's how your code runs on my Mac using Stata 12.1. As you can see,
I don't lose an observation. I'm pretty sure that the last line of
your version of "test.txt" is missing a return character which causes
Stata to skip the last line. As to -infix-'s behavior, it must read in
values for -type- and -number- before it can try to evaluate -if
type==20-. That generates the errors you see. One solution is to read
the second number as as string and then convert it to float later, see
example below.
Robert
--------------- test.txt ------------------------
10ABC
20321
10ZYX
20654
-------------------------------------------------
. infix type 1-2 str text 3-5 if type==10 using test.txt
(2 observations read)
. list
+-------------+
| type text |
|-------------|
1. | 10 ABC |
2. | 10 ZYX |
+-------------+
.
. clear
. infix type 1-2 number 3-5 if type==20 using test.txt
'ABC' cannot be read as a number for number[1]
'ZYX' cannot be read as a number for number[2]
(2 observations read)
. list
+---------------+
| type number |
|---------------|
1. | 20 321 |
2. | 20 654 |
+---------------+
.
. clear
. infix type 1-2 str snumber 3-5 if type==20 using test.txt
(2 observations read)
. gen number = real(snumber)
. list
+-------------------------+
| type snumber number |
|-------------------------|
1. | 20 321 321 |
2. | 20 654 654 |
+-------------------------+
On Tue, Nov 12, 2013 at 5:13 PM, Jorge Eduardo Pérez Pérez
<[email protected]> wrote:
> Dear Statalist.
>
> I have noticed weird behavior regarding -infix- and data where the
> variable type may change per line. -infix- is not reading the dataset
> properly.
>
> Here's an example. My dataset is:
>
> 10ABC
> 20321
> 10ZYX
> 20654
>
> If I try to read the lines that start with 10 and the remainder as a
> string, everything works:
>
> . clear
>
> . infix type 1-2 str text 3-5 if type==10 using test.txt
> (2 observations read)
>
> . list
>
> +-------------+
> | type text |
> |-------------|
> 1. | 10 ABC |
> 2. | 10 ZYX |
> +-------------+
>
> But if I try to read the lines that start with 20, where the remainder
> is a number, Stata seems to be trying to read the lines that start
> with 10 as well, producing a "cannot be read" message and , worse,
> dropping observations from my data!
>
> . clear
>
> . infix type 1-2 number 3-5 if type==20 using test.txt
> 'ABC' cannot be read as a number for number[1]
> 'ZYX' cannot be read as a number for number[2]
> (1 observations read)
>
> . list
>
> +---------------+
> | type number |
> |---------------|
> 1. | 20 321 |
> +---------------+
>
> I have replicated this in both Stata 12.1, Windows 7 and 13.1 on
> Windows 8. Can someone replicate to see if this is a bug, or am I
> missing something? In the meantime I will read my variables as strings
> and destring afterwards.
>
> Thanks!
>
> --------------------------------------------
> Jorge Eduardo Pérez Pérez
> Graduate Student
> Department of Economics
> Brown University
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/