Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: Need Help with converting String Variables to Numeric Variables
From
"Impavido, Gregorio" <[email protected]>
To
"[email protected]" <[email protected]>
Subject
RE: st: Need Help with converting String Variables to Numeric Variables
Date
Tue, 19 Jun 2012 11:01:52 -0400
In addition to the condition suggested by Nick, you could try the substitution before you destring.
Make a list of target entries to be substituted with
gen var2 = var // not to alter the original data
tab var2 if regexm(var2, "[^0-9 .]") // or
tab var2 if missing(real(var2))
and then replace
replace var2 = "0" if var2=="targetvalues"
clearly not convenient if you happen to have many target values as you would have many replace entries. However it would spare you the use of -force- in -destring- (my personal lack of confidence with that option)
-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Nick Cox
Sent: Tuesday, June 19, 2012 4:25 AM
To: [email protected]
Subject: Re: st: Need Help with converting String Variables to Numeric Variables
The table I suggested was the wrong way round.
tab numvar if missing(strvar)
should be
tab strvar if missing(numvar)
but the mention of -if missing(strvar)- might have alerted you to the
key trick: making your conversions conditional on the value of a
variable.
For example, empty strings "", one or more spaces " ", " ", etc.,
periods ".", stray text "foo", inequalities such as "<4" will all map
to numeric missing under -destring, force-. It may be, at the easiest,
that
replace numvar = 0 if strpos(strvar, "<") == 1
is enough to get what you want, but you need to look at your data.
On Tue, Jun 19, 2012 at 3:38 AM, Dudekula, Anwar <[email protected]> wrote:
> Dear Nick and Daniel,
>
> Thanks a lot for the response.
>
> Unfortunately, I have missing values in the original variable.
>
> Hence the new variable generated with destring command and force option generates new additional missing values in addition to missing values generated corresponding to the missing values in original variable
>
> The practical problem with this issues is that the original variable is a biomarker and a missing value is being created for an observation like "<0.01" in original variable.
>
> In stata missing value is a huge number but infact this number should be equal to zero
Nick Cox [[email protected]]
> I agree with Daniel. But I would check what is being mapped to zero.
>
> destring numvar, force gen(strvar)
> tab numvar if missing(strvar)
> On Mon, Jun 18, 2012 at 11:58 PM, daniel klein
> <[email protected]> wrote:
>
>> Can't you just -destring- your string variable, specifying the -force-
>> option, and -replace- the created missing values with 0 in the new
>> variable? Or am I missing something here?
>
> Anwar
>
>> I am working on a data set of a hospital where one of the variables
>> has string values as observations .
>>
>> I am able to convert numeric string values to numeric values using
>> destring command.
>>
>> But I have some of the observations as "< 4", "<0.04","less than 0.1"
>> which needs to be converted to zero .
>>
>> In Stata we have option to convert nonmumeric values to missing values
>> , but I need to convert them to Zero .
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/