Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: RE: RE: Change roman to Arabic numerals
From
Nick Cox <[email protected]>
To
"'[email protected]'" <[email protected]>
Subject
st: RE: RE: Change roman to Arabic numerals
Date
Mon, 20 Dec 2010 17:45:21 +0000
I pushed this a bit further. A help file (not included here) spells out the assumptions (and limitations) which can also be inferred from the code. I'll ask Kit Baum to put this up on SSC to complete the loop.
Mata isn't essential for this problem, but it makes the problem more fun.
*! 1.0.0 NJC 20 December 2010
program romantoarabic
version 9
syntax varname(string) [if] [in] , Generate(str)
quietly {
marksample touse, strok
count if `touse'
if r(N) == 0 error 2000
confirm new variable `generate'
tempvar work
gen `work' = upper(trim(itrim(`varlist'))) if `touse'
gen `generate' = .
mata : roman_to_arabic("`work'", "`generate'", "`touse'")
count if `work' != "" & `touse'
replace `generate' = . if `work' != "" & `touse'
}
if r(N) {
di _n as txt "Problematic input: "
list `varlist' if `work' != "" & `touse'
}
end
mata :
void roman_to_arabic(string scalar varname,
string scalar genname,
string scalar usename) {
string colvector work
real colvector y
work = st_sdata(., varname, usename)
y = J(rows(work), 1, 0)
y = y + 900 * (strpos(work, "CM") :> 0)
work = subinstr(work, "CM", "", .)
y = y + 400 * (strpos(work, "CD") :> 0)
work = subinstr(work, "CD", "", .)
y = y + 90 * (strpos(work, "XC") :> 0)
work = subinstr(work, "XC", "", .)
y = y + 40 * (strpos(work, "XL") :> 0)
work = subinstr(work, "XL", "", .)
y = y + 9 * (strpos(work, "IX") :> 0)
work = subinstr(work, "IX", "", .)
y = y + 4 * (strpos(work, "IV") :> 0)
work = subinstr(work, "IV", "", .)
while (sum(strpos(work, "M"))) {
y = y + 1000 * (strpos(work, "M") :> 0)
work = subinstr(work, "M", "", 1)
}
while (sum(strpos(work, "D"))) {
y = y + 500 * (strpos(work, "D") :> 0)
work = subinstr(work, "D", "", 1)
}
while (sum(strpos(work, "C"))) {
y = y + 100 * (strpos(work, "C") :> 0)
work = subinstr(work, "C", "", 1)
}
while (sum(strpos(work, "L"))) {
y = y + 50 * (strpos(work, "L") :> 0)
work = subinstr(work, "L", "", 1)
}
while (sum(strpos(work, "X"))) {
y = y + 10 * (strpos(work, "X") :> 0)
work = subinstr(work, "X", "", 1)
}
while (sum(strpos(work, "V"))) {
y = y + 5 * (strpos(work, "V") :> 0)
work = subinstr(work, "V", "", 1)
}
while (sum(strpos(work, "I"))) {
y = y + (strpos(work, "I") :> 0)
work = subinstr(work, "I", "", 1)
}
st_store(., genname, usename, y)
st_sstore(., varname, usename, work)
}
end
Nick
[email protected]
-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Nick Cox
Sent: 17 December 2010 19:39
To: '[email protected]'
Subject: st: RE: Change roman to Arabic numerals
I think anyone tempted to write this would be best advised to extract the subtraction parts of the syntax first, i.e. CM etc.
(Also, from what I recall IIII is sometimes allowed as a non-standard variant of IV.)
Here is one stab. This is a Mata function that works on a string vector of Roman numerals in upper case.
Example first:
. mata
: stuff = ("IV", "MCMIV")
: roman_to_arabic(stuff)
1 2
+---------------+
1 | 4 1904 |
+---------------+
: roman_to_arabic(stuff')
1
+--------+
1 | 4 |
2 | 1904 |
+--------+
: end
Code second:
mata :
real roman_to_arabic(string vector roman) {
numeric vector ro
string vector work
ro = J(rows(roman), cols(roman), 0)
work = roman
ro = ro + 900 * (strpos(work, "CM") :> 0)
work = subinstr(work, "CM", "", .)
ro = ro + 400 * (strpos(work, "CD") :> 0)
work = subinstr(work, "CD", "", .)
ro = ro + 90 * (strpos(work, "XC") :> 0)
work = subinstr(work, "XC", "", .)
ro = ro + 40 * (strpos(work, "XL") :> 0)
work = subinstr(work, "XL", "", .)
ro = ro + 9 * (strpos(work, "IX") :> 0)
work = subinstr(work, "IX", "", .)
ro = ro + 4 * (strpos(work, "IV") :> 0)
work = subinstr(work, "IV", "", .)
while (sum(strpos(work, "M"))) {
ro = ro + 1000 * (strpos(work, "M") :> 0)
work = subinstr(work, "M", "", 1)
}
while (sum(strpos(work, "D"))) {
ro = ro + 500 * (strpos(work, "D") :> 0)
work = subinstr(work, "D", "", 1)
}
while (sum(strpos(work, "C"))) {
ro = ro + 100 * (strpos(work, "C") :> 0)
work = subinstr(work, "C", "", 1)
}
while (sum(strpos(work, "L"))) {
ro = ro + 50 * (strpos(work, "L") :> 0)
work = subinstr(work, "L", "", 1)
}
while (sum(strpos(work, "X"))) {
ro = ro + 10 * (strpos(work, "X") :> 0)
work = subinstr(work, "X", "", 1)
}
while (sum(strpos(work, "V"))) {
ro = ro + 5 * (strpos(work, "V") :> 0)
work = subinstr(work, "V", "", 1)
}
while (sum(strpos(work, "I"))) {
ro = ro + (strpos(work, "I") :> 0)
work = subinstr(work, "I", "", 1)
}
return(ro)
}
end
Nick
[email protected]
-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Lachenbruch, Peter
Sent: 17 December 2010 18:50
To: '[email protected]'
Subject: st: Change roman to Arabic numerals
A colleague wants to generate Arabic numbers from Roman numerals and I was = wondering if anyone has written a routine for this. She only has I to X so= I suggested Gen numb=(rom=="I")+2*(rom=="2")+3*(rom=="3")+4*(rom=="4"=
) etc.
This is OK for this application, but not if we have many numbers. Of course the ordering gets messed up - I, II, III, IV, IX, V, VI, VII, VIII, X so= encode won't work and gen numb=3Dreal(rom) won't do either.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/