Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: Insert spaces into a string variable upto the max length of a variable.
From
Tim Evans <[email protected]>
To
"[email protected]" <[email protected]>
Subject
RE: st: Insert spaces into a string variable upto the max length of a variable.
Date
Tue, 11 Feb 2014 13:32:11 +0000
Hi all,
With some tinkering around, I think I have come up with a solution for my original problem which was to include the odds ratios, CIs and p-values on a graph. I originally wanted a table to the side but couldn't make myself happy with the final presentation of this - I couldn't control the placement of the table very well relative to the axis labels.. My work around was to include the estimates as part of the axis labels, this has been achieved using both Y axis. I appreciate that there is probably an element of repetition and it could probably benefit from a tidy up.
I've pasted a worked example below - please note that it requires a number of user written commands in order to fully run the code, including:
-labmask-, -parmest-, -revrs- being the most obvious ones I could see.
*******************BEGIN CODE***************
sysuse nlsw88, clear
replace ttl_exp = ttl_exp / 10
logit union i.race i.south grade ttl_exp, or
foreach v in grade ttl_ex { // continuous variables
local l`v' : variable label `v'
if `"`l`v''"' == "" { // if no variable label
local l`v' "`v'"
}
}
foreach v in south race { // factor variables
levelsof `v'
local `v'levs `r(levels)'
foreach l in ``v'levs' {
local l`v'_`l' : label (`v') `l'
if `"`l`v'_`l''"' == "`l'" { // if no value label
local l`v'_`l' `"`v' == `l'"'
}
}
}
parmest, norestore eform
foreach v in grade ttl_ex {
replace parm = "`l`v''" if parm == "`v'"
}
foreach v in south race {
local i = 1
foreach l in ``v'levs' {
if `i' == 1 {
drop if parm == "`l'b.`v'" // drop reference
}
else {
replace parm = "`l`v'_`l''" if parm == "`l'.`v'"
}
local i = `i' + 1
}
}
replace parm = "baseline odds" if parm == "_cons"
egen axis= axis(z), label(parm)
gen int flag = 1 if p<0.05
replace flag = 0 if p>=0.05
replace flag = 2 if p==. & estimate==1
label define flag 0 "Not significant" 1 "Significant" 2 "Base value"
label values flag flag
g flag2 = flag
replace flag2 = 0 if !inlist(flag2, 0,1)
separate estimate, by(flag2)
g str_est0 = string(estimate0, "%09.2fc")
g str_est1 = string(estimate1, "%09.2fc")
g str_min = string(min95, "%09.2fc")
g str_max = string(max95, "%09.2fc")
g str_P = string( p, "%09.4fc")
g double count = 10*runiform()*100
g str_count = string(count, "%09.0fc")
replace str_P = "" if str_P=="."
replace str_P = "<0.0001" if p<0.0001
g c = ","
g br1 = " ("
g br2 = ") "
gen length = length(trim(str_P))
su length, meanonly
local max = r(max)
local pad : di _dup(`max') " "
replace str_P = substr("`pad'", 1, `max' - length) + trim(str_P)
drop length
gen length = length(trim(str_est0))
su length, meanonly
local max = r(max)
local pad : di _dup(`max') " "
replace str_est0 = trim(str_est0) + substr("`pad'", 1, `max' - length)
drop length
gen length = length(trim(str_est1))
su length, meanonly
local max = r(max)
local pad : di _dup(`max') " "
replace str_est1 = trim(str_est1) + substr("`pad'", 1, `max' - length)
drop length
gen length = length(trim(str_min))
su length, meanonly
local max = r(max)
local pad : di _dup(`max') " "
replace str_min = trim(str_min) + substr("`pad'", 1, `max' - length)
drop length
gen length = length(trim(str_max))
su length, meanonly
local max = r(max)
local pad : di _dup(`max') " "
replace str_max = trim(str_max) + substr("`pad'", 1, `max' - length)
drop length
**GENERATE VARIABLE
*************************************
**For left side
g est5 = parm + "; (n=" + str_count + ")"
**For right side
g est6 = str_est1 + " " + "(" + str_min
*drop length
gen length = length(trim(est6))
su length, meanonly
local max = r(max)
local pad : di _dup(`max') " "
replace est6 = trim(est6) + substr("`pad'", 1, `max' - length)
drop length
replace est6 = est6 + c + str_max + ")"
g est7 = str_est0 + " " + "(" + str_min
*drop length
gen length = length(trim(est7))
su length, meanonly
local max = r(max)
local pad : di _dup(`max') " "
replace est7 = trim(est7) + substr("`pad'", 1, `max' - length)
drop length
replace est7 = est7 + c + str_max + ")"
replace est6 = est7 if estimate0!=.
gen length = length(trim(est6))
su length, meanonly
local max = r(max)
local pad : di _dup(`max') " "
replace est6 = trim(est6) + substr("`pad'", 1, `max' - length)
drop length
replace est6 = est6 + "; " + str_P
drop est7
***************USE THIS VARIABLE AS A LABEL
encode est5, g(est_touse)
g order2 = _n
egen axis2= axis(order2), label(est_touse)
gsort axis2
labmask order2, values(est_touse)
count if inlist(flag,0,1)
local N = r(N)
revrs axis2
*g flag2 = flag
*replace flag2 = 0 if !inlist(flag2, 0,1)
****NOW FOR THE SECOND AXIS
encode est6, g(est_touse1)
g order3 = _n
egen axis3= axis(order3), label(est_touse1)
gsort axis3
labmask order3, values(est_touse1)
count if inlist(flag,0,1)
local N = r(N)
revrs axis3
format %-65.0g revaxis2
**Graph this so that Odds ratio graph is produced with estimates included
twoway bar estimate0 revaxis2, xscale(log) base(1) horizontal barw(0.5) color(gs13) ///
yaxis(1) yla(1/`=_N', noticks valuelabel labsize(vsmall) ang(h) axis(1)) ///
xline(1, lcolor(gs13)) ytitle("", axis(1)) ///
xla(, labsize(vsmall)) || ///
rcap min95 max95 revaxis2, horizontal ///
|| bar estimate1 revaxis3, yaxis(2) xscale(log) base(1) horizontal barw(0.5) color(red) ///
xline(1, lcolor(gs13)) yla(1/`=_N', noticks valuelabel labsize(vsmall) ang(h) axis(2)) ytitle("", axis(2)) || ///
rcap min95 max95 revaxis3, horizontal ///
legend(order(3 "Significant" 1 "Not significant") size(small)) ///
xtitle("Odds ratio", size(small)) ytitle("") r2("{bf:ORs (95% CIs); p-value}", size(vsmall) orientation(horiz) placement(top) margin(t=0 l=-28))
***END CODE*********
-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Nick Cox
Sent: 07 February 2014 16:55
To: [email protected]
Subject: Re: st: Insert spaces into a string variable upto the max length of a variable.
Surely, e.g. you could use both y axes.
Nick
[email protected]
On 7 February 2014 16:24, Tim Evans <[email protected]> wrote:
> Nick,
>
> I do mean y axis and yes there are a number of user-written codes within my big jumble of code which have not been declared.
>
> Might there be another way of incorporating the OR, 95% CIs and p-values onto the graph?
>
> Best wishes
> Tim
>
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of Nick Cox
> Sent: 07 February 2014 16:09
> To: [email protected]
> Subject: Re: st: Insert spaces into a string variable upto the max length of a variable.
>
> When you say x axis, I think you mean y axis.
>
> Your problem seems to hinge on treatment of spaces within string
> labels: I think that's a tough call with most fonts. I don't have tricks for that.
>
> Anyone wanting to try Tim's code should note undeclared use of
> user-written code here, to the extent of -parmest- (SJ etc.),
> -labmask- (SJ etc.), -egenmore- (SSC), -revrs- (SSC), perhaps others.
>
> A trivial detail is that the code can be shortened by replacing each pair of -drop- and -generate- with a -replace- statement. That doesn't affect the problem one bit.
> Nick
> [email protected]
>
>
> On 7 February 2014 15:31, Tim Evans <[email protected]> wrote:
>> Nick,
>>
>> Thanks for your reply. My aim is to combine the estimates of a logistic regression analysis as a table next to a graphical representation of it using the variable labels. I'm after this solution purely for presentational purposes.
>>
>> Here is my code (using nlsw88 data). Apologies for the length of the code.... but it should work. On the finished graph, I would like the x axis labels to be left aligned (if you look at the data at the end of the code, the spacing of the information is how I would like it to appear next to the graph).
>>
>>
>> *****BEGIN CODE****
>>
>> sysuse nlsw88, clear
>>
>> replace ttl_exp = ttl_exp / 10
>>
>> logit union i.race i.south grade ttl_exp, or
>>
>> foreach v in grade ttl_ex { // continuous variables
>> local l`v' : variable label `v'
>> if `"`l`v''"' == "" { // if no variable label
>> local l`v' "`v'"
>> }
>> }
>> foreach v in south race { // factor variables
>> levelsof `v'
>> local `v'levs `r(levels)'
>> foreach l in ``v'levs' {
>> local l`v'_`l' : label (`v') `l'
>> if `"`l`v'_`l''"' == "`l'" { // if no value label
>> local l`v'_`l' `"`v' == `l'"'
>> }
>> }
>> }
>>
>> parmest, norestore eform
>> foreach v in grade ttl_ex {
>> replace parm = "`l`v''" if parm == "`v'"
>> }
>> foreach v in south race {
>> local i = 1
>> foreach l in ``v'levs' {
>> if `i' == 1 {
>> drop if parm == "`l'b.`v'" // drop reference
>> }
>> else {
>> replace parm = "`l`v'_`l''" if parm == "`l'.`v'"
>> }
>> local i = `i' + 1
>> }
>> }
>> replace parm = "baseline odds" if parm == "_cons"
>> egen axis= axis(z), label(parm)
>>
>>
>> gen int flag = 1 if p<0.05
>> replace flag = 0 if p>=0.05
>> replace flag = 2 if p==. & estimate==1 label define flag 0 "Not
>> significant" 1 "Significant" 2 "Base value"
>> label values flag flag
>> g flag2 = flag
>> replace flag2 = 0 if !inlist(flag2, 0,1)
>>
>> separate estimate, by(flag2)
>>
>> g str_est0 = string(estimate0, "%09.2fc") g str_est1 =
>> string(estimate1, "%09.2fc") g str_min = string(min95, "%09.2fc") g
>> str_max = string(max95, "%09.2fc") g str_P = string( p, "%09.4fc")
>> replace str_P = "" if str_P=="."
>> replace str_P = "<0.0001" if p<0.0001 g c = ","
>> g br1 = " ("
>> g br2 = ") "
>>
>>
>>
>> gen length = length(trim(parm))
>> su length, meanonly
>> local max = r(max)
>> local pad : di _dup(`max') " "
>> replace parm = trim(parm) + substr("`pad'", 1, `max' - length) drop
>> length
>>
>> gen length = length(trim(str_P))
>> su length, meanonly
>> local max = r(max)
>> local pad : di _dup(`max') " "
>> replace str_P = substr("`pad'", 1, `max' - length) + trim(str_P)
>>
>> drop length
>> gen length = length(trim(str_est0))
>> su length, meanonly
>> local max = r(max)
>> local pad : di _dup(`max') " "
>> replace str_est0 = trim(str_est0) + substr("`pad'", 1, `max' -
>> length)
>>
>> drop length
>> gen length = length(trim(str_est1))
>> su length, meanonly
>> local max = r(max)
>> local pad : di _dup(`max') " "
>> replace str_est1 = trim(str_est1) + substr("`pad'", 1, `max' -
>> length)
>>
>> drop length
>> gen length = length(trim(str_min))
>> su length, meanonly
>> local max = r(max)
>> local pad : di _dup(`max') " "
>> replace str_est1 = trim(str_min) + substr("`pad'", 1, `max' - length)
>>
>> drop length
>> gen length = length(trim(str_max))
>> su length, meanonly
>> local max = r(max)
>> local pad : di _dup(`max') " "
>> replace str_est1 = trim(str_max) + substr("`pad'", 1, `max' - length)
>> drop length
>>
>>
>> **GENERATE VARIABLE
>>
>>
>> *************************************
>> g est5 = parm + "; " + str_est1 + " " + "(" + str_min
>>
>> *drop length
>> gen length = length(trim(est5))
>> su length, meanonly
>> local max = r(max)
>> local pad : di _dup(`max') " "
>> replace est5 = trim(est5) + substr("`pad'", 1, `max' - length) drop
>> length
>>
>> replace est5 = est5 + c + str_max + ")"
>>
>>
>> ***************USE THIS VARIABLE AS A LABEL
>>
>>
>> encode est5, g(est_touse)
>> g order2 = _n
>>
>> egen axis2= axis(order2), label(est_touse) gsort axis2 labmask order,
>> values(est_touse) count if inlist(flag,0,1) local N = r(N) revrs
>> axis2 *g flag2 = flag *replace flag2 = 0 if !inlist(flag2, 0,1)
>>
>> format %-65.0g revaxis2
>>
>> twoway bar estimate0 revaxis2, xscale(log) base(1) horizontal barw(0.5) color(gs13) ///
>> xline(1, lcolor(gs13)) yla(1/`=_N', noticks valuelabel labsize(vsmall) ang(h)) xla(, labsize(vsmall)) || ///
>> rcap min95 max95 revaxis2, horizontal ///
>> || bar estimate1 revaxis2, xscale(log) base(1) horizontal barw(0.5) color(red) ///
>> xline(1, lcolor(gs13)) yla(1/`=_N', noticks valuelabel labsize(vsmall) ang(h)) || ///
>> rcap min95 max95 revaxis2, horizontal ///
>> legend(order(3 "Significant" 1 "Not significant") size(small)) ytitle("") ///
>> xtitle("Odds ratio", size(small)) ytitle("") l2("{bf:ORs (95% CIs)}", size(vsmall) orientation(horiz) placement(top) margin(t=0 r=-60))
>>
>> *****END CODE****
>>
>>
>>
>> -----Original Message-----
>> From: [email protected]
>> [mailto:[email protected]] On Behalf Of Nick Cox
>> Sent: 07 February 2014 14:54
>> To: [email protected]
>> Subject: Re: st: Insert spaces into a string variable upto the max length of a variable.
>>
>> I have no difficulty in using string variables in -graph bar- or -graph hbar-, meaning e.g.
>>
>> sysuse auto, clear
>> decode foreign, gen(origin)
>> graph hbar (mean) mpg, over(origin)
>> replace origin = origin + " " if origin == "Foreign"
>> graph hbar (mean) mpg, over(origin)
>>
>> This example shows that right alignment can be approximated, if not gained exactly, at least with the settings on Stata 13.1 for Windows.
>>
>> Your problem arises with -twoway bar-, so please either try an equivalent command in -graph bar|hbar|dot- or tell us _exactly_ why that won't work in your case. As requested in the FAQ, showing us exact syntax used helps greatly.
>>
>> Nick
>> [email protected]
>>
>> On 7 February 2014 14:38, Tim Evans <[email protected]> wrote:
>>
>>> Thank you very much for your help.
>>>
>>> I have used this code in order to create a variable that is based upon a number of other variables and I wanted to ensure correct spacing between the variables when I concatenated them together. I now left align the string variable and I notice that the combined variables now align for all records.
>>>
>>> My next thing is that I want to use this variable in a -tw- bar graph. I understand that strings cannot be used in a bar graph, so I encode my newly created string variable. However, I lose the left alignment I previously enjoyed with the string variable and while the contents are the same the alignment is now right aligned (I have tried left aligning in the variable properties). Is this because what is being aligned is the numeric value behind the label? If so, how can I duplicate the exact alignment in the string variable into the numeric labelled variable?
>>
>> Tim Evans
>>
>>> Thanks for this - much appreciated. Am working through this with other variables now.
>>
>> Nick Cox
>>
>>> gen length = length(trim(strvar))
>>> su length, meanonly
>>>
>>> local max = r(max)
>>> local pad : di _dup(`max') " "
>>>
>>> replace strvar = trim(strvar) + substr("`pad'", 1, `max' - length)
>>>
>>> NB _not_ length("strvar") !!!
>>>
>>> The -length()- function finds the length. -summarize- returns the maximum length. The biggest possible "pad" is that many spaces. The length of the part of the pad needed is the difference between the length of the trimmed string and the maximum length.
>>
>>
>> On 6 February 2014 21:34, Tim Evans <[email protected]> wrote:
>>
>>>> I'm using Stata 11.2. I have a string variable called 'make' containing the following:
>>>>
>>>> Buick
>>>> AMC
>>>> Lincoln
>>>>
>>>> The max string length is 7 (for Lincoln) with the smallest being 3 for AMC.
>>>>
>>>> What I want to do is assess the maximum length of the variable 'make' and then insert extra spaces in order to make the length of each value up to the max. I.e add to the end of Buick two spaces and AMC (4 spaces) and Lincoln (no spaces).
>>>>
>>>> I need this for when I concatenate the string variable with another string so that the start of the second string component is aligned for all values.
>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
>>
>> *********************************************************************
>> *
>> **** The information contained in the EMail and any attachments is
>> confidential and intended solely and for the attention and use of the
>> named addressee(s). It may not be disclosed to any other person
>> without the express authority of Public Health England, or the
>> intended recipient, or both. If you are not the intended recipient,
>> you must not disclose, copy, distribute or retain this message or any
>> part of it. This footnote also confirms that this EMail has been
>> swept for computer viruses by Symantec.Cloud, but please re-sweep any
>> attachments before opening or saving. http://www.gov.uk/PHE
>> *********************************************************************
>> *
>> ****
>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
>
> **********************************************************************
> **** The information contained in the EMail and any attachments is
> confidential and intended solely and for the attention and use of the
> named addressee(s). It may not be disclosed to any other person
> without the express authority of Public Health England, or the
> intended recipient, or both. If you are not the intended recipient,
> you must not disclose, copy, distribute or retain this message or any
> part of it. This footnote also confirms that this EMail has been swept
> for computer viruses by Symantec.Cloud, but please re-sweep any
> attachments before opening or saving. http://www.gov.uk/PHE
> **********************************************************************
> ****
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
**************************************************************************
The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of Public Health England, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses by Symantec.Cloud, but please re-sweep any attachments before opening or saving. http://www.gov.uk/PHE
**************************************************************************
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/