I agree with Alan. Although graphics with string variables
is not out of the question in Stata, it will prove limited and frustrating,
partly because of a in-built tendency to sort values by alphabetic order.
You are much better off working with numeric variables and
value labels.
As for bar charts showing frequencies, a rather basic graph
for many, official Stata is surprisingly ornery. I usually
miaow and reach for -catplot- from SSC, supposedly, like
Catwoman circa 1965, purrfect for the purrpose.
Here is some technique.
. sysuse auto, clear
(1978 Automobile Data)
. gen newrep78 = cond(rep78 <= 2, 1, cond(rep78 == 3, 2, 3))
. tab rep78 newrep78
Repair |
Record | newrep78
1978 | 1 2 3 | Total
-----------+---------------------------------+----------
1 | 2 0 0 | 2
2 | 8 0 0 | 8
3 | 0 30 0 | 30
4 | 0 0 18 | 18
5 | 0 0 11 | 11
-----------+---------------------------------+----------
Total | 10 30 29 | 69
. label def newrep78 1 "1 or 2" 2 "3" 3 "4 or 5"
. label val newrep78 newrep78
. catplot bar newrep78
In your case, you are starting with a string; otherwise
that's the basic idea.
Nick
[email protected]
Alan Neustadtl
> The encode command may be what you are looking for.
>
> help encode provides the following:
>
> encode varname [if] [in] , generate(newvar) [label(name) noextend]
>
> "encode creates a new variable named newvar based on the
> string variable
> varname, creating, adding to, or just using (as
> necessary) the value label
> newvar or, if specified, name. Do not use encode if
> varname contains
> numbers that merely happen to be stored as strings;
> instead, use generate
> newvar = real(varname) or destring; see real() or [D] destring."
Colleen Murphy
> > I have a dataset with censored observations. Many values of the
> > observations are recorded as for example "<2", ">16", including
> > "greater than or equal to" a value and "less than or equal to" a
> > value. The variables with these observations are
> string-type variables
> > in STATA. I would like to create a simple bar graph that shows the
> > frequency of observations and have it sorted by numeric value. As in
> > on the x-axis would have the values <2, 2, 4, >4. (in that
> order) and
> > the y-axis would be the frequency of those values.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/