Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: RE: Interval Tabbing?


From   Maarten buis <[email protected]>
To   [email protected]
Subject   RE: st: RE: Interval Tabbing?
Date   Wed, 23 Jan 2008 20:23:47 +0000 (GMT)

Here is an update (bugfix of silly mistake and nicer labels) and a help
file.

-- Maarten

*----------------- begin histtab.ado -----------------------
*! version 1.0.1 MLB 23Jan2008
program define histtab
	version 8.2
	syntax varname, *
	tempvar h x x2
	twoway__histogram_gen `varlist', gen(`h' `x') `options'

	qui gen byte `x2' = _n if `x' < .
	label variable `x2' `"`: variable label `x''"'
	local start = r(start)
	local rstart : di %9.2g `start'
	local rstart : list retokenize rstart
	local bin = r(bin)
	local width = r(width)

	local end = `start' + `width'
	local rend : di %9.2g `end'
	local rend : list retokenize rend

	tempname lab
	label define `lab' 1 "`rstart' - `rend'"

	forvalues i = 2/`r(bin)' {
		local start = `end'
		local rstart : di %9.2g `start'
		local rstart : list retokenize rstart
		local end = `end' + `width'
		local rend : di %9.2g `end'
		local rend : list retokenize rend
		label define `lab' `i' 	"`rstart' - `rend'", add
	}	
	label values `x2' `lab'
	tabdisp `x2' if `x2' < ., cellvar(`h') 
end
*------------------ end histtab.ado ------------------------

*----------------- begin histtab.hlp -----------------------
{smcl}
{* *! version 1.0.0  23Jan2008}{...}
{cmd:help histtab}
{hline}

{title:Title}

{p 4 35 2}
{hi:histtab} {hline 2} Tabulate a histogram


{title:Syntax}

{p 8 12 2}
{cmd:histtab}
        {varname}
        {weight} {ifin} 
        [{cmd:,}
        {c -(}{it:discrete_options}|{it:continuous_options}{c )-}
        {it:common_options}]

{pstd}
where {it:discrete_options} are

        {it:discrete_options}{col 42}description
        {hline 65}
        {cmdab:d:iscrete}{...}
{col 42}specify data are discrete
        {cmd:width(}{it:#}{cmd:)}{...}
{col 42}width of bins in {it:varname} units
        {cmd:start(}{it:#}{cmd:)}{...}
{col 42}theoretical minimum value
        {hline 65}

{pstd}
and where {it:continuous_options} are

        {it:continuous_options}{col 42}description
        {hline 65}
        {cmd:bin(}{it:#}{cmd:)}{...}
{col 42}{it:#} of bins
        {cmd:width(}{it:#}{cmd:)}{...}
{col 42}width of bins in {it:varname} units
        {cmd:start(}{it:#}{cmd:)}{...}
{col 42}lower limit of first bin
        {hline 65}

{pstd}
and where {it:common_options} are

        {it:common_options}{col 42}description
        {hline 65}
        {cmdab:den:sity}{...}
{col 42}tabulate density (default)
        {cmdab:frac:tion}{...}
{col 42}tabulate fractions
        {cmdab:freq:uency}{...}
{col 42}tabulate frequencies
        {cmd:display}{...}
{col 42}display (bin) start and width
        {hline 64}

{pstd}
{cmd:fweight}s are allowed; see {help weights}.


{title:Description}

{pstd}
{cmd:histtab} displays what as a table what would be displayed as an 
histogram. 


{title:Options}

{title:Options for use in the continuous case}

{phang}
{opt bin(#)} and {opt width(#)} are alternatives.  They specify how the
data are to be aggregated into bins; {opt bin()} by specifying the 
number of bins (from which the width can be derived) and {opt width()} 
by specifying the bin width (from which the number of bins can be 
derived).

{pmore}
If neither option is specified, results are the same as if {opt bin(k)}

were specified, where 

{phang3}
{it:k} = min{c -(}sqrt({it:N}), 10*ln({it:N})/ln(10){c )-}

{pmore}
    and where {it:N} is the number of observations.

{phang}
{opt start(#)} specifies the theoretical minimum of varname.  The 
default is {opt start(m)}, where {it:m} is the observed minimum value 
of {it:varname}.

{pmore}
Specify {opt start()} when you are concerned about sparse data, for 
instance, if you know that {it:varname} can have a value of 0, but you 
are concerned that 0 may not be observed.

{pmore}
{opt start(#)}, if specified, must be less than or equal to {it:m}, or 
else an error will be issued.


{title:Options for use in the discrete case}

{phang}
{opt discrete} specifies that varname is discrete and that you want
each
unique value of {it:varname} to have its own bin (bar of histogram).

{phang}
{opt width(#)} is rarely specified in the discrete case; it specifies 
the width of the bins.  The default is {opt width(d)}, where {it:d} is 
the observed minimum difference between the unique values of 
{it:varname}.

{pmore} 
Specify {opt width()} if you are concerned that your data are sparse.
For example, in theory {it:varname} could take on the values, say, 1, 
2, 3, ..., 9, but because of the sparseness, perhaps only the values 2,

4, 7, and 8 are observed.  Here the default width calculation would 
produce {cmd:width(2)} and you would want to specify {cmd:width(1)}.

{phang}
{opt start(#)} is also rarely specified in the discrete case; it 
specifies the theoretical minimum value of varname.  The default is 
{opt start(m)}, where {it:m} is the observed minimum value.

{pmore}
As with {opt width()}, you specify {opt start(#)} if you are concerned 
that your data are sparse.  In the previous example, you might also 
want to specify {cmd:start(1)}.  {opt start()} does nothing more than 
add white space to the left side of the graph.

{pmore}
The value of {it:#} in {opt start()} must be less than or equal to 
{it:m}, or an error will be issued.


{title:Common options}

{phang}
{opt density},
{opt fraction},
{opt frequency}, and
{opt percent} specify whether you want the histogram scaled to density 
units, fractional units, frequencies, or percentages.  {opt density} 
is the default.

{pmore}
{opt density} scales the height of the bars so that the sum of their
areas equals 1.

{pmore}
{opt fraction} scales the height of the bars so that the sum of their 
heights equals 1.

{pmore}
{opt frequency} scales the height of the bars so that each bar's
height is equal to the number of observations in the category.  Thus 
the sum of the heights is equal to the total number of observations.

{pmore}
{opt percent} scales the height of the bars so that the sum of their 
heights equals 100.


{phang}
{cmd:display} indicates that a short note be displayed indicating 
the number of bins, the lower limit of the first bin, and the width 
of the bins.  The output displayed is determined by whether the 
{cmd:discrete} option was specified.


{title:Author}

{p 4 4}
Maarten L. Buis{break}
Vrije Universiteit Amsterdam{break}
Department of Social Research Methodology{break}
[email protected] 
{p_end}


{title:Acknowledgement}

{phang}
Several programming tricks from Harrison (2005) are incorporated 
in this program.


{title:References}

{phang}
Harrison, David A. (2005), "Stata tip 20: Generating histogram bin 
variables". {it:The Stata Journal}, 5(2), pp. 280-281.


{title:Also see}

{psee}
Online:
{manhelp histogram R},
{manhelp twoway_histogram G:graph twoway histogram}
{p_end}
*------------------ end histtab.hlp ------------------------

-----------------------------------------
Maarten L. Buis
Department of Social Research Methodology
Vrije Universiteit Amsterdam
Boelelaan 1081
1081 HV Amsterdam
The Netherlands

visiting address:
Buitenveldertselaan 3 (Metropolitan), room Z434

+31 20 5986715

http://home.fsw.vu.nl/m.buis/
-----------------------------------------


      __________________________________________________________
Sent from Yahoo! Mail - a smarter inbox http://uk.mail.yahoo.com

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2025 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index