Title | Stata 7: Getting nice date labels on a graph | |
Author | Nicholas J. Cox, Durham University, UK |
If you are graphing time-series data in Stata, one of your axes will show time. For all but the crudest exploratory graphics, good labeling of that axis is important. Tastes and conventions may vary, but most people seem to prefer integer values to fractions, and wherever possible, multiples of 10, or of 5, or of 2. These may be called "nice" or "round" labels.
Here we are supposing that you have previously tsset the time variable and assigned to it an appropriate date format. If you then use graph, or any command based on it, and ask Stata to show nice number labels on the time axis, the results may be a little disappointing to you. We first explain why this happens, and then suggest what to do about it. The background on dates in Stata is explained in much more detail in [U] 27 Commands for dealing with dates.
The main trick suggested is to use one of foreach or forvalues to generate the numbers you need for nice labels. A detailed tutorial on those commands is given by Cox (2002).
In what follows, we assume that time goes on the x axis. If you put time on the y axis, you just need to think in terms of the corresponding options; that is, for xlabel, read ylabel, and so forth.
For example, suppose that you have quarterly data extending from 1960q1 to 1990q4, so that, in your dataset, the time variable t is quarterly and has a format %tq. If we draw a graph with t on the x axis and ask for nice labels using xlabel, the labels will be
1960q1 1972q3 1985q1 1997q3
which may not match your idea of nice or round. What is happening is that Stata is using the underlying numeric values for t when it produces nice labels. The origin 0 of that variable is the first quarter of 1960. (More generally, Stata date or time variables represent the first possible value in 1960 by 0.) Looking at these dates shows they are spaced 12.5 years apart, namely 50 quarters. Typing a display command within a forvalues loop,
reproduces those dates. As you may know, the entity (here i) controlling the forvalues loop is a local macro; inside the loop, its contents are referenced by `i'. The _c ensures, at least in this example, that all the dates can fit on one line; it specifies that the next output from display should be continued on the same line. Doing the arithmetic may not always be so easy, and you can also reverse the mapping by typing
Those dates might be what you need, but it seems more likely you want something rounder, such as labels showing (say) the first quarter of every 5th year from 1960 to 1990. Here is one way to do it:
which produces in your Results window:
0 20 40 60 80 100 120
Note one small but crucial detail: the space " ", which keeps the numbers apart.
Depending on taste and convenience, you can now
The last of these possibilities is closer to what would belong best in a program but is less attractive interactively.
The approach just explained can be extended. Here are the common elements:
h(`i'h1) q(`i'q1) m(`i'm1) w(`i'w1)
The odd one out is daily dates; the first day in each year is
d(1jan`i')
With 10 years or so of daily data for the 1990s, we might want every 1 January as a label:
In this example, the results are 10958 11323 11688 12054 12419 12784 13149 13515 13880 14245 14610. Copying the results of such a loop will often be easier than doing the arithmetic ourselves, especially with the complication of leap years.
Note (Stata 7 only): Users of Stata 7 will find that smarter date labels using this logic are implemented in two community-contributed programs intended for time-series graphics: tsgraph and ofrtplot7 (which is downloadable as part of the modeldiag package). Both commands can be found using findit.