This isn't a question--it's a warning about something that bit me, and a
hope that Stata will change it in the future. (Or perhaps somebody will
explain why it's actually a useful feature.)
The following command:
gen obs_interval = last_follow_up - first_visit if live_status ==
"Alive":live_statusw
resulted in obs_interval being set to missing for all observations because
there is no such value label in the dataset as live_statusw. live_statusw
was a typographical error--I meant to type live_status. (The reason the
problem wasn't obvious immediately is that subsequent replace obs_interval
= statements led to most records having a value for obs_interval, so that
the incorrectly missing values didn't make things go clearly wrong until
many analyses later.)
Why doesn't Stata cease execution and complain that there is no such value
label, just as it does if I try to use a non-existent variable name. (OK,
Stata will allow abbreviated variable names, but "live_statusw" is not the
abbreviation of any existing value label in the data set either.) Is
there some purpose to having Stata allow reference to non-existent value
labels? (I'm referring to the non-existence of the value label set
live_statusw here. I understand why a reference to "xxx":live_status
should not lead to an error even if live_status does not include "xxx"
among its possible values.)
Clyde Schechter
Dept. of Family & Social Medicine
Albert Einstein College of Medicine
Bronx, NY, USA
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/