Michael Hills <[email protected]> wrote,
> 1. Is it acceptable to users that Stata corp should make major changes
> in functionality to a widely used command, between versions,
> without warning or discussion? [...]
The answer is, "Of course it is not acceptable".
Our official policy is,
We do not change functionality or syntax except under version control.
We do fix bugs and add new functionality.
That is the policy we are supposed to follow. On 8may2002, we updated
-egen-'s -cut(), at()- function, and we violated that policy. It was a
mistake and we will issue an update tomorrow to fix that. We learned about
our mistake yesterday, 7aug2002.
The situation with -cut()- is indeed confusing. Let me go over it.
1. 01may1999. David Clayton and Michael Hills develop the -cut()-
function in "Recoding variables using grouped values", STB-49.
2. 15dec2000. The -cut()- function is made official in Stata 7.
3. 08may2002. StataCorp introduces an update to "fix" the cut
function after receiving a complaint from a user.
4. 07aug2002. Michael Hills points out that "fix" has changed
functionality and treats missing values oddly.
5. 07aug2002. Jean Marie Linhart of StataCorp defends fix.
6. 07aug2002. Others join in to agree with Michael Hills.
7. 07aug2002. Jean Marie Linhart surrenders.
There are actually two problems with the 08may2002 "fix", and I can think both
are bothering Michael Hills. The first is the treatment of missing values,
which has received emphasis on the list, and on which we all agree: Missing
should mean missing.
The second is on the treatment of nonmissing values larger than the top
cut point. In the original article, Clayton and Hills wrote inelegantly
about what the function does, and in what they wrote, one would not suspect
that nonmissing values above the final cutpoint would be mapped to missing.
Later in the article, however, in an example, they make it clear that turning
those nonmissing values into missing is exactly what they intend.
In adopting -cut()-, StataCorp wrote its own inelegant description of the
function, and in that description, one would not suspect that nonmissing
values above the final cutpoint would be mapped to missing.
Then, much later, StataCorp received a complaint from a user who said, "Look
at what -cut()- is doing. Read your description. It's broken." Technical
Services did exactly that and agreed. It was turned over to Jean Marie to be
fixed.
Jean Marie fixed the problem and, in the process, added her own little bit to
the mix, having to do with the treatment of missing values, on which we
have all focused (and on which we all now agree).
That, however still leaves the big problem: We did change the behavior of
cut() on 08may2002, even though we thought we were just fixing a bug.
Ergo,
1. we will return -cut()- to its original behavior.
2. We will consider whether our retracted change is an improvement.
If we determine that it is, we will either
a. Add an option to -cut()- which, if not specified,
maintains the prior-to 08may2002 behavior, or
b. In the next release of Stata, change -cut()-'s behavior,
but under version control, so that if you set the old
version, you get the old behavior.
I want to emphasize: it was not our intention to change functionality.
-- Bill
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/