Thanks to Kit Baum, a new version of the -keyby- package, and a new
package -addinby-, are now available for download from SSC. In Stata,
use the -ssc- command to do this.
The -keyby- and -addinby- packages are described as below on my website.
The -keyby- package is a "cleaner" version of -sort-, which ensures that
the observations of the memory dataset are uniquely identified, as well
as being sorted, by the sort key variables. And the -addinby- package is
a "cleaner" version of -merge-, which ensures that the observations of
the memory dataset remain identified and/or sorted as before, even after
new data have been added in from a -using- dataset on disk. So,
together, -keyby- and -addinby- can be used to enforce the relational
database model, in which a dataset can be viewed as a mathematical
function, whose domain is the set of available combinations of values of
the primary key variables, and whose range is the set of all possible
combinations of values of the non-key variables.
The -keyby- package has now been updated to Stata Version 10, with some
optimization of internal code. However, users of Stata Version 9 can
still download the old Stata 9 version of -keyby- by typing
net from "http://www.imperial.ac.uk/nhli/r.newson/stata9/"
and selecting the -keyby- package.
Best wishes
Roger
Roger B Newson
Lecturer in Medical Statistics
Respiratory Epidemiology and Public Health Group
National Heart and Lung Institute
Imperial College London
Royal Brompton campus
Room 33, Emmanuel Kaye Building
1B Manresa Road
London SW3 6LR
UNITED KINGDOM
Tel: +44 (0)20 7352 8121 ext 3381
Fax: +44 (0)20 7351 8322
Email: [email protected]
Web page: www.imperial.ac.uk/nhli/r.newson/
Departmental Web page:
http://www1.imperial.ac.uk/medicine/about/divisions/nhli/respiration/pop
genetics/reph/
Opinions expressed are those of the author, not of the institution.
------------------------------------------------------------------------
----------
package keyby from http://www.imperial.ac.uk/nhli/r.newson/stata10
------------------------------------------------------------------------
----------
TITLE
keyby: Key the dataset by a variable list
DESCRIPTION/AUTHOR(S)
keyby sorts the dataset currently in memory by the variables in a
varlist, checking that the variables in the varlist uniquely
identify the observations. This makes the variables in the
varlist
a primary key for the dataset in memory. If the user does not
specify otherwise, then keyby also reorders the variables in the
varlist to the start of the variable order in the dataset, and
checks that all values of these variables are nonmissing. keyby
can be useful if the user combines multiple datasets using merge,
which may cause a dataset in memory to become unsorted.
Author: Roger Newson
Distribution-Date: 13april2008
Stata-Version: 10
INSTALLATION FILES (click here to
install)
keyby.ado
keyby.sthlp
------------------------------------------------------------------------
----------
(click here to return to the previous screen)
------------------------------------------------------------------------
----------
package addinby from http://www.imperial.ac.uk/nhli/r.newson/stata10
------------------------------------------------------------------------
----------
TITLE
addinby: Add in data from a disk dataset using a foreign key
DESCRIPTION/AUTHOR(S)
addinby is a "cleaner" alternative version of merge, designed to
reduce the lines of code in Stata do-files. It adds variables
and/or
values to existing observations in the dataset currently in memory
(the master dataset} from a Stata-format dataset stored in the
file
filename (the using dataset), using a foreign key of variables
specified by the keyvarlist to identify observations in the using
dataset. The using dataset must be sorted by the variables in the
keyvarlist, and these variables must identify observations in the
using dataset uniquely. Unlike merge, addinby always preserves
the
master dataset in its original sorting order, and does not add any
merge-status variables or additional observations. However,
addinby
checks that the foreign key uniquely identifies observations in
the
using dataset, and may optionally check that there are no
unmatched
observations in the master dataset, and/or check that there are no
missing values in the foreign key variables in the master dataset.
Author: Roger Newson
Distribution-Date: 14april2008
Stata-Version: 10
INSTALLATION FILES (click here to
install)
addinby.ado
addinby.sthlp
------------------------------------------------------------------------
----------
(click here to return to the previous screen)
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/