Thanks to Kit Baum, a new version of the SQ-ados is available on SSC.
The programs can be installed by
. ssc install sq
Stata users, who have already installed a previous version of the
SQ-Ados are asked to use
. adoupdate sq, update
The SQ-Ados are various programs written to analyse sequence data. The
techniques and programs are described in some detail in
Brzinky-Fay/Kohler/Luniak "Sequence analysis with Stata". The new
release include sevaral bug fixes as well as new features. The new
features since our last update are:
(1) sqom: The program for the Needleman-Wunsch Algorithm has three new
options -meanprobdistance-, -minprobdistance-, and -maxprobdistance- to
define a matrix for the substitution-costs from the dataset.
Substitution costs are calculated from the transition' probabilities (p)
between every two neighboring elements of the sequences. Check out -help
sqom- for details.
(2) sqom: -sqom- now confirms that a user supplied substitution cost
matrix is symmetric and issues an error message if it is not the case.
(3) egen-sqfirstpos(): The new egen function -sqfirstpos()- returns the
postion at which a specific pattern within a sequence was first found.
Check out -help sqegen- for details.
(4) egen-sqallpos(): The new egen function -sqallpos()- returns the
numbers of occurences of a specific pattern within a sequence.
Check out -help sqegen- for details.
Aside: -sqfirstpos()- and -sqallpos()- use a Mata implementation of the
Boyer-Moore algorithm. The source code of our implementation can be
found in lsqbm.mata.
(5) sqindexplot: The program for producing sequence index plots now
allow a variable list in the option -order()-, which makes fine-tuning
of the sort order of the graph much easier.
In addition, here is a list of features that have been made in the
several updates between the Stata Journal entry and this new release:
- All programs now have the option "subsequence(a b)". The option allows
to restrict an analysis on a subsequence from postion a to position b.
- egen-sqfreq: New egen-function to generate the frequency of the
respective sequence-type. See -help sqegen-
- egen-sqrank: New egen-function to generate a bariable holding the rank
order of frequency of respective sequence-type
- sqclusterdat: New option "keep(varlist)" added. See -help
sqclusterdat-
- sqindexplot: Some new default settings
- sqom: New option -idealtype- allows to specify an ideal-typical
sequence, which all sequences are compared with.
We like to thank Mark Kaulisch, Irena Kogan, Chung Ip, Anna Manzoni,
Trent Spaulding and Abhirup Chakrabart for bug reports and comments.
Many regards
Ulrich Kohler [email protected]
Magdalena Luniak [email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/