Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Undesired drop of sequence data in Stata/IC 12.1 with SQ-Ado

From	Christoph Floethmann <[email protected]>
To	"[email protected]" <[email protected]>
Subject	st: Undesired drop of sequence data in Stata/IC 12.1 with SQ-Ado
Date	Wed, 21 Aug 2013 15:54:48 +0000

Dear Statalist Users,

I have a problem with my optimal matching analysis, using the SQ-Ado for Stata/IC 12.1 version.

I have 311 observations, 14 element variables and 29 order variables (see output below).

Reshaping, encoding and running the OM with the Needleman-Wunsch algorithm and clustering the output with Ward's method to six clusters works perfectly.

But somehow I end-up only with 276 observations in the end. 
35 observations, i.e. 35 sequences are dropped from my sample and I don't know why.

All values fulfill the requirements and are coded identical.

Has somebody an idea how I can get rid of that problem? This problem decreases my sample by over 10% which is not acceptable for me.

Thanks in advance for your advices and help.

Best regards,
Christoph 

------------------------------


Data                               wide   ->   long
-----------------------------------------------------------------------------
Number of obs.                      311   ->    9019
Number of variables                  30   ->       3
j variable (29 values)                    ->   order
xij variables:
                          y1 y2 ... y29   ->   y
-----------------------------------------------------------------------------

. 
. 
. 
. encode y, generate(y2)

. 
. 
. 
. sqset y2 id order, rtrim

Note: dataset has changed due to the use of option -rtrim-

       element variable:  y2, 1 to 14
       identifier variable:  id, 1 to 311
       order variable:  order, 1 to 29

.sqom, indelcost(1.5) subcost(sub) name(om1) full standard(longer)
Perform 37950 Comparisons with Needleman-Wunsch Algorithm
Distance matrix saved as SQdist

.sqclusterdat

. clustermat wardslinkage SQdist, name(wards) add

. cluster generate gten=group(6), name(wards)



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

Prev by Date: Re: st: generate variable equal to first non-zero value in row
Next by Date: Re: st: generate variable equal to first non-zero value in row
Previous by thread: st: Is my parameter significant or not?
Next by thread: st: cart
Index(es):
- Date
- Thread