Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: identifying highest number of consecutive variables where answer is consistent across observation

From	Nick Cox <[email protected]>
To	"[email protected]" <[email protected]>
Subject	Re: st: identifying highest number of consecutive variables where answer is consistent across observation
Date	Thu, 20 Feb 2014 18:34:26 +0000

Joe Canner has developed a good strategy for looking at this. Here is another.

Suppose we -reshape long-, something like

gen id = _n
reshape long var, i(id) j(question)
tsset id question

Then we can treat the blocks of observations as panel data. With

ssc inst tsspell
tsspell var

With this syntax for -tsspell- a "spell" is automatically a sequence
of identical values. The existence of spells 15 or longer will be
summarized by

egen fifteen_or_more = total((_seq >= 15) / _end), by(id)

where division by the indicator variable -_end- (1 on end of spell, 0
otherwise) ensures that we look only at the ends of spells. If needed,
we can then -reshape- back.

On the other hand, it is quite likely that some questions of similar
kind are more easily answered with this data structure.

Nick
[email protected]

On 20 February 2014 17:04, Alison El Ayadi <[email protected]> wrote:

> I am doing some data cleaning on survey data and am looking to
> identify observations where there are 15 or more of the same answers
> in a row (across the variables in current order).  All of the
> variables are string.  Does anyone have an easy automated way to do
> this?  I'm thinking that it could be done by generating a variable
> that provided the maximum number of same responses in a row, but have
> no idea how to code this.  Variables are q1 - q94, and all string.
>
> Any suggestions on efficiently writing this code would be greatly appreciated.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: identifying highest number of consecutive variables where answer is consistent across observation
  - From: Nick Cox <[email protected]>

References:
- st: identifying highest number of consecutive variables where answer is consistent across observation
  - From: Alison El Ayadi <[email protected]>

Prev by Date: Re: st: Mata compatibility problem
Next by Date: RE: st: insheet and dropping cases
Previous by thread: Re: st: RE: identifying highest number of consecutive variables where answer is consistent across observation
Next by thread: Re: st: identifying highest number of consecutive variables where answer is consistent across observation
Index(es):
- Date
- Thread