Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: xtdescribe and panel data
From
A. Berâ <[email protected]>
To
[email protected]
Subject
Re: st: xtdescribe and panel data
Date
Thu, 8 Sep 2011 14:40:35 +0300
Dear Dr. Cox,
Thank you very much for your detailed and helpful response.
May I ask one more question if you don't mind? Is it possible to
modify your code as follows:
Assume I would like to include in my analysis those firms that have n,
say five, consecutive observations. So for the firms below, the first
should be included; the second will not be included; for the third
one, the first two years should be deleted and the last 8 years should
be included; and for the last one, middle 5 observations will be
included
......11111111
111...........
11....11111111
11..11111.1111
Regards,
a.b.
On Tue, Sep 6, 2011 at 7:25 PM, Nick Cox <[email protected]> wrote:
>
> This is a fiddly calculation, so I packaged it in a more respectable
> program. The main algorithm is simplified a bit too. Example first,
> code later.
>
> . webuse abdata
>
> . xtset
> panel variable: id (unbalanced)
> time variable: year, 1976 to 1984
>
> . xtpatternvar , gen(pattern)
>
> . tab pattern
>
> pattern | Freq. Percent Cum.
> ------------+-----------------------------------
> ..1111111 | 14 1.36 1.36
> .1111111. | 273 26.48 27.84
> .11111111 | 152 14.74 42.58
> 1111111.. | 434 42.10 84.68
> 11111111. | 32 3.10 87.78
> 111111111 | 126 12.22 100.00
> ------------+-----------------------------------
> Total | 1,031 100.00
>
>
> *! NJC 1.0.0 6 Sept 2011
> program xtpatternvar, sort
> version 9.2
> syntax [if] [in] , GENerate(name)
>
> confirm new var `generate'
> local g `generate'
>
> quietly {
> xtset
> local t `r(timevar)'
> local id `r(panelvar)'
>
> marksample touse
> count if `touse'
> if r(N) == 0 error 2000
>
> su `t' if `touse', meanonly
> local max = r(max)
> local min = r(min)
> local range = r(max) - r(min) + 1
>
> if `range' > 244 {
> di as err "no go; patterns too long for str244"
> exit 498
> }
>
> local miss : di _dup(`range') "."
>
> bysort `touse' `id' (`t') : ///
> gen `g' = substr("`miss'", 1, `t'[1]-`min') + "1" if _n == 1
>
> by `touse' `id' : replace `g' = ///
> substr("`miss'", 1, `t'- `t'[_n-1] - 1) + "1" if _n > 1
>
> by `touse' `id': replace `g' = ///
> `g' + substr("`miss'", 1, `max'-`t'[_N]) if _n == _N
>
> by `touse' `id' : replace `g' = `g'[_n-1] + `g' if _n > 1
>
> by `touse' `id' : replace `g' = cond(`touse', `g'[_N], "")
>
> compress `g'
> }
> end
>
>
>
> On Tue, Sep 6, 2011 at 10:31 AM, Nick Cox <[email protected]> wrote:
> > On Tue, Sep 6, 2011 at 9:12 AM, A. Berâ <[email protected]> wrote:
> >
> >> I have some panel data as described below. Few questions:
> >>
> >> 1. Can these data be analyzed by panel data methods? I would
> >> appreciate any suggestions about a suitable approach for these data.
> >
> > You have panel data. You let slip that the panels are firms. Do
> > something that makes economic sense.
> > That seems all that can be advised.
> >
> >> 2. How can I delete firms that have a specific pattern? For example
> >> how can I delete these type of firms: 1..........111 ?
> >
> > You can create a pattern variable like this.
> >
> > use http://www.stata-press.com/data/r10/xtdatasmpl.dta, clear
> > xtset idcode year
> > keep if idcode <= 5
> > su year, meanonly
> > local max = r(max)
> > local min = r(min)
> > local range = r(max) - r(min) + 1
> > local miss : di _dup(`range') "."
> > bysort idcode (year) : gen this = substr("`miss'", 1, year[1]-`min') +
> > "1" if _n == 1
> > by idcode : replace this = substr("`miss'", 1, year- year[_n-1] - 1) +
> > "1" if _n > 1
> > by idcode : replace this = this + substr("`miss'", 1, `max'-year[_N])
> > if _n == _N
> > by idcode : gen pattern = this[1]
> > by idcode : replace pattern = pattern[_n-1] + this if _n > 1
> > by idcode : replace pattern = pattern[_N]
> > tab pattern
> > xtdes
> >
> > After that you can do things conditionally on values of -pattern-.
> >
> >> 3. Is imputation appropriate if "holes" between years is more than one?
> >
> > You could interpolate. People usually don't with this kind of data.
> >
> >> Many thanks for any help.
> >> --
> >> abdullah berâ
> >>
> >>
> >> . xtdescribe, patterns(1000)
> >>
> >> id: 2, 3, ..., 37376 n = 22997
> >> date: 1996, 1997, ..., 2009 T = 14
> >> Delta(date) = 1 unit
> >> Span(date) = 14 periods
> >> (id*date uniquely identifies each observation)
> >>
> >> Distribution of T_i: min 5% 25% 50% 75% 95% max
> >> 1 1 2 4 9 14 14
> >>
> >> Freq. Percent Cum. | Pattern
> >> ---------------------------+----------------
> >> 3171 13.79 13.79 | 1.............
> >> 2447 10.64 24.43 | 11111111111111
> >> 1932 8.40 32.83 | 11............
> >> 1471 6.40 39.23 | ...........111
> >> 1066 4.64 43.86 | ..........1111
> >
> > <big snip>
> >
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
--
abdullah berâ
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/