(code not tested)
The first year they enter the sample I take to be
the first year with a non-missing -mvalue-. That is
gen byte tag = missing(mvalue)
bysort company (tag year) : replace tag = _n == 1
The median value of -mvalue- across the entry years
of all the panels is then obtained by
su mvalue if tag, detail
The assignment to high or low is then
by company : gen byte highorlow = mvalue[1] > r(p50)
Alessandro Fiaschi
> I have a panel of 1000 firms and I wish to generate two subsamples of
> firms with relatively low and relatively high mvalue during our sample
> period. The sample splits should be achieved as follows: each firm
> should be assigned to a high (resp. low) category according to its
> position in the first year it enters the sample relative to the median
> across all firms in the first year they enter the sample. For example,
> firm 1 was categorised as a high mvalue firm if its mvalue in 1948,
> the first year firm XYZ entered the sample, was above the median
> dividend payout ratio across all firms in the first year they entered
> the sample. I would be very grateful I somebody could write me the
> Stata commands useful to enter this particular sample splitting.
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/