Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: How to (almost) randomly reduce the number of observations?
From
Maarten buis <[email protected]>
To
[email protected]
Subject
Re: st: How to (almost) randomly reduce the number of observations?
Date
Tue, 20 Apr 2010 08:13:44 +0000 (GMT)
--- On Mon, 19/4/10, Dimitrije Tišma wrote:
> > I would like to ask how to reduce number of observations
> > randomly BUT in a way that all observations are kept that
> > are related to the person who still in the dataset.
--- On Tue, 20/4/10, Maarten buis wrote:
> *---------- begin example -------------
> // create some example data
> clear
> set obs 100
> gen id = _n
> expand 10
> bys id : gen t = _n
> sort id t
> list in 1/22, sepby(id)
>
> // randomly drop 50%
> bys id: gen u = runiform() if _n == 1
> bys id: egen uu = total(u)
> keep if uu < .5
> drop u uu
> *----------- end example ----------------
An alternative approach that will sample _with_ replacement:
*---------- begin example -------------
// create some example data
clear
set obs 100
gen id = _n
expand 10
bys id : gen t = _n
sort id t
list in 1/22, sepby(id)
// randomly drop 50% with replacement
bys id: gen byte mark = _n==1
count if mark
local n = round(r(N)/2)
bsample `n', cluster(id)
*----------- end example ----------------
(For more on examples I sent to the Statalist see:
http://www.maartenbuis.nl/example_faq )
Hope this helps,
Maarten
--------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany
http://www.maartenbuis.nl
--------------------------
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/