[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: k-fold cross validation

From	Richard Goldstein <[email protected]>
To	[email protected]
Subject	Re: st: k-fold cross validation
Date	Fri, 15 Feb 2008 11:32:15 -0500

1. see the jackknife command for the extreme version of this

2. you may prefer to use bootstrap -- see that command

Rich

Nalin Payakachat wrote:

Hi,

I would like to perform k-fold cross validation using Stata. Here are
explanation for k-fold (http://www.cs.cmu.edu/~schneide/tut5/node42.html):

K-fold cross validation is one way to improve over the holdout method. The data
set is divided into k subsets, and the holdout method is repeated k times. Each
time, one of the k subsets is used as the test set and the other k-1 subsets are
put together to form a training set. Then the average error across all k trials
is computed. The advantage of this method is that it matters less how the data
gets divided. Every data point gets to be in a test set exactly once, and gets
to be in a training set k-1 times. The variance of the resulting estimate is
reduced as k is increased. The disadvantage of this method is that the training
algorithm has to be rerun from scratch k times, which means it takes k times as
much computation to make an evaluation. A variant of this method is to randomly
divide the data into a test and training set k different times. The advantage of
doing this is that you can independently choose how large each test set is and
how many trials you average over.

If anybody could help, I would deeply appreciate it.
Thank you so much.

Nalin

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- RE: st: k-fold cross validation
  - From: "Lachenbruch, Peter" <[email protected]>

References:
- st: k-fold cross validation
  - From: Nalin Payakachat <[email protected]>

Prev by Date: st: k-fold cross validation
Next by Date: st: invalid 'and' error with mim
Previous by thread: st: k-fold cross validation
Next by thread: RE: st: k-fold cross validation
Index(es):
- Date
- Thread