Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: imputing dates into a string date
From
Nick Cox <[email protected]>
To
"[email protected]" <[email protected]>
Subject
Re: st: imputing dates into a string date
Date
Fri, 7 Jun 2013 13:11:42 +0100
Just to spell out what will be obvious to Joseph and Tim: Sometimes
other dates provide constraints on what the dates might be.
Nick
[email protected]
On 7 June 2013 12:51, Joseph Coveney <[email protected]> wrote:
> Tim Evans wrote:
>
> Some time ago I had a problem with imputing dates into a string variable where
> the date took the form:
>
> XX/01/2012
>
> In the thread below a solution was provided which worked great, however, I now
> have data takes the form:
>
> /01/2012
>
> To this, I would like to impute a day of "1", but having tried to amend the
> original code below
>
> g dx_clean = subinstr(dx, "XX", "01", 1)
>
> to
>
> g dx_clean = subinstr(dx, "", "01", 1)
>
> The result is that I return the same value i.e.
> XX/01/2012
>
> Does anyone have a suggestion of how I can handle this please?
>
> --------------------------------------------------------------------------------
>
> If you've got missing elements other than just the day, it might be better to
> use -split-, and impute the days, months and years separately with their
> different defaults. You can then re-assemble the elements with simple string
> concatenation (or convert the imputed dates to a Stata date).
>
> Joseph Coveney
>
> . version 12.1
>
> .
> . clear *
>
> . set more off
>
> .
> . input str10 dx
>
> dx
> 1. "01//2001"
> 2. "/01/2001"
> 3. "01/01/"
> 4. end
>
> .
> . split dx, generate(d_) parse(/)
> variables created as string:
> d_1 d_2 d_3
>
> . replace d_1 = "15" if missing(d_1) // Missing days as approx. midmonth
> (1 real change made)
>
> . replace d_2 = "06" if missing(d_2) // Missing months as approx. midyear
> (1 real change made)
>
> . replace d_3 = "2012" if missing(d_3) // Missing year as most recent full year
> (1 real change made)
>
> .
> . generate int imputed_dt = date(d_3 + d_2 + d_1, "YMD")
>
> . format imputed_dt %tdCCYY-NN-DD
>
> .
> . generate str10 clean_dx = d_1 + "/" + d_2 + "/" + d_3
>
> . list dx clean_dx imputed_dt, noobs abbreviate(20)
>
> +------------------------------------+
> | dx clean_dx imputed_dt |
> |------------------------------------|
> | 01//2001 01/06/2001 2001-06-01 |
> | /01/2001 15/01/2001 2001-01-15 |
> | 01/01/ 01/01/2012 2012-01-01 |
> +------------------------------------+
>
> .
> . exit
>
> end of do-file
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/