Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: weird behavior of append
From
Joerg Luedicke <[email protected]>
To
[email protected]
Subject
Re: st: weird behavior of append
Date
Wed, 12 Sep 2012 10:20:52 -0500
I cannot spot a problem here? You have 11165 observations in one file
and 259 observations in the other. Then 11165 + 259 = 11424
observations is what you end up with after appending?
Joerg
On Wed, Sep 12, 2012 at 9:50 AM, Feiveson, Alan H. (JSC-SK311)
<[email protected]> wrote:
> Hello - In Stata 12 IC, I am trying to append a file of 259 observations to one of 11165 observations. Both files contain only one variable named "id" (see below). After appending, rather than having 259 new observations, it appears that 259 observations have been lost, yet if I reduce the size of the first file to 10000, the append seems to work. Also if the variables have different names, I get even more weird results (see below). Anyone have an explanation?
>
> Thanks,
>
> Al Feiveson
>
> ========================================================================
> . use temp1,clear
> . summ
>
> Variable | Obs Mean Std. Dev. Min Max
> -------------+--------------------------------------------------------
> id | 11165 5205.994 98.91063 5000 5389
>
> . use temp2,clear
> . summ
>
> Variable | Obs Mean Std. Dev. Min Max
> -------------+--------------------------------------------------------
> id | 259 5206.846 101.6719 5000 5388
>
> . append using temp1
> . summ
>
> Variable | Obs Mean Std. Dev. Min Max
> -------------+--------------------------------------------------------
> id | 11424 5206.013 98.9696 5000 5389
>
>
> ========================================================================
> Now cut out some observations
>
> . use temp1,clear
> . keep in 1/10000
> (1165 observations deleted)
>
> . summ
>
> Variable | Obs Mean Std. Dev. Min Max
> -------------+--------------------------------------------------------
> id | 10000 5189.761 91.46947 5000 5331
>
> . append using temp2
> . summ
>
> Variable | Obs Mean Std. Dev. Min Max
> -------------+--------------------------------------------------------
> id | 10259 5190.192 91.77468 5000 5388
>
> This appears to be correct.
>
> ========================================================================
> Now rename the variable in one of the files
> . use temp2,clear
> . des
>
> Contains data from temp2.dta
> obs: 259
> vars: 1 12 Sep 2012 09:32
> size: 518
> ----------------------------------------------------------------------------------------------
> storage display value
> variable name type format label variable label
> ----------------------------------------------------------------------------------------------
> id int %10.0g ID
> ----------------------------------------------------------------------------------------------
> Sorted by: id
>
> . rename id id2
> . append using temp1
> . summ
>
> Variable | Obs Mean Std. Dev. Min Max
> -------------+--------------------------------------------------------
> id2 | 259 5206.846 101.6719 5000 5388
> id | 11165 5205.994 98.91063 5000 5389
>
> . count if id==. & id2<.
> 259
>
> . count if id2==. & id<.
> 11165
>
> So it appears that there should be 259 + 11165 observations, since both conditions are exclusive. Yet
>
>
> . des
>
> Contains data from temp2.dta
> obs: 11,424
> vars: 2 12 Sep 2012 09:32
> size: 45,696
> ----------------------------------------------------------------------------------------------
> storage display value
> variable name type format label variable label
> ----------------------------------------------------------------------------------------------
> id2 int %10.0g ID
> id int %10.0g ID
> ----------------------------------------------------------------------------------------------
> Sorted by:
> Note: dataset has changed since last saved
>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/