Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: weird behavior of append
From
"Feiveson, Alan H. (JSC-SK311)" <[email protected]>
To
"[email protected]" <[email protected]>
Subject
st: weird behavior of append
Date
Wed, 12 Sep 2012 09:50:43 -0500
Hello - In Stata 12 IC, I am trying to append a file of 259 observations to one of 11165 observations. Both files contain only one variable named "id" (see below). After appending, rather than having 259 new observations, it appears that 259 observations have been lost, yet if I reduce the size of the first file to 10000, the append seems to work. Also if the variables have different names, I get even more weird results (see below). Anyone have an explanation?
Thanks,
Al Feiveson
========================================================================
. use temp1,clear
. summ
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
id | 11165 5205.994 98.91063 5000 5389
. use temp2,clear
. summ
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
id | 259 5206.846 101.6719 5000 5388
. append using temp1
. summ
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
id | 11424 5206.013 98.9696 5000 5389
========================================================================
Now cut out some observations
. use temp1,clear
. keep in 1/10000
(1165 observations deleted)
. summ
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
id | 10000 5189.761 91.46947 5000 5331
. append using temp2
. summ
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
id | 10259 5190.192 91.77468 5000 5388
This appears to be correct.
========================================================================
Now rename the variable in one of the files
. use temp2,clear
. des
Contains data from temp2.dta
obs: 259
vars: 1 12 Sep 2012 09:32
size: 518
----------------------------------------------------------------------------------------------
storage display value
variable name type format label variable label
----------------------------------------------------------------------------------------------
id int %10.0g ID
----------------------------------------------------------------------------------------------
Sorted by: id
. rename id id2
. append using temp1
. summ
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
id2 | 259 5206.846 101.6719 5000 5388
id | 11165 5205.994 98.91063 5000 5389
. count if id==. & id2<.
259
. count if id2==. & id<.
11165
So it appears that there should be 259 + 11165 observations, since both conditions are exclusive. Yet
. des
Contains data from temp2.dta
obs: 11,424
vars: 2 12 Sep 2012 09:32
size: 45,696
----------------------------------------------------------------------------------------------
storage display value
variable name type format label variable label
----------------------------------------------------------------------------------------------
id2 int %10.0g ID
id int %10.0g ID
----------------------------------------------------------------------------------------------
Sorted by:
Note: dataset has changed since last saved
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/