I have a question about working with very large data sets (combined
sizes ~ 40 gig) to run analysis when only 6 gig of memory is
available.
A second complicating factor is that I need to join some of these data
sets together based on a date range or similar join rule. In Oracle,
I could query out only the columns I need and then join them to other
files using a rule- such as the dates being within "x" number of days
of each other. I cannot get ""merge"" in stata to accept these kinds
of date ranges. Here are an example of two datasets to join
***subdataset***
date1 var1 extravar extravar1
10/22/2008 3 44 44
02/01/2001 5 44 44
05/24/2005 9 44 44
12/12/2012 99 44 44
12/29/2012 100 44 44
***big dataset***
date1 var2 extravar extravar1
10/20/2008 500 44 44
02/07/2001 500 44 44
05/20/2005 900 44 44
12/12/2015 990 44 44
01/01/1999 1000 44 44
01/01/1970 2000 44 44
01/01/1970 2222 44 44
12/01/2012 7777 44 55
I need to join by ""date1"" and load up a data set for analysis with
ONLY ""date1"", ""var1"", and ""extravar1"". Thanks for helping.
DFS
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/