Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: Large data set and post-saving problems
From
Bilge Eris <[email protected]>
To
<[email protected]>
Subject
st: Large data set and post-saving problems
Date
Thu, 12 Apr 2012 16:18:40 +0200
Hi to all,
I have a question about studying with large datasets and problems
occuring after "save.., raplace".
I am working with a data set with 1.5 million rows, and 20 columns in
the beginnig. For this reason I can not work on my oersonel computer, I
am working on my university's server.
I am creating necesarry dummies for my work(like around 700 dummies).
They are being created properly. Before saving the new file, I am
checking for some simple stats for variables, everything seems to be OK.
But after saving the new file with a new name and using replace command
in any case; clearing and using the lastly saved dataset, some of my
variables start to have troubles! I really don't know what is happening
really, but it seems that variables shift into each other.
For example when I tabulate gender(which was inherent before creating
dummies), the following output arrives:
gender | Freq. Percent Cum.
------------+-----------------------------------
-1 | 43 0.00 0.00
w | 481,424 35.46 35.46
m | 876,354 64.54 100.00
4 | 2 0.00 100.00
94 | 1 0.00 100.00
------------+-----------------------------------
Total | 1,357,824 100.00
And this is not the case always. When I do everything from the
beginning, I obtain totally different variables and tabulations in the
end.
I am using an offical Stata 10.1, I tried update all, update swap, and
all other recommendations under this topic, but is still does not work.
Do you have any idea how I can come over this problem?
Thank you.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/