Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Richard Goldstein <richgold@ix.netcom.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: Beginner Q: 7mil obs - how to add variables |
Date | Tue, 08 Oct 2013 15:37:24 -0400 |
as far as I can understand you, what I sent gives you what you want (though Scott's answer does also and compresses several steps into 1) the display command you used just shows the result in the first observation; did you look at the data? e.g., -browse tailnum distance newvar- to go the rest of the way to your graph: sort tailnum gr whatever if tailnum!=tailnum[_n-1] or any of several other ways of reducing Rich On 10/8/13 3:25 PM, Coleman, Greg wrote: > Thanks. > That just gives me this: > > > . egen newvar=total(distance), by(tailnum) > > . display newvar > 149899 > > Distance flown by airplane with tail number XXXXXX > > I am trying to eventually graph the top N planes by distance flown. > > I am getting close (I think) to understanding some features, I can see this: > > . bysort tailnum: sum distance > > ---------------------------------------------------------------------------------------------------------------------- > -> tailnum = 80009E > > Variable | Obs Mean Std. Dev. Min Max > -------------+-------------------------------------------------------- > distance | 1959 452.6514 216.5128 56 1194 > > ---------------------------------------------------------------------------------------------------------------------- > -> tailnum = 80019E > > Variable | Obs Mean Std. Dev. Min Max > -------------+-------------------------------------------------------- > distance | 1906 446.2739 222.8244 56 1149 > > > But need the arithmetic sum of all the distances for each tailnumber. > > -----Original Message----- > From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Richard Goldstein > Sent: Tuesday, October 08, 2013 3:01 PM > To: statalist@hsphsun2.harvard.edu > Subject: Re: st: Beginner Q: 7mil obs - how to add variables > > not completely clear, but it looks like this will get you what you want: > > egen newvar=total(Value), by(String) > > Rich > > On 10/8/13 2:46 PM, Coleman, Greg wrote: >> Hi >> My data has 7 million observations and 29 variables. About 10 of the variables are string, and I am trying to get some patterns of the numerical values which are related to the strings. To clarify, >> >> String Value >> AB1234 25 >> CDE789 44 >> F9999 126 >> CDE789 10 >> AB1234 3 >> F9999 100 >> >> I would like to get output that looks like the sum of values, per string: >> AB1234 28 >> CDE789 54 >> F9999 226 >> >> Some strings are just numbers, some are just letters, and some are a combination. >> Can anyone assist? >> Thanks! >> Greg * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/