|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: Re: reshape/collapse question (alas, again)
I sent this post to the list and got a nice reply from Howie Lempel. Here is the post and the reply and the persistent problem that I do not understand:
>>> Eric Uslaner wrote:
Hi,
I have a somewhat complex merge/collapse question that I could not find the answer to in the manuals or with help.
I have two data sets. One is the main data set with data on metropolitan areas (SMSAs) sorted by SMSA.
The second data set is religious membership and attendance by denomination (denomcode and denomname) and by a coding I have used for the orientation of each denomination. This latter data set contains:
Contains data from rcmsmetroupdate.dta
obs: 15,132
vars: 20 13 Mar 2009 17:53
size: 1,225,692 (99.8% of memory free)
----------------------------------------------------------------------------------------------------------------------------------------------------------------
storage display value
variable name type format label variable label
----------------------------------------------------------------------------------------------------------------------------------------------------------------
metropop2000 long %12.0g
denomabbrev str15 %15s
congregations int %8.0g
members long %12.0g
adherents long %12.0g
attendees long %12.0g
adherentpct float %9.0g
attendeepct float %9.0g
metrocode int %10.0g
metroshortname str17 %17s
denomcode int %10.0g
orientation str4 %9s
catholic byte %8.0g
eastern byte %8.0g
cnt byte %8.0g
evangelical byte %8.0g
jewish byte %8.0g
mainline byte %8.0g
orthodox byte %8.0g
orientation1 long %8.0g orientation1
There are several hundred denominations but I don't care about the denominations per se, just the orientations (Catholic, Eastern, CNT--conservative non-traditional, evangelical, Jewish, Mainline, Orthodox) which are in the string variable orientation and the decoded numericl variable orientation1. The data are arrayed with one observation for each denomination for each metropolitan area (e.g., Catholics in Amarillo would be one observation, LDS in Chicago another, etc.). What I want to do is to sum congregations through attendeepct for each orientation by metropolitan area, so that one record would be the number of congregations, members, adherents, etc. for Catholics in Chicago (e.g.), the number of congregations etc. for followers of Eastern religions, ..., the number of congregations... for Orthodox followers in Chicago). There should be one observation per city (here metrocode--which I will rename smsa before merging with the master data set).
First I tried reshape:
. reshape wide metropop metroshortname denomabbrev denomcode congregations members adherents attendees adherentpct attendeepct eastern-orthodox orientation ,i(metrocode) j(orientation1)
and got:
(note: j = 1 2 3 4 5 6 7 8)
orientation1 not unique within metrocode;
there are multiple observations at the same orientation1 within metrocode.
Type "reshape error" for a listing of the problem observations.
Then I tried:
reshape wide metropop denomcode congregations members adherents attendees adherentpct attendeepct eastern-orthodox metrocode,i(orientation) j(metroshortname)
> string
and got:
(note: j = Abilene Akron Albany GA Albany NY Albuquerque Alexandria Allentown Altoona Amarillo Anchorage Ann Arbor Anniston Appleton Asheville Athens Atlanta At
> lantic Auburn AL Augusta Austin Bakersfield Baltimore Bangor Barnstable Baton Rouge Beaumont Bellingham Benton Harbor Bergen Billings Biloxi Binghamton Birmin
> gham Bismarck Bloomington IL Bloomington IN Boise City Boston Boulder Brazoria Bremerton Brownsville Bryan Buffalo Burlington VT Canton Casper Cedar Rapids Ch
> ampaign Charleston SC Charleston WV Charlotte Charlottesville Chattanooga Cheyenne Chicago Chico Cincinnati Clarksville Cleveland Colorado Springs Columbia MO
> Columbia SC Columbus GA Columbus OH Corpus Christi Corvallis Cumberland Dallas Danville Davenport Dayton Daytona Beach Decatur AL Decatur IL Denver Des Moine
> s Detroit Dothan Dover DE Dubuque Duluth Eau Claire El Paso Elkhart Elmira Enid Erie Eugene Evansville Fargo Fayetteville AR Fayetteville NC Flagstaff Flint F
> lorence AL Florence SC Fort Collins Fort Lauderdale Fort Myers Fort Pierce Fort Smith Fort Walton Beach Fort Wayne Fort Worth Fresno Gadsden Gainesville Galve
> ston Gary Glens Falls Goldsboro Grand Forks Grand Junction Grand Rapids Great Falls Greeley Green Bay Greensboro Greenville NC Greenville SC Hagerstown Hamilt
> on Harrisburg Hartford Hattiesburg Hickory Honolulu Houma Houston Huntington Huntsville Indianapolis Iowa City Jackson MI Jackson MS Jackson TN Jacksonville F
> L Jacksonville NC Jamestown Janesville Jersey City Johnson City Johnstown Jonesboro Joplin Kalamazoo Kankakee Kansas City Kenosha Killeen Knoxville Kokomo La
> Crosse Lafayette IN Lafayette LA Lake Charles Lakeland Lancaster Lansing Laredo Las Cruces Las Vegas Lawrence KS Lawton Lewiston Lexington Lima Lincoln Little
> Rock Longview Los Angeles Louisville Lubbock Lynchburg Macon Madison Mansfield McAllen Medford Melbourne Memphis Merced Miami Middlesex Milwaukee Minneapolis
> Missoula Mobile Modesto Monmouth Monroe Montgomery Muncie Myrtle Beach Naples Nashville Nassau New Haven New London New Orleans New York Newark Newburgh Norf
> olk Oakland Ocala Odessa Oklahoma City Olympia Omaha Orange County Orlando Owensboro Panama City Parkersburg Pensacola Peoria Philadelphia Phoenix Pine Bluff
> Pittsburgh Pittsfield Pocatello Portland ME Portland OR Providence Provo Pueblo Punta Gorda Racine Raleigh Rapid City Reading Redding Reno Richland Richmond R
> iverside Roanoke Rochester MN Rochester NY Rockford Rocky Mount Sacramento Saginaw Salem Salinas Salt Lake City San Angelo San Antonio San Diego San Francisco
> San Jose San Luis Obispo Santa Barbara Santa Cruz Santa Fe Santa Rosa Sarasota Savannah Scranton Seattle Sharon Sheboygan Sherman Shreveport Sioux City Sioux
> Falls South Bend Spokane Springfield IL Springfield MA Springfield MO St. Cloud St. Joseph St. Louis State College Steubenville Stockton Sumter Syracuse Taco
> ma Tallahassee Tampa Terre Haute Texarkana Toledo Topeka Trenton Tucson Tulsa Tuscaloosa Tyler Utica Vallejo Ventura Victoria Vineland Visalia Waco Washington
> Waterloo Wausau West Palm Beach Wheeling Wichita Wichita Falls Williamsport Wilmington DE Wilmington NC Yakima Yolo York Youngstown Yuba City Yuma)
metroshortname not unique within orientation;
there are multiple observations at the same metroshortname within orientation.
Type "reshape error" for a listing of the problem observations.
I know that I need to reshape the data to collapse (or sum?) the data by orientation. Would egen rowsum by (orientation1 metrocode) be the way to go? I need to preserve the metrocode otherwise I can't merge these data.
Here is Howie's response:
Try something like this:
collapse (sum) congregations-attendeepct, by(orientation orientation1 metrocode metroshortname metropop2000)
reshape wide congregations members adherents attendees adherentpct attendeepct, i(metrocode metroshortname metropop2000) j(orientation1)
The collapse worked fine, but the reshape gave me once more:
(note: j = . CATH CNT EAST EV JEW ML OR)
orientation1 not constant within metrocode metroshortname metropop2000
Is there a solution to this problem? Thanks much.
Ric
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/