Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: merging same variables within a dataset

From	Eric Booth <[email protected]>
To	"<[email protected]>" <[email protected]>
Subject	Re: st: merging same variables within a dataset
Date	Fri, 19 Aug 2011 21:01:38 +0000

<>

Sorry for the delay -- I've been away from the office.  In case you haven't solved your problem yet:

You could change the "by Name:" parts of the code in my example to "by Name Company", where appropriate, however, this isn't as straightforward after the data are reshaped.  So, I think a faster solution is to tack on the company to the name and -bysort- by that combined variable to avoid making big changes to my example:


**************!
clear
inp str12(Name Position) Salary Bonus ///
Compensation str10(Company)
John  Manager    10  10 20 a
Jack  Manager    20  20 30 a
Brian  Director   10  10 40  c
Amy  Manager    20  20 30   d
John  Director    10  10 20  a
Amy  Director    20  30 20   d
Amy  Other 	10	12	30 f
end

//added//
   replace Name = Name + " - " + Company

**Position and Companies in one String:
g i = _n
bys Name: g j = _n
qui su j
loc max `r(max)'
reshape wide Position, i(i) j(j)

foreach p in Position {
forval n = 1/`max' {
gsort Name -`p'`n'
by Name: carryforward `p'`n', replace
}
**
egen `p'_all = concat(`p'*) , punct(" ")
replace `p'_all = trim(`p'_all)
drop `p'?
**fix duplicates in _all vars:
forval n = 1/`=_N' {
	loc j = `p'_all[`n']
	loc j: list uniq local(j)
	replace `p'_all = "`j'" in `n'
	}
}


**keep totals by Name:
collapse (sum) Salary Bonus Compensation  ///
	(first) Position_all Company, by(Name)


//added//
g len = length(Name)
replace Name = substr(Name, 1, len-3)
drop len
**************!
- Eric


On Aug 15, 2011, at 9:27 AM, Dmitriy Glumov wrote:

> Eric,
> 
> Thank you for your response, it was very helpful. From your example,
> how would I separate data for two different Amys (one from company d,
> the second one from company f) without aggregating it? In other words,
> what should I change in order for Stata to add up the numbers for two
> Amys from company d but keep Amy from f separate? Once again, thank
> you for the consideration and, if anyone can help solve this problem,
> it would be very much appreciated.
> 
> Dmitri
> 
> On Fri, Aug 12, 2011 at 4:10 PM, Eric Booth <[email protected]> wrote:
>> <>
>> 
>> I'd use -reshape- + -collapse-, not -merge-.
>> You'll need to get carryforward for this example (-findit carryforward-):
>> **************!
>> clear
>> inp str12(Name Position) Salary Bonus ///
>> Compensation str10(Company)
>> John  Manager    10  10 20 a
>> Jack  Manager    20  20 30 a
>> Brian  Director   10  10 40  c
>> Amy  Manager    20  20 30   d
>> John  Director    10  10 20  a
>> Amy  Director    20  30 20   d
>> Amy  Other      10      12      30 f
>> end
>> 
>> 
>> **Position and Companies in one String:
>> g i = _n
>> bys Name: g j = _n
>> qui su j
>> loc max `r(max)'
>> reshape wide Position Company, i(i) j(j)
>> 
>> foreach p in Position Company {
>> forval n = 1/`max' {
>> gsort Name -`p'`n'
>> by Name: carryforward `p'`n', replace
>> }
>> **
>> egen `p'_all = concat(`p'*) , punct(" ")
>> replace `p'_all = trim(`p'_all)
>> drop `p'?
>> }
>> 
>> 
>> **keep totals by Name:
>> collapse (sum) Salary Bonus Compensation  ///
>>        (first) Position_all Company_all, by(Name)
>> **************!
>> - Eric
>> __
>> Eric A. Booth
>> Public Policy Research Institute
>> Texas A&M University
>> [email protected]
>> 
>> On Aug 12, 2011, at 1:33 PM, Dmitriy Glumov wrote:
>> 
>>> Hello,
>>> 
>>> I have just started using Stata and have ran into a problem. I would
>>> like Stata to identify observations within a dataset that have the
>>> same name, and add their respective values. To give an example, the
>>> sample of the dataset looks something like this:
>>> 
>>> Name Position Salary Bonus Compensation Company
>>> John  Manager    10       10         20                   a
>>> Jack  Manager    20       20         30                   a
>>> Brian  Director    10       10         40                   c
>>> Amy  Manager    20       20         30                   d
>>> John  Director     10       10         20                   a
>>> Amy  Director     20       30         20                   d
>>> 
>>> 
>>> And I want it to end up looking like this:
>>> 
>>> Name   Position              Salary Bonus Compensation Company
>>> 
>>> John  Manager/Director       20       20         40                   a
>>> Jack  Manager                   20       20         30                   a
>>> Brian  Director                   10       10         40                    c
>>> Amy  Manager/Director      40       50         50                    d
>>> 
>>> In short, I would like Stata to identify same names and then add
>>> salary, bonus, and compensation together, while also keeping the
>>> company name and position. It is best if the company displays only one
>>> label during the merge ("a" instead of "aa" or "a-a", etc) but, as far
>>> as position, either merging or appending or just displaying either
>>> name would be fine (so ManagerDirector or just one of the names would
>>> also work), I just don't want it to create an error during the
>>> transformation process.
>>> 
>>> Thank you for your consideration and any help with this would be
>>> greatly appreciated.
>>> 
>>> Dmitri
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/statalist/faq
>>> *   http://www.ats.ucla.edu/stat/stata/
>> 
>> 
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>> 
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: merging same variables within a dataset
  - From: Eric Booth <[email protected]>

References:
- st: merging same variables within a dataset
  - From: Dmitriy Glumov <[email protected]>
- Re: st: merging same variables within a dataset
  - From: Eric Booth <[email protected]>
- Re: st: merging same variables within a dataset
  - From: Dmitriy Glumov <[email protected]>

Prev by Date: Re: st: Stata 12 bug report -export excel-
Next by Date: Re: st: merging same variables within a dataset
Previous by thread: Re: st: merging same variables within a dataset
Next by thread: Re: st: merging same variables within a dataset
Index(es):
- Date
- Thread