Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: merging same variables within a dataset
From
Eric Booth <[email protected]>
To
"<[email protected]>" <[email protected]>
Subject
Re: st: merging same variables within a dataset
Date
Fri, 19 Aug 2011 21:08:49 +0000
<>
Quick note:
Just to keep the code example similar to my first example, I kept the
foreach p in Position {
which before was
foreach p in Position Company {
even though we don't need the `p' loop since there is only one element (Position) to consider. This might cause confusion if someone just looks at the latest posting, but I was trying to make minimal changes to the code so that Dmitriy could track the changes he requested.
- Eric
On Aug 19, 2011, at 4:01 PM, Eric Booth wrote:
> <>
>
> Sorry for the delay -- I've been away from the office. In case you haven't solved your problem yet:
>
> You could change the "by Name:" parts of the code in my example to "by Name Company", where appropriate, however, this isn't as straightforward after the data are reshaped. So, I think a faster solution is to tack on the company to the name and -bysort- by that combined variable to avoid making big changes to my example:
>
>
> **************!
> clear
> inp str12(Name Position) Salary Bonus ///
> Compensation str10(Company)
> John Manager 10 10 20 a
> Jack Manager 20 20 30 a
> Brian Director 10 10 40 c
> Amy Manager 20 20 30 d
> John Director 10 10 20 a
> Amy Director 20 30 20 d
> Amy Other 10 12 30 f
> end
>
> //added//
> replace Name = Name + " - " + Company
>
> **Position and Companies in one String:
> g i = _n
> bys Name: g j = _n
> qui su j
> loc max `r(max)'
> reshape wide Position, i(i) j(j)
>
> foreach p in Position {
> forval n = 1/`max' {
> gsort Name -`p'`n'
> by Name: carryforward `p'`n', replace
> }
> **
> egen `p'_all = concat(`p'*) , punct(" ")
> replace `p'_all = trim(`p'_all)
> drop `p'?
> **fix duplicates in _all vars:
> forval n = 1/`=_N' {
> loc j = `p'_all[`n']
> loc j: list uniq local(j)
> replace `p'_all = "`j'" in `n'
> }
> }
>
>
> **keep totals by Name:
> collapse (sum) Salary Bonus Compensation ///
> (first) Position_all Company, by(Name)
>
>
> //added//
> g len = length(Name)
> replace Name = substr(Name, 1, len-3)
> drop len
> **************!
> - Eric
>
>
> On Aug 15, 2011, at 9:27 AM, Dmitriy Glumov wrote:
>
>> Eric,
>>
>> Thank you for your response, it was very helpful. From your example,
>> how would I separate data for two different Amys (one from company d,
>> the second one from company f) without aggregating it? In other words,
>> what should I change in order for Stata to add up the numbers for two
>> Amys from company d but keep Amy from f separate? Once again, thank
>> you for the consideration and, if anyone can help solve this problem,
>> it would be very much appreciated.
>>
>> Dmitri
>>
>> On Fri, Aug 12, 2011 at 4:10 PM, Eric Booth <[email protected]> wrote:
>>> <>
>>>
>>> I'd use -reshape- + -collapse-, not -merge-.
>>> You'll need to get carryforward for this example (-findit carryforward-):
>>> **************!
>>> clear
>>> inp str12(Name Position) Salary Bonus ///
>>> Compensation str10(Company)
>>> John Manager 10 10 20 a
>>> Jack Manager 20 20 30 a
>>> Brian Director 10 10 40 c
>>> Amy Manager 20 20 30 d
>>> John Director 10 10 20 a
>>> Amy Director 20 30 20 d
>>> Amy Other 10 12 30 f
>>> end
>>>
>>>
>>> **Position and Companies in one String:
>>> g i = _n
>>> bys Name: g j = _n
>>> qui su j
>>> loc max `r(max)'
>>> reshape wide Position Company, i(i) j(j)
>>>
>>> foreach p in Position Company {
>>> forval n = 1/`max' {
>>> gsort Name -`p'`n'
>>> by Name: carryforward `p'`n', replace
>>> }
>>> **
>>> egen `p'_all = concat(`p'*) , punct(" ")
>>> replace `p'_all = trim(`p'_all)
>>> drop `p'?
>>> }
>>>
>>>
>>> **keep totals by Name:
>>> collapse (sum) Salary Bonus Compensation ///
>>> (first) Position_all Company_all, by(Name)
>>> **************!
>>> - Eric
>>> __
>>> Eric A. Booth
>>> Public Policy Research Institute
>>> Texas A&M University
>>> [email protected]
>>>
>>> On Aug 12, 2011, at 1:33 PM, Dmitriy Glumov wrote:
>>>
>>>> Hello,
>>>>
>>>> I have just started using Stata and have ran into a problem. I would
>>>> like Stata to identify observations within a dataset that have the
>>>> same name, and add their respective values. To give an example, the
>>>> sample of the dataset looks something like this:
>>>>
>>>> Name Position Salary Bonus Compensation Company
>>>> John Manager 10 10 20 a
>>>> Jack Manager 20 20 30 a
>>>> Brian Director 10 10 40 c
>>>> Amy Manager 20 20 30 d
>>>> John Director 10 10 20 a
>>>> Amy Director 20 30 20 d
>>>>
>>>>
>>>> And I want it to end up looking like this:
>>>>
>>>> Name Position Salary Bonus Compensation Company
>>>>
>>>> John Manager/Director 20 20 40 a
>>>> Jack Manager 20 20 30 a
>>>> Brian Director 10 10 40 c
>>>> Amy Manager/Director 40 50 50 d
>>>>
>>>> In short, I would like Stata to identify same names and then add
>>>> salary, bonus, and compensation together, while also keeping the
>>>> company name and position. It is best if the company displays only one
>>>> label during the merge ("a" instead of "aa" or "a-a", etc) but, as far
>>>> as position, either merging or appending or just displaying either
>>>> name would be fine (so ManagerDirector or just one of the names would
>>>> also work), I just don't want it to create an error during the
>>>> transformation process.
>>>>
>>>> Thank you for your consideration and any help with this would be
>>>> greatly appreciated.
>>>>
>>>> Dmitri
>>>> *
>>>> * For searches and help try:
>>>> * http://www.stata.com/help.cgi?search
>>>> * http://www.stata.com/support/statalist/faq
>>>> * http://www.ats.ucla.edu/stat/stata/
>>>
>>>
>>> *
>>> * For searches and help try:
>>> * http://www.stata.com/help.cgi?search
>>> * http://www.stata.com/support/statalist/faq
>>> * http://www.ats.ucla.edu/stat/stata/
>>>
>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/statalist/faq
>> * http://www.ats.ucla.edu/stat/stata/
>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/