Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Identify observations that appear in a list

From	R Zhang <[email protected]>
To	[email protected]
Subject	Re: st: Identify observations that appear in a list
Date	Fri, 14 Mar 2014 00:57:56 -0400

Thank you for being so helpful !!!

Warm regards,

Rochelle

On Thu, Mar 13, 2014 at 7:50 AM, Nick Cox <[email protected]> wrote:
> This is an FAQ, at least in the sense that this is frequently asked here.
>
> One approach is just to -merge- the data with a reduced copy of
> itself, with the important twist that you -rename- what you want as an
> identifier.
>
> The slogan I use to remind myself of this trick is
>
> "-merge- is for finding intersections as well as unions"
>
> and you're welcome to pin or write it on a board near you.
>
> http://www.stata.com/support/faqs/data-management/group-characteristics-for-subsets/
> is also relevant.
>
> . clear
>
> . input str5 CustomerIndustry  str5 SupplierIndustry Input
>
>      Custome~y  Supplie~y      Input
>   1. 1000A    4000B    100
>   2. 1000A    3000A    200
>   3. 1000A    3000B    100
>   4. 1000B    4000B    50
>   5. 1000B    2000A    8
>   6. 4000B    3000A    19
>   7. 4000B    2000A    20
>   8. 3000A    3000B    18
>   9. 3000A    3000D    12
>  10. 2000A    1000D    25
>  11. end
>
> . save tostart
> file tostart.dta saved
>
> . bysort SupplierIndustry: keep if _n == 1
> (4 observations deleted)
>
> . keep SupplierIndustry
>
> . rename SupplierIndustry CustomerIndustry
>
> . merge 1:m CustomerIndustry using tostart
>
>     Result                           # of obs.
>     -----------------------------------------
>     not matched                             8
>         from master                         3  (_merge==1)
>         from using                          5  (_merge==2)
>
>     matched                                 5  (_merge==3)
>     -----------------------------------------
>
> . tab _merge
>
>                  _merge |      Freq.     Percent        Cum.
> ------------------------+-----------------------------------
>         master only (1) |          3       23.08       23.08
>          using only (2) |          5       38.46       61.54
>             matched (3) |          5       38.46      100.00
> ------------------------+-----------------------------------
>                   Total |         13      100.00
>
> .
> end of do-file
>
> . list if _merge==3
>
>      +-------------------------------------------+
>      | Custom~y   Suppli~y   Input        _merge |
>      |-------------------------------------------|
>   2. |    2000A      1000D      25   matched (3) |
>   3. |    3000A      3000B      18   matched (3) |
>   6. |    4000B      3000A      19   matched (3) |
>  12. |    3000A      3000D      12   matched (3) |
>  13. |    4000B      2000A      20   matched (3) |
>      +-------------------------------------------+
>
> Nick
> [email protected]
>
>
> On 13 March 2014 02:12, R Zhang <[email protected]> wrote:
>
>> I have the following data set (HAVE) (only provide a few observations
>> as illustration). The input variable gives the dollar input sold by
>> supplier to customer. You will notice that customer industry 4000B,
>> 3000A also appear in SupplierIndustry. This indicates that some
>> industries can be both suppliers and customer.
>>
>> +++++++++++++++++++++++
>>
>> HAVE
>>
>> CustomerIndustry           SupplierIndustry              Input
>>
>> 1000A    4000B    100
>>
>> 1000A    3000A    200
>>
>> 1000A    3000B    100
>>
>> 1000B    4000B    50
>>
>> 1000B    2000A    8
>>
>> 4000B    3000A    19
>>
>> 4000B    2000A    20
>>
>> 3000A    3000B    18
>>
>> 3000A    3000D    12
>>
>> 2000A    1000D    25
>>
>> +++++++++++++++++++++++
>>
>> I want to create a dataset that list all customer industries that are
>> also supplier industry, i.e., my output shall appear as :
>>
>> CustomerIndustry           SupplierIndustry              Input
>>
>> 4000B    3000A    19
>>
>> 4000B    2000A    20
>>
>> 3000A    3000B    18
>>
>> 3000A    3000D    12
>>
>> 2000A    1000D    25
>>
>> I am asking for your help on coding this.
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: Identify observations that appear in a list
  - From: R Zhang <[email protected]>

References:
- st: Identify observations that appear in a list
  - From: R Zhang <[email protected]>
- Re: st: Identify observations that appear in a list
  - From: Nick Cox <[email protected]>

Prev by Date: st: Calculating marginal effects in nonlinear models
Next by Date: Re: st: Descriptive statistics table for continuous variables by several subgroups
Previous by thread: Re: st: Identify observations that appear in a list
Next by thread: Re: st: Identify observations that appear in a list
Index(es):
- Date
- Thread