Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Assigning new values to group variables
From
Robert Picard <[email protected]>
To
[email protected]
Subject
Re: st: Assigning new values to group variables
Date
Mon, 9 May 2011 11:23:59 -0400
There are many issues here but I assume that you want to preserve the
relationship found in each observation. The following example creates
a variable called rel_id that identifies each relationship. Your main
issue of having consistent Group values is done by converting the data
to long form. Then I create a new variable called gid that identifies
groups of companies based on the relationships stated in the initial
dataset. This requires a program of mine called -group_id-, available
from SSC. Just in case you needed it, I convert back to wide form.
Hope this helps,
Robert
* --------------------- begin example ---------------------
clear
input Group1 str10 Var1 Group2 str10 Var2
1 companyABC 1 companyABD
1 companyABC . .
2 companyABD . .
3 companyABE . .
4 companyABF 2 companyCCC
5 companyACF 3 companyDDD
6 companyACG . .
6 companyACG 4 companyADK
7 companyADK . .
8 companyADL 5 companyCCD
8 companyADL . .
end
* Assign a unique identifier to each observation
* These identify a relationship
gen rel_id = _n
* Reshape to long form; drop obs with no company
reshape long Group Var, i(rel_id) j(j)
drop if Var == "."
* Disregard Group values if they are not Group1
replace Group = . if j > 1
* Each company should have the same Group value
sort Var Group
by Var: replace Group = Group[1]
* Assign new Group values for companies that were
* not part of Group1
by Var: gen first = _n == 1
sum Group, meanonly
replace Group = r(max) + sum(first) if Group == .
drop first
* Group co_id when they are part of the same
* relationship. This requires -group_id-, available
* from SSC. To install, type ssc install group_id
gen gid = Group
group_id gid, matchby(rel_id)
sort gid Var
list, sepby(gid) noobs
* If desired, convert back to wide
sort rel_id
reshape wide Var Group gid, i(rel_id) j(j)
list, noobs sep(0)
* --------------------- end example -----------------------
On Mon, May 9, 2011 at 7:35 AM, Florian Seliger <[email protected]> wrote:
> Dear Stalalist,
>
> I have a dataset from a firm survey containing several thousand observations.
>
> There are six variables with company names (Var1-Var6) where firms are asked to indicate to which other firms they have relationships.
>
> Similar companies may occur within Var1-Var6. These are grouped as indicated by the variables group1-group6.
>
> Var2-Var6 contain many missing values because many firms answer to have only a relationship to a single firm.
>
> The variables group1-group6 have different numbers although the companies are the same in var1 and var2 (and var3…), e.g., group1 may take on value 2 whereas group2 takes on value 1 for the same company. The problem is that there may also occur other companies in var2-var6 than in var1.
>
> Please see the example below for a few companies.
>
>
>
> Group1 Var1 Group2 Var2
>
> 1 companyABC 1 companyABD
>
> 1 companyABC . .
>
> 2 companyABD . .
>
> 3 companyABE . .
>
> 4 companyABF 2 companyCCC
>
> 5 companyACF 3 companyDDD
>
> 6 companyACG . .
>
> 6 companyACG 4 companyADK
>
> 7 companyADK . .
>
> 8 companyADL 5 companyCCD
>
> 8 companyADL . .
>
>
>
> At the end, all similar companies across Var1-Var6 should have the same value as in group1. In addition, companies that do not occur in Var1 should be assigned another number. Please look below for an example.
>
>
>
>
>
> Group1 Var1 Group2 Var2
>
> 1 companyABC 1 .
>
> 1 companyABC 1 .
>
> 2 companyABD 2 companyABD
>
> 3 companyABE 3 .
>
> 4 companyABF 4 .
>
> 5 companyACF 5 .
>
> 6 compaynACG 6 .
>
> 6 companyACG 6 .
>
> 7 companyADK 7 companyADK
>
> 8 companyADL 8 .
>
> 8 companyADL 8 .
>
> 9 . 9 companyCCC
>
> 10 . 10 companyDDD
>
> 11 . 11 companyCCD
>
>
>
> As I did not find the right approach to assign new numbers with STATA if a company does not occur in var1, I would like to ask you if you have any ideas.
>
>
>
> Thank you.
>
>
>
> Best,
>
> Florian
> --
> NEU: FreePhone - kostenlos mobil telefonieren und surfen!
> Jetzt informieren: http://www.gmx.net/de/go/freephone
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/