Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Matrix zeros and ones
From
Maarten Buis <[email protected]>
To
[email protected]
Subject
Re: st: Matrix zeros and ones
Date
Mon, 22 Aug 2011 14:20:34 +0200
On Mon, Aug 22, 2011 at 1:59 PM, Tribin Uribe, Ana wrote:
> I have data about letters that have been sign by a group of people
>
> Letter signatures
> Letter1 Friend1
> Letter2 Friend2 Friend3 Friend4 Friend5
> Letter3 Friend4
>
> I want to create with this information a matrix with zeros and ones,
> like this one using information above
>
> Letter Friend1 Friend2 Friend3 Friend4 Friend5
> Letter1 1 0 0 0 0
> Letter2 0 1 1 1 1
> Letter3 0 0 0 1 0
*----------------------- begin example ---------------------
clear
input ///
Letter str31 signatures
1 "Friend1"
2 "Friend2 Friend3 Friend4 Friend5"
3 "Friend4"
end
gen byte friend1 = strpos(signatures, "Friend1") > 0
gen byte friend2 = strpos(signatures, "Friend2") > 0
gen byte friend3 = strpos(signatures, "Friend3") > 0
gen byte friend4 = strpos(signatures, "Friend4") > 0
gen byte friend5 = strpos(signatures, "Friend5") > 0
l Letter friend*
*--------------------- end example ------------------------
(For more on examples I sent to the Statalist see:
http://www.maartenbuis.nl/example_faq )
There are however two things I would worry about: 1) Your data is
stored as a string variable, and these can only contain 244 characters
(including spaces). I can easily imagine letters signed by groups of
people where that limit is surpassed. 2) When signing people can
easily use other variations of their own name (Maarten, Maarten Buis,
M. Buis, dr. M. Buis, Dr. M. Buis, dr. M.L. Buis, dr. M. L. Buis
(extra space between M. and L.), Maarten Leendert Buis, Maarten L.
Buis, etc. etc.). Even if people are consistent in the way they sign
their name, the person that typed them in could easily make typos (an
incomplete list of variations on my name that have appeared on this
list is: Marteen, Maarteen, Maarrten, Martin). The strategy used above
is very sensitive to such variations.
Hope this helps,
Maarten
--------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany
http://www.maartenbuis.nl
--------------------------
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/