Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: tabstatmat question
From
"Alvarez,Sergio" <[email protected]>
To
<[email protected]>
Subject
Re: st: tabstatmat question
Date
Fri, 02 Sep 2011 13:14:40 -0400
Hi Austin,
Thanks for your response. So what I'm doing is a site choice model of
recreational fishing. From the dataset I can tell what city/town people
come from and what city/town they went fishing in. I can also tell how
many fish people caught. For the site choice model, I need to create a
series of alternatives or just other places where the person could have
fished at but decided not to. But I need some indication of the quality
of the site that was not visited. Since alternative fishing trips did
not take place, I have no indication of how many fish the person could
have caught if they had gone to place B, rather than to place A, which
is where they actually went.
So as an indication of quality I was going to use the mean number of
fish caught in the site (zone) at that particular time of the year
(wave) in years past. That is why I wanted to create a matrix that
would have the mean catch by zone and wave, something like this:
WAVE
ZONE 1 2 ...
1 mean(1,1) mean(1,2)
2 mean(2,1) mean(2,2)
...
Which was my original question. I would use that matrix to input the
mean catch for the alternatives that did not happen after I created
them. Now if the matrix looked like the example above, I thought I
could use:
gen meancatch = matrix[zone,wave]
I was hoping that this line of code would look up the wave and zone of
each observation and input the value from the matrix that corresponded
to each observation. So I looked around and found -tabstatmat- from
SSC, and tried it, using the code you gave yesterday:
egen byv=group(zone wave), lab
tabstat num_typ3, stat(mean) by(byv) save
tabstatmat TABLE
And this created the matrix with the values, and looks like this:
TABLE[414,1]
num_typ3
1†1:mean 1.9822335
1†2:mean 2.6614173
1†3:mean 2.7150396
1†4:mean 3.3340782
1†5:mean 2.8161094
1†6:mean 1.1767857
2†1:mean 1.5857143
2†2:mean 2.1863208
2†3:mean 2.542777
2†4:mean 1.8849432
2†5:mean 1.7281553
2†6:mean 1.4927536
3†1:mean 1.875
.....
There's 85 sites with 6 waves a piece.
The original dataset has about 70,000 observations, so after creating
84 alternatives for each I get about 6,000,000 observations. I already
know how to do this using -reshape- and the distance to the alternative
sites, which I already put in the dataset. And what I need is to have
the indicator of quality, or mean catch for each alternative site during
the time period that the person actually went fishing. Then I will be
able to run -clogit- or a similar procedure.
I hope this makes sense. I'm new both to stata and to choice models,
so this has been a pretty confusing and slow process for me.
I really appreciate the help.
Sergio
On Fri, 2 Sep 2011 12:45:05 -0400, Austin Nichols wrote:
Sergio <[email protected]>:
Did you read my response?
Look at the matrix; there is one column, so your references to row
and
column make no sense.
You could make another matrix with values of byv corresponding to
zone
and wave, noting that you must have these be integers counting from 1
up for row and column numbers to correspond to what you seem to want.
But why? What would be the point of this?
On Fri, Sep 2, 2011 at 12:35 PM, Alvarez,Sergio <[email protected]>
wrote:
Sorry about ambiguity.
So I used the mean by group code to create the matrix that would
store the
mean values for each group, using:
egen byv=group(zone wave), lab
tabstat num_typ3, stat(mean) by(byv) save
tabstatmat TABLE
which gives me a matrix, or rather a vector, with all the values I
need.
The first few lines of the matrix in the output screen look like
this:
TABLE[414,1]
num_typ3
1†1:mean 1.9822335
1†2:mean 2.6614173
1†3:mean 2.7150396
1†4:mean 3.3340782
1†5:mean 2.8161094
1†6:mean 1.1767857
2†1:mean 1.5857143
2†2:mean 2.1863208
2†3:mean 2.542777
2†4:mean 1.8849432
Now what I want to do is use -gen- or -egen- to create a variable
that would
look up the zone and wave of the corresponding observation from the
matrix
and insert the correct value in there. So I tried:
gen meancatch = TABLE[zone,wave]
and this gives the correct values for all observations with wave =
1, but
creates missing values on the rest of the observations. I also
tried:
gen meancatch = TABLE[byv,num_typ3]
and this gives me the correct value in some of the observations, but
mostly
missing values in the others.
So I must be doing something wrong, but can't figure out what. I
guess the
question is how to call the row and column numbers from the TABLE
matrix?
Thanks again,
Sergio
On Fri, 2 Sep 2011 12:08:31 -0400, Austin Nichols wrote:
Sergio <[email protected]> :
Now I have no idea what you are trying to do. For the mean by
group,
egen mby=mean(num_typ3), by(zone wave)
but you are referring to (probably) nonexistent row and column
numbers
of a matrix in your example.
On Fri, Sep 2, 2011 at 10:42 AM, Alvarez,Sergio <[email protected]>
wrote:
Thanks Austin and Nick for your help. I used what Austin
suggested
(which
is what Nick also suggested) and it worked. However, when I try to
create
the variable that contains the mean by group it works for some
observations,
but missing values are created for most of them. I tried both:
gen meancatch = TABLE[zone,wave]
and
gen meancatch = TABLE[byv,num_typ3]
For the first line of code, it creates the correct value for all
observations where wave = 1, but not for any others. The second
line
creates missing values at random (as far as I can tell).
I'd appreciate any tips.
*
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
--
Sergio Alvarez
Food and Resource Economics
University of Florida
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/