Hi Cristian,
I am not sure about your model, partly because it is not entirely clear
to me what is your dependent variable (sales?, satisfation?) and whether
the data are individual or grouped data.
About the apparent weird result using team familiarity as a continuous
or a dichotomous variable,
I would look for outliers and influential points in the dataset. May one
or a limited number of observation influence a lot the estimation in the
continuous analysis, and do not carry so much weight in the dichotomous
analysis?
Hope this helps
Isabelle
From: [email protected]
[mailto:[email protected]] On Behalf Of Cristian
Dezso
Sent: Tuesday, October 13, 2009 4:08 AM
To: [email protected]
Subject: st: 3 issues multiple firm obs per year, bounded dependent
variable and odd independent variable
Hello,
this is my first post on the listserv and I will try to make it short
and informative.
I have a data set on all video games released since the beginning of the
industry. I want to analyze how the user rating of a game depends on
genre, firm and year specific controls, as well as a variable that
measures how familiar with each other are the team members that worked
on the game.
I have several issues with the analysis that I would like to run, but
the following three seem to be the more important:
1. The data set is not a panel, in that for some firms I have several
observations per year, since a firm can release more than one game per
year
- what I have done to address this issue is to declare xtset with only
the panel variable (firm identificator), and used xtreg and xttobit with
firm specific random effects and release year dummies - in the first run
of analyses
Question: is this appropriate for the type of analysis I want to do, and
should I drop all observations whereby the firm released a single game
during the entire sample period?
2. The dependent variable - user rating - is continuous and can only
take values between 0 and 5.
- what I have done is to use xtreg (probably not appropriate) but also
xttobit (without declaring lower or upper bounds) given that the
variable is bounded
Question: is tobit appropriate for this kind of analysis given that the
variable, while bounded, is NOT censored? If not, can I use a
transformation for the dependent variable so that it is distributed over
the (0,infinity)?
3. The variable that measures team familiarity takes the value 0 almost
50% of the time (team members never worked together before) and is
really very low almost all the time.
- I used it nevertheless as a continuous independent variable in the
sales regression, but the oddest thing happened: the coefficient is
negative and significant; but if I use a dummy that takes the value 0 if
team familiarity is 0 and 1 otherwise, the coefficient on the dummy is
positive.
Question: what would be the appropriate solution for such a highly
skewed independent variable - continuous or dummy? And what could
explain this puzzle of a negative coefficient for the continuous
representation, but positive for the dummy?
I apologize for the long post and thanks in advance for your help,
Cristian
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/