Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | <theodroe@bellsouth.net> |
To | <statalist@hsphsun2.harvard.edu> |
Subject | st: Model design question |
Date | Thu, 20 Dec 2012 11:19:17 -0500 |
Hello, My name is Ted Kaniuka, and I am using STAT 12 2 core and I am in the field of Educational Leadership/economics. I don't have the strongest statistics background but I try. My question has to do with design. I am new to panel models so forgive me if my questions are rudimentary. I want to regress a set of constructs measured using a survey onto student achievement scores grouped by each school. The constructs are the differences between how each school's principal and teachers perceive the culture. So I have one set of scores for the principal and one set for each teacher. To establish the difference I subtracted the principal's scores from those of each teacher in each school. So for one school I can have 20 sets of scores if there are twenty teachers. The problem rests in that I only have one overall achievement score for the entire school. There is no way to link individual teacher's survey results to their students test scores. Here is the model and a different explanation: The difference variables were developed by subtracting the school principal's domain score for each teacher's domain scores. The difference score would then become the new predictor variable. This process was repeated for each of the five domains in the TWC. A basic regression model could be: Yj = β0 + β1jX1j + β2jX2j + β3jX3j + β4jX4j + β5jX5j Where Xij = the school-level difference for school j between the principal's and teachers' perceptions of the each TWC domain i, with Yj = the reading or math score for each school. So here is the dilemma, if I treat each school as a single unit I create a mean score for all the teachers across each construct, subtract the lone principal's scores from these group scores and then match the resulting difference scores with each school's achievement score. I end up with 700 cases. I believe that I will need to weight each case as some schools have 10 teachers while others 40 and it seems that I should account for this. However, I wanted to do a multi-level design but it does not seem that I should do this since I will have the same number of cases at each level. But I do want to control for school building characteristics (wealth, size, experience level of staff, tenure of staff at the school). I could revert back to the data set that has 28,000 cases but that results in a data set that for a school of twenty teachers, twenty difference scores and twenty achievement scores that are not the scores for each teacher's class. But to do a multi-level model this makes more sense since I have 28000 cases at one level and 700 at the other. The large data set has many more cases but seems incorrect to use since the student achievement scores are for the school not the individual teacher. I hope this is clear, while not a coding question I hope that you will have the time to provide some feedback. Ted * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/