Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Jeph Herrin <stata@spandrel.net> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: svy + aweights |
Date | Thu, 10 Nov 2011 16:44:20 -0500 |
On 11/10/2011 4:28 PM, Joerg Luedicke wrote:
I would be downweighting the subjects with fewer days;Whether you downweight subjects with fewer days or upweight subjects with more days should be the same thing. Anyway, if your concern is that of differing reliability across sub-groups of your data, then weighting will not solve the problem.
You asked why I was upweighting kids with fewer days, I stated that my concern was the opposite. If you want to understand the problem better, I suggest you use Google to look for information on precision weighting, inverse variance weighting, and weighting by sample size. It is a common issue when doing a meta-analysis, where each observation represents a separate trial of known study sample size N; when the effects are pooled, it is common to weight either by the sample size N (which is analogous to what I am doing here) or by the inverse variance of the effect.
I wouldn't actually use number of days but the inverse of the variance of the daily average, since I have that, number of days is just a short hand way of thinking about it.I am completely at a loss with this one.
See previous comment.
Not all obese kids have less days than normal kids, it's just that the distribution is skewed down - higher percent with 1-2 days and lower percent with 6-7 days.As I said, consider a multilevel model. That way you can directly account for the uncertainty that is related to the varying numbers of measurement occasions.
If you can explain how to fit a multilevel model using complex survey design data, that would be helpful. cheers, Jeph
J.On 11/10/2011 3:37 PM, Joerg Luedicke wrote:Then it seems to be a problem of reliability of your measure, i.e., measurement for obese kids is less reliable than measurement for non-obese kids, right? Now, if you upweight the obese kids in your sample, why would that enhance the reliability of their measurement? If I understand the problem correctly, then weighting strikes me as the wrong approach here. Perhaps you could consider not averaging at all and running a multilevel model of some sort. J. On Thu, Nov 10, 2011 at 3:23 PM, Jeph Herrin<stata@spandrel.net> wrote:For each day, I have 1440 minutes (24 hours) of measurements. Each minute has an activity measure, 0-30,000. I want to compare how active the kids (these are all children) are, so I calculate an activity measurement for each day (to keep it simple here I will say it is the median, though actually it is a complicated function of the activity levels over the day). id day1 day2 day3 obese average days 1 500 500 500 Y 500 3 2 1000 N 1000 1 Now I want to compare kids who are normal weight to those who are obese. It turns out, I don't have as many measurements on the obese kids because they did not wear their monitor as often. So the active kids have more precise daily averages than the obese kids. To compare average activity, I want to account for the differences in precision. If this was not -svy- data, I would use something like ttest average [aw=days], by(obese) even better -reshape- the data to have one record per day per id and use xtset id xtreg average But here I have this complex survey design to deal with. thanks, Jeph On 11/10/2011 3:06 PM, Joerg Luedicke wrote:I do not quite understand what you are trying to do. Suppose we have two individuals, one measured only once and the other on, say, 3 occasions. Let's further assume that activity is measured in minutes (btw, how is your dependent variable measured?). We could have the following data: id day1 day2 day3 1 30 2 10 10 10 If you calculate the minutes per day now (whether or not this being a proper way of handling it), id#1 will end up with 30 and id#2 with 10 minutes. I do not understand why id#2 is supposed to weigh more than id#1? J. On Thu, Nov 10, 2011 at 2:34 PM, Jeph Herrin<stata@spandrel.net> wrote:Thanks for the suggestion, but I specifically need to give more weight to subjects which have more days of observation. For example, I have svy : regress activity female BMI and would like this regression to give more weight to subjects which have more days of observation. Using activity/days as the dependent variable will not do this. Jeph On 11/10/2011 1:58 PM, Stas Kolenikov wrote:Rather than forming the mean activity per day, you might want to analyze this as a ratio: svy : ratio activity / day_reported or whatever would be an appropriate ratio. That way, you will get correct standard errors without messing with the analytical weights. On Thu, Nov 10, 2011 at 1:46 PM, Jeph Herrin<stata@spandrel.net> wrote:I am analyzing NHANES data (see manual page for -svyset-) using -svy- commands. My complication is that I am using the subset of subjects for which there is activity monitoring, and the number of days monitored varies from 1 to 8. So - to be clear - for some subjects I have 1 day of monitoring, and for some I have 2 days, some I have 3, etc. My dependent variable of interest is daily average activity levels, but I would like this to be weighted by the number of days monitored. (This is important because there seems to be a clear relationship between days monitored and age, race, etc). How do I incorporate this additional level of weighting? For instance, if I use svy : mean depvar [aw=days] I get an error that weights are not reported. thanks, Jeph * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/* * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/* * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ ----- No virus found in this message. Checked by AVG - www.avg.com Version: 2012.0.1869 / Virus Database: 2092/4607 - Release Date: 11/09/11* * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/* * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ ----- No virus found in this message. Checked by AVG - www.avg.com Version: 2012.0.1869 / Virus Database: 2092/4607 - Release Date: 11/09/11* * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/* * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ ----- No virus found in this message. Checked by AVG - www.avg.com Version: 2012.0.1869 / Virus Database: 2092/4607 - Release Date: 11/09/11
* * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/