Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | László Sándor <sandorl@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: comparing xtdes-like patterns for variables |
Date | Wed, 31 Oct 2012 15:39:18 -0400 |
Thanks, Nick. The values definitely don't line up that neatly, but that's a worry for another day. Basically my problem is, if I know I can expect differences between the variables, is there a neat way to compare their missing patterns (one always starting early, or one mistakenly having the years in reverse order)? On Wed, Oct 31, 2012 at 3:15 PM, Nick Cox <njcoxstata@gmail.com> wrote: > If # different versions of the same data should be the same, there > will be # duplicates of everything in a combined dataset. > > This applies to missings too. > > -duplicates- is therefore something that springs to mind. Panels are > no problem, as panel identifiers are just other variables > > Naturally, if the combined dataset is extremely large, this won't be > very practical. . > > Nick > > On Wed, Oct 31, 2012 at 7:02 PM, László Sándor <sandorl@gmail.com> wrote: > >> I have a panel-data cleaning problem that probably has some neat >> solution, probably already out there. I am happy to try any solutions >> for Stata 12.1 MP. >> >> Background: I had to try to look up supposedly the same data from >> multiple sources. (Financial data for the same securities, but >> different data sources were expected to cover different subsets of my >> universe, or for different time periods.) >> >> But now I have a panel where I would like to cross-check different >> version of the same data, and most crucially, I would like to verify >> that I got the years correctly for each version. (FYI: financial data >> sources can be opaque about how they handle missing data if you ask >> for "end-of-year prices for the last 15 calendar years", and whether >> they give years in ascending or descending order). For this, I would >> like to compare what periods I have non-missing values for a family of >> variables, say, bloomberg_price and reuters_price. >> >> Presumably, if I got the start and the end years right, I could hope >> -compare- those, (e.g. -compare *_price_first- ). And hope that the >> patterns will be clear. >> >> That said, I'm afraid some more nuanced analysis of missing value >> patterns might be justified. What are good tools for that? (How can I >> "xtdes by variable"? Or "misstable pattern in a panel"?) > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/