Hoetker, Glenn
> I have two files. File A has about 5000 unique values of
> the variable
> PATENT, which is 7 characters long. File B has 16 million
> observations
> and several million unique values for PATENT. I want to do some
> manipulation involving File B, but only for the observations that
> correspond to the patent values found in File A. I am
> currently using
> merge on the two files to do this (actually mmerge as a wrapper for
> ease), but wonder if there is an easier/faster way.
>
> I attempted using vallist.ado in File A to generate a long
> local macro
> (say, _useme) and then doing
>
> use FileB if index(patent, "'useme'")
>
> I get 0 observations in this case (even though I know there are some
> matches). From the manual, it appears that index is
> limited to strings
> of 80 characters, anyway.
-vallist- is Patrick Joly's program.
Quite apart from the 80 characters limit, what it does
does nothing to help with your problem.
Stripping down to a miniature analogue, suppose you have
a string variable -myvar- which takes on distinct values
"a" "b" "c".
-vallist myvar- will return that set of values as a
space-separated list, i.e.
"a b c"
If you then say
... if index(myvar,"a b c")
then this is true for _none_ of the observations;
naturally, you report the same for your dataset.
Closer to your problem are approaches detailed in
http://www.stata.com/support/faqs/data/characteristics.html
and
http://www.stata.com/support/faqs/data/selectid.html
which may well be (the equivalent) of what you are doing
with -mmerge-.
Nick
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/