Chih-Mao Hsieh
>
> I had been shying away from converting "cited" to
> strings because the numbers are in the millions, i.e.
> strings would be length 7. Many of the "citing" patents
> have more than 35-40 "cited" patents, and so the
> concatenation might surpass the string's length limit.
>
> Of course, the chances are not high that two patents
> would match each other over the first 35 patents, so your
> way does appear to be better.
Another way is to -reshape-, something
like this:
bysort citing (cited) : gen j = _n
reshape wide cited, i(citing) j(j)
bysort cited* (citing) : gen counter = _N - 1
At this moment, I think that's a lot better
than my earlier suggestions.
Nick
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/