Not quite.
-egen, count()- does not count distinct values. It counts non-missing
values, regardless of whether they are the same or different.
See the references in my previous post in this thread.
Nick
[email protected]
Steven Samuels
I think that Daniel's second example would be:
> Ndc: 00904161260 gpi: 44300040007440
> 00904161260 gpi: 44300040007441
and that he wants to list instances where the same NDC has 2 or more
GPI's.
If so:
bysort ndc (gpi): gen count=_N // counts number of observations
with the same ndc
list ndc old_ndc e49_ gpi g64_ if count>1
OR
egen count= count(gpi), by(ndc)
list ndc old_ndc e49_ gpi g64_ if count>1
If the same ndc gpi combination is listed more than once in the data,
Daniel should first:
bysort gpi ndc: keep if _n==1
On Aug 4, 2008, at 11:51 AM, Daniel Sepulveda-Adams wrote:
> OK here is an example of what happened in the dataset, this dataset
> is a
> drugs description, therefore codes are associated to names and
> strength the
> drug and packages size, etc.
>
> Ex:
> 1) ndc (National Drug Code) assigned one gpi (Generic Product
> Identifier)
>
> ndc: 49502007412 gpi: 44100010102005
>
> 2) ndc that has more than one gpi (9 in this specific case, why
> because each
> gpi represent a different packages size)
>
> Ndc: 00904161260 gpi: 44300040007440
>
> Please let me know if that clarified my question and thank you for
> your
> help.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/