Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: How do you select and describe a single variable of interest from a merged dataset but avoiding duplication (due to the merge)?
From
Gwinyai Masukume <[email protected]>
To
[email protected]
Subject
st: How do you select and describe a single variable of interest from a merged dataset but avoiding duplication (due to the merge)?
Date
Thu, 4 Apr 2013 18:39:47 +0200
Dear Stata list,
I have a single dataset obtained by merging two datasets (these 2
datasets are related – obtained from a relational database).
e.g. 1st dataset was of patients and the second dataset was of their
hospital visits – a single patient can have multiple hospital visits.
So the merged dataset has many entries for a single patient.
In my merged data set, I would like to analyze say patient age
(assuming it’s fixed for that patient regardless of the number of
visits). Since a single patient has the same age for their different
hospital visits, a command like “sum Age” will give too many
observations for age (duplication).
Each patient has a unique ID (identification number).
How do I issue a command to only count 1 age for each unique patient
ID and then summarize this information?
I have tried using the duplicates command to drop other hospital
visits and remain with one visit, then pick say patient age from this
to avoid the duplication mentioned above.
Thanks for your consideration
Kind regards,
Gwinyai
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/