Rodrigo Brice�o
>
> I have a database with incomes and a database with socioeconomic
> characteristics of individuals. The income database has
> 9,000 records and
> the socioeconomic has 19,000. I merged the two databases
> using some unique
> identifiers between the two databases.
>
> With the new database I need to construct a new variable
> called (income
> deciles) and for this I need the total income for the
> household (assigned to
> each individual of the household). The problem is that not
> every observation
> (in the new database) corresponding to the same household has income
> observations in the merged database (it could be the
> childrens, people with
> no reported income, etc).
> I already think in a procedure to make my statistical
> analysis right:
> To add incomes for each household due to the fact that each
> household could
> have several sources of income (salaries, transfers, etc.)
> First of all I
> don't know how to add those values (a new variable
> independent of the source
> I guess) neither how to assign the sum (for each household)
> to each of the
> members of the household. I require this to make the deciles.
> Can somebody help me with the commands or the steps that I
> need to do.
>
> Example:
> HH Member Age Sex SourceIncome Total Income
> 1 1 37 1 11 1500
> 1 1 42 2 11 3000
> 1 1 42 2 53 400
> 1 2 14 2 . .
> 1 2 25 2 . .
>
> You can find identical observations in the variable
> "member" because is
> possible that each household is composed by several families.
>
The total income in a household is
. bysort hh : egen hhincome = sum(total_income)
This treats all missings as 0. Short of some elaborate
imputation exercise, this appears to be the only thing
you can do.
When you say you want deciles, I guess that you
want a grouping into 10 groups using -xtile-. One
way to do this is use just one observation from
each household, and then to smear the results
across all values for each household.
. egen tag = tag(hh)
. xtile dechhincome = hhincome if tag, nq(10)
. bysort hh (dechhincome) : replace dechhincome = dechhincome[1]
One magic word here is -by:-. For manipulations involving
groups, -by:- is invaluable. Use the manual index
to see various sections on -by:-. Alternatively,
there is an overall tutorial in the Stata Journal:
How to move step by: step. Stata Journal 2(1):86-102
(2002)
Nick
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/