Nick Winter
> > I have a dataset which has customer's payment amount by month
> > by year. Month
> > ranges from 01 to 12 for years 2000 & 2001 and from 01 to 09
> > for the 2002. But
> > all customers don't have data for each month. The dataset
> > looks like the
> > following.
> >
> > customer month year amount
> > x1 01 2001 50.45
> > x1 03 2001 60.00
> > x2 04 2001 70.00
> > x2 06 2001 80.00
> >
> > I would like to create a data set where each customer will have 12
> > observations for years 2000 & 2001 and 9 obs. for 2002, and
> > amount will be
> > zero for the months they don't have any original data. I
> > tried with couple of
> > different ways, but didn't work. Could anyone please help me?
>
> Check out the -reshape- command. The only catch is that
> -reshape- only
> seems to allow one variable to specify the j() units (the
> month and year
> in your case), so you will need to combine them into one
> variable. One
> option:
>
> gen str7 date = string(year) + "_" + string(month)
> drop month year
> reshape wide amount, i(customer) j(date)
This needs to be follow by a -reshape long-.
In principle, the solution will not fill in any gaps
for the whole data set, i.e. months in which no
customer made purchases.
In practice, the enterprise is, we hope, not
in such a dire state that there are any such gaps.
Nick
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/