Dear Statalisters,
I would appreciate help with the following panel data manipulation
problem. I have data of the form:
seller year product buyer volume
...
100 1990 1111 200 100
100 1991 1111 200 200
100 1992 1111 200 250
100 1990 2222 200 .
100 1991 2222 200 80
100 1992 2222 200 83
100 1990 2222 500 100
100 1991 2222 500 125
100 1992 2222 500 200
...
The dot in volume means sales volume in that year was zero. I would like
summarize the data by answering two questions. First, each seller sells
many products to many buyers. Let "firstyear" be the first year a given
seller sells anything to a given buyer. In the example above, for the
seller-buyer pair 100-200, "firstyear" would be 1990. Then for every other
product 100 sells to 200 in subsequent year, I want to calculate the gap
between year of first sale and "firstyear". I would like to generate a
table of the form:
seller product buyer timegap firstyear
...
100 1111 200 0 1990
100 2222 200 1 1991
100 2222 300 0 1990
...
Second, I would like to ask: for each seller, how much of the increase in
total sales of each product is driven by changes in sales to existing
buyers or sales to new buyers. I would like to generate a table of the
form:
seller product year change same new
...
100 1111 1991 100 100 0
100 2222 1991 105 25 80
...
Note that from 1990 to 1991, the total sales of 2222 increased by 105, of
which 25 came from increasing sales to an existing customer 500, and the
rest 80 came from selling to a new customer 200.
One complication is that sometimes a seller will make sales with a given
buyer in one year, make zero sales the next year, and generate positive
sales another year. In the case of this last year, I would like to count
the sales as coming from "same" customer.
I would very much appreciate your help. Thanks very much.
Jason
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/