Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: could you please verify the correctness of the code?-tsfill function
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: could you please verify the correctness of the code?-tsfill function
Date
Wed, 6 Jun 2012 08:55:45 +0100
The flavour of this is now not about Stata but about whether you are
making the right substantive decision on what to do with your data. I
am not an economist and not familiar with the pertinent literature,
but you cannot be the first person to be faced with this problem that
prices in different places are observed on different dates. So, what
else do people do?
Your solution is Procrustean in forcing dates on to a grid. Daily
dates that round to the same floor(date/28) could be up to 27 days
apart, so with what you do I think you need to calculate the error
(real date - gridded date) and talk about its distribution.
One alternative is to interpolate the prices to a shared set of dates.
Another is to take what you have and calculate monthly average prices
and also report how many prices those averages are based on. You will
still have gaps and may well need to interpolate too.
As I've said many times on this list, Statalist may not serve well the
expectations of those list members who want to be told how best to
analyse their data. I keep wondering whether the response to thesis
examiners/committee members or paper reviewers to "Why did you do
that?" is going to be "Oh, that was what recommended on an internet
discussion list by one person who answered my question".
Nick
On Wed, Jun 6, 2012 at 2:14 AM, stef salvez <[email protected]> wrote:
> thank you Nick. I really appreciate your help and your patience.
> Let me be more explicit this time
>
>
> I have a panel data set of prices of goods that vary across time and countries.
>
> As you can see from the table below
>
>
>
> country dates price of good k
>
>
>
> 1 "23/11/08" 2
> 1 "28/12/08" 3
> 1 "25/01/09" 4
> 1 "22/02/09" 5
> 1 "29/03/09" 6
> 1 "26/04/09" 32
> 1 "24/05/09" 23
> 1 "28/06/09" 32
> 2 "26/10/08" 45
> 2 "23/11/08" 46
> 2 "21/12/08" 90
> 2 "18/01/09" 54
> 2 "15/02/09" 65
> 2 "16/03/09" 77
> 2 "12/04/09" 7
> 2 "10/05/09" 6
>
>
>
>
>
>
>
> the start and end date of the time series for countries 1 and 2 are
> different. For example, for country 1 the time series begins on
> "23/11/08" while for country 2 the time series begins on
> "26-10-2008".
>
> My data on prices are available every 28 days (or equivalently every 4
> weeks). But in some cases I have jumps (35 days or 29 days instead of
> 28 days). For example from the above table we have such jumps: from
> "28/12/08" to "28/12/08" , from 22/02/09" to
> "29/03/09", etc
>
> My goal is to have as much as possible the same sequence of dates
> across countries which is a bit difficult because of the two
> "problems" that I mentioned above. I want to have the same sequence of
> dates across countries because eventually what I want to do is see how
> the difference of prices for,say good k, between two countries
> evolves over time. So I want to set up the following regression
>
>
>
>
>
> ΔP_{ij,t}_{k}= constant +regressors +error term where ΔP_{ij,t } is
> the difference of prices between countries i and j in period t for
> good k. The ΔP_{t}_{k} is a vector of difference of prices for all
> pairs of countries at time t for good k.
> The whole point is to be able to run the above regression
>
>
> My initial idea was to use -tsfill- in the code which i display below
> ( and which can be easily reproduced with copy paste in stata):
>
>
>
> clear all
> cd D:\
> input id str8 (dates) variable
> 1 "23/11/08" 2
> 1 "28/12/08" 3
> 1 "25/01/09" 4
> 1 "22/02/09" 5
> 1 "29/03/09" 6
> 1 "26/04/09" 32
> 1 "24/05/09" 23
> 1 "28/06/09" 32
> 2 "26/10/08" 45
> 2 "23/11/08" 46
> 2 "21/12/08" 90
> 2 "18/01/09" 54
> 2 "15/02/09" 65
> 2 "16/03/09" 77
> 2 "12/04/09" 7
> 2 "10/05/09" 6
> end
>
>
>
> gen edate1 = date(dates, "DM20Y")
> gen edate2= floor(edate1/28)
> tsset id edate2
> tsfill
>
>
>
>
>
> But I do not know if this approach is correct or not in order to be
> able to run the above regression. Apart from tsfill I have no other
> idea how to run this regression. Any suggestions/codes are welcome.
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/