I'm dealing with call center data with several overlapping cyclical trends, known movable events and general trends. My forecasting is dependent on the previous years among other factors, i have this annoying skew in the data because the 2010 and 2009 weeks don't align.
I noticed in my local bussiness paper (Dagens Næringsliv) showing the same problem comparing power-price commodities from last year to this year, the graph had an apparent volatility to it in comparing week by week, but shifting one of the datasets showed that the two years had a positive correlation. As in this example from nve.no showing water reserves, where I think the red 2010 data should be closer to the blue (2009) and the black median due to a ISO week artifact skewing the data.
Is there some best practices in comparing two years?
I would have liked to have a day-by-day alignment so that I could show the previous year, then my forecast, then the actual result after the fact.
(Also, R seems to fight me all the way on this one wich is often an indicator that I really should do some reading).
Discovered some interessting tools on the US Cencus Bureau, which might get me closer to a complete answer.
I will take a stab at this. Go by Fiscal Week number? I'm sure you've already explored this, and it is ultimately, like a lot of date math, a matter of opinion, or some arbitrary demarcation, like say the date for the Easter holiday. My suggestion would be to pick a fix point in the past as Day 0, and just number each day consecutively since that date, and then just compare say days 200-300 to days 600-700 (obviously not correct, but perhaps by factoring out the arbitrary nature of weeks, months, and years...) This is how a lot of dates are calculated in systems known as using the Epoch notation, and is famously January 1st, 1980, or I think Lotus 123 had an even weirder date. Also, since about every answer I have comes back to Python somehow, the Python "datetime" module is the best date math paradigm I've seen. That module itself is worth it alone to at least try reformatting your data with Python and then finishing with R. Hope this helps!
answered Jul 08 '10 at 16:44
You could try an autocorrelation plot (in R, see
answered Jul 13 '10 at 08:19