AI, Python, Cognitive Neuroscience

What is a Time Series?

Many data sets are cross-sectional and represent a single slice of time. However, we also have data collected over many periods - weekly sales data, for instance. This is an example of time series data. Time series analysis is a specialized branch of statistics used extensively in fields such as Econometrics and Operations Research. Unfortunately, most Marketing Researchers and Data Scientists still have had little exposure to it. As we'll see, it has many very important applications for marketers.

Just to get our terms straight, below is a simple illustration of what a time series data file looks like. The column labeled DATE is the date variable and corresponds to a respondent ID in survey research data. WEEK, the sequence number of each week, is included because using this column rather than the actual dates can make graphs less cluttered. The sequence number can also serve as a trend variable in certain kinds of time series models.

I should note that the unit of analysis doesn't have to be brands and can include individual consumers or groups of consumers whose behavior is followed over time.

But first, why do we need to distinguish between cross-sectional and time series analysis? For several reasons, one being that our research objectives will usually be different. Another is that most statistical methods we learn in college and make use of in marketing research are intended for cross-sectional data, and if we apply them to time series data the results we obtain may be misleading. Time is a dimension in the data we need to take into account.

Time series analysis is a complex subject but, in short, when we use our usual cross-sectional techniques such as regression on time series data:
1- Standard errors can be far off. More often than not, p-values will be too small and variables can appear "more significant" than they really are;
2- In some cases regression coefficients can be seriously biased; and
3- We are not taking advantage of the information the serial correlation in the data provides.

Univariate Analysis
To return to our example data, one objective might be to forecast sales for our brand. There are many ways to do this and the most straightforward is univariate analysis, in which we essentially extrapolate future data from past data. Two popular univariate time series methods are Exponential Smoothing (e.g., Holt-Winters) and ARIMA (Autoregressive Integrated Moving Average). Causal Modeling

Obviously, there are risks in assuming the future will be like the past but, fortunately, we can also include "causal" (predictor) variables to help mitigate these risks. But besides improving the accuracy of our forecasts, another objective may be to understand which marketing activities most influence sales.

Causal variables will typically include data such as GRPs and price and also may incorporate data from consumer surveys or exogenous variables such as GDP. These kinds of analyses are called Market Response or Marketing Mix modeling and are a central component of ROMI (Return on Marketing Investment) analysis. They can be thought of as key driver analysis for time series data. The findings are often used in simulations to try to find the "optimal" marketing mix.

Transfer Function Models, ARMAX and Dynamic Regression are terms that refer to specialized regression procedures developed for time series data. There are more sophisticated methods, in addition, and I'll touch on a few in just a bit.

Part 1

❇️ @AI_Python_EN

780 viewsFarzad, edited 15:03