Introduction to Time Series Analysis#
Introduction#
In this section we will cover the following topics:
What is a time series, and when do we use it.
The main components of a time series.
Time series decomposition.
What is a time series?#
A time series is a sequence of data points organized in time order.
Usually, the time signal is sampled at equally spaced points in time.
These can be represented as the sequence of the sampled values.
We refer to this setting as discrete time.

What data are represented as time series?#
Time series have a plethora of applications such as modeling finance, activity, behaviour, and so on.
[*********************100%***********************] 1 of 1 completed
Time series analysis#
The main objectives of time series analysis are:
To find a model that adequately explains the dependence observed in a time series.
To predict or forecast future values of the time series based on those we have already observed.
To adjust our observed data into a form more suited for answering the question of interest.
Applications#
Time series analysis is applied in many real world applications, including
Economic forecasting
Stock market analysis
Demand planning and forecasting
Anomaly detection
… And much more
Economic Forecasting
Time series analysis is used in macroeconomic predictions.
World Trade Organization does time series forecasting to predict levels of international trade.
Federal Reserve uses time series forecasts of the economy to set interest rates [source].
Demand forecasting
Time series analysis is used to predict demand at different levels of granularity.
Amazon and other e commerce companies use time series modeling to predict demand at a product geography level [source].
Helps meet customer needs (fast shipping) and reduce inventory waste
Anomaly detection
Used to detect anomalous behaviors in the underlying system by looking at unusual patterns in the time series.
Widely used in manufacturing to detect defects and target preventive maintenance
With new IoT devices, anomaly detection is being used in machinery heavy industries, such as petroleum and gas [source].
Time series components#
A time series is often assumed to be composed of three components:
Trend: The trend is the direction of the data.
Seasonality: The seasonal component is the part constantly repeats itself in time.
Noise: The component that is neither Trend nor Seasonality.
Trend#
Kendall writes that “the essential idea of trend is that it shall be smooth.”
Trend captures the general direction of the time series.
Trend can be increasing, decreasing, or constant.
It can increase/decrease in different ways over time (linearly, exponentially, etc).
When the trend is removed, we obtain a detrended series.
[*********************100%***********************] 1 of 1 completed
Let’s create a trend from scratch to understand how it’s behaviour.
time = np.arange(252 * 5) # days the market is open * 5
trend = time * 0.07 + 75
Seasonality#
Periodic fluctuations in time series data that occur at regular intervals due to seasonal factors.
It is characterized by consistent and predictable patterns over a specific period (e.g., daily, monthly, quarterly, yearly).
When we remove a seasonal component we say that we deseasonalize the series.
It can be driven by many factors, and there can be multiple causes of seasonal behaviour.
Naturally occurring events such as weather fluctuations caused by time of year.
Business or administrative procedures, such as start and end of a school year.
Social or cultural behavior, e.g., holidays or religious observances.
seasonal = 0.1 + np.sin( time * 0.024 - 0.44 ) * -4.12
Noise#
Noise is the random fluctuations left over after trend and seasonality are removed from the original time series.
One should not see a trend or seasonal pattern in the residuals.
They represent short term, rather unpredictable fluctuations.
noise = np.random.normal(loc=0.0, scale=1, size=len(time))
Decomposition Models#
Time series can be decomposed with the following models:
Additive decomposition
Multiplicative decomposition
Additive model#
Additive models assume that the observed time series is the sum of its components:
where
\(X(t)\) is the time series
\(T(t)\) is the trend
\(S(t)\) is the seasonality
\(N(t)\) is the noise
Additive models are used when the magnitudes of the seasonal and residual values do not depend on the level of the trend.
additive = trend + seasonal + noise
Multiplicative Model#
Assumes that the observed time series is the product of its components:
It is possible to transform a multiplicative model to an additive one by applying a log transformation:
Multiplicative models are used when the magnitudes of seasonal and residual values depends on trend.
multiplicative = trend * seasonal # no noise to make the trend more pronounced
fig, ax = plt.subplots(1, 1, figsize=(10, 3))
ax.plot(time, multiplicative, 'tab:blue')
ax.set_xlabel("Date")
ax.set_ylabel("Price (USD)")
plt.title("Multiplicative Time Series")
year_ticks = np.arange(0, len(time), 252)
ax.set_xticks(year_ticks)
ax.grid(True)
For this workshop, we’ll be focusing more on the additive model.
Time Series Decomposition#
Now let’s go the other way.
We have additive and multiplicative data.
Let’s decompose them into their three components.
A very simple, yet often useful, approach is to estimate a linear trend.
A detrended time series is obtained by subtracting the linear trend from the data.
The linear trend is computed as a 1st order polynomial.
slope, intercept = np.polyfit(np.arange(len(additive)), additive, 1) # estimate line coefficient
trend = np.arange(len(additive)) * slope + intercept # linear trend
detrended = additive - trend # remove the trend
Next, we will use
seasonal_decompose(more information here) to isolate the main time series components.This is a simple method that requires us to specify the type of model (additive or multiplicative) and the main period.
Additive Decomposition#
We need to specify an integer that represents the main seasonality of the data.
Since the market is open for trading on 252 business days, we’ll use that as our period.
additive_decomposition = seasonal_decompose(x=additive, model='additive', period=252)
seas_decomp_plots(additive, additive_decomposition)
Locally estimated scatterplot smoothing (LOESS)#
Next, we try a second method called
STL(Seasonal and Trend decomposition using LOESS).
stl_decomposition = STL(endog=additive, period=252, robust=True).fit()
seas_decomp_plots(additive, stl_decomposition)
Which method to use?#
Use seasonal_decompose when:
Your time series data has a clear and stable seasonal pattern and trend.
You prefer a simpler model with fewer parameters to adjust.
The seasonal amplitude is constant over time (suggesting an additive model).
Use STL when:
Your time series exhibits complex seasonality that may change over time.
You need to handle outliers effectively without them distorting the trend and seasonal components.
You are dealing with non-linear trends and seasonality, and you need more control over the decomposition process.
Identifying the dominant period#
seasonal_decomposeexpects the dominant period as a parameter.In our example, the
seasonalcomponent was easy to figure out based on the behaviour of the market, but we won’t always be so luckyHow can we find it?
You can use one of the following techniques:
Plot the data and try to figure out after how many steps the cycle repeats.
Use the Fast Fourier Transform on a detrended signal.
For now, you can use the following function to compute the dominant period in the data.
period, freqs, magnitudes = fft_analysis(seasonal)
Dominant Frequency: 0.004
Dominant Period: 252.00 time units
hourly_seasonal = 12 + np.sin(2*np.pi*time/24)
fft_analysis(hourly_seasonal);
Dominant Frequency: 0.041
Dominant Period: 24.23 time units
Summary#
In this section we covered:
The definition of a time series and examples of time series from the real world.
The definition of time series analysis and examples of its application in different fields.
A practical understanding of the three components of time series data.
The additive and multiplicative models.
Standard approaches to decompose a time series in its constituent parts.
Exercises#
Exercise 1:#
lake_huron = sm.datasets.get_rdataset("LakeHuron", "datasets").data["value"]
Perform seasonal decompositon using STL and Seasonal Decompositon, justify your choice in period (either using rational or FFT). Comment on the overall trend in the data.
Exercise 2:#
passengers = sm.datasets.get_rdataset("AirPassengers", "datasets").data["value"].values
Use FFT to determine the dominant period in the data, use this to seasonally decompose the data. Once you extract the seasonality, comment on the seasonal ridership patterns.