<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Time Series - neptune.ai</title>
	<atom:link href="https://neptune.ai/blog/category/time-series-forecasting/feed" rel="self" type="application/rss+xml" />
	<link>https://neptune.ai/blog/category/time-series-forecasting</link>
	<description>The experiment tracker for foundation model training.</description>
	<lastBuildDate>Tue, 06 May 2025 11:38:04 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	

<image>
	<url>https://i0.wp.com/neptune.ai/wp-content/uploads/2022/11/cropped-Signet-1.png?fit=32%2C32&#038;ssl=1</url>
	<title>Time Series - neptune.ai</title>
	<link>https://neptune.ai/blog/category/time-series-forecasting</link>
	<width>32</width>
	<height>32</height>
</image> 
<site xmlns="com-wordpress:feed-additions:1">211928962</site>	<item>
		<title>Time Series Projects: Tools, Packages, and Libraries That Can Help</title>
		<link>https://neptune.ai/blog/time-series-tools-packages-libraries</link>
		
		<dc:creator><![CDATA[Enes Zvorničanin]]></dc:creator>
		<pubDate>Thu, 20 Oct 2022 14:43:45 +0000</pubDate>
				<category><![CDATA[ML Tools]]></category>
		<category><![CDATA[Time Series]]></category>
		<guid isPermaLink="false">https://neptune.test/time-series-tools-packages-libraries/</guid>

					<description><![CDATA[Since you are here, you probably know that time series data is a bit different than static ML data. So when working on time series projects, oftentimes, Data Scientists or ML Engineers use specific tools and libraries. Or they use commonly known tools that have proved to be well adjusted to time series projects. We&#8230;]]></description>
										<content:encoded><![CDATA[
<p>Since you are here, you probably know that <a href="/blog/time-series-prediction-vs-machine-learning" target="_blank" rel="noreferrer noopener">time series data is a bit different than static ML data</a>. So when working on time series projects, oftentimes, Data Scientists or ML Engineers use specific tools and libraries. Or they use commonly known tools that have proved to be well adjusted to time series projects. </p>



<p>We figured it would be useful to have those tools gathered in one place, so here we are. This article is sort of a database of time series tools and packages. Some of them are pretty well-known and some may be new to you. Hope you&#8217;ll find the whole list useful! </p>



<p>Before we dig into tools, let&#8217;s cover some basics. </p>



<div id="separator-block_76addb181febf0cb05ce8a0ae2d6945a"
         class="block-separator block-separator--15">
</div>



<h2 class="wp-block-heading" class="wp-block-heading" id="h-what-is-a-time-series">What is a time series?</h2>



<p>A time series is a sequence of data points indexed in time order. It’s an observation of the same variable at successive points in time. In other words, it’s a set of data that has been observed over a period of time.</p>



<p>The data is often plotted as a line on a graph with time on the x-axis and the value at each point on the y-axis. Also, there are four main components of a time series:</p>



<div id="case-study-numbered-list-block_6aab9dd561d5cb0d546c1cf1009a62d2"
         class="block-case-study-numbered-list ">

    
    <h2 id="h-"></h2>

    <ul class="c-list">
                    <li class="c-list__item">
                <span class="c-list__counter">1</span>
                Trend            </li>
                    <li class="c-list__item">
                <span class="c-list__counter">2</span>
                Seasonal variations            </li>
                    <li class="c-list__item">
                <span class="c-list__counter">3</span>
                Cyclic variations            </li>
                    <li class="c-list__item">
                <span class="c-list__counter">4</span>
                Irregular or random variations            </li>
            </ul>
</div>



<p>A <strong>Trend</strong> is simply a general direction of change in the data over many periods and it’s the long-term pattern in the data. The trend usually appears for a certain amount of time, after which it disappears or changes direction. For example, in financial markets, a ‘Bullish Trend’ indicates an upward trend where the prices of financial assets rise in general, while a ‘Bearish Trend’ indicates a decline in the prices.&nbsp;</p>



<p>Broadly, a trend in time series can be:</p>



<ul class="wp-block-list">
<li><strong>Upward trend: </strong>a time series increases over an observed period.</li>



<li><strong>Downward trend:</strong> a time series decreases over an observed period.</li>



<li><strong>Constant or horizontal trend: </strong>a time series doesn’t significantly rise or fall over an observed period.</li>
</ul>



<p><strong>Seasonal variations or seasonality</strong> is an important component to consider when looking at a time series because it can provide information about what might happen in the future based on past data. It refers to the variation in the value of a measure over the course of one or more seasons, such as winter and summer months but also might be on a daily, weekly, or monthly basis. For example, the temperature has a seasonal behavior because it is higher in summer and lower in winter.</p>



<p>In contrast to seasonal variations, <strong>cyclic variations </strong>don’t have precise time periods and might have some drifts in time. For instance, financial markets tend to cycle between periods of high and low values, but there is no predetermined period of time between them. Besides that, a time series can have both seasonal and cyclic variations. For instance, it&#8217;s known that the real estate market has both cyclic and seasonal patterns. The seasonal pattern shows that there are more transactions in the spring rather than in the summer. The cyclic pattern reflects the purchasing power of the people, which means that in a crisis there are fewer sales in contrast to the time when there is prosperity.</p>



<p><strong>Irregular or random variations</strong> are what remain after trend, seasonal and cyclic components are removed. Because of that, it’s also known as the residual component. This is a non-systematic part of a time series that is completely random and can’t be predicted.&nbsp;</p>



<div id="separator-block_e304094759d0a0c7cf2bcf596b48c813"
         class="block-separator block-separator--10">
</div>


<div class="wp-block-image">
<figure class="aligncenter size-large is-resized"><a href="https://neptune.ai/time-series-projects-tools-packages-and-libraries-that-can-help_3" target="_blank" rel="noopener"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Time-Series-Projects-Tools-Packages-and-Libraries-That-Can-Help_3.png?ssl=1" alt="Time-Series components" class="wp-image-64576" style="width:896px;height:554px"/></a><figcaption class="wp-element-caption"><em>Time series components |</em><a href="https://itfeature.com/time-series-analysis-and-forecasting/components-of-time-series"><em> </em></a><a href="https://itfeature.com/time-series-analysis-and-forecasting/components-of-time-series" target="_blank" rel="noreferrer noopener nofollow"><em>Source</em></a></figcaption></figure>
</div>


<p>In general, time series are often used in many fields such as economics, mathematics, biology, physics, meteorology, etc. Concretely, some examples of time series data are:</p>



<ul class="wp-block-list">
<li>The Dow Jones Industrial Average index prices</li>



<li>The temperature in New York City</li>



<li>Bitcoin price</li>



<li>ECG signals</li>



<li>Google trends of the term MLOps</li>



<li>The unemployment rate in the USA</li>



<li>Website traffic through time and similar</li>
</ul>



<p>In this article, we will take a look at a few of the aforementioned examples.</p>



<h2 class="wp-block-heading" class="wp-block-heading" id="h-examples-of-time-series-projects">Examples of time series projects</h2>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-stock-market-prediction">Stock market prediction</h3>



<p><a href="/blog/predicting-stock-prices-using-machine-learning" target="_blank" rel="noreferrer noopener">Stock market forecasting</a> is a challenging and attractive topic where the main goal is to develop diverse methods and strategies for predicting future stock prices. There are a lot of different techniques, from classic algorithmic and statistics methods up to complex neural network architectures. The common thing is that they all utilize different time series to achieve accurate forecasts. Stock market forecasting methods are widely used by amateur investors, fintech startups, and big hedge funds.</p>



<p>There are many ways to use stock market forecasting methods in practice, but the most popular is probably trading. The number of automatic trading on stock exchanges is on the rise, and it’s estimated that about 75% of stocks traded on US stock exchanges come from algorithmic systems. There are two main approaches to predicting how stocks will perform in the future: fundamental analysis and technical analysis.</p>



<p><strong>Fundamental analysis</strong> looks at factors such as a company&#8217;s financial statements, management, and industry trends. Also, it takes into account some macroeconomic indicators such as inflation rate, GDP, state of the economy, and similar. All these indicators are time-dependent and in that way can be represented as time series.</p>



<p>In contrast to fundamental analysis, <strong>technical analysis</strong> uses patterns in trading volume, price changes, and other information from the market itself to predict how stocks will perform in the future. It’s important for investors to understand both approaches before making an investment decision.</p>



<div id="separator-block_76addb181febf0cb05ce8a0ae2d6945a"
         class="block-separator block-separator--15">
</div>


<div class="wp-block-image">
<figure class="aligncenter size-large is-resized"><a href="https://neptune.ai/time-series-projects-tools-packages-and-libraries-that-can-help_2" target="_blank" rel="noopener"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Time-Series-Projects-Tools-Packages-and-Libraries-That-Can-Help_2.png?ssl=1" alt="Technical indicators example" class="wp-image-64577" style="width:808px;height:365px"/></a><figcaption class="wp-element-caption"><em>Technical indicators example | </em><a href="https://tradingstrategyguides.com/best-combination-of-technical-indicators/" target="_blank" rel="noreferrer noopener nofollow"><em>Source</em></a></figcaption></figure>
</div>


<h3 class="wp-block-heading" class="wp-block-heading" id="h-bitcoin-price-forecasting">Bitcoin price forecasting</h3>



<p>Bitcoin is a digital currency that has significant fluctuations in price. It’s also one of the most volatile assets in the world. The price of bitcoin is determined by supply and demand. When demand for bitcoins increases, the price increases, and when demand falls, the price falls. As demand has increased in recent years, so has the price. Because of its very volatile nature, it is a very challenging task to forecast bitcoin&#8217;s future prices.</p>



<p>In general, this problem is very similar to stock market prediction, and almost the same methods can be used to solve it. Even bitcoin has been shown to correlate with some indices such as S&amp;P 500 and Dow Jones. It means that the bitcoin price, to some degree, follows the prices of the mentioned indices. You can read more about this here:</p>



<section id="note-block_642e045472afb678d05bfbf44c04dbaf"
         class="block-note c-box c-box--default c-box--dark c-box--no-hover c-box--standard ">

    
    <div class="block-note__content">
                    <div class="c-item c-item--wysiwyg_editor">

                
                
                <div class="c-item__content">

                                            <p>  <a href="https://towardsdatascience.com/cryptocurrency-price-prediction-using-lstms-tensorflow-for-hackers-part-iii-264fcdbccd3f" target="_blank" rel="noreferrer noopener nofollow">Cryptocurrency price prediction using LSTMs | TensorFlow for hackers</a></p>
<p>  <a href="https://medium.com/geekculture/lstm-for-bitcoin-prediction-in-python-6e2ea7b1e4e4" target="_blank" rel="noreferrer noopener nofollow">LSTM for bitcoin prediction in Python</a></p>
                                    </div>

            </div>
            </div>


</section>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-ecg-anomaly-detection">ECG anomaly detection</h3>



<p>ECG <a href="/blog/anomaly-detection-in-time-series" target="_blank" rel="noreferrer noopener">anomaly detection</a> is a technique that detects the abnormalities in an ECG. The ECG is a test that monitors the electrical activity of the heart. Basically, it is an electrical signal generated by the heart and represented as a time series.</p>



<p>The ECG anomaly detection is done by comparing the normal pattern of an ECG with the abnormal pattern. There are many types of anomalies in an ECG, and they can be classified as follows:</p>



<ul class="wp-block-list">
<li><strong>Heart rate anomalies:&nbsp;</strong> this refers to any change in heart rate from its normal range. This may be due to a problem with the heart or a problem with how it is being stimulated.</li>



<li><strong>Heart rhythm anomalies: </strong> this refers to any change in rhythm from its normal pattern. This may be due to a problem with the way that impulses are being conducted through the heart or problems with how quickly they are conducted through it.</li>
</ul>



<p>A lot of work has been done on this topic, ranging from academic research to commercial ECG machines, and there are some promising results. The biggest issue is that the system should have a high level of accuracy and should not have any false positives or negatives. This is due to the nature of the problem and the consequences of the wrong prediction.</p>



<div id="separator-block_e304094759d0a0c7cf2bcf596b48c813"
         class="block-separator block-separator--10">
</div>


<div class="wp-block-image">
<figure class="aligncenter size-full is-resized"><a href="https://neptune.ai/time-series-projects-tools-packages-and-libraries-that-can-help_4" target="_blank" rel="noopener"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Time-Series-Projects-Tools-Packages-and-Libraries-That-Can-Help_4.png?ssl=1" alt="ECG anomalies detection" class="wp-image-64575" style="width:654px;height:491px"/></a><figcaption class="wp-element-caption"><em>ECG anomalies detection | </em><a href="https://www.researchgate.net/figure/Two-examples-of-local-anomaly-in-ECG-time-series_fig1_292077185" target="_blank" rel="noreferrer noopener nofollow"><em>Source</em></a></figcaption></figure>
</div>


<section id="note-block_b586b8e2f9c5cb82990f93cdc5afc7b6"
         class="block-note c-box c-box--default c-box--dark c-box--no-hover c-box--standard ">

    
    <div class="block-note__content">
                    <div class="c-item c-item--wysiwyg_editor">

                
                
                <div class="c-item__content">

                                            <p>  <a href="https://curiousily.com/posts/time-series-anomaly-detection-using-lstm-autoencoder-with-pytorch-in-python/" target="_blank" rel="noreferrer noopener nofollow">Time series anomaly detection using LSTM autoencoders with PyTorch in Python</a></p>
<p>  <a href="https://ieeexplore.ieee.org/document/7344872" target="_blank" rel="noreferrer noopener nofollow">Anomaly detection in ECG time signals via deep long short-term memory networks</a></p>
                                    </div>

            </div>
            </div>


</section>



<h2 class="wp-block-heading" class="wp-block-heading" id="h-tools-packages-and-libraries-for-time-series-projects">Tools, packages, and libraries for time series projects</h2>



<p>Since now we have some background regarding the importance of time series in the industry, let’s take a look at some popular tools, packages, and libraries that can be helpful for any time series project. Also, due to the fact that the majority of data science and machine learning projects related to time series are done in Python, it makes sense to discuss tools supported by Python.&nbsp;</p>



<p>We will discuss tools from majorly four categories:</p>



<div id="case-study-numbered-list-block_faccc3ca9731936b200307658d68752a"
         class="block-case-study-numbered-list ">

    
    <h2 id="h-"></h2>

    <ul class="c-list">
                    <li class="c-list__item">
                <span class="c-list__counter">1</span>
                Data preparation and feature engineering tools             </li>
                    <li class="c-list__item">
                <span class="c-list__counter">2</span>
                Data analysis and visualization packages             </li>
                    <li class="c-list__item">
                <span class="c-list__counter">3</span>
                Experiment tracking tools             </li>
                    <li class="c-list__item">
                <span class="c-list__counter">4</span>
                Time series forecasting packages            </li>
            </ul>
</div>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-data-preparation-and-feature-engineering-tools-for-time-series">Data preparation and feature engineering tools for time series</h3>



<p>Data preparation and feature engineering are two very important steps in the data science pipeline. Data preparation is typically the first step in any data science project. It’s the process of getting data into a form that can be used for analysis and further processing.</p>



<p>Feature engineering is a process of extracting features from raw data to make it more useful for modelling and prediction. Below, we’ll mention some of the most popular tools used for these tasks.</p>


    <a
        href="/blog/time-series-forecasting"
        id="cta-box-related-link-block_c98b61955570fd885baa9e93b981a7ac"
        class="block-cta-box-related-link  l-margin__top--standard l-margin__bottom--standard"
        target="_blank" rel="nofollow noopener noreferrer"    >

    
    <div class="block-cta-box-related-link__description-wrapper block-cta-box-related-link__description-wrapper--full">

        
            <div class="c-eyebrow">

                <img
                    src="https://neptune.ai/wp-content/themes/neptune/img/icon-related--article.svg"
                    loading="lazy"
                    decoding="async"
                    width="16"
                    height="16"
                    alt=""
                    class="c-eyebrow__icon">

                <div class="c-eyebrow__text">
                    Related                </div>
            </div>

        
                    <h3 class="c-header" class="c-header" id="h-time-series-forecasting-data-analysis-and-practice">                Time Series Forecasting: Data, Analysis, and Practice            </h3>        
                    <div class="c-button c-button--tertiary c-button--small">

                <span class="c-button__text">
                    Read more                </span>

                <img
                    src="https://neptune.ai/wp-content/themes/neptune/img/icon-button-arrow-right.svg"
                    loading="lazy"
                    decoding="async"
                    width="12"
                    height="12"
                    alt=""
                    class="c-button__arrow">

            </div>
            </div>

    </a>



<h4 class="wp-block-heading">Time series projects with Pandas</h4>



<p>Pandas is a Python library for data manipulation and analysis. It includes data structures and methods for manipulating numerical tables and time series. Also, it contains extensive capabilities and features for working with time series data for all domains.</p>



<p>It supports data input from a variety of file types, including CSV, JSON, Parquet, SQL database tables and queries, and Microsoft Excel. Also, Pandas allows various data manipulation features such as merging, reshaping, selecting, as well as data cleaning and wrangling.</p>



<p>Some useful time series features are:&nbsp;</p>



<ul class="wp-block-list">
<li>Date range generation and frequency conversions&nbsp;</li>



<li>Moving window statistics&nbsp;</li>



<li>Moving window linear regressions&nbsp;</li>



<li>Date shifting</li>



<li>Lagging and many more</li>
</ul>



<p>More related content for time series can be found below:</p>



<section id="note-block_ec70747ad2ca727ba64b0394f3491971"
         class="block-note c-box c-box--default c-box--dark c-box--no-hover c-box--standard ">

    
    <div class="block-note__content">
                    <div class="c-item c-item--wysiwyg_editor">

                
                
                <div class="c-item__content">

                                            <p>  <a href="https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html" target="_blank" rel="noreferrer noopener nofollow">Pandas documentation</a></p>
<p>  <a href="https://www.w3schools.com/python/pandas/default.asp" target="_blank" rel="noreferrer noopener nofollow">W3Schools: Pandas tutorial</a></p>
                                    </div>

            </div>
            </div>


</section>



<h4 class="wp-block-heading">Time series projects with NumPy</h4>



<p>NumPy is a Python library that adds support for huge, multi-dimensional arrays and matrices, as well as a vast number of high-level mathematical functions that may be used on these arrays. It has a very similar syntax to MATLAB and includes a high-performance multidimensional array object as well as capabilities for working with these arrays.</p>



<p>NumPy&#8217;s datetime64 data type and arrays enable an extremely compact representation of dates in time series. Using NumPy also makes it simple to do various time series operations using linear algebra operations.</p>



<p>NumPy documentation and tutorials:</p>



<section id="note-block_8b1fd8d5171591c30ce0758b037938dd"
         class="block-note c-box c-box--default c-box--dark c-box--no-hover c-box--standard ">

    
    <div class="block-note__content">
                    <div class="c-item c-item--wysiwyg_editor">

                
                
                <div class="c-item__content">

                                            <p>  <a href="https://numpy.org/" target="_blank" rel="noreferrer noopener nofollow">Numpy website</a></p>
<p>  <a href="https://www.w3schools.com/python/numpy/numpy_intro.asp" target="_blank" rel="noreferrer noopener nofollow">W3Schools: NumPy introduction</a></p>
                                    </div>

            </div>
            </div>


</section>



<h4 class="wp-block-heading">Time series projects with Datetime</h4>



<p>Datetime is a Python module that allows us to work with dates and times. This module contains the methods and functions required to handle the scenarios such as:</p>



<ul class="wp-block-list">
<li>Representation of dates and times</li>



<li>Arithmetic of dates and times</li>



<li>Comparison of dates and times</li>
</ul>



<p>Working with time series is simple using this tool. It allows users to transform dates and times into objects and manipulate them. For example, with only a few lines of code, we may convert from one DateTime format to another, add a number of days, months, or years to date, or calculate the difference in seconds between two-time objects.</p>



<p>Useful documentation around how to get started with this module:</p>



<section id="note-block_8973955b0e76fb24c5b276fd126650dd"
         class="block-note c-box c-box--default c-box--dark c-box--no-hover c-box--standard ">

    
    <div class="block-note__content">
                    <div class="c-item c-item--wysiwyg_editor">

                
                
                <div class="c-item__content">

                                            <p>  <a href="https://www.tutorialspoint.com/python_data_science/python_date_and_time.htm" target="_blank" rel="noreferrer noopener nofollow">Tutorials point: Python, date and time</a></p>
<p>  <a href="https://docs.python.org/3/library/datetime.html#module-datetime" target="_blank" rel="noreferrer noopener nofollow">Python documentation (datetime)</a></p>
                                    </div>

            </div>
            </div>


</section>



<h4 class="wp-block-heading">Time series projects with Tsfresh</h4>



<p>Tsfresh is a Python package. It automatically calculates a large number of time series characteristics, known as features. The package combines established algorithms from statistics, time series analysis, signal processing, and non-linear dynamics with a robust feature selection algorithm to provide systematic time series feature extraction.</p>



<p>The Tsfresh package includes a filtering procedure to prevent the extraction of irrelevant features. This filtering procedure assesses each characteristic&#8217;s explaining power and significance for the regression or classification tasks.</p>



<p>Some examples of advanced time series features are:</p>



<ul class="wp-block-list">
<li>Fourier transform components</li>



<li>Wavelet transform</li>



<li>Partial autocorrelation and others</li>
</ul>



<p>More about the Tsfresh package can be found below:</p>



<section id="note-block_6b373b5673398287156657a095eadb90"
         class="block-note c-box c-box--default c-box--dark c-box--no-hover c-box--standard ">

    
    <div class="block-note__content">
                    <div class="c-item c-item--wysiwyg_editor">

                                    <img
                        alt=""
                        class="c-item__arrow"
                        src="https://neptune.ai/wp-content/themes/neptune/img/blocks/note/list-arrow.svg"
                        loading="lazy"
                        decoding="async"
                        width="12"
                        height="10"
                    />
                
                <div class="c-item__content">

                                            <p>  <a href="https://tsfresh.readthedocs.io/en/latest/index.html">Tsfresh documentation </a></p>
                                    </div>

            </div>
            </div>


</section>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-data-analysis-and-visualization-packages-for-time-series">Data analysis and visualization packages for time series</h3>



<p>Data analysis and visualization packages are tools that help data analysts to create graphs and charts from their data. Data analysis is defined as the process of cleaning, transforming, and modelling data in order to uncover useful information for business decisions. The goal of data analysis is to extract useful information from data and make decisions based on that information.</p>



<p>The graphical representation of data is known as data visualization. Data visualization tools, which use visual elements such as charts and graphs, provide an easy way to see and understand trends and patterns in data.</p>



<p>There is a wide range of data analysis and visualization packages for time series and we’ll go through a few of them.</p>



<h4 class="wp-block-heading">Time series projects with Matplotlib</h4>



<p>Probably the most popular Python package for data visualization is Matplotlib. It’s used for creating static, animated, and interactive visualizations. With Matplotlib it’s possible to do some things such as:</p>



<ul class="wp-block-list">
<li>Produce plots suitable for publication</li>



<li>Create interactive figures that can be zoomed in, panned, and updated</li>



<li>Change the visual style and layout</li>
</ul>



<p>Also, it provides a variety of options for drawing time series charts. More about it is on the link below:</p>



<section id="note-block_b42cd80472c6d64bd4a640e6add56f35"
         class="block-note c-box c-box--default c-box--dark c-box--no-hover c-box--standard ">

    
    <div class="block-note__content">
                    <div class="c-item c-item--wysiwyg_editor">

                                    <img
                        alt=""
                        class="c-item__arrow"
                        src="https://neptune.ai/wp-content/themes/neptune/img/blocks/note/list-arrow.svg"
                        loading="lazy"
                        decoding="async"
                        width="12"
                        height="10"
                    />
                
                <div class="c-item__content">

                                            <p>  <a href="https://matplotlib.org/" target="_blank" rel="noreferrer noopener nofollow">Matplotlib website</a></p>
                                    </div>

            </div>
            </div>


</section>



<div id="separator-block_76addb181febf0cb05ce8a0ae2d6945a"
         class="block-separator block-separator--15">
</div>


<div class="wp-block-image">
<figure class="aligncenter size-large"><a href="https://neptune.ai/time-series-projects-tools-packages-and-libraries-that-can-help_9" target="_blank" rel="noopener"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Time-Series-Projects-Tools-Packages-and-Libraries-That-Can-Help_9.png?ssl=1" alt="Example of the Matplotlib chart with Time-Series" class="wp-image-64570"/></a><figcaption class="wp-element-caption"><em>Example of the Matplotlib chart with time series | Source: Author</em></figcaption></figure>
</div>


<h4 class="wp-block-heading">Time series projects with Plotly</h4>



<p>Plotly is an interactive, open-source, and browser-based graphing library for Python and R. It’s a high-level, declarative charting library with over 30 chart types, including scientific charts, 3D graphs, statistical charts, SVG maps, financial charts, and more.</p>



<p>Besides that, with Plotly it’s possible to draw interactive time series-based charts such as lines, gantts, scatter plots, and similar. More about this package is presented in the documentation:</p>



<section id="note-block_2410bb4d4b6a810773b0ab542225b90c"
         class="block-note c-box c-box--default c-box--dark c-box--no-hover c-box--standard ">

    
    <div class="block-note__content">
                    <div class="c-item c-item--wysiwyg_editor">

                                    <img
                        alt=""
                        class="c-item__arrow"
                        src="https://neptune.ai/wp-content/themes/neptune/img/blocks/note/list-arrow.svg"
                        loading="lazy"
                        decoding="async"
                        width="12"
                        height="10"
                    />
                
                <div class="c-item__content">

                                            <p>  <a href="https://plotly.com/python/time-series/" target="_blank" rel="noreferrer noopener nofollow">Plotly documentation</a></p>
                                    </div>

            </div>
            </div>


</section>



<div id="separator-block_deb55284b8977af1ef3f9bd0734d0115"
         class="block-separator block-separator--5">
</div>


<div class="wp-block-image">
<figure class="aligncenter size-large is-resized"><a href="https://neptune.ai/time-series-projects-tools-packages-and-libraries-that-can-help_1" target="_blank" rel="noopener"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Time-Series-Projects-Tools-Packages-and-Libraries-That-Can-Help_1.png?ssl=1" alt="Example of the Plotly chart with Time-Series" class="wp-image-64578" style="width:814px;height:595px"/></a><figcaption class="wp-element-caption"><em>Example of the Plotly chart with time series | </em><a href="https://willkoehrsen.github.io/python/data%20visualization/introduction-to-interactive-time-series-visualizations-with-plotly-in-python/" target="_blank" rel="noreferrer noopener nofollow"><em>Source</em></a></figcaption></figure>
</div>


<h4 class="wp-block-heading">Time series projects with Statsmodels</h4>



<p>Statsmodels is a Python package that provides classes and functions for estimating a wide range of statistical models, as well as running statistical tests and statistical data analysis.</p>



<p>We’ll cover in more detail this library in the section about forecasting but here it’s worth mentioning that it provides a very convenient method for time series decomposition and its visualization. With this package, we can easily decompose any time series and analyze its components such as trend, seasonal components, and residual or noise. More about that is described in the tutorial:</p>



<section id="note-block_98c0633a671eaecc34898f6d0d22ebc7"
         class="block-note c-box c-box--default c-box--dark c-box--no-hover c-box--standard ">

    
    <div class="block-note__content">
                    <div class="c-item c-item--wysiwyg_editor">

                                    <img
                        alt=""
                        class="c-item__arrow"
                        src="https://neptune.ai/wp-content/themes/neptune/img/blocks/note/list-arrow.svg"
                        loading="lazy"
                        decoding="async"
                        width="12"
                        height="10"
                    />
                
                <div class="c-item__content">

                                            <p>  <a href="https://machinelearningmastery.com/decompose-time-series-data-trend-seasonality/" target="_blank" rel="noreferrer noopener nofollow">How to decompose time series data into trend and seasonality</a></p>
                                    </div>

            </div>
            </div>


</section>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-experiment-tracking-tools-for-time-series">Experiment tracking tools for time series</h3>



<p>Experiment tracking tools are usually high-level tools that can be used for a variety of purposes like tracking the results of an experiment, showing what would happen if one changed the parameters in an experiment, model management, and similar.</p>



<p>They are typically more user-friendly than low-level packages and can save a significant amount of time when developing machine learning models. Only two of them will be mentioned here, as they are most likely the most popular ones.</p>



<p>For time series, it&#8217;s especially important to have a convenient environment for tracking defined metrics and hyperparameters, since it&#8217;s most likely that we would need to run a lot of different experiments. Usually, time series models are not big in comparison to some convolution neural networks and as an input have a few hundred or thousand numerical values, so models train pretty fast. Also, they often require quite some time for hyperparameter tuning.</p>



<p>Finally, it would be very beneficial to connect in one place models from different packages as well as visualization tools.</p>



<h4 class="wp-block-heading">Time series projects with neptune.ai</h4>



<p><a href="/" target="_blank" rel="noreferrer noopener">neptune.ai</a> is an experiment tracker designed with a <a href="https://neptune.ai/product/team-collaboration">strong focus on collaboration</a> and scalability. It lets you monitor months-long model training, track massive amounts of data, and compare thousands of metrics in the blink of an eye. The tool is known for its user-friendly interface and flexibility, enabling teams to adopt it into their existing workflows with minimal disruption. Neptune gives users a lot of freedom when defining data structures and tracking metadata.</p>



<p>Data scientists and ML/AI researchers can log, store, organize, display, compare, and query <a href="https://docs.neptune.ai/log_metadata" target="_blank" rel="noreferrer noopener">all their model-building metadata</a> in a single place. Neptune handles data such as model metrics and parameters, model checkpoints, images, videos, audio files, dataset versions, and visualizations. As for any type of data, Time-Series are not an exception and any project with them can be tracked on Neptune. </p>



<div id="app-screenshot-block_1a20a53f8a99abfac88f35d5ee959149"
	class="block-app-screenshot js-block-with-image-full-screen-modal "
	data-video-url=""
	data-show-controls="false"
	data-unmute="false"
	data-button-icon="https://neptune.ai/wp-content/themes/neptune/img/icon-close.svg"
	data-image-full-screen-modal="https://i0.wp.com/neptune.ai/wp-content/uploads/2024/11/Reporting.png?fit=1020%2C577&#038;ssl=1"
>

			<div class="block-app-screenshot__image-wrapper">
			<div class="block-app-screenshot__bar">
				<figure class="block-app-screenshot__bar-buttons-wrapper">
					<img
						src="https://neptune.ai/wp-content/themes/neptune/img/blocks/app-screenshot/bar-buttons.svg"
						width="34"
						height="9"
						class="block-app-screenshot__bar-buttons"
						alt="">
				</figure>
			</div>

			
				<img
					srcset="
					https://i0.wp.com/neptune.ai/wp-content/uploads/2024/11/Reporting.png?fit=480%2C271&#038;ssl=1 480w,					https://i0.wp.com/neptune.ai/wp-content/uploads/2024/11/Reporting.png?fit=768%2C434&#038;ssl=1 768w,					https://i0.wp.com/neptune.ai/wp-content/uploads/2024/11/Reporting.png?fit=1020%2C577&#038;ssl=1 1020w"
					alt=""
					style=""
					width="1020"
					height="577"
					class="block-app-screenshot__image"
				>

			
			<div class="block-app-screenshot__overlay">

				
					<a
						href="https://scale.neptune.ai/o/examples/org/LLM-Pretraining/reports/9e6a2cad-77e7-42df-9d64-28f07d37e908"
						class="c-button c-button--primary c-button--small c-button--cta">
						<img
							decoding="async"
							loading="lazy"
							src="https://neptune.ai/wp-content/themes/neptune/img/icon-button--test-tube.svg"
							width="16"
							height="19"
							target="_blank" rel="nofollow noopener noreferrer"							class="c-button__icon"
							alt=""
						/>

													<span class="c-button__text">
								See in the app							</span>
						
					</a>

				
														<button
						class="js-c-image-full-screen-modal c-button c-button--tertiary c-button--small">
						<img
							decoding="async"
							loading="lazy"
							src="https://neptune.ai/wp-content/themes/neptune/img/icon-zoom.svg"
							width="16"
							height="17"
							class="c-button__icon"
							alt="zoom"
						/>

						<span class="c-button__text">
							Full screen preview						</span>
						
					</button>
									
			</div>

		</div>

					<figcaption class="block-app-screenshot__caption">
				All metadata in a single place with an experiment tracker (example in neptune.ai)			</figcaption>
			
</div>



<h4 class="wp-block-heading">Time series projects with&nbsp;Weights &amp; Biases</h4>



<p>Weights &amp; Biases (W&amp;B) is a machine learning platform, similar to neptune.ai, aimed at developers to help them build better models faster. It’s intended to support and optimize key MLOps life cycle steps such as model management, experiment tracking, and dataset versioning.</p>



<p>As neptune.ai, this tool can be useful during work with Time-Series projects, providing useful features for tracking and managing Time-Series models. More about Weights &amp; Biases is presented in their <a href="https://docs.wandb.ai/" target="_blank" rel="noreferrer noopener nofollow">documentation</a>.</p>



<div id="separator-block_e304094759d0a0c7cf2bcf596b48c813"
         class="block-separator block-separator--10">
</div>


<div class="wp-block-image">
<figure class="aligncenter size-full is-resized"><a href="https://neptune.ai/time-series-projects-tools-packages-and-libraries-that-can-help_6"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Time-Series-Projects-Tools-Packages-and-Libraries-That-Can-Help_6.png?ssl=1" alt="ML experiment tracking with Weights and Biases" class="wp-image-64573" style="width:811px;height:501px"/></a><figcaption class="wp-element-caption"><em>ML experiment tracking with Weights and Biases | </em><a href="https://towardsdatascience.com/a-guide-to-ml-experiment-tracking-with-weights-biases-93a3a2544413" target="_blank" rel="noreferrer noopener nofollow"><em>Source</em></a></figcaption></figure>
</div>

    <a
        href="/vs/wandb"
        id="cta-box-related-link-block_96c855fe1c7e66fea6ee5d26262b9dce"
        class="block-cta-box-related-link  l-margin__top--standard l-margin__bottom--standard"
        target="_blank" rel="nofollow noopener noreferrer"    >

    
    <div class="block-cta-box-related-link__description-wrapper block-cta-box-related-link__description-wrapper--full">

        
            <div class="c-eyebrow">

                <img
                    src="https://neptune.ai/wp-content/themes/neptune/img/icon-related--resource.svg"
                    loading="lazy"
                    decoding="async"
                    width="16"
                    height="16"
                    alt=""
                    class="c-eyebrow__icon">

                <div class="c-eyebrow__text">
                    Learn more                </div>
            </div>

        
                    <h3 class="c-header" class="c-header" id="h-comparison-between-weights-biases-and-neptune-ai">                 Comparison Between Weights &#038; Biases and neptune.ai            </h3>        
                    <div class="c-button c-button--tertiary c-button--small">

                <span class="c-button__text">
                    Compare                </span>

                <img
                    src="https://neptune.ai/wp-content/themes/neptune/img/icon-button-arrow-right.svg"
                    loading="lazy"
                    decoding="async"
                    width="12"
                    height="12"
                    alt=""
                    class="c-button__arrow">

            </div>
            </div>

    </a>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-time-series-forecasting-packages">Time series forecasting packages</h3>



<p>Probably the most important part of the time series project is forecasting. Forecasting is the process of predicting future events based on current and past data. It’s based on the assumption that the future can be realized from the past. Also, it assumes that there are some patterns in the data that can be used to predict what will happen next.</p>



<p>There are many methods for time series forecasting, starting from simple ones such as linear regression and ARIMA based, up to complex multilayer neural networks or ensemble models. Here, we’ll present some packages that support different kinds of models.</p>


    <a
        href="/blog/arima-sarima-real-world-time-series-forecasting-guide"
        id="cta-box-related-link-block_c293bd20700fddea7a6f6ce953ff74dc"
        class="block-cta-box-related-link  l-margin__top--standard l-margin__bottom--standard"
        target="_blank" rel="nofollow noopener noreferrer"    >

    
    <div class="block-cta-box-related-link__description-wrapper block-cta-box-related-link__description-wrapper--full">

        
            <div class="c-eyebrow">

                <img
                    src="https://neptune.ai/wp-content/themes/neptune/img/icon-related--article.svg"
                    loading="lazy"
                    decoding="async"
                    width="16"
                    height="16"
                    alt=""
                    class="c-eyebrow__icon">

                <div class="c-eyebrow__text">
                    Related                </div>
            </div>

        
                    <h3 class="c-header" class="c-header" id="h-arima-sarima-real-world-time-series-forecasting">                ARIMA &#038; SARIMA: Real-World Time Series Forecasting            </h3>        
                    <div class="c-button c-button--tertiary c-button--small">

                <span class="c-button__text">
                    Read more                </span>

                <img
                    src="https://neptune.ai/wp-content/themes/neptune/img/icon-button-arrow-right.svg"
                    loading="lazy"
                    decoding="async"
                    width="12"
                    height="12"
                    alt=""
                    class="c-button__arrow">

            </div>
            </div>

    </a>



<h4 class="wp-block-heading">Time series forecasting with Statsmodels</h4>



<p>Statsmodels is a package that we’ve already mentioned in the section about data visualization tools. However, this is a more relevant package for forecasting. Basically, this package provides a range of statistical models and hypothesis tests.</p>



<p>Statsmodels package also includes model classes and functions for time series analysis. Autoregressive moving average models (ARMA) and vector autoregressive models (VAR) are examples of basic models. Markov switching dynamic regression and autoregression are examples of non-linear models. It also includes time series descriptive statistics such as autocorrelation, partial autocorrelation function, and periodogram, as well as the theoretical properties of ARMA or related processes.</p>



<p>How to get started with time series using the Statsmodels package is described below:</p>



<section id="note-block_dc6b998b855dfe02192b1d659d1da449"
         class="block-note c-box c-box--default c-box--dark c-box--no-hover c-box--standard ">

    
    <div class="block-note__content">
                    <div class="c-item c-item--wysiwyg_editor">

                                    <img
                        alt=""
                        class="c-item__arrow"
                        src="https://neptune.ai/wp-content/themes/neptune/img/blocks/note/list-arrow.svg"
                        loading="lazy"
                        decoding="async"
                        width="12"
                        height="10"
                    />
                
                <div class="c-item__content">

                                            <p>  <a href="https://www.statsmodels.org/stable/tsa.html" target="_blank" rel="noreferrer noopener nofollow">Statsmodels documentation</a></p>
                                    </div>

            </div>
            </div>


</section>



<h4 class="wp-block-heading">Time series forecasting with Pmdarima</h4>



<p>Pmdarima is a statistical library that facilitates the modelling of time series using ARIMA-based methods. Aside from that, it has other features such as:</p>



<ul class="wp-block-list">
<li>A set of statistical tests for stationarity and seasonality</li>



<li>Various endogenous and exogenous transformers including Box-Cox and Fourier transformations</li>



<li>Decompositions of seasonal time series, cross-validation utilities, and other tools</li>
</ul>



<p>Maybe the most useful utility of this library is the Auto-Arima module that searches over all possible ARIMA models within the constraints provided and returns the best one, based on either AIC or BIC value.</p>



<p>More about Pmdarima is presented here:</p>



<section id="note-block_d4c107ba22ef1dc54de9ef193736eaca"
         class="block-note c-box c-box--default c-box--dark c-box--no-hover c-box--standard ">

    
    <div class="block-note__content">
                    <div class="c-item c-item--wysiwyg_editor">

                                    <img
                        alt=""
                        class="c-item__arrow"
                        src="https://neptune.ai/wp-content/themes/neptune/img/blocks/note/list-arrow.svg"
                        loading="lazy"
                        decoding="async"
                        width="12"
                        height="10"
                    />
                
                <div class="c-item__content">

                                            <p>  <a href="https://alkaline-ml.com/pmdarima/" target="_blank" rel="noreferrer noopener nofollow">Pmdarima: ARIMA estimators for Python</a></p>
<p>  <a href="https://pypi.org/project/pmdarima/" target="_blank" rel="noreferrer noopener nofollow">Pmdarima PyPI</a></p>
                                    </div>

            </div>
            </div>


</section>



<h4 class="wp-block-heading">Time series forecasting with Sklearn</h4>



<p>Sklearn or Scikit-Learn is for sure one of the most commonly used machine learning packages in Python. It provides various classification, regression, and clustering methods including random forest, support vector machine, k-means, and others. Besides that, it provides some utilities related to dimensionality reduction, model selection, data preprocessing, and much more.</p>



<p>In addition to various models, for time series there are also available some useful functionalities such as pipelines, time series cross-validation functions, diverse metrics for measuring results, and similar.</p>



<div id="separator-block_e304094759d0a0c7cf2bcf596b48c813"
         class="block-separator block-separator--10">
</div>


<div class="wp-block-image">
<figure class="aligncenter size-full is-resized"><a href="https://neptune.ai/time-series-projects-tools-packages-and-libraries-that-can-help_5" target="_blank" rel="noopener"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Time-Series-Projects-Tools-Packages-and-Libraries-That-Can-Help_5.png?ssl=1" alt="Time-Series split using Sklearn" class="wp-image-64574" style="width:712px;height:290px"/></a><figcaption class="wp-element-caption"><em>Time series split using Sklearn | </em><a href="https://datascience.stackexchange.com/questions/41378/how-to-apply-stacking-cross-validation-for-time-series-data" target="_blank" rel="noreferrer noopener nofollow"><em>Source</em></a></figcaption></figure>
</div>


<p>More about this library can be found below:</p>



<section id="note-block_c5a7e9efa1b0a1d14c4d86e6d09a10ad"
         class="block-note c-box c-box--default c-box--dark c-box--no-hover c-box--standard ">

    
    <div class="block-note__content">
                    <div class="c-item c-item--wysiwyg_editor">

                                    <img
                        alt=""
                        class="c-item__arrow"
                        src="https://neptune.ai/wp-content/themes/neptune/img/blocks/note/list-arrow.svg"
                        loading="lazy"
                        decoding="async"
                        width="12"
                        height="10"
                    />
                
                <div class="c-item__content">

                                            <p>  <a href="https://scikit-learn.org/stable/index.html" target="_blank" rel="noreferrer noopener nofollow">Scikit-learn website</a></p>
<p>  <a href="https://www.tutorialspoint.com/scikit_learn/index.htm" target="_blank" rel="noreferrer noopener nofollow">Tutorials Point: Scikit-lea<br />
</a></p>
                                    </div>

            </div>
            </div>


</section>



<h4 class="wp-block-heading">Time series forecasting with PyTorch</h4>



<p>PyTorch is a Python-based deep learning library for fast and flexible experimentation. It was originally developed by researchers and engineers working on Facebook’s AI research team and then open-sourced. Deep learning software such as Tesla Autopilot, Uber&#8217;s Pyro, and Hugging Face&#8217;s Transformers are built on top of PyTorch.</p>



<p>With PyTorch, it’s possible to build powerful recurrent neural network models such as LSTM and GRU and forecast time series. Also, there is a PyTorch Forecasting package with state-of-the-art network architectures. It also includes a time series dataset class that abstracts handling variable transformations, missing values, randomized subsampling, multiple history lengths, and other similar issues. More about this is presented below:</p>



<section id="note-block_5374645b5d9b70b187510e992d261ec7"
         class="block-note c-box c-box--default c-box--dark c-box--no-hover c-box--standard ">

    
    <div class="block-note__content">
                    <div class="c-item c-item--wysiwyg_editor">

                                    <img
                        alt=""
                        class="c-item__arrow"
                        src="https://neptune.ai/wp-content/themes/neptune/img/blocks/note/list-arrow.svg"
                        loading="lazy"
                        decoding="async"
                        width="12"
                        height="10"
                    />
                
                <div class="c-item__content">

                                            <p>  <a href="https://github.com/jdb78/pytorch-forecasting" target="_blank" rel="noreferrer noopener nofollow">Github: PyTorch forecasting</a></p>
<p>  <a href="https://pytorch.org/" target="_blank" rel="noreferrer noopener nofollow">PyTorch website</a></p>
                                    </div>

            </div>
            </div>


</section>



<h4 class="wp-block-heading">Time series forecasting with Tensorflow (Keras)</h4>



<p>TensorFlow is an open-source software library for machine learning, based on data flow graphs. It was originally developed by the Google Brain team for internal use, but later it was released as an open-source project. The software library provides a set of high-level data flow operators that can be combined to express complex computations involving multidimensional data arrays, matrices, and higher-order tensors in a natural way. It also provides some lower-level primitives such as kernels that are used to construct custom operators or to speed up the execution of common operations.</p>



<p>Keras is a high-level API that is built on top of TensorFlow. Using Keras and TensorFlow it is possible to build neural network models for time series forecasting. One example of a time series project using weather time series data set is explained in the tutorial below:</p>



<section id="note-block_5f003ccea3fc9aaee191e988e15ee7f9"
         class="block-note c-box c-box--default c-box--dark c-box--no-hover c-box--standard ">

    
    <div class="block-note__content">
                    <div class="c-item c-item--wysiwyg_editor">

                                    <img
                        alt=""
                        class="c-item__arrow"
                        src="https://neptune.ai/wp-content/themes/neptune/img/blocks/note/list-arrow.svg"
                        loading="lazy"
                        decoding="async"
                        width="12"
                        height="10"
                    />
                
                <div class="c-item__content">

                                            <p>  <a href="https://www.tensorflow.org/tutorials/structured_data/time_series" target="_blank" rel="noreferrer noopener nofollow">TensorFlow time series tutorial</a></p>
                                    </div>

            </div>
            </div>


</section>



<h4 class="wp-block-heading">Time series forecasting with Sktime</h4>



<p>Sktime is an open-source Python library for time series and machine learning. It includes the algorithms and transformation tools needed to solve time series regression, forecasting, and classification tasks efficiently. Sktime was created to work with scikit-learn and make it easy to adapt algorithms for interrelated time series tasks as well as build composite models.</p>



<p>Overall, this package provides:</p>



<ul class="wp-block-list">
<li>State-of-the-art algorithms for time series forecasting</li>



<li>Transformations for time series such as detrending or deseasonalization and similar</li>



<li>Pipelines for models and transformations, model tuning utilities, and other useful functionalities</li>
</ul>



<p>How to get started with this library is described here:</p>



<section id="note-block_d6a83582c2671191633eee2ee6dbf820"
         class="block-note c-box c-box--default c-box--dark c-box--no-hover c-box--standard ">

    
    <div class="block-note__content">
                    <div class="c-item c-item--wysiwyg_editor">

                                    <img
                        alt=""
                        class="c-item__arrow"
                        src="https://neptune.ai/wp-content/themes/neptune/img/blocks/note/list-arrow.svg"
                        loading="lazy"
                        decoding="async"
                        width="12"
                        height="10"
                    />
                
                <div class="c-item__content">

                                            <p>  <a href="https://www.sktime.org/en/stable/" target="_blank" rel="noreferrer noopener nofollow">Sktime documentation</a></p>
                                    </div>

            </div>
            </div>


</section>



<h4 class="wp-block-heading">Time series forecasting with Prophet</h4>



<p>Prophet is an open-source library released by Facebook&#8217;s Core Data Science team. Briefly, it consists of a procedure for forecasting time series data, based on an additive model that combines a few non-linear trends with yearly, weekly and daily seasonality, as well as holiday effects. It works best with time series that have strong seasonal effects and historical data from multiple seasons. It’s capable of handling missing data, trend shifts and outliers in general.</p>



<p>More about Prophet library is presented below:</p>



<section id="note-block_14dff53c1bc2b2e117462c538a3e0cb7"
         class="block-note c-box c-box--default c-box--dark c-box--no-hover c-box--standard ">

    
    <div class="block-note__content">
                    <div class="c-item c-item--wysiwyg_editor">

                                    <img
                        alt=""
                        class="c-item__arrow"
                        src="https://neptune.ai/wp-content/themes/neptune/img/blocks/note/list-arrow.svg"
                        loading="lazy"
                        decoding="async"
                        width="12"
                        height="10"
                    />
                
                <div class="c-item__content">

                                            <p><a href="https://github.com/facebook/prophet" target="_blank" rel="noreferrer noopener nofollow">Github: Facebook Prophet</a></p>
                                    </div>

            </div>
            </div>


</section>



<h4 class="wp-block-heading">Time series forecasting with Pycaret</h4>



<p>PyCaret is an open-source machine learning library in Python that automates machine learning workflows. With PyCaret it’s possible to build and test several machine learning models with minimal effort and a few lines of code.</p>



<p>Basically, with minimal code, not going deep into the details, it&#8217;s possible to build an end-to-end machine learning project from EDA to deployment.</p>



<p>This library has some useful time series models among which are:</p>



<ul class="wp-block-list">
<li>Seasonal Naive Forecaster</li>



<li>ARIMA</li>



<li>Polynomial Trend Forecaster</li>



<li>Lasso Net with deseasonalize and detrending options and many others</li>
</ul>



<div id="separator-block_76addb181febf0cb05ce8a0ae2d6945a"
         class="block-separator block-separator--15">
</div>


<div class="wp-block-image">
<figure class="aligncenter size-large is-resized"><a href="https://neptune.ai/time-series-projects-tools-packages-and-libraries-that-can-help_7" target="_blank" rel="noopener"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Time-Series-Projects-Tools-Packages-and-Libraries-That-Can-Help_7.png?ssl=1" alt="Anomaly detection using PyCaret" class="wp-image-64572" style="width:810px;height:407px"/></a><figcaption class="wp-element-caption"><em>Anomaly detection using PyCaret | <a href="https://towardsdatascience.com/time-series-anomaly-detection-with-pycaret-706a6e2b2427" target="_blank" rel="noreferrer noopener nofollow">Source</a></em></figcaption></figure>
</div>


<p>More about PyCaret can be found here:</p>



<section id="note-block_589824628ba3ce432542f695345f3a1a"
         class="block-note c-box c-box--default c-box--dark c-box--no-hover c-box--standard ">

    
    <div class="block-note__content">
                    <div class="c-item c-item--wysiwyg_editor">

                                    <img
                        alt=""
                        class="c-item__arrow"
                        src="https://neptune.ai/wp-content/themes/neptune/img/blocks/note/list-arrow.svg"
                        loading="lazy"
                        decoding="async"
                        width="12"
                        height="10"
                    />
                
                <div class="c-item__content">

                                            <p>  <a href="https://pycaret.readthedocs.io/en/time_series/api/time_series.html" target="_blank" rel="noreferrer noopener nofollow">PyCaret documentation</a></p>
<p>  <a href="https://towardsdatascience.com/new-time-series-with-pycaret-4e8ce347556a" target="_blank" rel="noreferrer noopener nofollow">New time series with PyCaret</a></p>
                                    </div>

            </div>
            </div>


</section>



<h4 class="wp-block-heading">Time series forecasting with AutoTS</h4>



<p>AutoTS is a time series package for Python, designed to automate time series forecasting. It can be used to find the best time series forecasting model both for univariate and multivariate time series. Also, AutoTS itself clears the data from any NaN values or outliers.&nbsp;</p>



<p>Nearly 20 predefined models like ARIMA, ETS, VECM are available, and using genetic algorithms, it finds the best models, preprocessing, and ensembling for a given dataset.</p>



<p>Some tutorials about this package are:</p>



<section id="note-block_d301343c60adbe0379e307190075ebd5"
         class="block-note c-box c-box--default c-box--dark c-box--no-hover c-box--standard ">

    
    <div class="block-note__content">
                    <div class="c-item c-item--wysiwyg_editor">

                                    <img
                        alt=""
                        class="c-item__arrow"
                        src="https://neptune.ai/wp-content/themes/neptune/img/blocks/note/list-arrow.svg"
                        loading="lazy"
                        decoding="async"
                        width="12"
                        height="10"
                    />
                
                <div class="c-item__content">

                                            <p>  <a href="https://github.com/winedarksea/AutoTS" target="_blank" rel="noreferrer noopener nofollow">Github: AutoTS</a></p>
<p>  <a href="https://analyticsindiamag.com/hands-on-guide-to-autots-effective-model-selection-for-multiple-time-series/" target="_blank" rel="noreferrer noopener nofollow">Hands-on guide to AutoTS: effective model selection for multiple time series</a></p>
                                    </div>

            </div>
            </div>


</section>



<h4 class="wp-block-heading">Time series forecasting with Darts</h4>



<p>Darts is a Python library that allows simple manipulation and forecasting of time series. It includes a wide range of models, from classics like ES and ARIMA up to RNN and transformers. All of the models can be used in the same way as in the scikit-learn package.</p>



<p>The library also allows easy backtesting of models, combining predictions from multiple models, and incorporating external data. It supports both univariate and multivariate models. The table of all available models as well as several examples can be found here:</p>



<section id="note-block_4046279919ea7377bf854d5ff556ad1e"
         class="block-note c-box c-box--default c-box--dark c-box--no-hover c-box--standard ">

    
    <div class="block-note__content">
                    <div class="c-item c-item--wysiwyg_editor">

                                    <img
                        alt=""
                        class="c-item__arrow"
                        src="https://neptune.ai/wp-content/themes/neptune/img/blocks/note/list-arrow.svg"
                        loading="lazy"
                        decoding="async"
                        width="12"
                        height="10"
                    />
                
                <div class="c-item__content">

                                            <p>  <a href="https://unit8co.github.io/darts/" target="_blank" rel="noreferrer noopener nofollow">Darts documentation</a></p>
                                    </div>

            </div>
            </div>


</section>



<h4 class="wp-block-heading">Time series forecasting with Kats</h4>



<p>Kats is a package released by Facebook&#8217;s Infrastructure Data Science team, intended to perform time series analysis. The goal of this package is to provide everything needed for time series analysis, including detection, forecasting, feature extraction/embedding, multivariate analysis, and so on.</p>



<p>Kats provides a comprehensive set of forecasting tools, such as ensembling, meta-learning models, backtesting, hyperparameter tuning, and empirical prediction intervals. Also, it includes features for detecting seasonalities, outliers, change points, and slow trend changes in time series data. With the TSFeature option, it’s possible to generate 65 features with clear statistical definitions that can be used in most machine learning models.</p>



<p>More about Kats package is described below:</p>



<section id="note-block_c3b5ded000b2685ee81baab2a0a255b5"
         class="block-note c-box c-box--default c-box--dark c-box--no-hover c-box--standard ">

    
    <div class="block-note__content">
                    <div class="c-item c-item--wysiwyg_editor">

                                    <img
                        alt=""
                        class="c-item__arrow"
                        src="https://neptune.ai/wp-content/themes/neptune/img/blocks/note/list-arrow.svg"
                        loading="lazy"
                        decoding="async"
                        width="12"
                        height="10"
                    />
                
                <div class="c-item__content">

                                            <p>  <a href="https://github.com/facebookresearch/Kats" target="_blank" rel="noreferrer noopener nofollow">Github: Kats</a></p>
<p>  <a href="https://facebookresearch.github.io/Kats/" target="_blank" rel="noreferrer noopener nofollow">Kats website</a></p>
                                    </div>

            </div>
            </div>


</section>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-forecasting-libraries-comparison">Forecasting libraries comparison</h3>



<p>In order to easily compare forecasting packages and have a high-level overview, here is a table with some common features. It shows some metrics such as GitHub stars, year of release, supporting features, and similar.&nbsp;</p>



<div id="separator-block_e304094759d0a0c7cf2bcf596b48c813"
         class="block-separator block-separator--10">
</div>



<div id="medium-table-block_83886174f5c0f5ec8c2522c8de8690b4"
     class="block-medium-table c-table__outer-wrapper  l-padding__top--0 l-padding__bottom--0 l-margin__top--unset l-margin__bottom--unset">

    <table class="c-table">
                    <thead class="c-table__head">
            <tr>
                                    <td class="c-item"
                        style="">
                        <div class="c-item__inner">
                            &nbsp;                        </div>
                    </td>
                                    <td class="c-item"
                        style="">
                        <div class="c-item__inner">
                            Year of release                        </div>
                    </td>
                                    <td class="c-item"
                        style="">
                        <div class="c-item__inner">
                            GitHub stars                        </div>
                    </td>
                                    <td class="c-item"
                        style="">
                        <div class="c-item__inner">
                            Statistics &#038; econometrics                        </div>
                    </td>
                                    <td class="c-item"
                        style="">
                        <div class="c-item__inner">
                            Machine learning                        </div>
                    </td>
                                    <td class="c-item"
                        style="">
                        <div class="c-item__inner">
                            Deep learning                        </div>
                    </td>
                            </tr>
            </thead>
        
        <tbody class="c-table__body">

                    
                <tr class="c-row">

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>Statsmodels</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>2010</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>7200</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p style="text-align: center;">++</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p style="text-align: center;">+</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                                                                                </div>
                        </td>

                    
                </tr>

            
                <tr class="c-row">

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>Pmdarima</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>2018</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>1100</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p style="text-align: center;">+</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p style="text-align: center;">+</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                                                                                </div>
                        </td>

                    
                </tr>

            
                <tr class="c-row">

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>Sklearn</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>2007</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>50000</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p style="text-align: center;">+</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p style="text-align: center;">++</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p style="text-align: center;">+</p>
                                                            </div>
                        </td>

                    
                </tr>

            
                <tr class="c-row">

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>PyTorch</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>2016</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>55000</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                                                                                </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p style="text-align: center;">++</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p style="text-align: center;">+</p>
                                                            </div>
                        </td>

                    
                </tr>

            
                <tr class="c-row">

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>TensorFlow</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>2015</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>164000</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                                                                                </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p style="text-align: center;">+</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p style="text-align: center;">++</p>
                                                            </div>
                        </td>

                    
                </tr>

            
                <tr class="c-row">

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>Sktime</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>2019</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>5000</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p style="text-align: center;">+</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p style="text-align: center;">+</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                                                                                </div>
                        </td>

                    
                </tr>

            
                <tr class="c-row">

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>Prophet</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>2017</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>14000</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p style="text-align: center;">+</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p style="text-align: center;">+</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                                                                                </div>
                        </td>

                    
                </tr>

            
                <tr class="c-row">

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>PyCaret</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>2020</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>5500</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p style="text-align: center;">+</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p style="text-align: center;">+</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p style="text-align: center;">+</p>
                                                            </div>
                        </td>

                    
                </tr>

            
                <tr class="c-row">

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>AutoTS</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>2020</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>450</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p style="text-align: center;">+</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p style="text-align: center;">+</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                                                                                </div>
                        </td>

                    
                </tr>

            
                <tr class="c-row">

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>Darts</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>2021</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>3800</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p style="text-align: center;">+</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p style="text-align: center;">+</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p style="text-align: center;">+</p>
                                                            </div>
                        </td>

                    
                </tr>

            
                <tr class="c-row">

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>Kats</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>2021</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>3600</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p style="text-align: center;">+</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p style="text-align: center;">+</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                                                                                </div>
                        </td>

                    
                </tr>

                    
        </tbody>
    </table>

</div>



<h2 class="wp-block-heading" class="wp-block-heading" id="h-conclusion">Conclusion</h2>



<p>In this post, we described the most commonly used tools, packages, and libraries for time series projects. With this list of tools, it’s possible to cover almost any project related to time series. On top of that, we provided a comparison of libraries for forecasting that shows some interesting stats, such as year of release, popularity level, and what kind of models it supports.&nbsp;</p>



<p>If you want to dive deeper into the area of time series, there is a collection of different packages that can be used to process time series: &#8220;<a href="https://github.com/MaxBenChrist/awesome_time_series_in_python" target="_blank" rel="noreferrer noopener">Github: using Python to work with time series data</a>&#8220;.</p>



<p>For those who would like to learn more about time series in general with a theoretical approach, the great choice would be the book &#8220;<a href="https://link.springer.com/book/10.1007/978-3-540-27752-1" target="_blank" rel="noreferrer noopener nofollow">New Introduction to Multiple Time Series Analysis</a>&#8221; by professor dr. Helmut Lütkepohl.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">6741</post-id>	</item>
		<item>
		<title>ARIMA vs Prophet vs LSTM for Time Series Prediction</title>
		<link>https://neptune.ai/blog/arima-vs-prophet-vs-lstm</link>
		
		<dc:creator><![CDATA[Konstantin Kutzkov]]></dc:creator>
		<pubDate>Tue, 13 Sep 2022 14:28:34 +0000</pubDate>
				<category><![CDATA[ML Model Development]]></category>
		<category><![CDATA[Time Series]]></category>
		<guid isPermaLink="false">https://neptune.test/arima-vs-prophet-vs-lstm/</guid>

					<description><![CDATA[Assuming we subscribe to a linear understanding of time and causality, as Dr. Sheldon Cooper says, then representing historical events as a series of values and features observed over time provides the foundations for learning from the past. However, time series are somewhat different from other datasets, including sequential data like text or DNA sequences.&#8230;]]></description>
										<content:encoded><![CDATA[
<p>Assuming we subscribe to a linear understanding of time and causality, as Dr. Sheldon Cooper <a href="https://bigbangtheory.fandom.com/wiki/The_Financial_Permeability" target="_blank" rel="noreferrer noopener nofollow">says</a>, then representing historical events as a series of values and features observed over time provides the foundations for learning from the past. However, <a href="/blog/time-series-prediction-vs-machine-learning" target="_blank" rel="noreferrer noopener">time series are somewhat different from other dataset</a>s, including sequential data like text or DNA sequences. </p>



<p>The time component provides additional information that can be useful when predicting the future. Thus, there are many different techniques designed specifically for dealing with time series. Such techniques range from simple visualization tools that show trends evolving or repeating over time to advanced machine learning models that utilize the specific structure of time series.</p>



<section id="blog-intext-cta-block_3374e8319b19995c269ed6f8e127f45f" class="block-blog-intext-cta  c-box c-box--default c-box--dark c-box--no-hover c-box--standard ">

            <h3 class="block-blog-intext-cta__header" class="block-blog-intext-cta__header" id="h-check-also">Check also</h3>
    
            <p><img decoding="async" class="lazyload block-blog-intext-cta__arrow-image" src="https://neptune.ai/wp-content/themes/neptune/img/image-ratio-holder.svg" alt="" width="12" height="12" data-src="https://neptune.ai/wp-content/themes/neptune/img/icon-arrow--right-gray.svg" />️ <a href="/blog/arima-sarima-real-world-time-series-forecasting-guide" target="_blank" rel="noopener">ARIMA &amp; SARIMA: Real-World Time Series Forecasting [Advanced Guide]</a></p>
<p><img decoding="async" class="lazyload block-blog-intext-cta__arrow-image" src="https://neptune.ai/wp-content/themes/neptune/img/image-ratio-holder.svg" alt="" width="12" height="12" data-src="https://neptune.ai/wp-content/themes/neptune/img/icon-arrow--right-gray.svg" />️ <a href="/blog/select-model-for-time-series-prediction-task" target="_blank" rel="noopener">How to Select a Model For Your Time Series Prediction Task [Guide]</a></p>
    
    </section>



<p>In this post, we will discuss three popular approaches to learning from time-series data:</p>



<div id="case-study-numbered-list-block_05802973a1883a875e71bfb2d4b39c2e"
         class="block-case-study-numbered-list ">

    
    <h2 id="h-"></h2>

    <ul class="c-list">
                    <li class="c-list__item">
                <span class="c-list__counter">1</span>
                The classic ARIMA framework for time series prediction            </li>
                    <li class="c-list__item">
                <span class="c-list__counter">2</span>
                Facebook’s in-house model Prophet, which is specifically designed for learning from business time series            </li>
                    <li class="c-list__item">
                <span class="c-list__counter">3</span>
                The LSTM model, a powerful recurrent neural network approach that has been used to achieve the best-known results for many problems on sequential data<br />
            </li>
            </ul>
</div>



<p>We will then show how to compare the results across the three models using <a href="/" target="_blank" rel="noreferrer noopener">neptune.ai</a> and its powerful features.</p>



<p>Let’s start with a brief overview of the three methods.</p>



<h2 class="wp-block-heading" class="wp-block-heading" id="h-overview-of-the-three-methods-arima-prophet-and-lstm">Overview of the three methods: ARIMA, Prophet, and LSTM</h2>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-arima">ARIMA</h3>



<p>ARIMA is a class of time series prediction models, and the name is an abbreviation for AutoRegressive Integrated Moving Average. The backbone of ARIMA is a mathematical model that represents the time series values using its past values. This model is based on two main features:&nbsp;</p>



<ol class="wp-block-list">
<li><strong>Past Values</strong>: Clearly, past behaviour is a good predictor of the future. The only question is how many past values we should use. The model uses the last p time series values as features. Here p is a hyperparameter that needs to be determined when we design the model.</li>



<li><strong>Past Errors:</strong> The model can use the information on how well it has performed in the past. Thus, we add as features the most recent q errors the model made. Again, q is a hyperparameter.&nbsp;&nbsp;&nbsp;</li>
</ol>



<p>An important aspect here is that the time series needs to be standardized such that the model becomes independent from seasonal or temporary trends. The formal term for this is that we want the model to be trained on a <em>stationary</em> time series. In the most intuitive sense, stationarity means that the statistical properties of a process generating a time series do not change over time. It does not mean that the series does not change over time, just that the way it changes does not itself change over time.</p>



<p>There are several approaches to making a time series stationary, the most popular being differencing. By replacing the n values in the series with the n-1 differences, we force the model to learn more advanced patterns. When the model predicts a new value, we simply add the last observed value to it in order to obtain a final prediction. Stationarity can be somewhat confusing if you encounter the concept for the first time, you can refer to this <a href="https://machinelearningmastery.com/remove-trends-seasonality-difference-transform-python/" target="_blank" rel="noreferrer noopener nofollow">tutorial</a> for more details.</p>



<h4 class="wp-block-heading">Parameters</h4>



<p>Formally, ARIMA is defined by three parameters p, d, and q that describe the three main components of the model.&nbsp;</p>



<ul class="wp-block-list">
<li><strong>Integrated</strong> <strong>(the I in ARIMA): </strong>The number of differences needed to achieve stationarity is given by the parameter d. Let the original features be Y<sub>t</sub> where t is the index in the sequence. We create a stationary time series using the following transformations for different values of d.</li>
</ul>



<h5 class="wp-block-heading">For d=0</h5>


<div class="wp-block-image is-style-default">
<figure class="aligncenter size-full"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/1-removebg-preview.png?ssl=1" alt="ARIMA parameters" class="wp-image-60428"/></figure>
</div>


<p>In this case the series is already stationary and we have nothing to do.</p>



<h5 class="wp-block-heading">For d=1</h5>


<div class="wp-block-image is-style-default">
<figure class="aligncenter size-full is-resized"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Time-Series-Prediction_2.png?ssl=1" alt="ARIMA parameters" class="wp-image-60435" style="width:213px;height:69px"/></figure>
</div>


<p>This is the most typical transformation.</p>



<h5 class="wp-block-heading">For d=2</h5>


<div class="wp-block-image is-style-default">
<figure class="aligncenter size-full"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Time-Series-Prediction_1.png?ssl=1" alt="ARIMA parameters" class="wp-image-60434"/></figure>
</div>


<p>Observe that differencing can be seen as a discrete version of differentiation. For d=1 the new features represent how the values change. While for d=2 the new features represent <em>the rate of the change</em>, just like the second derivative in calculus.&nbsp; The above can be generalized to d&gt;2 as well but this is rarely used in practice.</p>



<ul class="wp-block-list">
<li><strong>AutoRegressive (AR): </strong>The parameter p tells us how many past values to consider for the expression of the current value. Essentially, we learn a model that predicts the value at time t as:</li>
</ul>


<div class="wp-block-image is-style-default">
<figure class="aligncenter size-full"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Time-Series-Prediction_14.png?ssl=1" alt="AutoRegressive (AR)" class="wp-image-60442"/></figure>
</div>


<ul class="wp-block-list">
<li>&nbsp;<strong>Moving Average (MA): </strong>How many of the forecast errors in the past should be considered. A new value is computed as:</li>
</ul>


<div class="wp-block-image is-style-default">
<figure class="aligncenter size-full"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Time-Series-Prediction_13.png?ssl=1" alt="AutoRegressive (AR)" class="wp-image-60441"/></figure>
</div>


<p>The past prediction errors:</p>


<div class="wp-block-image is-style-default">
<figure class="aligncenter size-full"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Time-Series-Prediction_12.png?ssl=1" alt="AutoRegressive (AR)" class="wp-image-60440"/></figure>
</div>


<p>The combination of the three components gives the ARIMA(p, d, q) model. More precisely, we first integrate the time series, and then we add the AR and MA models and learn the corresponding coefficients.</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-prophet">Prophet</h3>



<p>Prophet FB was developed by Facebook as an algorithm for the in-house prediction of time series values for different business applications. Therefore, it is specifically designed for the prediction of business time series.</p>



<p>It is an additive model consisting of four components:</p>


<div class="wp-block-image is-style-default">
<figure class="aligncenter size-full"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Time-Series-Prediction_11.png?ssl=1" alt="Prophet" class="wp-image-60439"/></figure>
</div>


<p>Let us discuss the meaning of each component:</p>



<ol class="wp-block-list">
<li><strong>g(t):</strong> It represents the <em>trend</em> and the objective is to capture the general trend of the series. For example, the number of advertisements views on Facebook is likely to increase over time as more people join the network. But what would be the exact function of increase?</li>



<li><strong>s(t):</strong><em> </em>It is the<em> Seasonality </em>component. The number of advertisement views might also depend on the season. For example, in the Northern hemisphere during the summer months, people are likely to spend more time outdoors and less time in from of their computers. Such seasonal fluctuations can be very different for different business time series. The second component is thus a function that models seasonal trends.&nbsp;</li>



<li><strong>h(t):</strong> The <em>Holidays</em> component. We use the information for holidays which have a clear impact on most business time series. Note that holidays vary between years, countries, etc. and therefore the information needs to be explicitly provided to the model.</li>



<li>The <strong>error term</strong> ε<sub>t</sub> stands for random fluctuations that cannot be explained by the model. As usual, it is assumed that ε<sub>t</sub> follows a normal distribution <em>N </em>(0, σ<sup>2</sup>) with zero mean and unknown variance σ that has to be derived from the data.</li>
</ol>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-lstm-recurrent-neural-networks">LSTM recurrent neural networks</h3>



<p>LSTM stands for Long short-term memory. LSTM cells are used in recurrent neural networks that learn to predict the future from sequences of variable lengths. Note that recurrent neural networks work with any kind of sequential data and, unlike ARIMA and Prophet, are not restricted to time series.&nbsp;</p>



<p>The main idea behind LSTM cells is to learn the important parts of the sequence seen so far and forget the less important ones. This is achieved by the so-called gates, i.e., functions that have different learning objectives such as:&nbsp;</p>



<ol class="wp-block-list">
<li>a compact representation of the time series seen so far</li>



<li>how to combine new input with the past representation of the series</li>



<li>what to forget about the series</li>



<li>what to output as a prediction for the next time step.&nbsp;</li>
</ol>



<p>See Figure 1 and the <a href="https://en.wikipedia.org/wiki/Long_short-term_memory" target="_blank" rel="noreferrer noopener nofollow">Wikipedia article</a> for more details.</p>



<section id="blog-intext-cta-block_280e275641be925a75cdf1ef0afb6702" class="block-blog-intext-cta  c-box c-box--default c-box--dark c-box--no-hover c-box--standard ">

            <h3 class="block-blog-intext-cta__header" class="block-blog-intext-cta__header" id="h-read-also">Read also</h3>
    
            <p><img decoding="async" class="lazyload block-blog-intext-cta__arrow-image" src="https://neptune.ai/wp-content/themes/neptune/img/image-ratio-holder.svg" alt="" width="12" height="12" data-src="https://neptune.ai/wp-content/themes/neptune/img/icon-arrow--right-gray.svg" />️ <a href="/blog/recurrent-neural-network-guide" target="_blank" rel="noopener">Recurrent Neural Network Guide: a Deep Dive in RNN</a></p>
    
    </section>



<p>Designing an optimal LSTM based model can be a difficult task that requires careful hyperparameter tuning. Here is the list of the most important parameters an LSTM based model needs to consider:</p>



<ul class="wp-block-list">
<li>How many LSTM cells are to use in order to represent the sequence? Note that each LSTM cell will focus on specific aspects of the time series processed so far. A few LSTM cells are unlikely to capture the structure of the sequence while too many LSTM cells might lead to overfitting.</li>



<li>It is typical that first, we convert the input sequence into another sequence, i.e. the values <em>h</em><sub>t</sub>. This yields a new representation as the<em> h</em><sub>t</sub> states capture the structure of the series processed so far. But at some point, we won’t need all htvalues but rather only the last <em>h<sub>t</sub></em>. This will allow us to feed the different <em>h</em><sub>t</sub>’s into a fully connected layer as each <em>h</em><sub>t </sub>corresponds to the final output of an individual LSTM cell. Designing the exact architecture might require careful finetuning and many trials.</li>
</ul>


<div class="wp-block-image is-style-default">
<figure class="aligncenter size-large is-resized"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Prophet-vs-ARIMA-vs-LSTM-for-Time-Series-Prediction_15.png?ssl=1" alt="The structure of an LSTM cell" class="wp-image-60382" style="width:803px;height:550px"/><figcaption class="wp-element-caption"><em>Figure 1: the structure of an LSTM cell | <a href="https://en.wikipedia.org/wiki/Long_short-term_memory" target="_blank" rel="noreferrer noopener nofollow">Source</a></em></figcaption></figure>
</div>


<p>Finally, we would like to reiterate that recurrent neural networks are a general class of methods for learning from sequential data and they can work with arbitrary sequences such as natural text or audio.&nbsp;&nbsp;</p>



<h2 class="wp-block-heading" class="wp-block-heading" id="h-experimental-evaluation-arima-vs-prophet-vs-lstm">Experimental evaluation: ARIMA vs Prophet vs LSTM</h2>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-dataset">Dataset</h3>



<p>We are going to use stock exchange data for Bajaj Finserv Ltd, an Indian financial services company in order to compare the three models. The dataset spans the period from 2008 until the end of 2021. It contains the daily stock price (mean, low, and high values) as well as the total volume and the turnover of traded stocks. A subsample of the dataset is shown in Figure 2.&nbsp;</p>



<div id="separator-block_750e244b4745b6c4d584b8748b7e3b2b"
         class="block-separator block-separator--10">
</div>


<div class="wp-block-image is-style-default">
<figure class="aligncenter size-large is-resized"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Prophet-vs-ARIMA-vs-LSTM-for-Time-Series-Prediction_19.png?ssl=1" alt="The data used for evaluation" class="wp-image-60378" style="width:840px;height:216px"/><figcaption class="wp-element-caption"><em>Figure 2: the data used for evaluation | Source: Author</em></figcaption></figure>
</div>


<div id="separator-block_750e244b4745b6c4d584b8748b7e3b2b"
         class="block-separator block-separator--10">
</div>



<p>We are interested in predicting the Volume Weighted Average Price (VWAP) variable at the end of each day.&nbsp; A graph of the time series VWAP values is presented in Figure 3.</p>


<div class="wp-block-image is-style-default">
<figure class="aligncenter size-full is-resized"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Prophet-vs-ARIMA-vs-LSTM-for-Time-Series-Prediction_23.png?ssl=1" alt="The daily values of the VWAP variable" class="wp-image-60374" style="width:568px;height:350px"/><figcaption class="wp-element-caption"><em>Figure 3: the daily values of the VWAP variable | Source: Author</em></figcaption></figure>
</div>


<p>For the evaluation, we divided the time series into a train and test time series where the training series consists of the data until the end of 2018 (see Figure 4).&nbsp;</p>



<p><strong>Total number of observations:</strong> 3201&nbsp;</p>



<p><strong>Training observations:</strong> 2624</p>



<p><strong>Test observations:</strong> 577</p>


<div class="wp-block-image is-style-default">
<figure class="aligncenter size-full"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Prophet-vs-ARIMA-vs-LSTM-for-Time-Series-Prediction_18.png?ssl=1" alt="The train and test subsets of the VWAP time series" class="wp-image-60379"/><figcaption class="wp-element-caption"><em>Figure 4: the train and test subsets of the VWAP time series | Source: Author</em></figcaption></figure>
</div>


<h3 class="wp-block-heading" class="wp-block-heading" id="h-implementation">Implementation&nbsp;</h3>



<p>In order to work properly, machine learning models require good data and for this, we will do a little <strong>Feature engineering</strong>. The objective behind feature engineering is to design more powerful models that exploit different patterns in the data. As the three models learn patterns observed in the past, we create additional features that thoroughly describe the recent trends of the stock movements.&nbsp;</p>



<p>In particular, we track the moving average for the different trade features over a period of 3, 7, and 30 days. In addition, we consider features such as the month, the week number, and the weekday. Thus, the input to our models is multidimensional.&nbsp; A small example of the used feature engineering looks as follows:</p>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);">lag_features = [<span class="hljs-string" style="color: rgb(221, 17, 68);">"High"</span>, <span class="hljs-string" style="color: rgb(221, 17, 68);">"Low"</span>, <span class="hljs-string" style="color: rgb(221, 17, 68);">"Volume"</span>, <span class="hljs-string" style="color: rgb(221, 17, 68);">"Turnover"</span>, <span class="hljs-string" style="color: rgb(221, 17, 68);">"Trades"</span>]
df_rolled_7d = df[lag_features].rolling(window=<span class="hljs-number" style="color: teal;">7</span>, min_periods=<span class="hljs-number" style="color: teal;">0</span>)
df_mean_7d = df_rolled_7d.mean().shift(<span class="hljs-number" style="color: teal;">1</span>).reset_index().astype(np.float32)</pre>



<p>The above code excerpt shows how to add the running mean over the last week of several features describing the sales of the stock. Overall, we create a set of exogenous features:</p>


<div class="wp-block-image is-style-default">
<figure class="aligncenter size-large"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Prophet-vs-ARIMA-vs-LSTM-for-Time-Series-Prediction_21-1669368620-1641306995772.png?ssl=1" alt="Implementation " class="wp-image-60376"/></figure>
</div>


<p>Now, let’s get started with our main models:</p>



<h4 class="wp-block-heading">ARIMA</h4>



<p>We implemented the ARIMA version from the publicly available package <a href="http://alkaline-ml.com/pmdarima/" target="_blank" rel="noreferrer noopener nofollow">pmdarima</a>. The function <a href="http://alkaline-ml.com/pmdarima/modules/generated/pmdarima.arima.auto_arima.html#pmdarima.arima.auto_arima" target="_blank" rel="noreferrer noopener nofollow">auto_arima</a> accepts as an additional parameter a list of <em>exogenous</em> features where we provide the features created in the feature engineering step. The main advantage of auto_arima is that it first performs several tests in order to decide if the time series is stationary or not. Also, it employs a smart grid search strategy that determines the optimal parameters for p, d, and q discussed in the previous section.&nbsp;</p>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);"><span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">from</span> pmdarima <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> auto_arima
model = auto_arima(
	df_train[<span class="hljs-string" style="color: rgb(221, 17, 68);">"VWAP"</span>],
	exogenous=df_train[exogenous_features],
	trace=<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">True</span>,
	error_action=<span class="hljs-string" style="color: rgb(221, 17, 68);">"ignore"</span>,
	suppress_warnings=<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">True</span>)</pre>



<p>The grid search over different values of the parameters p, d, and q is shown below. In the end, the model with the smallest <a href="https://en.wikipedia.org/wiki/Akaike_information_criterion" target="_blank" rel="noreferrer noopener nofollow">AIC value</a> is returned. (The AIC value is a measure of model complexity that simultaneously optimizes the accuracy and the complexity of a prediction model.)&nbsp;</p>


<div class="wp-block-image is-style-default">
<figure class="aligncenter size-large"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Prophet-vs-ARIMA-vs-LSTM-for-Time-Series-Prediction_24.png?ssl=1" alt="ARIMA" class="wp-image-60373"/></figure>
</div>


<p>Predictions on the test set are then obtained by</p>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);">forecast = model.predict(n_periods=len(df_valid),  exogenous=df_valid[exogenous_features])</pre>



<h4 class="wp-block-heading">Prophet&nbsp;&nbsp;</h4>



<p>We use the publicly available <a href="https://facebook.github.io/prophet/docs/quick_start.html" target="_blank" rel="noreferrer noopener nofollow">Python implementation</a> of Prophet. The input data must contain two specific fields:&nbsp;</p>



<ol class="wp-block-list">
<li><strong>Date</strong>:&nbsp; should be a valid calendar date from which the holidays can be computed</li>



<li><strong>Y</strong>: the target variable we want to predict.</li>
</ol>



<p>We instantiate the model as:</p>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);"><span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">from</span> prophet <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> Prophet
model = Prophet()</pre>



<p>The features created during feature engineering have to be explicitly added to the model as follows:</p>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);"><span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">for</span> feature <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">in</span> exogenous_features:
	model.add_regressor(feature)</pre>



<p>Finally, we fit the model:</p>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);">model.fit(df_train[[<span class="hljs-string" style="color: rgb(221, 17, 68);">"Date"</span>, <span class="hljs-string" style="color: rgb(221, 17, 68);">"VWAP"</span>] + exogenous_features].rename(columns={<span class="hljs-string" style="color: rgb(221, 17, 68);">"Date"</span>: <span class="hljs-string" style="color: rgb(221, 17, 68);">"ds"</span>, <span class="hljs-string" style="color: rgb(221, 17, 68);">"VWAP"</span>: <span class="hljs-string" style="color: rgb(221, 17, 68);">"y"</span>}))</pre>



<p>And the forecast for the test set is obtained as:</p>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);">forecast = model.predict(df_test[[<span class="hljs-string" style="color: rgb(221, 17, 68);">"Date"</span>, <span class="hljs-string" style="color: rgb(221, 17, 68);">"VWAP"</span>] + exogenous_features].rename(columns={<span class="hljs-string" style="color: rgb(221, 17, 68);">"Date"</span>: <span class="hljs-string" style="color: rgb(221, 17, 68);">"ds"</span>}))</pre>



<h4 class="wp-block-heading">LSTM</h4>



<p>We used the <a href="https://keras.io/api/layers/recurrent_layers/lstm/" target="_blank" rel="noreferrer noopener nofollow">Keras implementation</a> of LSTMs:</p>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);"><span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> tensorflow <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">as</span> tf
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">from</span> keras.layers <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> Dropout
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">from</span> tensorflow.keras.layers <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> Dense
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">from</span> tensorflow.keras.layers <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> LSTM
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">from</span> tensorflow.keras.metrics <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> RootMeanSquaredError, MeanAbsoluteError
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">from</span> tensorflow.keras.models <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> Sequential</pre>



<p>The model is defined by the following function.</p>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);"><span class="hljs-function"><span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">def</span> <span class="hljs-title" style="color: rgb(153, 0, 0); font-weight: 700;">get_model</span><span class="hljs-params">(params, input_shape)</span>:</span>
	model = Sequential()
	model.add(LSTM(units=params[<span class="hljs-string" style="color: rgb(221, 17, 68);">"lstm_units"</span>], return_sequences=<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">True</span>, input_shape=(input_shape, <span class="hljs-number" style="color: teal;">1</span>)))
	model.add(Dropout(rate=params[<span class="hljs-string" style="color: rgb(221, 17, 68);">"dropout"</span>]))

	model.add(LSTM(units=params[<span class="hljs-string" style="color: rgb(221, 17, 68);">"lstm_units"</span>], return_sequences=<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">True</span>))
	model.add(Dropout(rate=params[<span class="hljs-string" style="color: rgb(221, 17, 68);">"dropout"</span>]))

	model.add(LSTM(units=params[<span class="hljs-string" style="color: rgb(221, 17, 68);">"lstm_units"</span>], return_sequences=<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">True</span>))
	model.add(Dropout(rate=params[<span class="hljs-string" style="color: rgb(221, 17, 68);">"dropout"</span>]))

	model.add(LSTM(units=params[<span class="hljs-string" style="color: rgb(221, 17, 68);">"lstm_units"</span>], return_sequences=<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">False</span>))
	model.add(Dropout(rate=params[<span class="hljs-string" style="color: rgb(221, 17, 68);">"dropout"</span>]))

	model.add(Dense(<span class="hljs-number" style="color: teal;">1</span>))

	model.compile(loss=params[<span class="hljs-string" style="color: rgb(221, 17, 68);">"loss"</span>],
              	optimizer=params[<span class="hljs-string" style="color: rgb(221, 17, 68);">"optimizer"</span>],
              	metrics=[RootMeanSquaredError(), MeanAbsoluteError()])

	<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">return</span> model</pre>



<p>Then we instantiate a model with a given set of parameters. We use the past 90 observations in the time series as a sequence for the input to the model. The other hyperparameters describe the architecture and the specific choices for training the model.&nbsp;</p>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);">params = {
	<span class="hljs-string" style="color: rgb(221, 17, 68);">"loss"</span>: <span class="hljs-string" style="color: rgb(221, 17, 68);">"mean_squared_error"</span>,
	<span class="hljs-string" style="color: rgb(221, 17, 68);">"optimizer"</span>: <span class="hljs-string" style="color: rgb(221, 17, 68);">"adam"</span>,
	<span class="hljs-string" style="color: rgb(221, 17, 68);">"dropout"</span>: <span class="hljs-number" style="color: teal;">0.2</span>,
	<span class="hljs-string" style="color: rgb(221, 17, 68);">"lstm_units"</span>: <span class="hljs-number" style="color: teal;">90</span>,
	<span class="hljs-string" style="color: rgb(221, 17, 68);">"epochs"</span>: <span class="hljs-number" style="color: teal;">30</span>,
	<span class="hljs-string" style="color: rgb(221, 17, 68);">"batch_size"</span>: <span class="hljs-number" style="color: teal;">128</span>,
	<span class="hljs-string" style="color: rgb(221, 17, 68);">"es_patience"</span> : <span class="hljs-number" style="color: teal;">10</span>
}

model = get_model(params=params, input_shape=x_train.shape[<span class="hljs-number" style="color: teal;">1</span>])</pre>



<p>The above results in the following Keras model (see Figure 5):</p>


<div class="wp-block-image is-style-default">
<figure class="aligncenter size-full"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Prophet-vs-ARIMA-vs-LSTM-for-Time-Series-Prediction_26.png?ssl=1" alt="A summary of the Keras LSTM model" class="wp-image-60371"/><figcaption class="wp-element-caption"><em>Figure 5: a summary of the Keras LSTM model | Source: Author</em></figcaption></figure>
</div>


<p>We then create a callback to implement <a href="https://en.wikipedia.org/wiki/Early_stopping" target="_blank" rel="noreferrer noopener nofollow">early stopping</a> i.e. to stop training the model if it yields no improvement on the validation dataset for a given number of epochs (in our case 10):</p>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);">es_callback = tf.keras.callbacks.EarlyStopping(monitor=<span class="hljs-string" style="color: rgb(221, 17, 68);">'val_root_mean_squared_error'</span>,
                                           	mode=<span class="hljs-string" style="color: rgb(221, 17, 68);">'min'</span>,
patience=params[<span class="hljs-string" style="color: rgb(221, 17, 68);">"es_patience"</span>])</pre>



<p>The parameter <em>es_patience</em> refers to the number of epochs for early stopping.</p>



<p>Finally, we fit the model using the predefined parameters:</p>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);">model.fit(
	x_train,
	y_train,
	validation_data=(x_test, y_test),
	epochs=params[<span class="hljs-string" style="color: rgb(221, 17, 68);">"epochs"</span>],
	batch_size=params[<span class="hljs-string" style="color: rgb(221, 17, 68);">"batch_size"</span>],
	verbose=<span class="hljs-number" style="color: teal;">1</span>,
	callbacks=[neptune_callback, es_callback]
)</pre>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-experiment-tracking-and-model-comparison">Experiment tracking and model comparison&nbsp;</h3>



<p>Since in this blog post, we want to answer the simple question of which model yields the most accurate predictions for the test dataset, we will need to see how these three models fare against each other.</p>



<p>There are many different approaches for <a href="/blog/how-to-compare-machine-learning-models-and-algorithms" target="_blank" rel="noreferrer noopener">model comparisons</a> such as creating tables and charts that record the evaluation of different metrics, creating graphs that plot the predicted values vs the true values on a test set, etc. However, for this exercise, we will be using <strong><a href="/" target="_blank" rel="noreferrer noopener">neptune.ai</a>.</strong></p>



<section
	id="i-box-block_878c50069eef8a5100866e1c1cb8d2c1"
	class="block-i-box  l-margin__top--large l-margin__bottom--x-large">

			<header class="c-header">
			<img
				src="https://neptune.ai/wp-content/themes/neptune/img/image-ratio-holder.svg"
				data-src="https://neptune.ai/wp-content/themes/neptune/img/blocks/i-box/header-icon.svg"
				width="24"
				height="24"
				class="c-header__icon lazyload"
				alt="">

			
            <h2 class="c-header__text animation " style='max-width: 100%;'   >
                <strong>Disclaimer</strong>
            </h2>		</header>
	
	<div class="block-i-box__inner">
		

<p>Please note that this article references a <strong>deprecated version of Neptune</strong>.</p>



<p>For information on the latest version with improved features and functionality, please <a href="/" target="_blank" rel="noreferrer noopener">visit our website</a>.</p>


	</div>

</section>



<p>It&#8217;s an experiment tracker built for teams that run a lot of experiments.‌ It gives you a single place to log, store, display, organize, compare, and query all your model-building metadata.</p>



<p>We first create a Neptune project and record the API of our account. You can check a detailed tutorial on how to do it in the <a href="https://docs.neptune.ai/setup/installation/" target="_blank" rel="noreferrer noopener">Neptune </a><a href="https://docs-legacy.neptune.ai/setup/installation/" target="_blank" rel="noreferrer noopener">documentation</a>.</p>




<div
	style="opacity: 0;"
	class="block-code-snippet  l-padding__top--0 l-padding__bottom--0 l-margin__top--unset l-margin__bottom--unset block-code-snippet--regular language-py line-numbers block-code-snippet--show-header"
	data-show-header="show"
	data-header-text=""
>
	<pre style="font-size: .875rem;" data-prismjs-copy="Copy the JavaScript snippet!"><code>import neptune

# Create a Neptune run object
run = neptune.init_run(
    project="your-workspace-name/your-project-name",  
    api_token="YourNeptuneApiToken",  
)</code></pre>
</div>




<p>The variable <em>run</em> can be seen as a folder in which we can create subfolders containing different information. For example, we can create a subfolder called model and record in it the name of the model:</p>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);">run[<span class="hljs-string" style="color: rgb(221, 17, 68);">"model/name"</span>] = <span class="hljs-string" style="color: rgb(221, 17, 68);">"Arima"</span></pre>



<p>We will compare the accuracy of these models with respect to two different metrics:&nbsp;</p>



<ol class="wp-block-list">
<li>The root mean square error (RMSE)</li>
</ol>


<div class="wp-block-image is-style-default">
<figure class="aligncenter size-full"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Time-Series-Prediction2.png?ssl=1" alt="The root mean square error (RMSE)" class="wp-image-60461"/></figure>
</div>


<ol start="2" class="wp-block-list">
<li>The mean absolute error (MAE)</li>
</ol>


<div class="wp-block-image is-style-default">
<figure class="aligncenter size-full is-resized"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Time-Series-Prediction1.png?ssl=1" alt="The mean absolute error (MAE)" class="wp-image-60460" style="width:212px;height:108px"/></figure>
</div>


<p>Note that these values can be logged into Neptune by setting the corresponding values, for example, setting:</p>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);">run[<span class="hljs-string" style="color: rgb(221, 17, 68);">"test/mae"</span>] = mae
 run[<span class="hljs-string" style="color: rgb(221, 17, 68);">"test/rmse"</span>] = mse</pre>



<p> The mean square error and the mean average error for the three models can be seen next to each other in the runs table: </p>


<div class="wp-block-image is-style-default">
<figure class="aligncenter size-large is-resized"><a href="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Prophet-vs-ARIMA-vs-LSTM-for-Time-Series-Prediction_16.png?ssl=1" target="_blank" rel="noopener"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Prophet-vs-ARIMA-vs-LSTM-for-Time-Series-Prediction_16.png?ssl=1" alt="The mean square error and the mean average error for the three models can be seen next to each other. (The tags for each project are at the top.)" class="wp-image-60381" style="width:840px;height:270px"/></a><figcaption class="wp-element-caption"><em>Figure 6. the MSE and the MAE for the three models in the Neptune web app<br>(the tags for each project are at the top) | <a href="https://app.neptune.ai/kutzkov/TimeSeries/experiments?compare=IzBMBpnAWI&amp;split=tbl&amp;dash=leaderboard&amp;viewId=standard-view" target="_blank" rel="noreferrer noopener nofollow">See in the Neptune app</a></em></figcaption></figure>
</div>


<p>The comparison of the three algorithms can be then seen side by side in Neptune, as shown in Figure 7.</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><a href="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/ARIMA-vs-Prophet-vs-LSTM.png?ssl=1"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/ARIMA-vs-Prophet-vs-LSTM.png?ssl=1" alt="Side by side comparison ARIMA Prophet LSTM" class="wp-image-60480"/></a><figcaption class="wp-element-caption"> <em>Figure 7: The mean square error and the mean average error for the three models can be seen next to each other </em><br><em>(the tags for each project are at the top) | <a href="https://app.neptune.ai/kutzkov/TimeSeries/experiments?compare=Iwg03I&amp;split=cmp&amp;dash=leaderboard&amp;viewId=standard-view" target="_blank" rel="noreferrer noopener nofollow">See in the Neptune app</a></em> </figcaption></figure>
</div>


<p>We see that ARIMA yields the best performance, i.e., it achieves the smallest mean square error and mean absolute error on the test set. In contrast, the LSTM neural network performs the worst of the three models.&nbsp;</p>



<p>The exact predictions plotted against the true values can be seen in the following images. We observe that all three models capture the overall trend of the time series but the LSTM appears to be running behind the curve, i.e. it needs more to adjust itself to the change in trend. And Prophet appears to lose against ARIMA in the last few months of the considered test period where it underestimates the true values.</p>


<div class="wp-block-image is-style-default">
<figure class="aligncenter size-full"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Prophet-vs-ARIMA-vs-LSTM-for-Time-Series-Prediction_22.png?ssl=1" alt="ARIMA predictions" class="wp-image-60375"/><figcaption class="wp-element-caption"><em>Figure 8: ARIMA predictions | Source: Author</em></figcaption></figure>
</div>

<div class="wp-block-image is-style-default">
<figure class="aligncenter size-full"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Prophet-vs-ARIMA-vs-LSTM-for-Time-Series-Prediction_25.png?ssl=1" alt="Prophet predictions" class="wp-image-60372"/><figcaption class="wp-element-caption"><em>Figure 9: prophet predictions | Source: Author</em></figcaption></figure>
</div>

<div class="wp-block-image is-style-default">
<figure class="aligncenter size-full"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Prophet-vs-ARIMA-vs-LSTM-for-Time-Series-Prediction_17.png?ssl=1" alt="LSTM prediction" class="wp-image-60380"/><figcaption class="wp-element-caption"><em>Figure 10: LSTM prediction | Source: Author</em></figcaption></figure>
</div>


<h3 class="wp-block-heading" class="wp-block-heading" id="h-a-deeper-look-into-the-performance-of-the-models">A deeper look into the performance of the models&nbsp;</h3>



<h4 class="wp-block-heading">ARIMA grid-search</h4>



<p>When doing grid-search over different values for p, d, and q in ARIMA, we can plot the individual values for the mean squared error. The colored dots in Figure 11 show the mean square error values for different ARIMA parameters over a validation set.</p>


<div class="wp-block-image is-style-default">
<figure class="aligncenter size-large is-resized"><a href="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Prophet-vs-ARIMA-vs-LSTM-for-Time-Series-Prediction_27.png?ssl=1"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Prophet-vs-ARIMA-vs-LSTM-for-Time-Series-Prediction_27.png?ssl=1" alt="Grid-search over the ARIMA parameters" class="wp-image-60370" style="width:840px;height:274px"/></a><figcaption class="wp-element-caption"><em>Figure 11: grid-search over the ARIMA parameters | <a href="https://app.neptune.ai/kutzkov/TimeSeries/experiments?compare=OwJgNAjJ1bP3RDlNS9bNA&amp;split=cmp&amp;dash=charts&amp;viewId=standard-view&amp;query=((%60sys%2Ftags%60%3AstringSet%20CONTAINS%20%22grid-search%22)%20OR%20(%60sys%2Ftags%60%3AstringSet%20CONTAINS%20%22arima%22))&amp;sortBy=%5B%22sys%2Fcreation_time%22%5D&amp;sortFieldType=%5B%22datetime%22%5D&amp;sortFieldAggregationMode=%5B%22auto%22%5D&amp;sortDirection=%5B%22descending%22%5D&amp;suggestionsEnabled=true&amp;lbViewUnpacked=true&amp;chartFilter=mse" target="_blank" rel="noreferrer noopener">See in the Neptune app</a></em></figcaption></figure>
</div>


<h4 class="wp-block-heading">Trends in Prophet</h4>



<p>We collect in Neptune the parameters, forecast data frames, residual diagnostic charts, and other metadata while training models with Prophet. This is achieved using a <a href="https://docs-legacy.neptune.ai/integrations/prophet/" target="_blank" rel="noreferrer noopener nofollow">single function that captures Prophet training metadata</a> and logs it automatically to Neptune.</p>



<p>In Figure 12, we show the change of the different components of the Prophet. We observe that the trend follows a linear increase while the seasonal components exhibit fluctuations.</p>


<div class="wp-block-image is-style-default">
<figure class="aligncenter size-full is-resized"><a href="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Prophet-vs-ARIMA-vs-LSTM-for-Time-Series-Prediction_28.png?ssl=1"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Prophet-vs-ARIMA-vs-LSTM-for-Time-Series-Prediction_28.png?ssl=1" alt="The change of values of the different components in the Prophet over time" class="wp-image-60369" style="width:829px;height:697px"/></a><figcaption class="wp-element-caption"><em>Figure 12: the change of values of the different components in the Prophet over time | Source: Author</em></figcaption></figure>
</div>


<h4 class="wp-block-heading">Why did LSTM fare the worst?</h4>



<p>We collect in Neptune the mean absolute error while training the LSTM model over several epochs. This is achieved using a <a href="https://docs-legacy.neptune.ai/integrations/keras/" target="_blank" rel="noreferrer noopener">Neptune callback</a> which captures Keras training metadata and logs it automatically to Neptune. The results are shown in Figure 13.&nbsp;</p>



<p>Observe that while the error on the training dataset decreases over subsequent epochs, this is not the case for the error on the validation set which reaches its minimum in the second epoch and then fluctuates. This shows that the LSTM model is too advanced for a rather small dataset and is prone to overfitting. Despite adding regularization terms such as dropout, we can’t still avoid overfitting.</p>


<div class="wp-block-image is-style-default">
<figure class="aligncenter size-full is-resized"><a href="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Prophet-vs-ARIMA-vs-LSTM-for-Time-Series-Prediction_20.png?ssl=1"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Prophet-vs-ARIMA-vs-LSTM-for-Time-Series-Prediction_20.png?ssl=1" alt="The evolution of train and test error over different epochs of training the LSTM model" class="wp-image-60377" style="width:827px;height:399px"/></a></figure>
</div>

<div class="wp-block-image is-style-default">
<figure class="aligncenter size-full is-resized"><a href="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Prophet-vs-ARIMA-vs-LSTM-for-Time-Series-Prediction_29.png?ssl=1" target="_blank" rel="noopener"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Prophet-vs-ARIMA-vs-LSTM-for-Time-Series-Prediction_29.png?ssl=1" alt="The evolution of train and test error over different epochs of training the LSTM model" class="wp-image-60368" style="width:840px;height:396px"/></a><figcaption class="wp-element-caption"><em>Figure 13: the evolution of train and test error over different epochs of training the LSTM model | <a href="https://app.neptune.ai/kutzkov/TimeSeries/e/TIM-117/charts" target="_blank" rel="noreferrer noopener">See in the Neptune app</a></em></figcaption></figure>
</div>


<h2 class="wp-block-heading" class="wp-block-heading" id="h-conclusions">Conclusions</h2>



<p>In this blog post, we presented and compared three different algorithms for time series prediction. As expected, there is no clear winner and each algorithm has its own advantages and limitations. Below we summarize our observations for each algorithm:</p>



<ol class="wp-block-list">
<li><strong>ARIMA</strong> is a powerful model and as we saw it achieved the best result for the stock data. A challenge is that it might need careful hyperparameter tuning and a good understanding of the data.&nbsp;</li>



<li><strong>Prophet</strong> is specifically designed for business time series prediction. It achieves very good results for the stock data but, speaking from anecdotes, it can fail spectacularly on time series datasets from other domains. In particular, this holds for time series where the notion of <em>calendar date</em> is not applicable and we cannot learn any seasonal patterns. Prophet’s advantage is that it requires less hyperparameter tuning as it is specifically designed to detect patterns in business time series.</li>



<li><strong>LSTM-based recurrent neural networks</strong> are probably the most powerful approach to learning from sequential data and time series are only a special case. The potential of LSTM based models is fully revealed when learning from massive datasets where we can detect complex patterns. Unlike ARIMA or Prophet, they do not rely on specific assumptions about the data such as time series stationarity or the existence of a Date field.&nbsp; A disadvantage is that LSTM based RNNs are difficult to interpret and it is challenging to gain intuition into their behaviour. Also, careful hyperparameter tuning is required in order to achieve good results.</li>
</ol>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-future-directions">Future directions</h3>



<p>So I hope you enjoyed reading this article and now you must have a better understanding of the time-series algorithms that we discussed here. If you want to dig deeper, here are some links to some useful resources. Happy experimenting!</p>



<ol class="wp-block-list">
<li><a href="http://alkaline-ml.com/pmdarima/index.html" target="_blank" rel="noreferrer noopener nofollow">PMD ARIMA</a>. The documentation for the respective Python package.</li>



<li><a href="https://facebook.github.io/prophet/" target="_blank" rel="noreferrer noopener nofollow">Prophet</a>. Documentation and tutorial for Facebook Prophet.</li>



<li><a href="https://keras.io/api/layers/recurrent_layers/lstm/" target="_blank" rel="noreferrer noopener nofollow">Keras LSTM</a>. Documentation and examples for LSTM RNNs in Keras.</li>



<li><a href="https://neptune.ai/" target="_blank" rel="noreferrer noopener">Neptune</a>. The Neptune website with tutorials and documentation.</li>



<li>A <a href="/blog/ml-experiment-tracking" target="_blank" rel="noreferrer noopener">blog post</a> on ML experiment tracking with neptune.ai.&nbsp;</li>



<li>A <a href="https://otexts.com/fpp2/arima.html" target="_blank" rel="noreferrer noopener nofollow">deeper overview</a> of ARIMA models.</li>



<li>A <a href="https://machinelearningmastery.com/time-series-prediction-lstm-recurrent-neural-networks-python-keras/" target="_blank" rel="noreferrer noopener nofollow">tutorial</a> on time series prediction with LSTM RNNs.</li>



<li>The <a href="https://peerj.com/preprints/3190.pdf" target="_blank" rel="noreferrer noopener nofollow">original Prophet research</a> paper.</li>
</ol>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">6369</post-id>	</item>
		<item>
		<title>How to Select a Model For Your Time Series Prediction Task [Guide]</title>
		<link>https://neptune.ai/blog/select-model-for-time-series-prediction-task</link>
		
		<dc:creator><![CDATA[Joos Korstanje]]></dc:creator>
		<pubDate>Tue, 13 Sep 2022 14:22:56 +0000</pubDate>
				<category><![CDATA[ML Model Development]]></category>
		<category><![CDATA[Time Series]]></category>
		<guid isPermaLink="false">https://neptune.test/select-model-for-time-series-prediction-task/</guid>

					<description><![CDATA[Are you working with time series data and seeking the most effective models? This guide explains how to select and evaluate time series models based on predictive performance—including classical, supervised, and deep learning-based models.&#160; After comparing models and selecting the right one for our task, we&#8217;ll build models for stock market forecasting, benchmarking each to&#8230;]]></description>
										<content:encoded><![CDATA[
<p>Are you working with time series data and seeking the most effective models? This guide explains how to select and evaluate time series models based on predictive performance—including classical, supervised, and deep learning-based models.&nbsp;</p>



<p>After comparing models and selecting the right one for our task, we&#8217;ll build models for stock market forecasting, benchmarking each to identify the best-performing approach.</p>



<h2 class="wp-block-heading" class="wp-block-heading" id="h-understanding-time-series-datasets-and-forecasting">Understanding time series datasets and forecasting</h2>



<p>Most data sets that practitioners work with are based on independent observations. For example, given a website, you could track each visitor; each data point (e.g. row in a table) would represent an individual observation about each visitor. If we assign each visitor a “User ID,” each ID will be independent of the other visitors.</p>


<div class="wp-block-image is-style-default">
<figure class="aligncenter size-large"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/How-to-Select-a-Model-For-Your-Time-Series-Prediction-Task_6-2160513845-1639395985798.jpg?ssl=1" alt="Example of a dataset with independent observations " class="wp-image-58967"/><figcaption class="wp-element-caption"><em>Example of a dataset with independent observations | Source: Author</em></figcaption></figure>
</div>


<p>In contrast, time series data are unique because they measure one or more variables as they change over time, creating dependencies between data points. Unlike typical datasets with independent observations, each time point in a time series dataset is related to its predecessors. This impacts the choice of machine learning algorithms we should use.&nbsp;</p>


<div class="wp-block-image is-style-default">
<figure class="aligncenter size-large"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/How-to-Select-a-Model-For-Your-Time-Series-Prediction-Task_16-80408099-1639395999233.jpg?ssl=1" alt="Example of a dataset with dependent observations taken over time" class="wp-image-58957"/><figcaption class="wp-element-caption"><em>Example of a dataset with dependent observations taken over time | Source: Author</em></figcaption></figure>
</div>

    <a
        href="/blog/time-series-prediction-vs-machine-learning"
        id="cta-box-related-link-block_5c1ce59ccf2902cbc795c50c7730c084"
        class="block-cta-box-related-link  l-margin__top--standard l-margin__bottom--standard"
        target="_blank" rel="nofollow noopener noreferrer"    >

    
    <div class="block-cta-box-related-link__description-wrapper block-cta-box-related-link__description-wrapper--full">

        
            <div class="c-eyebrow">

                <img
                    src="https://neptune.ai/wp-content/themes/neptune/img/icon-related--article.svg"
                    loading="lazy"
                    decoding="async"
                    width="16"
                    height="16"
                    alt=""
                    class="c-eyebrow__icon">

                <div class="c-eyebrow__text">
                    Related                </div>
            </div>

        
                    <h3 class="c-header" class="c-header" id="h-time-series-prediction-how-is-it-different-from-other-machine-learning-ml-engineer-explains">                Time Series Prediction: How Is It Different From Other Machine Learning? [ML Engineer Explains]            </h3>        
                    <div class="c-button c-button--tertiary c-button--small">

                <span class="c-button__text">
                    Read more                </span>

                <img
                    src="https://neptune.ai/wp-content/themes/neptune/img/icon-button-arrow-right.svg"
                    loading="lazy"
                    decoding="async"
                    width="12"
                    height="12"
                    alt=""
                    class="c-button__arrow">

            </div>
            </div>

    </a>



<h2 class="wp-block-heading" class="wp-block-heading" id="h-key-aspects-of-time-series-modeling">Key aspects of time series modeling</h2>



<p>Before we dive into the models themselves, let&#8217;s make sure we have a good understanding of the nature of time series data and how modeling them differs from other types of data.</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-univariate-versus-multivariate-time-series-models">Univariate versus multivariate time series models</h3>



<p>In time series data, timestamps hold intrinsic meaning. Univariate time series models use only one variable (the target variable) and its variation over time to make future predictions.</p>



<p>In contrast, multivariate time series models include additional variables. For instance, if you want to forecast product demand, you might consider including weather data as an influencing factor. Multivariate models extend univariate models by integrating these additional (or external) variables.</p>



<div id="medium-table-block_e6279d5e7364a905fdde61b57f26c076"
     class="block-medium-table c-table__outer-wrapper  l-padding__top--0 l-padding__bottom--0 l-margin__top--unset l-margin__bottom--unset">

    <table class="c-table">
                    <thead class="c-table__head">
            <tr>
                                    <td class="c-item"
                        style="">
                        <div class="c-item__inner">
                            Univariate time series models                        </div>
                    </td>
                                    <td class="c-item"
                        style="">
                        <div class="c-item__inner">
                            Multivariate time series models                        </div>
                    </td>
                            </tr>
            </thead>
        
        <tbody class="c-table__body">

                    
                <tr class="c-row">

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>Use only one variable</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>Use multiple variables</p>
                                                            </div>
                        </td>

                    
                </tr>

            
                <tr class="c-row">

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>Cannot use external data</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>Can use external data</p>
                                                            </div>
                        </td>

                    
                </tr>

            
                <tr class="c-row">

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>Based only on relationships between past and present</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>Based on relationships between past and present, and between variables</p>
                                                            </div>
                        </td>

                    
                </tr>

                    
        </tbody>
    </table>

</div>



<div id="separator-block_67f96a4bc5290f04525fa51bf7bd75d7"
         class="block-separator block-separator--15">
</div>



<p>Suppose you want to look at the patterns (or changes over time) within your data to understand it better and make predictions. In this case, you need to understand the temporal variations you&#8217;ll encounter: seasonality, trend, and noise. The next topic we&#8217;ll discuss, time series decomposition, is a technique used to separate these components so you can analyze them individually.</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-time-series-decomposition">Time series decomposition</h3>



<p>You can decompose the time series to extract different types of variation from your dataset. This will extract three key features from your data:</p>



<ul class="wp-block-list">
<li><strong>Seasonality</strong> is a recurring pattern based on time periods (such as seasons of the year). For example, temperatures typically rise in the summer and fall in the winter. You can use this predictable pattern to help predict future values.</li>



<li><strong>Trends</strong> reflect long-term increases or decreases in your data. Going back to our temperature example, you could observe a gradual upward trend due to global warming, layered on top of seasonal variations.</li>



<li><strong>Noise</strong> is random variability that doesn&#8217;t follow seasonality or trend. It represents unpredictable fluctuations in the data, meaning no model can fully account for it (that&#8217;s why it&#8217;s also often called the “error” or “residual” in the data).</li>
</ul>



<h4 class="wp-block-heading">Time series decomposition in Python</h4>



<p>Here&#8217;s a quick example of how to decompose a time series in Python using <a href="https://www.statsmodels.org/dev/datasets/generated/co2.html" target="_blank" rel="noreferrer noopener nofollow">the CO2 dataset</a> from the <span class="c-code-snippet">statsmodels</span> library.&nbsp;</p>



<p>Before diving into the code, make sure you have the required dependencies installed. Run the following commands to set up your environment:</p>




<div
	style="opacity: 0;"
	class="block-code-snippet  l-padding__top--0 l-padding__bottom--0 l-margin__top--standard l-margin__bottom--standard block-code-snippet--regular language-py line-numbers block-code-snippet--show-header"
	data-show-header="show"
	data-header-text=""
>
	<pre style="font-size: .875rem;" data-prismjs-copy="Copy the JavaScript snippet!"><code><pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);"><span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Install the relevant libraries</span>
!pip install numpy pandas matplotlib statsmodels scikit-learn xgboost pmdarima tensorflow yfinance neptune</pre></code></pre>
</div>




<p>Now, import the dataset:</p>




<div
	style="opacity: 0;"
	class="block-code-snippet  l-padding__top--0 l-padding__bottom--0 l-margin__top--standard l-margin__bottom--standard block-code-snippet--regular language-py line-numbers block-code-snippet--show-header"
	data-show-header="show"
	data-header-text=""
>
	<pre style="font-size: .875rem;" data-prismjs-copy="Copy the JavaScript snippet!"><code><pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);"><span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Import the CO2 dataset</span>
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> statsmodels.datasets.co2 <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">as</span> co2

co2_data = co2.load().data
print(co2_data)</pre></code></pre>
</div>




<p>The dataset includes a time index (weekly dates) and CO2 measurements, shown below.</p>


<div class="wp-block-image is-style-default">
<figure class="aligncenter size-full is-resized"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/How-to-Select-a-Model-For-Your-Time-Series-Prediction-Task_20.png?ssl=1" alt="Example measurements and timestamps from the statsmodels CO2 dataset" class="wp-image-58953" style="aspect-ratio:0.794392523364486;width:289px;height:auto"/><figcaption class="wp-element-caption"><em>Example measurements and timestamps from the statsmodels CO2 dataset</em> | Source: author</figcaption></figure>
</div>


<p>There are a few missing (NA) values. To handle these, you can use interpolation like this:</p>




<div
	style="opacity: 0;"
	class="block-code-snippet  l-padding__top--0 l-padding__bottom--0 l-margin__top--standard l-margin__bottom--standard block-code-snippet--regular language-py line-numbers block-code-snippet--show-header"
	data-show-header="show"
	data-header-text=""
>
	<pre style="font-size: .875rem;" data-prismjs-copy="Copy the JavaScript snippet!"><code><pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);"><span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Handle missing values with interpolation</span>
co2_data = co2_data.fillna(co2_data.interpolate())</pre></code></pre>
</div>




<p>Next, plot the CO2 values over time to see the temporal trend:</p>




<div
	style="opacity: 0;"
	class="block-code-snippet  l-padding__top--0 l-padding__bottom--0 l-margin__top--standard l-margin__bottom--standard block-code-snippet--regular language-py line-numbers block-code-snippet--show-header"
	data-show-header="show"
	data-header-text=""
>
	<pre style="font-size: .875rem;" data-prismjs-copy="Copy the JavaScript snippet!"><code><pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);"><span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Plot CO2 values over time</span>
co2_data.plot()</pre></code></pre>
</div>




<p>This will generate the following plot:</p>


<div class="wp-block-image is-style-default">
<figure class="aligncenter size-full is-resized"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/How-to-Select-a-Model-For-Your-Time-Series-Prediction-Task_10.png?ssl=1" alt="Plot of the CO2 time series from the statsmodels CO2 dataset" class="wp-image-58963" style="width:576px;height:386px"/><figcaption class="wp-element-caption"><em><em>Plot of the CO2 time series from the statsmodels CO2 dataset | Source: Author</em></em></figcaption></figure>
</div>


<p>To decompose the time series into trend, seasonality, and noise (labeled as &#8220;residual&#8221;), use the <span class="c-code-snippet">seasonal_decompose</span> function from <span class="c-code-snippet">statsmodels</span>:</p>




<div
	style="opacity: 0;"
	class="block-code-snippet  l-padding__top--0 l-padding__bottom--0 l-margin__top--standard l-margin__bottom--standard block-code-snippet--regular language-py line-numbers block-code-snippet--show-header"
	data-show-header="show"
	data-header-text=""
>
	<pre style="font-size: .875rem;" data-prismjs-copy="Copy the JavaScript snippet!"><code><pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);"><span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">from</span> statsmodels.tsa.seasonal <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> seasonal_decompose
result = seasonal_decompose(co2_data)
result.plot()</pre></code></pre>
</div>



<div class="wp-block-image is-style-default">
<figure class="aligncenter size-large is-resized"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/How-to-Select-a-Model-For-Your-Time-Series-Prediction-Task_25.png?ssl=1" alt="The CO2 time series decomposed into seasonality, trend, and residual (noise)" class="wp-image-58948" style="aspect-ratio:1.5205128205128204;width:590px;height:auto"/><figcaption class="wp-element-caption"><em><em>The CO2 time series decomposed into seasonality, trend, and residual (noise) | Source: Author</em></em></figcaption></figure>
</div>


<p>In this decomposition, the CO2 data reveals an upward trend (reflected in the first plot) and strong seasonality (see the pattern in the third plot).</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-autocorrelation">Autocorrelation</h3>



<p>Autocorrelation is another key temporal feature in time series data. It measures how the current value of a time series correlates with past values, allowing for more accurate predictions based on recent trends.</p>



<p>Autocorrelation can be:</p>



<ol class="wp-block-list">
<li><strong>Positive:</strong> High values tend to be followed by other high values, and low values by low values. For example, in the stock market, a rising stock price often attracts more buyers, driving the price up further; when the price falls, many people usually sell, driving the price down.</li>



<li><strong>Negative:</strong> High values are likely to be followed by low values, and vice versa. For example, a high rabbit population in the summer may deplete resources, leading to a lower population in the winter, allowing resources to recover and the population to increase again the following year.</li>
</ol>



<h4 class="wp-block-heading">Detecting autocorrelation</h4>



<p>Two common tools for detecting autocorrelation are:</p>



<ul class="wp-block-list">
<li>The autocorrelation function (ACF) plot</li>



<li>The partial autocorrelation function (PACF) plot</li>
</ul>



<p>You can compute an ACF plot using Python as follows:</p>




<div
	style="opacity: 0;"
	class="block-code-snippet  l-padding__top--0 l-padding__bottom--0 l-margin__top--standard l-margin__bottom--standard block-code-snippet--regular language-py line-numbers block-code-snippet--show-header"
	data-show-header="show"
	data-header-text=""
>
	<pre style="font-size: .875rem;" data-prismjs-copy="Copy the JavaScript snippet!"><code><pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);"><span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">from</span> statsmodels.graphics.tsaplots <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> plot_acf

plot_acf(co2_data)</pre></code></pre>
</div>




<p>For our CO2 dataset, this is what we get:</p>


<div class="wp-block-image is-style-default">
<figure class="aligncenter size-full is-resized"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/How-to-Select-a-Model-For-Your-Time-Series-Prediction-Task_22.png?ssl=1" alt="Autocorrelation plot for the CO2 dataset" class="wp-image-58951" style="width:580px;height:412px"/><figcaption class="wp-element-caption"><em><em>Autocorrelation plot for the CO2 dataset | Source: Author</em></em></figcaption></figure>
</div>


<p>On the x-axis, you see the time steps (or “lags”) going back in time. On the y-axis, you can see the correlation of each time step with the current time. This plot clearly shows significant autocorrelation.</p>



<p>The PACF (partial autocorrelation function) is an alternative to the ACF. Instead of showing all autocorrelations, it shows only the <em>unique</em> correlation at each time step, filtering out indirect effects. This helps identify the true relationship between each lag and the present time.</p>



<p>For example, if today&#8217;s value is similar to yesterday&#8217;s and the day before, the ACF would show high correlations for both days. The PACF, however, would only show yesterday&#8217;s value as correlated, removing redundant correlations from earlier days.</p>



<p>You can compute a PACF plot in Python as follows:</p>




<div
	style="opacity: 0;"
	class="block-code-snippet  l-padding__top--0 l-padding__bottom--0 l-margin__top--standard l-margin__bottom--standard block-code-snippet--regular language-py line-numbers block-code-snippet--show-header"
	data-show-header="show"
	data-header-text=""
>
	<pre style="font-size: .875rem;" data-prismjs-copy="Copy the JavaScript snippet!"><code><pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);"><span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">from</span> statsmodels.graphics.tsaplots <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> plot_pacf

plot_pacf(co2_data)</pre></code></pre>
</div>



<div class="wp-block-image is-style-default">
<figure class="aligncenter size-full is-resized"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/How-to-Select-a-Model-For-Your-Time-Series-Prediction-Task_14.png?ssl=1" alt="Partial autocorrelation plot for the CO2 dataset " class="wp-image-58959" style="aspect-ratio:1.4113300492610839;width:577px;height:auto"/><figcaption class="wp-element-caption"><em><em>Partial autocorrelation plot for the CO2 dataset | Source: Author</em></em></figcaption></figure>
</div>


<p>The PACF plot provides a clearer view of the autocorrelation in the CO2 data. It shows strong positive autocorrelation at lag 1, meaning a high value now likely indicates a high value in the next time step. The PACF only displays direct correlations, avoiding duplicate effects from earlier lags. This results in a cleaner and more straightforward representation.</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-stationarity">Stationarity</h3>



<p>Stationarity is another important concept in time series analysis. It means a series has no trend, meaning its statistical properties, like mean and variance, remain constant over time. Many time series models require stationarity to work effectively.</p>



<p>To check for non-stationarity, you can use the Dickey-Fuller Test.</p>



<h4 class="wp-block-heading">Dickey-Fuller test</h4>



<p>The Dickey-Fuller test is a statistical test that detects non-stationarity in a time series. Here&#8217;s how to apply it to the CO2 data in Python:</p>




<div
	style="opacity: 0;"
	class="block-code-snippet  l-padding__top--0 l-padding__bottom--0 l-margin__top--standard l-margin__bottom--standard block-code-snippet--regular language-py line-numbers block-code-snippet--show-header"
	data-show-header="show"
	data-header-text=""
>
	<pre style="font-size: .875rem;" data-prismjs-copy="Copy the JavaScript snippet!"><code><pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);"><span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">from</span> statsmodels.tsa.stattools <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> adfuller
adf, pval, usedlag, nobs, crit_vals, icbest = adfuller(co2_data.co2.values)

print(<span class="hljs-string" style="color: rgb(221, 17, 68);">'ADF test statistic:'</span>, adf)
print(<span class="hljs-string" style="color: rgb(221, 17, 68);">'ADF p-value:'</span>, pval)
print(<span class="hljs-string" style="color: rgb(221, 17, 68);">'Number of lags used:'</span>, usedlag)
print(<span class="hljs-string" style="color: rgb(221, 17, 68);">'Number of observations:'</span>, nobs)
print(<span class="hljs-string" style="color: rgb(221, 17, 68);">'Critical values:'</span>, crit_vals)
print(<span class="hljs-string" style="color: rgb(221, 17, 68);">'Best information criterion:'</span>, icbest)</pre></code></pre>
</div>




<p>The result looks like this:</p>


<div class="wp-block-image is-style-default">
<figure class="aligncenter size-full is-resized"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/How-to-Select-a-Model-For-Your-Time-Series-Prediction-Task_17.png?ssl=1" alt="Results of the Dickey-Fuller Test for the CO2 data in Python. The ADF test suggests that the time series is non-stationary, as the test statistic (0.0337) is greater than the critical values, and the p-value (0.9612) is much higher than 0.05, failing to reject the null hypothesis of non-stationarity." class="wp-image-58956" style="width:831px;height:122px"/><figcaption class="wp-element-caption"><em>Results of the Dickey-Fuller Test for the CO2 data in Python. The ADF test suggests that the time series is non-stationary, as the test statistic (0.0337) is greater than the critical values, and the p-value (0.9612) is much higher than 0.05, failing to reject the null hypothesis of non-stationarity.</em></figcaption></figure>
</div>


<p>In the ADF test:</p>



<ul class="wp-block-list">
<li>The <strong>null hypothesis</strong> assumes a unit root is present, meaning the series is non-stationary.</li>



<li>The <strong>alternative hypothesis</strong> suggests that the time series is stationary.</li>
</ul>



<p>If the p-value is below 0.05, you can reject the null hypothesis, suggesting that the data is stationary. We cannot reject the null hypothesis if the p-value is above 0.05 (as we see above), meaning the data is likely non-stationary. This aligns with the trend we saw above in the CO2 data.&nbsp;</p>



<h4 class="wp-block-heading">Differencing</h4>



<p>To make a non-stationary series stationary, you can apply differencing, which removes trends and leaves only seasonal variations. This helps when using models that assume stationarity.</p>




<div
	style="opacity: 0;"
	class="block-code-snippet  l-padding__top--0 l-padding__bottom--0 l-margin__top--standard l-margin__bottom--standard block-code-snippet--regular language-py line-numbers block-code-snippet--show-header"
	data-show-header="show"
	data-header-text=""
>
	<pre style="font-size: .875rem;" data-prismjs-copy="Copy the JavaScript snippet!"><code><pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);"><span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Apply differencing to remove the trend</span>
prev_co2_value = co2_data.co2.shift()
differenced_co2 = co2_data.co2 - prev_co2_value
differenced_co2.plot()</pre></code></pre>
</div>




<p>The differenced CO2 data looks like this:</p>


<div class="wp-block-image is-style-default">
<figure class="aligncenter size-full is-resized"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/How-to-Select-a-Model-For-Your-Time-Series-Prediction-Task_5.png?ssl=1" alt="The CO2 time series after applying differencing" class="wp-image-58968" style="aspect-ratio:1.6666666666666667;width:586px;height:auto"/><figcaption class="wp-element-caption"><em><em>The CO2 time series after applying differencing | Source: Author</em></em></figcaption></figure>
</div>


<p>Now, if we do the ADF test again on this differenced data, we can confirm that it has become stationary:</p>




<div
	style="opacity: 0;"
	class="block-code-snippet  l-padding__top--0 l-padding__bottom--0 l-margin__top--standard l-margin__bottom--standard block-code-snippet--regular language-py line-numbers block-code-snippet--show-header"
	data-show-header="show"
	data-header-text=""
>
	<pre style="font-size: .875rem;" data-prismjs-copy="Copy the JavaScript snippet!"><code><pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);"><span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">from</span> statsmodels.tsa.stattools <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> adfuller
adf, pval, usedlag, nobs, crit_vals, icbest = adfuller(differenced_co2.dropna())

print(<span class="hljs-string" style="color: rgb(221, 17, 68);">'ADF test statistic:'</span>, adf)
print(<span class="hljs-string" style="color: rgb(221, 17, 68);">'ADF p-value:'</span>, pval)
print(<span class="hljs-string" style="color: rgb(221, 17, 68);">'ADF number of lags used:'</span>, usedlag)
print(<span class="hljs-string" style="color: rgb(221, 17, 68);">'ADF number of observations:'</span>, nobs)
print(<span class="hljs-string" style="color: rgb(221, 17, 68);">'ADF critical values:'</span>, crit_vals)
print(<span class="hljs-string" style="color: rgb(221, 17, 68);">'ADF best information criterion:'</span>, icbest)</pre></code></pre>
</div>



<div class="wp-block-image is-style-default">
<figure class="aligncenter size-full is-resized"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/How-to-Select-a-Model-For-Your-Time-Series-Prediction-Task_4.png?ssl=1" alt="Results of the Dickey-Fuller Test after applying differencing. The ADF test suggests that the time series is stationary, as the test statistic is smaller than the critical values and the p-value is much smaller than 0.05, so we can reject the null hypothesis of non-stationarity." class="wp-image-58969" style="width:851px;height:122px"/><figcaption class="wp-element-caption"><em>Results of the Dickey-Fuller Test after applying differencing. The ADF test suggests that the time series is stationary, as the test statistic is smaller than the critical values and the p-value is much smaller than 0.05, so we can reject the null hypothesis of non-stationarity.</em></figcaption></figure>
</div>


<p>Now, the p-value is very small, indicating that we can reject the null hypothesis (non-stationarity) and assume that this data is stationary.</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-one-step-vs-multi-step-time-series-models">One-step vs multi-step time series models</h3>



<p>Before diving into modeling, the final concept we should cover is the difference between one-step and multi-step models.</p>



<ul class="wp-block-list">
<li><strong>One-step models</strong> predict only the next time point in a series. To create multi-step forecasts with these models, you can repeatedly use the previous prediction as input for the next step. However, this approach can extend any existing errors over multiple steps.</li>



<li><strong>Multi-step models</strong> are designed to predict multiple future points simultaneously. These models are generally better for long-term forecasts and perform well for single-step forecasts.</li>
</ul>



<p>Choosing between one-step and multi-step models depends on how many steps you need to predict for your use case.</p>



<div id="medium-table-block_19ab81deaabc55fddc9fc0d9a27e5020"
     class="block-medium-table c-table__outer-wrapper  l-padding__top--0 l-padding__bottom--0 l-margin__top--0 l-margin__bottom--0">

    <table class="c-table">
                    <thead class="c-table__head">
            <tr>
                                    <td class="c-item"
                        style="">
                        <div class="c-item__inner">
                            One-step forecasts                        </div>
                    </td>
                                    <td class="c-item"
                        style="">
                        <div class="c-item__inner">
                            Multi-step forecasts                        </div>
                    </td>
                            </tr>
            </thead>
        
        <tbody class="c-table__body">

                    
                <tr class="c-row">

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p><span style="font-weight: 400;">Designed to forecast only one step ahead</span></p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p><span style="font-weight: 400;">Designed to forecast multiple steps ahead</span></p>
                                                            </div>
                        </td>

                    
                </tr>

            
                <tr class="c-row">

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p><span style="font-weight: 400;">Can be extended to multi-step by windowing</span></p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p><span style="font-weight: 400;">Direct multi-step capability</span></p>
                                                            </div>
                        </td>

                    
                </tr>

            
                <tr class="c-row">

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p><span style="font-weight: 400;">May be less accurate for multi-step forecasts</span></p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p><span style="font-weight: 400;">Ideal for multi-step forecasts</span></p>
                                                            </div>
                        </td>

                    
                </tr>

                    
        </tbody>
    </table>

</div>



<div id="separator-block_67f96a4bc5290f04525fa51bf7bd75d7"
         class="block-separator block-separator--15">
</div>



<h2 class="wp-block-heading" class="wp-block-heading" id="h-types-of-time-series-models">Types of time series models</h2>



<p>Now that we&#8217;ve covered key aspects of time series data, let&#8217;s explore the types of models used for forecasting.</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-classical-time-series-models">Classical time series models</h3>



<p>These traditional models, such as ARIMA and Exponential Smoothing, are based on time-based patterns in a time series. While highly effective for forecasting single-variable (univariate) series, some advanced options exist to add external variables as well.&nbsp;</p>



<p>Classical models like these are specific to time series data and generally aren&#8217;t suitable for other types of machine learning.</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-supervised-models">Supervised models</h3>



<p>Supervised models are a family of models used for many different machine learning tasks. They use clearly defined input (X) and output (Y) variables.&nbsp;</p>



<p>For time series forecasting, you can create input features from date-based elements (e.g., year, month, day), and the target to be predicted is the value of your time series at that date. You can also include lagged values to add autocorrelation effects.</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-deep-learning-models">Deep learning models</h3>



<p>The rise of deep learning has enabled new forecasting methods, especially useful for complex, sequential data. Specific model architectures like LSTMs have been developed and applied for sequence-based forecasting.&nbsp;</p>



<p>Major tech companies like Facebook and Amazon have released open-source forecasting tools, offering powerful new options for practitioners. These can sometimes outperform traditional models.</p>



<h2 class="wp-block-heading" class="wp-block-heading" id="h-classical-time-series-models">Classical time series models</h2>



<p>Now, let&#8217;s dive deeper into classical time series models, starting with the ARIMA family, which combines multiple components to create a robust forecasting model.</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-arima-family">ARIMA family</h3>



<p>The ARIMA family of models consists of a set of smaller models that can be used on their own or combined (when all of the individual components are put together, you obtain the SARIMAX model). The main building blocks are:</p>



<p><strong>1. Autoregression (AR):</strong> Uses past values to predict future ones. The order of an AR model, <em>p</em>, indicates the number of previous time steps included; the simplest model is the AR(1) model, which only uses one previous timestep to predict the current value.</p>



<p><strong>2. Moving average (MA):</strong> Predicts future values based on past prediction errors rather than past values. The intuition here is that when a model has external perturbations, there may be a pattern in the error; the MA aims to capture this pattern. The order <em>q</em> represents how many error terms to include: MA(1) uses only the last error.</p>



<p><strong>3. Autoregressive Moving Average (ARMA):</strong> Combines AR and MA, using both past values and past errors for predictions. ARMA can use different lags for either the AR or MA; for example, ARMA(1, 0) has an order of <em>p</em>=1 and <em>q</em>=0, effectively making it a regular AR(1) model. The ARMA model requires a stationary time series.</p>



<p><strong>4. Autoregressive Integrated Moving Average (ARIMA)</strong>: Extends ARMA by adding differencing (indicated by <em>d</em>) to make the series stationary, if necessary. The notation is ARIMA(<em>p</em>, <em>d</em>, <em>q</em>). For example, an ARMA(1, 2) model that needs to be differenced once would become an ARIMA(2, 1, 2) model. The first 2 is for the AR order, the second 1 is for the differencing, and the third 2 is for the MA order. ARIMA(1, 0, 1) would be the same as ARMA(1, 1).</p>



<p><strong>5. Seasonal ARIMA (SARIMA)</strong>: Adds seasonality to ARIMA, with seasonal parameters (P, D, Q) on top of the non-seasonal parameters (p, d, q). If seasonality is present in your time series, using it in your forecast is critical. The frequency <em>m</em> specifies the seasonal period (e.g., 12 for monthly data, or 4 for quarterly data). SARIMA notation is SARIMA(<em>p</em>, <em>d</em>, <em>q</em>)(<em>P</em>, <em>D</em>, <em>Q</em>)<em>m</em>.</p>


    <a
        href="/blog/arima-sarima-real-world-time-series-forecasting-guide"
        id="cta-box-related-link-block_27b358494aefeb5d13265882c6cdcef7"
        class="block-cta-box-related-link  l-margin__top--standard l-margin__bottom--standard"
        target="_blank" rel="nofollow noopener noreferrer"    >

    
    <div class="block-cta-box-related-link__description-wrapper block-cta-box-related-link__description-wrapper--full">

        
            <div class="c-eyebrow">

                <img
                    src="https://neptune.ai/wp-content/themes/neptune/img/icon-related--article.svg"
                    loading="lazy"
                    decoding="async"
                    width="16"
                    height="16"
                    alt=""
                    class="c-eyebrow__icon">

                <div class="c-eyebrow__text">
                    Related                </div>
            </div>

        
                    <h3 class="c-header" class="c-header" id="h-arima-sarima-real-world-time-series-forecasting">                ARIMA &#038; SARIMA: Real-World Time Series Forecasting            </h3>        
                    <div class="c-button c-button--tertiary c-button--small">

                <span class="c-button__text">
                    Read more                </span>

                <img
                    src="https://neptune.ai/wp-content/themes/neptune/img/icon-button-arrow-right.svg"
                    loading="lazy"
                    decoding="async"
                    width="12"
                    height="12"
                    alt=""
                    class="c-button__arrow">

            </div>
            </div>

    </a>



<p><strong>6. Seasonal autoregressive integrated moving-average with exogenous regressors (SARIMAX)</strong></p>



<p><strong>6. SARIMA with Exogenous Variables (SARIMAX)</strong>: Adds external variables (X) to SARIMA, allowing additional features to improve forecast accuracy. This is the most complex variant, combining AR, MA, differencing, and seasonal effects, along with the addition of external variables.</p>



<h4 class="wp-block-heading">Example: using auto-ARIMA in Python on CO2 Data</h4>



<p>Now that we&#8217;ve reviewed the building blocks of the ARIMA family, let&#8217;s apply them to create a predictive model for CO2 data.</p>



<p>Choosing the right parameters for ARIMA or SARIMAX models can be challenging, as there are many combinations of (<em>p</em>, <em>d</em>, <em>q</em>) or (<em>p</em>, <em>d</em>, <em>q</em>)(<em>P</em>, <em>D</em>, <em>Q</em>). While you can inspect autocorrelation graphs to make educated guesses, the <span class="c-code-snippet">pmdarima</span> library provides an <span class="c-code-snippet">auto_arima</span> function to automatically select optimal parameters.</p>



<p>First, import <span class="c-code-snippet">pmdarima</span> and other necessary libraries:</p>




<div
	style="opacity: 0;"
	class="block-code-snippet  l-padding__top--0 l-padding__bottom--0 l-margin__top--standard l-margin__bottom--standard block-code-snippet--regular language-py line-numbers block-code-snippet--show-header"
	data-show-header="show"
	data-header-text=""
>
	<pre style="font-size: .875rem;" data-prismjs-copy="Copy the JavaScript snippet!"><code><pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);"><span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> pmdarima <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">as</span> pm
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">from</span> pmdarima.model_selection <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> train_test_split
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> matplotlib.pyplot <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">as</span> plt</pre></code></pre>
</div>




<p>After installation, split the data into training and testing sets (we&#8217;ll go into why in more detail later on):</p>




<div
	style="opacity: 0;"
	class="block-code-snippet  l-padding__top--0 l-padding__bottom--0 l-margin__top--standard l-margin__bottom--standard block-code-snippet--regular language-py line-numbers block-code-snippet--show-header"
	data-show-header="show"
	data-header-text=""
>
	<pre style="font-size: .875rem;" data-prismjs-copy="Copy the JavaScript snippet!"><code><pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);">train, test = train_test_split(co2_data.co2.values, train_size=<span class="hljs-number" style="color: teal;">2200</span>)</pre></code></pre>
</div>




<p>Next, fit the model using <span class="c-code-snippet">auto_arima</span> on the training data with seasonal parameters, then make predictions with the best-selected model:</p>




<div
	style="opacity: 0;"
	class="block-code-snippet  l-padding__top--0 l-padding__bottom--0 l-margin__top--standard l-margin__bottom--standard block-code-snippet--regular language-py line-numbers block-code-snippet--show-header"
	data-show-header="show"
	data-header-text=""
>
	<pre style="font-size: .875rem;" data-prismjs-copy="Copy the JavaScript snippet!"><code><pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);">model = pm.auto_arima(train, seasonal=<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">True</span>, m=<span class="hljs-number" style="color: teal;">52</span>)
preds = model.predict(test.shape[<span class="hljs-number" style="color: teal;">0</span>])</pre></code></pre>
</div>




<p>Finally, visualize the actual vs. forecasted data:</p>




<div
	style="opacity: 0;"
	class="block-code-snippet  l-padding__top--0 l-padding__bottom--0 l-margin__top--standard l-margin__bottom--standard block-code-snippet--regular language-py line-numbers block-code-snippet--show-header"
	data-show-header="show"
	data-header-text=""
>
	<pre style="font-size: .875rem;" data-prismjs-copy="Copy the JavaScript snippet!"><code><pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);">plt.plot(co2_data.co2.values[:<span class="hljs-number" style="color: teal;">2200</span>], train)
plt.plot(range(<span class="hljs-number" style="color: teal;">2200</span>, <span class="hljs-number" style="color: teal;">2200</span> + len(preds)), preds)
plt.legend()
plt.show()</pre></code></pre>
</div>



<div class="wp-block-image is-style-default">
<figure class="aligncenter size-full is-resized"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/How-to-Select-a-Model-For-Your-Time-Series-Prediction-Task_13.png?ssl=1" alt="In the plot, the blue line represents the actual data, and the orange line represents the forecast.
" class="wp-image-58960" style="width:570px;height:381px"/><figcaption class="wp-element-caption"><em>In the plot, the blue line represents the actual data, and the orange line represents the forecast. | Source: Author</em></figcaption></figure>
</div>


<p>For more examples and details, check the <a href="https://alkaline-ml.com/pmdarima/" target="_blank" rel="noreferrer noopener nofollow">pmdarina documentation</a>.</p>



<h4 class="wp-block-heading">Vector autoregression (VAR) and its variants: VARMA and VARMAX</h4>



<p>Vector Autoregression (VAR) is a multivariate alternative to ARIMA, designed to predict multiple time series simultaneously. This is especially useful when strong relationships exist between series, as VAR models only the autoregressive component for multiple variables.</p>



<ul class="wp-block-list">
<li><strong>VARMA</strong>: The multivariate equivalent of ARMA, adding a moving average component to VAR, allowing it to model both past values and errors across multiple series.</li>



<li><strong>VARMAX</strong>: Extends VARMA by adding exogenous (external) variables (X), which can improve forecasting accuracy without needing to be forecasted themselves. The <span class="c-code-snippet">statsmodels</span> <a href="https://www.statsmodels.org/dev/examples/notebooks/generated/statespace_varmax.html" target="_blank" rel="noreferrer noopener nofollow">VARMAX implementation</a> is a good way to get started with multivariate forecasting with external factors.</li>
</ul>



<p>Advanced versions like Seasonal VARMAX (SVARMAX) also exist but can become highly complex, making implementation and interpretation challenging. In practice, simpler models may be preferable.</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-smoothing-techniques">Smoothing techniques</h3>



<p>Exponential smoothing is a statistical technique that helps to reduce short-term noise in time series data, making long-term patterns more visible. Time series patterns often have a lot of long-term variability and short-term (noisy) variability. Smoothed time series can reveal trends more effectively for analysis. The main smoothing techniques include:</p>



<p><strong>1. Simple moving average: </strong>Replaces the current value with an average of the current and past values. Increasing the number of past values smooths the series further, but reduces detail. This is the simplest smoothing technique.&nbsp;</p>



<p><strong>2. Simple exponential smoothing (SES):</strong> An adaptation of the moving average that applies weights to past values so that recent values have more influence. This approach smooths the series without losing as much detail as a simple moving average.</p>



<p><strong>3. Double exponential smoothing (DES):</strong> Suitable for data with trends, DES uses two parameters—α (the data smoothing factor) and β (the trend smoothing factor)—to adjust for trends in the data. This method addresses cases where SES alone would fall short by recursively applying an exponential filter.</p>



<p><strong>4. Holt-Winters Exponential Smoothing (HWES)</strong>: Also known as Triple Exponential Smoothing, HWES is ideal for data with seasonality and trend. It adjusts for three components—trend, seasonal cycles (e.g., weekly or monthly), and noise.</p>



<h4 class="wp-block-heading">Example: Exponential smoothing in Python</h4>



<p>Here&#8217;s how to apply simple exponential smoothing (SES) to our CO2 data:</p>




<div
	style="opacity: 0;"
	class="block-code-snippet  l-padding__top--0 l-padding__bottom--0 l-margin__top--standard l-margin__bottom--standard block-code-snippet--regular language-py line-numbers block-code-snippet--show-header"
	data-show-header="show"
	data-header-text=""
>
	<pre style="font-size: .875rem;" data-prismjs-copy="Copy the JavaScript snippet!"><code><pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);"><span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">from</span> statsmodels.tsa.api <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> SimpleExpSmoothing
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> matplotlib.pyplot <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">as</span> plt

es = SimpleExpSmoothing(co2_data.co2.values).fit(smoothing_level=<span class="hljs-number" style="color: teal;">0.01</span>)

plt.plot(co2_data.co2.values)
plt.plot(es.predict(es.params, start=<span class="hljs-number" style="color: teal;">0</span>, end=<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">None</span>))
plt.legend()
plt.show()</pre></code></pre>
</div>




<p>The smoothing level indicates how smooth your curve should become. In this example, it&#8217;s set very low, indicating a very smooth curve. Feel free to play around with this parameter and see what less-smooth versions look like.</p>


<div class="wp-block-image is-style-default">
<figure class="aligncenter size-full is-resized"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/How-to-Select-a-Model-For-Your-Time-Series-Prediction-Task_11.png?ssl=1" alt="The blue line shows the original data, and the orange line shows the smoothed series" class="wp-image-58962" style="aspect-ratio:1.541899441340782;width:549px;height:auto"/><figcaption class="wp-element-caption"><em><em>The blue line shows the original data, and the orange line shows the smoothed series | Source: Author</em></em></figcaption></figure>
</div>


<h2 class="wp-block-heading" class="wp-block-heading" id="h-supervised-machine-learning-models">Supervised machine learning models</h2>



<p>Supervised machine learning models categorize variables as either dependent (target) or independent (predictor) variables. While these models aren&#8217;t designed for time series, they can be adapted by treating time-based features (e.g., year, month, day) as independent variables.</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-linear-regression">Linear regression</h3>



<p>Linear Regression is the simplest supervised model. It estimates linear relationships: each independent variable has a coefficient that indicates how much it affects the target.</p>



<p><strong>Multiple Linear Regression</strong>: Uses multiple predictors (e.g., temperature and price) to model the target variable.</p>



<p><strong>Simple Linear Regression</strong>: Uses one independent variable. For example, hot chocolate sales can be modeled based on the temperature outside (pictured below).</p>


<div class="wp-block-image is-style-default">
<figure class="aligncenter size-full is-resized"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/How-to-Select-a-Model-For-Your-Time-Series-Prediction-Task_2.png?ssl=1" alt="An example of linear regression fit to data of hot chocolate sales according to outside temperature" class="wp-image-58971" style="aspect-ratio:1.3770833333333334;width:551px;height:auto"/><figcaption class="wp-element-caption"><em><em>An example of linear regression fit to data of hot chocolate sales according to outside temperature | Source: Author</em></em></figcaption></figure>
</div>


<p>This is not a time series dataset yet: no time variable is present. To make it a time series, we can add date-based variables such as year, month, or day instead of only using temperature and price as predictors. For example, tying this back to our CO2 dataset:&nbsp;</p>




<div
	style="opacity: 0;"
	class="block-code-snippet  l-padding__top--0 l-padding__bottom--0 l-margin__top--standard l-margin__bottom--standard block-code-snippet--regular language-py line-numbers block-code-snippet--show-header"
	data-show-header="show"
	data-header-text=""
>
	<pre style="font-size: .875rem;" data-prismjs-copy="Copy the JavaScript snippet!"><code><pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);"><span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> numpy <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">as</span> np
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">from</span> sklearn.linear_model <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> LinearRegression
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> matplotlib.pyplot <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">as</span> plt

<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Extract seasonality data</span>
months = [x.month <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">for</span> x <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">in</span> co2_data.index]
years = [x.year <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">for</span> x <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">in</span> co2_data.index]
day = [x.day <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">for</span> x <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">in</span> co2_data.index]
X = np.array([day, months, years]).T

<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Fit the Linear Regression model</span>
my_lr = LinearRegression()
my_lr.fit(X, co2_data.co2.values)

<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Make predictions</span>
preds = my_lr.predict(X)

<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Plot the results</span>
plt.plot(co2_data.index, co2_data.co2.values, label=<span class="hljs-string" style="color: rgb(221, 17, 68);">"Actual"</span>)
plt.plot(co2_data.index, preds, label=<span class="hljs-string" style="color: rgb(221, 17, 68);">"Predicted"</span>)
plt.legend()
plt.show()</pre></code></pre>
</div>




<p>We had to do a little bit of feature engineering to extract seasonality into variables, but the advantage is that adding external variables becomes much easier.</p>



<p>We used the<span class="c-code-snippet"> scikit-learn</span> library to build a linear regression model, fit it to our data, and make predictions. Let&#8217;s see what our model learned:&nbsp;</p>


<div class="wp-block-image is-style-default">
<figure class="aligncenter size-full is-resized"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/How-to-Select-a-Model-For-Your-Time-Series-Prediction-Task_12.png?ssl=1" alt="The plot shows the fit of our linear regression model (in orange) to the CO2 data (presented in blue)" class="wp-image-58961" style="width:536px;height:369px"/><figcaption class="wp-element-caption"><em><em>The plot shows the fit of our linear regression model (in orange) to the CO2 data (presented in blue) | Source: Author</em></em></figcaption></figure>
</div>


<p>This is a pretty good fit!</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-random-forest">Random forest</h3>



<p>Linear Regression is limited to linear relationships. For more flexibility, Random Forest—a widely used model for nonlinear relationships—can provide a better fit.</p>



<p>The <span class="c-code-snippet">scikit-learn</span> library has the RandomForestRegressor that you can simply use to replace the LinearRegression class in the previous code:</p>




<div
	style="opacity: 0;"
	class="block-code-snippet  l-padding__top--0 l-padding__bottom--0 l-margin__top--standard l-margin__bottom--standard block-code-snippet--regular language-py line-numbers block-code-snippet--show-header"
	data-show-header="show"
	data-header-text=""
>
	<pre style="font-size: .875rem;" data-prismjs-copy="Copy the JavaScript snippet!"><code><pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);"><span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">from</span> sklearn.ensemble <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> RandomForestRegressor

<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Fit Random Forest</span>
my_rf = RandomForestRegressor()
my_rf.fit(X, co2_data.co2.values)

<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Make predictions</span>
preds = my_rf.predict(X)

<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Plot the results</span>
plt.plot(co2_data.index, co2_data.co2.values, label=<span class="hljs-string" style="color: rgb(221, 17, 68);">"Actual"</span>)
plt.plot(co2_data.index, preds, label=<span class="hljs-string" style="color: rgb(221, 17, 68);">"Predicted"</span>)
plt.legend()
plt.show()</pre></code></pre>
</div>




<p>The fit is now even better than before:</p>


<div class="wp-block-image is-style-default">
<figure class="aligncenter size-full is-resized"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/How-to-Select-a-Model-For-Your-Time-Series-Prediction-Task_3.png?ssl=1" alt="This plot demonstrates how our random forest model fits our CO2 dataset. In blue, the original data; in orange, the predicted values. As we can see, the fit is better than with linear regression " class="wp-image-58970" style="width:554px;height:370px"/><figcaption class="wp-element-caption"><em><em>This plot demonstrates how our random forest model fits our CO2 dataset. In blue, the original data; in orange, the predicted values. As we can see, the fit is better than with linear regression | Source: Author</em></em></figcaption></figure>
</div>


<p>For now, it&#8217;s enough to understand that this Random Forest model has been able to learn the training data better. Later, we&#8217;ll cover more quantitative methods for model evaluation.</p>


    <a
        href="/blog/random-forest-regression-when-does-it-fail-and-why"
        id="cta-box-related-link-block_c08e26ae42cc1e53701b9d6a384563e2"
        class="block-cta-box-related-link  l-margin__top--standard l-margin__bottom--standard"
        target="_blank" rel="nofollow noopener noreferrer"    >

    
    <div class="block-cta-box-related-link__description-wrapper block-cta-box-related-link__description-wrapper--full">

        
            <div class="c-eyebrow">

                <img
                    src="https://neptune.ai/wp-content/themes/neptune/img/icon-related--article.svg"
                    loading="lazy"
                    decoding="async"
                    width="16"
                    height="16"
                    alt=""
                    class="c-eyebrow__icon">

                <div class="c-eyebrow__text">
                    Related                </div>
            </div>

        
                    <h3 class="c-header" class="c-header" id="h-random-forest-regression-when-does-it-fail-and-why">                Random Forest Regression: When Does It Fail and Why?            </h3>        
                    <div class="c-button c-button--tertiary c-button--small">

                <span class="c-button__text">
                    Read more                </span>

                <img
                    src="https://neptune.ai/wp-content/themes/neptune/img/icon-button-arrow-right.svg"
                    loading="lazy"
                    decoding="async"
                    width="12"
                    height="12"
                    alt=""
                    class="c-button__arrow">

            </div>
            </div>

    </a>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-xgboost">XGBoost</h3>



<p>XGBoost is another essential supervised model based on gradient boosting. It combines an ensemble of “weak learners”—like random forest—in sequence to minimize errors iteratively. It can perform parallel learning for efficiency.</p>




<div
	style="opacity: 0;"
	class="block-code-snippet  l-padding__top--0 l-padding__bottom--0 l-margin__top--standard l-margin__bottom--standard block-code-snippet--regular language-py line-numbers block-code-snippet--show-header"
	data-show-header="show"
	data-header-text=""
>
	<pre style="font-size: .875rem;" data-prismjs-copy="Copy the JavaScript snippet!"><code><pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);"><span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> xgboost <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">as</span> xgb

<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Fit XGBoost model</span>
my_xgb = xgb.XGBRegressor()
my_xgb.fit(X, co2_data.co2.values)

<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Make predictions</span>
preds = my_xgb.predict(X)

<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Plot the results</span>
plt.plot(co2_data.index, co2_data.co2.values)
plt.plot(co2_data.index, preds)
plt.show()</pre></code></pre>
</div>



<div class="wp-block-image is-style-default">
<figure class="aligncenter size-full is-resized"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/How-to-Select-a-Model-For-Your-Time-Series-Prediction-Task_18.png?ssl=1" alt="This plot shows in orange the XGBoost's strong fit to the data (represented in blue)." class="wp-image-58955" style="width:543px;height:371px"/><figcaption class="wp-element-caption"><em><em>This plot shows in orange the XGBoost&#8217;s strong fit to the data (represented in blue). | Source: Author</em></em></figcaption></figure>
</div>


<h2 class="wp-block-heading" class="wp-block-heading" id="h-advanced-and-specific-time-series-models">Advanced and specific time series models</h2>



<p>This section covers two advanced models for time series forecasting: <strong>GARCH</strong> and <strong>TBATS</strong>.</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-garch">GARCH</h3>



<p><strong>GARCH</strong> (Generalized Autoregressive Conditional Heteroskedasticity) is primarily used to estimate volatility in financial markets.&nbsp;</p>



<p>Rather than predicting actual values, GARCH models the error variance in a time series, assuming an ARMA model for the variance. It&#8217;s ideal for forecasting volatility rather than point values.</p>



<p>GARCH has several variants within its family, but it is best for predicting volatility, as it differs significantly from traditional time series models.</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-tbats">TBATS</h3>



<p><strong>TBATS</strong> stands for:</p>



<ul class="wp-block-list">
<li><strong>T</strong>rigonometric seasonality</li>



<li><strong>B</strong>ox-Cox transformation</li>



<li><strong>A</strong>RMA errors</li>



<li><strong>T</strong>rend</li>



<li><strong>S</strong>easonal components</li>
</ul>



<p>Introduced in 2011, TBATS is designed to handle time series with multiple seasonal cycles. This model is newer and less commonly used than ARIMA models but is effective for data with complex seasonal patterns.</p>



<p>A Python implementation of TBATS is available in the <a href="https://www.sktime.org/en/latest/api_reference/auto_generated/sktime.forecasting.tbats.TBATS.html" target="_blank" rel="noreferrer noopener nofollow"><span class="c-code-snippet">sktime</span></a> package.</p>



<h2 class="wp-block-heading" class="wp-block-heading" id="h-deep-learning-based-time-series-models">Deep learning-based time series models</h2>



<p>We can now look at more advanced deep learning models after exploring classical and supervised models, which focus on past-present relations and cause-effect relations. These models, while complex, may offer superior forecasting performance depending on the data and context.</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-lstms-long-short-term-memory">LSTMs (Long Short-Term Memory)</h3>



<p><strong>LSTM</strong> networks are a type of Recurrent Neural Network (RNN) specifically designed to handle sequential data. In LSTM models, multiple nodes pass input data through layers, each learning simple tasks that together capture complex, nonlinear relationships.</p>



<p>LSTMs are especially effective for time series forecasting because they can remember long-term dependencies in sequence data. Although they require substantial data and are challenging to train, they can be highly effective for complex time series patterns.&nbsp;</p>



<p>Python&#8217;s <a href="https://keras.io/api/layers/recurrent_layers/lstm/" target="_blank" rel="noreferrer noopener nofollow">Keras library</a> is a popular starting point for building LSTM models.</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-prophet">Prophet</h3>



<p><strong>Prophet</strong> is a time series forecasting library open-sourced by Facebook. Prophet can generate forecasts with little user specification, making it easy to use, especially for non-experts in time series analysis.</p>



<p>However, it&#8217;s essential to validate Prophet forecasts carefully, as automated model building may overlook nuances in the data. When properly validated, Prophet can be an effective forecasting tool. More resources are available on <a href="https://facebook.github.io/prophet/" target="_blank" rel="noreferrer noopener nofollow">Facebook&#8217;s GitHub</a>.</p>


    <a
        href="/blog/arima-vs-prophet-vs-lstm"
        id="cta-box-related-link-block_7f145e0e37770d9e41da7a694535e693"
        class="block-cta-box-related-link  l-margin__top--standard l-margin__bottom--standard"
        target="_blank" rel="nofollow noopener noreferrer"    >

    
    <div class="block-cta-box-related-link__description-wrapper block-cta-box-related-link__description-wrapper--full">

        
            <div class="c-eyebrow">

                <img
                    src="https://neptune.ai/wp-content/themes/neptune/img/icon-related--article.svg"
                    loading="lazy"
                    decoding="async"
                    width="16"
                    height="16"
                    alt=""
                    class="c-eyebrow__icon">

                <div class="c-eyebrow__text">
                    Related                </div>
            </div>

        
                    <h3 class="c-header" class="c-header" id="h-arima-vs-prophet-vs-lstm-for-time-series-prediction">                ARIMA vs Prophet vs LSTM for Time Series Prediction            </h3>        
                    <div class="c-button c-button--tertiary c-button--small">

                <span class="c-button__text">
                    Read more                </span>

                <img
                    src="https://neptune.ai/wp-content/themes/neptune/img/icon-button-arrow-right.svg"
                    loading="lazy"
                    decoding="async"
                    width="12"
                    height="12"
                    alt=""
                    class="c-button__arrow">

            </div>
            </div>

    </a>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-deepar">DeepAR</h3>



<p><strong>DeepAR</strong>, developed by Amazon, is another black-box model designed to simplify time series forecasting. While the underlying mechanics differ from Prophet, its user experience is automated too.</p>



<p>A great and easy-to-use implementation of DeepAR is available in the <a href="https://ts.gluon.ai/api/gluonts/gluonts.model.deepar.html" target="_blank" rel="noreferrer noopener nofollow">Gluon</a> package.</p>



<h2 class="wp-block-heading" class="wp-block-heading" id="h-time-series-model-selection">Time series model selection</h2>



<p>After exploring various time series models—including classical, supervised, and recent developments like LSTM, Prophet, and DeepAR—the final step is choosing the model that best suits your use case.</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-model-evaluation-and-metrics">Model evaluation and metrics</h3>



<h4 class="wp-block-heading">Defining metrics</h4>



<p>To <a href="/blog/ml-model-evaluation-and-selection" target="_blank" rel="noreferrer noopener">select a model</a>, you must first define the right metric(s) to evaluate your model.</p>



<p><strong><br></strong>Time series forecasting key metrics include:</p>



<ul class="wp-block-list">
<li><strong>Mean Squared Error (MSE)</strong>: Measures squared error at each time point, then averages.</li>



<li><strong>Root Mean Squared Error (RMSE)</strong>: Square root of MSE, to have the error in its original units.</li>



<li><strong>Mean Absolute Error (MAE)</strong>: Uses absolute values of errors, making it more interpretable.</li>



<li><strong>Mean Absolute Percentage Error (MAPE)</strong>: Expresses absolute errors as percentages of actual values, making results easier to interpret.</li>
</ul>



<h4 class="wp-block-heading">Train-test split and cross-validation</h4>



<p>When evaluating machine learning models, remember that good performance on training data doesn&#8217;t guarantee good results on new, out-of-sample data. To estimate how well a model generalizes, two common approaches are <strong>train-test split</strong> and <strong>cross-validation</strong>.</p>



<p>Doing a <strong>train-test split</strong> involves holding back a portion of the data as a test set. For example, you could reserve the last 3 years of a CO2 dataset as a test set and use the remaining 40 years for training. Using a chosen evaluation metric, you&#8217;d then forecast the reserved period and compare predictions to actual values.</p>



<p>To benchmark multiple models, train each on the same 40-year data, forecast the test period, and select the model with the best performance.</p>



<p>A limitation of the train-test split in time series is that it only evaluates performance at a single point in time. Unlike non-sequential data, where test sets can be randomly selected, time series data relies on sequence order, so it is essential to reserve the final period as the test set. However, this approach may be unreliable if the last period is atypical (e.g., due to events like COVID-19, which disrupted trends and forecasts).</p>



<p><strong>Cross-validation</strong> provides a more robust approach, repeatedly splitting the data for training and testing. For example, in <strong>3-fold cross-validation</strong>, the data is divided into three parts. Each fold is used as a test set once, with the other two as training sets, producing three evaluation scores. Averaging these scores provides a more reliable measure of model performance.</p>


    <a
        href="/blog/cross-validation-in-machine-learning-how-to-do-it-right"
        id="cta-box-related-link-block_5c0aa56157ee2380e511ec1a51bb4c37"
        class="block-cta-box-related-link  l-margin__top--standard l-margin__bottom--standard"
        target="_blank" rel="nofollow noopener noreferrer"    >

    
    <div class="block-cta-box-related-link__description-wrapper block-cta-box-related-link__description-wrapper--full">

        
            <div class="c-eyebrow">

                <img
                    src="https://neptune.ai/wp-content/themes/neptune/img/icon-related--article.svg"
                    loading="lazy"
                    decoding="async"
                    width="16"
                    height="16"
                    alt=""
                    class="c-eyebrow__icon">

                <div class="c-eyebrow__text">
                    Related                </div>
            </div>

        
                    <h3 class="c-header" class="c-header" id="h-cross-validation-in-machine-learning-how-to-do-it-right">                Cross-Validation in Machine Learning: How to Do It Right            </h3>        
                    <div class="c-button c-button--tertiary c-button--small">

                <span class="c-button__text">
                    Read more                </span>

                <img
                    src="https://neptune.ai/wp-content/themes/neptune/img/icon-button-arrow-right.svg"
                    loading="lazy"
                    decoding="async"
                    width="12"
                    height="12"
                    alt=""
                    class="c-button__arrow">

            </div>
            </div>

    </a>



<p>By doing this, you avoid selecting a model that performs well on the test set by chance: you now have a more reliable measure of its performance.&nbsp;</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-time-series-model-experiments">Time series model experiments</h3>



<p>To guide your time series model selection, consider the following key questions before starting your experiments:</p>



<ol class="wp-block-list">
<li>Which metric will you use for evaluation?</li>



<li>Which period are you aiming to forecast?</li>



<li>How will you ensure the model performs well in the future using unseen data?</li>
</ol>



<p>Once you&#8217;ve answered these questions, you can begin testing different models and applying your evaluation strategy to select and refine the best-performing model.</p>



<h2 class="wp-block-heading" class="wp-block-heading" id="h-example-use-case-time-series-forecasting-for-sp-500">Example use case: time series forecasting for S&amp;P 500</h2>



<p>In this example, we&#8217;ll build a model to predict the next day&#8217;s direction (up or down) for the S&amp;P 500 index, simulating a scenario where predictions are made nightly for potential trading insights (note: don&#8217;t take this as serious financial advice!).</p>



<h4 class="wp-block-heading">Stock market forecasting data</h4>



<p>To access stock data, we can use the Yahoo Finance (<span class="c-code-snippet">yfinance</span>) package in Python:</p>




<div
	style="opacity: 0;"
	class="block-code-snippet  l-padding__top--0 l-padding__bottom--0 l-margin__top--standard l-margin__bottom--standard block-code-snippet--regular language-py line-numbers block-code-snippet--show-header"
	data-show-header="show"
	data-header-text=""
>
	<pre style="font-size: .875rem;" data-prismjs-copy="Copy the JavaScript snippet!"><code><pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);">!pip install yfinance
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> yfinance <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">as</span> yf

<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Download S&amp;P 500 closing prices from Yahoo Finance</span>
sp500_data = yf.download(<span class="hljs-string" style="color: rgb(221, 17, 68);">'^GSPC'</span>, start=<span class="hljs-string" style="color: rgb(221, 17, 68);">"1980-01-01"</span>, end=<span class="hljs-string" style="color: rgb(221, 17, 68);">"2021-11-21"</span>)[[<span class="hljs-string" style="color: rgb(221, 17, 68);">'Close'</span>]]
sp500_data.plot(figsize=(<span class="hljs-number" style="color: teal;">12</span>, <span class="hljs-number" style="color: teal;">12</span>))</pre></code></pre>
</div>




<p>The output:</p>


<div class="wp-block-image is-style-default">
<figure class="aligncenter size-full is-resized"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/How-to-Select-a-Model-For-Your-Time-Series-Prediction-Task_1.png?ssl=1" alt="This plot shows the evolution of S&amp;P 500 closing prices since 1980
" class="wp-image-58972" style="aspect-ratio:1.136094674556213;width:576px;height:auto"/><figcaption class="wp-element-caption"><em><em>This plot shows the evolution of S&amp;P 500 closing prices since 1980 | Source: Author</em></em></figcaption></figure>
</div>


<p>Instead of using absolute prices, traders often focus on daily percentage changes. We can calculate these changes as follows:</p>




<div
	style="opacity: 0;"
	class="block-code-snippet  l-padding__top--0 l-padding__bottom--0 l-margin__top--standard l-margin__bottom--standard block-code-snippet--regular language-py line-numbers block-code-snippet--show-header"
	data-show-header="show"
	data-header-text=""
>
	<pre style="font-size: .875rem;" data-prismjs-copy="Copy the JavaScript snippet!"><code><pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);">difs = (sp500_data.shift() - sp500_data) / sp500_data
difs = difs.dropna()
difs.plot(figsize=(<span class="hljs-number" style="color: teal;">12</span>, <span class="hljs-number" style="color: teal;">12</span>))</pre></code></pre>
</div>



<div class="wp-block-image is-style-default">
<figure class="aligncenter size-full is-resized"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/How-to-Select-a-Model-For-Your-Time-Series-Prediction-Task_15.png?ssl=1" alt="The plot now displays the percentage change in the S&amp;P 500 over time 
" class="wp-image-58958" style="width:569px;height:509px"/><figcaption class="wp-element-caption"><em><em>The plot now displays the percentage change in the S&amp;P 500 over time | Source: Author</em></em></figcaption></figure>
</div>


<h4 class="wp-block-heading">Defining the experimental approach</h4>



<p>The model&#8217;s goal is to predict the next day&#8217;s percentage change accurately. Since our prediction period is just one day, the test set will be small, and multiple test splits are needed to ensure reliability.</p>



<p>We can set up 100 different train-test splits using the train-test split we discussed previously, with each split training on three months of data and testing on the following day. This setup allows consistent evaluation and better selection of the best-performing model.</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-building-a-classical-time-series-model-arima">Building a classical time series model: ARIMA</h3>



<p>To address this forecasting problem, we&#8217;ll start with a classical ARIMA model. The code below sets up ARIMA models with orders ranging from (0, 0, 0) to (4, 4, 4). Each model is evaluated using 100 splits, where each split uses a maximum of three months for training and one day for testing.</p>



<p>There are a lot of training and test runs, so we will use a tracking tool, <a href="/" target="_blank" rel="noreferrer noopener">neptune.ai</a>, for easy comparison. </p>



<section
	id="i-box-block_0f6a03f830b91a6d96b351b21c4eab2f"
	class="block-i-box  l-margin__top--large l-margin__bottom--x-large">

			<header class="c-header">
			<img
				src="https://neptune.ai/wp-content/themes/neptune/img/image-ratio-holder.svg"
				data-src="https://neptune.ai/wp-content/themes/neptune/img/blocks/i-box/header-icon.svg"
				width="24"
				height="24"
				class="c-header__icon lazyload"
				alt="">

			
            <h2 class="c-header__text animation " style='max-width: 100%;'   >
                <strong>Disclaimer</strong>
            </h2>		</header>
	
	<div class="block-i-box__inner">
		

<p>Please note that this article references a <strong>deprecated version of Neptune</strong>.</p>



<p>For information on the latest version with improved features and functionality, please <a href="/" target="_blank" rel="noreferrer noopener">visit our website</a>.</p>


	</div>

</section>



<p>Before we continue, let&#8217;s first:</p>



<ul class="wp-block-list">
<li>Sign up for a Neptune account and <a href="https://docs-legacy.neptune.ai/setup/creating_project/" target="_blank" rel="noreferrer noopener">create a new project</a></li>



<li><a href="https://docs-legacy.neptune.ai/setup/setting_credentials/" target="_blank" rel="noreferrer noopener">Save your credentials</a> as environment variables</li>
</ul>



<p>Now that we are all set up, let&#8217;s start!</p>




<div
	style="opacity: 0;"
	class="block-code-snippet  l-padding__top--0 l-padding__bottom--0 l-margin__top--standard l-margin__bottom--standard block-code-snippet--regular language-py line-numbers block-code-snippet--show-header"
	data-show-header="show"
	data-header-text=""
>
	<pre style="font-size: .875rem;" data-prismjs-copy="Copy the JavaScript snippet!"><code><pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);"><span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> numpy <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">as</span> np 
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">from</span> sklearn.metrics <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> mean_squared_error 
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">from</span> sklearn.model_selection <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> TimeSeriesSplit 
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> neptune 
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">from</span> neptune.utils <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> stringify_unsupported 
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> statsmodels.api <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">as</span> sm

<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># List of ARIMA parameter combinations </span>
param_list = [(p, d, q) <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">for</span> p <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">in</span> range(<span class="hljs-number" style="color: teal;">5</span>) <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">for</span> d <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">in</span> range(<span class="hljs-number" style="color: teal;">5</span>) <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">for</span> q <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">in</span> range(<span class="hljs-number" style="color: teal;">5</span>)] 

<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">for</span> order <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">in</span> param_list:
    <span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Initialize a Neptune run</span>
    run = neptune.init_run(
        project=<span class="hljs-string" style="color: rgb(221, 17, 68);">"YOU/YOUR_PROJECT"</span>,
        api_token=<span class="hljs-string" style="color: rgb(221, 17, 68);">"YOUR_API_TOKEN"</span>,
    )

    run[<span class="hljs-string" style="color: rgb(221, 17, 68);">'parameters/order'</span>] = order

    mses = []
    tscv = TimeSeriesSplit(n_splits=<span class="hljs-number" style="color: teal;">100</span>, max_train_size=<span class="hljs-number" style="color: teal;">3</span>*<span class="hljs-number" style="color: teal;">31</span>, test_size=<span class="hljs-number" style="color: teal;">1</span>)

    <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">for</span> train_index, test_index <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">in</span> tscv.split(co2_data.co2.values):
        <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">try</span>:
            train, test = co2_data.co2.values[train_index], co2_data.co2.values[test_index]

            <span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Fit ARIMA model</span>
            model = sm.tsa.ARIMA(train, order=order)
            result = model.fit()
            prediction = result.forecast(<span class="hljs-number" style="color: teal;">1</span>)[<span class="hljs-number" style="color: teal;">0</span>]

            <span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Calculate Mean Squared Error (MSE)</span>
            mse = mean_squared_error(test, prediction)
            mses.append(mse)

        <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">except</span>:
            <span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Ignore models that produce errors</span>
            <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">pass</span>

    <span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Log results to Neptune</span>
    run[<span class="hljs-string" style="color: rgb(221, 17, 68);">'average_mse'</span>] = np.mean(mses) <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">if</span> mses <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">else</span> <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">None</span>
    run[<span class="hljs-string" style="color: rgb(221, 17, 68);">'std_mse'</span>] = np.std(mses) <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">if</span> mses <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">else</span> <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">None</span>
    run.stop()</pre></code></pre>
</div>




<p>After running, you can <a href="https://app.neptune.ai/o/community/org/time-series-prediction/runs/table?viewId=9e0f7494-f6fa-463b-bcde-e01f3ee3a6b5" target="_blank" rel="noreferrer noopener nofollow">view the results</a> in a table format in the Neptune dashboard:</p>



<div id="app-screenshot-block_7f32550951bddefe05c3770f3ac6374a"
	class="block-app-screenshot js-block-with-image-full-screen-modal "
	data-video-url=""
	data-show-controls="false"
	data-unmute="false"
	data-button-icon="https://neptune.ai/wp-content/themes/neptune/img/icon-close.svg"
	data-image-full-screen-modal="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/09/Dashboard-view-of-our-100-runs-in-the-Neptune-UI.png?fit=1020%2C574&#038;ssl=1"
>

			<div class="block-app-screenshot__image-wrapper">
			<div class="block-app-screenshot__bar">
				<figure class="block-app-screenshot__bar-buttons-wrapper">
					<img
						src="https://neptune.ai/wp-content/themes/neptune/img/blocks/app-screenshot/bar-buttons.svg"
						width="34"
						height="9"
						class="block-app-screenshot__bar-buttons"
						alt="">
				</figure>
			</div>

			
				<img
					srcset="
					https://i0.wp.com/neptune.ai/wp-content/uploads/2022/09/Dashboard-view-of-our-100-runs-in-the-Neptune-UI.png?fit=480%2C270&#038;ssl=1 480w,					https://i0.wp.com/neptune.ai/wp-content/uploads/2022/09/Dashboard-view-of-our-100-runs-in-the-Neptune-UI.png?fit=768%2C432&#038;ssl=1 768w,					https://i0.wp.com/neptune.ai/wp-content/uploads/2022/09/Dashboard-view-of-our-100-runs-in-the-Neptune-UI.png?fit=1020%2C574&#038;ssl=1 1020w"
					alt=""
					style=""
					width="1020"
					height="574"
					class="block-app-screenshot__image"
				>

			
			<div class="block-app-screenshot__overlay">

				
					<a
						href="https://app.neptune.ai/o/community/org/time-series-prediction/runs/table?viewId=9e0f7494-f6fa-463b-bcde-e01f3ee3a6b5"
						class="c-button c-button--primary c-button--small c-button--cta">
						<img
							decoding="async"
							loading="lazy"
							src="https://neptune.ai/wp-content/themes/neptune/img/icon-button--test-tube.svg"
							width="16"
							height="19"
							target="_blank" rel="nofollow noopener noreferrer"							class="c-button__icon"
							alt=""
						/>

													<span class="c-button__text">
								See in the app							</span>
						
					</a>

				
														<button
						class="js-c-image-full-screen-modal c-button c-button--tertiary c-button--small">
						<img
							decoding="async"
							loading="lazy"
							src="https://neptune.ai/wp-content/themes/neptune/img/icon-zoom.svg"
							width="16"
							height="17"
							class="c-button__icon"
							alt="zoom"
						/>

						<span class="c-button__text">
							Full screen preview						</span>
						
					</button>
									
			</div>

		</div>

					<figcaption class="block-app-screenshot__caption">
				Dashboard view of our 100 runs in the Neptune UI			</figcaption>
			
</div>



<div id="separator-block_67f96a4bc5290f04525fa51bf7bd75d7"
         class="block-separator block-separator--15">
</div>



<p>The model with the lowest average MSE is ARIMA(0, 1, 3). However, its standard deviation is unexpectedly 0, which raises concerns about the stability of this result. The next best models, ARIMA(1, 0, 3) and ARIMA(1, 0, 2), have very similar performance, indicating more reliable outcomes.</p>



<p>Based on this, ARIMA(1, 0, 3) is the best choice, with an average MSE of 0.00000131908 and a standard deviation of 0.00000197007, suggesting both accuracy and consistency in forecasting performance.</p>



<section id="note-block_aa5eb4ca69dce78da6dc05b0affc6b96"
         class="block-note c-box c-box--default c-box--dark c-box--no-hover c-box--standard ">

    
    <div class="block-note__content">
                    <div class="c-item c-item--wysiwyg_editor">

                
                
                <div class="c-item__content">

                                            <p>💡 If you work with Prophet, the <a href="https://docs-legacy.neptune.ai/integrations/prophet/" target="_blank" rel="noopener">Neptune-Prophet integration</a> can help you track parameters, forecast data frames, residual diagnostic charts, and other model-building metadata.</p>
                                    </div>

            </div>
            </div>


</section>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-building-a-supervised-machine-learning-model">Building a supervised machine learning model</h3>



<p>Next, we&#8217;ll explore a supervised machine learning model and see how its performance compares to a classical time series model.</p>



<p>As we mentioned earlier, feature engineering is important in supervised machine learning for forecasting. Supervised models need both dependent (target) and independent (predictor) variables. Sometimes, you may have additional future data (like reservation numbers to help predict a restaurant&#8217;s daily customer count). However, for this stock market example, we only have past stock prices.</p>



<p>A supervised model can&#8217;t be trained on just a target variable, so we need to create features to capture seasonality and autocorrelation effects. For this model, we&#8217;ll use the stock prices from the past 30 days as input features to predict the price on the 31st day.</p>



<p>This approach will create a dataset where each entry contains 30 consecutive days as predictors and the 31st day as the target. By sliding this 30-day window across the S&amp;P 500 data, we can generate a large training dataset for model development.</p>






<p>Now that you have the training database, you can use regular cross-validation: after all, the rows of the data set can be used independently. They are all sets of 30 training days and 1 ‘future’ test day. Thanks to this data preparation, you can use regular KFold cross-validation.</p>




<div
	style="opacity: 0;"
	class="block-code-snippet  l-padding__top--0 l-padding__bottom--0 l-margin__top--standard l-margin__bottom--standard block-code-snippet--regular language-py line-numbers block-code-snippet--show-header"
	data-show-header="show"
	data-header-text=""
>
	<pre style="font-size: .875rem;" data-prismjs-copy="Copy the JavaScript snippet!"><code><pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);"><span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> yfinance <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">as</span> yf
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> numpy <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">as</span> np

<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Download S&amp;P 500 closing price data</span>
sp500_data = yf.download(<span class="hljs-string" style="color: rgb(221, 17, 68);">'^GSPC'</span>, start=<span class="hljs-string" style="color: rgb(221, 17, 68);">"1980-01-01"</span>, end=<span class="hljs-string" style="color: rgb(221, 17, 68);">"2021-11-21"</span>)
sp500_data = sp500_data[[<span class="hljs-string" style="color: rgb(221, 17, 68);">'Close'</span>]]

<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Calculate daily percentage changes</span>
difs = (sp500_data.shift() - sp500_data) / sp500_data
difs = difs.dropna()  <span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Remove any NaN values from the dataset</span>

<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Extract the 'Close' values as our target variable</span>
y = difs.Close.values

<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Generate input windows of 30 days to predict the 31st day</span>
X_data = []
y_data = []
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">for</span> i <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">in</span> range(len(y) - <span class="hljs-number" style="color: teal;">31</span>):
    X_data.append(y[i:i+<span class="hljs-number" style="color: teal;">30</span>])     <span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Last 30 days as input features</span>
    y_data.append(y[i+<span class="hljs-number" style="color: teal;">30</span>])       <span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># 31st day as the target</span>

<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Convert lists to numpy arrays</span>
X_windows = np.vstack(X_data)</pre></code></pre>
</div>




<p>With the training dataset prepared, you can now apply regular cross-validation. Each row in this dataset is a separate sequence of 30 training days followed by a target day, allowing them to be used independently in the model.&nbsp;</p>



<p>Using cross-validation will help evaluate model performance more reliably by testing it across different splits of the data, rather than relying on a single train-test split.</p>



<p>The code below performs a grid search with cross-validation using XGBoost on our prepared dataset. It evaluates model performance across multiple hyperparameter combinations and logs the results to Neptune.</p>




<div
	style="opacity: 0;"
	class="block-code-snippet  l-padding__top--0 l-padding__bottom--0 l-margin__top--standard l-margin__bottom--standard block-code-snippet--regular language-py line-numbers block-code-snippet--show-header"
	data-show-header="show"
	data-header-text=""
>
	<pre style="font-size: .875rem;" data-prismjs-copy="Copy the JavaScript snippet!"><code><pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);"><span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> numpy <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">as</span> np
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> xgboost <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">as</span> xgb
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">from</span> sklearn.model_selection <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> KFold
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> neptune
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">from</span> neptune.utils <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> stringify_unsupported
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">from</span> sklearn.metrics <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> mean_squared_error

<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Define parameter grid for hyperparameter tuning</span>
parameters = {
    <span class="hljs-string" style="color: rgb(221, 17, 68);">'max_depth'</span>: list(range(<span class="hljs-number" style="color: teal;">2</span>, <span class="hljs-number" style="color: teal;">20</span>, <span class="hljs-number" style="color: teal;">4</span>)),
    <span class="hljs-string" style="color: rgb(221, 17, 68);">'gamma'</span>: list(range(<span class="hljs-number" style="color: teal;">0</span>, <span class="hljs-number" style="color: teal;">10</span>, <span class="hljs-number" style="color: teal;">2</span>)),
    <span class="hljs-string" style="color: rgb(221, 17, 68);">'min_child_weight'</span>: list(range(<span class="hljs-number" style="color: teal;">0</span>, <span class="hljs-number" style="color: teal;">10</span>, <span class="hljs-number" style="color: teal;">2</span>)),
    <span class="hljs-string" style="color: rgb(221, 17, 68);">'eta'</span>: [<span class="hljs-number" style="color: teal;">0.01</span>, <span class="hljs-number" style="color: teal;">0.05</span>, <span class="hljs-number" style="color: teal;">0.1</span>, <span class="hljs-number" style="color: teal;">0.15</span>, <span class="hljs-number" style="color: teal;">0.2</span>, <span class="hljs-number" style="color: teal;">0.3</span>, <span class="hljs-number" style="color: teal;">0.5</span>]
}

<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Create a list of all possible parameter combinations</span>
param_list = [(x, y, z, a) <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">for</span> x <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">in</span> parameters[<span class="hljs-string" style="color: rgb(221, 17, 68);">'max_depth'</span>] 
                              <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">for</span> y <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">in</span> parameters[<span class="hljs-string" style="color: rgb(221, 17, 68);">'gamma'</span>] 
                              <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">for</span> z <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">in</span> parameters[<span class="hljs-string" style="color: rgb(221, 17, 68);">'min_child_weight'</span>] 
                              <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">for</span> a <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">in</span> parameters[<span class="hljs-string" style="color: rgb(221, 17, 68);">'eta'</span>]]

<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Iterate over all parameter combinations</span>
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">for</span> params <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">in</span> param_list:
    mses = []

    <span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Initialize Neptune run for logging</span>
    run = neptune.init_run(
        project=<span class="hljs-string" style="color: rgb(221, 17, 68);">"YOU/YOUR_PROJECT"</span>,
        api_token=<span class="hljs-string" style="color: rgb(221, 17, 68);">"YOUR_API_TOKEN"</span>,
    )
    run[<span class="hljs-string" style="color: rgb(221, 17, 68);">'params'</span>] = params

    <span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Set up KFold cross-validation</span>
    my_kfold = KFold(n_splits=<span class="hljs-number" style="color: teal;">10</span>, shuffle=<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">True</span>, random_state=<span class="hljs-number" style="color: teal;">0</span>)

    <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">for</span> train_index, test_index <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">in</span> my_kfold.split(X_windows):
        X_train, X_test = X_windows[train_index], X_windows[test_index]
        y_train, y_test = np.array(y_data)[train_index], np.array(y_data)[test_index]

        <span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Create and train the XGBoost model</span>
        xgb_model = xgb.XGBRegressor(
            max_depth=params[<span class="hljs-number" style="color: teal;">0</span>],
            gamma=params[<span class="hljs-number" style="color: teal;">1</span>],
            min_child_weight=params[<span class="hljs-number" style="color: teal;">2</span>],
            eta=params[<span class="hljs-number" style="color: teal;">3</span>]
        )
        xgb_model.fit(X_train, y_train)
        preds = xgb_model.predict(X_test)

        <span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Calculate and store Mean Squared Error for the fold</span>
        mses.append(mean_squared_error(y_test, preds))

    <span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Log average MSE and standard deviation to Neptune</span>
    average_mse = np.mean(mses)
    std_mse = np.std(mses)
    run[<span class="hljs-string" style="color: rgb(221, 17, 68);">'average_mse'</span>] = average_mse
    run[<span class="hljs-string" style="color: rgb(221, 17, 68);">'std_mse'</span>] = std_mse

    <span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Stop the Neptune run</span>
    run.stop()</pre></code></pre>
</div>




<p>Some of the scores obtained using this loop are shown in the below table:</p>



<div id="app-screenshot-block_3c7159673403c99597ef8b572a40ff90"
	class="block-app-screenshot js-block-with-image-full-screen-modal "
	data-video-url=""
	data-show-controls="false"
	data-unmute="false"
	data-button-icon="https://neptune.ai/wp-content/themes/neptune/img/icon-close.svg"
	data-image-full-screen-modal="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/09/Table-view-of-our-runs-from-the-Neptune.png?fit=1020%2C574&#038;ssl=1"
>

			<div class="block-app-screenshot__image-wrapper">
			<div class="block-app-screenshot__bar">
				<figure class="block-app-screenshot__bar-buttons-wrapper">
					<img
						src="https://neptune.ai/wp-content/themes/neptune/img/blocks/app-screenshot/bar-buttons.svg"
						width="34"
						height="9"
						class="block-app-screenshot__bar-buttons"
						alt="">
				</figure>
			</div>

			
				<img
					srcset="
					https://i0.wp.com/neptune.ai/wp-content/uploads/2022/09/Table-view-of-our-runs-from-the-Neptune.png?fit=480%2C270&#038;ssl=1 480w,					https://i0.wp.com/neptune.ai/wp-content/uploads/2022/09/Table-view-of-our-runs-from-the-Neptune.png?fit=768%2C432&#038;ssl=1 768w,					https://i0.wp.com/neptune.ai/wp-content/uploads/2022/09/Table-view-of-our-runs-from-the-Neptune.png?fit=1020%2C574&#038;ssl=1 1020w"
					alt=""
					style=""
					width="1020"
					height="574"
					class="block-app-screenshot__image"
				>

			
			<div class="block-app-screenshot__overlay">

				
					<a
						href="https://app.neptune.ai/o/community/org/time-series-prediction/runs/table?viewId=9e0ff933-dc30-451f-8541-8e4d3a6996dc"
						class="c-button c-button--primary c-button--small c-button--cta">
						<img
							decoding="async"
							loading="lazy"
							src="https://neptune.ai/wp-content/themes/neptune/img/icon-button--test-tube.svg"
							width="16"
							height="19"
							target="_blank" rel="nofollow noopener noreferrer"							class="c-button__icon"
							alt=""
						/>

													<span class="c-button__text">
								See in the app							</span>
						
					</a>

				
														<button
						class="js-c-image-full-screen-modal c-button c-button--tertiary c-button--small">
						<img
							decoding="async"
							loading="lazy"
							src="https://neptune.ai/wp-content/themes/neptune/img/icon-zoom.svg"
							width="16"
							height="17"
							class="c-button__icon"
							alt="zoom"
						/>

						<span class="c-button__text">
							Full screen preview						</span>
						
					</button>
									
			</div>

		</div>

					<figcaption class="block-app-screenshot__caption">
				Table view of our runs from the Neptune UI			</figcaption>
			
</div>



<div id="separator-block_67f96a4bc5290f04525fa51bf7bd75d7"
         class="block-separator block-separator--15">
</div>



<p>The parameters that were tested in this grid search are reproduced below:</p>



<div id="medium-table-block_b073ca09ed072663838c04fe802934a2"
     class="block-medium-table c-table__outer-wrapper  l-padding__top--0 l-padding__bottom--0 l-margin__top--0 l-margin__bottom--0">

    <table class="c-table">
                    <thead class="c-table__head">
            <tr>
                                    <td class="c-item"
                        style="">
                        <div class="c-item__inner">
                            Parameter name                        </div>
                    </td>
                                    <td class="c-item"
                        style="">
                        <div class="c-item__inner">
                            Values tested                        </div>
                    </td>
                                    <td class="c-item"
                        style="">
                        <div class="c-item__inner">
                            Description                        </div>
                    </td>
                            </tr>
            </thead>
        
        <tbody class="c-table__body">

                    
                <tr class="c-row">

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>Max Depth</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>2, 4, 6 8, 10</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p><span style="font-weight: 400;">Controls the tree depth. Higher values make the model more complex and increase the risk of overfitting.</span></p>
                                                            </div>
                        </td>

                    
                </tr>

            
                <tr class="c-row">

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>Min Child Weight</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>0, 2, 4</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p><span style="font-weight: 400;">Minimum sum of instance weights needed in a child node. Higher values prevent overly complex models by stopping splits that don&#8217;t meet this threshold.</span></p>
                                                            </div>
                        </td>

                    
                </tr>

            
                <tr class="c-row">

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>Eta</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>0.01, 0.1, 0.3</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p><span style="font-weight: 400;">Learning rate (or step size). Low values mean slow learning, but they can improve accuracy by preventing overfitting.</span></p>
                                                            </div>
                        </td>

                    
                </tr>

            
                <tr class="c-row">

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>Gamma</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p>0, 2, 4</p>
                                                            </div>
                        </td>

                    
                        <td class="c-ceil">
                            <div class="c-ceil__inner">
                                                                    <p><span style="font-weight: 400;">Minimum loss reduction required to split a node. Higher values make the model more conservative by reducing unnecessary splits.</span></p>
                                                            </div>
                        </td>

                    
                </tr>

                    
        </tbody>
    </table>

</div>



<div id="separator-block_9daaa2aa5d82c0c2d3ff865de4643139"
         class="block-separator block-separator--20">
</div>



<p>For more information on XGBoost tuning, check out the official <a href="https://xgboost.readthedocs.io/en/latest/tutorials/param_tuning.html">XGBoost documentation</a>.</p>



<p>The best (lowest) MSE this XGBoost model achieves is 0.000129982, with several hyperparameter combinations reaching this score. However, the XGBoost model underperforms in its current setup compared to the classical time series model. To improve XGBoost&#8217;s results, a different approach to organizing the data may be needed.</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-lstm-model-for-time-series-forecasting">LSTM model for time series forecasting&nbsp;</h3>



<p>As a third model for the model comparison, let&#8217;s take an LSTM and see whether it can beat our ARIMA model.&nbsp;</p>



<p>The following code sets up an LSTM model using Keras. Instead of cross-validation (which can be time-consuming for LSTMs), a train-test split approach is used here to evaluate the model.</p>




<div
	style="opacity: 0;"
	class="block-code-snippet  l-padding__top--0 l-padding__bottom--0 l-margin__top--standard l-margin__bottom--standard block-code-snippet--regular language-py line-numbers block-code-snippet--show-header"
	data-show-header="show"
	data-header-text=""
>
	<pre style="font-size: .875rem;" data-prismjs-copy="Copy the JavaScript snippet!"><code><pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);"><span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> tensorflow <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">as</span> tf
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> numpy <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">as</span> np
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> yfinance <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">as</span> yf
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">from</span> sklearn.model_selection <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> train_test_split
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> neptune

<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Load and preprocess data</span>
sp500_data = yf.download(<span class="hljs-string" style="color: rgb(221, 17, 68);">'^GSPC'</span>, start=<span class="hljs-string" style="color: rgb(221, 17, 68);">"1980-01-01"</span>, end=<span class="hljs-string" style="color: rgb(221, 17, 68);">"2021-11-21"</span>)
sp500_data = sp500_data[[<span class="hljs-string" style="color: rgb(221, 17, 68);">'Close'</span>]]
difs = (sp500_data.shift() - sp500_data) / sp500_data
difs = difs.dropna()
y = difs.Close.values

<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Create windows</span>
X_data = []
y_data = []
window_size = <span class="hljs-number" style="color: teal;">3</span> * <span class="hljs-number" style="color: teal;">31</span>  <span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># 3 months of data</span>
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">for</span> i <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">in</span> range(len(y) - window_size):
    X_data.append(y[i:i+window_size])
    y_data.append(y[i+window_size])

X_windows = np.array(X_data)
y_data = np.array(y_data)

<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Train/test/validation split</span>
X_train, X_test, y_train, y_test = train_test_split(X_windows, y_data, test_size=<span class="hljs-number" style="color: teal;">0.2</span>, random_state=<span class="hljs-number" style="color: teal;">1</span>)
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=<span class="hljs-number" style="color: teal;">0.25</span>, random_state=<span class="hljs-number" style="color: teal;">1</span>)

<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Reshape input data for LSTM (samples, timesteps, features)</span>
X_train = X_train.reshape((X_train.shape[<span class="hljs-number" style="color: teal;">0</span>], X_train.shape[<span class="hljs-number" style="color: teal;">1</span>], <span class="hljs-number" style="color: teal;">1</span>))
X_val = X_val.reshape((X_val.shape[<span class="hljs-number" style="color: teal;">0</span>], X_val.shape[<span class="hljs-number" style="color: teal;">1</span>], <span class="hljs-number" style="color: teal;">1</span>))
X_test = X_test.reshape((X_test.shape[<span class="hljs-number" style="color: teal;">0</span>], X_test.shape[<span class="hljs-number" style="color: teal;">1</span>], <span class="hljs-number" style="color: teal;">1</span>))

<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Define LSTM architectures</span>
archi_list = [
    [tf.keras.layers.LSTM(<span class="hljs-number" style="color: teal;">32</span>, return_sequences=<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">True</span>, input_shape=(window_size, <span class="hljs-number" style="color: teal;">1</span>)),
     tf.keras.layers.LSTM(<span class="hljs-number" style="color: teal;">32</span>, return_sequences=<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">False</span>),
     tf.keras.layers.Dense(units=<span class="hljs-number" style="color: teal;">1</span>)],
    [tf.keras.layers.LSTM(<span class="hljs-number" style="color: teal;">64</span>, return_sequences=<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">True</span>, input_shape=(window_size, <span class="hljs-number" style="color: teal;">1</span>)),
     tf.keras.layers.LSTM(<span class="hljs-number" style="color: teal;">64</span>, return_sequences=<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">False</span>),
     tf.keras.layers.Dense(units=<span class="hljs-number" style="color: teal;">1</span>)],
    [tf.keras.layers.LSTM(<span class="hljs-number" style="color: teal;">128</span>, return_sequences=<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">True</span>, input_shape=(window_size, <span class="hljs-number" style="color: teal;">1</span>)),
     tf.keras.layers.LSTM(<span class="hljs-number" style="color: teal;">128</span>, return_sequences=<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">False</span>),
     tf.keras.layers.Dense(units=<span class="hljs-number" style="color: teal;">1</span>)],
    [tf.keras.layers.LSTM(<span class="hljs-number" style="color: teal;">32</span>, return_sequences=<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">True</span>, input_shape=(window_size, <span class="hljs-number" style="color: teal;">1</span>)),
     tf.keras.layers.LSTM(<span class="hljs-number" style="color: teal;">32</span>, return_sequences=<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">True</span>),
     tf.keras.layers.LSTM(<span class="hljs-number" style="color: teal;">32</span>, return_sequences=<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">False</span>),
     tf.keras.layers.Dense(units=<span class="hljs-number" style="color: teal;">1</span>)],
    [tf.keras.layers.LSTM(<span class="hljs-number" style="color: teal;">64</span>, return_sequences=<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">True</span>, input_shape=(window_size, <span class="hljs-number" style="color: teal;">1</span>)),
     tf.keras.layers.LSTM(<span class="hljs-number" style="color: teal;">64</span>, return_sequences=<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">True</span>),
     tf.keras.layers.LSTM(<span class="hljs-number" style="color: teal;">64</span>, return_sequences=<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">False</span>),
     tf.keras.layers.Dense(units=<span class="hljs-number" style="color: teal;">1</span>)],
]

<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Loop through architectures and log results</span>
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">for</span> archi <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">in</span> archi_list:
    run = neptune.init_run(
        project=<span class="hljs-string" style="color: rgb(221, 17, 68);">"YOU/YOUR_PROJECT"</span>,
        api_token=<span class="hljs-string" style="color: rgb(221, 17, 68);">"YOUR_API_TOKEN"</span>,
    )

    run[<span class="hljs-string" style="color: rgb(221, 17, 68);">'params'</span>] = f<span class="hljs-string" style="color: rgb(221, 17, 68);">'LSTM Layers: {len(archi) - 1}, Units: {archi[0].units}'</span>
    run[<span class="hljs-string" style="color: rgb(221, 17, 68);">'Tags'</span>] = <span class="hljs-string" style="color: rgb(221, 17, 68);">'lstm_model_comparison'</span>

    <span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Build, compile, and train model</span>
    lstm_model = tf.keras.models.Sequential(archi)
    lstm_model.compile(loss=tf.losses.MeanSquaredError(),
                       optimizer=tf.optimizers.Adam(),
                       metrics=[tf.metrics.MeanSquaredError()])
    history = lstm_model.fit(X_train, y_train, epochs=<span class="hljs-number" style="color: teal;">10</span>, validation_data=(X_val, y_val), verbose=<span class="hljs-number" style="color: teal;">1</span>)
    
    <span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Log final validation MSE to Neptune</span>
    run[<span class="hljs-string" style="color: rgb(221, 17, 68);">'final_val_mse'</span>] = history.history[<span class="hljs-string" style="color: rgb(221, 17, 68);">'val_mean_squared_error'</span>][<span class="hljs-number" style="color: teal;">-1</span>]
    run.stop()</pre></code></pre>
</div>




<p>Here, we can see the output for the 10 epochs:</p>



<div id="app-screenshot-block_9eff34e4ba0b4a14ebd4f20d7bb0dd91"
	class="block-app-screenshot js-block-with-image-full-screen-modal "
	data-video-url=""
	data-show-controls="false"
	data-unmute="false"
	data-button-icon="https://neptune.ai/wp-content/themes/neptune/img/icon-close.svg"
	data-image-full-screen-modal="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/09/Output-values-for-our-LSTM-viewed-in-Neptune.png?fit=1020%2C562&#038;ssl=1"
>

			<div class="block-app-screenshot__image-wrapper">
			<div class="block-app-screenshot__bar">
				<figure class="block-app-screenshot__bar-buttons-wrapper">
					<img
						src="https://neptune.ai/wp-content/themes/neptune/img/blocks/app-screenshot/bar-buttons.svg"
						width="34"
						height="9"
						class="block-app-screenshot__bar-buttons"
						alt="">
				</figure>
			</div>

			
				<img
					srcset="
					https://i0.wp.com/neptune.ai/wp-content/uploads/2022/09/Output-values-for-our-LSTM-viewed-in-Neptune.png?fit=480%2C265&#038;ssl=1 480w,					https://i0.wp.com/neptune.ai/wp-content/uploads/2022/09/Output-values-for-our-LSTM-viewed-in-Neptune.png?fit=768%2C423&#038;ssl=1 768w,					https://i0.wp.com/neptune.ai/wp-content/uploads/2022/09/Output-values-for-our-LSTM-viewed-in-Neptune.png?fit=1020%2C562&#038;ssl=1 1020w"
					alt=""
					style=""
					width="1020"
					height="562"
					class="block-app-screenshot__image"
				>

			
			<div class="block-app-screenshot__overlay">

				
					<a
						href="https://app.neptune.ai/o/community/org/time-series-prediction/runs/table?viewId=9e0ff933-dc30-451f-8541-8e4d3a6996dc"
						class="c-button c-button--primary c-button--small c-button--cta">
						<img
							decoding="async"
							loading="lazy"
							src="https://neptune.ai/wp-content/themes/neptune/img/icon-button--test-tube.svg"
							width="16"
							height="19"
							target="_blank" rel="nofollow noopener noreferrer"							class="c-button__icon"
							alt=""
						/>

													<span class="c-button__text">
								See in the app							</span>
						
					</a>

				
														<button
						class="js-c-image-full-screen-modal c-button c-button--tertiary c-button--small">
						<img
							decoding="async"
							loading="lazy"
							src="https://neptune.ai/wp-content/themes/neptune/img/icon-zoom.svg"
							width="16"
							height="17"
							class="c-button__icon"
							alt="zoom"
						/>

						<span class="c-button__text">
							Full screen preview						</span>
						
					</button>
									
			</div>

		</div>

					<figcaption class="block-app-screenshot__caption">
				Output values for our LSTM viewed in the Neptune UI			</figcaption>
			
</div>



<div id="separator-block_67f96a4bc5290f04525fa51bf7bd75d7"
         class="block-separator block-separator--15">
</div>



<p>The LSTM performed similarly to the XGBoost model. To improve the results, you could experiment with different training period lengths or adjust data standardization methods, which often impact neural network performance.</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-selecting-the-best-model">Selecting the best model</h3>



<p>Out of the three models we tested, the ARIMA model (highlighted in the blue box below) showed the best performance based on a three-month training period and a one-day forecast, with the lowest mean squared error value (the row “average”, with MSE 0.423, compared to 0.46 and 1.07 for the other two models).</p>



<div id="app-screenshot-block_11e9485e51eb199ae303d20899e3cec9"
	class="block-app-screenshot js-block-with-image-full-screen-modal "
	data-video-url=""
	data-show-controls="false"
	data-unmute="false"
	data-button-icon="https://neptune.ai/wp-content/themes/neptune/img/icon-close.svg"
	data-image-full-screen-modal="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/09/Comparison-view-of-our-three-models-in-Neptune-1.png?fit=1020%2C574&#038;ssl=1"
>

			<div class="block-app-screenshot__image-wrapper">
			<div class="block-app-screenshot__bar">
				<figure class="block-app-screenshot__bar-buttons-wrapper">
					<img
						src="https://neptune.ai/wp-content/themes/neptune/img/blocks/app-screenshot/bar-buttons.svg"
						width="34"
						height="9"
						class="block-app-screenshot__bar-buttons"
						alt="">
				</figure>
			</div>

			
				<img
					srcset="
					https://i0.wp.com/neptune.ai/wp-content/uploads/2022/09/Comparison-view-of-our-three-models-in-Neptune-1.png?fit=480%2C270&#038;ssl=1 480w,					https://i0.wp.com/neptune.ai/wp-content/uploads/2022/09/Comparison-view-of-our-three-models-in-Neptune-1.png?fit=768%2C432&#038;ssl=1 768w,					https://i0.wp.com/neptune.ai/wp-content/uploads/2022/09/Comparison-view-of-our-three-models-in-Neptune-1.png?fit=1020%2C574&#038;ssl=1 1020w"
					alt=""
					style=""
					width="1020"
					height="574"
					class="block-app-screenshot__image"
				>

			
			<div class="block-app-screenshot__overlay">

				
					<a
						href="https://app.neptune.ai/o/community/org/time-series-prediction/runs/compare?viewId=9e0ff5b5-3c9c-46a2-a67c-3e6e2ea87773&#038;dash=leaderboard&#038;compare=GwBgHANATBCMQ&#038;referenceShortId=TIM-611"
						class="c-button c-button--primary c-button--small c-button--cta">
						<img
							decoding="async"
							loading="lazy"
							src="https://neptune.ai/wp-content/themes/neptune/img/icon-button--test-tube.svg"
							width="16"
							height="19"
							target="_blank" rel="nofollow noopener noreferrer"							class="c-button__icon"
							alt=""
						/>

													<span class="c-button__text">
								See in the app							</span>
						
					</a>

				
														<button
						class="js-c-image-full-screen-modal c-button c-button--tertiary c-button--small">
						<img
							decoding="async"
							loading="lazy"
							src="https://neptune.ai/wp-content/themes/neptune/img/icon-zoom.svg"
							width="16"
							height="17"
							class="c-button__icon"
							alt="zoom"
						/>

						<span class="c-button__text">
							Full screen preview						</span>
						
					</button>
									
			</div>

		</div>

					<figcaption class="block-app-screenshot__caption">
				Comparison view of our three models in the Neptune UI			</figcaption>
			
</div>



<div id="separator-block_67f96a4bc5290f04525fa51bf7bd75d7"
         class="block-separator block-separator--15">
</div>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-next-steps">Next steps</h3>



<p>To further improve this model, you could experiment with different training period lengths or add more data, such as seasonal indicators (day of the week, month, etc.) or additional predictors like market sentiment. If adding external variables, consider using a SARIMAX model.</p>



<p>Now you have a solid overview of time series model selection, including model types and tools like windowing and time series splits!</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">6302</post-id>	</item>
		<item>
		<title>Time Series Prediction: How Is It Different From Other Machine Learning? [ML Engineer Explains]</title>
		<link>https://neptune.ai/blog/time-series-prediction-vs-machine-learning</link>
		
		<dc:creator><![CDATA[Aayush Bajaj]]></dc:creator>
		<pubDate>Fri, 22 Jul 2022 06:44:43 +0000</pubDate>
				<category><![CDATA[Time Series]]></category>
		<guid isPermaLink="false">https://neptune.test/time-series-prediction-vs-machine-learning/</guid>

					<description><![CDATA[Time-series is kind of a problem that every Data Scientist/ML Engineer will encounter in the span of their careers, more often than they think. So, it&#8217;s an important concept to understand in-out. You see, time-series is a type of data that is sampled based on a time-based dimension like days, months, years, etc. We term&#8230;]]></description>
										<content:encoded><![CDATA[
<p>Time-series is kind of a problem that every <a href="/blog/ml-engineer-vs-data-scientist" target="_blank" rel="noreferrer noopener">Data Scientist/ML Engineer</a> will encounter in the span of their careers, more often than they think. So, it&#8217;s an important concept to understand in-out.</p>



<p>You see, time-series is a type of data that is sampled based on a time-based dimension like days, months, years, etc. We term this data as “dynamic” as we’ve indexed it based on a DateTime attribute. This gives data an implicit order. Don’t get me wrong, static data can still have an attribute that&#8217;s a DateTime value but the data will not be sampled or indexed based on that attribute.</p>



<p>When we apply machine learning algorithms on time-series data and want to make predictions for the future DateTime values, for e.g. predicting total sales for February given data for the previous 5 years, or predicting the weather for a certain day given weather data of several years. These predictions on time-series data are called <strong>forecasting. </strong>This contrasts with what we deal with when working on <strong>static</strong> <strong>data</strong>.</p>



<p>In this blog we’re going to talk about:</p>



<div id="case-study-numbered-list-block_e971629848c2170c5f2f904dcff00497"
         class="block-case-study-numbered-list ">

    
    <h2 id="h-"></h2>

    <ul class="c-list">
                    <li class="c-list__item">
                <span class="c-list__counter">1</span>
                How is time-series prediction i.e forecasting different from static machine learning predictions?            </li>
                    <li class="c-list__item">
                <span class="c-list__counter">2</span>
                Best practices while working on time series forecasting            </li>
            </ul>
</div>



<h2 class="wp-block-heading" class="wp-block-heading" id="h-time-series-data-vs-static-ml">Time-series data vs static ML</h2>



<p>So far we’ve established a baseline on how we should perceive time-series data as compared to static data. In this section, we are going to talk about the difference in approaching both of these types of data.&nbsp;</p>



<p><strong><em>Note</em></strong><em>: For the sake of simplicity we assume data to be continuous in all cases.</em></p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-imputation-of-missing-data">Imputation of missing data</h3>



<p>Imputation of missing data is a key preprocessing step in any tabular machine learning project. In static data, techniques like Simple Imputation where you can fill missing data with mean, median, mode of the data depending on nature of the attribute, or more sophisticated methods like Nearest Neighbour imputation where you employ a KNN algorithm to identify missing datums.</p>



<p>However, In time-series, missing data looks something like this:&nbsp;</p>


<div class="wp-block-image is-style-default">
<figure class="aligncenter size-full is-resized"><img data-recalc-dims="1" fetchpriority="high" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Time-Series-Prediction-How-Is-It-Different-From-Other-Machine-Learning-ML-Engineer-Explains_2.png?resize=721%2C541&#038;ssl=1" alt="Time-series – missing data" class="wp-image-62231" width="721" height="541"/><figcaption class="wp-element-caption"><em>Time-series – missing data | <a href="https://jagan-singhh.medium.com/missing-data-in-time-series-5dcf19b0f40f" target="_blank" rel="noreferrer noopener nofollow">Source</a></em></figcaption></figure>
</div>


<div id="separator-block_92bd24339704c53cfb56e027c0951d12"
         class="block-separator block-separator--15">
</div>



<p>You have these visible gaps in the data that can’t be logically filled with any of the imputation strategies that can be used on static data. Let’s discuss some techniques that can be useful:</p>



<ul class="wp-block-list">
<li><strong>Why not fill it with mean?</strong> Static mean doesn’t do us any good here since it makes no sense to fill your missing values by taking cues from the future. In the plot above, it&#8217;s quite intuitive that gaps between 2001-2003 can logically be filled with only historical data i.e. pre-2001 data.<br><br>In Time-Series data, we use something called rolling mean or moving average or window mean which is taking mean of values pertaining to a predefined window for e.g., a 7-day window or a 1-month window. So, we can utilize this moving average to fill in any missing gaps in our time-series data.<br><br><strong><em>Note</em></strong><em>: Stationarity plays an important role when working with averages in time-series data.</em></li>
</ul>



<ul class="wp-block-list">
<li><strong><strong>Interpolations are quite popular</strong></strong>: Utilizing the implicit order that Time-Series data has, interpolation is quite often the go-to method for devising the missing parts in the Time-Series data. Interpolations, in brief, use the value present before and after the missing point to calculate the missing datum For eg, Linear interpolations work by calculating a straight line between the two points, averaging them, and getting the missing datum.<br><br>There are many types of interpolations available like Linear, Spline, Stineman. Their implementations are given in almost all major modules like python’s pandas <a href="https://pandas.pydata.org/docs/reference/api/pandas.Series.interpolate.html" target="_blank" rel="noreferrer noopener nofollow">interpolate() </a>function and <a href="https://cran.r-project.org/web/packages/imputeTS/vignettes/imputeTS-Time-Series-Missing-Value-Imputation-in-R.pdf" target="_blank" rel="noreferrer noopener nofollow">R imputeTime-Series package</a>.<br><br>Although, interpolation can also be used in static data as well. However, it isn’t widely used since there are more sophisticated imputations techniques in static data (some of which are explained above).</li>
</ul>



<ul class="wp-block-list">
<li><strong>Understanding the business use-case: </strong>This is not any technical method to deal with missing data. But I feel it’s the most underrated technique which can give results quickly. This involves understanding the problem at hand and then devising which method would work best. After all, SOTA might not be SOTA on your use case. For eg, Sales data should be treated differently than say stocks data, with both having a different set of market metrics.<br>By the way, this technique is common between static as well as time-series data.</li>
</ul>



<section id="blog-intext-cta-block_08097ecdd6ddeb7970ec138738f528de" class="block-blog-intext-cta  c-box c-box--default c-box--dark c-box--no-hover c-box--standard ">

            <h3 class="block-blog-intext-cta__header" class="block-blog-intext-cta__header" id="h-see-also">See also</h3>
    
            <p>  <a href="/blog/time-series-forecasting" target="_blank" rel="noopener">Time Series Forecasting: Data, Analysis, and Practice</a></p>
    
    </section>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-feature-engineering-in-time-series-model">Feature engineering in time-series model&nbsp;</h3>



<p>Working with features is another major step that differentiates time-series data from static. Feature engineering is a broad term that encapsulates a variety of standard techniques and ad-hoc methods. Features are handled differently in time-series data as compared to static data.&nbsp;</p>



<p><em><strong>Note:</strong> One might argue that imputation comes under Feature engineering, which is not wrong but I wanted to explain this under a separate section to give you a better idea.</em></p>



<p>In static data, it’s highly subjective on the kind of problem at hand but a few standard techniques include Feature Transformations, Scaling, Compression, Normalization, Encoding, etc.&nbsp;&nbsp;</p>



<p>Time-series data can have other attributes apart from time-based features. If those attributes are time-based then the resulting time-series would be multivariate and if static, the resulting would be univariate with static features. Non-time-based features can utilize methods from the static techniques in a way that doesn’t hinder the integrity of the data.</p>



<p>All time-based components have a definitive pattern that can be devised using some standard techniques. Let’s look at some of the techniques that prove useful while working with time-based features.&nbsp;</p>



<h4 class="wp-block-heading">Time-series components: what is the main characteristic of time-series data</h4>



<p>For starters, every time-series data has time-series components. We do an STL decomposition (Seasonal and Trend decomposition using Loess) to extract some of these components. Let&#8217;s take a look at what each of these means.</p>



<div id="separator-block_ef33c7335defd019296f9251e9ea9c29"
         class="block-separator block-separator--10">
</div>


<div class="wp-block-image is-style-default">
<figure class="aligncenter size-full is-resized"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Time-Series-Prediction-How-Is-It-Different-From-Other-Machine-Learning-ML-Engineer-Explains_4.png?resize=738%2C487&#038;ssl=1" alt="Example of an STL decomposition" class="wp-image-62229" width="738" height="487"/><figcaption class="wp-element-caption"><em>Example of an STL decomposition | <a href="/blog/anomaly-detection-in-time-series" target="_blank" rel="noreferrer noopener">Source</a></em></figcaption></figure>
</div>


<div id="separator-block_ef33c7335defd019296f9251e9ea9c29"
         class="block-separator block-separator--10">
</div>



<ul class="wp-block-list">
<li><strong>Trend: </strong>Time-series data shows a trend when its value variably changes with time, an increasing value shows a positive trend and decreasing, a negative trend. In the plot above, you can see a positive increasing trend.</li>



<li><strong>Seasonality</strong>: Seasonality refers to a property of time-series that displays periodical patterns repeating at a constant frequency. In the example above, we can observe a seasonal component with the frequency being 12 months, which broadly means that the periodical pattern repeats every twelve months.</li>



<li><strong>Remainder: </strong>After extracting Trend and Seasonality from the data, the remaining is what we call remainder (error) or Residual. This actually helps in anomaly detection in time-series.</li>



<li><strong>Cycle:</strong> Time-series data is termed cyclical when there are trends with no set repetitions or seasonality.</li>



<li><strong>Stationarity:</strong> Time-series data is stationary when its statistical features do not change over time i.e. a constant mean and standard deviation. The covariance is independent of time.</li>
</ul>



<p>These components when extracted usually form the basis of the next steps in Feature engineering in time-series data. To put this in perspective of static data, STL decomposition is the descriptive part of the time-series world. There are a few more time-series specific metrics subjective to the type of time-series data like <em>dummy variables</em> when working on stock data.</p>



<p>Time-series components are highly important for analyzing the time-series variable of interest in order to understand its behavior, what patterns it has, and to be able to choose and fit an appropriate time-series model.&nbsp;</p>



<section id="blog-intext-cta-block_3c3f3fb6e7ca5b322a36570223a948ab" class="block-blog-intext-cta  c-box c-box--default c-box--dark c-box--no-hover c-box--standard ">

            <h3 class="block-blog-intext-cta__header" class="block-blog-intext-cta__header" id="h-learn-more">Learn more</h3>
    
            <p>  <a href="/blog/select-model-for-time-series-prediction-task" target="_blank" rel="noopener">How to Select a Model For Your Time Series Prediction Task [Guide]</a></p>
    
    </section>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-analysis-and-visualization-in-time-series-models">Analysis and visualization in time-series models</h3>



<h4 class="wp-block-heading">Analysis</h4>



<p>Time-series data analysis comes with a different blueprint than a static data analysis. As discussed in the previous section, time-series analysis starts with answering questions like:</p>



<ul class="wp-block-list">
<li>Does this data has a trend?</li>



<li>Does this data contain any sort of pattern or seasonality?</li>



<li>Is the data stationary or non-stationary?</li>
</ul>



<p>Ideally, one must proceed further with the analysis after working on the answers to the above questions. Similar to this, static data analysis has some procedures like <strong>Descriptive</strong>, <strong>Predictive</strong> and <strong>Prescriptive</strong>. Although, Descriptive is standard in all problem statements, Predictive and Prescriptive are subjective. These procedures are common in both time-series and static ML. However, many metrics used inside Descriptive, Predictive and Prescriptive are used differently, one of which is, <strong>Correlation</strong>.</p>



<p>Contrastingly, in time-series data we use something called <strong>Autocorrelation</strong> and <strong>Partial-Autocorrelation</strong>. Autocorrelation and Partial-Autocorrelation are both measures of association between current and past series values and indicate which past series values are most useful in predicting future values.</p>



<div id="separator-block_ef33c7335defd019296f9251e9ea9c29"
         class="block-separator block-separator--10">
</div>


<div class="wp-block-image is-style-default">
<figure class="aligncenter size-full is-resized"><img data-recalc-dims="1" loading="lazy" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Time-Series-Prediction-How-Is-It-Different-From-Other-Machine-Learning-ML-Engineer-Explains_1.png?resize=775%2C458&#038;ssl=1" alt="An example ACF and PACF plot in time-series" class="wp-image-62232" width="775" height="458"/><figcaption class="wp-element-caption"><em>An example ACF and PACF plot in time-series | <a href="https://stats.stackexchange.com/questions/139796/autocorrelation-and-partial-correlation-plots-in-arma-models" target="_blank" rel="noreferrer noopener nofollow">Source</a></em></figcaption></figure>
</div>


<div id="separator-block_ef33c7335defd019296f9251e9ea9c29"
         class="block-separator block-separator--10">
</div>



<p>While the approach for analysis is somewhat different between the two data kinds, the core idea is the same, it depends largely on the problem statement. E.g. Stocks and weather data, both are time-series but you can use stock data to predict future values and weather data to study the seasonal patterns. Similarly, using loan data you can use it to analyze patterns of the borrowers or check if a new borrower will default on loan repayment or not.</p>



<h4 class="wp-block-heading">Visualization</h4>



<p>Visualization is an integral part of any analysis. The differencing question isn’t what should you visualize but how should you visualize.</p>



<p>You see, time-series data’s time-based features should be visualized with one axis of the plot being time and non-time-based features are subjected to the strategy employed to work on the problem.</p>



<div id="separator-block_ef33c7335defd019296f9251e9ea9c29"
         class="block-separator block-separator--10">
</div>


<div class="wp-block-image is-style-default">
<figure class="aligncenter size-full is-resized"><img data-recalc-dims="1" loading="lazy" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Time-Series-Prediction-How-Is-It-Different-From-Other-Machine-Learning-ML-Engineer-Explains_3.png?resize=733%2C390&#038;ssl=1" alt="An example visualization of time-series" class="wp-image-62230" width="733" height="390"/><figcaption class="wp-element-caption"><em>An example visualization of time-series | <a href="https://www.stat.pitt.edu/stoffer/tsa4/tsgraphics.htm" target="_blank" rel="noreferrer noopener nofollow">Source</a></em></figcaption></figure>
</div>


<div id="separator-block_ef33c7335defd019296f9251e9ea9c29"
         class="block-separator block-separator--10">
</div>



<h2 class="wp-block-heading" class="wp-block-heading" id="h-time-series-forecasting-vs-static-ml-predictions">Time-series forecasting vs static ML predictions</h2>



<p>In the previous section, we saw the difference between the two data kinds pertaining to the initial steps and also the difference in approaches while comparing the two. In this section, we’re going to explore the next steps i.e. <strong>prediction</strong> or in terms of time-series, <strong>forecasting</strong>.</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-algorithms">Algorithms</h3>



<p>The choice of algorithms in time-series data is completely different from the one in static data. An algorithm that can extrapolate patterns and encapsulate the time-series components outside of the domain of training data can be considered as a time-series algorithm.</p>



<p>Now, most static machine learning algorithms like Linear regression, SVMs do not have this capability as they generalize the training space for any new prediction. They simply can’t exhibit any behaviour we discussed above.</p>



<p>Some common algorithms used for time-series forecasting:</p>



<ul class="wp-block-list">
<li><strong>ARIMA:</strong> It stands for Autoregressive-Integrated-Moving Average. It utilizes the combination of Autoregressive and moving averages to predict future values. Read more about it <a href="https://otexts.com/fpp2/arima.html" target="_blank" rel="noreferrer noopener nofollow">here</a>.</li>



<li><strong>EWMA/Exponential Smoothening:</strong> Exponentially weighted moving average or Exponential Smoothening serves as an upgrade to the Moving averages. It works by reducing the lag effect shown by moving averages by putting on more weight on values that occurred more recently. Read more about it <a href="https://otexts.com/fpp2/expsmooth.html" target="_blank" rel="noreferrer noopener nofollow">here</a>.</li>



<li><strong>Dynamic Regression Models:</strong> This algorithm also takes other miscellaneous information into account such as public holidays, changes in law, etc. Read more about it <a href="https://otexts.com/fpp2/dynamic.html" target="_blank" rel="noreferrer noopener nofollow">here</a>.</li>



<li><strong>Prophet</strong>: <a href="https://facebook.github.io/prophet/" target="_blank" rel="noreferrer noopener nofollow">Prophet</a>, which was released by Facebook’s Core Data Science team, is an open-source library developed by Facebook and designed for automatic forecasting of univariate time series data.</li>



<li><strong>LSTM</strong>: Long Short-Term Memory (LSTM) is a type of recurrent neural network that can learn the order dependence between items in a sequence. It is often used to solve time series forecasting problems.</li>
</ul>



<section id="blog-intext-cta-block_5a2f5cf04896d73d4f4298fe580f510f" class="block-blog-intext-cta  c-box c-box--default c-box--dark c-box--no-hover c-box--standard ">

            <h3 class="block-blog-intext-cta__header" class="block-blog-intext-cta__header" id="h-recommended-for-you">Recommended for you</h3>
    
            <p>  <a href="/blog/arima-vs-prophet-vs-lstm" target="_blank" rel="noopener">ARIMA vs Prophet vs LSTM for Time Series Prediction</a></p>
    
    </section>



<p>This list is certainly not exhaustive. Many complex models or approaches such as <a href="https://www.investopedia.com/terms/g/garch.asp" target="_blank" rel="noreferrer noopener nofollow">Generalized Autoregressive Conditional Heteroskedasticity</a> (GARCH) and <a href="https://en.wikipedia.org/wiki/Bayesian_structural_time_series" target="_blank" rel="noreferrer noopener nofollow">Bayesian structural time-series</a> (BS time-series) may be very useful in some cases. There are also neural network models like <a href="https://otexts.com/fpp2/nnetar.html" target="_blank" rel="noreferrer noopener nofollow">Neural Networks Autoregression </a>(NNAR)&nbsp; that can be applied to time series which use lagged predictors and can handle features.</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-evaluation-metrics-in-time-series-models">Evaluation metrics in time-series models</h3>



<p>Forecasting evaluation involves metrics like scale-dependent errors such as Mean squared error(MSE) and Root mean squared error (RMSE), Percentage errors such as Mean absolute percentage error (MAPE), Scaled errors such as Mean absolute scaled error (MASE) to mention a few. These metrics are actually similar to static ML metrics.&nbsp;</p>



<p>However, while evaluation metrics help determine how close the fitted values are to the actual ones, they do not evaluate whether the model properly fits the time series. For this, we do something called <strong>Residual Diagnostics</strong>. Read about it in detail <a href="https://otexts.com/fpp2/residuals.html" target="_blank" rel="noreferrer noopener nofollow">here</a>.</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-dealing-with-outliers-anomalies">Dealing with outliers/anomalies</h3>



<p>Outliers plague almost every real-world data. Time-series and static data take two completely different routes from identification to the handling of outliers/anomalies.</p>



<h4 class="wp-block-heading">Identification</h4>



<ul class="wp-block-list">
<li>For identification in static data, we use techniques from Z-score, Boxplot analysis to some advanced statistical techniques like hypothesis testing.&nbsp;</li>



<li>In time-series we use a range of techniques and algorithms starting from the STL analysis to using algorithms like Isolation forests. You can read about it in more detail <a href="/blog/anomaly-detection-in-time-series" target="_blank" rel="noreferrer noopener">here</a>.</li>
</ul>



<h4 class="wp-block-heading">Handling</h4>



<ul class="wp-block-list">
<li>We use methods like Trimming, Quantile based flooring and capping, and Mean/Median Imputation in static data depending on the capacity and problem statement at hand.</li>



<li>In time-series data, there are a number of options that can be highly subjective to your use case. A few of them are:
<ul class="wp-block-list">
<li><strong>Using replacement</strong>: We can compute values that can replace the outlier and will make a better fit for the data. tsclean() function in R will fit a robust trend using loess (for non-seasonal series), or robust trend and seasonal components using STL (for seasonal series) to compute the replacement value.</li>



<li><strong>Studying the business</strong>: This is not a technical approach but an ad-hoc one. You see, identifying and studying the business behind the problem can really help deal with the outlier. Whether or not it is a wise choice to drop it or replace it will come from first studying it in-out.</li>
</ul>
</li>
</ul>



<section id="blog-intext-cta-block_723ef273d127a58aaf3af129f8609942" class="block-blog-intext-cta  c-box c-box--default c-box--dark c-box--no-hover c-box--standard ">

            <h3 class="block-blog-intext-cta__header" class="block-blog-intext-cta__header" id="h-check-also">Check also</h3>
    
            <p>  <a href="/blog/anomaly-detection-in-time-series" target="_blank" rel="noopener">Anomaly Detection in Time Series</a></p>
    
    </section>



<h2 class="wp-block-heading" class="wp-block-heading" id="h-best-practices-while-working-on-time-series-data-and-forecasting">Best practices while working on time-series data and forecasting</h2>



<p>Although there are no fixed steps to be followed while working on time-series and forecasting, there are still some good practices one can employ to get optimal results.&nbsp;</p>



<ol class="wp-block-list">
<li><strong>No One-size-fits-all:</strong> No forecasting method performs best for all time-series. You need to understand the problem statement, type of features, and goals before starting to work on forecasting. Some domains you can select algorithms from depending on your need (compute + goals):
<ul class="wp-block-list">
<li>statistical models,</li>



<li>machine learning, </li>



<li>and hybrid methods.<br></li>
</ul>
</li>



<li><strong>Feature selection</strong>: Selection of the features has an impact on the resulting forecast error. In other words, the selection has to be done carefully. There are different methods like <em>correlation analysis</em> also known as <em>ﬁlter</em>, <em>wrapper</em> (i.e., adding or removing features iterative), and embedded (i.e., the selection is already part of the forecasting method).<br></li>



<li><strong>Countering Overfitting:</strong> During the training of the model, the <a href="/blog/overfitting-vs-underfitting-in-machine-learning" target="_blank" rel="noreferrer noopener">risk of over-ﬁtting</a> may occur, as the best model does not always lead to the best forecast. To counteract the over-ﬁtting problem, the historical data can be split into train and test data and internal validations can be conducted.<br></li>



<li><a href="/blog/data-preprocessing-guide" target="_blank" rel="noreferrer noopener"><strong>Data preprocessing</strong></a>: Data should be first analyzed and preprocessed to make it clean for forecasting. Data can contain missing values and as most forecasting methods can’t handle missing values, values have to be imputed.<br></li>



<li><strong>Keep the Curse of Dimensionality in mind</strong>: When models in training are presented with a lot of dimensions and a lot of potential factors, they can encounter the <em>Curse of Dimensionality</em>, which says that as we have a finite amount of training data and we add more dimensions to that data, we start having diminishing returns in terms of accuracy.<br></li>



<li><strong>Working with Seasonal Data Patterns</strong>: If there is seasonality in time series data, multiple cycles that include that seasonal pattern are required to make a proper forecast. Otherwise, there is no way for the model to learn the pattern.<br></li>



<li><strong>Deal with Anomalies before moving to Forecast</strong>: Anomalies can create huge bias in the model learning and more often the results will always be subpar.<br></li>



<li><strong>Studying the problem statement carefully</strong>: This is probably the most underrated practice especially when you’re just starting to work on a time-series problem. Identify your time-based and non-time-based features, study the data first before moving to any standard techniques.&nbsp;</li>
</ol>



<h2 class="wp-block-heading" class="wp-block-heading" id="h-youve-reached-the-end">You’ve reached the end!</h2>



<p>We successfully understood the difference in structure and approach between Time-series and static data. The sections listed in this blog are by no means exhaustive. There can be more differences when we move to more granularities pertaining to specific data problems in each. Here are some of my favorite resources you can refer to while studying about time-series:</p>



<ul class="wp-block-list">
<li><a href="https://otexts.com/fpp2/" target="_blank" rel="noreferrer noopener nofollow">Hyndman, R.J., &amp; Athanasopoulos, G. (2018) Forecasting: principles and practice, 2nd edition, OTexts: Melbourne, Australia.</a>&nbsp;</li>



<li><a href="https://www.kaggle.com/learn/time-series" target="_blank" rel="noreferrer noopener nofollow">https://www.kaggle.com/learn/time-series</a></li>
</ul>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-references">References</h3>



<ol class="wp-block-list">
<li><a href="https://cran.r-project.org/web/packages/imputeTS/vignettes/imputeTS-Time-Series-Missing-Value-Imputation-in-R.pdf" target="_blank" rel="noreferrer noopener nofollow">https://cran.r-project.org/web/packages/imputeTime-Series/vignettes/imputeTime-Series-Time-Series-Missing-Value-Imputation-in-R.pdf</a></li>



<li><a href="http://blog/anomaly-detection-in-time-series" target="_blank" rel="noreferrer noopener">https://neptune.ai/blog/anomaly-detection-in-time-series</a></li>



<li><a href="https://machinelearningmastery.com/resample-interpolate-time-series-data-python/" target="_blank" rel="noreferrer noopener nofollow">https://machinelearningmastery.com/resample-interpolate-time-series-data-python/</a></li>



<li><a href="https://otexts.com/fpp2/missing-outliers.html" target="_blank" rel="noreferrer noopener nofollow">https://otexts.com/fpp2/missing-outliers.html</a></li>



<li><a href="https://otexts.com/fpp2/stl.html" target="_blank" rel="noreferrer noopener nofollow">https://otexts.com/fpp2/stl.html</a></li>



<li><a href="https://otexts.com/fpp2/arima.html" target="_blank" rel="noreferrer noopener nofollow">https://otexts.com/fpp2/arima.html</a></li>



<li><a href="https://otexts.com/fpp2/expsmooth.html" target="_blank" rel="noreferrer noopener nofollow">https://otexts.com/fpp2/expsmooth.html</a></li>



<li><a href="https://www.advancinganalytics.co.uk/blog/2021/06/22/10-incredibly-useful-time-series-forecasting-algorithms" target="_blank" rel="noreferrer noopener nofollow">https://www.advancinganalytics.co.uk/blog/2021/06/22/10-incredibly-useful-time-series-forecasting-algorithms</a></li>



<li><a href="https://www.analyticsvidhya.com/blog/2021/05/detecting-and-treating-outliers-treating-the-odd-one-out/" target="_blank" rel="noreferrer noopener nofollow">https://www.analyticsvidhya.com/blog/2021/05/detecting-and-treating-outliers-treating-the-odd-one-out/</a></li>



<li><a href="https://otexts.com/fpp2/missing-outliers.html" target="_blank" rel="noreferrer noopener nofollow">https://otexts.com/fpp2/missing-outliers.html</a></li>



<li><a href="https://www.rdocumentation.org/packages/forecast/versions/8.3/topics/tsclean" target="_blank" rel="noreferrer noopener nofollow">https://www.rdocumentation.org/packages/forecast/versions/8.3/topics/tsclean</a></li>



<li><a href="https://www.researchgate.net/publication/332079043_Best_Practices_for_Time_Series_Forecasting_Tutorial_Paper" target="_blank" rel="noreferrer noopener nofollow">https://www.researchgate.net/publication/332079043_Best_Practices_for_Time_Series_Forecasting_Tutorial_Paper</a></li>



<li><a href="https://www.anodot.com/blog/time-series-forecasting/" target="_blank" rel="noreferrer noopener nofollow">https://www.anodot.com/blog/time-series-forecasting/</a></li>
</ol>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">6586</post-id>	</item>
		<item>
		<title>Machine Learning for Stock Price Prediction</title>
		<link>https://neptune.ai/blog/predicting-stock-prices-using-machine-learning</link>
		
		<dc:creator><![CDATA[Katherine (Yi) Li]]></dc:creator>
		<pubDate>Fri, 22 Jul 2022 06:15:04 +0000</pubDate>
				<category><![CDATA[ML Model Development]]></category>
		<category><![CDATA[Time Series]]></category>
		<guid isPermaLink="false">https://neptune.test/predicting-stock-prices-using-machine-learning/</guid>

					<description><![CDATA[The stock market is known for being volatile, dynamic, and nonlinear. Accurate stock price prediction is extremely challenging because of multiple (macro and micro) factors, such as politics, global economic conditions, unexpected events, a company’s financial performance, and so on.&#160; But all of this also means that there’s a lot of data to find patterns&#8230;]]></description>
										<content:encoded><![CDATA[
<p>The stock market is known for being volatile, dynamic, and nonlinear. Accurate stock price prediction is extremely challenging because of multiple (macro and micro) factors, such as politics, global economic conditions, unexpected events, a company’s financial performance, and so on.&nbsp;</p>



<p>But all of this also means that there’s a lot of data to find patterns in. So, financial analysts, researchers, and data scientists keep exploring analytics techniques to detect stock market trends. This gave rise to the concept of <a href="https://en.wikipedia.org/wiki/Algorithmic_trading" target="_blank" rel="noreferrer noopener nofollow">algorithmic trading</a>, which uses automated, pre-programmed trading strategies to execute orders.</p>



<p>In this article, we’ll be using both traditional quantitative finance methodology and machine learning algorithms to predict stock movements. We’ll go through the following topics:</p>



<ul class="wp-block-list">
<li>Stock analysis: fundamental vs. technical analysis&nbsp;</li>



<li>Stock prices as time-series data and related concepts</li>



<li>Predicting stock prices with Moving Average techniques</li>



<li>Introduction to LSTMs&nbsp;</li>



<li>Predicting stock prices with an LSTM model</li>



<li>Final thoughts on new methodologies, such as ESN</li>
</ul>



<p><em>Disclaimer: this project/article is not intended to provide financial, trading, and investment advice. No warranties are made regarding the accuracy of the models. Audiences should conduct their due diligence before making any investment decisions using the methods or code presented in this article.</em></p>



<h2 class="wp-block-heading" class="wp-block-heading" id="h-stock-analysis-fundamental-analysis-vs-technical-analysis">Stock analysis: fundamental analysis vs. technical analysis</h2>



<p>When it comes to stocks, fundamental and technical analyses are at opposite ends of the market analysis spectrum.</p>



<ul class="wp-block-list">
<li>Fundamental analysis (you can read more about it <a href="https://www.investopedia.com/terms/f/fundamentalanalysis.asp#:~:text=Fundamental%20analysis%20(FA)%20is%20a,related%20economic%20and%20financial%20factors.&amp;text=The%20end%20goal%20is%20to,security%20is%20undervalued%20or%20overvalued." target="_blank" rel="noreferrer noopener nofollow">here</a>):
<ul class="wp-block-list">
<li>Evaluates a company’s stock by examining its intrinsic value, including but not limited to tangible assets, financial statements, management effectiveness, strategic initiatives, and consumer behaviors; essentially all the basics of a company.</li>



<li>Being a relevant indicator for long-term investment, the fundamental analysis relies on both historical and present data to measure revenues, assets, costs, liabilities, and so on.</li>



<li>Generally speaking, the results from fundamental analysis don’t change with short-term news.&nbsp;</li>
</ul>
</li>



<li>Technical analysis (you can read more about it <a href="https://www.investopedia.com/terms/t/technicalanalysis.asp" target="_blank" rel="noreferrer noopener nofollow">here</a>):
<ul class="wp-block-list">
<li>Analyzes measurable data from stock market activities, such as stock prices, historical returns, and volume of historical trades; i.e. quantitative information that could identify trading signals and capture the movement patterns of the stock market.&nbsp;</li>



<li>Technical analysis focuses on historical data and current data just like fundamental analysis, but it’s mainly used for short-term trading purposes.</li>



<li>Due to its short-term nature, technical analysis results are easily influenced by news.</li>



<li>Popular technical analysis methodologies include moving average (MA), <a href="https://www.investopedia.com/trading/support-and-resistance-basics/" target="_blank" rel="noreferrer noopener nofollow">support and resistance levels</a>, as well as <a href="https://www.investopedia.com/terms/t/trendline.asp" target="_blank" rel="noreferrer noopener nofollow">trend lines and channels</a>.&nbsp;</li>
</ul>
</li>
</ul>



<p>For our exercise, we’ll be looking at technical analysis solely and focusing on the Simple MA and Exponential MA techniques to predict stock prices. Additionally, we’ll utilize LSTM (Long Short-Term Memory), a deep learning framework for time-series, to build a predictive model and compare its performance against our technical analysis.&nbsp;</p>



<p>As stated in the disclaimer, stock trading strategy is not in the scope of this article. I’ll be using trading/investment terms only to help you better understand the analysis, but this is not financial advice. We’ll be using terms like:</p>



<ul class="wp-block-list">
<li><a href="https://www.investopedia.com/articles/active-trading/011815/top-technical-indicators-rookie-traders.asp" target="_blank" rel="noreferrer noopener nofollow">trend indicators</a>: statistics that represent the trend of stock prices,</li>



<li><a href="https://www.investopedia.com/ask/answers/122414/what-are-most-common-periods-used-creating-moving-average-ma-lines.asp" target="_blank" rel="noreferrer noopener nofollow">medium-term movements</a>: the 50-day movement trend of stock prices.</li>
</ul>



<h2 class="wp-block-heading" class="wp-block-heading" id="h-stock-prices-as-time-series-data">Stock prices as time-series data</h2>



<p>Despite the volatility, stock prices aren’t just randomly generated numbers. So, they can be analyzed as a sequence of <a href="https://en.wikipedia.org/wiki/Discrete_time_and_continuous_time" target="_blank" rel="noreferrer noopener nofollow">discrete-time</a> data; in other words, time-series observations taken at successive points in time (usually on a daily basis). <a href="/blog/time-series-prediction-vs-machine-learning" target="_blank" rel="noreferrer noopener">Time series forecasting</a> (predicting future values based on historical values) applies well to stock forecasting.</p>



<p>Because of the sequential nature of time-series data, we need a way to aggregate this sequence of information. From all the potential techniques, the most intuitive one is MA with the ability to smooth out short-term fluctuations. We’ll discuss more details in the next section.</p>


    <a
        href="/blog/select-model-for-time-series-prediction-task"
        id="cta-box-related-link-block_b7232f24ec5af658278bcc1c1a55b2a2"
        class="block-cta-box-related-link  l-margin__top--large l-margin__bottom--x-large"
        target="_blank" rel="nofollow noopener noreferrer"    >

    
    <div class="block-cta-box-related-link__description-wrapper block-cta-box-related-link__description-wrapper--full">

        
            <div class="c-eyebrow">

                <img
                    src="https://neptune.ai/wp-content/themes/neptune/img/icon-related--article.svg"
                    loading="lazy"
                    decoding="async"
                    width="16"
                    height="16"
                    alt=""
                    class="c-eyebrow__icon">

                <div class="c-eyebrow__text">
                    Related post                </div>
            </div>

        
                    <h3 class="c-header" class="c-header" id="h-how-to-select-a-model-for-your-time-series-prediction-task-guide">                How to Select a Model For Your Time Series Prediction Task [Guide]            </h3>        
                    <div class="c-button c-button--tertiary c-button--small">

                <span class="c-button__text">
                    Read more                </span>

                <img
                    src="https://neptune.ai/wp-content/themes/neptune/img/icon-button-arrow-right.svg"
                    loading="lazy"
                    decoding="async"
                    width="12"
                    height="12"
                    alt=""
                    class="c-button__arrow">

            </div>
            </div>

    </a>



<h2 class="wp-block-heading" class="wp-block-heading" id="h-dataset-analysis">Dataset analysis</h2>



<p>For this demonstration exercise, we’ll use the closing prices of Apple’s stock (ticker symbol AAPL) from the past 21 years (1999-11-01 to 2021-07-09). Analysis data will be loaded from <a href="https://www.alphavantage.co/documentation/" target="_blank" rel="noreferrer noopener nofollow">Alpha Vantage</a>, which offers a free API for historical and real-time stock market data.&nbsp;</p>



<p>To get data from Alpha Vantage, you need a free API key; a walk-through tutorial can be found&nbsp;<a href="https://www.alphavantage.co/" target="_blank" rel="noreferrer noopener nofollow">here</a>. Don&#8217;t want to create an API? No worries, the analysis data is available&nbsp;<a href="https://app.neptune.ai/o/showcase/org/StockPrediction/metadata?path=&amp;attribute=data" target="_blank" rel="noreferrer noopener nofollow">here</a>&nbsp;as well. If you feel like exploring other stocks, code to download the data is accessible in this Github repo as well. Once you have the API, all you need is the ticker symbol for the particular stock.</p>



<p>For model training, we’ll use the oldest 80% of the data, and save the most recent 20% as the hold-out testing set.</p>




<div
	style="opacity: 0;"
	class="block-code-snippet  l-padding__top--0 l-padding__bottom--0 l-margin__top--unset l-margin__bottom--unset block-code-snippet--regular language-py line-numbers block-code-snippet--show-header"
	data-show-header="show"
	data-header-text=""
>
	<pre style="font-size: .875rem;" data-prismjs-copy="Copy the JavaScript snippet!"><code># %% Train-Test split for time-series
stockprices = pd.read_csv("stock_market_data-AAPL.csv", index_col="Date")

test_ratio = 0.2
training_ratio = 1 - test_ratio

train_size = int(training_ratio * len(stockprices))
test_size = int(test_ratio * len(stockprices))
print(f"train_size: {train_size}")
print(f"test_size: {test_size}")

train = stockprices[:train_size][["Close"]]
test = stockprices[train_size:][["Close"]]</code></pre>
</div>




<h2 class="wp-block-heading" class="wp-block-heading" id="h-creating-a-neptune-ai-project">Creating a neptune.ai project</h2>



<p>With regard to model training and performance comparison, neptune.ai makes it convenient for users to track everything model-related, including hyper-parameter specification and evaluation plots. </p>



<section
	id="i-box-block_b53db938adc91ec6f64e55f336beba03"
	class="block-i-box  l-margin__top--large l-margin__bottom--x-large">

			<header class="c-header">
			<img
				src="https://neptune.ai/wp-content/themes/neptune/img/image-ratio-holder.svg"
				data-src="https://neptune.ai/wp-content/themes/neptune/img/blocks/i-box/header-icon.svg"
				width="24"
				height="24"
				class="c-header__icon lazyload"
				alt="">

			
            <h2 class="c-header__text animation " style='max-width: 100%;'   >
                <strong>Disclaimer</strong>
            </h2>		</header>
	
	<div class="block-i-box__inner">
		

<div
    id="custom-text-block_40ce8db3d4f1109cf23fa85dc5afda39"
    class="block-custom-text  white l-padding__top--0 l-padding__bottom--0"
    style="max-width: 100%; font-size: 1rem; line-height: 1.33; font-weight: 600;"
    >
    
    neptune.ai is NOT a stock prediction software.<br />
<br />
It is an experiment tracker for ML/AI teams that struggle with debugging and reproducing experiments, sharing results, and messy model handover.
    </div>



<div id="group-of-boxes-block_66963b14979b7b39e09b0b1a1538a9c9" class="b-group-of-boxes  l-padding__top--large l-padding__bottom--large">

<div
    class="c-wrapper c-wrapper--align-auto c-wrapper--align-vertical-auto" >
    <div class="b-group-of-boxes__grid l-grid--cols-2  l-grid--boxes">
        

	<div
		class="c-box c-box--transparent c-box--dark c-box--no-hover c-box--micro c-box--vertical-center c-box--horizontal-flex-start c-box--paddings-none  l-margin__top--0 l-margin__bottom--0">
		

<p>neptune.ai lets you monitor months-long model training, track massive amounts of data, and compare thousands of metrics in the blink of an eye.</p>



<ul
    id="arrow-list-block_73e0a01c9281f784d0e2fc42acd41d16"
    class="block-arrow-list block-list-item--font-size-regular">
    

<li class="block-list-item ">
    <img loading="lazy" decoding="async"
        src="https://neptune.ai/wp-content/themes/neptune/img/image-ratio-holder.svg"
        data-src="https://neptune.ai/wp-content/themes/neptune/img/blocks/list-item/arrow.svg"
        width="10"
        height="10"
        class="block-list-item__arrow lazyload"
        alt="">

    

<p><a href="/walkthrough">Watch the 2-min product demo</a></p>


</li>


</ul>


	</div>



	<div
		class="c-box c-box--transparent c-box--dark c-box--no-hover c-box--micro c-box--vertical-flex-start c-box--horizontal-flex-start c-box--paddings-none  l-margin__top--0 l-margin__bottom--0">
		

<div id="app-screenshot-block_9db596af80a9fc436daf59bce0ea7cb7"
	class="block-app-screenshot js-block-with-image-full-screen-modal "
	data-video-url=""
	data-show-controls="false"
	data-unmute="false"
	data-button-icon="https://neptune.ai/wp-content/themes/neptune/img/icon-close.svg"
	data-image-full-screen-modal="https://i0.wp.com/neptune.ai/wp-content/uploads/2024/11/Reporting.png?fit=1020%2C577&#038;ssl=1"
>

			<div class="block-app-screenshot__image-wrapper">
			<div class="block-app-screenshot__bar">
				<figure class="block-app-screenshot__bar-buttons-wrapper">
					<img
						src="https://neptune.ai/wp-content/themes/neptune/img/blocks/app-screenshot/bar-buttons.svg"
						width="34"
						height="9"
						class="block-app-screenshot__bar-buttons"
						alt="">
				</figure>
			</div>

			
				<img
					srcset="
					https://i0.wp.com/neptune.ai/wp-content/uploads/2024/11/Reporting.png?fit=480%2C271&#038;ssl=1 480w,					https://i0.wp.com/neptune.ai/wp-content/uploads/2024/11/Reporting.png?fit=768%2C434&#038;ssl=1 768w,					https://i0.wp.com/neptune.ai/wp-content/uploads/2024/11/Reporting.png?fit=1020%2C577&#038;ssl=1 1020w"
					alt=""
					style=""
					width="1020"
					height="577"
					class="block-app-screenshot__image"
				>

			
			<div class="block-app-screenshot__overlay">

				
					<a
						href="https://scale.neptune.ai/o/examples/org/LLM-Pretraining/reports/9e6a2cad-77e7-42df-9d64-28f07d37e908"
						class="c-button c-button--primary c-button--small c-button--cta">
						<img
							decoding="async"
							loading="lazy"
							src="https://neptune.ai/wp-content/themes/neptune/img/icon-button--test-tube.svg"
							width="16"
							height="19"
							target="_blank" rel="nofollow noopener noreferrer"							class="c-button__icon"
							alt=""
						/>

													<span class="c-button__text">
								See in the app							</span>
						
					</a>

				
														<button
						class="js-c-image-full-screen-modal c-button c-button--tertiary c-button--small">
						<img
							decoding="async"
							loading="lazy"
							src="https://neptune.ai/wp-content/themes/neptune/img/icon-zoom.svg"
							width="16"
							height="17"
							class="c-button__icon"
							alt="zoom"
						/>

						<span class="c-button__text">
							Full screen preview						</span>
						
					</button>
									
			</div>

		</div>

			
</div>


	</div>


    </div>
</div>


</div>


	</div>

</section>



<p>Now, let’s <a href="https://docs-legacy.neptune.ai/setup/creating_project/" target="_blank" rel="noreferrer noopener">create a project in Neptune</a> for this particular exercise and name it “<strong>StockPrediction</strong>”.</p>



<section
	id="i-box-block_b53db938adc91ec6f64e55f336beba03"
	class="block-i-box  l-margin__top--large l-margin__bottom--x-large">

			<header class="c-header">
			<img
				src="https://neptune.ai/wp-content/themes/neptune/img/image-ratio-holder.svg"
				data-src="https://neptune.ai/wp-content/themes/neptune/img/blocks/i-box/header-icon.svg"
				width="24"
				height="24"
				class="c-header__icon lazyload"
				alt="">

			
            <h2 class="c-header__text animation " style='max-width: 100%;'   >
                <strong>Disclaimer</strong>
            </h2>		</header>
	
	<div class="block-i-box__inner">
		

<p>Please note that this article references a <strong>deprecated version of Neptune</strong>.</p>



<p>For information on the latest version with improved features and functionality, please <a href="/" target="_blank" rel="noreferrer noopener">visit our website</a>.</p>


	</div>

</section>



<h2 class="wp-block-heading" class="wp-block-heading" id="h-evaluation-metrics-and-helper-functions">Evaluation metrics and helper functions&nbsp;</h2>



<p>Since stock prices prediction is essentially a regression problem, the <a href="https://en.wikipedia.org/wiki/Root-mean-square_deviation">RMSE (Root Mean Squared Error)</a> and <a href="https://www.statisticshowto.com/mean-absolute-percentage-error-mape/" target="_blank" rel="noreferrer noopener nofollow">MAPE (Mean Absolute Percentage Error %)</a> will be our current model evaluation metrics. Both are useful measures of forecast accuracy.&nbsp;</p>


<div class="wp-block-image">
<figure class="aligncenter"><img decoding="async" src="https://lh4.googleusercontent.com/SEPlzou2lnpmy1T58g8ndsJcmy2Dpp3TQt4he_VH61wnW3gXe_oN7ayaylKpFGdrP7mTQBA0IWDZLjdDJD6h8GslLv8dOYp8V0oIRqixgow1qDNmEhMCKzIJ945Y0XrWuQXta31G" alt=""/></figure>
</div>

<div class="wp-block-image">
<figure class="aligncenter"><img decoding="async" src="https://lh5.googleusercontent.com/exhZtdP7rguUdERu0Mte-RbcIUIB1EFwYrtMyXW9TOQj-cR8v0zk-JwOz4ba4tmYhVTpPHDgVW-TkoNIT_jQOYp7byX7DLwVFoPeo2dGYWDIAO7MvUdnE519z6SfvbogM6j5ppRt" alt=""/></figure>
</div>


<p>, <em>where N = the number of time points, At = the actual / true stock price, Ft = the predicted / forecast value</em>.</p>



<p>RMSE gives the differences between predicted and true values, whereas MAPE (%) measures this difference relative to the true values. For example, a MAPE value of 12% indicates that the mean difference between the predicted stock price and the actual stock price is 12%.</p>



<p>Next, let’s create several helper functions for the current exercise.&nbsp;</p>



<ul class="wp-block-list">
<li>Split the stock prices data into training sequence X and the next output value Y,&nbsp;</li>
</ul>




<div
	style="opacity: 0;"
	class="block-code-snippet  l-padding__top--0 l-padding__bottom--0 l-margin__top--unset l-margin__bottom--unset block-code-snippet--regular language-py line-numbers block-code-snippet--show-header"
	data-show-header="show"
	data-header-text=""
>
	<pre style="font-size: .875rem;" data-prismjs-copy="Copy the JavaScript snippet!"><code>## Split the time-series data into training seq X and output value Y
def extract_seqX_outcomeY(data, N, offset):
    """
    Split time-series into training sequence X and outcome value Y
    Args:
        data - dataset
        N - window size, e.g., 50 for 50 days of historical stock prices
        offset - position to start the split
    """
    X, y = [], []

    for i in range(offset, len(data)):
        X.append(data[i - N : i])
        y.append(data[i])

    return np.array(X), np.array(y)
</code></pre>
</div>




<ul class="wp-block-list">
<li>Calculate the RMSE and MAPE (%),&nbsp;</li>
</ul>




<div
	style="opacity: 0;"
	class="block-code-snippet  l-padding__top--0 l-padding__bottom--0 l-margin__top--unset l-margin__bottom--unset block-code-snippet--regular language-py line-numbers block-code-snippet--show-header"
	data-show-header="show"
	data-header-text=""
>
	<pre style="font-size: .875rem;" data-prismjs-copy="Copy the JavaScript snippet!"><code>#### Calculate the metrics RMSE and MAPE ####
def calculate_rmse(y_true, y_pred):
    """
    Calculate the Root Mean Squared Error (RMSE)
    """
    rmse = np.sqrt(np.mean((y_true - y_pred) ** 2))
    return rmse


def calculate_mape(y_true, y_pred):
    """
    Calculate the Mean Absolute Percentage Error (MAPE) %
    """
    y_pred, y_true = np.array(y_pred), np.array(y_true)
    mape = np.mean(np.abs((y_true - y_pred) / y_true)) * 100
    return mape</code></pre>
</div>




<ul class="wp-block-list">
<li>Calculate the evaluation metrics for technical analysis and log to Neptune,</li>
</ul>




<div
	style="opacity: 0;"
	class="block-code-snippet  l-padding__top--0 l-padding__bottom--0 l-margin__top--unset l-margin__bottom--unset block-code-snippet--regular language-py line-numbers block-code-snippet--show-header"
	data-show-header="show"
	data-header-text=""
>
	<pre style="font-size: .875rem;" data-prismjs-copy="Copy the JavaScript snippet!"><code>def calculate_perf_metrics(var):
    ### RMSE
    rmse = calculate_rmse(
        np.array(stockprices[train_size:]["Close"]),
        np.array(stockprices[train_size:][var]),
    )
    ### MAPE
    mape = calculate_mape(
        np.array(stockprices[train_size:]["Close"]),
        np.array(stockprices[train_size:][var]),
    )

    ## Log to Neptune
    run["RMSE"] = rmse
    run["MAPE (%)"] = mape

    return rmse, mape</code></pre>
</div>




<ul class="wp-block-list">
<li>Plot the trend of the stock prices and log the plot to Neptune,</li>
</ul>




<div
	style="opacity: 0;"
	class="block-code-snippet  l-padding__top--0 l-padding__bottom--0 l-margin__top--unset l-margin__bottom--unset block-code-snippet--regular language-py line-numbers block-code-snippet--show-header"
	data-show-header="show"
	data-header-text=""
>
	<pre style="font-size: .875rem;" data-prismjs-copy="Copy the JavaScript snippet!"><code>def plot_stock_trend(var, cur_title, stockprices=stockprices):
    ax = stockprices[["Close", var, "200day"]].plot(figsize=(20, 10))
    plt.grid(False)
    plt.title(cur_title)
    plt.axis("tight")
    plt.ylabel("Stock Price ($)")

    ## Log to Neptune
    run["Plot of Stock Predictions"].upload(
        neptune.types.File.as_image(ax.get_figure())
    )</code></pre>
</div>




<h2 class="wp-block-heading" class="wp-block-heading" id="h-predicting-stock-price-with-moving-average-ma-technique">Predicting stock price with Moving Average (MA) technique</h2>



<p>MA is a popular method to smooth out random movements in the stock market. Similar to a sliding window, an MA is an average that moves along the time scale/periods; older data points get dropped as newer data points are added.&nbsp;</p>



<p>Commonly used periods are <a href="https://www.investopedia.com/articles/active-trading/052014/how-use-moving-average-buy-stocks.asp" target="_blank" rel="noreferrer noopener nofollow">20-day, 50-day, and 200-day MA</a> for short-term, medium-term, and long-term investment respectively.&nbsp;</p>



<p>Two types of MA are most preferred by financial analysts: Simple MA and Exponential MA.</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-simple-ma">Simple MA</h3>



<p>SMA, short for Simple Moving Average, calculates the average of a range of stock (closing) prices over a specific number of periods in that range. The formula for SMA is:</p>



<p><img loading="lazy" decoding="async" src="https://lh6.googleusercontent.com/9FotEYAEbCRHcf6BaDkf7A59jv6YE-TNOl0fWX6jTXJ0HHEk0VwLkHy3ziZN3995c0rg6tLuQMNtXmAIly0WO_7q7vNj_K3QlqO7cY2PZUtEvpFFKEL7BV6-W1cqOlSJriVe9Gpz" width="223" height="38">, <em>where P</em><em>n</em><em> = the stock price at time point n, N = the number of time points.</em></p>



<p>For this exercise of building an SMA model, we’ll use the Python code below to compute the 50-day SMA. We’ll also add a 200-day SMA for good measure.</p>




<div
	style="opacity: 0;"
	class="block-code-snippet  l-padding__top--0 l-padding__bottom--0 l-margin__top--unset l-margin__bottom--unset block-code-snippet--regular language-py line-numbers block-code-snippet--show-header"
	data-show-header="show"
	data-header-text=""
>
	<pre style="font-size: .875rem;" data-prismjs-copy="Copy the JavaScript snippet!"><code>window_size = 50

# Initialize a Neptune run
run = neptune.init_run(
    project=myProject,
    name="SMA",
    description="stock-prediction-machine-learning",
    tags=["stockprediction", "MA_Simple", "neptune"],
)

window_var = f"{window_size}day"

stockprices[window_var] = stockprices["Close"].rolling(window_size).mean()

### Include a 200-day SMA for reference
stockprices["200day"] = stockprices["Close"].rolling(200).mean()

### Plot and performance metrics for SMA model
plot_stock_trend(var=window_var, cur_title="Simple Moving Averages")
rmse_sma, mape_sma = calculate_perf_metrics(var=window_var)

### Stop the run
run.stop()</code></pre>
</div>




<p>In our Neptune run, we’ll see the performance metrics on the testing set; RMSE = 43.77, and MAPE = 12.53%. In addition, the trend chart shows the 50-day, 200-day SMA predictions compared with the true stock closing values.</p>



<div id="app-screenshot-block_0203329d508e25125e120f7a80db4e16"
	class="block-app-screenshot js-block-with-image-full-screen-modal "
	data-video-url=""
	data-show-controls="false"
	data-unmute="false"
	data-button-icon="https://neptune.ai/wp-content/themes/neptune/img/icon-close.svg"
	data-image-full-screen-modal="https://i0.wp.com/neptune.ai/wp-content/uploads/2023/07/image.png?fit=1020%2C390&#038;ssl=1"
>

			<div class="block-app-screenshot__image-wrapper">
			<div class="block-app-screenshot__bar">
				<figure class="block-app-screenshot__bar-buttons-wrapper">
					<img
						src="https://neptune.ai/wp-content/themes/neptune/img/blocks/app-screenshot/bar-buttons.svg"
						width="34"
						height="9"
						class="block-app-screenshot__bar-buttons"
						alt="">
				</figure>
			</div>

			
				<img
					srcset="
					https://i0.wp.com/neptune.ai/wp-content/uploads/2023/07/image.png?fit=480%2C183&#038;ssl=1 480w,					https://i0.wp.com/neptune.ai/wp-content/uploads/2023/07/image.png?fit=768%2C293&#038;ssl=1 768w,					https://i0.wp.com/neptune.ai/wp-content/uploads/2023/07/image.png?fit=1020%2C390&#038;ssl=1 1020w"
					alt=""
					style=""
					width="1020"
					height="390"
					class="block-app-screenshot__image"
				>

			
			<div class="block-app-screenshot__overlay">

				
					<a
						href="https://app.neptune.ai/o/showcase/org/StockPrediction/runs/details?viewId=99828b80-2046-43fe-a942-c36ab4c725d5&#038;detailsTab=metadata&#038;shortId=STOC-12&#038;type=run&#038;path=&#038;attribute=Plot%20of%20Stock%20Predictions"
						class="c-button c-button--primary c-button--small c-button--cta">
						<img
							decoding="async"
							loading="lazy"
							src="https://neptune.ai/wp-content/themes/neptune/img/icon-button--test-tube.svg"
							width="16"
							height="19"
							target="_blank" rel="nofollow noopener noreferrer"							class="c-button__icon"
							alt=""
						/>

													<span class="c-button__text">
								See in app							</span>
						
					</a>

				
														<button
						class="js-c-image-full-screen-modal c-button c-button--tertiary c-button--small">
						<img
							decoding="async"
							loading="lazy"
							src="https://neptune.ai/wp-content/themes/neptune/img/icon-zoom.svg"
							width="16"
							height="17"
							class="c-button__icon"
							alt="zoom"
						/>

						<span class="c-button__text">
							Full screen preview						</span>
						
					</button>
									
			</div>

		</div>

			
</div>



<div id="separator-block_d7c5d01608ee214ada6636eca496f1dc"
         class="block-separator block-separator--15">
</div>



<p>In addition, the trend chart below shows the 50-day, 200-day SMA predictions compared with the true stock closing values.</p>



<p>It’s not surprising to see that the 50-day SMA is a better trend indicator than the 200-day SMA in terms of (short-to-) medium movements. Both indicators, nonetheless, seem to give smaller predictions than the actual values.</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-exponential-ma">Exponential MA</h3>



<p>Different from SMA, which assigns equal weights to all historical data points, EMA, short for Exponential Moving Average, applies higher weights to recent prices, i.e., tail data points of the 50-day MA in our example. The magnitude of the weighting factor depends on the number of time periods. The formula to calculate EMA is:</p>



<p><img loading="lazy" decoding="async" src="https://lh6.googleusercontent.com/I1AsNkNRJ7tAQBR4-k_fcxtRzpK16grJz_ae2dqOvC4Zj6gNSpq3lcZsIsmVASneZamfP7j9xUOIeNB8YKIjHk_wrzdLNLgI1oCMhXvewgaHXT_auf1pYqHN2mpYadd6xlflb1i_" width="278" height="19">,&nbsp;</p>



<p><em>where P</em><em>t</em><em> = the price at time point t,&nbsp;</em></p>



<p><em>EMA</em><em>t-1</em><em> = EMA at time point t-1,&nbsp;</em></p>



<p><em>N = number of time points in EMA,</em></p>



<p><em>and weighting factor k = 2/(N+1).</em></p>



<p>One advantage of the EMA over SMA is that EMA is more responsive to price changes, which makes it useful for short-term trading. Here’s a Python implementation of EMA:</p>




<div
	style="opacity: 0;"
	class="block-code-snippet  l-padding__top--0 l-padding__bottom--0 l-margin__top--unset l-margin__bottom--unset block-code-snippet--regular language-py line-numbers block-code-snippet--show-header"
	data-show-header="show"
	data-header-text=""
>
	<pre style="font-size: .875rem;" data-prismjs-copy="Copy the JavaScript snippet!"><code># Initialize a Neptune run
run = neptune.init_run(
    project=myProject,
    name="EMA",
    description="stock-prediction-machine-learning",
    tags=["stockprediction", "MA_Exponential", "neptune"],
)

###### Exponential MA
window_ema_var = f"{window_var}_EMA"

# Calculate the 50-day exponentially weighted moving average
stockprices[window_ema_var] = (
    stockprices["Close"].ewm(span=window_size, adjust=False).mean()
)
stockprices["200day"] = stockprices["Close"].rolling(200).mean()

### Plot and performance metrics for EMA model
plot_stock_trend(
    var=window_ema_var, cur_title="Exponential Moving Averages")
rmse_ema, mape_ema = calculate_perf_metrics(var=window_ema_var)

### Stop the run
run.stop()</code></pre>
</div>




<p>Examining the performance metrics tracked in Neptune, we have RMSE = 36.68, and MAPE = 10.71%, which is an improvement from SMA’s 43.77 and 12.53% for RMSE and MAPE, respectively. The trend chart generated from this EMA model also implies that it outperforms the SMA.</p>



<div id="app-screenshot-block_6628d61695a04d6cc3fb80787324af58"
	class="block-app-screenshot js-block-with-image-full-screen-modal "
	data-video-url=""
	data-show-controls="false"
	data-unmute="false"
	data-button-icon="https://neptune.ai/wp-content/themes/neptune/img/icon-close.svg"
	data-image-full-screen-modal="https://i0.wp.com/neptune.ai/wp-content/uploads/2023/07/image-1.png?fit=1020%2C399&#038;ssl=1"
>

			<div class="block-app-screenshot__image-wrapper">
			<div class="block-app-screenshot__bar">
				<figure class="block-app-screenshot__bar-buttons-wrapper">
					<img
						src="https://neptune.ai/wp-content/themes/neptune/img/blocks/app-screenshot/bar-buttons.svg"
						width="34"
						height="9"
						class="block-app-screenshot__bar-buttons"
						alt="">
				</figure>
			</div>

			
				<img
					srcset="
					https://i0.wp.com/neptune.ai/wp-content/uploads/2023/07/image-1.png?fit=480%2C188&#038;ssl=1 480w,					https://i0.wp.com/neptune.ai/wp-content/uploads/2023/07/image-1.png?fit=768%2C301&#038;ssl=1 768w,					https://i0.wp.com/neptune.ai/wp-content/uploads/2023/07/image-1.png?fit=1020%2C399&#038;ssl=1 1020w"
					alt=""
					style=""
					width="1020"
					height="399"
					class="block-app-screenshot__image"
				>

			
			<div class="block-app-screenshot__overlay">

				
					<a
						href="https://app.neptune.ai/o/showcase/org/StockPrediction/runs/details?viewId=99828b80-2046-43fe-a942-c36ab4c725d5&#038;detailsTab=metadata&#038;shortId=STOC-12&#038;type=run&#038;path=&#038;attribute=Plot%20of%20Stock%20Predictions"
						class="c-button c-button--primary c-button--small c-button--cta">
						<img
							decoding="async"
							loading="lazy"
							src="https://neptune.ai/wp-content/themes/neptune/img/icon-button--test-tube.svg"
							width="16"
							height="19"
							target="_blank" rel="nofollow noopener noreferrer"							class="c-button__icon"
							alt=""
						/>

													<span class="c-button__text">
								See in app							</span>
						
					</a>

				
														<button
						class="js-c-image-full-screen-modal c-button c-button--tertiary c-button--small">
						<img
							decoding="async"
							loading="lazy"
							src="https://neptune.ai/wp-content/themes/neptune/img/icon-zoom.svg"
							width="16"
							height="17"
							class="c-button__icon"
							alt="zoom"
						/>

						<span class="c-button__text">
							Full screen preview						</span>
						
					</button>
									
			</div>

		</div>

			
</div>



<div id="separator-block_d7c5d01608ee214ada6636eca496f1dc"
         class="block-separator block-separator--15">
</div>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-comparison-of-the-sma-and-ema-prediction-performance">Comparison of the SMA and EMA prediction performance&nbsp;</h3>



<p>The screenshot below shows a comparison of SMA and EMA side-by-side in Neptune.</p>



<div id="app-screenshot-block_a03a8de5bf85c46a11e9a4e728560453"
	class="block-app-screenshot js-block-with-image-full-screen-modal "
	data-video-url=""
	data-show-controls="false"
	data-unmute="false"
	data-button-icon="https://neptune.ai/wp-content/themes/neptune/img/icon-close.svg"
	data-image-full-screen-modal="https://i0.wp.com/neptune.ai/wp-content/uploads/2023/07/image-2.png?fit=1020%2C370&#038;ssl=1"
>

			<div class="block-app-screenshot__image-wrapper">
			<div class="block-app-screenshot__bar">
				<figure class="block-app-screenshot__bar-buttons-wrapper">
					<img
						src="https://neptune.ai/wp-content/themes/neptune/img/blocks/app-screenshot/bar-buttons.svg"
						width="34"
						height="9"
						class="block-app-screenshot__bar-buttons"
						alt="">
				</figure>
			</div>

			
				<img
					srcset="
					https://i0.wp.com/neptune.ai/wp-content/uploads/2023/07/image-2.png?fit=480%2C174&#038;ssl=1 480w,					https://i0.wp.com/neptune.ai/wp-content/uploads/2023/07/image-2.png?fit=768%2C279&#038;ssl=1 768w,					https://i0.wp.com/neptune.ai/wp-content/uploads/2023/07/image-2.png?fit=1020%2C370&#038;ssl=1 1020w"
					alt=""
					style=""
					width="1020"
					height="370"
					class="block-app-screenshot__image"
				>

			
			<div class="block-app-screenshot__overlay">

				
					<a
						href="https://app.neptune.ai/o/showcase/org/StockPrediction/runs/compare?viewId=99828b80-2046-43fe-a942-c36ab4c725d5&#038;detailsTab=metadata&#038;shortId=STOC-12&#038;dash=images&#038;type=run&#038;path=&#038;attribute=Plot%20of%20Stock%20Predictions&#038;compare=IzA0yA"
						class="c-button c-button--primary c-button--small c-button--cta">
						<img
							decoding="async"
							loading="lazy"
							src="https://neptune.ai/wp-content/themes/neptune/img/icon-button--test-tube.svg"
							width="16"
							height="19"
							target="_blank" rel="nofollow noopener noreferrer"							class="c-button__icon"
							alt=""
						/>

													<span class="c-button__text">
								See in app							</span>
						
					</a>

				
														<button
						class="js-c-image-full-screen-modal c-button c-button--tertiary c-button--small">
						<img
							decoding="async"
							loading="lazy"
							src="https://neptune.ai/wp-content/themes/neptune/img/icon-zoom.svg"
							width="16"
							height="17"
							class="c-button__icon"
							alt="zoom"
						/>

						<span class="c-button__text">
							Full screen preview						</span>
						
					</button>
									
			</div>

		</div>

			
</div>



<h2 class="wp-block-heading" class="wp-block-heading" id="h-introduction-to-lstms-for-the-time-series-data">Introduction to LSTMs for the time-series data</h2>



<p>Now, let’s move on to the LSTM model. LSTM, short for Long Short-term Memory, is an extremely powerful algorithm for time series. It can capture historical trend patterns, and predict future values with high accuracy.&nbsp;</p>


    <a
        href="/blog/arima-vs-prophet-vs-lstm"
        id="cta-box-related-link-block_0a3452e423cbd77e78e6b51895b41db2"
        class="block-cta-box-related-link  l-margin__top--large l-margin__bottom--x-large"
        target="_blank" rel="nofollow noopener noreferrer"    >

    
    <div class="block-cta-box-related-link__description-wrapper block-cta-box-related-link__description-wrapper--full">

        
            <div class="c-eyebrow">

                <img
                    src="https://neptune.ai/wp-content/themes/neptune/img/icon-related--article.svg"
                    loading="lazy"
                    decoding="async"
                    width="16"
                    height="16"
                    alt=""
                    class="c-eyebrow__icon">

                <div class="c-eyebrow__text">
                    Related post                </div>
            </div>

        
                    <h3 class="c-header" class="c-header" id="h-arima-vs-prophet-vs-lstm-for-time-series-prediction">                ARIMA vs Prophet vs LSTM for Time Series Prediction            </h3>        
                    <div class="c-button c-button--tertiary c-button--small">

                <span class="c-button__text">
                    Read more                </span>

                <img
                    src="https://neptune.ai/wp-content/themes/neptune/img/icon-button-arrow-right.svg"
                    loading="lazy"
                    decoding="async"
                    width="12"
                    height="12"
                    alt=""
                    class="c-button__arrow">

            </div>
            </div>

    </a>



<p>In a nutshell, the key component to understand an LSTM model is the Cell State (<em>C</em><em>t</em>), which represents the internal short-term and long-term memories of a cell.&nbsp;</p>


<div class="wp-block-image">
<figure class="aligncenter size-full is-resized"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Stock-prediction-LSTM.png?ssl=1" alt="Stock prediction LSTM" class="wp-image-49787" style="width:523px;height:285px"/><figcaption class="wp-element-caption"><em><a href="http://colah.github.io/posts/2015-08-Understanding-LSTMs/" target="_blank" rel="noreferrer noopener nofollow">Source</a></em></figcaption></figure>
</div>


<p>To control and manage the cell state, an LSTM model contains three gates/layers. It’s worth mentioning that the “gates” here can be treated as filters to let information in (being remembered) or out (being forgotten).&nbsp;</p>



<ul class="wp-block-list">
<li>Forget gate:&nbsp;</li>
</ul>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Stock-prediction-LSTM-2.png?ssl=1" alt="Stock prediction LSTM" class="wp-image-49788"/><figcaption class="wp-element-caption"><em><a href="http://colah.github.io/posts/2015-08-Understanding-LSTMs/" target="_blank" rel="noreferrer noopener nofollow">Source</a></em></figcaption></figure>
</div>


<p>As the name implies, forget gate decides which information to throw away from the current cell state. Mathematically, it applies a <a href="https://en.wikipedia.org/wiki/Sigmoid_function" target="_blank" rel="noreferrer noopener nofollow">sigmoid function</a> to output/returns a value between [0, 1] for each value from the previous cell state (<em>Ct-1</em>); here ‘1’ indicates “completely passing through” whereas ‘0’ indicates “completely filtering out”</p>



<ul class="wp-block-list">
<li>Input gate:</li>
</ul>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Stock-prediction-LSTM-3.png?ssl=1" alt="Stock prediction LSTM" class="wp-image-49789"/><figcaption class="wp-element-caption"><a href="http://colah.github.io/posts/2015-08-Understanding-LSTMs/"><em>Source</em></a></figcaption></figure>
</div>


<p>It’s used to choose which new information gets added and stored in the current cell state. In this layer, a sigmoid function is implemented to reduce the values in the input vector (<em>i</em><em>t</em>), and then a tanh function squashes each value between [-1, 1] (<em>C</em><em>t</em>). Element-by-element matrix multiplication of <em>i</em><em>t</em> and <em>C</em><em>t</em> represents new information that needs to be added to the current cell state.&nbsp;</p>



<ul class="wp-block-list">
<li>Output gate:</li>
</ul>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Stock-prediction-LSTM-4.png?ssl=1" alt="Stock prediction LSTM" class="wp-image-49790"/><figcaption class="wp-element-caption"><em><a href="http://colah.github.io/posts/2015-08-Understanding-LSTMs/" target="_blank" rel="noreferrer noopener nofollow">Source</a></em></figcaption></figure>
</div>


<p>The output gate is implemented to control the output flowing to the next cell state.&nbsp; Similar to the input gate, an output gate applies a sigmoid and then a tanh function to filter out unwanted information, keeping only what we’ve decided to let through.&nbsp;</p>



<p>For a more detailed understanding of LSTM, you can check out <a href="http://colah.github.io/posts/2015-08-Understanding-LSTMs/" target="_blank" rel="noreferrer noopener nofollow">this document</a>.&nbsp;&nbsp;</p>



<p>Knowing the theory of LSTM, you must be wondering how it does at predicting real-world stock prices. We’ll find out in the next section, by building an LSTM model and comparing its performance against the two technical analysis models: SMA and EMA.&nbsp;</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-predicting-stock-prices-with-an-lstm-model">Predicting stock prices with an LSTM model</h3>



<p>First, we need to create a Neptune experiment dedicated to LSTM, which includes the specified hyper-parameters.</p>




<div
	style="opacity: 0;"
	class="block-code-snippet  l-padding__top--0 l-padding__bottom--0 l-margin__top--unset l-margin__bottom--unset block-code-snippet--regular language-py line-numbers block-code-snippet--show-header"
	data-show-header="show"
	data-header-text=""
>
	<pre style="font-size: .875rem;" data-prismjs-copy="Copy the JavaScript snippet!"><code>layer_units = 50
optimizer = "adam"
cur_epochs = 15
cur_batch_size = 20

cur_LSTM_args = {
    "units": layer_units,
    "optimizer": optimizer,
    "batch_size": cur_batch_size,
    "epochs": cur_epochs,
}

# Initialize a Neptune run
run = neptune.init_run(
    project=myProject,
    name="LSTM",
    description="stock-prediction-machine-learning",
    tags=["stockprediction", "LSTM", "neptune"],
)
run["LSTM_args"] = cur_LSTM_args</code></pre>
</div>




<p>Next, we scale the input data for LSTM model regulation and split it into train and test sets.</p>




<div
	style="opacity: 0;"
	class="block-code-snippet  l-padding__top--0 l-padding__bottom--0 l-margin__top--unset l-margin__bottom--unset block-code-snippet--regular language-py line-numbers block-code-snippet--show-header"
	data-show-header="show"
	data-header-text=""
>
	<pre style="font-size: .875rem;" data-prismjs-copy="Copy the JavaScript snippet!"><code># Scale our dataset
scaler = StandardScaler()
scaled_data = scaler.fit_transform(stockprices[["Close"]])
scaled_data_train = scaled_data[: train.shape[0]]

# We use past 50 days’ stock prices for our training to predict the 51th day's closing price.
X_train, y_train = extract_seqX_outcomeY(scaled_data_train, window_size, window_size)</code></pre>
</div>




<p>A couple of notes:</p>



<ul class="wp-block-list">
<li>we use the <em>StandardScaler</em>, rather than the <em>MinMaxScaler</em> as you might have seen before. The reason is that stock prices are ever-changing, and there are no true min or max values. It doesn’t make sense to use the <em>MinMaxScaler</em>, although this choice probably won’t lead to disastrous results at the end of the day;</li>



<li>stock price data in its raw format can’t be used in an LSTM model directly; we need to transform it using our pre-defined `<em>extract_seqX_outcomeY`</em> function. For instance, to predict the 51st price, this function creates input vectors of 50 data points prior and uses the 51st price as the outcome value.</li>
</ul>



<p>Moving on, let’s kick off the LSTM modeling process. Specifically, we’re building an LSTM with two hidden layers, and a ‘linear’ activation function upon the output. We also use Neptune&#8217;s Keras integration to monitor and log model training progress live.</p>



<p>Read more about the integration in the <a href="https://docs-legacy.neptune.ai/integrations/keras/" target="_blank" rel="noreferrer noopener">Neptune docs</a>.</p>




<div
	style="opacity: 0;"
	class="block-code-snippet  l-padding__top--0 l-padding__bottom--0 l-margin__top--unset l-margin__bottom--unset block-code-snippet--regular language-py line-numbers block-code-snippet--show-header"
	data-show-header="show"
	data-header-text=""
>
	<pre style="font-size: .875rem;" data-prismjs-copy="Copy the JavaScript snippet!"><code>### Setup Neptune's Keras integration ###
from neptune.integrations.tensorflow_keras import NeptuneCallback

neptune_callback = NeptuneCallback(run=run)

### Build a LSTM model and log training progress to Neptune ###

def Run_LSTM(X_train, layer_units=50):
    inp = Input(shape=(X_train.shape[1], 1))

    x = LSTM(units=layer_units, return_sequences=True)(inp)
    x = LSTM(units=layer_units)(x)
    out = Dense(1, activation="linear")(x)
    model = Model(inp, out)

    # Compile the LSTM neural net
    model.compile(loss="mean_squared_error", optimizer="adam")

    return model


model = Run_LSTM(X_train, layer_units=layer_units)

history = model.fit(
    X_train,
    y_train,
    epochs=cur_epochs,
    batch_size=cur_batch_size,
    verbose=1,
    validation_split=0.1,
    shuffle=True,
    callbacks=[neptune_callback],
)</code></pre>
</div>




<p>Training progress is visible live on Neptune.</p>



<div id="app-screenshot-block_7ad8ad72e3a3855459995337fd0349b2"
	class="block-app-screenshot js-block-with-image-full-screen-modal "
	data-video-url=""
	data-show-controls="false"
	data-unmute="false"
	data-button-icon="https://neptune.ai/wp-content/themes/neptune/img/icon-close.svg"
	data-image-full-screen-modal="https://i0.wp.com/neptune.ai/wp-content/uploads/2023/07/image-3.png?fit=1020%2C298&#038;ssl=1"
>

			<div class="block-app-screenshot__image-wrapper">
			<div class="block-app-screenshot__bar">
				<figure class="block-app-screenshot__bar-buttons-wrapper">
					<img
						src="https://neptune.ai/wp-content/themes/neptune/img/blocks/app-screenshot/bar-buttons.svg"
						width="34"
						height="9"
						class="block-app-screenshot__bar-buttons"
						alt="">
				</figure>
			</div>

			
				<img
					srcset="
					https://i0.wp.com/neptune.ai/wp-content/uploads/2023/07/image-3.png?fit=480%2C140&#038;ssl=1 480w,					https://i0.wp.com/neptune.ai/wp-content/uploads/2023/07/image-3.png?fit=768%2C225&#038;ssl=1 768w,					https://i0.wp.com/neptune.ai/wp-content/uploads/2023/07/image-3.png?fit=1020%2C298&#038;ssl=1 1020w"
					alt=""
					style=""
					width="1020"
					height="298"
					class="block-app-screenshot__image"
				>

			
			<div class="block-app-screenshot__overlay">

				
					<a
						href="https://app.neptune.ai/o/showcase/org/StockPrediction/runs/details?viewId=99828b80-2046-43fe-a942-c36ab4c725d5&#038;detailsTab=charts&#038;shortId=STOC-13&#038;type=run"
						class="c-button c-button--primary c-button--small c-button--cta">
						<img
							decoding="async"
							loading="lazy"
							src="https://neptune.ai/wp-content/themes/neptune/img/icon-button--test-tube.svg"
							width="16"
							height="19"
							target="_blank" rel="nofollow noopener noreferrer"							class="c-button__icon"
							alt=""
						/>

													<span class="c-button__text">
								See in app							</span>
						
					</a>

				
														<button
						class="js-c-image-full-screen-modal c-button c-button--tertiary c-button--small">
						<img
							decoding="async"
							loading="lazy"
							src="https://neptune.ai/wp-content/themes/neptune/img/icon-zoom.svg"
							width="16"
							height="17"
							class="c-button__icon"
							alt="zoom"
						/>

						<span class="c-button__text">
							Full screen preview						</span>
						
					</button>
									
			</div>

		</div>

			
</div>



<div id="separator-block_d7c5d01608ee214ada6636eca496f1dc"
         class="block-separator block-separator--15">
</div>



<p>Once the training completes, we’ll test the model against our hold-out set.&nbsp;</p>




<div
	style="opacity: 0;"
	class="block-code-snippet  l-padding__top--0 l-padding__bottom--0 l-margin__top--unset l-margin__bottom--unset block-code-snippet--regular language-py line-numbers block-code-snippet--show-header"
	data-show-header="show"
	data-header-text=""
>
	<pre style="font-size: .875rem;" data-prismjs-copy="Copy the JavaScript snippet!"><code># predict stock prices using past window_size stock prices
def preprocess_testdat(data=stockprices, scaler=scaler, window_size=window_size, test=test):
    raw = data["Close"][len(data) - len(test) - window_size:].values
    raw = raw.reshape(-1,1)
    raw = scaler.transform(raw)

    X_test = [raw[i-window_size:i, 0] for i in range(window_size, raw.shape[0])]
    X_test = np.array(X_test)

    X_test = np.reshape(X_test, (X_test.shape[0], X_test.shape[1], 1))
    return X_test

X_test = preprocess_testdat()

predicted_price_ = model.predict(X_test)
predicted_price = scaler.inverse_transform(predicted_price_)

# Plot predicted price vs actual closing price
test["Predictions_lstm"] = predicted_price</code></pre>
</div>




<p>Time to calculate the performance metrics and log them to Neptune.</p>




<div
	style="opacity: 0;"
	class="block-code-snippet  l-padding__top--0 l-padding__bottom--0 l-margin__top--unset l-margin__bottom--unset block-code-snippet--regular language-py line-numbers block-code-snippet--show-header"
	data-show-header="show"
	data-header-text=""
>
	<pre style="font-size: .875rem;" data-prismjs-copy="Copy the JavaScript snippet!"><code># Evaluate performance
rmse_lstm = calculate_rmse(np.array(test["Close"]), np.array(test["Predictions_lstm"]))
mape_lstm = calculate_mape(np.array(test["Close"]), np.array(test["Predictions_lstm"]))

### Log to Neptune
run["RMSE"] = rmse_lstm
run["MAPE (%)"] = mape_lstm

### Plot prediction and true trends and log to Neptune
def plot_stock_trend_lstm(train, test):
    fig = plt.figure(figsize = (20,10))
    plt.plot(np.asarray(train.index), np.asarray(train["Close"]), label = "Train Closing Price")
    plt.plot(np.asarray(test.index), np.asarray(test["Close"]), label = "Test Closing Price")
    plt.plot(np.asarray(test.index), np.asarray(test["Predictions_lstm"]), label = "Predicted Closing Price")
    plt.title("LSTM Model")
    plt.xlabel("Date")
    plt.ylabel("Stock Price ($)")
    plt.legend(loc="upper left")

    ## Log image to Neptune
    run["Plot of Stock Predictions"].upload(neptune.types.File.as_image(fig))

plot_stock_trend_lstm(train, test)

### Stop the run after logging
run.stop()</code></pre>
</div>




<p>In Neptune, it’s amazing to see that our LSTM model achieved an RMSE = 12.58 and MAPE = 2%; a tremendous improvement from the SMA and EMA models! The trend chart shows a near-perfect overlay of the predicted and actual closing price for our testing set.</p>



<div id="app-screenshot-block_99bc96a92f27db66a52837f3c87caede"
	class="block-app-screenshot js-block-with-image-full-screen-modal "
	data-video-url=""
	data-show-controls="false"
	data-unmute="false"
	data-button-icon="https://neptune.ai/wp-content/themes/neptune/img/icon-close.svg"
	data-image-full-screen-modal="https://i0.wp.com/neptune.ai/wp-content/uploads/2023/07/image-4.png?fit=1020%2C380&#038;ssl=1"
>

			<div class="block-app-screenshot__image-wrapper">
			<div class="block-app-screenshot__bar">
				<figure class="block-app-screenshot__bar-buttons-wrapper">
					<img
						src="https://neptune.ai/wp-content/themes/neptune/img/blocks/app-screenshot/bar-buttons.svg"
						width="34"
						height="9"
						class="block-app-screenshot__bar-buttons"
						alt="">
				</figure>
			</div>

			
				<img
					srcset="
					https://i0.wp.com/neptune.ai/wp-content/uploads/2023/07/image-4.png?fit=480%2C179&#038;ssl=1 480w,					https://i0.wp.com/neptune.ai/wp-content/uploads/2023/07/image-4.png?fit=768%2C286&#038;ssl=1 768w,					https://i0.wp.com/neptune.ai/wp-content/uploads/2023/07/image-4.png?fit=1020%2C380&#038;ssl=1 1020w"
					alt=""
					style=""
					width="1020"
					height="380"
					class="block-app-screenshot__image"
				>

			
			<div class="block-app-screenshot__overlay">

				
					<a
						href="https://app.neptune.ai/o/showcase/org/StockPrediction/runs/details?viewId=99828b80-2046-43fe-a942-c36ab4c725d5&#038;detailsTab=metadata&#038;shortId=STOC-13&#038;type=run&#038;path=&#038;attribute=Plot%20of%20Stock%20Predictions"
						class="c-button c-button--primary c-button--small c-button--cta">
						<img
							decoding="async"
							loading="lazy"
							src="https://neptune.ai/wp-content/themes/neptune/img/icon-button--test-tube.svg"
							width="16"
							height="19"
							target="_blank" rel="nofollow noopener noreferrer"							class="c-button__icon"
							alt=""
						/>

													<span class="c-button__text">
								See in app							</span>
						
					</a>

				
														<button
						class="js-c-image-full-screen-modal c-button c-button--tertiary c-button--small">
						<img
							decoding="async"
							loading="lazy"
							src="https://neptune.ai/wp-content/themes/neptune/img/icon-zoom.svg"
							width="16"
							height="17"
							class="c-button__icon"
							alt="zoom"
						/>

						<span class="c-button__text">
							Full screen preview						</span>
						
					</button>
									
			</div>

		</div>

			
</div>



<div id="separator-block_d7c5d01608ee214ada6636eca496f1dc"
         class="block-separator block-separator--15">
</div>



<h2 class="wp-block-heading" class="wp-block-heading" id="h-final-thoughts-on-new-methodologies">Final thoughts on new methodologies</h2>



<p>We’ve seen the advantage of LSTMs in the example of predicting Apple stock prices compared to traditional MA models. Be careful about making generalizations to other stocks, because, unlike other stationary time series, stock market data is less-to-none seasonal and more chaotic.&nbsp;</p>



<p>In our example, Apple, as one of the biggest tech giants, has not only established a mature business model and management, its sales figures also benefit from the release of innovative products or services. Both contribute to the <a href="https://www.investors.com/research/options/apple-stock-volatility-low-bull-call-spread-option-idea/">lower implied volatility</a> of Apple stock, making the predictions relatively easier for the LSTM model in contrast to different, high-volatility stocks.&nbsp;</p>



<p>To account for the chaotic dynamics of the stock market, Echo State Networks (ESN) is <a href="https://science.sciencemag.org/content/304/5667/78">proposed</a>. As a new invention within the RNN (Recurrent Neural Networks) family, ESN utilizes a hidden layer with several neurons flowing and loosely interconnected; this hidden layer is referred to as the ‘reservoir’ designed to capture the non-linear history information of input data.</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Schema-of-echo-state.png?ssl=1" alt="Schema of echo state" class="wp-image-49796"/><figcaption class="wp-element-caption"><em><a href="https://pdfs.semanticscholar.org/8332/5e2222c938aecf8159300ce3f78139bac1ba.pdf" target="_blank" rel="noreferrer noopener nofollow">Schema of an Echo State Network (ESN)</a></em></figcaption></figure>
</div>


<p>At a high level, an ESN takes in a time-series input vector and maps it to a high-dimensional feature space, i.e. the dynamical reservoir (neurons aren’t connected like a net but rather like a reservoir). Then, at the output layer, a linear activation function is applied to calculate the final predictions.&nbsp;</p>



<p>If you’re interested in learning more about this methodology, check out the <a href="http://www.columbia.edu/cu/biology/courses/w4070/Reading_List_Yuste/haas_04.pdf" target="_blank" rel="noreferrer noopener nofollow">original paper by Jaeger and Haas</a>.</p>



<p>In addition, it would be interesting to incorporate sentiment analysis on news and social media regarding the stock market in general, as well as a given stock of interest. Another promising approach for better stock price predictions is the hybrid model, where we add MA predictions as input vectors to the LSTM model. You might want to explore different methodologies, too.</p>



<p>Hope you enjoyed reading this article as much as I enjoyed writing it! The <a href="https://app.neptune.ai/o/showcase/org/StockPrediction/runs/table?viewId=99828b80-2046-43fe-a942-c36ab4c725d5" target="_blank" rel="noreferrer noopener nofollow">whole Neptune project is available&nbsp;here</a>&nbsp;for your reference.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">5601</post-id>	</item>
		<item>
		<title>ARIMA &#038; SARIMA: Real-World Time Series Forecasting</title>
		<link>https://neptune.ai/blog/arima-sarima-real-world-time-series-forecasting-guide</link>
		
		<dc:creator><![CDATA[Aayush Bajaj]]></dc:creator>
		<pubDate>Thu, 21 Jul 2022 15:10:16 +0000</pubDate>
				<category><![CDATA[ML Model Development]]></category>
		<category><![CDATA[Time Series]]></category>
		<guid isPermaLink="false">https://neptune.test/arima-sarima-real-world-time-series-forecasting-guide/</guid>

					<description><![CDATA[Time series and forecasting have been some of the key problems in statistics and Data Science. A data becomes a time series when it’s sampled on a time-bound attribute like days, months, and years inherently giving it an implicit order. Forecasting is when we take that data and predict future values.&#160; ARIMA and SARIMA are&#8230;]]></description>
										<content:encoded><![CDATA[
<p>Time series and forecasting have been some of the key problems in statistics and Data Science. A data becomes a time series when it’s sampled on a time-bound attribute like days, months, and years inherently giving it an implicit order. Forecasting is when we take that data and predict future values.&nbsp;</p>



<p>ARIMA and SARIMA are both <a href="https://towardsdatascience.com/forecasting-with-machine-learning-models-95a6b6579090" target="_blank" rel="noreferrer noopener nofollow">algorithms for forecasting</a>. <strong>ARIMA </strong>takes into account the past values (autoregressive, moving average) and predicts future values based on that. <strong>SARIMA </strong>similarly uses past values but also takes into account any seasonality patterns. Since SARIMA brings in seasonality as a parameter, it’s significantly more powerful than ARIMA in forecasting complex data spaces containing cycles.</p>



<section id="blog-intext-cta-block_2f4b4496b56c5ccfa946d874a3624f75" class="block-blog-intext-cta  c-box c-box--default c-box--dark c-box--no-hover c-box--standard ">

            <h3 class="block-blog-intext-cta__header" class="block-blog-intext-cta__header" id="h-may-interest-you">May interest you</h3>
    
            <p><img loading="lazy" decoding="async" class="lazyload block-blog-intext-cta__arrow-image" src="https://neptune.ai/wp-content/themes/neptune/img/image-ratio-holder.svg" alt="" width="12" height="12" data-src="https://neptune.ai/wp-content/themes/neptune/img/icon-arrow--right-gray.svg" />️ <a href="/blog/select-model-for-time-series-prediction-task" target="_blank" rel="noopener">How to Select a Model For Your Time Series Prediction Task [Guide]</a></p>
    
    </section>



<p>Further in the blog, we’re going to explore:</p>



<ol class="wp-block-list">
<li>ARIMA
<ul class="wp-block-list">
<li>What it is and how it forecasts</li>



<li>Example of predicting GDP of USA using ARIMA</li>
</ul>
</li>



<li>SARIMA
<ul class="wp-block-list">
<li>What it is and how it forecasts</li>



<li>Example of predicting electricity consumption</li>
</ul>
</li>



<li>Pros and cons of both models</li>



<li>Real-world use-cases of ARIMA and SARIMA</li>
</ol>



<p>Before we move on to the algorithms, there’s an important section about data processing that you should be wary about before embarking on your forecasting journey.</p>



<h2 class="wp-block-heading" class="wp-block-heading" id="h-data-preprocessing-for-time-series-forecasting">Data preprocessing for time series forecasting</h2>



<p>Time series data is messy. Forecasting models from simple rolling averages to LSTMs requires data to be clean. So here are some techniques you could use before moving to forecasting.</p>



<p><em>Note: This data preprocessing step is general and intended to make readers emphasize it as real-world projects involve a lot of cleaning and preparation.</em></p>



<ul class="wp-block-list">
<li><strong>Detrending/ Stationarity</strong>: Before forecasting, we want our time series variables to be mean-variance stationery. This means that the statistical properties of a model do not vary depending on when the sample was taken. Models built on stationary data are generally more robust. This can be achieved by using differencing.</li>



<li><strong>Anomaly detection:</strong> Any outlier present in the data might skew the forecasting results so it’s often considered a good practice to identify and normalize outliers before moving on to forecasting. You could follow this <a href="/blog/anomaly-detection-in-time-series" target="_blank" rel="noreferrer noopener">blog here where I have explained anomaly detection algorithms at length</a>.</li>



<li><strong>Check for sampling frequency:</strong> This is an important step to check the regularity of sampling. Irregular data has to be imputed or made uniform before applying any modeling techniques because irregular sampling leads to broken integrity of the time series and doesn’t fit well with the models.</li>



<li><strong>Missing data: </strong>At times there can be missing data for some datetime values and it needs to be addressed before modeling. For example, a time series data with missing values looks like this:</li>
</ul>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Time-series-missing-data.png?ssl=1" alt="Missing data in time series " class="wp-image-65628"/><figcaption class="wp-element-caption">Missing data in time series | <a href="https://jagan-singhh.medium.com/missing-data-in-time-series-5dcf19b0f40f" target="_blank" rel="noreferrer noopener nofollow">Source</a></figcaption></figure>
</div>


<p>Now, let&#8217;s move on to the models.</p>



<h2 class="wp-block-heading" class="wp-block-heading" id="h-arima">ARIMA&nbsp;</h2>



<p>ARIMA model is a class of linear models that utilizes historical values to forecast future values. ARIMA stands for <strong>Autoregressive Integrated Moving Average, </strong>each of which technique contributes to the final forecast. Let’s understand it one by one.</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-autoregressive-ar">Autoregressive (AR)</h3>



<p>In an autoregression model, we forecast the variable of interest using a linear combination of past values of that variable. The term autoregression indicates that it is a regression of the variable against itself. That is, we use lagged values of the target variable as our input variables to forecast values for the future. An autoregression model of order p will look like:</p>



<p class="has-text-align-center">m<sub>t</sub> =  <sub>0</sub> +  <sub>1</sub>m<sub>t-1</sub> + <sub>2</sub>m<sub>t-2</sub> + <sub>3</sub>m<sub>t-3</sub>+&#8230;+ <sub>p</sub>m<sub>t-p</sub></p>



<p>In the above equation, the currently observed value of <em>m </em>is a linear function of its past <em>p </em>values. [ 0, p] are the regression coefficients that are determined after training. There are some standard methods to determine optimal values of <em>p</em> one of which is, analyzing <strong>Autocorrelation</strong> and <strong>Partial Autocorrelation </strong>function plots.&nbsp;</p>



<p>The autocorrelation function (ACF) is the correlation between the current and the past values of the same variable. It also considers the translative effect that values carry over with time apart from a direct effect. For example, prices of oil 2 days ago will affect prices 1 day ago and eventually, today. But the prices of oil 2 days ago might also have an effect on today which ACF measures.</p>



<p>Partial Autocorrelation (PACF) on the other hand measures only the direct correlation between past values and current values. For example, PACF will only measure the effect of prices of oil 2 days ago on today with no translative effect.</p>



<p>ACF and PACF plots help us determine past value dependency which in turn helps us deduce <em>p </em>in AR. Head over <a href="https://otexts.com/fpp2/non-seasonal-arima.html#acf-and-pacf-plots" target="_blank" rel="noreferrer noopener nofollow">here</a> to understand how to deduce values for p (AR), and q(MA) in depth.&nbsp;</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-integrated-i">Integrated (I)</h3>



<p>Integrated represents any differencing that has to be applied in order to make the data stationary. A dickey-fuller test (code below) can be run on the data to check for stationarity and then experiment with different differencing factors. A differencing factor, d=1 means a lag of i.e.mt-mt-1. Let&#8217;s look at a plot of original vs differenced data.</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Arima-sarima-catfish-sales-1.png?ssl=1" alt="Arima sarima catfish sales 1" class="wp-image-65629"/><figcaption class="wp-element-caption"><em>Original Data | Source: Author</em></figcaption></figure>
</div>

<div class="wp-block-image">
<figure class="aligncenter size-full"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Arima-sarima-catfish-sales-2.png?ssl=1" alt="Arima sarima catfish sales 1" class="wp-image-65630"/><figcaption class="wp-element-caption"><em>After applying d=1 | Source: Author</em></figcaption></figure>
</div>


<p>The difference between them is evident. After differencing we could see that it&#8217;s significantly more stationary than the original and the mean and variance are approximately consistent over the years. We could use the code below to conduct a dickey-fuller test.</p>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);"><span class="hljs-function"><span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">def</span> <span class="hljs-title" style="color: rgb(153, 0, 0); font-weight: 700;">check_stationarity</span><span class="hljs-params">(ts)</span>:</span>
    dftest = adfuller(ts)
    adf = dftest[<span class="hljs-number" style="color: teal;">0</span>]
    pvalue = dftest[<span class="hljs-number" style="color: teal;">1</span>]
    critical_value = dftest[<span class="hljs-number" style="color: teal;">4</span>][<span class="hljs-string" style="color: rgb(221, 17, 68);">'5%'</span>]
    <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">if</span> (pvalue &lt; <span class="hljs-number" style="color: teal;">0.05</span>) <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">and</span> (adf &lt; critical_value):
        print(<span class="hljs-string" style="color: rgb(221, 17, 68);">'The series is stationary'</span>)
    <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">else</span>:
        print(<span class="hljs-string" style="color: rgb(221, 17, 68);">'The series is NOT stationary'</span>)</pre>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-moving-average-ma">Moving Average (MA)</h3>



<p>Moving average models uses past forecast errors rather than past values in a regression-like model to forecast future values. A moving average model can be denoted by the following equation:</p>



<p class="has-text-align-center">m<sub>t </sub>=  <sub>0</sub> +  <sub>1</sub>e<sub>t-1</sub> + <sub>2</sub>e<sub>t-2</sub> + <sub>3</sub>e<sub>t-3</sub>+&#8230;+ <sub>q</sub>e<sub>t-q</sub></p>



<p>This is referred as <strong>MA(q) </strong>model. In the above equation, <em>e</em> is called an <em>error</em> and it represents the random residual deviations between the model and the target variable. Since <em>e</em> can only be determined after fitting the model and since it&#8217;s a parameter too so in this case <em>e</em> is an <strong>unobservable parameter</strong>. Hence, to solve the MA equation, iterative techniques like Maximum Likelihood Estimation are used instead of OLS.</p>



<p>Since we’ve looked at how ARIMA works, let’s dive into an example and see how ARIMA is applied to time series data.</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-implementing-arima">Implementing ARIMA</h3>



<p>For the implementation, I’ve chosen <a href="https://www.kaggle.com/datasets/yekahaaagayeham/time-series-toy-data-set" target="_blank" rel="noreferrer noopener nofollow">catfish sales data from 1996 to 2008</a>. We’re going to apply the techniques we learned above to this dataset and see them in action. Although the data doesn’t need a lot of cleaning and is in a read-to-be-analyzed state, you might have to apply cleaning techniques to your dataset.&nbsp;</p>



<p>Unfortunately, we cannot replicate each and every scenario as cleaning methods are highly subjective and depend on the team’s requirements too. But the techniques learned here can be directly applied to your dataset after cleaning.</p>



<p>Let’s start with importing essential modules.</p>



<h4 class="wp-block-heading">Importing dependencies</h4>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);"><span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">from</span> IPython.display <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> display

<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> numpy <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">as</span> np
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> pandas <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">as</span> pd
pd.set_option(<span class="hljs-string" style="color: rgb(221, 17, 68);">'display.max_rows'</span>, <span class="hljs-number" style="color: teal;">15</span>)
pd.set_option(<span class="hljs-string" style="color: rgb(221, 17, 68);">'display.max_columns'</span>, <span class="hljs-number" style="color: teal;">500</span>)
pd.set_option(<span class="hljs-string" style="color: rgb(221, 17, 68);">'display.width'</span>, <span class="hljs-number" style="color: teal;">1000</span>)

<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> matplotlib.pyplot <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">as</span> plt
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">from</span> datetime <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> datetime
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">from</span> datetime <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> timedelta
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">from</span> pandas.plotting <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> register_matplotlib_converters

register_matplotlib_converters()

<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">from</span> statsmodels.tsa.seasonal <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> seasonal_decompose
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">from</span> statsmodels.tsa.arima_model <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> ARIMA
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">from</span> statsmodels.tsa.statespace.sarimax <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> SARIMAX
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">from</span> statsmodels.tsa.stattools <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> adfuller
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">from</span> statsmodels.graphics.tsaplots <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> plot_acf, plot_pacf
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">from</span> time <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> time
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> seaborn <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">as</span> sns
sns.set(style=<span class="hljs-string" style="color: rgb(221, 17, 68);">"whitegrid"</span>)

<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> warnings
warnings.filterwarnings(<span class="hljs-string" style="color: rgb(221, 17, 68);">'ignore'</span>)

RANDOM_SEED = np.random.seed(<span class="hljs-number" style="color: teal;">0</span>)</pre>



<p>These are pretty self-explanatory modules every Data Scientist will be familiar with. It’s always a good practice to set the RANDOM_SEED to make code reproducible with the same results.</p>



<p>Next, we’re going to import and plot the time series data</p>



<h4 class="wp-block-heading">Extract-Transform-Load (ETL)</h4>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);"><span class="hljs-function"><span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">def</span> <span class="hljs-title" style="color: rgb(153, 0, 0); font-weight: 700;">parser</span><span class="hljs-params">(s)</span>:</span>
    <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">return</span> datetime.strptime(s, <span class="hljs-string" style="color: rgb(221, 17, 68);">'%Y-%m-%d'</span>)
<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;">#read data</span>
catfish_sales = pd.read_csv(<span class="hljs-string" style="color: rgb(221, 17, 68);">'catfish.csv'</span>, parse_dates=[<span class="hljs-number" style="color: teal;">0</span>], index_col=<span class="hljs-number" style="color: teal;">0</span>, date_parser=parser)
<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;">#infer the frequency of the data</span>
catfish_sales = catfish_sales.asfreq(pd.infer_freq(catfish_sales.index))

<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;">#transform</span>
start_date = datetime(<span class="hljs-number" style="color: teal;">1996</span>,<span class="hljs-number" style="color: teal;">1</span>,<span class="hljs-number" style="color: teal;">1</span>)
end_date = datetime(<span class="hljs-number" style="color: teal;">2008</span>,<span class="hljs-number" style="color: teal;">1</span>,<span class="hljs-number" style="color: teal;">1</span>)
lim_catfish_sales = catfish_sales[start_date:end_date]

<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;">#plot</span>
plt.figure(figsize=(<span class="hljs-number" style="color: teal;">14</span>,<span class="hljs-number" style="color: teal;">4</span>))
plt.plot(lim_catfish_sales)
plt.title(<span class="hljs-string" style="color: rgb(221, 17, 68);">'Catfish Sales in 1000s of Pounds'</span>, fontsize=<span class="hljs-number" style="color: teal;">20</span>)
plt.ylabel(<span class="hljs-string" style="color: rgb(221, 17, 68);">'Sales'</span>, fontsize=<span class="hljs-number" style="color: teal;">16</span>)</pre>



<p>For the sake of simplicity, I’ve limited the data to only 1996-2008. The plot generated by above code looks like:</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Arima-sarima-catfish-sales-3.png?ssl=1" alt="Arima sarima catfish sales 3" class="wp-image-65633"/><figcaption class="wp-element-caption"><em>Catfish sales | Source: Author</em></figcaption></figure>
</div>


<p>First impressions say there is a definite trend and seasonality present in the data. Let&#8217;s do an STL decomposition to get a better understanding.</p>



<h4 class="wp-block-heading">STL decomposition</h4>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);">plt.rc(<span class="hljs-string" style="color: rgb(221, 17, 68);">'figure'</span>,figsize=(<span class="hljs-number" style="color: teal;">14</span>,<span class="hljs-number" style="color: teal;">8</span>))
plt.rc(<span class="hljs-string" style="color: rgb(221, 17, 68);">'font'</span>,size=<span class="hljs-number" style="color: teal;">15</span>)

result = seasonal_decompose(lim_catfish_sales,model=<span class="hljs-string" style="color: rgb(221, 17, 68);">'additive'</span>)
fig = result.plot()</pre>



<p>Resulting plot look like this:</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Arima-sarima-catfish-sales-4.png?ssl=1" alt="Arima sarima catfish sales 4" class="wp-image-65636"/></figure>
</div>


<p>Points to ponder:</p>



<ol class="wp-block-list">
<li>A 6-month and 12-month seasonal pattern is visible</li>



<li>An upwards and downwards trend is evident</li>
</ol>



<p>Let’s look at ACF and PACF plots to get an idea for p and q values</p>



<h4 class="wp-block-heading">ACF and PACF plots</h4>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);">plot_acf(lim_catfish_sales[<span class="hljs-string" style="color: rgb(221, 17, 68);">'Total'</span>], lags=<span class="hljs-number" style="color: teal;">48</span>);
plot_pacf(lim_catfish_sales[<span class="hljs-string" style="color: rgb(221, 17, 68);">'Total'</span>], lags=<span class="hljs-number" style="color: teal;">30</span>);</pre>



<p>The output of the above code plots ACF and PACF:</p>


<div class="wp-block-image">
<figure class="aligncenter size-full is-resized"><img data-recalc-dims="1" loading="lazy" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Autocorrelation-plot-for-Catfish-data.png?resize=831%2C484&#038;ssl=1" alt="Autocorrelation plot for Catfish data" class="wp-image-65638" style="width:831px;height:484px" width="831" height="484"/><figcaption class="wp-element-caption"><em>Autocorrelation plot for Catfish data</em></figcaption></figure>
</div>

<div class="wp-block-image">
<figure class="aligncenter"><img decoding="async" src="https://lh4.googleusercontent.com/2HjZi09y0R9D9uNPdctwArCQn3e7L-ip0UGruPPt8GymA8iaLhOaPFB_yJ9yacM412Vcck6GOWajpyKonAXbGkXa8P5NCX8VltnUx1gEe_6t03AmXlXu9g4rwXQHtsK8_MOAy5SOn8bN51B4tg" alt="Partial autocorrelation plot for catfish data"/><figcaption class="wp-element-caption"><em>Partial autocorrelation plot for Catfish data</em></figcaption></figure>
</div>


<p>Points to ponder:</p>



<ol class="wp-block-list">
<li>There’s a significant spike at 6-month and 12-month in ACF</li>



<li>PACF is nearly sinusoidal</li>
</ol>



<p>The differencing factor d should be kept at 1 since there’s a clear trend and non-stationary data. P can be tested with values 6 and 12.&nbsp;</p>



<h4 class="wp-block-heading">Fitting ARIMA</h4>



<p>We’re going to use statsmodels module to implement and use ARIMA. For this, we’ve imported the ARIMA class from the statsmodels. Now, let’s fit with the parameters we discussed in the previous section.</p>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);">arima = ARIMA(lim_catfish_sales[<span class="hljs-string" style="color: rgb(221, 17, 68);">'Total'</span>], order=(<span class="hljs-number" style="color: teal;">12</span>,<span class="hljs-number" style="color: teal;">1</span>,<span class="hljs-number" style="color: teal;">1</span>))
predictions = arima.fit().predict()</pre>



<p>As you notice above I started with (12,1,1) for (p,d,q) right from what we saw in the ACF and PACF plots.&nbsp;</p>



<p><em>Note: It is quite handy to use modules for algorithms (like scikit-learn) and you’ll be glad to know that statsmodels is one of the libraries that gets used a lot.</em></p>



<section id="blog-intext-cta-block_ad04649f4dbc87e23fb9f5e85af89ded" class="block-blog-intext-cta  c-box c-box--default c-box--dark c-box--no-hover c-box--standard ">

            <h3 class="block-blog-intext-cta__header" class="block-blog-intext-cta__header" id="h-check-more-tools">Check more tools</h3>
    
            <p><a href="/blog/time-series-tools-packages-libraries" target="_blank" rel="noopener">Time Series Projects: Tools, Packages, and Libraries That Can Help</a></p>
    
    </section>



<p>Let’s see how our predictions stack up with the original data.</p>



<h4 class="wp-block-heading">Visualizing the result</h4>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);">plt.figure(figsize=(<span class="hljs-number" style="color: teal;">16</span>,<span class="hljs-number" style="color: teal;">4</span>))
plt.plot(lim_catfish_sales.diff(), label=<span class="hljs-string" style="color: rgb(221, 17, 68);">"Actual"</span>)
plt.plot(predictions, label=<span class="hljs-string" style="color: rgb(221, 17, 68);">"Predicted"</span>)
plt.title(<span class="hljs-string" style="color: rgb(221, 17, 68);">'Catfish Sales in 1000s of Pounds'</span>, fontsize=<span class="hljs-number" style="color: teal;">20</span>)
plt.ylabel(<span class="hljs-string" style="color: rgb(221, 17, 68);">'Sales'</span>, fontsize=<span class="hljs-number" style="color: teal;">16</span>)
plt.legend()</pre>



<p>The output of the above code will give you a comparative plot of predictions and the actual data.</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Arima-comparative-plot.png?ssl=1" alt="arima comparative plot" class="wp-image-65640"/><figcaption class="wp-element-caption"><em>A comparative plot of predictions and the actual data</em></figcaption></figure>
</div>


<p>You can witness here that the model didn’t really catch up with some of the peaks but captured the essence of the data well. We can experiment with more p,d,q values to generalize the model better and make sure it doesn’t overfit.</p>



<p>Trial and optimization is one way but you can also use <strong>Auto-ARIMA</strong>. It essentially does the heavy lifting for you and tunes the hyperparameters for you. This <a href="https://towardsdatascience.com/time-series-forecasting-using-auto-arima-in-python-bb83e49210cd" target="_blank" rel="noreferrer noopener nofollow">blog</a> is a good starting point for auto-ARIMA.</p>



<p>Keep in mind that the explainability of the parameters will be something that you have to deal with while working on Auto-ARIMA and make sure it doesn’t get converted into a BlackBox as forecasting models have to go for governance before deployment. So, it’s good practice to be able to explain the parameter values and their contribution.</p>



<h2 class="wp-block-heading" class="wp-block-heading" id="h-sarima">SARIMA</h2>



<p>SARIMA stands for Seasonal-ARIMA and it includes seasonality contribution to the forecast. The importance of seasonality is quite evident and ARIMA fails to encapsulate that information implicitly.</p>



<p>The Autoregressive (AR), Integrated (I), and Moving Average (MA) parts of the model remain as that of ARIMA. The addition of Seasonality adds robustness to the SARIMA model. It’s represented as:</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Sarima-seasonality.png?ssl=1" alt="Sarima seasonality" class="wp-image-65641"/><figcaption class="wp-element-caption"><a href="https://otexts.com/fpp2/seasonal-arima.html" target="_blank" rel="noreferrer noopener nofollow"><em>Source</em></a></figcaption></figure>
</div>


<p>where m is the number of observations per year. We use the uppercase notation for the seasonal parts of the model, and lowercase notation for the non-seasonal parts of the model.&nbsp;</p>



<p>Similar to ARIMA, the P,D,Q values for seasonal parts of the model can be deduced from the ACF and PACF plots of the data. Let&#8217;s implement SARIMA for the same Catfish sales model.</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-implementing-sarima">Implementing SARIMA</h3>



<p>The ETL and dependencies will remain the same as in ARIMA so we’ll jump straight to the modeling part.</p>



<h4 class="wp-block-heading">Fitting SARIMA</h4>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);">sarima = SARIMAX(lim_catfish_sales[<span class="hljs-string" style="color: rgb(221, 17, 68);">'Total'</span>],
                order=(<span class="hljs-number" style="color: teal;">1</span>,<span class="hljs-number" style="color: teal;">1</span>,<span class="hljs-number" style="color: teal;">1</span>),
                seasonal_order=(<span class="hljs-number" style="color: teal;">1</span>,<span class="hljs-number" style="color: teal;">1</span>,<span class="hljs-number" style="color: teal;">0</span>,<span class="hljs-number" style="color: teal;">12</span>))
predictions = sarima.fit().predict()</pre>



<p>I experimented with taking 1,1,1 for the non-seasonal parts and took 1,1,0,12 for seasonal ones as ACF showed a 6-month and 12-month lagged correlation. Let’s see how it turned out.</p>



<h4 class="wp-block-heading">Visualizing the result</h4>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);">plt.figure(figsize=(<span class="hljs-number" style="color: teal;">16</span>,<span class="hljs-number" style="color: teal;">4</span>))
plt.plot(lim_catfish_sales, label=<span class="hljs-string" style="color: rgb(221, 17, 68);">"Actual"</span>)
plt.plot(predictions, label=<span class="hljs-string" style="color: rgb(221, 17, 68);">"Predicted"</span>)
plt.title(<span class="hljs-string" style="color: rgb(221, 17, 68);">'Catfish Sales in 1000s of Pounds'</span>, fontsize=<span class="hljs-number" style="color: teal;">20</span>)
plt.ylabel(<span class="hljs-string" style="color: rgb(221, 17, 68);">'Sales'</span>, fontsize=<span class="hljs-number" style="color: teal;">16</span>)
plt.legend()</pre>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Sarima-comparative-plot.png?ssl=1" alt="Sarima comparative plot" class="wp-image-65643"/><figcaption class="wp-element-caption"><em>A comparative plot of predictions and the actual data</em></figcaption></figure>
</div>


<p>As you can see, at the start model, struggled to fit probably because of off-course initialization but it quickly learned the right path. The fit is quite good as compared to the ARIMA one suggesting that SARIMA can learn seasonality better and if it’s present in the data then it’d make sense to try SARIMA out.</p>



<h2 class="wp-block-heading" class="wp-block-heading" id="h-pros-and-cons-of-arima-and-sarima-models">Pros and cons of ARIMA and SARIMA models</h2>



<p>Owing to the linear nature of both the algorithms, they are quite handy and used in the industry when it comes to experimentation and understanding the data, creating baseline forecasting scores. If tuned right with lagged values (p,d,q) they can perform significantly better. The simple and explainable nature of both the algorithms makes them one of the top picks by analysts and Data Scientists. There are, however, some pros and cons when working with ARIMA and SARIMA at scale. Let’s discuss both of those:</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-pros-of-arima-sarima">Pros of ARIMA &amp; SARIMA</h3>



<ul class="wp-block-list">
<li><strong>Easy to understand and interpret</strong>: The one thing that your fellow teammates and colleagues would appreciate is the simplicity and interpretability of the models. Focusing on both of these things while also maintaining the quality of the results will help with presentations with the stakeholders.</li>



<li><strong>Limited variables</strong>: There are fewer hyperparameters so the config file will be easily maintainable if the model goes into production.</li>
</ul>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-cons-of-arima-sarima">Cons of ARIMA &amp; SARIMA</h3>



<ul class="wp-block-list">
<li><strong>Exponential time complexity:</strong> When the value of p and q increases there are equally more coefficients to fit hence increasing the time complexity manifold if p and q are high. This makes both of these algorithms hard to put into production and makes Data Scientists look into Prophet and other algorithms. Then again, it depends on the complexity of the dataset too.</li>



<li><strong>Complex data: </strong>There can be a possibility where your data is too complex and there is no optimal solution for p and q. Although highly unlikely that ARIMA and SARIMA would fail but if this occurs then unfortunately you may have to look elsewhere.</li>



<li><strong>Amount of data needed: </strong>Both the algorithms require considerable data to work on, especially if the data is seasonal. For example, using three years of historical demand is likely not to be enough (Short Life-Cycle Products) for a good forecast.</li>
</ul>



<section id="blog-intext-cta-block_e8db003dd33ba4bda000f5039da4f83c" class="block-blog-intext-cta  c-box c-box--default c-box--dark c-box--no-hover c-box--standard ">

            <h3 class="block-blog-intext-cta__header" class="block-blog-intext-cta__header" id="h-may-interest-you">May interest you</h3>
    
            <p><img loading="lazy" decoding="async" class="lazyload block-blog-intext-cta__arrow-image" src="https://neptune.ai/wp-content/themes/neptune/img/image-ratio-holder.svg" alt="" width="12" height="12" data-src="https://neptune.ai/wp-content/themes/neptune/img/icon-arrow--right-gray.svg" />️ <a href="/blog/arima-vs-prophet-vs-lstm" target="_blank" rel="noopener">ARIMA vs Prophet vs LSTM for Time Series Prediction</a></p>
    
    </section>



<h2 class="wp-block-heading" class="wp-block-heading" id="h-real-world-use-cases-of-arima-and-sarima">Real-world use-cases of ARIMA and SARIMA</h2>



<p>ARIMA/SARIMA are among the most popular econometrics models used for forecasting stock prices, <a href="https://journals.sagepub.com/doi/10.1177/1847979018808673" target="_blank" rel="noreferrer noopener nofollow">demand forecasting</a>, and even the spread of infectious diseases. When the underlying mechanisms are not known or are too complicated, e.g., the stock market, or not fully known, e.g., retail sales, it is usually better to apply ARIMA or a similar statistical model than complex deep algorithms like RNNs. </p>



<p>However, there are cases where applying ARIMA can give you par results.&nbsp;</p>



<p>Here are some curated papers that use ARIMA/SARIMA:</p>



<ol class="wp-block-list">
<li><a href="https://journals.sagepub.com/doi/full/10.1177/0972150920988653" target="_blank" rel="noreferrer noopener nofollow">An Application of ARIMA Model to Forecast the Dynamics of COVID-19 Epidemic in India</a>: This research paper utilized ARIMA to forecast COVID-19 cases numbers in India. The shortcoming of utilizing ARIMA, in this case, is, that it only utilizes past values to forecast the future. But with COVID-19 changes shape with the passage of time and it depends on a lot of other behavioral factors other than past values that ARIMA isn’t capable to capture.</li>



<li><a href="https://www.mdpi.com/2073-8994/11/2/240" target="_blank" rel="noreferrer noopener nofollow">Time Series ARIMA Model for Prediction of Daily and Monthly Average Global Solar Radiation: The Case Study of Seoul, South Korea</a>: This is a study that forecasts solar radiation in South Korea based on the hourly solar radiation data obtained from the Korean Meteorological Administration over 37 years by using SARIMA.</li>



<li><a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4872983/" target="_blank" rel="noreferrer noopener nofollow">Disease management with ARIMA model in time series</a>: Another example of using ARIMA in disease management utilizing wide applicability of ARIMA/SARIMA models. The research papers touch on some real-life use cases for ARIMA. For example, a hospital in Singapore accurately predicted the number of beds they will be needing in 3 days during the SARS epidemic.</li>



<li><a href="https://journals.sagepub.com/doi/full/10.1177/1847979018808673" target="_blank" rel="noreferrer noopener nofollow">Forecasting of demand using the ARIMA model</a>: This use case focuses on modeling and forecasting demand in a food company using ARIMA.</li>
</ol>



<p>When it comes to the industry, here&#8217;s <a href="https://eng.uber.com/forecasting-introduction/" target="_blank" rel="noreferrer noopener nofollow">a nice article about forecasting in Uber</a>. </p>



<p>More often than not you’ll find ARIMA/SARIMA used when the problem statement is limited to the past values whether it&#8217;s predicting hospital beds, COVID cases, or forecasting demand. The shortcoming, however, arises when there are other factors to consider in forecasting like attributes that are static. Look out for the problem statement you’re working on, if these circumstances occur for you then try to use other methods like <a href="https://robjhyndman.com/papers/Theta.pdf" target="_blank" rel="noreferrer noopener nofollow">Theta</a>, QRF (quantile regression forests), Prophet, RNNs.&nbsp;</p>



<h2 class="wp-block-heading" class="wp-block-heading" id="h-conclusion-and-final-notes">Conclusion and final notes</h2>



<p>You’ve reached the end! In this blog, we discussed ARIMA and SARIMA at length pertaining to their utilization and importance for research in the industry. Their simplicity and robustness make them top contenders for modeling and forecasting. There are however some things to keep in mind while working on them in your real-world use case:</p>



<ol class="wp-block-list">
<li>Increasing p,q can increase the time complexity of training exponentially. So, it’s advised to deduce their values priorly and then experiment.</li>



<li>They are prone to overfitting. So, make sure you set the hyperparameters right and do validation before moving to production.</li>
</ol>



<p>That’s it from my side. Keep learning and stay tuned for more! Adios!</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-references">References</h3>



<ol class="wp-block-list">
<li><a href="/blog/time-series-prediction-vs-machine-learning" target="_blank" rel="noreferrer noopener">https://neptune.ai/blog/time-series-prediction-vs-machine-learning</a></li>



<li><a href="https://otexts.com/fpp2/arima.html" target="_blank" rel="noreferrer noopener nofollow">https://otexts.com/fpp2/arima.html</a></li>



<li><a href="https://towardsdatascience.com/understanding-arima-time-series-modeling-d99cd11be3f8" target="_blank" rel="noreferrer noopener nofollow">https://towardsdatascience.com/understanding-arima-time-series-modeling-d99cd11be3f8</a></li>



<li><a href="https://towardsdatascience.com/understanding-sarima-955fe217bc77" target="_blank" rel="noreferrer noopener nofollow">https://towardsdatascience.com/understanding-sarima-955fe217bc77</a></li>



<li><a href="/blog/anomaly-detection-in-time-series">https://neptune.ai/blog/anomaly-detection-in-time-series</a></li>



<li><a href="https://www.statsmodels.org/dev/generated/statsmodels.tsa.statespace.sarimax.SARIMAX.html" target="_blank" rel="noreferrer noopener nofollow">https://www.statsmodels.org/dev/generated/statsmodels.tsa.statespace.sarimax.SARIMAX.html</a></li>



<li><a href="https://otexts.com/fpp2/seasonal-arima.html" target="_blank" rel="noreferrer noopener nofollow">https://otexts.com/fpp2/seasonal-arima.html</a></li>



<li><a href="https://towardsdatascience.com/time-series-forecasting-using-auto-arima-in-python-bb83e49210cd" target="_blank" rel="noreferrer noopener nofollow">https://towardsdatascience.com/time-series-forecasting-using-auto-arima-in-python-bb83e49210cd</a></li>



<li><a href="https://towardsdatascience.com/time-series-from-scratch-autocorrelation-and-partial-autocorrelation-explained-1dd641e3076f" target="_blank" rel="noreferrer noopener nofollow">https://towardsdatascience.com/time-series-from-scratch-autocorrelation-and-partial-autocorrelation-explained-1dd641e3076f</a></li>



<li><a href="https://eng.uber.com/forecasting-introduction/" target="_blank" rel="noreferrer noopener nofollow">https://eng.uber.com/forecasting-introduction/</a></li>



<li><a href="https://www.capitalone.com/tech/machine-learning/understanding-arima-models/" target="_blank" rel="noreferrer noopener nofollow">https://www.capitalone.com/tech/machine-learning/understanding-arima-models/</a></li>



<li><a href="https://medium.com/analytics-vidhya/why-you-should-not-use-arima-to-forecast-demand-196cc8b8df3d" target="_blank" rel="noreferrer noopener nofollow">https://medium.com/analytics-vidhya/why-you-should-not-use-arima-to-forecast-demand-196cc8b8df3d</a></li>
</ol>



<p></p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">4893</post-id>	</item>
		<item>
		<title>Anomaly Detection in Time Series</title>
		<link>https://neptune.ai/blog/anomaly-detection-in-time-series</link>
		
		<dc:creator><![CDATA[Aayush Bajaj]]></dc:creator>
		<pubDate>Thu, 21 Jul 2022 14:21:40 +0000</pubDate>
				<category><![CDATA[ML Model Development]]></category>
		<category><![CDATA[Time Series]]></category>
		<guid isPermaLink="false">https://neptune.test/anomaly-detection-in-time-series/</guid>

					<description><![CDATA[Time series are everywhere! In user behavior on a website, or stock prices of a Fortune 500 company, or any other time-related example. Time series data is evident in every industry in some shape or form.&#160; Naturally, it’s also one of the most researched types of data. As a rule of thumb, you could say&#8230;]]></description>
										<content:encoded><![CDATA[
<p>Time series are everywhere! In user behavior on a website, or stock prices of a Fortune 500 company, or any other time-related example. Time series data is evident in every industry in some shape or form.&nbsp;</p>



<p>Naturally, it’s also one of the most researched types of data. As a rule of thumb, you could say time series is a type of data that’s sampled based on some kind of time-related dimension like years, months, or seconds.</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow">
<p>Time series are observations that have been recorded in an orderly fashion and which are correlated in time.</p>
</blockquote>



<p>While analyzing time series data, we have to make sure of the outliers, much as we do in static data. If you’ve worked with data in any capacity, you know how much pain outliers cause for an analyst. These outliers are called “anomalies” in time series jargon.</p>



<section id="blog-intext-cta-block_d1a910ad268e0870113ff7d0b1973d7b" class="block-blog-intext-cta  c-box c-box--default c-box--dark c-box--no-hover c-box--standard ">

            <h3 class="block-blog-intext-cta__header" class="block-blog-intext-cta__header" id="h-read-also">Read also </h3>
    
            <p><a href="/blog/time-series-prediction-vs-machine-learning" target="_blank" rel="noopener">Time Series Prediction: How Is It Different From Other Machine Learning?</a></p>
    
    </section>



<h2 class="wp-block-heading" class="wp-block-heading" id="h-what-are-anomalies-outliers-and-types-of-anomalies-in-time-series-data">What are anomalies/outliers and types of anomalies in time-series data?</h2>



<p>From a traditional point of view, an outlier/anomaly is:</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow">
<p>“An observation which deviates so much from other observations as to arouse suspicions that it was generated by a different mechanism.”</p>
</blockquote>



<p>Therefore, you can think of outliers as observations that don’t follow the expected behavior.</p>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Time-series-outliers.png?ssl=1" alt="Time series outliers" class="wp-image-42429"/></figure>
</div>


<p>As the figure above shows, outliers in time series can have two different meanings.&nbsp; The semantic distinction between them is mainly based on your interest as the analyst, or the particular scenario.&nbsp;</p>



<p>These observations have been related to noise, erroneous or unwanted data, which by itself isn’t interesting to the analyst. In these cases, outliers should be deleted or corrected to improve data quality, and generate a cleaner dataset that can be used by other data mining algorithms. For example, sensor transmission errors are eliminated to obtain more accurate predictions, because the main goal is to make predictions.&nbsp;</p>



<p>Nevertheless, in recent years &#8211; especially in the area of time series data &#8211; many researchers have aimed to detect and analyze unusual, but interesting phenomena. <em>Fraud detection</em> is a good example &#8211; the main objective is to detect and analyze the outlier itself. These observations are often referred to as anomalies.</p>



<p>The anomaly detection problem for time series is usually formulated as <em>identifying outlier data points relative to some norm or usual signal</em>. Take a look at some outlier types:</p>


<div class="wp-block-image">
<figure class="aligncenter size-large is-resized"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Time-series-outliers-types.png?ssl=1" alt="Time series outliers types" class="wp-image-42430" style="width:587px;height:377px"/></figure>
</div>


<p>Let’s break this down one by one:</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-point-outlier">Point outlier</h3>



<p>A point outlier is a datum that behaves unusually in a specific time instance when compared either to the other values in the time series (global outlier), or to its neighboring points (local outlier).&nbsp;</p>



<p>Example: are you aware of the Gamestop frenzy? A slew of young retail investors bought GME stock to get back at big hedge funds, driving the stock price way up. That sudden, short-lived spike that occurred due to an unlikely event is an <strong>additive (point) outlier. </strong>The unexpected growth of a time-based value in a short period (looks like a sudden spike) comes under additive outliers.</p>


<div class="wp-block-image">
<figure class="aligncenter size-large is-resized"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Time-series-point-outlier.png?ssl=1" alt="Time series point outlier" class="wp-image-42431" style="width:640px;height:392px"/><figcaption class="wp-element-caption"><em>Source: Google</em></figcaption></figure>
</div>


<p>Point outliers can be <em>univariate</em> or <em>multivariate,</em> depending on whether they affect one or more time-dependent variables, respectively.&nbsp;</p>



<p>Fig. 1a contains two univariate point outliers, O1 and O2, whereas the multivariate time series is composed of three variables in Fig. 3b, and has both univariate (O3) and multivariate (O1 and O2) point outliers.</p>


<div class="wp-block-image">
<figure class="aligncenter size-large is-resized"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Point-outliers-in-time-series.png?ssl=1" alt="Point outliers in time series" class="wp-image-42432" style="width:750px;height:301px"/><figcaption class="wp-element-caption"><br><em>Fig: 1 — Point outliers in time series data. | <a href="https://arxiv.org/pdf/2002.04236.pdf" target="_blank" rel="noreferrer noopener nofollow">Source</a></em></figcaption></figure>
</div>


<p>We will take a deeper look at Univariate Point Outliers in the Anomaly Detection section.</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-subsequence-outlier">Subsequence outlier</h3>



<p>This means consecutive points in time whose joint behavior is unusual, although each observation individually is not necessarily a point outlier. Subsequence outliers can also be global or local, and can affect one (univariate subsequence outlier) or more (multivariate subsequence outlier) time-dependent variables.&nbsp;</p>



<p>Fig. 2 provides an example of univariate (O1 and O2 in Fig. 2a, and O3 in Fig. 2b) and multivariate (O1 and O2 in Fig. 2b) subsequence outliers. Note that the latter does not necessarily affect all the variables (e.g., O2 in Fig. 2b).</p>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/%E2%80%8ASubsequence-outliers-in-time-series.png?ssl=1" alt=" Subsequence outliers in time series " class="wp-image-42433"/><figcaption class="wp-element-caption"><em>Fig: 2 — Subsequence outliers in time series data. | <a href="https://arxiv.org/pdf/2002.04236.pdf" target="_blank" rel="noreferrer noopener nofollow">Source</a></em></figcaption></figure>
</div>


<h2 class="wp-block-heading" class="wp-block-heading" id="h-anomaly-detection-techniques-in-time-series-data">Anomaly detection techniques in time series data</h2>



<p>There are few techniques that analysts can employ to identify different anomalies in data. It starts with a basic statistical decomposition and can work up to autoencoders. Let’s start with the basic one, and understand how and why it’s useful.</p>



<section id="blog-intext-cta-block_60def3edab13faeffdbc19d665e4d9ea" class="block-blog-intext-cta  c-box c-box--default c-box--dark c-box--no-hover c-box--standard ">

            <h3 class="block-blog-intext-cta__header" class="block-blog-intext-cta__header" id="h-note">Note</h3>
    
            <p><span style="font-weight: 400;">  Here you can find the </span><a href="https://app.neptune.ai/theaayushbajaj/Anomaly-Detection/n/49ba1752-fc3a-4abb-b35f-0e2ea4fd4afa/48dc19d8-3c75-4989-a2c0-67839393a093" target="_blank" rel="noopener"><span style="font-weight: 400;">notebook </span></a><span style="font-weight: 400;">and the </span><a href="https://drive.google.com/drive/folders/1vsLzhpgNbVPsYvBIFNI_20S3LRZuNYhz?usp=sharing" target="_blank" rel="noopener"><span style="font-weight: 400;">data </span></a>used in the article</p>
    
    </section>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-stl-decomposition">STL decomposition</h3>



<p><a href="http://www.wessa.net/download/stl.pdf" target="_blank" rel="noreferrer noopener nofollow">STL</a> stands for seasonal-trend decomposition procedure based on LOESS. This technique gives you the ability to split your time series signal into three parts: <strong>seasonal, trend, and residue</strong>.</p>



<p>It works for seasonal time-series, which is also the most popular type of time series data. To generate an STL-decomposition plot, we just use the ever-amazing <em>statsmodels </em>to do the heavy lifting for us.</p>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);">plt.rc(<span class="hljs-string" style="color: rgb(221, 17, 68);">'figure'</span>,figsize=(<span class="hljs-number" style="color: teal;">12</span>,<span class="hljs-number" style="color: teal;">8</span>))
plt.rc(<span class="hljs-string" style="color: rgb(221, 17, 68);">'font'</span>,size=<span class="hljs-number" style="color: teal;">15</span>)
result = seasonal_decompose(lim_catfish_sales,model=<span class="hljs-string" style="color: rgb(221, 17, 68);">'additive'</span>)
fig = result.plot()</pre>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/STL-decomposition.png?ssl=1" alt="STL decomposition" class="wp-image-42436"/><figcaption class="wp-element-caption"><em>This is Catfish sales data from 1996–2000 with an anomaly introduced in Dec-1998</em></figcaption></figure>
</div>


<p>If we analyze the deviation of <strong>residue</strong> and introduce some threshold for it, we’ll get an anomaly detection algorithm. To implement this, we only need the residue data from the decomposition.</p>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);">plt.rc(<span class="hljs-string" style="color: rgb(221, 17, 68);">'figure'</span>,figsize=(<span class="hljs-number" style="color: teal;">12</span>,<span class="hljs-number" style="color: teal;">6</span>))
plt.rc(<span class="hljs-string" style="color: rgb(221, 17, 68);">'font'</span>,size=<span class="hljs-number" style="color: teal;">15</span>)
fig, ax = plt.subplots()
x = result.resid.index
y = result.resid.values
ax.plot_date(x, y, color=<span class="hljs-string" style="color: rgb(221, 17, 68);">'black'</span>,linestyle=<span class="hljs-string" style="color: rgb(221, 17, 68);">'--'</span>)
ax.annotate(<span class="hljs-string" style="color: rgb(221, 17, 68);">'Anomaly'</span>, (mdates.date2num(x[<span class="hljs-number" style="color: teal;">35</span>]), y[<span class="hljs-number" style="color: teal;">35</span>]), xytext=(<span class="hljs-number" style="color: teal;">30</span>, <span class="hljs-number" style="color: teal;">20</span>),
          textcoords=<span class="hljs-string" style="color: rgb(221, 17, 68);">'offset points'</span>, color=<span class="hljs-string" style="color: rgb(221, 17, 68);">'red'</span>,arrowprops=dict(facecolor=<span class="hljs-string" style="color: rgb(221, 17, 68);">'red'</span>,arrowstyle=<span class="hljs-string" style="color: rgb(221, 17, 68);">'fancy'</span>))
fig.autofmt_xdate()
plt.show()</pre>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Residue-from-STL-decomposition.png?ssl=1" alt="Residue from STL decomposition" class="wp-image-42437"/><figcaption class="wp-element-caption"><em>Residue from the above STL decomposition</em></figcaption></figure>
</div>


<p><strong><em>Pros</em></strong></p>



<p>It’s simple, robust, it can handle a lot of different situations, and all anomalies can still be intuitively interpreted.</p>



<p><strong><em>Cons</em></strong></p>



<p>The biggest downside of this technique is rigid tweaking options. Apart from the threshold and maybe the confidence interval, there isn’t much you can do about it. For example, you’re tracking users on your website that was closed to the public and then was suddenly opened. In this case, you should track anomalies that occur before and after launch periods separately.</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-classification-and-regression-trees-cart">Classification and Regression Trees (CART)</h3>



<p>We can utilize the power and robustness of Decision Trees to identify outliers/anomalies in time series data.</p>



<ul class="wp-block-list">
<li>First, you can use supervised learning to teach trees to classify anomaly and non-anomaly data points. In order to do that, we’d need to have labeled anomaly data points, which you won’t find often outside of toy datasets.</li>



<li>Unsupervised is what you need! We can use the Isolation Forest algorithm to predict whether a certain point is an outlier or not, without the help of any labeled dataset. Let’s see how.</li>
</ul>



<p>The main idea, which is different from other popular outlier detection methods, is that Isolation Forest explicitly identifies anomalies instead of profiling normal data points. Isolation Forest, like any tree ensemble method, is based on decision trees.</p>



<p>In other words,<a href="https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.IsolationForest.html" target="_blank" rel="noreferrer noopener nofollow"> Isolation Forest</a> detects anomalies purely based on the fact that anomalies are data points that are few and different. The anomalies isolation is implemented without employing any distance or density measure.</p>



<ul class="wp-block-list">
<li>When applying an<a href="https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.IsolationForest.html" target="_blank" rel="noreferrer noopener nofollow"> IsolationForest</a> model, we set contamination = outliers_fraction, that is telling the model what proportion of outliers are present in the data. This is a trial/error metric.</li>



<li>Fit and predict (data) performs outlier detection on data, and returns 1 for normal, -1 for the anomaly.</li>



<li>Finally, we visualize anomalies with the Time Series view.</li>
</ul>



<p>Let&#8217;s do it step by step. First, visualize the time series data:</p>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);">plt.rc(<span class="hljs-string" style="color: rgb(221, 17, 68);">'figure'</span>,figsize=(<span class="hljs-number" style="color: teal;">12</span>,<span class="hljs-number" style="color: teal;">6</span>))
plt.rc(<span class="hljs-string" style="color: rgb(221, 17, 68);">'font'</span>,size=<span class="hljs-number" style="color: teal;">15</span>)
catfish_sales.plot()</pre>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Time-series-anomalies.png?ssl=1" alt="Time series anomalies" class="wp-image-42438"/><figcaption class="wp-element-caption"><em>The same Catfish Sales data but with different (multiple) anomalies introduced</em></figcaption></figure>
</div>


<p>Next, we need to set some parameters like the outlier fraction, and train our IsolationForest model. We can utilize the super useful scikit-learn to implement the Isolation Forest algorithm. You can find the complete notebook with code and other stuff <a href="https://ui.neptune.ai/theaayushbajaj/Anomaly-Detection/n/Anomaly-Detection-49ba1752-fc3a-4abb-b35f-0e2ea4fd4afa">here</a>.</p>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);">outliers_fraction = float(<span class="hljs-number" style="color: teal;">.01</span>)
scaler = StandardScaler()
np_scaled = scaler.fit_transform(catfish_sales.values.reshape(<span class="hljs-number" style="color: teal;">-1</span>, <span class="hljs-number" style="color: teal;">1</span>))
data = pd.DataFrame(np_scaled)
<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># train isolation forest</span>
model =  IsolationForest(contamination=outliers_fraction)
model.fit(data)</pre>



<p>Lastly, we need to visualize how the prediction was.</p>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);">catfish_sales[<span class="hljs-string" style="color: rgb(221, 17, 68);">'anomaly'</span>] = model.predict(data)
<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># visualization</span>
fig, ax = plt.subplots(figsize=(<span class="hljs-number" style="color: teal;">10</span>,<span class="hljs-number" style="color: teal;">6</span>))
a = catfish_sales.loc[catfish_sales[<span class="hljs-string" style="color: rgb(221, 17, 68);">'anomaly'</span>] == <span class="hljs-number" style="color: teal;">-1</span>, [<span class="hljs-string" style="color: rgb(221, 17, 68);">'Total'</span>]] <span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;">#anomaly</span>
ax.plot(catfish_sales.index, catfish_sales[<span class="hljs-string" style="color: rgb(221, 17, 68);">'Total'</span>], color=<span class="hljs-string" style="color: rgb(221, 17, 68);">'black'</span>, label = <span class="hljs-string" style="color: rgb(221, 17, 68);">'Normal'</span>)
ax.scatter(a.index,a[<span class="hljs-string" style="color: rgb(221, 17, 68);">'Total'</span>], color=<span class="hljs-string" style="color: rgb(221, 17, 68);">'red'</span>, label = <span class="hljs-string" style="color: rgb(221, 17, 68);">'Anomaly'</span>)
plt.legend()
plt.show();</pre>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Anomaly-Detection-Isolation-Forest.png?ssl=1" alt="Anomaly Detection Isolation Forest" class="wp-image-42439"/><figcaption class="wp-element-caption"><em>Anomaly Detection using Isolation Forest algo</em>rithm</figcaption></figure>
</div>


<p>As you can see, the algorithm did a pretty good job in identifying our planted anomalies, but it also labeled a few points at the start as “outlier”. This is due to two reasons:</p>



<ul class="wp-block-list">
<li>At the start, the algorithm is pretty naive to be able to comprehend what qualifies as an anomaly. The more data it gets, the more variance it’s able to see, and it adjusts itself.</li>



<li>If you see many true negatives, that means your <strong>contamination</strong> parameter is too high Conversely, if you don’t see the red dots where they should be, the <strong>contamination</strong> parameter is set too low.</li>
</ul>



<p><strong><em>Pros</em></strong></p>



<p>The biggest advantage of this technique is you can introduce as many random variables or features as you like to make more sophisticated models.</p>



<p><strong><em>Cons</em></strong></p>



<p>The weakness is that a growing number of features can start to impact your computational performance fairly quickly. In this case, you should select features carefully.</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-detection-using-forecasting">Detection using Forecasting</h3>



<p>Anomaly detection using Forecasting is based on an approach that several points from the past generate a forecast of the next point with the addition of some random variable, which is usually white noise.&nbsp;</p>



<p>As you can imagine, forecasted points in the future will generate new points and so on. Its obvious effect on the forecast horizon &#8211; the signal gets smoother.</p>



<p>The difficult part of using this method is that you should select the number of differences, number of autoregressions, and forecast error coefficients.</p>



<p><em>Each time you work with a new signal, you should build a new forecasting model.</em></p>



<p>Another obstacle is that your signal should be stationary after differencing. In simple words, it means your signal shouldn’t be dependent on time, which is a significant constraint.</p>



<p>We can utilize different forecasting methods such as Moving Averages, Autoregressive approach, and ARIMA with its different variants. The procedure for detecting anomalies with ARIMA is:</p>



<ul class="wp-block-list">
<li>Predict the new point from past datums and find the difference in magnitude with those in the training data.</li>



<li>Choose a threshold and identify anomalies based on that difference threshold. That&#8217;s it!</li>
</ul>



<p>To test this technique, we’re gonna use a popular module in time series called <strong><em>fbprophet</em></strong><em>. </em>This module specifically caters to stationarity and seasonality, and can be tuned with some hyper-parameters.</p>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Multiple-anomailies.png?ssl=1" alt="Multiple anomailies" class="wp-image-42440"/><figcaption class="wp-element-caption"><em>The same Catfish Sales data but with different (multiple) anomalies introduced</em></figcaption></figure>
</div>


<p>We’ll utilize the same data as we did above with the same anomalies. First, let&#8217;s import it and make it ready for the environment:</p>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);"><span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">from</span> fbprophet <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> Prophet</pre>



<p>Now let&#8217;s define the forecasting function. An important thing to note here is that <em>fbprophet </em>will add some additional metrics as features, in order to help identify anomalies better. For example, the predicted time series variable (by the model), the upper and lower limit of the target time series variable, and the trend metric.</p>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);"><span class="hljs-function"><span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">def</span> <span class="hljs-title" style="color: rgb(153, 0, 0); font-weight: 700;">fit_predict_model</span><span class="hljs-params">(dataframe, interval_width = <span class="hljs-number" style="color: teal;">0.99</span>, changepoint_range = <span class="hljs-number" style="color: teal;">0.8</span>)</span>:</span>
   m = Prophet(daily_seasonality = <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">False</span>, yearly_seasonality = <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">False</span>, weekly_seasonality = <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">False</span>,
               seasonality_mode = <span class="hljs-string" style="color: rgb(221, 17, 68);">'additive'</span>,
               interval_width = interval_width,
               changepoint_range = changepoint_range)
   m = m.fit(dataframe)
   forecast = m.predict(dataframe)
   forecast[<span class="hljs-string" style="color: rgb(221, 17, 68);">'fact'</span>] = dataframe[<span class="hljs-string" style="color: rgb(221, 17, 68);">'y'</span>].reset_index(drop = <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">True</span>)
   <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">return</span> forecast

pred = fit_predict_model(t)</pre>



<p><strong>We now have to push the <em>pred </em>variable</strong> to another function, which will detect anomalies based on a threshold of lower and upper limit in the time series variable.</p>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);"><span class="hljs-function"><span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">def</span> <span class="hljs-title" style="color: rgb(153, 0, 0); font-weight: 700;">detect_anomalies</span><span class="hljs-params">(forecast)</span>:</span>
   forecasted = forecast[[<span class="hljs-string" style="color: rgb(221, 17, 68);">'ds'</span>,<span class="hljs-string" style="color: rgb(221, 17, 68);">'trend'</span>, <span class="hljs-string" style="color: rgb(221, 17, 68);">'yhat'</span>, <span class="hljs-string" style="color: rgb(221, 17, 68);">'yhat_lower'</span>, <span class="hljs-string" style="color: rgb(221, 17, 68);">'yhat_upper'</span>, <span class="hljs-string" style="color: rgb(221, 17, 68);">'fact'</span>]].copy()
forecasted[<span class="hljs-string" style="color: rgb(221, 17, 68);">'anomaly'</span>] = <span class="hljs-number" style="color: teal;">0</span>
   forecasted.loc[forecasted[<span class="hljs-string" style="color: rgb(221, 17, 68);">'fact'</span>] &gt; forecasted[<span class="hljs-string" style="color: rgb(221, 17, 68);">'yhat_upper'</span>], <span class="hljs-string" style="color: rgb(221, 17, 68);">'anomaly'</span>] = <span class="hljs-number" style="color: teal;">1</span>
   forecasted.loc[forecasted[<span class="hljs-string" style="color: rgb(221, 17, 68);">'fact'</span>] &lt; forecasted[<span class="hljs-string" style="color: rgb(221, 17, 68);">'yhat_lower'</span>], <span class="hljs-string" style="color: rgb(221, 17, 68);">'anomaly'</span>] = <span class="hljs-number" style="color: teal;">-1</span>
<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;">#anomaly importances</span>
   forecasted[<span class="hljs-string" style="color: rgb(221, 17, 68);">'importance'</span>] = <span class="hljs-number" style="color: teal;">0</span>
   forecasted.loc[forecasted[<span class="hljs-string" style="color: rgb(221, 17, 68);">'anomaly'</span>] ==<span class="hljs-number" style="color: teal;">1</span>, <span class="hljs-string" style="color: rgb(221, 17, 68);">'importance'</span>] =
       (forecasted[<span class="hljs-string" style="color: rgb(221, 17, 68);">'fact'</span>] - forecasted[<span class="hljs-string" style="color: rgb(221, 17, 68);">'yhat_upper'</span>])/forecast[<span class="hljs-string" style="color: rgb(221, 17, 68);">'fact'</span>]
   forecasted.loc[forecasted[<span class="hljs-string" style="color: rgb(221, 17, 68);">'anomaly'</span>] ==<span class="hljs-number" style="color: teal;">-1</span>, <span class="hljs-string" style="color: rgb(221, 17, 68);">'importance'</span>] =
       (forecasted[<span class="hljs-string" style="color: rgb(221, 17, 68);">'yhat_lower'</span>] - forecasted[<span class="hljs-string" style="color: rgb(221, 17, 68);">'fact'</span>])/forecast[<span class="hljs-string" style="color: rgb(221, 17, 68);">'fact'</span>]

   <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">return</span> forecasted
pred = detect_anomalies(pred)</pre>



<p>At last, we just need to plot the above predictions and visualize the anomalies.</p>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Anomaly-detection.png?ssl=1" alt="Anomaly detection" class="wp-image-42441"/></figure>
</div>


<p><strong><em>Pros</em></strong></p>



<p>This algorithm nicely handles different seasonality parameters like monthly or yearly, and it has native support for all time series metrics.&nbsp;</p>



<p>If you look closely, this algorithm can handle edge cases well as compared to the Isolation Forest algorithm.</p>



<p><strong><em>Cons</em></strong></p>



<p>Since this technique is based on forecasting, it will struggle in limited data scenarios. The quality of prediction in limited data will be lower, and so will the accuracy of anomaly detection.</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-clustering-based-anomaly-detection">Clustering-based anomaly detection</h3>



<p>So far, we’ve looked at the IsolationForest algorithm as our unsupervised way of anomaly detection. Now, we’ll look into another unsupervised technique: Clustering!&nbsp;</p>



<p>The approach is pretty straightforward. Data instances that fall outside of defined clusters could potentially be marked as anomalies. We’re gonna use k-means clustering, because why not!</p>



<p>For the sake of visualizations, we’ll use a different dataset that corresponds to a multivariable time series with one or more time-based variables. The dataset will be a subset of the one found<a href="https://www.kaggle.com/c/expedia-personalized-sort/data" target="_blank" rel="noreferrer noopener nofollow"> here</a> (columns/features are the same).</p>



<p><em>Dataset Description: Data contains information on shopping and purchase as well as information on price competitiveness.</em></p>



<p>Now in order to process k-means, first we need to know the number of clusters we’re gonna be dealing with. <em>The Elbow Method</em> works pretty efficiently for this.</p>



<p><em>The Elbow method is a graph of the number of clusters vs the variance explained/objective/score</em></p>



<p>To implement this, we’ll use scikit-learn’s implementation of K-means.</p>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);">data = df[[<span class="hljs-string" style="color: rgb(221, 17, 68);">'price_usd'</span>, <span class="hljs-string" style="color: rgb(221, 17, 68);">'srch_booking_window'</span>, <span class="hljs-string" style="color: rgb(221, 17, 68);">'srch_saturday_night_bool'</span>]]
n_cluster = range(<span class="hljs-number" style="color: teal;">1</span>, <span class="hljs-number" style="color: teal;">20</span>)
kmeans = [KMeans(n_clusters=i).fit(data) <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">for</span> i <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">in</span> n_cluster]
scores = [kmeans[i].score(data) <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">for</span> i <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">in</span> range(len(kmeans))]
fig, ax = plt.subplots(figsize=(<span class="hljs-number" style="color: teal;">10</span>,<span class="hljs-number" style="color: teal;">6</span>))
ax.plot(n_cluster, scores)
plt.xlabel(<span class="hljs-string" style="color: rgb(221, 17, 68);">'Number of Clusters'</span>)
plt.ylabel(<span class="hljs-string" style="color: rgb(221, 17, 68);">'Score'</span>)
plt.title(<span class="hljs-string" style="color: rgb(221, 17, 68);">'Elbow Curve'</span>)
plt.show();</pre>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Elbow-method.png?ssl=1" alt="Elbow method" class="wp-image-42442"/></figure>
</div>


<p>From the above elbow curve, we see that the graph levels off after 10 clusters, implying that the addition of more clusters do not explain much more of the variance in our relevant variable; in this case price_usd.</p>



<p>We set n_clusters=10, and upon generating the k-means output, use the data to plot the 3D clusters.</p>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Anomaly-detection-k-means.png?ssl=1" alt="Anomaly detection k means" class="wp-image-42443"/></figure>
</div>


<p><strong>Now we need to find</strong> out the number of components (features) to keep.</p>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);">data = df[[<span class="hljs-string" style="color: rgb(221, 17, 68);">'price_usd'</span>, <span class="hljs-string" style="color: rgb(221, 17, 68);">'srch_booking_window'</span>, <span class="hljs-string" style="color: rgb(221, 17, 68);">'srch_saturday_night_bool'</span>]]
X = data.values
X_std = StandardScaler().fit_transform(X)
<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;">#Calculating Eigenvecors and eigenvalues of Covariance matrix</span>
mean_vec = np.mean(X_std, axis=<span class="hljs-number" style="color: teal;">0</span>)
cov_mat = np.cov(X_std.T)
eig_vals, eig_vecs = np.linalg.eig(cov_mat)
<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Create a list of (eigenvalue, eigenvector) tuples</span>
eig_pairs = [ (np.abs(eig_vals[i]),eig_vecs[:,i]) <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">for</span> i <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">in</span> range(len(eig_vals))]
eig_pairs.sort(key = <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">lambda</span> x: x[<span class="hljs-number" style="color: teal;">0</span>], reverse= <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">True</span>)
<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Calculation of Explained Variance from the eigenvalues</span>
tot = sum(eig_vals)
var_exp = [(i/tot)*<span class="hljs-number" style="color: teal;">100</span> <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">for</span> i <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">in</span> sorted(eig_vals, reverse=<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">True</span>)] <span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Individual explained variance</span>
cum_var_exp = np.cumsum(var_exp) <span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Cumulative explained variance</span>
plt.figure(figsize=(<span class="hljs-number" style="color: teal;">10</span>, <span class="hljs-number" style="color: teal;">5</span>))
plt.bar(range(len(var_exp)), var_exp, alpha=<span class="hljs-number" style="color: teal;">0.3</span>, align=<span class="hljs-string" style="color: rgb(221, 17, 68);">'center'</span>, label=<span class="hljs-string" style="color: rgb(221, 17, 68);">'individual explained variance'</span>, color = <span class="hljs-string" style="color: rgb(221, 17, 68);">'y'</span>)
plt.step(range(len(cum_var_exp)), cum_var_exp, where=<span class="hljs-string" style="color: rgb(221, 17, 68);">'mid'</span>,label=<span class="hljs-string" style="color: rgb(221, 17, 68);">'cumulative explained variance'</span>)
plt.ylabel(<span class="hljs-string" style="color: rgb(221, 17, 68);">'Explained variance ratio'</span>)
plt.xlabel(<span class="hljs-string" style="color: rgb(221, 17, 68);">'Principal components'</span>)
plt.legend(loc=<span class="hljs-string" style="color: rgb(221, 17, 68);">'best'</span>)
plt.show();</pre>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Anomaly-detection-components.png?ssl=1" alt="Anomaly detection components" class="wp-image-42444"/></figure>
</div>


<p>We see that the first component explains almost 50% of the variance. The second component explains over 30%. However, notice that almost none of the components are really negligible. The first 2 components contain over 80% of the information. So, we will set n_components=2.</p>



<p>The underlying assumption in the clustering-based anomaly detection is that if we cluster the data, normal data will belong to clusters while anomalies will not belong to any clusters, or belong to small clusters.&nbsp;</p>



<p><strong>We use the following steps to find and visualize anomalies:</strong></p>



<ul class="wp-block-list">
<li>Calculate the distance between each point and its nearest centroid. The biggest distances are considered anomalies.</li>



<li>We use outliers_fraction to provide information to the algorithm about the proportion of the outliers present in our data set, similarly to the IsolationForest algorithm. This is largely a hyperparameter that needs hit/trial or grid-search to be set right &#8211; as a starting figure, let’s estimate, outliers_fraction=0.1</li>



<li>Calculate number_of_outliers using outliers_fraction.</li>



<li>Set the threshold as the minimum distance of these outliers.</li>



<li>The anomaly result of anomaly1 contains the above method Cluster (0:normal, 1:anomaly).</li>



<li>Visualize anomalies with cluster view.</li>



<li>Visualize anomalies with Time Series view.</li>
</ul>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);"><span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># return Series of distance between each point and its distance with the closest centroid</span>
<span class="hljs-function"><span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">def</span> <span class="hljs-title" style="color: rgb(153, 0, 0); font-weight: 700;">getDistanceByPoint</span><span class="hljs-params">(data, model)</span>:</span>
   distance = pd.Series()
   <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">for</span> i <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">in</span> range(<span class="hljs-number" style="color: teal;">0</span>,len(data)):
       Xa = np.array(data.loc[i])
       Xb = model.cluster_centers_[model.labels_[i]<span class="hljs-number" style="color: teal;">-1</span>]
       distance.at[i]=np.linalg.norm(Xa-Xb)
   <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">return</span> distance
outliers_fraction = <span class="hljs-number" style="color: teal;">0.1</span>
<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># get the distance between each point and its nearest centroid. The biggest distances are considered as anomaly</span>
distance = getDistanceByPoint(data, kmeans[<span class="hljs-number" style="color: teal;">9</span>])
number_of_outliers = int(outliers_fraction*len(distance))
threshold = distance.nlargest(number_of_outliers).min()
<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># anomaly1 contain the anomaly result of the above method Cluster (0:normal, 1:anomaly)</span>
df[<span class="hljs-string" style="color: rgb(221, 17, 68);">'anomaly1'</span>] = (distance &gt;= threshold).astype(int)
fig, ax = plt.subplots(figsize=(<span class="hljs-number" style="color: teal;">10</span>,<span class="hljs-number" style="color: teal;">6</span>))
colors = {<span class="hljs-number" style="color: teal;">0</span>:<span class="hljs-string" style="color: rgb(221, 17, 68);">'blue'</span>, <span class="hljs-number" style="color: teal;">1</span>:<span class="hljs-string" style="color: rgb(221, 17, 68);">'red'</span>}
ax.scatter(df[<span class="hljs-string" style="color: rgb(221, 17, 68);">'principal_feature1'</span>], df[<span class="hljs-string" style="color: rgb(221, 17, 68);">'principal_feature2'</span>], c=df[<span class="hljs-string" style="color: rgb(221, 17, 68);">"anomaly1"</span>].apply(<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">lambda</span> x: colors[x]))
plt.xlabel(<span class="hljs-string" style="color: rgb(221, 17, 68);">'principal feature1'</span>)
plt.ylabel(<span class="hljs-string" style="color: rgb(221, 17, 68);">'principal feature2'</span>)
plt.show();</pre>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Visualize-anomalies.png?ssl=1" alt="Visualize anomalies" class="wp-image-42445"/></figure>
</div>


<p><strong>Now, in order to see the anomalies </strong>against real-world features, we process the dataframe we created in the previous step.</p>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);">df = df.sort_values(<span class="hljs-string" style="color: rgb(221, 17, 68);">'date_time'</span>)
fig, ax = plt.subplots(figsize=(<span class="hljs-number" style="color: teal;">10</span>,<span class="hljs-number" style="color: teal;">6</span>))
a = df.loc[df[<span class="hljs-string" style="color: rgb(221, 17, 68);">'anomaly1'</span>] == <span class="hljs-number" style="color: teal;">1</span>, [<span class="hljs-string" style="color: rgb(221, 17, 68);">'date_time'</span>, <span class="hljs-string" style="color: rgb(221, 17, 68);">'price_usd'</span>]] <span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;">#anomaly</span>
ax.plot(pd.to_datetime(df[<span class="hljs-string" style="color: rgb(221, 17, 68);">'date_time'</span>]), df[<span class="hljs-string" style="color: rgb(221, 17, 68);">'price_usd'</span>], color=<span class="hljs-string" style="color: rgb(221, 17, 68);">'k'</span>,label=<span class="hljs-string" style="color: rgb(221, 17, 68);">'Normal'</span>)
ax.scatter(pd.to_datetime(a[<span class="hljs-string" style="color: rgb(221, 17, 68);">'date_time'</span>]),a[<span class="hljs-string" style="color: rgb(221, 17, 68);">'price_usd'</span>], color=<span class="hljs-string" style="color: rgb(221, 17, 68);">'red'</span>, label=<span class="hljs-string" style="color: rgb(221, 17, 68);">'Anomaly'</span>)
ax.xaxis_date()
plt.xlabel(<span class="hljs-string" style="color: rgb(221, 17, 68);">'Date Time'</span>)
plt.ylabel(<span class="hljs-string" style="color: rgb(221, 17, 68);">'price in USD'</span>)
plt.legend()
fig.autofmt_xdate()
plt.show()</pre>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Visualize-anomalies-dataframe.png?ssl=1" alt="Visualize anomalies dataframe" class="wp-image-42446"/></figure>
</div>


<p>This method is able to encapsulate peaks pretty well, with some misses of course. A part of the issue may be the outlier_fraction hasn’t played around with many values.</p>



<p><strong><em>Pros</em></strong></p>



<p>The biggest advantage of this technique is similar to other unsupervised techniques, which is that you can introduce as many random variables or features as you like to make more sophisticated models.</p>



<p><strong><em>Cons</em></strong></p>



<p>The weakness is that a growing number of features can start to impact your computational performance fairly quickly. In addition to this, there are more hyper-parameters to tune and get right, so there’s always a chance of high model variance in performance.</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-autoencoders">Autoencoders</h3>



<p>Can’t talk about data techniques without Deep Learning! So, let&#8217;s discuss Anomaly detection using <strong>Autoencoders.</strong></p>



<section id="blog-intext-cta-block_17c42698b5db377ccb3b20fd3b4298a0" class="block-blog-intext-cta  c-box c-box--default c-box--dark c-box--no-hover c-box--standard ">

            <h3 class="block-blog-intext-cta__header" class="block-blog-intext-cta__header" id="h-read-also">Read also </h3>
    
            <p><a href="/blog/autoencoders-case-study-guide" target="_blank" rel="noopener">How to Work with Autoencoders </a></p>
    
    </section>



<p>Autoencoders are an unsupervised technique that recreates the input data while extracting its features through different dimensions. So, in other words, if we use the Latent Representation of data from Autoencoders, it corresponds to <em>dimensionality reduction</em>.</p>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Anomaly-detection-autoencoders.png?ssl=1" alt="Anomaly detection autoencoders" class="wp-image-42447"/><figcaption class="wp-element-caption"><em><a href="https://hackernoon.com/hn-images/1*8ixTe1VHLsmKB3AquWdxpQ.png" target="_blank" rel="noreferrer noopener nofollow">Source</a></em></figcaption></figure>
</div>


<h4 class="wp-block-heading">Why do we apply dimensionality reduction to find outliers?</h4>



<p>Don’t we lose some information, including the outliers, if we reduce the dimensionality? The answer is that once the main patterns are identified, the outliers are revealed. Many distance-based techniques (e.g. KNNs) suffer the curse of dimensionality when they compute distances of every data point in the full feature space. High dimensionality has to be reduced.&nbsp;</p>



<p>Interestingly, during the process of dimensionality reduction outliers are identified. We can say outlier detection is a by-product of dimension reduction.</p>



<p><em>Autoencoders are an unsupervised approach to find anomalies.</em></p>



<h4 class="wp-block-heading">Why autoencoders?</h4>



<p>There are many useful tools, such as Principal Component Analysis (PCA), for detecting outliers. Why do we need autoencoders? The reason is that PCA uses linear algebra to transform. In contrast, autoencoder techniques can perform non-linear transformations with their non-linear activation function and multiple layers. It’s more efficient to train several layers with an autoencoder, rather than training one huge transformation with PCA. The autoencoder techniques thus show their merits when the data problems are complex and non-linear in nature.</p>



<h4 class="wp-block-heading">Build the model</h4>



<p>We can implement Autoencoders with popular frameworks like TensorFlow or Pytorch, but &#8211; for the sake of simplicity &#8211; we’re gonna use a python module called PyOD, which builds autoencoders internally using few inputs from the user.</p>



<p>For the data part, let’s use the utility function generate_data() of PyOD to generate 25 variables, 500 observations, and ten percent outliers.</p>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);"><span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> numpy <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">as</span> np
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> pandas <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">as</span> pd
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">from</span> pyod.models.auto_encoder <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> AutoEncoder
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">from</span> pyod.utils.data <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> generate_data
contamination = <span class="hljs-number" style="color: teal;">0.1</span>  <span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># percentage of outliers</span>
n_train = <span class="hljs-number" style="color: teal;">500</span>  <span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># number of training points</span>
n_test = <span class="hljs-number" style="color: teal;">500</span>  <span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># number of testing points</span>
n_features = <span class="hljs-number" style="color: teal;">25</span> <span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Number of features</span>
X_train, y_train, X_test, y_test = generate_data(
   n_train=n_train, n_test=n_test,
   n_features= n_features,
   contamination=contamination,random_state=<span class="hljs-number" style="color: teal;">1234</span>)
X_train = pd.DataFrame(X_train)
X_test = pd.DataFrame(X_test)</pre>



<p>When you do unsupervised learning, it’s always a safe step to standardize the predictors like below:</p>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);"><span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">from</span> sklearn.preprocessing <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> StandardScaler
X_train = StandardScaler().fit_transform(X_train)
X_train = pd.DataFrame(X_train)
X_test = StandardScaler().fit_transform(X_test)
X_test = pd.DataFrame(X_test)</pre>



<p>In order to get a good sense of what the data looks like, let’s use PCA to reduce it to two dimensions, and plot accordingly.</p>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);"><span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">from</span> sklearn.decomposition <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> PCA
pca = PCA(<span class="hljs-number" style="color: teal;">2</span>)
x_pca = pca.fit_transform(X_train)
x_pca = pd.DataFrame(x_pca)
x_pca.columns=[<span class="hljs-string" style="color: rgb(221, 17, 68);">'PC1'</span>,<span class="hljs-string" style="color: rgb(221, 17, 68);">'PC2'</span>]
cdict = {<span class="hljs-number" style="color: teal;">0</span>: <span class="hljs-string" style="color: rgb(221, 17, 68);">'red'</span>, <span class="hljs-number" style="color: teal;">1</span>: <span class="hljs-string" style="color: rgb(221, 17, 68);">'blue'</span>}
<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Plot</span>
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> matplotlib.pyplot <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">as</span> plt
plt.scatter(X_train[<span class="hljs-number" style="color: teal;">0</span>], X_train[<span class="hljs-number" style="color: teal;">1</span>], c=y_train, alpha=<span class="hljs-number" style="color: teal;">1</span>)
plt.title(<span class="hljs-string" style="color: rgb(221, 17, 68);">'Scatter plot'</span>)
plt.xlabel(<span class="hljs-string" style="color: rgb(221, 17, 68);">'x'</span>)
plt.ylabel(<span class="hljs-string" style="color: rgb(221, 17, 68);">'y'</span>)
plt.show()</pre>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Anomaly-detection-scatter-plot.png?ssl=1" alt="Anomaly detection scatter plot" class="wp-image-42448"/></figure>
</div>


<p>The black points clustered together are the typical observations, and the yellow points are the outliers.</p>



<h4 class="wp-block-heading">Model specification</h4>



<ul class="wp-block-list">
<li>[25, 2, 2, 25]. The input layer and the output layer have 25 neurons each. There are two hidden layers, each has two neurons.</li>
</ul>



<p><strong>Step 1 — Build your model</strong></p>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);">clf = AutoEncoder(hidden_neurons =[<span class="hljs-number" style="color: teal;">25</span>, <span class="hljs-number" style="color: teal;">2</span>, <span class="hljs-number" style="color: teal;">2</span>, <span class="hljs-number" style="color: teal;">25</span>])
clf.fit(X_train)</pre>



<p><strong>Step 2 — Determine the cut point</strong></p>



<p>Let’s apply the trained model <em>Clf</em> to predict the anomaly score for each observation in the test data. How do we define an outlier? An outlier is a point that’s distant from other points, so the outlier score is defined by distance. The PyOD function .decision_function() calculates the distance, or the anomaly score, for each data point.</p>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);"><span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Get the outlier scores for the train data</span>
y_train_scores = clf.decision_scores_
<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Predict the anomaly scores</span>
y_test_scores = clf.decision_function(X_test)  <span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># outlier scores</span>
y_test_scores = pd.Series(y_test_scores)
<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Plot it!</span>
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">import</span> matplotlib.pyplot <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">as</span> plt
plt.hist(y_test_scores, bins=<span class="hljs-string" style="color: rgb(221, 17, 68);">'auto'</span>)
plt.title(<span class="hljs-string" style="color: rgb(221, 17, 68);">"Histogram for Model Clf1 Anomaly Scores"</span>)
plt.show()</pre>



<p>If we use a histogram to count the frequency by the anomaly score, we will see the high scores corresponds to a low frequency — evidence of outliers. We choose 4.0 to be the cut point and those &gt;=4.0 to be outliers.</p>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Anomaly-detection-histogram.png?ssl=1" alt="Anomaly detection histogram" class="wp-image-42449"/></figure>
</div>


<p><strong>Step 3 — Get the summary statistics by cluster</strong></p>



<p>Let’s assign those observations with less than 4.0 anomaly scores to Cluster 0, and to Cluster 1 for those above 4.0. Also, let’s calculate the summary statistics by cluster using .groupby() . This model has identified 50 outliers (not shown).</p>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);">df_test = X_test.copy()
df_test[<span class="hljs-string" style="color: rgb(221, 17, 68);">'score'</span>] = y_test_scores
df_test[<span class="hljs-string" style="color: rgb(221, 17, 68);">'cluster'</span>] = np.where(df_test[<span class="hljs-string" style="color: rgb(221, 17, 68);">'score'</span>]&lt;<span class="hljs-number" style="color: teal;">4</span>, <span class="hljs-number" style="color: teal;">0</span>, <span class="hljs-number" style="color: teal;">1</span>)
df_test[<span class="hljs-string" style="color: rgb(221, 17, 68);">'cluster'</span>].value_counts()
df_test.groupby(<span class="hljs-string" style="color: rgb(221, 17, 68);">'cluster'</span>).mean()</pre>



<p>The following output shows the mean variable values in each cluster. The values of Cluster ‘1’ (the abnormal cluster) are quite different from those of Cluster ‘0’ (the normal cluster). The “score” values show the average distance of those observations to others. A high “score” means that observation is far away from the norm.</p>



<figure class="wp-block-image size-large"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Anomaly-detection-cluster.png?ssl=1" alt="Anomaly detection cluster" class="wp-image-42450"/></figure>



<p>This way, we can distinguish and label pretty perfectly between typical datums and anomalies.</p>



<p><strong><em>Pros</em></strong></p>



<ul class="wp-block-list">
<li>Autoencoders can handle high-dimensional data with ease.&nbsp;</li>



<li>Pertaining to its nonlinearity behavior, it can find complex patterns within high-dimensional datasets.</li>
</ul>



<p><strong><em>Cons</em></strong></p>



<ul class="wp-block-list">
<li>Since it’s a deep learning-based strategy, it will particularly struggle if the data is less.</li>



<li>Computation costs will skyrocket if the depth of the network increases and while dealing with big data.</li>
</ul>



<p>So far we’ve seen how to detect and identify anomalies. But the real question arises after finding them. Now what? What do we do about it?</p>



<p>Let’s discuss some of the pointers you could apply in your scenario.</p>



<h2 class="wp-block-heading" class="wp-block-heading" id="h-how-to-deal-with-the-anomalies">How to deal with the anomalies?</h2>



<p>After detection, there comes a big question of what to do about the stuff we identified. There are numerous ways to deal with the newly found information. I’ll list some of them based on my experience to give you a headway on how to approach this question.</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-understanding-the-business-case">Understanding the business case</h3>



<p>Anomalies almost always provide new information and perspective to your problems. Stock prices going up suddenly? There has to be a reason for this like we saw with Gamestop, a pandemic could be another. So, understanding the reasons behind the spike can help you solve the problem in an efficient manner.</p>



<p>Understanding the business use case can also help you identify the problem better. For instance, you might be working on some sort of fraud detection which means your primary goal is indeed understanding the outliers in the data.</p>



<p>If none of this is your concern, you can move to remove or smoothen out the outlier.</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-statistical-methods-to-adjust-outliers">Statistical methods to adjust outliers</h3>



<p>Statistical methods let you adjust the value of your outlier to match the original distribution. Let’s see one of the methods that use mean to smoothen out the anomalies.</p>



<h4 class="wp-block-heading">Using mean to smoothen out the outlier</h4>



<p>The idea is to smoothen out the anomaly by using data from the previous DateTime. E.g., to even out a sudden usage of electricity due to an event that happened in your house, you could take an average of usages in the same month for previous years.</p>



<p>Let’s implement the same to get a clear picture. We’ll employ the same catfish sales data we did earlier. We can adjust with the <strong>mean</strong> using the script below.</p>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);">adjusted_data = lim_catfish_sales.copy()
adjusted_data.loc[curr_anomaly] = december_data[(december_data.index != curr_anomaly) &amp; (december_data.index &lt; test_data.index[<span class="hljs-number" style="color: teal;">0</span>])].mean()</pre>



<p><strong>Plotting the adjusted data </strong>and the old data will look something like this:</p>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);">plt.figure(figsize=(<span class="hljs-number" style="color: teal;">10</span>,<span class="hljs-number" style="color: teal;">4</span>))
plt.plot(lim_catfish_sales, color=<span class="hljs-string" style="color: rgb(221, 17, 68);">'firebrick'</span>, alpha=<span class="hljs-number" style="color: teal;">0.4</span>)
plt.plot(adjusted_data)
plt.title(<span class="hljs-string" style="color: rgb(221, 17, 68);">'Catfish Sales in 1000s of Pounds'</span>, fontsize=<span class="hljs-number" style="color: teal;">20</span>)
plt.ylabel(<span class="hljs-string" style="color: rgb(221, 17, 68);">'Sales'</span>, fontsize=<span class="hljs-number" style="color: teal;">16</span>)
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">for</span> year <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">in</span> range(start_date.year,end_date.year):
   plt.axvline(pd.to_datetime(str(year)+<span class="hljs-string" style="color: rgb(221, 17, 68);">'-01-01'</span>), color=<span class="hljs-string" style="color: rgb(221, 17, 68);">'k'</span>, linestyle=<span class="hljs-string" style="color: rgb(221, 17, 68);">'--'</span>, alpha=<span class="hljs-number" style="color: teal;">0.2</span>)
plt.axvline(curr_anomaly, color=<span class="hljs-string" style="color: rgb(221, 17, 68);">'k'</span>, alpha=<span class="hljs-number" style="color: teal;">0.7</span>)</pre>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Anomaly-detection-plotting.png?ssl=1" alt="Anomaly detection plotting" class="wp-image-42451"/></figure>
</div>


<p>This way, you can proceed to apply forecasting or analysis without worrying much about skewness in your results.&nbsp;</p>



<p>There are numerous methods to deal with non-time series data but unfortunately cannot be used directly in Timeseries due to the difference in underlying structures. Non-time series methods of dealing involve a lot of distribution-based methods which can’t be simply translated to Timeseries data. If you wish to look at some of those, you can head over <a href="https://cxl.com/blog/outliers/" target="_blank" rel="noreferrer noopener nofollow">here</a>.&nbsp;</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-removing-the-outlier">Removing the Outlier</h3>



<p>The last option if none of the above two sparks any debate in your solution is to get rid of the anomalies. This is not recommended (as you’re basically getting rid of some potentially valuable information) unless it&#8217;s absolutely necessary and it doesn’t harm the analysis in the future.</p>



<p>You can use the .drop() feature in pandas after identification. It will do the heavy lifting for you.</p>



<h2 class="wp-block-heading" class="wp-block-heading" id="h-youve-reached-the-end">You’ve reached the end!</h2>



<p>Congratulations! You now know about Anomalies, how to detect them, and what you can do about them. Few endnotes:</p>



<ul class="wp-block-list">
<li>Time series data varies a lot depending on the business case, so it’s better to experiment and find out what works instead of just applying what you find. Experience can do wonders!</li>



<li>There are tons of techniques for anomaly detection apart from what we’ve discussed on this blog. I encourage you to read more in research papers.</li>
</ul>



<p>You can find the complete notebook with code and some bonus stuff <a href="https://ui.neptune.ai/theaayushbajaj/Anomaly-Detection/n/Anomaly-Detection-49ba1752-fc3a-4abb-b35f-0e2ea4fd4afa" target="_blank" rel="noreferrer noopener nofollow">here</a>!</p>



<p>That’s it for now, stay tuned for more! Adios!</p>



<p><em>Note: images are created by the author unless stated otherwise. </em></p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">4228</post-id>	</item>
		<item>
		<title>Time Series Forecasting: Data, Analysis, and Practice</title>
		<link>https://neptune.ai/blog/time-series-forecasting</link>
		
		<dc:creator><![CDATA[Akshay P Jain]]></dc:creator>
		<pubDate>Thu, 21 Jul 2022 14:04:33 +0000</pubDate>
				<category><![CDATA[ML Model Development]]></category>
		<category><![CDATA[Time Series]]></category>
		<guid isPermaLink="false">https://neptune.test/time-series-forecasting/</guid>

					<description><![CDATA[Usually, in the traditional machine learning approach, we randomly split the data into training data, test data, and cross-validation data. Here, each point xi in the dataset has: Instead of random-based splitting, we can use another approach called time-based splitting. When we have a timestamp given in our dataset, we can split the data according&#8230;]]></description>
										<content:encoded><![CDATA[
<p>Usually, <a href="/blog/time-series-prediction-vs-machine-learning" target="_blank" rel="noreferrer noopener">in the traditional mach</a>ine learning approach, we randomly split the data into training data, test data, and cross-validation data.</p>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Traditional-ML.png?ssl=1" alt="Traditional ML" class="wp-image-41221"/></figure>
</div>


<p>Here, each point <strong>x</strong><strong><sub>i</sub></strong> in the dataset has:</p>



<ul class="wp-block-list">
<li>60% probability of going into D<sub>train</sub>&nbsp;</li>



<li>20% probability of going into D<sub>test</sub>&nbsp;&nbsp;</li>



<li>20% probability of going into Validation</li>
</ul>



<p>Instead of <a href="https://developers.google.com/machine-learning/data-prep/construct/sampling-splitting/example" target="_blank" rel="noreferrer noopener nofollow">random-based splitting, we can use another approach called time-based splitting</a>. When we have a timestamp<em> </em>given in our dataset, we can split the data according to time.&nbsp;</p>



<p>Imagine you’re an ML engineer at Amazon, trying to productionize a model to classify reviews. You randomly split the data into training data and test data, and after obtaining the required accuracy, you deploy the model. With additional reviews being added to new products, over time the model’s accuracy could decrease. Time-based splitting is a way to overcome this issue.&nbsp;</p>



<p>In time-based splitting, we generally divide the data based on the timestamp and train the model. With this, we have a better chance of getting higher accuracy than with random-based splitting.</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-why-do-we-need-a-different-approach">Why do we need a different approach?</h3>



<p>The standard ML approach doesn’t work for time series models:</p>



<ul class="wp-block-list">
<li>Features and target variables are the same,</li>



<li>Data correlated over time,</li>



<li>Often non-stationary (hard to model),</li>



<li>Need a lot of data to capture the patterns and trends and model those changes appropriately.</li>
</ul>



<h2 class="wp-block-heading" class="wp-block-heading" id="h-what-is-a-time-series">What is a time series?</h2>



<p>Time-series are a sequence of data points organized in time order.</p>


<div class="wp-block-image">
<figure class="aligncenter size-large is-resized"><img data-recalc-dims="1" loading="lazy" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Time-series.png?resize=553%2C359&#038;ssl=1" alt="Time series" class="wp-image-41222" style="width:553px;height:359px" width="553" height="359"/></figure>
</div>


<h2 class="wp-block-heading" class="wp-block-heading" id="h-types-of-forecasting">Types of forecasting</h2>


<div class="wp-block-image">
<figure class="aligncenter size-large is-resized"><img data-recalc-dims="1" loading="lazy" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Types-of-forecasting.png?resize=768%2C382&#038;ssl=1" alt="" class="wp-image-41226" style="width:768px;height:382px" width="768" height="382"/></figure>
</div>


<h3 class="wp-block-heading" class="wp-block-heading" id="h-time-series-are-everywhere">Time series are everywhere</h3>



<p><strong>Finance</strong>: we&#8217;re trying to predict perhaps stock prices over time, asset prices, different macroeconomic factors that will have a large effect on our business objectives.</p>



<p><strong>E-commerce</strong>: we’re trying to predict future page views compared to what happened in the past, and whether it’s trending up, down, or if there’s seasonality. Same with new users, how many new users are you getting/losing over time?</p>



<p><strong>Business</strong>: we’re trying to predict the number of transactions, future revenue, and future inventory levels that you will need.</p>



<p>Time series decomposition involves thinking of a series as a combination of level, trend, seasonality, and noise components.Decomposition provides a useful abstract model for thinking about time series generally and for better understanding problems during time series analysis and forecasting.</p>



<p>One of the fundamental topics in time series is time series decomposition:</p>



<ul class="wp-block-list">
<li>Components of time series data</li>



<li>Seasonal patterns and trends</li>



<li>Decomposition of time series data</li>
</ul>



<p>What are the components of time series?</p>



<p><strong>Trend:</strong> change direction over a period of time</p>



<p><strong>Seasonality:</strong> seasonality is about periodic behavior, spikes or drops caused by different factors, for example:&nbsp;</p>



<ul class="wp-block-list">
<li>Naturally occurring events, like weather fluctuations</li>



<li>Business or administrative procedures, like start or end of a fiscal year</li>



<li>Social and cultural behavior, like holidays or religious observances</li>



<li>Calendar events, like the number of Mondays per month or holidays shifting year to year</li>
</ul>



<p><strong>Residual:</strong> irregular fluctuations that we cannot predict using trend or seasonality.</p>



<p>The graphs of trends, seasonality, and residual factors are constructed below using Pandas and NumPy arrays in Python.</p>



<h2 class="wp-block-heading" class="wp-block-heading" id="h-decomposition-models">Decomposition models</h2>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-additive-model">Additive model</h3>



<p>The additive model assumes the observed time series is the sum of components:</p>



<p><strong><em>Observation&nbsp; = trend + seasonality</em></strong></p>



<p>Additive models are used when the magnitude of seasonal and residual values are independent of the trend.</p>


<div class="wp-block-image">
<figure class="aligncenter size-large is-resized"><img data-recalc-dims="1" loading="lazy" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Additive-model.png?resize=512%2C360&#038;ssl=1" alt="Additive model" class="wp-image-41229" style="width:512px;height:360px" width="512" height="360"/></figure>
</div>


<p>The above graph is generated using python which we will learn in a while</p>



<p>In the above example, we can see that seasonality in the residuals doesn’t increase or decrease as the trend increases, but rather it stays constant all the way. Looking at this plot, and subtracting out the straight line that is the trend, we can imagine that we just have the straight added on the seasonal component that says the same no matter what that trend is.</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-multiplicative-model">Multiplicative model</h3>



<p>The multiplicative model assumes the observed time series is a product of its components:</p>



<p><strong><em>Observation&nbsp; = trend * seasonality * residual</em></strong></p>



<p>We can transform the multiplicative model to an additive model by applying a log transformation:</p>



<p><strong><em>log(time * seasonality * residual) = log(Time) + log(seasonality) + log(residual)</em></strong></p>



<p>These are used if the magnitudes of seasonal and residual values fluctuate with the trend.</p>


<div class="wp-block-image">
<figure class="aligncenter size-large is-resized"><img data-recalc-dims="1" loading="lazy" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Multiplicative-model.png?resize=512%2C370&#038;ssl=1" alt="Multiplicative model" class="wp-image-41230" style="width:512px;height:370px" width="512" height="370"/></figure>
</div>


<p>The above graph is generated using python which we will learn in a while</p>



<p>In the above image, we see the trend increases, so we&#8217;re trending up. The seasonal component is also trending up with the trend. This means that it’s likely a multiplicative model, so we should divide out that trend, and then we would end up with more reasonable looking (more consistent) seasonality.</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-pseudo-additive-models">Pseudo-additive models</h3>



<p>Pseudo-additive models combine the elements of both additive and multiplicative models. They can be useful when:</p>



<ul class="wp-block-list">
<li>Time series values are close to or equal to zero</li>



<li>We expect features related to the multiplicative model</li>



<li>Division by zero often becomes a problem when this is the case</li>
</ul>



<h2 class="wp-block-heading" class="wp-block-heading" id="h-time-series-decomposition-using-python-pandas">Time series decomposition using Python-Pandas</h2>



<p>We will individually construct fictional trends, seasonality, and residual components. This is an example to show how a simple time-series dataset can be constructed using the Pandas module.</p>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);">time = np.arange(<span class="hljs-number" style="color: teal;">1</span>, <span class="hljs-number" style="color: teal;">51</span>)</pre>



<p>Now we need to create a trend. Let&#8217;s pretend we have a sensor measuring electricity demand. We&#8217;ll ignore units to keep things simple.</p>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);">trend = time * <span class="hljs-number" style="color: teal;">2.75</span></pre>



<p>Now lets plot to show trend as a function of time</p>



<p>Now let&#8217;s generate a seasonal component.</p>



<figure class="wp-block-image size-large"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Time-series-forcasting-plot-trend.png?ssl=1" alt="Time series forcasting plot trend" class="wp-image-41196"/></figure>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);">seasonal = <span class="hljs-number" style="color: teal;">10</span> + np.sin(time) * <span class="hljs-number" style="color: teal;">10</span></pre>



<p>Let’s plot seasonality against time.</p>



<figure class="wp-block-image size-large"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Time-series-forcasting-plot-time.png?ssl=1" alt="Time series forcasting plot against trend" class="wp-image-41198"/></figure>



<p>Now, let’s construct the residual component.</p>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);">np.random.seed(<span class="hljs-number" style="color: teal;">10</span>)  <span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># reproducible results</span>
residual = np.random.normal(loc=<span class="hljs-number" style="color: teal;">0.0</span>, scale=<span class="hljs-number" style="color: teal;">1</span>, size=len(time))
</pre>



<p>A quick plot of residuals:</p>



<figure class="wp-block-image size-large"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Time-series-forcasting-plot-residuals.png?ssl=1" alt="Time series forcasting plot residuals" class="wp-image-41200"/></figure>



<h2 class="wp-block-heading" class="wp-block-heading" id="h-aggregate-trend-seasonality-and-residual-components">Aggregate trend, seasonality, and residual components</h2>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-additive-time-series">Additive time series</h3>



<p>Remember the equation for additive time series is simply: <em><strong>O<sub>t </sub>&nbsp;= T<sub>t</sub> + S<sub>t</sub> + R<sub>t</sub></strong>&nbsp;</em></p>



<p><strong>O<sub>t </sub></strong>&nbsp;= output<br><strong>T<sub>t</sub> </strong>= &nbsp;trend<br><strong>S<sub>t</sub> </strong>= seasonality&nbsp;<br><strong>R<sub>t</sub></strong> = residual<br><strong><sub>t</sub></strong>&nbsp; = variable representing a particular point in time</p>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);">additive = trend + seasonal + residual</pre>



<figure class="wp-block-image size-large"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Time-series-forcasting-plot-additive.png?ssl=1" alt="Time series forcasting plot additive" class="wp-image-41202"/></figure>



<p>The same follows for multiplicative time series, except we don’t add, but multiply the values of trend, seasonality, and residual.</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-stationary-and-autocorrelation">Stationary and autocorrelation</h3>



<h4 class="wp-block-heading">What is stationarity?</h4>



<p>For <a href="https://www.quora.com/What-does-stationary-data-mean-in-machine-learning-and-data-science" target="_blank" rel="noreferrer noopener nofollow">time series data to be stationary</a>, the data must exhibit four properties over time:</p>



<p>1. <strong>Constant Mean:</strong></p>



<p>A stationary time series will have a constant mean throughout the entire series.</p>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Constant-Mean.png?ssl=1" alt="Constant Mean" class="wp-image-41231"/></figure>
</div>


<p>As an example, if we were to draw the mean of the series, this holds as the mean throughout all of the time.&nbsp;</p>



<p>A good example where the mean wouldn&#8217;t be constant is if we had some type of trend. With an upward or downward trend, for example, the mean at the end of our series would be noticeably higher or lower than the mean at the beginning of the series.</p>



<p>2. <strong>Constant Variance:</strong></p>



<p>A stationary time series will have a constant variance throughout the entire series.</p>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Constant-Variance.png?ssl=1" alt="Constant Variance" class="wp-image-41232"/></figure>
</div>


<p>3. <strong>Constant Autocorrelation Structure:</strong></p>



<p>Autocorrelation simply means that the current time series measurement is correlated with a past measurement. For example, today&#8217;s stock price is often highly correlated with yesterday&#8217;s price.</p>



<p>The time interval between correlated values is called LAG. Suppose we wanted to know if today&#8217;s stock price correlated better with yesterday&#8217;s price, or the price from two days ago. We could test this by computing the correlation between the original time series and the same series delayed by one time interval. So, the second value of the original time series would be compared with the first of the delayed. The third original value would be compared with the second of the delayed, and so on. Performing this process for a lag of 1 and a lag of 2, respectively, would yield two correlation outputs. This output would tell which lag is more correlated. That is <strong>autocorrelation</strong> in a nutshell.</p>



<h2 class="wp-block-heading" class="wp-block-heading" id="h-time-series-smoothing">Time series smoothing</h2>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-what-is-smoothing">What is Smoothing?</h3>



<p><a href="https://towardsdatascience.com/what-is-label-smoothing-108debd7ef06" target="_blank" rel="noreferrer noopener nofollow">Smoothing</a> is a process that often improves our ability to forecast series by reducing the impact of noise.</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-why-is-smoothing-important">Why is smoothing important?</h3>



<p>Smoothing is an important tool that lets us improve forward-looking forecasts.</p>



<p>Consider the data in the below graph. How could we forecast what will happen in one, two, or three steps into the future?</p>



<p>One solution is to calculate the mean of the series and predict the value in the future.</p>


<div class="wp-block-image">
<figure class="aligncenter size-large is-resized"><img data-recalc-dims="1" loading="lazy" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Smoothing.png?resize=395%2C305&#038;ssl=1" alt="Smoothing" class="wp-image-41233" style="width:395px;height:305px" width="395" height="305"/></figure>
</div>


<p>But, using the mean to predict future values doesn’t seem like a good way, and we might not get accurate predictions. Instead, we employ a technique called exponential smoothing.</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-single-exponential-smoothing">Single Exponential Smoothing</h3>



<p><a href="https://otexts.com/fpp2/ses.html" target="_blank" rel="noreferrer noopener nofollow">Single Exponential Smoothing</a>, also called Simple Exponential Smoothing, is a time series forecasting method for univariate data without a trend or seasonality.</p>



<p>It requires a single parameter, called <em>alpha</em> (<em>a</em>), also called the smoothing factor or smoothing coefficient.</p>



<figure class="wp-block-image size-large is-resized"><img data-recalc-dims="1" loading="lazy" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Single-Exponential-Smoothing.png?resize=512%2C222&#038;ssl=1" alt="Single Exponential Smoothing" class="wp-image-41234" style="width:512px;height:222px" width="512" height="222"/></figure>



<p>This parameter controls the rate at which the influence of observations at prior time steps decays exponentially. Alpha is often set to a value between 0 and 1. Large values mean that the model pays attention mainly to the most recent past observations, whereas smaller values mean more of the history is taken into account when making a prediction.</p>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Single-Exponential-Smoothing-2.png?ssl=1" alt="Single Exponential Smoothing" class="wp-image-41235"/></figure>
</div>


<h3 class="wp-block-heading" class="wp-block-heading" id="h-double-exponential-smoothing">Double Exponential Smoothing</h3>



<p><a href="https://www.itl.nist.gov/div898/handbook/pmc/section4/pmc434.htm" target="_blank" rel="noreferrer noopener nofollow">Double Exponential Smoothing</a> is an extension to Exponential Smoothing that explicitly adds support for trends in the univariate time series.</p>



<p>In addition to the alpha parameter for controlling the smoothing factor for the level, a smoothing factor is added to control the decay of the influence of the change in a trend, called beta (b).</p>



<figure class="wp-block-image size-large is-resized"><img data-recalc-dims="1" loading="lazy" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Double-Exponential-Smoothing.png?resize=512%2C134&#038;ssl=1" alt="Double Exponential Smoothing" class="wp-image-41236" style="width:512px;height:134px" width="512" height="134"/></figure>



<p>The method supports trends that change in different ways: an additive and a multiplicative, depending on whether the trend is linear or exponential respectively.</p>



<p>Double Exponential Smoothing with an additive trend is classically referred to as <strong>Holt’s linear trend model</strong>, named after the developer of the method, <strong>Charles Holt</strong>.</p>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Double-Exponential-Smoothing-2.png?ssl=1" alt="Double Exponential Smoothing" class="wp-image-41237"/></figure>
</div>


<h3 class="wp-block-heading" class="wp-block-heading" id="h-triple-exponential-smoothing">Triple Exponential Smoothing</h3>



<p><a href="https://machinelearningmastery.com/exponential-smoothing-for-time-series-forecasting-in-python/" target="_blank" rel="noreferrer noopener nofollow">Triple Exponential Smoothing</a> is an extension of Exponential Smoothing that explicitly adds support for seasonality to the univariate time series.</p>



<p>This method is sometimes called <strong>Holt-Winters Exponential Smoothing</strong>, named for two contributors to the method: Charles Holt and Peter Winters.</p>



<p>In addition to the alpha and beta smoothing factors, a new parameter is added called gamma (g), which controls the influence on the seasonal component.</p>



<p>As with the trend, the seasonality may be modeled as either an additive or multiplicative process, for a linear or exponential change in the seasonality.</p>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Triple-exponential-smoothing.png?ssl=1" alt="Triple exponential smoothing" class="wp-image-41238"/></figure>
</div>


<h2 class="wp-block-heading" class="wp-block-heading" id="h-autoregressive-models-and-moving-average-arma-models">Autoregressive models and Moving Average (ARMA) models</h2>



<p><a href="http://www-stat.wharton.upenn.edu/~stine/stat910/lectures/08_intro_arma.pdf" target="_blank" rel="noreferrer noopener nofollow">ARMA models</a> combine two models:</p>



<p>The first is an autoregressive (AR) model. Autoregressive models anticipate series dependence on its past values.</p>



<p>The second is the moving average (MA) model. Moving average model anticipates series dependence on past forecast errors.</p>



<p>The combination (ARMA) is also known as the Box-Jenkins approach.</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-arma-model-auto-regressive-ar-part">ARMA model: Auto regressive (AR) part</h3>



<p>ARMA models are often expressed using <strong>P</strong> and <strong>Q</strong> for the <strong>AR</strong> and <strong>MA</strong> components. For a time series variable X that we want to predict the time t, the last few observations are:</p>



<p><strong><em>X<sub>t &#8211; 3</sub>, X<sub>t &#8211; 2</sub>, X<sub>t-&nbsp; 1</sub></em></strong></p>



<p><strong>AR(p)</strong> models are assumed to depend on the last p values of the time series. Let’s say <strong>p = 2</strong>, the forecast has the form:&nbsp;</p>



<figure class="wp-block-image size-large is-resized"><img data-recalc-dims="1" loading="lazy" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/ARMA-model-1.png?resize=768%2C101&#038;ssl=1" alt="ARMA model" class="wp-image-41241" style="width:768px;height:101px" width="768" height="101"/></figure>



<p><strong>Ma(q)</strong> models are assumed to depend on the last q values of the time series. Let say q = 2, the forecast has the form:</p>



<figure class="wp-block-image size-large is-resized"><img data-recalc-dims="1" loading="lazy" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/ARMA-model-2.png?resize=767%2C139&#038;ssl=1" alt="ARMA model" class="wp-image-41244" style="width:767px;height:139px" width="767" height="139"/></figure>



<p>We’ll discuss what exactly these equations mean and how the errors are calculated in a while.</p>



<p>Now, to get our AR(p) and MA(q) models together, we combine the <strong>AR(p)</strong> and <strong>MA(P)</strong> to yield the <strong>ARMA(p,q)</strong> model. For <strong>p = 2 </strong>and <strong>q = 2</strong> the ARMA (2,2) forecast will be:&nbsp;</p>



<figure class="wp-block-image size-large is-resized"><img data-recalc-dims="1" loading="lazy" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/ARMA-model-3.png?resize=767%2C137&#038;ssl=1" alt="ARMA model" class="wp-image-41245" style="width:767px;height:137px" width="767" height="137"/></figure>



<p>Again we’ll see all these while doing the hands-on.</p>



<p>There are some things to keep in mind while implementing ARMA models:</p>



<ul class="wp-block-list">
<li>First, the time series is going to be assumed to be stationary, and that regression approach will fail if we&#8217;re working with a non-stationary example.</li>
</ul>



<ul class="wp-block-list">
<li>A good rule of thumb is to have at least 100 observations when fitting an ARMA model, so that we can adequately demonstrate those past autocorrelations.</li>
</ul>



<p>Now we’ll take a practical approach to understand auto-regressive models, and get a practical understanding of moving averages.</p>



<h2 class="wp-block-heading" class="wp-block-heading" id="h-hands-on-approach">Hands-on approach</h2>



<p>One of the key concepts in the quantitative toolbox is that of mean reversion. This process refers to a time series that displays a tendency to revert to its historical mean value. Mathematically, such a (continuous) time series is referred to as an <strong>Ornstein-Uhlenbeck process.</strong></p>



<p>This is in contrast to a random walk (aka Brownian motion), which has no &#8220;memory&#8221; of where it has been at each particular instance of time.</p>



<p>The mean-reverting property of a time series can be exploited to produce better predictions.</p>



<p>A continuous mean-reverting time series can be represented by an Ornstein-Uhlenbeck stochastic differential equation:</p>



<p><strong>  = θ(μ− )   + σ </strong></p>



<p>Where:</p>



<ul class="wp-block-list">
<li>θ is the rate of reversion to the mean,</li>



<li>μ is the mean value of the process,</li>



<li>σ is the variance of the process,</li>



<li>  &nbsp;is a Wiener Process or Brownian Motion.</li>
</ul>



<p>In a discrete setting, the equation states that the change of the price series in the next time period is proportional to the difference between the mean price and the current price, with the addition of Gaussian noise.</p>



<p>For more details, have a look <a href="https://www.quantstart.com/articles/Basics-of-Statistical-Mean-Reversion-Testing" target="_blank" rel="noreferrer noopener nofollow">here</a>. </p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-section-1-arma">Section 1: ARMA&nbsp;</h3>



<p>Enter <a href="https://en.wikipedia.org/wiki/Autoregressive_integrated_moving_average" target="_blank" rel="noreferrer noopener nofollow">Autoregressive Integrated Moving Average (ARIMA)</a> modeling. When we have autocorrelation between outcomes and their ancestors, we will see a theme or relationship in the outcome plot. This relationship can be modeled in its way, allowing us to predict the future with a confidence level proportionate to the strength of the relationship and the proximity to known values (prediction weakens the further out we go).</p>



<ul class="wp-block-list">
<li><a href="https://www.otexts.org/fpp/8/5" target="_blank" rel="noreferrer noopener nofollow">ARIMA in R</a></li>



<li><a href="https://people.duke.edu/~rnau/411arim2.htm" target="_blank" rel="noreferrer noopener nofollow">Duke ARIMA Guide</a></li>



<li><a href="http://stats.stackexchange.com/questions/164824/moving-average-ma-process-numerical-intuition" target="_blank" rel="noreferrer noopener nofollow">Great explanation on MA in practice</a></li>
</ul>



<p>For second-order stationary data (both mean and variance: <strong>&nbsp;</strong><strong> </strong><strong>  </strong><strong>=  </strong> and <strong> </strong><strong><sup>2</sup></strong><strong> </strong><strong>= </strong><strong><sup>2</sup></strong><sup> </sup>&nbsp;for all <strong> </strong>), autocovariance is expressed as a function only of the time lag  :</p>



<p><strong> </strong><strong> </strong><strong>= [( </strong><strong> </strong><strong>− )( </strong><strong> + </strong><strong>− )]</strong></p>



<p>Therefore, the autocorrelation function is defined as:</p>



<p><strong> </strong><strong>  </strong><strong>=  </strong><strong> </strong>/<strong> </strong><strong><sup>2</sup></strong></p>



<p>We use the plot of these values at different lags to determine optimal ARIMA parameters. Notice how phi changes the process.</p>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/ARIMA-parameters.png?ssl=1" alt="ARIMA parameters" class="wp-image-41247"/></figure>
</div>


<h3 class="wp-block-heading" class="wp-block-heading" id="h-section-2-autoregressive-ar-models">Section 2: Autoregressive (AR) Models</h3>



<p><strong>Autocorrelation:</strong> a variable&#8217;s correlation with itself at different lags.</p>



<p>AR models regress on actual past values.</p>



<p>This is the first order or <strong>AR(1)</strong> formula you should know:&nbsp;</p>



<p><strong>  =  0 +  1  −1 +  </strong></p>



<p>The β&#8217;s are just like those in linear regression and ϵ is an irreducible error.</p>



<p>A second-order or <strong>AR(2)</strong> would look like this:&nbsp;</p>



<p><strong>  =  0 +  1  −1 +  2  −2 + </strong></p>



<p>We&#8217;ll generate our data to gain insight into how AR models work.</p>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);"><span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># reproducibility</span>
np.random.seed(<span class="hljs-number" style="color: teal;">123</span>)

<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># create autocorrelated data</span>
time = np.arange(<span class="hljs-number" style="color: teal;">100</span>)
<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;">#Assuming 0 mean</span>
ar1_sample = np.zeros(<span class="hljs-number" style="color: teal;">100</span>)

<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Set our first number to a random value with expected mean of 0 and standard deviation of 2.5</span>
ar1_sample[<span class="hljs-number" style="color: teal;">0</span>] += np.random.normal(loc=<span class="hljs-number" style="color: teal;">0</span>, scale=<span class="hljs-number" style="color: teal;">2.5</span>, size=<span class="hljs-number" style="color: teal;">1</span>)

<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Set every value thereafter as 0.7 * the last term plus a random error</span>
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">for</span> t <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">in</span> time[<span class="hljs-number" style="color: teal;">1</span>:]:
    ar1_sample[t] = (<span class="hljs-number" style="color: teal;">0.7</span> * ar1_sample[t<span class="hljs-number" style="color: teal;">-1</span>]) + np.random.normal(loc=<span class="hljs-number" style="color: teal;">0</span>, scale=<span class="hljs-number" style="color: teal;">2.5</span>, size=<span class="hljs-number" style="color: teal;">1</span>)

plt.fill_between(time,ar1_sample)</pre>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Autoregressive-Models.png?ssl=1" alt="Autoregressive (AR) Models" class="wp-image-41206"/></figure>
</div>


<p>Here we create a prediction for generated data to show we came up with a model that is approximately ar(1) with phi ≈ 0.7.</p>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);"><span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># using ARMA model from statsmodel package</span>
model = sm.tsa.ARMA(ar1_sample, (<span class="hljs-number" style="color: teal;">1</span>, <span class="hljs-number" style="color: teal;">0</span>)).fit(trend=<span class="hljs-string" style="color: rgb(221, 17, 68);">'nc'</span>, disp=<span class="hljs-number" style="color: teal;">0</span>)
model.params</pre>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);"><span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># create autocorrelated data</span>
np.random.seed(<span class="hljs-number" style="color: teal;">112</span>)
<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Mean is again 0</span>
ar2_sample = np.zeros(<span class="hljs-number" style="color: teal;">100</span>)
<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Set first two values to random values with expected mean of 0 and standard deviation of 2.5</span>
ar2_sample[<span class="hljs-number" style="color: teal;">0</span>:<span class="hljs-number" style="color: teal;">2</span>] += np.random.normal(loc=<span class="hljs-number" style="color: teal;">0</span>, scale=<span class="hljs-number" style="color: teal;">2.5</span>, size=<span class="hljs-number" style="color: teal;">2</span>)
<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># Set future values as 0.3 times the prior value and 0.3 times value two prior</span>
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">for</span> t <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">in</span> time[<span class="hljs-number" style="color: teal;">2</span>:]:
    ar2_sample[t] = (<span class="hljs-number" style="color: teal;">0.3</span> * ar2_sample[t<span class="hljs-number" style="color: teal;">-1</span>]) + (<span class="hljs-number" style="color: teal;">0.3</span> * ar2_sample[t<span class="hljs-number" style="color: teal;">-2</span>]) + np.random.normal(loc=<span class="hljs-number" style="color: teal;">0</span>, scale=<span class="hljs-number" style="color: teal;">2.5</span>, size=<span class="hljs-number" style="color: teal;">1</span>)

plt.fill_between(time,ar2_sample)</pre>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Autoregressive-Models-2.png?ssl=1" alt="Autoregressive Models" class="wp-image-41208"/></figure>
</div>


<h3 class="wp-block-heading" class="wp-block-heading" id="h-section-3-moving-averagema-models">Section 3: Moving Average(MA) models</h3>



<h4 class="wp-block-heading">MA Model Specifics</h4>



<p>A MA model is defined by this equation:&nbsp;</p>



<p><strong>  =   +   + θ1   − 1 + θ2   − 2 +⋯+ θ   − </strong></p>



<p>Where:</p>



<ul class="wp-block-list">
<li>  &nbsp;is the white noise value,</li>



<li>  is a constant value,&nbsp;</li>



<li> &#8216;s are coefficients, not unlike those found in linear regression.</li>
</ul>



<h4 class="wp-block-heading">MA Models != Moving Average Smoothing</h4>



<p>An important distinction is that a moving average model is not the same thing as moving average smoothing. What we did in previous lessons was smoothing. It has important properties that we’ve discussed. However, moving average models are a completely different beast.</p>



<p>Moving average smoothing is useful for estimating the trend and seasonality of past data. <a href="https://otexts.com/fpp2/MA.html" target="_blank" rel="noreferrer noopener nofollow">MA models</a>, on the other hand, are a useful forecasting model that regresses past forecast errors to forecast future values.&nbsp;</p>



<p>It’s easy to lump the two techniques together, but they serve very different functions. Thus, a moving-average model is conceptually a linear regression of the current value of the series against current and previous (unobserved) white noise error terms or random shocks.&nbsp;</p>



<p>The random shocks at each point are assumed to be mutually independent and to come from the same distribution, typically a normal distribution, with a location at zero and constant scale.</p>



<p>We&#8217;ll generate our data so we know the generative process for an MA series.</p>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);"><span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># reproducibility</span>
np.random.seed(<span class="hljs-number" style="color: teal;">12</span>)

<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># create autocorrelated data</span>
time = np.arange(<span class="hljs-number" style="color: teal;">100</span>)
<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;">#mean 0</span>
ma1_sample = np.zeros(<span class="hljs-number" style="color: teal;">100</span>)
<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;">#create vector of random normally distributed errors</span>
error = np.random.normal(loc=<span class="hljs-number" style="color: teal;">0</span>, scale=<span class="hljs-number" style="color: teal;">2.5</span>, size=<span class="hljs-number" style="color: teal;">100</span>)
<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># set first value to one of the random errors</span>
ma1_sample[<span class="hljs-number" style="color: teal;">0</span>] += error[<span class="hljs-number" style="color: teal;">0</span>]

<span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;">#set future values to 0.4 times error of prior value plus the current error term</span>
<span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">for</span> t <span class="hljs-keyword" style="color: rgb(51, 51, 51); font-weight: 700;">in</span> time[<span class="hljs-number" style="color: teal;">1</span>:]:
    ma1_sample[t] = (<span class="hljs-number" style="color: teal;">0.4</span> * error[t<span class="hljs-number" style="color: teal;">-1</span>]) + error[t]

plt.fill_between(time,ma1_sample)</pre>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Moving-Average-models.png?ssl=1" alt="Moving Average models" class="wp-image-41210"/></figure>
</div>


<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);"> <span class="hljs-comment" style="color: rgb(153, 153, 136); font-style: italic;"># find model params for generated sample </span>
model = sm.tsa.ARMA(ma1_sample, (<span class="hljs-number" style="color: teal;">0</span>, <span class="hljs-number" style="color: teal;">1</span>)).fit(trend=<span class="hljs-string" style="color: rgb(221, 17, 68);">'nc'</span>, disp=<span class="hljs-number" style="color: teal;">0</span>)
model.params</pre>



<p>out:array([0.34274651])</p>



<h3 class="wp-block-heading" class="wp-block-heading" id="h-section-3-the-autocorrelation-function-acf">Section 3: The Autocorrelation Function (ACF)</h3>



<p>There&#8217;s a crucial question we need to answer: how do you choose the orders (p and q) for a time series?</p>



<p>To answer that question, we need to understand the Autocorrelation Function (ACF). Let&#8217;s start by showing an example ACF plot for our different simulated series.</p>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);">fig = sm.tsa.graphics.plot_acf(ar1_sample, lags=range(<span class="hljs-number" style="color: teal;">1</span>,<span class="hljs-number" style="color: teal;">30</span>), alpha=<span class="hljs-number" style="color: teal;">0.05</span>,title = <span class="hljs-string" style="color: rgb(221, 17, 68);">'ar1 ACF'</span>)
fig = sm.tsa.graphics.plot_acf(ma1_sample, lags=range(<span class="hljs-number" style="color: teal;">1</span>,<span class="hljs-number" style="color: teal;">15</span>), alpha=<span class="hljs-number" style="color: teal;">0.05</span>,title = <span class="hljs-string" style="color: rgb(221, 17, 68);">'ma1 ACF'</span>)</pre>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-9d6595d7 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow"><div class="wp-block-image">
<figure class="aligncenter size-large"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Autocorrelation-Function-1.png?ssl=1" alt="Autocorrelation Function" class="wp-image-41213"/></figure>
</div></div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow"><div class="wp-block-image">
<figure class="aligncenter size-large"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Autocorrelation-Function-2.png?ssl=1" alt="Autocorrelation Function" class="wp-image-41215"/></figure>
</div></div>
</div>



<p>An explanation is in order. First, the blue region represents a confidence interval. Alpha, in this case, was set to 0.05 (95% confidence interval). This can be set to whatever float value you require. See the <strong>plot_acf</strong> function for details.</p>



<p>The stems represent lagged correlation values. In other words, a lag of 1 will show a correlation with the prior endogenous value. A lag of 2 shows a correlation to the value 2 prior and so on. Remember that we&#8217;re regressing on past forecast values, that&#8217;s the correlation we&#8217;re inspecting here.</p>



<p>Correlations outside of the confidence interval are statistically significant, whereas the others are not.</p>



<p>Note that if lag 1 shows strong autocorrelation, lag 2 will show strong autocorrelation as well, since lag 1 is correlated with lag 2, lag 2 with lag 3, and so on. That’s why you see the ar1 model with slowly decaying correlation.</p>



<p>If we think about the functions, we note that autocorrelation will propagate for AR(1) models:</p>



<ul class="wp-block-list">
<li>  =  0 +  1  −1 +  </li>



<li> −1 =  0 +  1  −2 + −1</li>



<li>  = 0 +  0 +  1  −2 +  −1 +  </li>
</ul>



<p>The past errors will propagate into the future, leading to the slowly decaying plot we just mentioned.</p>



<p>For MA(1) models:</p>



<p>  =   =  0 + θ1  −1 +  </p>



<p>Only the prior error affects future errors.</p>



<p>So an easy way to identify an AR(1) model or MA(1) model is to see if the correlation from one affects the next.</p>



<pre class="hljs" style="display: block; overflow-x: auto; padding: 0.5em; color: rgb(51, 51, 51); background: rgb(248, 248, 248);">fig = sm.tsa.graphics.plot_acf(ar2_sample, lags=range(<span class="hljs-number" style="color: teal;">1</span>,<span class="hljs-number" style="color: teal;">15</span>), alpha=<span class="hljs-number" style="color: teal;">0.05</span>,title = <span class="hljs-string" style="color: rgb(221, 17, 68);">'ar2 ACF'</span>)
fig = sm.tsa.graphics.plot_acf(ma2_sample, lags=range(<span class="hljs-number" style="color: teal;">1</span>,<span class="hljs-number" style="color: teal;">15</span>), alpha=<span class="hljs-number" style="color: teal;">0.05</span>,title = <span class="hljs-string" style="color: rgb(221, 17, 68);">'ma2 ACF'</span>)</pre>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-9d6595d7 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow"><div class="wp-block-image">
<figure class="aligncenter size-large"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Autocorrelation-Function-3.png?ssl=1" alt="Autocorrelation Function" class="wp-image-41217"/></figure>
</div></div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow"><div class="wp-block-image">
<figure class="aligncenter size-large"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/neptune.ai/wp-content/uploads/2022/10/Autocorrelation-Function-4.png?ssl=1" alt="Autocorrelation Function" class="wp-image-41219"/></figure>
</div></div>
</div>



<h2 class="wp-block-heading" class="wp-block-heading" id="h-summary">Summary</h2>



<p>In this post, we explored what exactly is time series forecasting, and what are the important components of time series forecasting, ie.: the constituent components that a time series can be decomposed into when performing an analysis.</p>



<p>We also went through different types of forecasting, and dove into moving averages, stationary models, and how to plot time series using Python.</p>



<p>In the next article, we’ll focus on how to model time series data using ARIMA, SARIMA, and FB PROPHET. Thanks for reading!</p>



<p><strong>Reference:</strong></p>



<ul class="wp-block-list">
<li><a href="https://www.ibm.com/support/knowledgecenter/SSLVMB_23.0.0/spss/tutorials/timeseriesmodeling_gateway.html">https://www.ibm.com/support/knowledgecenter/SSLVMB_23.0.0/spss/tutorials/timeseriesmodeling_gateway.html</a></li>



<li><a href="https://machinelearningmastery.com/time-series-forecasting-methods-in-python-cheat-sheet/">https://machinelearningmastery.com/time-series-forecasting-methods-in-python-cheat-sheet/</a></li>
</ul>



<p><strong>Images reference:</strong></p>



<ul class="wp-block-list">
<li><a href="https://www.ibm.com/support/knowledgecenter/vi/SSLVMB_23.0.0/spss/trends/idh_idd_tab_vars.html"></a><a href="https://www.google.com/url?q=https://www.ibm.com/support/knowledgecenter/vi/SSLVMB_23.0.0/spss/trends/idh_idd_tab_vars.html&amp;sa=D&amp;source=editors&amp;ust=1615312343402000&amp;usg=AOvVaw0-ST724GYv8H_ja1MXrOX_" target="_blank" rel="noreferrer noopener">https://www.ibm.com/support/knowledgecenter/vi/SSLVMB_23.0.0/spss/trends/idh_idd_tab_vars.html</a></li>



<li><a href="https://developer.ibm.com/exchanges/models/all/max-weather-forecaster/" target="_blank" rel="noreferrer noopener nofollow">https://developer.ibm.com/exchanges/models/all/max-weather-forecaster/</a> </li>
</ul>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">3928</post-id>	</item>
	</channel>
</rss>
