How To Run a Stock Market Index Prediction Model for the S&P 500 index On UbiOps

The IT spending of financial institutions all over the world is steadily increasing, and is expected to reach over $750 billion dollars by 2025. This is partly because of the significant increase in the development and deployment of AI systems. AI-powered systems can process large volumes of data very quickly and at a large scale. Just some examples of applications of AI in finance are:

  • Financial markets predictions (including the use of alternative data sets)
  • Fraud and anti-money laundering detection and prevention
  • Trading execution
  • Customer services (chatbots)
  • Investment advice (robo advisors)

An application of AI that we will focus on in this article is the prediction of financial time series on UbiOps. The focus is not on the actual model itself, as developing a model for predicting financial time series requires significant research, which is outside the scope of this article.

Forecasting financial time series such as stock market prices is complex, but the rapid advancement in machine learning techniques, such as deep-learning neural networks, availability of large-scale data, and increased computational capabilities creates new possibilities to apply sophisticated ML models for predicting stock market prices. Machine learning models make few assumptions on the structure of the underlying data and are therefore better able to find complex and non-linear patterns than traditional time series models, which we will use here to predict prices of the S&P 500. These AI systems can be automated for production using good MLOps practices

Stock Market Index Prediction

The future price of an index depends on a lot of different factors, such as the current state of the financial markets (liquidity, trading activity), fundamental data (revenue, costs, borrowings, etc. of companies), industry-specific data, and macroeconomic data like interest rates, world events, and inflation.

A lot of these factors have a non-linear impact on index prices, and also impact each other. Financial time series data also typically has a low signal-to-noise ratio and is generally non-stationary. Predicting future index price returns is therefore very complex and requires a lot of data, which is where AI excels. 

Example: Applying LSTM for S&P 500 Index Prediction

What is an LSTM model?

The model that we will implement on UbiOps in this article is a Long Short-Term Memory (LSTM) neural network. LSTM models are types of recurrent neural networks (RNN). This is a class of neural networks in which the connections between nodes form cycles. RNNs are therefore very suitable for sequential data inputs, which is why they are frequently used in natural language processing or, as we will do here, financial time series! They memorize information from previous components in the sequence to influence the eventual output. The following diagram depicts a cell in a standard RNN as well as a cell in an LSTM network.

Deep RNNs are prone to being affected by the exploding/vanishing gradient problem, essentially caused by a series of multiplications of numbers which are larger than 1 or smaller than 1 resp. LSTMs solve this problem by determining for each value in the input sequence whether it is relevant enough for the output. This article by Rian Dolphin explains how they do that very well. 

LSTM models fix the exploding/vanishing gradient problem

In short, LSTM networks pass the input data through different gates to determine their relevance. The forget gate determines which part of the long term memory of the network is relevant, the input gate determines what new information should be added to the long term memory, and the output gate determines the output using its short and long term memory. The combination of these three gates is called a cell, and you can have multiple cells in a single LSTM network, corresponding to the number of timesteps in the input sequence (one cell for each timestep). The final hidden state of the cells goes through conventional layers of neurons, before it reaches the output layer. The dimension of the hidden state equals the number of neurons in the next dense layer. 

Why Use an LSTM Model for S&P 500 Index Prediction?

Because of the ability to efficiently use long sequences of time series data, LSTM networks have been a popular choice of ML models for financial time series. Also, LSTMs are easy to implement using Python which is then easily uploaded as a Deployment, see below.

Training and deploying a LSTM model takes a lot of computation power. The parallel processing capabilities of GPUs can accelerate the training and inference processes. GPUs are the de-facto standard for LSTM deployment. The cost of building and maintaining the computing infrastructure to perform these tasks is high, because GPUs and ML engineers are expensive and scarce! UbiOps can save you effort and money with our production-grade, powerful platform with on-demand, auto-scalable access to GPUs suitable for large training jobs as well as for low latency inferencing.

LSTM On UbiOps for index Market Prediction

Now let’s deploy our LSTM model for index market prediction to UbiOps! Deploying an ML model to UbiOps is fast and simple, and we offer support in the deployment and maintenance of your model. We will build a simple LSTM based on this article by Hum Nath Bhandari et al..

What datasets will we apply it to?

In the article by Bhandari et al., they find an LSTM structure and hyperparameters that perform optimally for S&P 500 data. S&P 500 is a popular US index market index. Bhandari finds that for this dataset, the best performing LSTM model has a single layer and 150 neurons (with optimer Adagrad., Learning Rate 0.01 and Batch Size 16). We will apply the same set up and apply it to a dataset of daily data from the past three years. The dataset comprises, for each day, of the open price (Open) and close price of the index, the US federal interest rate (EFFR), the US dollar index (USDX), and three technical indicators: the moving average convergence divergence (MACD), average true range (ATR), and relative strength index (RSI).

The first four are publicly available information, and the three technical indicators are easily calculated based on the open price, daily maximum price, daily minimum price, and close price of the S&P 500 index using the pandas_ta library in Python. Before we use the S&P 500 index data we have denoised it with a Haar wavelet transform.

LSTM implementation and deploying to UbiOps

This is easily implemented in Python. In addition, we will also train our model. It is also possible to do this on the UbiOps platform.

https://gist.github.com/AlexanderNeutel/271dc9de86ef77dbc7ffb51654bf7b08

We import the required libraries, read the data with Pandas, and split that data into a training set and test set. We have used the hyperparameters from Bhandari, so we can immediately create and train the model with Tensorflow Keras .

Finally, we save the trained model to UbiOps so that we can use the model for inferencing. 

Deploying an LSTM Model to UbiOps

Now that we have a pre-trained model for one-period ahead forecasts we can upload it on UbiOps and start using it. Deploying a model to UbiOps is easy. All you need to do is create a deployment package that contains a Python file with your model code, and a requirements.txt file that specifies the libraries UbiOps needs to install to get your code to run. UbiOps then creates a microservice of the model with its own API endpoint.

Deployment.py

https://gist.github.com/AlexanderNeutel/acf285b345ea516fb1257893db64b198

The deployment.py file contains a Deployment class that UbiOps uses to handle model initialization and requests. The __init__() function runs when your model is initialized on UbiOps, and the request() function runs when you create a request to use your model. 

The template for this code can be found in the documentation. You only need to add the code that is unique to your use case. In ours, the initialization function loads our model and the trained model parameters (weights and biases) in memory. 

Now that we have a trained model, we want to be able to use it for forecasting on the most recent data by making a request to the API endpoint of our deployed model. The request function finds the relevant data online, formats it, scales it, and then  returns the predicted value(s).

Requirements.txt

https://gist.github.com/AlexanderNeutel/42809615ca764cf785663583e6a50c38

The requirements file is a simple text file to specify the libraries required for our deployment to run.

Deploying to UbiOps

To deploy our model, we first zip our files deployment.py and requirements.txt, which we then upload later. We now log into UbiOps click Create new project. Let’s call it “lstm-on-ubiops”.

The first step in creating a Deployment is to create an API token for it. We create an API token in the Permissions section on the left hand side of the screen. We will have to use this token in a later step, so we copy it to a notepad so we can easily find it back later. 

Next, I create a new bucket, via the Storage page,that contains the trained model file. Go to Storage -> Create new bucket, give it a name and create it. Once it’s created, I can upload the model.h5 file that contains our trained LSTM model weights..Next, we go to Deployments, click Create and go through two screens of options. In the first, you give your deployment a name and description, and define the type of input and output the deployment will take and give. In our case, the input is empty, because our request function loads the data from the internet. Here, we also connect the deployment to the bucket we just made. We set it to read only.In the next screen we drag our zipped deployment package into the drag-and-drop field, and open the optional/advanced settings. We need to add two environment variables: the UbiOps API token, and the bucket name.

We now paste the API token we created earlier into the environment variable we created for it. And for the bucket name variable we just enter the name we created earlier. There’s a few more settings we can alter, like scaling and resource settings, but the default ones will work just fine in this case.And that’s it! We click Create and UbiOps will do the rest of the setup for us. It will install all the packages and save our deployment, so it can be quickly started up whenever you want. If we now navigate to our deployment we can click Create request to use our model. It is also possible to create requests using the UbiOps API.

Results

Now, let’s look at how our model performs. We have not yet exposed our model to the final fifty days of our S&P 500 data. We will create a request on UbiOps with this data and look at the LSTM’s predictions of the one-day ahead S&P 500 closing prices for these days. We’ve graphed the one-day ahead forecasts of the final fifty days, along with the true data, below. You can see it works well!

Conclusion

With UbiOps, you can quickly and reliably run an LSTM model in the cloud, based upon financial time series data from various sources. If you have any questions about what else is possible with UbiOps, or if you’re interested in deploying your model to UbiOps, do not hesitate to reach us via email, Slack, or our website!

Disclaimer:

“How to run a index market prediction model on UbiOps” is subject to the Terms of Use posted at newyorkfed.org. The New York Fed is not responsible for publication of the EFFR data by UbiOps, does not or any particular republication, and has no liability for your use.

Disclaimer:

UbiOps has permission from S&P DJ Indices to Republish Index Data/Information.

Reference

https://www.sciencedirect.com/science/article/pii/S2666827022000378

Latest news

Turn your AI & ML models into powerful services with UbiOps