Project 3: Full Analysis

Algorithmic Trading with Artificial Intelligence


Uncovering Market Trends in SEC Filings with Machine Learning and NLP.

Introduction

Wouldn't it be great if you had a tool that can predict if the stock market is going up or down? What would you do? Would you start a hedge fund? Publish your research? Go to Vegas and put it all on red? This notion of outsmarting the markets caught the attention of myself and two of my friends, Jack Burt and Andrew Hocher, at Texas State University. With our background in Computer Information Systems, our goal was to utilize big data and artificial intelligence to see if our dreams could come true.

Our Team with our Professor Tahir Ekin Ph.D. presenting at the TXST Undergraduate Research Conference


Background

The Efficient Market Hypothesis (EMH), introduced by economist Eugene Fama in the 1960s, suggests that financial markets are "efficient" in that all available information is reflected in stock prices. This foundational theory suggests that achieving returns above the market average consistently through stock predictions or timing is virtually impossible due to markets immediately taking into account the information, influencing stock prices.

Building on this understanding, investment strategies splits into two distinct approaches: active and passive investing. Active investing seeks to outperform market benchmarks through trading and asset selection based on analysis and forecasts. In contrast, passive investing focuses on long-term growth by replicating market indexes, which typically involves lower trading costs and fewer transactions.

Source: https://www.reuters.com/markets/us/global-markets-funds-passive-2024-02-01/ 


To analyze our data we used natural language processing techniques. NLP is a subsection of artificial intelligence that guides computers to understand, interpret, and generate human language. This type of technology has been made widely popular with the advance of large language models and Chat GPT in recent years. The capabilities of NLP is crucial for the in depth interpretation of textual data in financial documents, allowing for a more in-depth analysis.

To test out the predictions made by our models, algorithmic trading is needed. Utilizing computer programs, this method utilizes predefined instructions to execute trades. The goal is to analyze vast amounts of financial data and anticipate market movements with reduced human error and bias.

In our research we used SEC 10k documents as our data of interest. 10k documents are provided every year by publicly traded companies to inform their investors about their financials and how the business is progressing. We believe that there are insights in these reports that may predict if a stock were to trend positive or negative after the report was released. Our goal was to make a systematic approach with machine learning to predict these trends without the need of intervention from a human. Now that we have a base understanding of what's going on, let's get right into it!


Objectives and Hypotheses

Research Question: Can artificial intelligence be used to interpret 10-K financial reports and predict stock market movements?

To structure our investigation, we established a null hypothesis (H₀) and an alternative hypothesis (Hₐ). The null hypothesis concludes that NLP analysis of 10-K financial reports does not provide insights that can surpass market performance, aligning with the principles of the Efficient Market Hypothesis. Our alternative hypothesis states that NLP can indeed provide predictive insights capable of outperforming the market, challenging the views of the Efficient Market Hypothesis.

The significance of this research extends beyond the application of AI in finance. It is the intersection where advanced technology meets market analysis, potentially reshaping how investors and analysts interact with market data. By employing NLP to uncover subtle trends and predictive signals in financial reports, our study aims to show the potential of AI to enhance investment strategies and decision-making processes. This research not only has the potential to influence the financial sector, but also contributes to the broader integration of artificial intelligence within traditional industries.


Data & Methodology

Focal Point of Analysis: Item 7 on Form 10-K

Item 7 of the Form 10-K, "Management's Discussion and Analysis of Financial Condition and Results of Operations" offers management's perspective on the company's financial health, operational results, and cash flows over the last year. This is the portion of the 10k we extracted to use for our analysis.

The Image above is example text of a Item 7 from a 10k Document (Microsoft 2023)

Sourcing the Data

Our primary data source is a collection of 10-K reports ranging from the year 2000 to 2023. This dataset was built by Professor Bill McDonald at Notre Dame and is sourced from the SEC's EDGAR website and made publicly available.

Preparing the Data

First we parsed the data, selecting only the Item 7 from the entire 10k document. Then we preprocessed the data for classification by removing HTML tags, replacing contractions, removing special characters and numbers, converting text to lowercase, tokenizing, removing stop words, and lemmatizing words. Then, it applies vectorizing techniques (Count Vectorizer and TF-IDF) to convert text into numerical features.

Entire Model Development Process

Models

Our models were separated into two different categories: NLP Sentiment Models and Machine Learning Models. For our NLP Sentiment Models our goal was to use a diverse set of models, making sure some were pretrained on 10k data and some were not. For our Machine Learning models we used three different types, using two types of vectoring techniques (Countvectorizer and TF-IDF) for each one.

NLP Sentiment Models

FinBert: A natural language processing model specifically trained on financial texts to analyze sentiment and understand context in financial documents. (Source: https://huggingface.co/ProsusAI/finbert)

TextBlob: A traditional model that utilizes NLP techniques to analyze and predict text sentiment by evaluating the polarity and subjectivity of a piece of text.

Chat GPT API: An advanced language model used to classify text sentiment considering context and language nuances.

Word Sentiment: A comparison of a list of positive and negative finance words to the 10k text. (Frequency analysis). (Source: https://sraf.nd.edu/loughranmcdonald-master-dictionary/)

Frequency Distribution of Positive and Negative Words

Machine Learning Models

Naïve Bays: A classification algorithm based on Bayes' Theorem that assumes independence among predictors to predict the category of a given sample.

Random Forest: A machine learning algorithm that builds multiple decision trees and merges them together to get a more accurate prediction.

XGBoost: A scalable machine learning algorithm that uses gradient boosting frameworks for building fast and accurate predictive models.

Correlation Matrix of Models

Results

Models
Out of the selected models there were notable observations. The first, traditional machine learning models outperformed the sentiment models. Out of these models XGBoost preformed the best, followed by Naïve Bayes and then Random Forest. Additionally, there was not a clear vectorizing technique that stood out from the other. Our second observation was the two models that had context of Financial documents (FinBERT and Word Sentiment) were two out of three of the bottom performers.

Model Performance

Meta Model
After building the initial models, we aggregated them into a "meta-model" to explore potential predictive power beyond individual performance. This involved combining outputs using techniques like majority voting and ensemble optimization. Majority voting considers outputs from models, executing a trade (paper trade) if a certain amount of models agree on the same decision. We tested five thresholds: ≥6, ≥7, ≥8, ≥9, or ≥10 models predicting the same output. Ensemble optimization converted initial model outputs into a binary matrix. Then, we processed every combination of columns (1023 possibilities) into logistic regression or Naïve Bayes models to find the best combination. This best combination was tested over separate 50 train/test seed iterations to determine average model accuracy.

Meta Model Performance
The models' performance was graphed over a 3-year period to illustrate their performance on out-of-sample data. We conducted 1000 different seed iterations of training/testing splits to mitigate "lucky" streaks. The graphed data represents the average portfolio value for a given time, along with a confidence interval band of +/- 1 standard deviation (68.2% of all samples). These meta-models were compared against three benchmarks: the total stock market index (Ticker: VTI), a random 50/50 chance of buying or shorting the stock, and a Chat GPT API call with the query "What is the sentiment of the following text?" along with some additional context.

Algorithmic Trading Performance Backtest of Meta-Models and Benchmarks

Findings

In our research, we discovered that, on average, the stock market outperformed all models, with Naïve Bayes and Logistic Regression ensemble models following closely behind. These ensemble optimization methods, while promising, resulted in significantly higher investment risk, measured by standard deviation of portfolio value over time. For majority vote methods we observed that as trade frequency increased, returns decreased and standard deviation increased. The random benchmark showed similar starting and ending portfolio balances. Lastly, the Chat GPT model performed poorly when used alone. Utilizing "Sharpe Ratio" (a finance measure of risk vs return) we found that the stock market significantly outperformed any of our models due to its high return for relative volatility. Naïve Bayes and Logistic Regression follow behind due to their positive returns. For every other method other than these two you were better off doing a coin flip for your trading decisions.

Risk vs Return of Meta-Models and Benchmarks


Meta-Models Portfolio Performance Comparison

Conclusion

Our study evaluated the Efficient Market Hypothesis (EMH) by applying Natural Language Processing (NLP) techniques on 10-K financial reports. As a team we sourced and processed data to develop predictive models with the goal of predicting stock market movements. Select models surpassed the no-information rate, showcasing slight predictive capability. Despite our efforts, the research did not outperform the Total Stock Market Index (VTI), reinforcing the EMH. This outcome underlines the market's efficiency in incorporating public information into stock prices and challenges the potential of achieving superior returns through 10-K report analysis alone.


Recommendations

Computational Advancements
Future research should leverage more sophisticated computational techniques. The incorporation of deep learning algorithms and advanced statistical models may unveil patterns not noticable through traditional NLP methodologies. Also I would like to incorperate another dimension of information such as sentiment change over time from report to report, which would require further computational power.

Diverse Data Sources
Broadening the dataset to include various unstructured data like social media, news articles, and analyst reports might enrich the data pool for more accurate market predictions. The hardest part would be sourcing unbiased text.

Cross-disciplinary Approaches
I would like combine the great minds of researchers in statistics and finance to get the most out of this research. For statistics, leverage advanced statistical methods for evaluating NLP model performance. Methods like error reduction and model optimization through cross-validation and ensemble techniques will enhance prediction accuracy. For finance, subject matter expertise of financial markets and theories, especially in relation to the EMH.


Closing

I wanted to take some time to thank the people this project would not be possible without. To my group mates, Jack and Andrew, thank you for pushing hard and bringing your best work to make this project a reality. Thank you to my professors Dr. Tahir Ekin, Dr. Ha-Chin Yi, and Dr. Rasim Musal for your expert opinion and motivation throughout the project. Finally, I want to thank my family for their support and for allowing me the time to focus on developing my skills.

If you have any questions or would like to reachout to me feel free to contact me on linkedin: https://www.linkedin.com/in/jamespavlicek/ . Thank you for your time and support!

All the code for this project is free to use and can be found on github here: https://github.com/james-pavlicek/algorithmic-trading-with-artificial-intelligence 

Below is an embedded version of the web app I built. This app takes your selection of a stock and grabs thier most recent 10k. From there it applies our models and makes a prediction based on our Naive Bayes Meta Model. For best performace open the tool in a new tab with the "Fullscreen" button in the bottom of the embed. Once you enter your stock it may take a minute or two, to output your results. Link to tool: https://trading-ai.streamlit.app/