April 24, 2023, Repost from Bedrock AI -- It has been heartening to see all the recent advances in language models, spurred by the launch of ChatGPT and GPT-4 by OpenAI. The phrase ‘language model’ has finally been catapulted into the mainstream and Bedrock AI team couldn’t be more delighted.
Language models have been the foundation of our technology stack ever since Bedrock AI set up shop in 2020. The team has been working on a financial Q&A assistant since mid-2022, akin to a financial variant of ChatGPT, long before ChatGPT was launched. Check out the company blog for more information about their production models.
No doubt these LLMs (Large Language Models) have vastly expanded the realm of possibilities for novel use cases previously considered impossible. There are already several useful applications built on top of these technologies that are significantly enhancing human productivity. Is the same possible in the world of finance?
Financial use cases have high acceptability requirements. Key requirements include:
Factuality - Output from AI systems should be fact-based and grounded in reality.
Auditability - There should be a means to verify the truthfulness of outputs from AI systems (e.g. citations)
Trustworthiness - The system should not fail in unexpected ways, and behave in a manner expected of it.
Completeness - The system should be able to detect all data points of interest and be able to present the entire picture.
Even state-of-the-art LLMs, while excellent at understanding questions, summarization and other language tasks, suffer from ‘hallucination’ issues. The output they generate is not grounded in the real world and often contains factual inconsistencies. Sometimes, it is just a date or a number that is incorrect, sometimes the entire content generated is utter nonsense. Would you trust a system like this for your investment research?
It is easy to build a demo or prototype product for the financial domain using GPT-4, but it’s really hard to provide factuality guarantees and verifiable results, which engender trust. This is where Bedrock AI comes in. GPT-4 is excellent for a wide variety of tasks but it does not rely on it for answering finance-related questions. Instead, The company use its in-house models adapted to the financial domain for that, while leveraging GPT-4s impressive language skills to augment its model’s output to make it more structured, formatted, and readable.
Bedrock AI’s language model-driven features
Even though it's still a young company (3 years old), their models have been constantly upgraded as they make in-house research advances and adopt the latest tools and techniques being developed in the field.
Bedrock AI's first generation models, launched in 2021, facilitate its ‘core capability’ - separating out inconsequential boilerplate from meaningful information in SEC filings, even when they are linguistically similar, and solving the information overload problem by extracting 20-30 key red flags from a 100 page document like a 10-K (annual report).
A red flag is defined as an event or an outcome at a company that can be an early indicator of earnings manipulation or fraud, increasing the probability of SEC action. Its premium product, the BFI Risk Score, serves as a measure of the likelihood of fraud or malfeasance.
Bedrock AI's second generation models, launched in 2022, significantly improved its information ranking capabilities (the ability to determine relative severity of red flags) and facilitate its real-time AI-generated financial news feed. Its automated news feed is not a news aggregator, it combs through SEC filings and other documents in real-time and extracts price-moving information as it is published.
Bedrock AI's upcoming third generation models, augmented by GPT-4, facilitate a whole host of exciting features including SEC filing summaries, a finance-specific question answering assistant, as well as better contextualised red flags.
AI-generated SEC filing summaries & company overviews
Securities filings like 10-K annual reports can run into hundreds of pages. Our new feature generates a comprehensive summary of all the price-moving information contained in a filing. The content in the summary is categorised and can be filtered, so you can only focus on content related to mergers and acquisitions, for example, or only red flags. Again, the content in the summary comes with citations, and simply clicking on each sentence will take you to the position of the original content in the filing that contains the ground truth. This assures you are always accessing accurate and complete information.
GPT-4 or Bing Chat cannot provide good summaries because they do not have the embedded financial knowledge to determine what is important or what is not.
The ‘context window’ for GPT-4 is simply not large enough to demonstrate all the nuances of what makes something important enough to be added to a summary. The Bedrock AI advantage is that we have our in-house models that do exactly that!
Financial Question Answering
The chatbot paradigm is currently in vogue, fuelled by BingChat and ChatGPT. The conversational paradigm is a more natural model for interaction with computer systems. We are therefore delighted to announce our financial question answering assistant that can answer a wide variety of questions about U.S public companies.
No, this is not a glorified front-end to ChatGPT/GPT-4. We leverage GPT-4 to produce a grammatical and coherent answer in a structured and readable form, but it is never used to answer any financial question. That is handled by our own more reliable models. The key advantage of our question answering system lies in its ability to answer aggregation level queries - for example ‘what percentage of mid cap companies have had a CFO resignation’, in addition to simpler queries like ‘list all the risk factors in Apple’s latest 10-K’.
This capability is currently not available anywhere and we predict this will be add enormous value to analyst workflows. Questions can be asked at the global level, the market cap group and sector level, company level, filing level, or filing section level (e.g. MD&A only).
Contextualized Red Flags
Currently, Bedrock AI red flags are displayed verbatim as they appear in SEC filings. While this guarantees that they are factual, it is not ideal because…
Financial text is long-winded and contains legalese that is cumbersome to read.
Individual sentences may not necessarily contain the entire context needed to put the red flag into perspective.
Our new feature rephrases red flags so that they contain more context and are readable in plain English. This reduces the time needed to process through them. Clicking on the flag opens a popup that displays the position of the original sentence in the filing, thus ensuring instant factual verification. This feature will be available in our news feed, timeline view, and watchlist email notifications.
Comparison of Bedrock AI models to BloombergGPT
While a direct comparison to BloombergGPT is impossible since the model is not publicly available, here are a few points to note:
BloombergGPT reports training their models on 3k+ SEC filings. Bedrock has completed pre-training on 200k+ SEC filings, 65 times more than BloombergGPT.
Bedrock models are instruction-tuned which means they are tuned specifically to respond to the kind of queries that we want it to respond to. Bloomberg’s paper doesn’t mention instruction-tuning yet.
Bloomberg models are probably going to be better at interpreting financial metrics and dealing with numbers (neither of which is Bedrock’s focus)
Overall, it is an exciting time to be in the world of AI and Finance. At Bedrock AI, the small but fast-moving team has been at the forefront of the financial language model space for several years and is now making headway in AI automation in public markets.
More about Bedrock.ai