Challenges
Our client needed to assess in-house financial news articles to track positive and negative sentiments for individual companies daily. Key challenges comprised the fact that those long documents were multilingual, and it was insufficient to predict an overall sentiment score of a news article.
Instead, the sentiment scores had to be aggregated for each individual company mentioned in the news article. Above all, a limited number of annotation workforce was available to annotate a large amount of news article documents.
Benefits
- Extraction of the financial news sentiment for 100K organizations over a time period of 15+years, making it possible to identify key trends.
- From no annotated data, we build 2 x 4K sample high-quality ground truth for classification and NER in just 2 weeks by programmatic pre-annotation support.
- Delivery of a complex multi-step NLP pipeline from PDF documents to a scalable monitoring solution organization-level sentiments in only 6 months.
- Business teams can take quick and relevant data-driven decisions on investments strategies and financial products recommendations for their private clients.
Approach
- Translating the complex problem into attainable NLP steps by discussing with the business teams, analyzing the existing data, and based on our experience.
- Helping in the tool decisions for each step of the project (pdf parsing, machine translation, …) by considering task performance quality, latency, and integration into a consistent NLP Pipeline.
- Making design choices together with our client’s teams (supported by experiments and iterations on client documents and a subset of the data).
- Introducing a scalable strategy for rapidly annotating a sufficiently sized client dataset (programmatic labeling functions and transfer learning).
- Solving the hard problem of attributing sentiments to individual companies.
- Strongly integrating customer feedback to develop intuitive visualizations.
Explore the full story here.