Concept Networks at Mathlabs Research

Today, across industries — significant amounts of traditional resources are spent daily on capturing insights from text data through internal monitoring and external research. These tasks rely on humans digesting information and then producing compact reports that summarise and subjectively make sense of that information.

As part of Math Labs’ mission to help companies become AI-optimal, we have been continuously working on tools to automate this procedure and bring them as close to the human level as possible.

Drawing nuanced insights from a cohort of text data typically involves multiple tasks, each increasing in complexity: 1. Extracting dominant concepts and core ideas being discussed, 2. Understanding the interactions among these concepts, and 3. Understanding the dynamic relationships among concepts across time. Whilst much effort has gone into the first via topic modelling and a vast family of embedding approaches, discovering interactions among concepts and their evolution with time have largely been neglected.

Math Labs has been working on its own framework of Concept Networks to extract inter topic/concept interactions. Our framework is designed to aid the user in identifying key themes, how they emerge, evolve and their relationship to each other.

Causality and NLP

CAUSALITY

An emerging trend in enterprise Artificial Intelligence is learning Causal relationships within an information space. It is felt by many in the community that going ‘beyond correlation’ will be a big step in the direction of more intelligent, and certainly more useful, AI — this is particularly difficult as a large part of traditional statistics does focus more on similarity or co-movement rather than cause-effect inferences.

For a relatively long time (since 1920s) probabilistic models based on directed acyclic graphs and then Bayesian networks from 1970s have increasingly been used to address causality (learning the causal relationships in a BN being typically called structure learning). However, issues of severe computational intractability of structure-learning forced Bayesian Networks to take a backseat in the last decade’s AI boom. Recent progress on computationally efficient methods for approximate structure learning is leading to again leading a surge of interest in learning causality.

DRAWING CAUSAL INSIGHTS IN ‘TEXT’ DATA

Despite these advances, attempts to infer causality in dynamic text data such as financial news have largely remained rules based — using clauses such as ‘I felt ill which caused me to cough’ and POS (part-of-speech tagging) to infer sickness caused coughing. (However, these methods themselves were not very successful in identifying causality in a sentence such as ‘I felt ill and I coughed.’, apart from just not being scalable).

One potential explanation for the adherence to rule-based approaches has been the ill-suitedness of BN structure-learning algorithms to deal with text-data in their current form.

Firstly, there is a domain problem, the modelling assumptions implicit in structure-learning such as the ability to feature variables algorithms did not hold-up well enough in the NLP problem of trying to understand causal influences of topics and concepts on one another.

Secondly, structure learning algorithms typically require sufficient number observations of events A and B in order to draw a causal relationship. This may be true in the case of say ‘sickness’ events and ‘cough’ events, however these frame works may not work as well in once-in-a-lifetime / low event interactions such as a one off outbreak of COVID-19 and its one-off causal effects.

The Math Labs research team has built the framework of Concept Networksthat aims to solve these problems and thus produce a powerful tool for extracting causal relationships between concepts.

*Figure 1: left:* BN structure learning algorithm, *right:* The Math Labs Concept Network. A BN might typically struggle to separate concepts and often infers incorrect causal directions e.g. ‘stock market futures fall’ causes ‘coronavirus outbreak’. Whereas the Concept Network picks up the intuitive causal relationships.

Static Concept Networks

In a stationary information scenario, Static Concept Networks work as follows: If intuitively, if two concepts are highly related causally then they should be connected in the network. One can imagine the edges in the network as the pipes along which information or influence can flow.

In Figure 2 below, we can see that the Coronavirus pandemic caused a drop in demand for oil which had a direct effect on the OPEC+ decision to reduce oil supply. The colour of the edges encodes the ‘strength’ of the relationship or, going back to our analogy, the capacity of the pipe. Further the direction of the arrow is like a one-way filter for information flow in the pipe.

Dynamic Concept Networks

Although Static Concept Networks extract out a good overview of dominant concepts and how they are related, some of the subtlety is lost due to the low temporal resolution. In order to give more insight, we can build the Concept Network dynamically, at a time resolution set by the user.

In Figure 3, our Dynamic Concept Network presents to us a more fine-grained picture of how the oil crisis, and subsequent OPEC+ deal evolved in terms of subtle effects and their timescales.

*Figure 3: (Above)* Dynamic Concept Network over a five week period.

We can additionally use the relationships in the network to automatically generate a briefing-style report on the concepts leading up to an event.

*Figure 4:* Example of an auto-generated report, generated using a Dynamic Concept Network.

Conclusion & Impact

As text and other unstructured data grows exponentially, the need to comb finely through such data to extract nuanced insights, grows by the day.

For the AI practitioner, the strength of Concept Networks is its ability to better move causality into the ideas or concept space from just an event or random-variable space. Additionally, the framework reduces the reliance on having a large corpus for each interaction — it works well on ‘rare concepts’ or ‘small data’ with sufficiently high intensities.

For the enterprise, the power of this framework is two-fold. Firstly, the amount of resources saved, a report such as this can be generated quickly — pointing the user to the dominant themes. Secondly, even in cases of large amounts of text digested and the presence of complex patterns, there is a much smaller likelihood of missing an emerging but important risk factor.

Importantly, the framework is designed to be used in the form of a tool that enables a user to quickly understand a corpus, and do further analysis where necessary.

‍

Embracing the Future of Investment Research

How to Live with AI Hallucinations

Combining Quantum-Classical Optimization at Mathlabs