By Anthony Scriffignano, Chief Data Scientist, Dun & Bradstreet
Anthony Scriffignano, Chief Data Scientist, Dun & Bradstreet
Artificial Intelligence (AI), once talked about only by science fiction geeks and computer nerds, is quickly becoming the topic of the day with business leaders. The focus is clearly justified: organizations struggle to capture and understand overwhelming amounts of data. There is no doubt AI capabilities will play an increasingly profound role in commerce and industry. However, those who focus singularly on AI, risk missing other important emerging technologies, or getting distracted by a technology in search of a problem. Clearly, there is potential for efforts such as exploring unstructured data, semantics, and connected relationships, but we must lead with a solid business challenge.
Consider management of supply chain risk in Consumer Packaged Goods. This complex problem requires good visibility and understanding of suppliers, who can number in tens of thousands for larger companies. Business objectives include vetting supplier quality and reliability, ensuring supplier compliance with changing regulations, and preventing reputational scandals involving conflict commodities and human rights. While managing immediate suppliers in developed markets is relatively straightforward, managing the deeper supply chain (e.g. suppliers of suppliers), or entities in remote markets is significantly more challenging. How do you know you are not sharing a critical Tier 3 supplier in East Asia with your largest competitor? How do you ensure that there are no egregious human rights violations within your supply chain? How is the character and quality of relationships in your supply chain changing as disruptive events take place?
Such questions fall in the domain of what can be referred to as “connected space”. Imagine connecting all known entities within your supply chain ecosystem (suppliers, supplier’s suppliers, their customers, and so on) in a connected graph where companies are linked by various two-sided relationships. This graph is now extended from a “supply” chain to an integrated “value” chain.
There is potential for efforts such as exploring unstructured data, semantics, and connected relationships, but we must lead with a solid business challenge
AI and other data science techniques allow visual and computational exploration of graph nodes (entities) and edges (relationships between entities). Relationships can be explored using traditional metrics, such as in-degree/out-degree or more complex observations such as an isotropism (how “lumpy” a graph region is). It is possible to explore how relationships change over time, throughout geography of interests, or from other points of reference. Seeing how things change is crucial to knowing where to focus attention in an otherwise overwhelmingly large collection of entities. Using appropriate methods, many of which can employ AI techniques, it is possible to recognize anomalous or otherwise interesting patterns or regions of the connected space. These regions may indicate impending distress, opportunity, or unusual behavior requiring human analysis (e.g. potential fraud).
While this sort of analysis can greatly improve a company’s understanding of its integrated value chain, other AI techniques can also provide insight. Natural language techniques, such as semantic vector models applied to unstructured text data, can greatly enhance companies’ ability to “see” relationships. Most businesses are acutely aware of the explosion of unstructured text data (e.g. social media posts, news articles, customer feedback) though few companies have articulated the value of unstructured text data clearly, particularly as it pertains to complex relationships. Even fewer organizations have a holistic strategy to capture this value into an actionable process. Nonetheless, there is progress. For example, semantic vector techniques help manage supply chain reputational risk by pre-defining terms indicative of supply chain issues (e.g. “conflict minerals,” “child labor”) and discovering those terms in unstructured data. Consider a statement like “Company A is absolutely committed to not using conflict minerals”–a casually defined algorithm might incorrectly associate the term “conflict minerals” with Company A. If we want to teach machines to make more than casual inference from written words, we need to allow those methods to evolve as language and usage evolve; this is no small task. Techniques such as entity extraction and semantic disambiguation are very useful in the context of these challenges, and thus should be viewed as critical emerging capabilities.
Other emerging techniques hold huge potential as well. For instance, the semantic vectors, where words in large unstructured text corpora are transformed into computable vector model that can be traversed through semantic relationships. Consider what happens if the term “bonded person” starts emerging as a synonym of “slave labor,” overwhelming a prior interpretation of someone who holds a surety bond. The vector space model will detect the semantic relationship, even though “bonded person” may not be an explicitly monitored term. As a recursively self-learning AI application, therefore, the semantic vector learning approach can be tremendously helpful for staying on top of evolving language and emerging events.
We live in exciting times indeed, and, the world is increasingly interconnected. It is no wonder that some of the most promising emerging technologies help us understand how entities and events are related and highlight the true meaning of these relationships.