zondag 15 oktober 2023

On the role of AI in Research and Science: Not Giving Fish, but Teaching to Fish

Ever since Alan Turing's groundbreaking paper on artificial intelligence, we've been on a quest to master what he aptly described as an "imitation game." AI advancement is a continuous endeavour to increasingly simulate and mimic human reasoning and behaviour, an extension of the mediaeval ambition to build a "Golem", a mythical non-human but anthropomorphic, non-machine but mechanical, incarnation that simulates and ultimately surpasses our own capabilities.

While building a passable imitation of human behaviour is commendable, it carries the inherent risk of equally amplifying our built-in limitations and flaws. We want to marvel at the astonishing feats of AI, but do so without the risk of being confronted with equally glaring examples of human shortcomings.

We've witnessed glimpses of AI gone astray due to biassed training or unforeseen behaviours. Image recognition AI might hallucinate recognizable patterns in random noise, and online language models have at times produced convincing but utterly false information without even the most basic fact-checking or self-questioning.

Historically, science attributes human failures to a lack of knowledge, and the scientific process establishes a structured dialogue to rectify this. This dialogue, while imperfect, is fine-tuned to cultivate knowledge—a shared, common understanding of reality that enables us to predict the outcomes of our actions and policies. It has also unveiled our inherent biases: our tendency to disregard unwelcome facts, seek arguments that align with our preconceived notions, and engage in post-rationalization, all while using notoriously ambiguous language.

Today's challenge lies in aligning the statistical prowess of AI with the perennial quest for knowledge. Obtaining quick answers does not equate to fostering understanding, but they should harmoniously coexist.

(Data) Science has progressively embraced the principles of the semantic web and semantically unambiguous knowledge graphs. The deceptively simple RDF grammar, with its concise three-word sentences, has introduced a new computer language aimed at describing facts about the world, this in contrast to the directive nature of classic programming languages (Turing-complete instruction sets)

This transition to publishing linked open research data brings forth a global effort to define vocabularies, unambiguous reference terms, and navigational property paths that connect disparate parts of a scientific knowledge graph spanning all domains. The grown consensus behind this forms the semantic backbone for fruitful collaboration with AI, equipped with advanced statistical analysis capabilities.

The collaborative cycle unfolds as follows:

  • Humans provide curated real-world data encoded in semantically annotated form (triples).
  • A "feature-detecting AI" can be trained to assist in accurately converting new, high-volume observations into semantically precise statements.
  • The expanding interconnected data becomes accessible for both humans and AI to derive meaning from it.
  • A distinct "association-detecting AI" mines this extensive and domain-spanning dataset, revealing surprising statistical connections that are, crucially, unambiguously understandable due to shared terminology.
  • Some of these discoveries trigger our attention, becoming new hypotheses for which experts, likely humans initially, devise data collection systems and theories to verify or debunk them and, ideally, ascertain the direction of causality.

In prehistoric times, a hunter would assess various factors like prey movement, distance, and wind before throwing a spear to secure a meal for their tribe. In the modern perspective, this is an exquisite solution to second-degree ballistic functions. Individual experiences and skills are explained as theories, signifying reusable knowledge. In the "survival game" this approach has unlocked a new level to us humans.  We should not nonchalantly hand the keys to this over to a new breed of intelligence. And, if further down the road, AI is to help us discover more new levels, we must establish and govern the principles that allow us to keep evolving alongside it.