Logic Seminar talk: Doing and observing in (large) sequence models

On May 22, 2025 at 14:00 EEST, Andreea Eșanu (New Europe College & University of Bucharest) will give a talk in the Logic Seminar.

Title: Doing and observing in (large) sequence models. A discussion of auto-suggestive delusions

Abstract:

In recent artificial intelligence (AI) literature, so-called hallucinations or delusions—instances where generative models produce false or unfounded outputs—have raised legitimate concerns about the trustworthiness and applicability of AI systems in real-world contexts. While much of the literature frames these delusions as surface-level failures of content fidelity, a more profound explanation lies in the structural underpinnings of these models—specifically, in the presence of confounding variables and their treatment under causal inference frameworks like Judea Pearl’s do-calculus.

A compelling causal account of such delusions, particularly those termed auto-suggestive delusions, was developed recently by Ortega et al. [1] in the context of (large) sequence models, which are embedded in a great number of current generative AI models (including the GPT class of language models). According to this account, auto-suggestive delusions arise not merely from data scarcity or exposure bias but from a deeper conflation in the model of observations (ground-truth data) and actions (model-generated data) during inference. This misidentification introduces delusions whereby the model interprets its own outputs as evidence about the world—a self-reinforcing effect that systematically distorts its representation of reality.

This leads to broader philosophical implications. If a model recursively treats its own outputs as valid observations, it may gradually drift into a self-referential epistemic loop—where it no longer requires, nor aligns with, external validation. Such a model isn’t merely biased; rather, it might be said it reasons according to an internal causal model that’s just cut off from the world, albeit coherent.

This presentation proceeds in four parts. First, I will review exposure bias and its limitations in explaining AI models’ delusions. Second, following Ortega et al. [1] I will introduce Pearl’s causal inference framework that reveals how confounding operates in (large) sequence models. Third, I will examine the nature of auto-suggestive delusions through the lens of causal inference. Finally, I will argue that these findings suggest that (large) sequence models and the AI models implementing them, under certain conditions, evolve toward solipsistic behavior—a state in which they create internally coherent but externally ungrounded representations of the world.

References:

[1] P. A. Ortega et al., Shaking the foundations: delusions in sequence models for interaction and control. arXiv:2110.10819 [cs.LG], 2021.

The talk will take place physically at FMI (Academiei 14), Hall 214 “Google”.