Panel 2: Cognition, Models, and Meaning

  • Eight Arms, Zero Score. Octopus Cognition and the Anthropomorphic Limits of AI Evaluation.

    By Claudio Rossi (Alma Mater Europaea)

    Evaluation benchmarks for embodied AI are typically treated as neutral measurement instruments. This paper reads them instead as forms, following Caroline Levine's affordance-based formalism. Forms are configurations of bodies, materials, and protocols that afford some kinds of action and foreclose others; their affordances are relational, residing in the form-environment couple, and form and materiality are inextricable. Treating a benchmark as a form, rather than as a transparent instrument, shifts the question from what it measures to what it makes legible as competent action.

    I develop the reading through a sustained engagement with the Yale-CMU-Berkeley Object and Model Set (YCB), a widely-used benchmark in robotic manipulation. YCB assembles objects, scans, protocol templates, and scoring schemes whose materiality co-constitutes the kind of body and architecture it can reward: vision-dominant, rigidly-bodied, planning-centralised systems that converge on terminal poses at standard zones. Compliant, peripherally-distributed alternatives are not so much unmeasurable as systematically illegible to the form. This asymmetry, I argue, is what anthropomorphic names in the title: a coupling of body and cognitive architecture organised as we imagine ours to be, settled into the benchmark's material commitments rather than asserted in its documentation.

    The octopus enters as a denaturalising case rather than an empirical counterexample. Its body, sensing, and neural distribution co-arise as a different form-materiality couple, and that coupling makes the contingency of the one YCB favours visible. As Peter Godfrey-Smith argues, the standard examples of embodied cognition do not have a clear referent in the octopus; the case strains the framework I have used to read YCB. The denaturalisation operates at two levels: of the benchmark, and of the analytic vocabulary used to read it.

    The reading sits within the conference's non-rational register: it neither proposes a better metric nor refuses evaluation, but reads existing evaluation regimes formally, as forms whose affordances quietly settle contestable answers to philosophical questions about competence. The paper closes by transposing the argument from embodied AI to language: large language model architectures, too, are forms whose material affordances operationalise what counts as competent output in their domain.

  • Mitigating Hallucinations and Incoherent Reasoning in Large Language Models Using Neuro-Symbolic Knowledge Frameworks.

    By IbiyinkaTemilola Ayorinde (Ibadan) and Oluseyi Ayodeji Oyedeji (Northampton)

    In recent times, Large Language Models (LLMs) have experienced huge breakthroughs in the Natural Language Processing domain. In fact, the level of grammatical and contextual richness in texts produced by LLMs is highly impressive. However, a significant gap remains in terms of semantic reliability. A close analysis of some LLM-generated texts often reveals questionable facts and logical inconsistencies, highlighting the instability of LLMs and the constant need for their evolution. This paper conceptualises LLMs as syntactically rational but semantically irrational, thereby revealing their limitations in expressing the grounded meaning that human language requires. A lightweight neuro-symbolic framework is used to conceptualise and test the integration of neural language modelling with ontology-based reasoning as a way to address semantic irrationality. This involves using a pre-trained LLM, Flan-T5 model developed by Google combined with WordNet and a small domain ontology. Ontology rules, supported by a reasoner, are used to enforce constraints during text generation. The setup is applied across two experimental contexts, including question answering and summarisation, then comparisons are made between a baseline LLM and its neuro-symbolic variant. The results show that integrating the symbolic component reduces hallucination tendencies, ontological inconsistencies, and contradictions without compromising grammatical richness. Qualitative analysis also indicates clearer reasoning and justification, thereby improving trust and ethical transparency. In conclusion, this paper argues that semantic irrationality is not merely a technical issue that can be solved by upgrading LLMs, but one that requires integrating the natural blend of logic, intuition, and bias that characterises human thought. Neuro-symbolic integration therefore serves as a bridge between neural language modelling and natural human cognition.

  • Stochastic Parrots as World Makers.

    By Leonardo Santa Maria (Civic AI)

    There is a line of thought in contemporary philosophy of language that suggests that, since Large Language Models (LLMs) do not genuinely understand language, it is impossible for us to communicate with them (Bender & Koller, 2020; Mallory, 2023; Titus, 2024; Bottazzi Grifoni & Ferrario, 2025; Hattiangadi & Schoubye, 2025; Browning, forthcoming). The usual response to this view is to argue that LLMs are, in fact, more than stochastic parrots (Grzankowski et al., 2025) and that they exhibit forms of semantic understanding (Lyre, 2024; Beckmann & Queloz, 2025). In this talk, I suggest that this debate is distorting our understanding of LLMs and propose a different hermeneutical framing through which to conceptualize these technologies.

    Following a suggestion by Fazi (2025), I explore LLMs as “worldmakers” – not because they understand or intend, but because they demonstrably reorganize the symbolic resources through which humans express themselves (Geng & Trotta, 2025; Yakura et al., 2024; Liang et al., 2025). Furthermore, their outputs influence what is taken as objective (An, 2025), stylistically normal (Alvero et al., 2024; Rama & Airoldi, 2025), or informationally salient (Gillespie, 2024), thereby affecting the background conditions of thought expression and social relations (Lepp & Alvero, 2025). This paper argues that such worldmaking effects are more consequential for the future of human expression than the familiar question of whether LLMs genuinely “mean” what they say. It is not sufficient to counteract the pervasive anthropomorphizing of LLMs in the name of their essence as stochastic parrots that understand nothing. Rather, it is pivotal that we recognize these stochastic parrots as worldmakers.