Interesting NeuroAI/CompNeuro/LLM Cognition/Embodied AI/miscellaneous papers

18 minute read

Published:

1.29.2024

Neural tuning and representational geometry, Nature Reviews Neuroscience, 2021 Nikolaus Kriegeskorte & Xue-Xin Wei

1.30.2024

Spatially embedded recurrent neural networks reveal widespread links between structural and functional neuroscience findings, Nature Machine Intelligence, 2023 Jascha Achterberg et al.

2.1.2024

Transformer as a hippocampal memory consolidation model based on NMDAR-inspired nonlinearity, NeurIPS, 2023

2.5.2024

Brains and algorithms partially converge in natural language processing, Communications Biology, 2022

2.7.2024

No Coincidence, George: Capacity-Limits as the Curse of Compositionality, PsyArXiv, 2022

2.12.2024

Structural constraints on the emergence of oscillations in multi-population neural networks, eLife, 2024

Oscillatory neural networks, YouTube

2.14

Dynamics of Sparsely Connected Networks of Excitatory and Inhibitory Spiking Neurons

2.16

Using large language models to study human memory for meaningful narratives

Mechanisms of Gamma Oscillations

2.17

A call for embodied AI

2.18

Circular and unified analysis in network neuroscience

2.20-2.27

I was at AAAI 2024 for nearly a week. I learned a lot and will share some papers I came across from talks/posters at the conference.

On the Paradox of Learning to Reason from Data

CRAB: Assessing the Strength of Causal Relationships Between Real-World Events

Passive learning of active causal strategies in agents and language models

SPARTQA: A Textual Question Answering Benchmark for Spatial Reasoning

Hallucination is Inevitable: An Innate Limitation of Large Language Models

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

3.1

Three aspects of representation in neuroscience

Redefining "Hallucination" in LLMs: Towards a psychology-informed framework for mitigating misinformation

Distributed representations of words and phrases and their compositionality

3.2

Neural Turing Machines

A Critical Review of Causal Reasoning Benchmarks for Large Language Models

3.3

Recurrent Models of Visual Attention

Massive Activations in Large Language Models

Multiple Object Recognition with Visual Attention

Attention is not all you need anymore

The Annotated Transformer

Attention and Memory in Deep Learning

3.7

Large language models surpass human experts in predicting neuroscience results

3.8

Encoding and decoding in fMRI

My favorite math jokes

3.9

Memory in humans and deep language models: Linking hypotheses for model augmentation

3.11

Are Emergent Abilities of Large Language Models a Mirage?

Mathematical introduction to deep learning

3.12

Memory and attention in deep learning

World Models and Predictive Coding for Cognitive and Developmental Robotics: Frontiers and Challenges

Mastering Memory Tasks with World Models

Mechanism for feature learning in neural networks and backpropagation-free machine learning models

3.13

Brain-inspired intelligent robotics: The intersection of robotics and neuroscience

Papers mentioned in this article

3.14

One model for the learning of language

3.15

The pitfalls of next-token prediction

3.16

Do Llamas Work in English? On the Latent Language of Multilingual Transformers

Using large language models to study human memory for meaningful narratives

3.18

Neuroscience needs behavior

3.23

Traveling waves shape neural population dynamics enabling predictions and internal model updating

Task interference as a neuronal basis for the cost of cognitive flexibility

A Technical Critique of Some Parts of the Free Energy Principle

3.24

Theories of Error Back-Propagation in the Brain

Neurosymbolic AI

3.26

Spatially embedded recurrent neural networks reveal widespread links between structural and functional neuroscience findings

Traveling waves shape neural population dynamics enabling predictions and internal model updating

3.27

Reconstructing computational system dynamics from neural data with recurrent neural networks

3.29

A useful guide of how to pronounce common math symbols

3.30

A Review of Neuroscience-Inspired Machine Learning

3.31

Collective intelligence: A unifying concept for integrating biology across scales and substrates

4.3

An Introduction to Model-Based Cognitive Neuroscience

What does it mean to understand a neural network?

What is a GPT by 3Blue1Brown

4.5

Nonmonotonic Plasticity: How Memory Retrieval Drives Learning

Single Cortical Neurons as Deep Artificial Neural Networks

4.17

The brain's unique take on algorithms

Cognition is an emergent property

4.18

Catalyzing next-generation Artificial Intelligence through NeuroAI

4.19

Toward a formal theory for computing machines made out of whatever physics offers

Natural and Artificial Intelligence: A brief introduction to the interplay between AI and neuroscience research

4.22

Time, Love, Memory

Thinking About Science

Reasoning ability is (little more than) working-memory capacity?! - ScienceDirect

What Is Life - Wikipedia

How do Large Language Models Handle Multilingualism?

4.24

Empowering Working Memory for Large Language Model Agents

4.26

Context-dependent computation by recurrent dynamics in prefrontal cortex

Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation

4.29

Concurrent maintenance of both veridical and transformed working memory representations within unique coding schemes

5.1

A formal model of capacity limits in working memory - ScienceDirect

The Thermodynamics of Mind: Trends in Cognitive Sciences

5.7

Bridging Neuroscience and Robotics: Spiking Neural Networks in Action

Combined Sensing, Cognition, Learning, and Control for Developing Future Neuro-Robotics Systems: A Survey

AI, Robotics & Neuroengineering at Ken Kennedy Institute

Special Issue : Applications of Neural Networks in Robot Control

Embodied AI Workshop

5.8

Efficiently Modeling Long Sequences with Structured State Spaces

A new look at state-space models for neural data, Journal of Computational Neuroscience

Latent state-space models for neural decoding

State Space Modeling of Neural Spike Train and Behavioral Data - ScienceDirect

Switching state-space modeling of neural signal dynamics

Robotics and artificial intelligence

Human-like intuitive behavior and reasoning biases emerged in large language models but disappeared in ChatGPT

5.13

Is it a transition or a continuation? From PhD student to Postdoc. - ECR Community

Ten Simple Rules for Selecting a Postdoctoral Position

Transitioning fields between a Ph.D. and postdoc

5.14

The Computational Lens: from Quantum Physics to Neuroscience

Integration of cognitive tasks into artificial general intelligence test for large models: iScience

Active Predictive Coding: A Unified Neural Framework for Learning Hierarchical World Models for Perception and Planning

From grid cells to place cells: A mathematical model

If deep learning is the answer, what is the question?

5.21

The Lazy Neuron Phenomenon: On Emergence of Activation Sparsity in Transformers

5.29

Testing theory of mind in large language models and humans

Neuromorphic dreaming: A pathway to efficient learning in artificial agents

6.2

Do Llamas Work in English? On the Latent Language of Multilingual Transformers

6.3

Biocomputing with organoid intelligence

Catalyzing next-generation Artificial Intelligence through NeuroAI (Well, this one has been listed above, but never mind)

Disentangling and Integrating Relational and Sensory Information in Transformer Architectures

6.5

Empirical influence functions to understand the logic of fine-tuning

6.12

Are Emergent Abilities of Large Language Models a Mirage?

6.13

A virtual rodent predicts the structure of neural activity across behaviors

Empirical influence functions to understand the logic of fine-tuning

Activation Sparsity: An Insight into the Interpretability of Trained Transformers

6.14

Inferences on a multidimensional social hierarchy use a grid-like code

Grid-like and distance codes for representing word meaning in the human brain

Relating transformers to models and neural representations of the hippocampal formation

Scaling Laws for Neural Language Models

Emergent Abilities of Large Language Models

Organizing conceptual knowledge in humans with a gridlike code

The coming decade of digital brain research: A vision for neuroscience at the intersection of technology and computing

6.18

Thousand Brains Project

千脑智能理论:开启创造机器智能的路线图

6.24

Oxford ML School

Oxford LLMs

Large Language Models for Mathematicians

6.25

Language is primarily a tool for communication rather than thought

Representation learning for neural population activity with Neural Data Transformers

Towards a Foundation Model of the Mouse Visual Cortex

Statistical mechanics of Bayesian inference and learning in neural networks

Jascha Achterberg - NeuroAI

6.26

A Bayesian account of learning and generalising representations in the brain - ORA - Oxford University Research Archive

Detecting hallucinations in large language models using semantic entropy

Fine-tuning can cripple your foundation model; preserving features may be the solution

7.12

Working Memory Load Modulates Neuronal Coupling

In vivo ephaptic coupling allows memory network formation

7.16

Cognitive computational neuroscience

Heavy-tailed neuronal connectivity arises from Hebbian self-organization

INSTRUCTION-TUNING ALIGNS LLMS TO THE HUMAN BRAIN

The debate over understanding in AI’s large language models

7.18

Shared functional specialization in transformer-based language models and the human brain

On Layer Normalization in the Transformer Architecture

7.19

The expanding horizons of network neuroscience: From description to prediction and control - ScienceDirect

Modular Brain Networks

7.31

Organic electrochemical neurons and synapses with ion mediated spiking

8.2

Stephen Wolfram: A New Kind of Science

8.3

Do Language Models Have a Critical Period for Language Acquisition?

8.5

Inductive or Deductive? Rethinking the Fundamental Reasoning Abilities of LLMs

8.7

From Analog to Digital Computing: Is Homo sapiens’ Brain on Its Way to Become a Turing Machine?

8.13

The brain and its time: intrinsic neural timescales are key for input processing

8.28

Neural circuits as computational dynamical systems

9.9

Unsupervised neural network models of the ventral visual stream

Emotional Intelligence of Large Language Models

No Free Lunch from Deep Learning in Neuroscience: A Case Study through Models of the Entorhinal-Hippocampal Circuit

CEBRA: Learnable latent embeddings for joint behavioral and neural analysis

DevBench: A multimodal developmental benchmark for language learning

Running cognitive evaluations on large language models: The do's and the don'ts

Induction heads - illustrated — LessWrong

Systematic Generalization and Emergent Structures in Transformers Trained on Structured Tasks

Abstract representations emerge in human hippocampal neurons during inference

Reconciling Shared versus Context-Specific Information in a Neural Network Model of Latent Causes

Lecture Notes on Infinite-Width Limits of Neural Networks

Scaling and renormalization in high-dimensional regression

Curriculum Learning with Infant Egocentric Videos

In-context Learning and Induction Heads

Natural and Artificial Intelligence: A brief introduction to the interplay between AI and neuroscience research - ScienceDirect

Reasoning ability is (little more than) working-memory capacity?!

A formal model of capacity limits in working memory

Prefrontal cortex as a meta-reinforcement learning system

Scaffolding cooperation in human groups with deep reinforcement learning

Sequential Memory with Temporal Predictive Coding

COGNITIVE MODELING OF SEMANTIC FLUENCY USING TRANSFORMERS

Predictive Coding: a Theoretical and Experimental Review

Neural Foundations of Mental Simulation: Future Prediction of Latent Representations on Dynamic Scenes

Toward the Emergence of Intelligent Control: Episodic Generalization and Optimization

Machine learning Notation

Accelerating generative models and nonconvex optimisation

Representation and computation in visual working memory

Nonlinear difference equations

Dynamical Systems Approaches to Cognition

Attention Mechanisms and Their Applications to Complex Systems - PMC

Learning differential equations

Context-dependent computation by recurrent dynamics in prefrontal cortex

Evidence of a predictive coding hierarchy in the human brain listening to speech

Using higher-order Markov models to reveal flow-based communities in networks

he Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks

The neuron as a direct data-driven controller

Seminar course: Bridging Language in Machines and Language in the Brain

Manifolds: A Gentle Introduction

Dimension Reduction using Isomap

A simple weight decay can…

Investigating Neuron Ablation in Attention Heads: The Case for Peak Activation Centering

Empirical influence functions to understand the logic of fine-tuning

9.14

The Impact of Positional Encoding on Length Generalization in Transformers

9.15

How Attention works in Deep Learning: understanding the attention mechanism in sequence models

Explainable AI: Visualizing Attention in Transformers - Comet

Toy Models of Superposition

A Mathematical Framework for Transformer Circuits

Transformers: a Primer

Code repo: Word-level Language Modeling using RNN and Transformer

Code repo: Transformer as a hippocampal memory consolidation model based on NMDAR-inspired nonlinearity

Code repo: The Tolman-Eichenbaum Machine

Adaptive chunking improves effective working memory capacity in a prefrontal cortex and basal ganglia circuit

Toward Transparent AI: A Survey on Interpreting the Inner Structures of Deep Neural Networks

Analogous computations in working memory input, output and motor gating: Electrophysiological and computational modeling evidence

Exposing Attention Glitches with Flip-Flop Language Modeling

Opening the Black Box: Low-Dimensional Dynamics in High-Dimensional Recurrent Neural Networks

Context-dependent computation by recurrent dynamics in prefrontal cortex

A survey of transformers

Gradient-based learning drives robust representations in recurrent neural networks by balancing compression and expansion

Population codes enable learning from few examples by shaping inductive bias

What is In-context Learning, and how does it work: The Beginner’s Guide

The Curse of Dimensionality

A generative model of memory construction and consolidation

The Neurobiology of Semantic Memory

The precision of visual working memory is set by allocation of a shared resource

The capacity of visual working memory for features and conjunctions

Timescales of learning in prefrontal cortex

The Distributed Nature of Working Memory

Geometry of neural computation unifies working memory and planning

Transformer Mechanisms Mimic Frontostriatal Gating Operations When Trained on Human Working Memory Tasks

Wide Attention Is The Way Forward For Transformers?

The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers

Natural constraints explain working memory capacity limitations in sensory-cognitive models

Scaling Laws for Neural Language Models

The Depth-to-Width Interplay in Self-Attention

A mathematical perspective on Transformers

TOWARDS SMALLER, FASTER DECODER-ONLY TRANSFORMERS: ARCHITECTURAL VARIANTS AND THEIR IMPLICATIONS

Upper and lower memory capacity bounds of transformers for next-token prediction

How Powerful are Decoder-Only Transformer Neural Models?

Mastering Decoder-Only Transformer: A Comprehensive Guide

How should the architecture of a transformer be scaled? : r/MachineLearning

Code repo: Transformer_walkthrough

PsychRNN: An Accessible and Flexible Python Package for Training Recurrent Neural Network Models on Cognitive Tasks

Self-backpropagation of synaptic modifications elevates the efficiency of spiking and artificial neural networks

9.17

STRIDE: A Tool-Assisted LLM Agent Framework for Strategic and Interactive Decision-Making

Automated construction of cognitive maps with visual predictive coding

Schrodinger's Memory: Large Language Models

From Cognition to Computation: A Comparative Review of Human Attention and Transformer Architectures

Neuroscience + Artificial Intelligence = NeuroAI

Transformer-based Working Memory for Multiagent Reinforcement Learning with Action Parsing

[2402.12875] Chain of Thought Empowers Transformers to Solve Inherently Serial Problems

[2103.03404] Attention is Not All You Need: Pure Attention Loses Rank Doubly Exponentially with Depth

9.18

InversionView: A General-Purpose Method for Reading Information from Neural Activations

Divergent recruitment of developmentally defined neuronal ensembles supports memory dynamics

Theoretical Limitations of Self-Attention in Neural Sequence Models

TransformerFAM: Feedback attention is working memory

A resource-rational model of human processing of recursive linguistic structure

9.19

Empirical Capacity Model for Self-Attention Neural Networks

Self-attention Does Not Need $O(n^2)$ Memory

9.20

Imitating and exploring the human brain's resting and task-performing states via brain computing: scaling and architecture

Feed-Forward Networks with Attention Can Solve Some Long-Term Memory Problems

Emulating the Attention Mechanism in Transformer Models with a Fully Convolutional Network

9.21

Human-like systematic generalization through a meta-learning neural network

9.25

The Dimensions of dimensionality

RNNs Implicitly Implement Tensor Product Representations

Tensor product variable binding and the representation of symbolic structures in connectionist systems

9.26

A shared model-based linguistic space for transmitting our thoughts from brain to brain in natural conversations: Neuron

A Quantitative Approach to Predicting Representational Learning and Performance in Neural Networks

Mechanistic Interpretability for AI Safety – A Review

9.28

Flexible control of sequence working memory in the macaque frontal cortex

Mental programming of spatial sequences in working memory in the macaque frontal cortex, Science, 2024

Geometry of sequence working memory in macaque prefrontal cortex, Science, 2022

Nonlinear classification of neural manifolds with contextual information

9.29

A theory of consciousness from a theoretical computer science perspective: Insights from the Conscious Turing Machine

[2012.14601] Emergent Symbols through Binding in External Memory

[2001.11027] The Tensor Brain: Semantic Decoding for Perception and Memory

Network attractors and nonlinear dynamics of neural computation - ScienceDirect

An attractor network in the hippocampus: Theory and neurophysiology

Learning Attractor Dynamics for Generative Memory

What is remembered? Role of attention on the encoding and retrieval of hippocampal representations - PMC

Attractor dynamics with activity-dependent plasticity capture human working memory across time scales

attractor networks

Acetylcholine-mediated top-down attention improves the response to bottom-up inputs by deformation of the attractor landscape

The Consciousness Prior

Bayesian surprise attracts human attention

[1902.10186] Attention is not Explanation

[1908.04626] Attention is not not Explanation

From Human Attention to Computational Attention: A Multidisciplinary Approach

Disentangling and Integrating Relational and Sensory Information in Transformer Architectures

Accurate Path Integration in Continuous Attractor Network Models of Grid Cells

A map of spatial navigation for neuroscience - ScienceDirect

Viewpoints: how the hippocampus contributes to memory, navigation and cognition - PMC

[1805.09042] Generalisation of structural knowledge in the hippocampal-entorhinal system

Can We Reconcile the Declarative Memory and Spatial Navigation Views on Hippocampal Function?: Neuron

Using Fast Weights to Attend to the Recent Past

Hopfield Networks is All You Need

R-Transformer: Recurrent Neural Network Enhanced Transformer

Abstract representations emerge in human hippocampal neurons during inference

10.2

PhD Thesis: Exploring the role of (self-)attention in cognitive and computer vision architecture

10.6

A rubric for human-like agents and NeuroAI

Memory Networks: Towards Fully Biologically Plausible Learning

Textbook: Introduction to Machine Learning

10.7

Language in Brains, Minds, and Machines

10.8

Statistical Mechanics of Deep Learning

It’s about time: Linking dynamical systems with human neuroimaging to understand the brain

RNNs implicitly implement tensor-product representations

Tensor product decomposition network

Basic Reasoning with Tensor Product Representations

10.9

Were RNNs All We Needed?

Human-level control through deep reinforcement learning

Geometric constraints on human brain function

The brain wave equation: a model for the EEG

Traveling Waves Encode the Recent Past and Enhance Sequence Learning

10.10

Probability theory notes

Tensor Decomposition via Variational Auto-Encoder

10.11

Neural knowledge assembly in humans and neural networks: Neuron

Reproducibility in Computational Neuroscience Models and Simulations

Distributed Representations of Words and Phrases and their Compositionality

GloVe: Global Vectors for Word Representation

Learning by thinking in natural and artificial minds: Trends in Cognitive Sciences

GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models

Intelligence at the Edge of Chaos

10.13

Why the simplest explanation isn’t always the best

10.14

Meta Movie Gen

Interpretable Recurrent Neural Networks in Continuous-time Control Environments

From Liquid Neural Networks to Liquid Foundation Models

Humans actively reconfigure neural task states

OptPDE: Discovering Novel Integrable Systems via AI-Human Collaboration

Loss of plasticity in deep continual learning

10.15

The relational bottleneck as an inductive bias for efficient abstraction: Trends in Cognitive Sciences

Turning large language models into cognitive models

Comunication-Efficient Algorithms for Statistical Optimization

Compositionality Decomposed: How do Neural Networks Generalise?

Visualisation and ‘Diagnostic Classifiers’ Reveal How Recurrent and Recursive Neural Networks Process Hierarchical Structure