Research

NeuroAI

PhD Student, Yale University, 2024

Using neural-network models such as RNNs and Transformers to gain mechanistic understandings of psychological phenomena related to attention and memory: How does attention facilitate memory encoding? How is working memory transformed into long-term memory? Why is there a fundamental limit in human working memory capacity? Are working memory and long-term memory represented in the same neural architecture? How do the representational formats of working memory and long-term memory differ? How to explain memory retrieval and false memory?
Using theoretical and mathematical tools to understand the inner workings and limitations of Transformer-based large language models (LLMs): How to use formal mathematical tools to assess the reasoning ability of LLMs? How to define the “emergence” of cognitive abilities in LLMs through the lens of statistical physics? What are the unique mathematical properties of LLMs that make them powerful? What are the missing pieces in these models to achieve human-level cognitive abilities? How to mathematically define the amount of intelligence and sentience in deep neural networks like Transformer-based LLMs? If LLMs are not the final solution to artificial general intelligence, what insights can theoretical neuroscience offer on the way of achieving that goal?

Bayesian Telephone: Memory Consolidation and Recall as Generative Processes

PhD Student, Yale University, 2023

The hippocampus serves a key role in memory acquisition and consolidation, yet it is unknown whether the hippocampus stores raw sensory inputs or merely generative reconstructions of those inputs. In this paper we examined these competing hypotheses of memory representation in the hippocampus. To do so we modeled the hippocampus as a modern Hopfield network and the entorhinal cortex as a variational autoencoder (VAE). We used Mitsuba 3 to generate a Cornell box dataset. In our first model, we passed these scenes directly into our Hopfield network and trained our VAE on the Hopfield network’s output when prompted by stimulus. In the second, the model first probabilistically inferred latent parameters for the observations and generated reconstructed observations which were then passed into the Hopfield network to aid in the training of our VAE. We tested the capacity of our models for generative recall of these scenes and found reliable minimization of reconstruction error during recall in both models. We concluded that either representation scheme or a combination of the two might be at work in the human brain. Future studies should explore implementing features such as forgetting and recall vulnerability in our base model and comparing model performance to human performance on recall tasks.

Github page: Bayesian Telephone: Memory Consolidation and Recall as Generative Processes

Guiding Perception by Memories of Multiple Timescales

PhD Student, Brain and Cognition Lab, University of Oxford, 2021

Our brain is extraordinary at matching incoming sensory signals to past experiences, which guides selective attention and allows us to behave adaptively and dynamically based on predictions and expectations. Memories of different timescales are involved in guiding perception and performance. For instance, when cycling to work, you are not only using long-term memories (LTM) of the spatial map and cycling route to guide your direction, but also relying on current working memories (WM) of traffic lights, pedestrians and cars on the road to adjust your speed or re-plan the route.
However, we still lack a clear understanding of how these memories of different timescales work together to guide adaptive behaviour, and neither do we know the underlying neural mechanisms.
This study aims to elucidate how the prospective nature of memory operates in a multi-timescale and interactive way, and what brain processes are involved in doing so.
Part of the funding of this research comes from UKRI (UK Research and Innovation). An overview of this research on UKRI website can be found here.

Relationship between Selective Attention and Ensemble Perception

Research Intern, Visual Attention Lab, Harvard University, 2020

Background: Our visual system copes with limited capacity using two different modes of attention, a distributed attention mode extracting the gist of a scene (i.e., ensemble perception) and a focused attention mode selecting only relevant information (i.e., selective attention). These two modes of processing serve different purposes. Still, it is unclear how they work together, whether they conflict with each other, and how cognitive control might play a role in conciliating their different processing demands.
Designed and programmed experiments, collected data and conducted analyses with MATLAB.
Developed a novel paradigm incorporating the mean orientation discrimination task and target orientation detection task. Introduced a single-task condition (requiring only one mode of processing), a dual-task condition (requiring both modes of processing), and a mixed-task condition (requiring either one or two modes of processing across trials).
Discovered that people’s performance in target selection and ensemble discrimination tasks positively correlated in both single-task and dual-task conditions, indicating some shared neural mechanisms underlying selective attention and ensemble perception. The mixed-task condition significantly impaired ensemble discrimination performance rather than target selection performance, suggesting a cognitive control strategy favoring selective attention when faced with conflicts in processing demands.

Saliency-Specific Mechanism of Distractor Suppression

Visiting Researcher, Department of Experimental and Applied Psychology, VU Amsterdam, 2020

Background: Research has shown that interference caused by a salient distractor in visual search tasks can be reduced by suppressing the high-probability location (HPL) of the distractor through implicit learning, while the underlying neural mechanisms and the impact of distractor saliency on suppression effects remain unclear.
Designed and programmed experiments, collected and analyzed data with MATLAB, and wrote the paper (published on Attention, Perception, & Psychophysics).
Developed a novel paradigm to manipulate saliency of distractors in additional singleton task and examined how distractors of different saliency were suppressed at the same HPL.
Inferred the neural mechanisms underlying the saliency-specific mechanism: Spatial probability manipulation elicited attentional modulation of V1 cells that cover the HPL with their classical receptive fields. The attentional modulation is tuned in accordance to the firing rates of the group of V1 cells representing the distractor, which is finally reflected on the saliency-specific reduction of interference when the distractor appears at the HPL.

Reverse Correlating Ensemble Perception

Research Assistant, Perception and Action Lab, Universityof California, Berkeley, 2019

Background: There hasn’t been enough evidence that people can extract summary statistical information about abstract social traits (e.g., trustworthiness, dominance, submissiveness and the like) because the space of faces features that can draw specific judgments is infinitely large.
Employed a data-driven reverse correlation approach to model the ensemble perception of trustworthiness.
Programmed experiments, collected and analyzed data with Reverse-Correlation Image-Classification Toolbox in R.
Participants viewed face crowds of different average trustworthiness levels, and then did a two-images-forced-choice (2IFC) classification task where they selected one of the two faces (with noise patterns superimposed on the base image) more representative of the average trustworthiness of the face crowd previously shown.
The average of all selected noise patterns constitutes the classification image (CI) and then the CIs for different average trustworthiness levels were statistically compared to determine how much they were different from each other.

Effects of Distractor Saliency and Spatial Location on Attentional Capture

Project Leader, Cognitive Neuroscience Lab, Tsinghua University, 2018

Background: Previous research found that a salient but task-irrelevant color singleton would increase the response time to the target form singleton, which is known as attentional capture. However, there haven’t been studies systematically examining how different distractor saliency conditions on a continuous spectrum would have an effect on attentional capture. Additionally, there has been evidence showing spatial heterogeneity in the perception of various visual features, but it’s not clear yet whether attentional capture also exhibits spatial heterogeneity across display locations.
Designed and programmed experiments, collected and analyzed data with MATLAB, and presented posters at the Psychonomic Society 2019 Annual Meeting and VSS 2020 Virtual Meeting.
Examined the effects of distractor saliency defined within multiple attention-guiding dimensions, including color, size and orientation, and established the existence of a certain threshold for distractor saliency to elicit attentional capture.
Found that there existed a spatial pattern of attentional capture susceptibility, which was distinctive and stable for each individual.

Dongyu Gong