About

Learn more about me

Background

I study language because it represents the most sophisticated mapping we have between cognitive states and external information. My goal is to move beyond heuristic-based decoding toward a formal understanding of representational alignment between biological and artificial neural networks.

Right now, I'm an NLP master's student at UC Santa Cruz (Fiat Slug!), where my work centers on how deep learning models learn and represent meaning. In practice, this has meant building span-based systems for semantic role labeling, developing custom pipelines for microelectrode array (MEA) recording systems, and using evaluation as a way to probe what models actually learn about meaning.

Before graduate school, I studied Economic Policy and Language & Mind at New York University (Go Violets!). At NYU, I became fascinated by how language shapes cognition and how computational systems might approximate parts of that process. I've since worked on applied machine learning in various sectors, and have learned how to turn messy real-world data into usable models.

These experiences pulled my interests toward deeper questions about language, thought, and intelligence. Today, my long-term research direction sits at the intersection of NLP, computational neuroscience, and brain–computer interfaces. My work is focused on building real-time interfaces for neural recording systems and exploring how biological signals can be translated into structured, machine-readable representations.

Ultimately, I'm interested in connecting language models with neural signals of thought, and using that connection to better understand both artificial and biological intelligence.

Questions I'm chasing

Language & Learning

I'm drawn to psycholinguistics and computational neuroscience:

How are linguistic processes represented in the brain?
How does language shape conscious experience?
What would it mean for a machine to truly "understand" meaning?

Brain & Signal

I'm building and studying neural interfacing algorithms:

How can we reliably record and stimulate specific neural populations?
What are the tradeoffs between invasive and non-invasive recording modalities?
How much structure is already present in neural signals before modeling?

Systems & Data

I care about pipelines as much as theory:

How do we model cognitive processes with real data?
Which architectures best support neural decoding and representation learning?
How do we design systems that stay interpretable as they scale?

Current Research Readings

Neural Decoding & Brain-Model Alignment

Tang et al. (2023). Semantic reconstruction of continuous language from non-invasive brain recordings
Nature Neuroscience
Demonstrates the reconstruction of continuous natural language from fMRI signals using a GPT-1-based encoding model to map BOLD responses to semantic space.
Tang et al. (2025). Semantic language decoding across participants and stimulus modalities
Current Biology
Uses functional alignment (Procrustes transformations) to enable cross-subject semantic decoding, suggesting that high-level linguistic representations are conserved across individuals.
Ye et al. (2025). Generative language reconstruction from brain recordings (BrainLLM)
Communications Biology
Proposes a generative framework that maps neural features directly into the latent hidden state space of a pre-trained LLM for language reconstruction.
Duan et al. (2023). DeWave: Discrete Encoding of EEG Waves for Natural Language Decoding
arXiv
Introduces a method to translate non-invasive EEG waves into text by using a quantized codebook to represent neural signals as discrete tokens for LLM processing.
Ndir & Schirrmeister (2025). EEG-CLIP: Learning EEG representations from natural language descriptions
arXiv
Applies contrastive learning (InfoNCE) to align EEG time-series with clinical text descriptions in a shared multimodal embedding space.

Multimodal Architectures & Optimization

Alayrac et al. (2022). Flamingo: a Visual Language Model for Few-Shot Learning
NeurIPS
Utilizes a Perceiver Resampler and Gated Cross-Attention to inject non-linguistic inputs into a frozen LLM, providing a blueprint for multimodal alignment.
Langer et al. (2025). OpenTSLM: Time-Series Language Models for Reasoning over Multivariate Medical Text
arXiv
Adapts large language models to process and reason over multivariate time-series data using Flamingo-style cross-attention mechanisms.

Datasets, Biology & Theory

Niguangjian et al. (2025). An open dataset of multidimensional signals based on different speech patterns in pragmatic Mandarin (T-MSPD)
Nature Scientific Data
Provides a comprehensive open dataset of aligned EEG, sEMG, and audio signals for vocalized, silent, and imagined speech patterns.
Andrew et al. (2025). Microscale maps of bursting dynamics across human hippocampal slices from patients with epilepsy
Journal of Neurophysiology (Sharf Lab/UCSC)
Maps micro-circuit bursting dynamics using high-density CMOS arrays, providing foundational biological context for neural signal processing research at the Sharf Lab.
Kukushkin et al. (2024). Non-neural cells can learn and form memories
Nature Communications
Demonstrates memory and learning capabilities in non-neural systems, challenging the neuron-centric monopoly on biological computation.

Get in touch

I'm always happy to talk about NLP, neurotechnology, and strange questions about language and mind. Feel free to reach out if you're working on something interesting or just want to compare notes.

Back to home

GitHub LinkedIn