Αlpha¹ Review in Progress

RECORD ID: D0F77EC6
Peer-Reviewed Manuscript

Decoding the gene regulatory landscape through multimodal learning of protein-DNA interactions

Authors

Tan, J.; Fu, X.; Ling, X.; Mo, S.; Bai, J.; Rabadan, R.; Fenyo, D.; Boeke, J. D.; Tsirigos, A.; Xia, B.

Abstract

The identity of a cell is governed by regulatory proteins binding to the genome to control gene expression. Mapping these genome-wide binding events across thousands of proteins and cell types is essential for understanding development and disease at scale, yet has remained a major experimental and computational barrier. Here we present Chromnitron, a multimodal foundation model that learns the rules of protein-DNA binding from protein sequence, DNA sequence, and context-specific chromatin states. Unlike prior single-task and multi-task learning approaches, Chromnitron implements a multimodal learning framework that accurately predicts the binding landscape for proteins and cell types not seen during training. Using Chromnitron, we discovered and experimentally validated new protein regulators of T cell exhaustion. Chromnitron also uncovered previously uncharacterized dynamic shifts in the binding landscape of regulatory proteins during neurogenesis. This marks a critical step toward a predictive model of interpretable gene regulatory programs across cell types, enabling rapid discovery of regulatory circuits and identification of new therapeutic targets.

Peer Reviews

Peer review in progress...

Community Assessment

Your Assessment

Robust Methods
Supported Claims
Significance