Discover &
Discuss Breakthroughs
Browse high-impact preprints in the Αlpha¹ review pipeline. Evaluate the latest research, identify breakthrough signals, and participate in the validation of the top 1% of science.
Under Review
4 papersThese papers are actively being reviewed by expert peers. Rate the research and follow the validation process.
Prediction of transformative breakthroughs in biomedical research
Davis, M. T.; Busse, B. L.; Arabi, S.; Meyer, P.; Hoppe, T. A.; Meseroll, R. A.; Hutchins, B. I.; Willis, K. A.; Santangelo, G. M.
The ability to predict scientific breakthroughs at scale would accelerate the pace of discovery and improve the efficiency of research investments. Recent advances in artificial intelligence, graph theory, and computing …
10.64898/2025.12.16.694385Your Assessment
This paper needs qualified reviewers (0/3 accepted). Use the form below to nominate yourself.
Using a GPT-5-driven autonomous lab to optimize the cost and titer of cell-free protein synthesis
Smith, A. A.; Wong, E. L.; Donovan, R. C.; Chapman, B. A.; Harry, R.; Tirandazi, P.; Kanigowska, P.; Gendreau, E. A.; Dahl, R. H.; Jastrzebski, M.; Cortez, J. E.; Bremner, C. J.; Hemuda, J. C. M.; Dooner, J.; Graves, I.; Karandikar, R.; Lionetti, C.; Christopher, K.; Consiglio, A. L.; Tran, A.; McCusker, W.; Nguyen, D. X.; Nunes da Silva, I. B.; Bautista-Ayala, A. R.; McNerney, M. P.; Atkins, S.; McDuffie, M.; Serber, W.; Barber, B. P.; Thanongsinh, T.; Nesson, A.; Lama, B.; Nichols, B.; LaFrance, C.; Nyima, T.; Byrn, A.; Thornhill, R.; Cai, B.; Ayala-Valdez, L.; Wong, A.; Che, A. J.; Thavaraj
We used an autonomous lab, comprising a large language model (LLM) and a fully automated cloud laboratory, to optimize the cost efficiency of cell-free protein synthesis (CFPS). By conducting iterative optimization, the …
10.64898/2026.02.05.703998Your Assessment
Review in progress
A Single-Cell and Spatial 3D Multi-omic Atlas of Developing Human Basal Ganglia and Inhibitory Neurons
Heffel, M. G.; Xu, H.; Pastor-Alonso, O.; Li, X.; Baig, M. S.; Irfan Ghoor, R.; Li, R.; Kern, C.; Kum, J.; Zhang, Y.; Paino, J.; Tsai, M. J.; Tai, C.-Y.; Tucker, G.; Zhao, Z.; Hou, A.; von Behren, Z.; Bhade, M.; Li, S.; Sandoval, K.; Scholes, J.; Codrea, F.; Calimlim, J.; Liao, E. K.; Leung, G.; Kim, J.; Eskin, E.; Flint, J.; Cotter, J. A.; Pasaniuc, B.; Bintu, B.; Zhu, Q.; Mukamel, E. A.; Ernst, J.; Paredes, M. F.; Luo, C.
The human basal ganglia (BG), subcortical nuclei fundamental to motor regulation and cognitive modulation, is constructed from neurons produced during gestation in the adjacent ganglionic eminences (GEs). GEs are transie…
10.64898/2026.01.28.702385Your Assessment
This paper needs qualified reviewers (0/3 accepted). Use the form below to nominate yourself.
CD4⁺ T cells confer transplantable rejuvenation via Rivers of telomeres
Lanna, A.; Valvo, S.; Dustin, M.; Rinaldi, F.
The role of the immune system in regulating organismal lifespan remains poorly understood. Here, we show that CD4 T cells release "telomere Rivers" into circulation after acquiring telomeres from antigen-presenting cells…
10.1101/2025.11.14.688504Your Assessment
This paper needs qualified reviewers (1/3 accepted). Use the form below to nominate yourself.
Calling For Reviewers
0 papersThese papers need qualified expert reviewers. Submit a bid with your fee and timeline to be considered.
No papers currently seeking reviewers.
Calling For Commission
38 papersThese nominated papers need a sponsor to commission an official Αlpha¹ peer review. Fund the validation of breakthrough science.
De novo protein ligand design including protein flexibility and conformational adaptation
Agamia, J.; Zacharias, M.
MotivationThe rational design of chemical compounds that bind to a desired protein target molecule is a major goal of drug discovery. Most current molecular docking but also fragment-based build-up or machine-learning ba…
10.64898/2026.01.08.698398Your Assessment
LoL-align: sensitive and fast probabilistic protein structure alignment
Reifenrath, L.; van Kempen, M.; Kim, G.; Kim, S. H.; Radnezhad, M.; Mirdita, M.; Steinegger, M.; Söding, J.
The ubiquitous availability of protein structures permits replacing sequence alignment with more accurate and sensitive structure alignment algorithms. LoL-align maximizes a local log-odds score for proteins to be homolo…
10.1101/2025.11.24.690091Your Assessment
Batch Effects Remain a Fundamental Barrier to Universal Embeddings in Single-Cell Foundation Models
Wang, L.; Zhang, C.; Zhang, S.
Constructing a cell universe requires integrating heterogeneous single-cell RNA-seq datasets, but is hindered by diverse batch effects. Single-cell foundation models (scFMs), inspired by large language models, aim to lea…
10.64898/2025.12.19.695371Your Assessment
Evo2HiC: a multimodal foundation model for integrative analysis of genome sequence and architecture
Fang, T.; Wang, X.; Xiao, Z.; Hang, S.; Murtaza, G.; Yang, J.; Xu, H.; Jha, A.; Noble, W. S.; Wang, S.
Understanding how genomic sequences shape three-dimensional (3D) genome architecture is funda-mental to interpreting diverse biological processes. Although previous studies have shown that sequence information can predic…
10.1101/2025.11.18.689171Your Assessment
Detecting Somatic Mutations in Rare Clones using Single Cell Multi-Omics
Gillman, R.; Dukda, S.; Sadir, J.; Goodnow, C.; Luciani, F.; Singh, M.; Field, M. A.
BackgroundSomatic mutations are increasingly recognised as drivers of diseases beyond cancer, including autoimmune disorders. However, identifying rare, cell type-specific causal mutations remains challenging due to thei…
10.1101/2025.10.30.685685Your Assessment
scPRINT-2: Towards the next-generation of cell foundation models and benchmarks
Kalfon, J.; Peyre, G.; Cantini, L.
Cell biology has been booming with foundation models trained on large single-cell RNA-seq databases, but benchmarks and capabilities remain unclear. We propose an additive benchmark across a gymnasium of tasks to discove…
10.64898/2025.12.11.693702Your Assessment
CellDiffusion: a generative model to annotate single-cell and spatial RNA-seq using bulk references
Zhang, X.; Mao, J.; Le Cao, K.-A.
Annotating single-cell and spatial RNA-seq data can be greatly enhanced by leveraging bulk RNA-seq, which remains a cost-effective and well-established benchmark for characterising transcriptional activity in immune cell…
10.1101/2025.10.27.684671Your Assessment
Benchmarking datasets for machine learning in protein function prediction
Jang, Y.; Qin, Q.-Q.; Wang, J.-L.; Kornmann, B.
Remarkable progress has been achieved by machine learning, particularly in accurate prediction of protein tertiary structures. Despite these advances, accurately annotating protein functions through machine learning appr…
10.64898/2025.12.17.694800Your Assessment
DIVAS: an R package for identifying shared and individual variations of multiomics data
Sun, Y.; Marron, J. S.; Le Cao, K.-A.; Mao, J.
MotivationMultiomics data integration aims to identify biological patterns shared across different molecular modalities. Traditional methods focus on detecting either jointly shared variation (across all modalities) or i…
10.64898/2026.01.12.698985Your Assessment
spaTransfer: transfer learning for single-cell and spatialtranscriptomics data using non-negative matrix factorization
Fang, C.; Montgomery, K. D.; Maguire, S. E.; Ramnauth, A. D.; Guo, B.; Miller, R.; Kleinmann, J. E.; Hyde, T. M.; Martinowich, K.; Maynard, K. R.; Page, S. C.; Hicks, S. C.
Recent advances in spatially-resolved transcriptomics have enabled profiling of gene expression in a spatial context, which has led to the generation of large-scale single-cell and spatial atlases with computationally-de…
10.64898/2025.12.12.694021Your Assessment
Which pLM to choose?
Senoner, T.; Koludarov, I.; Guenther, J.; Shehu, A.; Rost, B.; Bromberg, Y.
AO_SCPLOWBSTRACTC_SCPLOWProtein-language models (pLMs) provide a novel means for mapping the protein space. Which of these new maps best advances specific biological analyses, however, is not obvious. To elucidate the pr…
10.1101/2025.10.30.685515Your Assessment
BioPrediction-PPI: Simplifying the Prediction of Protein-Protein Interactions through Artificial Intelligence
Florentino, B. R.; Bonidia, R. P.; Nunes da Rocha, U.; Carlos Ponce de Leon Ferreira de Carvalho, A.
Proteins are essential in biological processes, primarily through their interactions with other molecules, including proteins. These interactions are crucial for cellular functions and maintaining life. Predicting Protei…
10.1101/2025.11.16.688401Your Assessment
Atacformer: A transformer-based foundation model for analysis and interpretation of ATAC-seq data
LeRoy, N. J.; Zheng, G.; Khoroshevskyi, O.; Campbell, D. R.; Zhang, A.; Sheffield, N. C.
IntroductionChromatin accessibility profiling is an important tool for understanding gene regulation and cellular function. While public repositories house nearly 10,000 scATAC-seq experiments, unifying this data for mea…
10.1101/2025.11.03.685753Your Assessment
Benchmarking large-scale single-cell RNA-seq analysis
Billato, I.; Pages, H.; Carey, V.; Waldron, L.; Sales, G.; Romualdi, C.; Risso, D.
The increasing size of single-cell RNA sequencing (scRNA-seq) datasets poses major computational challenges. This work benchmarks the scalability, efficiency, and accuracy of five widely used analysis frameworks (Seurat,…
10.1101/2025.10.28.681564Your Assessment
Benchmarking algorithms for RNA velocity inference
Huang, K.; Zhou, Y.; Wang, T.; Li, X.; Zhao, X.; Liu, X.; Huang, L.; Zhou, X.; Liu, J.
RNA velocity is a computational framework for single-cell RNA sequencing (scRNA-seq) that estimates the future transcriptional state of individual cells, thereby capturing the direction and rate of cell state transitions…
10.64898/2026.01.03.697314Your Assessment
Predicting protein complexes in biosynthetic gene clusters
Moriwaki, Y.; Shiraishi, T.; Katsuyama, Y.; Matsuda, K.; Ose, T.; Minami, A.; Oikawa, H.; Kuzuyama, T.; Ishitani, R.; Terada, T.
Biosynthetic gene clusters (BGCs) are contiguous genomic regions that encode diverse, non-homologous proteins required for the production of specific natural products. Their genetic diversity underlies the structural com…
10.1101/2025.10.26.684697Your Assessment
Generative single-cell transcriptomics via large language models
Choi, H.; Shin, H.; Lee, D.; Lee, D.
Single-cell and spatial transcriptomics have generated vast atlases of cellular states, yet these data are almost exclusively used for analysis rather than generation. Here we introduce the LLM-based model PGL, Portrayin…
10.64898/2026.01.12.699186Your Assessment
A gene program dictionary of human cells
Xu, Y.; Wang, Y.; Geng, Z.; Qin, Y.; Ma, S.
Defining all human cell types and their roles in health and disease is a central goal of biology. Single-cell RNA sequencing has enabled the construction of organ-specific cell atlases, but building a comprehensive organ…
10.1101/2025.10.29.685322Your Assessment
SpatialProp: tissue perturbation modeling with spatially resolved single-cell transcriptomics
Sun, E. D.; Buendia, A.; Brunet, A.; Zou, J.
Perturbational studies are the gold standard for identifying causal relationships between components of biological systems. Recent technological advances, including Perturb-seq and related assays, have enabled high-throu…
10.64898/2025.11.30.691355Your Assessment
vir2vec: A Genome-Wide Viral Embedding
Rancati, S.; Arozarena Donelli, P.; Nicora, G.; Bergomi, L.; Buonocore, T.; Sy, M. A.; Pandey, S.; Prosperi, M.; Salemi, M.; Bellazzi, R.; Boucher, C.; Parimbelli, E.; Marini, S.
Genomic language models (gLMs) have recently emerged as powerful numerical surrogates for DNA, but existing architectures are largely focused on human DNA or trained on limited viral references, and no dedicated benchmar…
afeYour Assessment
Inferring virtual cell environments using multi-agent reinforcement learning
Kalafut, N. C.; He, C.; Sheng, J.; Chandrashekar, P. B.; Choi, J. J.; Wang, D.
1Single cells interact continuously to form a cell environment that drives key biological processes. Cells and cell environments are highly dynamic across time and space, fundamentally governed by molecular mechanisms, s…
10.1101/2025.11.21.689815V2Your Assessment
Mapping Gene Impact on Single-cell Transcriptomic Networks via Perturbation Response Scanning
Gupta, S.; Romero, S.; Cai, J. J.
Gene knockout experiments are essential for dissecting gene function, and CRISPR has made targeted gene disruption more accessible than ever. Single-cell CRISPR screening enables the construction of rich genetic perturba…
10.64898/2025.12.15.694358Your Assessment
mim: A lightweight auxiliary index to enable fast, parallel, gzipped FASTQ parsing
Patro, R.; Bharti, S.; Singhania, P.; Dhakal, R.; Dahlstrom, T. J.; Groot Koerkamp, R.
The FASTQ file format is the lingua franca of primary data distribution and processing across most of bioinformatics. Over time, the compression, storage, transmission, and decompression of gzip compressed fastq.gz files…
10.1101/2025.11.24.690271Your Assessment
Genetics of skeletal proportions in two different populations
Bartell, E.; Lin, K.; Tsuo, K.; Gan, W.; Vedantam, S.; Cole, J. B.; Baronas, J. M.; Yengo, L.; Marouli, E.; Amariuta, T.; Chen, Z.; Li, L.; GIANT Consortium, ; Renthal, N. E.; Jacobsen, C. M.; Salem, R.; Walters, R. G.; Hirschhorn, J. N.
Human height can be divided into sitting height and leg length, reflecting growth of different parts of the skeleton whose relative proportions are captured by the ratio of sitting to total height (as sitting height rati…
10.1101/2023.05.22.541772Your Assessment
Pangenome reconstruction in rats enhances genotype-phenotype mapping and novel variant discovery
Villani, F.; Guarracino, A.; Ward, R. R.; Green, T.; Emms, M.; Pravenec, M.; Prins, P.; Garrison, E.; Williams, R. W.; Chen, H.; Colonna, V.
The HXB/BXH family of recombinant inbred rat strains is a unique genetic resource that has been extensively phenotyped over 25 years, resulting in a vast dataset of quantitative molecular and physiological phenotypes. We…
10.1101/2024.01.10.575041Your Assessment
singletCode: synthetic barcodes identify singlets in scRNA-seq datasets and evaluate doublet algorithms
Zhang, Z.; Melzer, M.; Kiani, K.; Goyal, Y.
Single-cell RNA sequencing datasets comprise true single cells, or singlets, in addition to cells that coalesce during the protocol, or doublets. Identifying singlets with high fidelity in single-cell RNA sequencing is n…
10.1101/2023.08.04.552078Your Assessment
Yeast, a single eukaryotic cell model organism, demonstrates a progressive aging process. In the era of synthetic biology, study of the impact of synthetic chromosomes and aging is urgent and intriguing. Herein, we succe…
10.1101/2023.11.07.566118Your Assessment
Implications of noncoding regulatory functions in the development of insulinomas
Ramos-Rodriguez, M.; Subirana-Granes, M.; Norris, R.; Sordi, V.; Fernandez Ruiz, A.; Berenguer Balaguer, C.; Fuentes-Paez, G.; Perez-Gonzalez, B.; Raurell-Vila, H.; Chowdhury, M.; Corripio, R.; Partelli, S.; Lopez-Bigas, N.; Pellegrini, S.; Montanya, E.; Nacher, M.; Falconi, M.; Layer, R.; Rovira, M.; Gonzalez-Perez, A.; Piemonti, L.; Pasquali, L.
Insulinomas are rare neuroendocrine tumours arising from the pancreatic {beta}-cells. While retaining the ability to produce insulin, insulinomas feature aberrant proliferation and altered hormone secretion resulting in …
10.1101/2024.01.23.576802Your Assessment
Spatially Resolved Gene Expression is Not Necessary for Identifying Spatial Domains
Lin, S.; Zhao, Y.; Yuan, Z.
The development of Spatially Resolved Transcriptomics (SRT) technologies has revolutionized the study of tissue organization. We introduce a graph convolutional network with an attention and positive emphasis mechanism, …
10.1101/2023.10.15.562443Your Assessment
A Unified Pipeline for FISH Spatial Transcriptomics
Cisar, C.; Keener, N.; Ruffalo, M.; Paten, B.
In recent years, high-throughput spatial transcriptomics has emerged as a powerful tool for investigating the spatial distribution of mRNA expression and the effects it may have on cellular function. There is a lack of s…
10.1101/2023.02.17.529010Your Assessment
Decoding the MYC locus reveals a druggable ultraconserved RNA element
Shi, P.; Yang, F.; FNU, T.; Huang, W.; Aparicio, A. O.; Kalicki, C. H.; Trehan, A.; Murphy, M. R.; Rotlevi, E. R.; Xing, L.; Reilly, M. P.; Que, J.; Wu, X.
The human genome is dominated by noncoding sequences, most of which are poorly conserved across species. How genetic information is distributed between coding and noncoding regions remains a fundamental unresolved questi…
10.64898/2026.01.29.702547Your Assessment
MERFISH+, a large-scale, multi-omics spatial technology resolves the molecular holograms of the 3D human developing heart
Kern, C.; Zhang, Q.; Lu, Y.; Eschbach, J.; Zeng, Z.; Farah, E. N.; Tai, C.-Y.; Yang, K.; Jenie, I.; Yao, F.; Zhao, Z.; Ma, Q.; Padilla, C. G.; Monell, A.; Moghadami, S.; Zhu, F.; Li, B.; Hou, A.; Tucker, G.; Ellison, D.; Chi, N. C.; Qiu, X.; Zhu, Q.; Bintu, B.
Hybridization-based spatial transcriptomics technologies have advanced our ability to map cellular and subcellular organization in complex tissues. However, existing methods remain constrained in gene coverage, multimoda…
10.1101/2025.11.02.686137Your Assessment
The Demographic and GDP Impacts of Slowing Biological Aging
Romanni-Klein, R.; Hendrix, N.; DeBacker, J.; Evans, R.
Biological aging imposes significant socio-economic costs, increasing health expenses, reducing productivity, stalling population growth and straining social systems, culminating in reduced economic activity. We draw ins…
10.64898/2026.01.22.701157Your Assessment
Protenix-v1: Toward High-Accuracy Open-Source Biomolecular Structure Prediction
Xiao, W.; Zhang, Y.; Gong, C.; Zhang, H.; Ma, W.; Liu, Z.; Chen, X.; Guan, J.; Wang, L.
We introduce Protenix-v1 (PX-v1), the first open-source structure prediction model to attain superior performance to AlphaFold3 while strictly adhering to the same training data cutoff, model size, and inference budget. …
10.64898/2026.02.05.703733Your Assessment
LipiGo: A Versatile DNA-Lipid Nanoparticle Hybrid for Precision Drug Delivery
Kadletz, K.; Kimna, C.; Reichenbach, V. K.; Carofiglio, O.; Jeridi, D.; Khalin, I.; Chen, Y.; Horvath, I.; Minde, D.-P.; Ali, M.; Sych, T.; Hu, S.; Hoeher, L.; Aydeniz, E.; Kulcu, Y. A.; Grzejdak, J.; Ricci, A.; Sezgin, E.; Plesnila, N.; Liesz, A.; Ussar, S.; Hellal, F.; Elsner, M.; Erturk, A.
Lipid nanoparticles (LNPs) are established carriers for nucleic acid delivery, however, achieving efficient delivery to non-hepatic tissues remains a major challenge. Here, we present LipiGo, a nanocarrier platform engin…
10.1101/2025.09.15.676334Your Assessment
Decoding the gene regulatory landscape through multimodal learning of protein-DNA interactions
Tan, J.; Fu, X.; Ling, X.; Mo, S.; Bai, J.; Rabadan, R.; Fenyo, D.; Boeke, J. D.; Tsirigos, A.; Xia, B.
The identity of a cell is governed by regulatory proteins binding to the genome to control gene expression. Mapping these genome-wide binding events across thousands of proteins and cell types is essential for understand…
10.1101/2025.08.17.670761Your Assessment
PeptiVerse: A Unified Platform for Therapeutic Peptide Property Prediction
Zhang, Y.; Tang, S.; Chen, T.; Mahood, E.; Vincoff, S.; Chatterjee, P.
Therapeutic peptides combine the advantages of small molecules and antibodies, offering target flexibility and low immunogenicity, yet their successful translation requires careful evaluation of multiple developability p…
10.64898/2025.12.31.697180Your Assessment
Critical assessment of intratumor and low-biomass microbiome using long-read sequencing
Zhang, Y.; Mead, E. A.; Ni, M.; Ksiezarek, M.; Liu, Y.; Cao, L.; Chen, H.; Fan, Y.; Qiao, W.; Li, Y.; Zuluaga, L.; Deikus, G.; Sebra, R.; Brody, R.; Yong, R. L.; Badani, K. K.; Zhang, X.-S.; Fang, G.
The detection of low-biomass microbial DNA in human tissues is often confounded by contamination, as demonstrated in the debates over the existence of microbiomes in the placenta, brain, blood, and tumors. Here we show t…
10.64898/2026.02.02.703393