I am a Director of Data Science and Machine Learning at Relation Therapeutics, where I lead a team developing DNA foundation models to better understand how human genetic variation contributes to disease biology and to support the discovery of new therapeutic opportunities.
My work sits at the intersection of machine learning, statistical modelling, genomics and biomedicine. I am particularly interested in developing models that connect genetic variation to gene regulation, cellular function and disease mechanisms.
Before joining Relation Therapeutics, I was a cross-disciplinary postdoctoral fellow at the University of Edinburgh, working with Catalina Vallejos and collaborating closely with Neil Henderson’s lab. There, I worked on machine learning and statistical modelling of single-cell and multi-modal genomics data to decode molecular mechanisms regulating liver fibrosis and regeneration.
I completed my PhD in Data Science at the School of Informatics, University of Edinburgh, under the supervision of Guido Sanguinetti, developing statistical machine learning methods for modelling epigenomic variability and single-cell genomics data.
Selected publications
ICLR
PatchDNA: A Flexible and Biologically-Informed Alternative to Tokenization for DNA
Del Vecchio, Alice,
Kapourani, Chantriolnt-Andreas, Athar, Abdullah M., Dobrowolska, Agnieszka, Anighoro, Andrew, Tenmann, Benjamin, Edwards, Lindsay, and Regep, Cristian
In International Conference on Learning Representations 2026
DNA language models are emerging as powerful tools for representing genomic sequences, with recent progress driven by self-supervised learning. However, performance on downstream tasks is sensitive to tokenization strategies reflecting the complex encodings in DNA, where both regulatory elements and single-nucleotide changes can be functionally significant. Drawing inspiration from the Byte Latent Transformer’s combining of bytes into patches, we propose that patching provides a competitive and more efficient alternative to tokenization for DNA sequences. Patching eliminates the need for a fixed vocabulary, which offers unique advantages to DNA modeling. We introduce conservation-guided patching, where patch boundaries are informed by evolutionary conservation scores, enabling the model to focus on functionally important regions. We further introduce re-patching, enabling flexible downstream adaptation with no retraining from scratch. PatchDNA achieves state-of-the-art performance across diverse genomic benchmarks and received the Best Paper Award at the AI4D3 NeurIPS 2025 workshop.
@inproceedings{delvecchio2026patchdna,author={Del Vecchio, Alice and Kapourani, Chantriolnt-Andreas and Athar, Abdullah M. and Dobrowolska, Agnieszka and Anighoro, Andrew and Tenmann, Benjamin and Edwards, Lindsay and Regep, Cristian},booktitle={International Conference on Learning Representations},title={{PatchDNA: A Flexible and Biologically-Informed Alternative to Tokenization for DNA}},url={https://openreview.net/forum?id=kwa1bcyuTF},year={2026}}
Nature
Multimodal decoding of human liver regeneration
Matchett, Kylie P., Wilson-Kanamori, John R.,
Kapourani, Chantriolnt-Andreas, Portman, Jordan R., Fercoq, Frederique, May, Sophie, Zajdel, Marta, Beltran, Miguel, Sutherland, Eleanor F., Mackey, John B. G., Brice, Mhairi, Wilson, Gregory C., Wallace, Sarah J., Kitto, Laura, Younger, Nicholas T., Dobie, Ross, Mole, Damian J., Oniscu, Gabriel C., Wigmore, Stephen J., Ramachandran, Prakash, Vallejos, Catalina A., Carragher, Neil O., Saeidinejad, Mohammad Mahdi, Quaglia, Alberto, Jalan, Rajiv, Simpson, Kenneth J., Kendall, Timothy J., Rule, Jody A., Lee, William M., Hoare, Matthew, Weston, Christopher J., Marioni, John C., Teichmann, Sarah A., Bird, Thomas G., Carlin, Leo M., and Henderson, Neil C.
The liver has a unique ability to regenerate; however, in the setting of acute liver failure (ALF), this regenerative capacity is often overwhelmed, leaving emergency liver transplantation as the only curative option. Here, to advance understanding of human liver regeneration, we use paired single-nucleus RNA sequencing combined with spatial profiling of healthy and ALF explant human livers to generate a single-cell, pan-lineage atlas of human liver regeneration. We uncover a novel ANXA2+ migratory hepatocyte subpopulation, which emerges during human liver regeneration, and a corollary subpopulation in a mouse model of acetaminophen (APAP)-induced liver regeneration. Interrogation of necrotic wound closure and hepatocyte proliferation across multiple timepoints following APAP-induced liver injury in mice demonstrates that wound closure precedes hepatocyte proliferation. Four-dimensional intravital imaging of APAP-induced mouse liver injury identifies motile hepatocytes at the edge of the necrotic area, enabling collective migration of the hepatocyte sheet to effect wound closure. Depletion of hepatocyte ANXA2 reduces hepatocyte growth factor-induced human and mouse hepatocyte migration in vitro, and abrogates necrotic wound closure following APAP-induced mouse liver injury. Together, our work dissects unanticipated aspects of liver regeneration, demonstrating an uncoupling of wound closure and hepatocyte proliferation and uncovering a novel migratory hepatocyte subpopulation that mediates wound closure following liver injury. Therapies designed to promote rapid reconstitution of normal hepatic microarchitecture and reparation of the gut–liver barrier may advance new areas of therapeutic discovery in regenerative medicine.
@article{matchett2024multimodal,author={Matchett, Kylie P. and Wilson-Kanamori, John R. and Kapourani, Chantriolnt-Andreas and Portman, Jordan R. and Fercoq, Frederique and May, Sophie and Zajdel, Marta and Beltran, Miguel and Sutherland, Eleanor F. and Mackey, John B. G. and Brice, Mhairi and Wilson, Gregory C. and Wallace, Sarah J. and Kitto, Laura and Younger, Nicholas T. and Dobie, Ross and Mole, Damian J. and Oniscu, Gabriel C. and Wigmore, Stephen J. and Ramachandran, Prakash and Vallejos, Catalina A. and Carragher, Neil O. and Saeidinejad, Mohammad Mahdi and Quaglia, Alberto and Jalan, Rajiv and Simpson, Kenneth J. and Kendall, Timothy J. and Rule, Jody A. and Lee, William M. and Hoare, Matthew and Weston, Christopher J. and Marioni, John C. and Teichmann, Sarah A. and Bird, Thomas G. and Carlin, Leo M. and Henderson, Neil C.},doi={10.1038/s41586-024-07376-2},journal={Nature},pages={158--165},title={{Multimodal decoding of human liver regeneration}},url={https://doi.org/10.1038/s41586-024-07376-2},volume={630},year={2024}}
GBIO
scMET: Bayesian modeling of DNA methylation heterogeneity at single-cell resolution
Kapourani, Chantriolnt-Andreas, Argelaguet, Ricard, Sanguinetti, Guido, and Vallejos, Catalina A.
High-throughput single-cell measurements of DNA methylomes can quantify methylation heterogeneity and uncover its role in gene regulation. However, technical limitations and sparse coverage can preclude this task. scMET is a hierarchical Bayesian model which overcomes sparsity, sharing information across cells and genomic features to robustly quantify genuine biological heterogeneity. scMET can identify highly variable features that drive epigenetic heterogeneity, and perform differential methylation and variability analyses. We illustrate how scMET facilitates the characterization of epigenetically distinct cell populations and how it enables the formulation of novel hypotheses on the epigenetic regulation of gene expression. scMET is available at https://github.com/andreaskapou/scMET.
@article{Kapourani2021,author={Kapourani, Chantriolnt-Andreas and Argelaguet, Ricard and Sanguinetti, Guido and Vallejos, Catalina A.},doi={10.1186/s13059-021-02329-8},issn={1474760X},journal={Genome Biology},keywords={DNA methylation,Epigenetic heterogeneity,Hierarchical Bayes,Single-cell},number={1},pages={1--21},pmid={33879195},publisher={Genome Biology},title={{scMET: Bayesian modeling of DNA methylation heterogeneity at single-cell resolution}},url={https://doi.org/10.1186/s13059-021-02329-8},volume={22},year={2021}}
Bioinformatics
Higher order methylation features for clustering and prediction in epigenomic studies
Kapourani, Chantriolnt-Andreas, and Sanguinetti, Guido
Motivation: DNA methylation is an intensely studied epigenetic mark, yet its functional role is incompletely understood. Attempts to quantitatively associate average DNA methylation to gene expression yield poor correlations outside of the well-understood methylation-switch at CpG islands. Results: Here we use probabilistic machine learning to extract higher order features associated with the methylation profile across a defined region. These features quantitate precisely notions of shape of a methylation profile, capturing spatial correlations in DNA methylation across genomic regions. Using these higher order features across promoter-proximal regions, we are able to construct a powerful machine learning predictor of gene expression, significantly improving upon the predictive power of average DNA methylation levels. Furthermore, we can use higher order features to cluster promoter-proximal regions, showing that five major patterns of methylation occur at promoters across different cell lines, and we provide evidence that methylation beyond CpG islands may be related to regulation of gene expression. Our results support previous reports of a functional role of spatial correlations in methylation patterns, and provide a mean to quantitate such features for downstream analyses. Availability: https://github.com/andreaskapou/BPRMeth
@article{Kapourani2016,archiveprefix={arXiv},author={Kapourani, Chantriolnt-Andreas and Sanguinetti, Guido},doi={10.1093/bioinformatics/btw432},eprint={1603.08386},issn={14602059},journal={Bioinformatics},number={17},pages={i405--i412},pmid={27587656},title={{Higher order methylation features for clustering and prediction in epigenomic studies}},url={https:/doi.org/10.1093/bioinformatics/btw432},volume={32},year={2016}}
GBIO
Melissa: Bayesian clustering and imputation of single-cell methylomes
Kapourani, Chantriolnt-Andreas, and Sanguinetti, Guido
Measurements of single-cell methylation are revolutionizing our understanding of epigenetic control of gene expression, yet the intrinsic data sparsity limits the scope for quantitative analysis of such data. Here, we introduce Melissa (MEthyLation Inference for Single cell Analysis), a Bayesian hierarchical method to cluster cells based on local methylation patterns, discovering patterns of epigenetic variability between cells. The clustering also acts as an effective regularization for data imputation on unassayed CpG sites, enabling transfer of information between individual cells. We show both on simulated and real data sets that Melissa provides accurate and biologically meaningful clusterings and state-of-the-art imputation performance.
@article{kapourani2019melissa,author={Kapourani, Chantriolnt-Andreas and Sanguinetti, Guido},journal={Genome biology},number={1},pages={61},publisher={BioMed Central},title={{Melissa: Bayesian clustering and imputation of single-cell methylomes}},url={https://doi.org/10.1186/s13059-019-1665-8},volume={20},year={2019}}