25 February 2025
I am a PhD student at the Swammerdam Institute for Life Sciences, working within the Biosystems Data Analysis (BDA) and Plant Hormone Biology (PHB) groups. My research focuses on developing algorithms and pipelines to analyse high-dimensional data from biological experiments. Specifically, I aim to uncover host-microbe pathways relevant to sustainable agriculture and human and plant health.
I’m particularly proud of our co-expression analysis tool, MASCARA, which has been available as preprint for almost a year. It’s great to see colleagues from other groups adopting it to discover new genes, metabolites and microbes within their research domains.
It is interesting to see how similar methods are applied across vastly different research fields. Being part of the DSC has broadened my knowledge and sparked new interests beyond my specific area of study.
While it’s tempting to focus on the latest LLMs or neural network architectures, in the BDA group, we often rely on algebraic and classical statistical methods. Since experimentalists can only grow a limited number of plants, yet need to investigate tens of thousands of genes and chemicals from these few samples, I find all approaches to latent variable estimation fascinating for uncovering hidden patterns in complex datasets.
Having grown up with R, it’s admittedly more comfortable to prototype a quick analysis. However, I’ve recently encountered optimisation and parallelisation bottlenecks, prompting me to increasingly turn to Python. Though this isn’t the long term solution either, navigating these challenges keeps the work dynamic and exciting.