For best experience please turn on javascript and use a modern browser!
You are using a browser that is no longer supported by Microsoft. Please upgrade your browser. The site may not present itself correctly if you continue browsing.
Each month, the Spotlight introduces a Data Science Centre Affiliate Member. This month, meet Justin Ho, Postdoctoral Researcher at the Digital Communication Methods Lab, Faculty of Social and Behavioural Sciences, and a Carpentries instructor.

Can you tell us more about your role and how you apply data science to your projects?

I am a postdoctoral researcher at the Amsterdam School of Communication Research, focusing on computational social science. My work involves developing frameworks to integrate data science tools into social science research. Recently, I evaluated linguistic and contextual bias in large language models. I tested the performance of vision-language models for extracting theoretical concepts from social media data. I also study social movements and nationalism through social media analysis using large language models.

Is there a project from this past year that you are most proud of?

Coming from a political science and sociology background where party manifestos and legislative texts are easily accessible, I was surprised to find how challenging it is to access news data in communication science due to copyright restrictions. Many databases, like LexisNexis, are costly and restrict automated batch downloading. Scraping the web for articles from news outlets is also a difficult and time consuming task. To address this, I initiated a project to curate a global dataset of news articles, using open-access web crawl data from Common Crawl. Our pilot currently covers 16 countries, with plans to expand to 90. We're working on securing funding to scale and enhance the project further.

What do you enjoy most about being a DSC member?

Data science evolves rapidly, staying updated is challenging. What I particularly enjoy about the DSC is the opportunity to connect with colleagues and exchange experiences on the newest methods and techniques. The interdisciplinary nature of the community exposes me to ideas and tools I wouldn't encounter otherwise.

What is your favourite data science method?

I have a love-hate relationship with generative AI. It’s an incredibly powerful and versatile tool when used correctly, but I’m cautious about relying on it for everything. I'm currently working on a project that compares major generative AI models, weighing performance, reproducibility, and openness.

Are you camp Python/R/or something else?

I first worked with R, but now use Python more frequently. Both have their strengths—R excels in statistical tests and visualization (I swear by ggplot!), while Python offers better scalability and is a slightly more stable when building apps. My choice often depends on which tool best fits the task at hand.

Dr. J.C. (Justin) Ho

Faculty of Social and Behavioural Sciences

CW : Political Communication & Journalism