The DSC seeks to accelerate data-driven research within the University of Amsterdam. Part of that mission is to foster interdisciplinary research. Specifically, this programme aims to foster research into new data science methods that help to tackle hard challenging problems in a given domain.
The DSC is providing funding for seven PhD students to perform the research at this interaction. Such interaction is realized through joint supervision: one supervisor with core expertise in data science methods, the other with core expertise in the domain problem.
The projects are collaborations from across the University and include all seven faculties. The projects are:
The GPU Dentist: Instilling Domain Knowledge in Deep Networks through Hyperbolic Geometry
Deep learning still requires many training examples. Particularly in the medical and dental domain visual examples and corresponding labels are hard to obtain. Recent efforts to improve dental radiographic diagnostics with the help of AI did not outperform the clinicians’ diagnostic accuracy. On the other hand, domain knowledge is typically available in abundance. The aim of this project is to answer the question: Can such domain knowledge be used to accelerate deep learning in dental radiology?
Building better vision models using pre-cortical inductive biases
State-of-the-art computational models of vision are capable of human-level object recognition. However, they still face many challenges that are solved elegantly by the brain. This project looks at bringing well-understood properties of the human nervous system into deep learning and computational models of vision. A key ingredient in the project is the expansion of the Open Amsterdam Data Set with visual images and real-world video with a resolution that is two orders of magnitude higher than those currently used.
Perceiving through the language lens: an interdisciplinary data-driven approach to language-perception interaction
Does language influence our perception of the world? This question has interested Humanities scholars for many years. Recent experimental work shows that linguistic labels can indeed function as 'cues' to the perceptual system. However, the underlying neural mechanisms through which language exerts this top-down influence are unknown. By combining different data types: anatomical data from structural brain scans (MRI), electrophysiological data (EEG), and behavioural data from psycholinguistic experiment, this project aims to gain deeper insight into language-on-perception effects.
A novel empirical and data-driven hybrid approach for analysing legal risks, their causes and impact in decentralised techno-social systems
Decentralised technological infrastructures (e.g. blockchains) promise a trustworthy technological environment for many applications. However, properties of their design create significant deviations from the societal expectations embodied in institutions, laws, and ethical frameworks. The aim of this project is to develop new data science methods to detect the existence and severity of legal risks in such environments.
Natural Language Processing and Responsible Data Management for Mental Health Research
Linguistic patterns in patients’ utterances have been shown to be predictive of different symptoms related to mental health conditions. However, most studies have been conducted using social media. This project aims to develop novel methods for the early detection of mental health conditions based on analysis of written texts produced both by a patient in social media and by healthcare workers in electronic health records. A key technical challenge in this project is to ensure fair representation of demographic groups in the context of heterogenous data sources.
Innovation genome: Discovering the secret ingredients of successful innovations in the cultural industries and science using geometric deep learning and visual analytics
What are the ingredients needed to produce a successful innovation? To tackle this challenge, this project aims to analyze and map an innovation genome in the cultural industries (visual arts) and science by extracting, modelling, and interactively visualizing complex traits (e.g., inventions) and patterns of influence. Creating such a map requires advancements in multimodal geometric deep learning approaches that are informed by state-of-the-art business theories to model actors, innovations, and categories of interest into a joint semantic space.
Optimal CO2 conversion by inverse design
An urgent challenge faced by society is global warming caused by anthropogenic CO2 emissions into the atmosphere. One solution is to use CO2 as a building block for high value-added chemicals and fuels. A challenge in implementing this solution is the development of improved catalytic materials. This project will create novel machine learning approaches (e.g., deep generative modeling, deep probabilistic programming) that are able to infer optimal catalyst materials and process conditions given a set of desired chemical and process properties. This project will explore few-shot learning ideas that start with generic databases but then are improved by taking advantage of sparse but more accurate quantum chemical simulations and experimental data.
The project is a collaboration between Dr. Bernd Ensing, Dr. Shiju Raveendran (Van 't Hoff Institute for Molecular Sciences, Faculty of Science) and Prof. Max Welling (Informatics Institute, Faculty of Science).