Choose your region

Select the region that best fits your location or preferences.

Choose your site language

This setting controls the language of the user interface, including buttons, menus, and all site text. Select your preferred language for the best browsing experience.

Choose your job languages

Select the languages for job listings you want to see. This setting determines which job advertisements will be displayed to you.

Academic story

Speeding Up DNA Analysis With String Algorithms

4 min read · By Academic Positions

Addressing some of the biggest challenges in medical science relies on processing unimaginably huge amounts of data. Theoretical computer scientists like Hilde Verbeek, second year PhD student at Centrum Wiskunde & Informatica (CWI) in the Netherlands, are creating algorithms that can more efficiently sift through this data, meaning critical analysis can be performed faster than ever.

PhD student Hilde Verbeek

One of the most important applications for this theoretical research - which is carried out on paper rather than a computer program - is in DNA analysis. “A single human genome consists of around 3 billion base pairs, and in practice, the amount of data that's worked with is even larger than that,” says Hilde. “So we work on algorithms and data structures that allow this analysis to be done faster and that use less space.” This is pivotal in situations like the Covid pandemic, where rapidly tracking the spread of new variants around the world was key to tackling the virus.

Hilde’s work currently involves developing algorithms to identify what’s known as the shortest unique substring. “Given some sequence like DNA or text, we’re looking for a certain part of this sequence that occurs just once,” Hilde explains. In DNA, finding the shortest part of the sequence that meets this criterion enables certain genes to be more easily identified.

There is a very simple algorithm that can perform this task in time proportional to the length of the sequence and uses fundamental techniques commonly taught in universities. But we’ve found a way to do this faster than would be intuitively possible - which is very interesting because it's a lot more complex, and it uses a lot of different techniques. We do this by basically taking advantage of the fact that in many applications, such as DNA analysis, we are working with alphabets that are very small - DNA has just four different characters.” In practice, that means these abstract algorithms can shorten the time it takes to find genetic disorders and abnormalities. 

Hilde working on the CWI campus

As a research institute for maths and science, rather than a university, there’s no teaching at CWI. “For a PhD student, this means that a lot more focus can be given to research, and I think it's created a very nice atmosphere here,” says Hilde. She was the first recipient of CWI’s Constance van Eeden fellowship, which offers a female student a PhD position and is named after one of the first women to receive a PhD in statistics in the Netherlands. “It’s given me a lot of freedom to choose what I want to do within my PhD. I got to choose which research group I wanted to join and also choose my supervisor, which is how I ended up in this string algorithms group. But it also means I can work on projects that are outside of this research group, if I want to.”

Part of Hilde’s fellowship also includes mentorship from a CWI academic outside of her area of study. “She guides me through things that are not directly related to research, but are important to know when you're pursuing an academic career. I can go to her with any questions or problems I have,” she says. The Diversity, Equity and Inclusion team Hilde is a part of is pushing for this type of mentorship to be standard for all PhD students. “We are trying to guide the policy-making within the institute to create a more diverse, inclusive, and equitable academic world, and we believe this would really help students who are in some way disadvantaged. 

“Diversity has been a very big focus at CWI over the past few years, which is very good to see. I think it's important not only because it's fairer to people, but by allowing for different perspectives, it will also accelerate research.”

This kind of collaboration is found throughout the institute even outside the labs. “There’s a community atmosphere,” says Hilde. “There are a lot of organised group activities, which help us connect with each other. There's very much an effort made to allow people to let themselves be distracted, to socialise, to take the stress off, and to meet others. And I'm very happy that's done here.” 

Featured employer

Founded in 1946, CWI is the national research institute for mathematics and computer science in the Netherlands and is located at Science Park Amsterdam.

See all current vacancies
Published 2024-11-06

Featured employer

Founded in 1946, CWI is the national research institute for mathematics and computer science in the Netherlands and is located at Science Park Amsterdam.

Visit employer page

Featured researcher

...
Hilde Verbeek
Website

Hilde Verbeek is a second year PhD student at Centrum Wiskunde & Informatica (CWI). She was the first recipient of CWI’s Constance van Eeden fellowship

Others also read

...
Understanding Users to Optimise 3D Experiences Centrum Wiskunde & Informatica (CWI) 5 min read
...
Futureproofing Computer Security Centrum Wiskunde & Informatica (CWI) 4 min read
...
Better Statistics Leads to Better Research Centrum Wiskunde & Informatica (CWI) 5 min read
...
Making the Invisible Visible Centrum Wiskunde & Informatica (CWI) 5 min read
...
Bringing Quantum Computers Closer to Reality Centrum Wiskunde & Informatica (CWI) 4 min read
More stories