Structural correctness in metagenomics assembly

Description of the granted funding

The amount of sequencing data has increased enormously in the last decade. To analyse the data efficiently, it needs to be assembled to genomes or represented in a compact manner. However, current tools for assembly and compaction of sequencing data only output sequences with no estimates of their correctness which severely hampers accurate estimation of the correctness of downstream analysis. We will develop models to estimate the structural correctness of sequence reconstructions. We will provide for each substring of the sequences the probability that it occurs in the underlying sample. We will consider assembling one genome or a mixture of genomes and compacting sequencing data. Our methods will enable assessing the correctness of downstream genomic analysis accurately and to direct validation efforts to uncertain regions of reconstructed sequences.
Show more

Starting year

2025

End year

2029

Granted funding

Leena Salmela Orcid -palvelun logo
599 999 €

Funder

Research Council of Finland

Funding instrument

Academy projects

Decision maker

Scientific Council for Natural Sciences and Engineering
12.06.2025

Other information

Funding decision number

370538

Fields of science

Biomedicine

Research fields

Systeemibiologia, bioinformatiikka

Identified topics

genes, genetics