GenoML Genomic variant interpretation with machine learning
For this project we created an automated pipeline to create artificial NGS data with known errors using NEAT-genreads and aligned them against the human reference genome. This allowed us to show the ability to train a machine learning model against specific data and compare it against the known errors that were injected. In the end we compared our method on the well known HG001 (NA12878) dataset, which allowed us to verify the transferability between synthetic and real data.
The project was done as part of an Innocheque with Phenosystems SA.