A researcher from the University of Waterloo has conducted the development of a new software tool that can determine compelling answers to some of the world’s most fascinating questions.
The software fuses supervised machine learning (ML) with digital signal processing (DSP). The tool could be able to answer questions such as how many different species exist on Earth or in the seas, how are newly-discovered and obsolete species related, what are the bacterial origins of human mitochondrial DNA, and even to the matter of a DNA parasite having the same genomic signature as its host.
Researchers see the software program as having huge potential recognizing and labeling the precise strain of a virus thus impacting the personalized medicine industry in a positive way by allowing for specific drugs to be created and prescribed to treat the virus.
The way ML-DSP, an alignment-free software tool works is by converting a DNA section into digital, numerical signals which then are processed and categorized through digital signal processing methods.
New Software Tool Can Now Answer the World’s Most Fascinating Questions
Lila Kari, a professor at Waterloo’s Faculty of Mathematics said that the tool allows them to categorize DNA sections even if they are small fragments, and indifferent of their origin, either natural, synthetic or computer generated.
Researchers have executed a significant comparison with other avant-garde classification software programs on two small reference point datasets and one large 4,322 vertebrate mitochondrial genome dataset in the study.
Their results concluded that ML-DSP powerfully outperforms alignment-based software when it comes to its processing time, having classification accuracies that are worthy of comparison in the case of small datasets but much superior when it comes to large datasets.
So, when we compare it with other alignment-free software tools, ML-DSP has a prevailing better classification accuracy and is overall faster. The researchers have also lead prior experiments demonstrating the potential of ML-DSP to be used for other datasets, by dividing 4,271 complete dengue virus genomes into subtypes with an accuracy of 100 percent, and 4,710 bacterial genomes into categories with 95.5 percent accuracy.
Daniel Kiss is the senior editor for News Lair. Daniel was working as a writer since he finished high-school, first for local papers then he started online, nowadays he likes to write about the latest games and tech innovations.