Scientists Uncover 188 Hidden CRISPR Systems with Breakthrough Algorithm!
Dr. Feng Zhang’s lab CRISPR is at the forefront of developing innovative gene-editing systems applicable for both research and drug discovery. A crucial aspect of their work involves the exploration of organisms, whether prokaryotic or eukaryotic, to identify enzymes or systems that can be repurposed for gene editing.
READ: Epic Fortnite Makeover with LEGO! Crafting Madness Unleashed in New Survival Mode!
Despite the wealth of genomic data in extensive databases, the challenge lies in the time-consuming process of mining such vast repositories. In a recent publication in the journal Science, Zhang and collaborators from the Broad Institute of MIT and Harvard, the McGovern Institute for Brain Research at MIT, and the National Center for Biotechnology Information (NCBI) introduced a new search algorithm: Fast Locality-Sensitive Hashing-based clustering (FLSHclust).
CRISPR
This algorithm employs clustering techniques to rapidly navigate through massive genetic datasets. The researchers applied FLSHclust to three public databases, including the NCBI’s Whole Genome Shotgun database and the Joint Genome Institute, housing genomic information for diverse bacteria found in unconventional environments like dog saliva, coal mines, Antarctic lakes, and breweries.
FLSHclust, grounded in locality-sensitive hashing, a technique from computer science and big data mining, efficiently identifies similarities in genetic data. The team successfully used FLSHclust to sift through the three databases in a matter of weeks, uncovering 188 previously unreported CRISPR-associated systems, including rare variants.
“This new algorithm allows us to parse through data in a time frame that’s short enough that we can actually recover results and make biological hypotheses,” remarked Dr. Soumya Kannan, a co-first author of the study and a post-doctoral researcher at Harvard University.
Dr. Han Altae-Tran, another co-first author, emphasized the excitement of expanding the scale of their search and improving exploration methods. The researchers characterized four of the discovered systems in the laboratory, unveiling their capacity to edit DNA or RNA in human cells, among other functions.
These findings included new variants of Type 1 CRISPR systems with a 32-base pair guide RNA, offering more precise gene-editing approaches than the commonly used Cas9 guide with its 20-nucleotide length. The authors emphasized the untapped potential of these newly discovered CRISPR-linked systems, describing them as a “treasure trove” with diverse biochemical activities.
Dr. Feng Zhang concluded by highlighting the importance of tools like FLSHclust in exploring the vast sequence space of biodiversity, emphasizing the need to uncover “molecular gems” as genome and metagenomic sequencing efforts continue.