Posted 30 November 2021 | By Kari Oakes 

A key microbial genome database is being updated and expanded through a public-private partnership, as the US Food and Drug Administration (FDA) works with a health data company and an academic institution to build out the FDA-ARGOS database.
In partnership with George Washington University and Embleema, which maintains a bioinformatics platform, FDA  is conducting a year-long, $2 million project to “further improve the utility of the FDA-ARGOS database as a key tool for medical countermeasure development and validation,” according to FDA’s webpage about the project.
The FDA-ARGOS database is envisioned as a powerful tool “that contains quality controlled and curated genomic sequence data to support research and regulatory decisions,” says FDA’s home page for the database. The database can support in silico modeling for such purposes as performance validation, with the potential to ease the testing burden when industry is meeting regulatory requirements.
The new project’s focus on highly clinically significant microbial species will mesh with a diagnostic testing industry that is rapidly moving toward the use of next-generation and high-throughput genetic sequencing (NGS and HTS) for infectious disease diagnostics. These technologies can allow clinicians to detect pathogens with little or no a priori information to narrow a diagnosis, explains the FDA-ARGOS website.
“NGS technology can potentially reveal the presence of all microorganisms in a patient sample. Using infectious disease NGS (ID-NGS) technology, each microbial pathogen may be identified by its unique genomic fingerprint,” according to the website. “The vision of ID-NGS technology is to further improve patient care by delivering diagnostics which can help identify the microbial makeup in patient samples quickly and accurately.”
The COVID-19 pandemic has already prompted some expansion of the FDA-ARGOS database, with reference-grade sequence data for SARS-CoV-2 being added to the database in April 2020.
Since then, researchers have continued to invite other scientific collaborators to make contributions to the database; project managers have especially sought more SARS-CoV-2 samples as the pandemic wears on and variants continue to emerge.
“The FDA-ARGOS and collaborators are specifically searching for unique, hard to source microbes such as biothreat organisms, emerging pathogens, and clinically significant bacterial, viral, fungal, and parasitic genomes,” according to an information page hosted by another partner, the University of Maryland. “The goal is to collect sequence information for a minimum of 5 isolates per species. These isolates will be sequenced using a combination of long-read and short-read technologies, assembled, annotated, and made publicly available.”
One of the new project’s chief aims is to gather publicly available genomes from clinically important microbial species, and then ascertain which sequences are of sufficient quality to qualify as “regulatory-grade.” A to-be-built annotation data model will set standards for data acquisition and annotation, as well as harmonization of sequence data across species.
To this end, the partners will also work on developing new analytic tools to make the existing database more user-friendly and stable “in the context of sequence representativeness,” according to the project description.
Finally, the project will put together submission packages to the National Center for Biotechnology Information (NCBI) and add the newly acquired regulatory sequences to the FDA-ARGOS database. Training and documentation will also be provided to FDA staff.
A set of instructions and a list of species included in the database are already accessible through NCBI, though currently, just about two thirds of the species included (971 of 1,410) have complete genomic data uploaded to FDA-ARGOS.


