NSF Solicitation: Advancing Digitization of Biological Collections

We have followed the rapid emergence of “big data science” over the last few years, which includes digitization of “collections” which underpin new research efforts and which, when utilized in new research, are now required to be made part of an accessible database and available for the broader research community.

The National Science Foundation is driving much of this activity and we find the language in solicitations for new grants a useful marker for how quickly and in what direction the field is moving.

In this regard, NSF released a new solicitation last week – Advancing Digitization of Biological Collections (ADBC) – which provide an excellent example.

Language from the Program Description section provides context:
“Digitizing and mobilizing the Nation’s biological and paleontological collections represents a grand challenge and will require development of both technical and human resources to support the creation of an enduring digital alliance of collections and institutions.

“This program establishes a national resource to integrate the digitization data and make it widely accessible. Collections digitization is defined broadly for the purpose of this solicitation to include the capture of digital images of specimens, transcription into electronic format of various types of data associated with specimens or linking ancillary data already stored in an electronic format apart from the voucher specimens, and the georeferencing of specimen-collection localities.

“…Paleontological collections are included and may be integrated with biological collections if relevant to a research theme, or may be developed around a research theme unique to the past. This program will create an organizational structure and processes inclusive of the broad biological and paleontological collections community, provide open data access, and empower biological and paleobiological researchers.”

Grand…

The NSF does note that while “new efforts and approaches to understanding biodiversity and advancing our knowledge are represented by several NSF programs (e.g., Dimensions of Biodiversity, Systematics and Biodiversity Science, Sedimentary Geology and Paleobiology). “However, there is a digitization bottleneck that effectively limits access to information residing in the various vouchered collections across the U.S. and the world. It is estimated that U.S. collections contain one billion specimens, but only 10% of these are accessible online.”

This solicitation is occurring in the broader context of surveys of federally held or supported collections and responds to a ten-year strategic plan to digitize, image and mobilize biological collections data here http://digbiocol.files.wordpress.com/2010/08/niba_brochure.pdf.

The goal of the this digitization effort is “to produce a resource of lasting value for answering major research questions” with key objectives:

–  digitize data from all U.S. biological collections, large and small, and integrate these in a web accessible interface using shared standards and formats

–  develop new web interfaces, visualization and analysis tools, data mining, georeferencing processes and make all available for using and improving the collections resource

–  create real-time upgrades of biological data and prevent the future occurrence of non-accessible collection data through the use of tools, training, and infrastructure.

Altogether ambitious and instructive for other discipline areas which are still navigating their path to action around “collections” integral to their advancement!

David R Curry
30 October 2011

Leave a comment