News
The JFK Presidential Library Closed — Then Reopened — Amid Confusion Over DOGE Firings
News
As Mass. Sues DOGE, HKS Professors Criticize Musk’s Broad Authority As ‘Absurd’
News
It Could Take Lifetimes To Catalog the Harvard Zoology Museum’s Collections Online. AI Tools Might Help.
News
Biopharma Giant GSK to Expand Research and Development Footprint in Cambridge
News
‘Harvard’s History is Black History’: Undergraduates Recognize Black History
The Museum of Comparative Zoology holds more than 21 million specimens from its more than 150-year history, which could take lifetimes to manually add to digital catalogs, but some museum researchers hope artificial intelligence tools could speed up the process.
These research collections include entomology — the study of insects within the field of zoology — which holds 7,500,000 specimens and primary types of over 33,000 species. According to recent numbers, only 395,000 of these insect specimens have been added to MCZbase — an online database of the museum’s collections, accessible to researchers across the world — leaving millions of specimens uncatalogued.
“There’s this huge challenge to get the data that’s on all the labels on the other 90 percent of the collection into data sets, give them catalog numbers, so that the data is actually useful,” said Brendan K. Haley, the senior database manager.
In the early 2010s, the museum followed a global trend and employed new 3D and 2D technology to input the information on these specimens into the world-wide database.
Jon Woodward, who was appointed as the museum’s digitization manager in 2023, has been with the museum and witnessed these technological shifts over the past 20 years.
“Recently we’ve seen both from the research side and increasingly from a sort of collections management angle, a desire and an effort to digitize specimens in three-dimensions through a variety of different modalities,” he said.
The digital imaging facilities at the museum consist of a digital x-ray system and a micro-CT scanner with a paired analysis workstation, which provide a three-dimensional view and information on the density of a given specimen.
“A lot of the research labs here that are associated with the MCZ have made, at this point, almost constant use of the CT scanner on a variety of different kinds of animal specimens — everything from tick mouth parts up to big, chunky fossils,” Woodward said.
The museum uses photogrammetry to stack dozens of two-dimensional images on top of each other, and has recently begun working with Gaussian splatting — a 3D technique that paints spots of color repeatedly “millions and millions of times” — to document what photogrammetry has difficulty capturing: translucent and reflective objects, Woodward said.
Despite the advanced technology allowing for complex 3D and 2D imaging, the museum anticipates a shift that will come with the increased prevalence of AI.
With recent developments, Woodward said he sees AI use as “kind of a big, amorphous uncertainty blob right now.”
“It seems like there are promising ways that AI — machine learning — will help us to speed up the task of getting our specimens digitized,” he added.
According to Haley, advancements in imaging and AI have the potential to turn individual labels — that aren’t “necessarily illuminating” — into meaningful data sets.
For example, with the entomology branch, AI could decrease the time necessary to catalog millions of specimens from lifetimes to years.
“As far as data collection goes, there’s a lot of hope that you can use imaging to image large sums of specimens at a single shot — hopefully read the labels that are on the specimens – and then hopefully use some AI that’s been trained by entomology data to figure out what those labels mean and produce data that can be loaded into a database,” Haley said.
“It would still be a couple hundred years from now that they would have actually cataloged the entire collection,” Haley added. “And that’s, of course, fighting the tide of people are also still bringing in new stuff.”
Environmental concerns are also placing more pressure on the collections, adding impetus for AI use to help “preserve what we have and know what we have,” Woodward said.
He said increased access to the collections “would be beneficial to us now, particularly pressing at a time when the climate is changing and we’re seeing habitat loss and loss of biodiversity worldwide.”
But with such potential for success, there would also be challenges with AI implementation, according to Woodward.
“The challenge is going to be the learning curve and figuring out how to accommodate data; how to represent data that we’ve acquired or that we’ve captured with the help of machine learning that has some amount of uncertainty built into it,” Woodward said.
Despite technological change, Woodward said the mission to preserve the collections remains the same.
“Overall, our mandate is to preserve animal specimens and their data and make them available to science,” Woodward said. “So in that sense, the new tech is just an extension of that. It’s a way to enrich the data that we have, to make it available to people in new ways.”
—Staff writer Ella F. Niederhelman can be reached at ella.niederhelman@thecrimson.com.
Want to keep up with breaking news? Subscribe to our email newsletter.