The University of Michigan Herbarium is home to some of the finest botanical collections in the world. Founded in 1837 and growing ever since, their 1.7 million specimens of vascular plants, algae, bryophytes, fungi, and lichens, combined with the expertise of the faculty-curators, students, and staff, provide a world-class facility for teaching and research in systematic biology and biodiversity studies. Digitizing accurate information about that many specimens can be challenging. How could they use new technologies like AI to help speed up and reduce human error in their digitizing process?
Ph.D. student Will Weaver, in collaboration with Research Museum Collection Manager and Assistant Research Scientist Brad Ruhfel, Professor Stephen Smith, and project manager Kyle Lough, all in LSA's Department of Ecology & Evolutionary Biology, developed a Python-based application suite called VoucherVision to help speed up their digital workflow while reducing human error:
Example workflow from scanning the specimen label to
creating the database record.
The VoucherVision suite itself consists of three major components:
VoucherVision uses the U-M GPT Toolkit, but it also includes support for OpenAI API, Google PaLM 2, Google Gemini, local LLMs (such as Mixtral and Meta Llama 2), and private OpenAI via Azure to make it more universal for others to use as an open-source project.
While the U-M Herbarium has not deployed VoucherVision at scale yet, our initial calculations suggest that it reduces overall transcription costs by 40-60%. In addition to immediate cost and time savings, the automation frees up valuable time for researchers to focus on more complex tasks and analyses.
Researchers across more than a dozen campuses and museums are using VoucherVision in collaboration with our Herbarium collection managers. For more information about or to participate in testing VoucherVision, please fill out their Google Form.