Back to all News
Photo credit: 
Alexander Pyron
A slender loris sitting on a branch against a black background. The small primate is covered in speckled brown fur and has huge red eyes and a small protruding snout.
Dr. Mario Moura
Publications
Publications

Imagine a masterpiece of art—like a book, film, or music. Each of these forms of expression contains pieces of information that come together to form an idea, tell a story, or compose a symphony. But what if some crucial pages or sections were missing? The artwork becomes incomplete, and our interpretation of the work would likely be different. It is this incompleteness that reflects our knowledge of the planet's biodiversity. Countless species and their basic traits are unknown to us, making us ignorant about both their existence and potential relevance. In a recent study published in PLOS Biology, our team assessed and set out to close some of these gaps by compiling thousands of data sources on bird, mammal, amphibian, and reptile species traits and using machine learning to predict missing information.

Like any great story or symphony, the construction of nature's masterpiece is a cumbersome task. For hundreds of years, scientists and naturalists have traveled continents and oceans to catalog species and document their habitats and behaviors in a field of study known as natural history. Despite this centuries-old effort, significant biodiversity knowledge gaps persist.

Photo of a walking leaf frog sitting on a branch against a black background. The frog is a vivid green color and one large eye stares at the camera.g
Data gaps in natural history may arise due to challenges in detecting species with specific traits, such as those inhabiting the canopy or active during the night, like this walking leaf frog (Phyllomedusa burmeisteri). Photo credit: Mario R. Moura.

While new field expeditions may shed light on poorly known species, many of them are rare and scarce. Some species have only had a few individuals observed decades ago and never found again, others have been discovered more recently but are only known from a single specimen, but the bottom line is that we’re missing key trait information about these understudied species. In our new paper “A phylogeny-informed characterisation of global tetrapod traits addresses data gaps and biases,” we combat these persistent knowledge gaps with a modern ally: artificial intelligence. Leveraging information foundations laid by the NSF-funded VertLife project, we used innovative machine learning methods to uncover biases in natural history data and provide guidance for more effective field research strategies.

First, we compiled existing global data on tetrapods, or all bird, mammal, amphibian, and reptile species, which serve as model systems for global ecology and conservation research. This herculean effort involved digitizing more than 3,300 datasets and references altogether representing more than 33,000 documented tetrapod species around the world. Within this global tetrapod database, we include information on a variety of essential species traits, such as body size, activity time, micro- and macrohabitat, ecosystem, threat status, biogeography, insularity, environmental preferences and level of human influence – the notes that compose the symphony of Earth’s biodiversity. 

But, some of these notes were conspicuously missing. By analyzing data gaps on different species traits, we uncover research preferences and practical difficulties in data collection. Only 43% of species had complete recorded values for 5 key traits – body length, body mass, activity time, microhabitat, and threat status. Many data gaps result from the difficulty of detecting species with certain traits, like small size or nocturnal activity. Other gaps may be linked to species inhabiting hard-to-access habitats, such as the canopy of tropical forests. While these species may be harder for us to study, they are no less important in the tapestry of biodiversity, so filling these gaps is of utmost importance.  

To address these gaps, we leveraged relationships among species traits to estimate the missing information about biodiversity. It's akin to filling in the absent notes of a musical score to recover its melody. We used a sophisticated artificial intelligence algorithm and a statistical technique called multiple imputation to estimate missing trait values. By incorporating phylogenetics – the evolutionary relatedness of species – into this computation, missing values were estimated with a high degree of accuracy based on species relatedness, trait relationships, and thousands of documented trait values.   

Photo of a worm lizard on the ground. The animal is legless with brown, wrinkly skin and a slightly pinkish face. It rests on the leaf litter.
Data gaps in natural history may arise due to challenges in detecting species with certain traits, such as subterranean animals like this worm lizard (Amphisbaena alba). Photo credit: Mario R. Moura.

After including the imputed values, the final database showed at least 98% completeness. This novel database, TetrapodTraits (Zenodo, VertLife), expands the frontiers of our knowledge of the natural world. Beyond simply summing up available knowledge, TetrapodTraits provides a more representative view of biodiversity that not only helps us better understand the limitations of existing data, but also guides more effective strategies for future field research and data collection. With this wealth of species data at hand, scientists can explore the complex species-environment interactions, identify distribution patterns, and better understand threats to biodiversity.

For many species, global changes outpace research knowledge accumulation. While new data allows for more informed conservation planning, it is crucial to intensify sampling efforts and continually improve data collection techniques. Knowledge gaps are not static and may even increase if the discovery of new species is not accompanied by a deep understanding of their ecology.

Just as in artistic masterpieces where each sequence builds upon the other, our knowledge of biodiversity continues to grow. With a more holistic view of natural history, we can better understand how to ensure the conservation of biodiversity and the benefits it provides us.

Mario Moura is a biologist, professor at the Federal University of Paraíba, Brazil, and works with biodiversity, ecology, and conservation.