Back to all News
Photo credit: 
Jeremy Cohen ©
The blue-gray gnatcatcher, Polioptila caerulea, a North American species with high sampling density in the eBird dataset.
Tamara Rudic

Researchers highlight distinct sampling biases in the eBird global citizen-science database

This one’s for the birders, the ecologists, and all the citizen scientists out there – how well does the global citizen science platform eBird capture a representative sample of the Earth’s avian biodiversity? 

The explosion of citizen science biodiversity data in recent years has certainly been a boon to the field of global biodiversity monitoring, but, as the seasoned birders and researchers out there know, more data alone is not enough to address our pressing needs of understanding and monitoring species in a rapidly changing world. Where and when the data is gathered matters, a lot. Understanding what kind of trends and biases exist in different data resources is crucial to being able to use those resources effectively in science and conservation. In their recent paper “Data coverage, biases, and trends in a global citizen-science resource for monitoring avian diversity” published in the journal Diversity and Distributions, La Sorte et al. analyze these trends and biases for global eBird hotspot data to uncover several important and regionally distinct skews in the data.

Citizen science monitoring projects are typically opportunistic and sporadic in nature – people tend to snap photos or grab audio recordings near their homes whenever they have a chunk of free time to go exploring. For long-term monitoring of species populations, which is crucial for teasing out changes in those populations, consistent data collection at defined sites is necessary for important research and conservation goals such as tracking declines in threatened species populations or monitoring the spread of introduced or invasive species. eBird addresses this need through hotspots, which are defined, publicly-accessible locations nominated by users that people are motivated to visit to collect and submit species checklists. 

“Hotspots provide a global collection of fixed locations that are often sampled intensively across the year,” says Dr. Frank La Sorte, lead author on the paper and senior scientist at the BGC Center. “Other large-scale ecological data sources lack this level of spatial consistency and rarely capture data across the full annual-cycle.” By facilitating consistent, year-round data collection at the same locations, eBird hotspots circumvent many of the usual citizen science pitfalls and ensure greater data reliability, accuracy, and consistency. 

But, even hotspots are prone to bias, and understanding these biases is crucial to ensuring accurate application of the data. For example, if hotspot locations overrepresent urban and disturbed environments – due to the tendency for people to submit data closer to their homes –  neglecting to account for this bias could lead to incorrect conclusions about bird diversity and change that aren’t representative of the entire ecosystem.  “By understanding the sampling biases, we can develop scientific questions and methods that leverage the strengths of the data while avoiding critical weakness or inconsistencies, resulting in more reliable and defensible inferences,” says Dr. La Sorte. 

To assess these biases, La Sorte et al. pulled data for over 300,000 eBird hotspots all around the world and compared sampling biases in protection status, temperature, precipitation, and landcover between hotspots, non-hotspots (other eBird sampling locations that aren’t formally recognized as hotspots) and randomly-sampled geographic locations. They discovered that these biases are not only very real, but show distinct patterns in different regions of the world. 

The hotspots were first divided temporally into four seasons and spatially into eight biogeographic realms, which are generally defined by their distinct species communities and separated by natural barriers like oceans, deserts, or mountain ranges. Using statistical analyses, the researchers tested for sampling biases in hotspots in protection status, temperature, precipitation, and landcover.

The good news first: in the past couple decades, the amount of hotspots, checklists, and eBird participants have increased consistently all around the world, enough so that nearly 97% of all known bird species have been captured in hotspot checklists. This means that a pretty hefty portion of the world’s birds are being monitored in global hotspot locations. 

Now, the more complicated findings: the researchers found that hotspot locations are more likely to sample protected areas, urban areas, and wetland and water body habitats across the biogeographic realms. In different realms, biases in the representation of landcover types in hotspots were distinct: in North America, Eurasia, and Antarctica, hotspots oversampled warmer and wetter habitats and undersampled the high northern latitudes, while in Australia, Africa, and South and Central America, cooler habitats were oversampled, and in the Pacific islands dryer habitats were oversampled by hotspots.

The hooded merganser, Lophodytes cucullatus, a common North American species. Credit: Jeremy Cohen ©

All this means that while we’re getting a consistently increasing volume of bird monitoring data from hotspots, biases in where exactly that data is being collected could lead to a skewed interpretation of the data if corrections aren’t made, the researchers warn. For example, as they point out in the paper, a lack of data at the high northern latitudes could make it difficult to assess the impacts of Arctic warming on bird species, and biases in what types of landcover are sampled could obfuscate monitoring of how bird species respond to land use change. “For example, the observed bias toward sampling in urban areas could suggest that frequent feeder visitors are increasing their numbers even if their populations in undersampled forest habitats are actually declining”, says Dr. Jeremy Cohen, co-author of the study and Associate Research Scientist at the BGC Center. However, compared to data from non-hotspot locations, the researchers point out the hotspots still do a good job of mitigating many of the common biases that originate from opportunistic citizen science data collection. 

When you think about the birds, don’t forget about the biases – the researchers urge other scientists who use hotspot data in their studies to account for these observed sampling biases in order to ensure they generate accurate and reliable results, especially when it comes to assessing the impacts of land use change and global warming. Such practices will be crucial for ensuring that our understanding of the world’s changing avian diversity is as close as possible to the true picture.