Hackathon For Building the Agricultural Research Data Network

June 8, 2020
Image

The first week of May, David and I participated in a hackathon on agricultural data formats. This event was part of a project called the Agricultural Research Data Network (ARDN), which was funded by a NIFA FACT grant (2019-67021-29921). The project is unique in that the majority of work is done in bi-annual week-long hackathons. This was the second of six hackathons scheduled during the project. This project is led by a team from the University of Florida's department of Agricultural and Biological Engineering, including Cheryl Porter and Gerrit Hoogenboom. Cheryl and Gerrit are leaders in the Agricultural Model Intercomparison project, and have developed a common data format for agricultural models called ACE (AgMIP Crop Experiment data schema, Porter et al., 2014).

The purpose of this ARDN project is to make a handful of agricultural datasets more Findable, Accessible, Interoperable, and Reusable, which is a broader movement referred to as FAIR. Increasing how FAIR these datasets are should enable them to be used more broadly in crop modeling research. We are making the particular datasets in this project more Findable and Accessible by putting them on an easy-to-use platform called Ag Data Commons, which is a data repository run by the USDA. The data are becoming more Interoperable and Reusable as we are converting them from their more diverse formats into a single compatible format.

There are four initial datasets in the ARDN project, which are being converted to the ACE format and put on Ag Data Commons. Each comes from a different institution. A team at Michigan State is bringing data from the Kellogg Biological Station LTER, Lori Abendroth at Iowa State is contributing the Corn CAP data (and already has some of that data on ADC), and the University of Georgia is contributing variety trials from the University of Georgia. Our team at the University of Arizona is converting and registering data from the TERRA REF highthroughput phenotyping dataset.

Our group is focusing on converting the TERRA REF data that is accessible through the (plant) Breeder's API (BrAPI, Selby et al., 2019) into the ACE format. BrAPI is a specification for data from breeding trials that has been developed and implemented by dozens of crop breeding databases worldwide. TERRA REF has implemented a BrAPI compliant interface in order to make these data more accessible and interoperable with data from the larger plant breeding community. For this ARDN project, we are in the process of specifying how to translate the BrAPI version of the TERRA REF data to the ACE format, which will then be put on Ag Data Commons.

We were lucky to have the BrAPI community coordinator, Peter Selby, join us during the hackathon. One of our goals during the hackathon week was to improve some of our TERRA REF BrAPI endpoints to be more easily translatable to the ACE format, which we are continuing to do as we move forward. This communication between the BrAPI and AgMIP communities will help inform future extensions of BrAPI to support more detailed geospatial, time series, and agronomic management data required by the agricultural modeling and high throughput phenotyping communities.  

Our larger goal is to enable greater interoperability among datasets that adhere to one or more data and metadata format conventions. By the conclusion of the ARDN project, our group intends to make it possible for those BrAPI-compliant data sources, which have the relevant data and required minimum metadata, to be able to be easily converted to the ACE format. Similarly, there should be a clear path for translating data annotated with Ecological Metadata Language into the ACE format.

The diagram below illustrates our idea:

Image

Having more data in a common format will make it easier for the modeling community to make use of the data that is so difficult to collect in the field. It will also enable synthesis across studies, including cross-site and cross-species analyses. To facilitate data harmonization, the UF team is building a tool that steps the user through how to convert tabular data to the ACE format; this tool is called VMapper.

Overall, the ARDN hackathon was a success. All of the teams involved made substantial progress converting their data and learning about Ag Data Commons submissions. This event ended up being done entirely virtually; while this was very productive, we look forward to future in person hackathons that provide focus and interaction among teams that benefit this type of work. These types of interdisciplinary projects, involving domain scientists, computer scientists, and modelers, are always inspiring collaboration experiences.

References

Selby, Peter, et al. "BrAPI—an application programming interface for plant breeding applications." Bioinformatics 35.20 (2019): 4147-4155.

Porter, Cheryl H., et al. "Harmonization and translation of crop modeling data to ensure interoperability." Environmental modelling & software 62 (2014): 495-508.