Data Access and Curation

We help scientists manage and curate their data. Our services include:

  • Advising on all stages of the data lifecycle, from collection to publication and reuse.
  • Writing and Implementing data management plans
  • Sharing data through CyVerse, the UA Library, DataOne and other repositories.

Custom Pipelines

We can help users automate computational pipelines to improve the efficiency and consistency of their data intensive research.

We work closely with CyVerse

Ag and Ecosystem Related Pipelines:

  • High Throughput Phenomics for Drones, Tractors, and the Field Scanner
  • Curation of environmental data

Crop & Ecosystem Simulation Models

We use crop, ecosystem, and land surface models, and are interested in more closely integrating models and data.

We are actively use and develop the PEcAn Project framework for land surface modeling, model-data synthesis, and forecasting.

Some of the models we use include:

  • BioCro: ecophysiological model for tree, grass, and crop simulation; see github.com/ebimodeling/biocro
  • ED2: model for simulating plant community dynamics

Training

We teach best practices in computational and data intensive science and reproducible research.

Our group is dedicated to training the next generation of scientists to use data and computation intensive research. Topics that we cover range from best practices to domain specific data analysis and simulation to project management.

  • Best practices in scientific computing: Our group includes three certified Software and Data Carpentry instructors with over ten years of combined experience.
  • High throughput phenomics: terraref.org/tutorials
  • Crop and Ecosystem Modeling: we can help researchers use models to increase the utility of their data, generate and test hypotheses, and predict outcomes of alternative scenarios. We seek to reduce the gap between data collectors and modelers (see, e.g. LeBauer et al 2013 and Dietze et al 2013). We can help identify what data is needed to improve predictions and quantify the frequently posed conclusion 'our data will improve model predictions'.