Enhancing data quality and usability

Dr. Jackie Boerman and Dr. Luiz Britol wanted to analyze data produced at dairy farms. Dairy producers are utilizing multiple software programs and sensors on their farms to assist with the management of their farms. A challenge to fully utilizing the data generated on dairy farms is the lack of data interoperability across data sources, which results in challenges associated with integrating data.
Along with our IT partners, we have developed a research data ecosystem that allows Dr. Boerman, Dr. Luiz, and their graduate students to analyze messy and dissimilar data in a custom-curated environment. Working with this much data on individual computers would not be as effective or, in some cases, even possible.
Challenges we faced
- On-farm data that needed to be brought to Purdue
- Poor internet connectivity
- Lack of common data structure
- Inconsistent data quality
- Diverse data sets
- Parlor management
- Farm management
- Genetic data
- Weather
- Feed management
- Environmental sensors
- Automated milking robots
Some of the methods and processes implemented
- Automated data retrieval
- Automated quality control
- Automated data cleaning and transformation
- Data conversion
- Process documentation
- Data description (metadata generation)
- Storage optimization