Hamidah Oderinwale
BLUE Fellow
|
Fall
2024
Streamlining data provenance: documenting data origins and processes to replicate machine learning research
BLUE Fellow
Fall
2024

Background

In science, replicating experiments and achieving consistent results is crucial for maintaining scientific integrity and credibility. But it is getting harder with the rise of machine learning experiments, especially those using computer simulations. These experiments have built-in randomness, making it tough to do them again precisely the same way each time. Reproducibility is important for accountability but can also make model training easier and more precise, supporting capabilities in a world where algorithmic creativity will be increasingly important to make breakthroughs. My research focuses on streamlining data provenance—the documentation of data origins and processes.

More scholars