Model & Dataset Disclosures
Key Idea: What are Model & Dataset Disclosures (MDSDs)?
Model & dataset disclosures (MDSDs) (a term coined by the creators of this curriculum) refer to the vast collection of proposed methods and mediums which enable the deliberate communication and reporting of an ML model's origins and characteristics (limitations, performance metrics, intended uses), as well as the origin and composition of the datasets used for a model's training, testing, and validation.
The development of robust disclosure methods and mediums have been proposed as a central strategy in the pursuit of achieving algorithmic transparency.
Deeper Dive: Dataset vs. Model Disclosures
Dataset Disclosures
Dataset disclosures focus specifically on the origins, characteristics, composition, and recommended uses of datasets intended to be used in the training, testing, and validation of ML models.
Examples include:
Model Disclosures
Model disclosures focus specifically the provenance, limitations, performance metrics, and recommended usage of trained ML models.
Examples include:
The remining two modules of curriculum will focus on select MDSD methods and mediums proposed in the ML literature and their relevance to health contexts.