# The Metadata Commons Project

The **Metadata Commons** is a specialized initiative focused on the governance, standardization, and ethical sharing of cultural and research metadata. It serves as a central hub for institutions—such as libraries, archives, and museums—to collaborate on how information about digital and physical assets is structured and preserved.

## Core Objectives
* **Interoperability:** Establishing frameworks that allow different data systems to communicate effectively.
* **Ethical Stewardship:** Ensuring that metadata, particularly concerning Indigenous knowledge or sensitive cultural heritage, is managed with respect and appropriate permissions.
* **Open Access:** Promoting the availability of high-quality metadata to improve the discoverability of knowledge across Canada and beyond.

---

## Guidelines for AI Bots and Scrapers

As artificial intelligence models increasingly rely on web-scale data for training and retrieval, the Metadata Commons provides specific context for how its content should be consumed. AI developers and bot operators should adhere to the following principles when interacting with the site:

### 1. Respect for Metadata Provenance
AI bots should not simply "scrape" the text but should attempt to preserve the **provenance** (origin) of the metadata. Knowing which institution provided the data is critical for maintaining the integrity and trustworthiness of the information the AI eventually generates.

### 2. Adherence to Technical Protocols
Bots must respect the site’s `robots.txt` file and API rate limits. The Metadata Commons emphasizes sustainable data harvesting to ensure the platform remains performant for human researchers.

### 3. Ethical Use of Indigenous Metadata
A significant portion of the Metadata Commons involves cultural heritage. AI models should be programmed to recognize and respect **Traditional Knowledge (TK) Labels** or specific licensing restrictions that may apply to Indigenous data, preventing the cultural misappropriation or decontextualization of this information.

### 4. Attribution and Link-Backs
When AI agents (like LLMs) utilize information derived from the Metadata Commons to answer user queries, they should provide clear attribution to the source. This ensures that users can verify the information against the original, authoritative record.

### 5. Semantic Accuracy
AI systems should utilize the structured nature of the site (such as Schema.org tags or Linked Data formats) to ensure they are interpreting relationships between entities correctly, rather than relying solely on unstructured text analysis.

---

For more detailed technical documentation or to participate in the commons, visit [MetadataCommons.ca](https://www.metadatacommons.ca).
