What is Golden Dataset in ML & PowerBI? (Answered)

We hope you love the products we recommend! Just so you know, when you buy through links on our site, we may earn an affiliate commission. This adds no cost to our readers, for more information read our earnings disclosure.

A “golden dataset” is a validated and integrated data set that is properly annotated without bias.

It’s simply a “centered” dataset used to develop various reports to ensure they contain the same data and are processed the same way. 

It is commonly referred to as Hand-Labeled, making it very high-quality data.

With the golden dataset concept, common issues in Power BI are solved. 

Power BI is a collection of software services developed by Microsoft for business owners to help analyze and visualize raw data and present actionable information.    

Overview of the Golden Dataset in Power BI

The golden dataset is used in Power BI, a self-service data visualization tool, as an effective solution to eliminate having numerous data models that are essentially alike but with different reports written. 

Golden dataset eradicates this problem by utilizing single-centered data in various reports.

Power BI service is a windows desktop-based application that visualizes data and shares insights for use in the cloud space. 

But one major issue with PowerBI is the proliferation of similar data models, with slightly different versions or reports, which worsens when the data model is published to multiple workspaces on powerBI.com (formerly known as Power BI for Office 365). 

The Concept of the Golden Dataset

  • It makes use of Shared datasets and workspaces.
  • It publishes certified or authorized datasets.

Advantages of a Golden Dataset

  1. More space is available for use when unnecessary duplicate data models don’t exist.
  2. With the golden dataset, much lesser resources are required to keep the modern up to date.
  3. With the required access, authorized access can build new “thin reports” connected to the Golden Dataset when needed.
  4. Eliminating similar data models with slight differences leaves only one dataset to maintain and improve. 
  5. It is possible to improve and make changes to the single data model using the golden dataset. These edits made are visible to all users and dependent reports.
  6. With the Golden Dataset, it is easier to build new reports by adding new data to the thin reports.

The Process of Publishing a Golden Dataset

1. OneDrive

This involves saving the Golden Dataset on “OneDrive” and importing it to workspaces. This method is great for maintaining only one dataset. 

However, in the process of importing the dataset to various workspaces, the data need to be loaded multiple times to each workspace.

2. SharePoint Online

This process also follows the same steps as above, save the golden dataset on “SharePoint online”, then import it to various workspaces. 

The data also needs to be loaded multiple times to each workspace.

3. Power BI Workspace

It involves publishing a Golden Dataset to Power BI Workspace. 

With this method, there is no need to duplicate the dataset multiple times because the workspaces and reports link back to one common source of data.

How to Publish a Golden Dataset on Power BI Workspace (Step by Step)

  1. Enabled the option to allow “users to use dataset across workplaces if they have the required permission.” You can configure this in the tenant settings.
golden dataset powerbi step 1

Image 1

  1. Create a Golden Workspace to keep the Golden Dataset.

Image 2

  1.  You can proceed to publish the Golden Dataset; reviews and required edits can also be made before publishing.
publish dataset to powerbi step 3

Image 3

  1. With the Power BI Desktop, build the thin reports. Thin reports links or re-direct to the main data model; it doesn’t include it.  For example, it could link to Power BI. Image 4
golden dataset powerbi step 4
  1. The final step involves publishing thin reports on various workspaces.
publish dataset step 5

Note: Power BI automatically manages the link back to the Golden Dataset from the thin reports.

Image 5

The image above shows the process of publishing a Golden Dataset into a Golden Workspace.


In conclusion, the main idea behind the Golden Dataset is to reduce the needless duplication of the same data models with similar contents and data sources. 

With the golden dataset, one dataset can be used for every data report.

Scroll to Top