Overview
Datasets are collections of data objects (records) that share the same schema structure. They represent the actual data stored in your Pretectum master data repository and provide a way to organize records into logical groups. The hierarchy in Pretectum is:Why Datasets Matter
Datasets help you:- Organize Data: Group records by region, source, time period, or any logical category
- Filter Searches: Narrow search results to specific subsets of data
- Track Data Quality: Monitor record counts and error rates per dataset
- Manage Data Lifecycle: Handle imports, exports, and deletions at the dataset level
Before You Begin
To use the Datasets API, you need:1
Get API Credentials
Obtain your
client_id and client_secret from your Pretectum tenant administrator.2
Obtain Access Token
Exchange your credentials for an access token using the authentication endpoint.
3
Get Business Area and Schema IDs
Retrieve the business area ID using List Business Areas and the schema ID using List Schemas. Dataset queries require both IDs.
Retrieving Datasets
Datasets are accessed through their parent schema. You need both the business area ID and schema ID to list datasets.Step 1: Get Business Area ID
Step 2: Get Schema ID
Step 3: List Datasets
Understanding the Response
| Field | Description |
|---|---|
dataSetId | Unique identifier for the dataset |
dataSetName | Display name used for filtering in search operations |
dataSetDescription | Explanation of the dataset’s purpose |
businessAreaId / businessAreaName | Parent business area |
schemaId / schemaName | Parent schema defining the data structure |
recordCount | Total number of data objects in the dataset |
erroredRecordsCount | Number of records with validation errors |
runningJobsCount | Number of background jobs currently processing |
version | Version number for tracking changes |
deleted | Whether the dataset has been soft-deleted |
nextPageKey | Pagination token for fetching the next page |
Using Datasets for Search
Once you have the list of datasets, use thedataSetName field to filter your data object searches.
Searching Within a Dataset
Complete Client Implementation
Here is a complete implementation that handles the full hierarchy from business areas to datasets:Building a Dataset Selector
Create a cascading filter interface for business area → schema → dataset:Best Practices
Cache Dataset Metadata
Cache Dataset Metadata
Dataset metadata changes less frequently than actual data. Cache the list and refresh periodically:
Filter Active Datasets
Filter Active Datasets
Always filter out deleted datasets in user-facing interfaces:
Monitor Data Quality
Monitor Data Quality
Regularly check error counts to identify data quality issues:
Use Names for Search Filters
Use Names for Search Filters
When filtering searches, use the
dataSetName, not the dataSetId: