List Datasets - Pretectum

The List Datasets endpoint returns all datasets defined within a specific schema. Datasets are collections of data objects that share the same schema structure and represent actual data records in your master data repository.

Prerequisites

Valid access token (see Get Access Token)
Permission to access datasets in your tenant
Valid business area ID (see List Business Areas)
Valid schema ID (see List Schemas)

Authentication

Include your access token in the Authorization header.

Pass the token directly without the “Bearer” prefix.

Authorization: your_access_token

Request

Path Parameters

businessAreaId

string

required

The unique identifier of the business area. You can obtain this from the List Business Areas endpoint.

schemaId

string

required

The unique identifier of the schema containing the datasets. You can obtain this from the List Schemas endpoint.

Query Parameters

pageKey

string

A pagination token for retrieving the next page of results. This value is returned in the response as nextPageKey when more results are available.

Headers

Authorization

string

required

Your access token obtained from the /v1/oauth2/token endpoint. Pass the token directly without the “Bearer” prefix.

string

default:"application/json"

The response content type. Currently only application/json is supported.

Example Requests

# List all datasets in a schema
curl -X GET "https://api.pretectum.io/v1/businessareas/20240115103000123a1b2c3d4e5f6789012345678901234/schemas/20240115103000456d1e2f3a4b5c6789012345678901234/datasets" \
  -H "Authorization: your_access_token" \
  -H "Accept: application/json"

# Paginate through datasets
curl -X GET "https://api.pretectum.io/v1/businessareas/20240115103000123a1b2c3d4e5f6789012345678901234/schemas/20240115103000456d1e2f3a4b5c6789012345678901234/datasets?pageKey=eyJMYXN0RXZhbHVhdGVkS2V5Ijp7Li4ufQ" \
  -H "Authorization: your_access_token" \
  -H "Accept: application/json"

Response

A successful request returns an object containing an array of datasets and pagination information.

items

array

required

An array of dataset objects within the schema.

Show Dataset object properties

dataSetId

string

The unique identifier for the dataset. Use this ID when filtering data object searches.

dataSetName

string

The display name of the dataset. This is the human-readable name you can use in the dataSet filter parameter when searching data objects.

dataSetDescription

string

A description of the dataset explaining its purpose and the data it contains.

businessAreaId

string

The ID of the business area this dataset belongs to.

businessAreaName

string

The name of the business area this dataset belongs to.

schemaId

string

The ID of the schema this dataset uses.

schemaName

string

The name of the schema this dataset uses.

recordCount

integer

The total number of data objects (records) in this dataset.

erroredRecordsCount

integer

The number of records that have validation errors.

runningJobsCount

integer

The number of background jobs currently running on this dataset (e.g., imports, exports).

version

integer

The version number of the dataset. This increments each time the dataset is modified.

createdBy

string

The identifier of the user who created the dataset.

createdByEmail

string

The email address of the user who created the dataset.

createdByName

string

The full name of the user who created the dataset.

updatedBy

string

The identifier of the user who last modified the dataset.

updatedByEmail

string

The email address of the user who last modified the dataset.

updatedByName

string

The full name of the user who last modified the dataset.

createdDate

string

The ISO 8601 timestamp when the dataset was created.

updatedDate

string

The ISO 8601 timestamp when the dataset was last modified.

deleted

boolean

Indicates whether the dataset has been marked as deleted.

nextPageKey

string

A pagination token for retrieving the next page of results. If this field is present, more datasets are available. Pass this value as the pageKey query parameter in your next request.

Example Response

{
  "items": [
    {
      "dataSetId": "20240925152201042a1b2c3d4e5f6789012345678901234",
      "dataSetName": "US Customers",
      "dataSetDescription": "Customer records for United States region",
      "businessAreaId": "20240115103000123a1b2c3d4e5f6789012345678901234",
      "businessAreaName": "Customer",
      "schemaId": "20240115103000456d1e2f3a4b5c6789012345678901234",
      "schemaName": "Individual Customer",
      "recordCount": 15420,
      "erroredRecordsCount": 12,
      "runningJobsCount": 0,
      "version": 5,
      "createdBy": "9ae5f422-bb62-4c9d-b277-594ddcda6d8d",
      "createdByEmail": "admin@example.com",
      "createdByName": "John Admin",
      "updatedBy": "9ae5f422-bb62-4c9d-b277-594ddcda6d8d",
      "updatedByEmail": "admin@example.com",
      "updatedByName": "John Admin",
      "createdDate": "2024-09-25T15:22:01.042Z",
      "updatedDate": "2024-12-15T10:30:00.000Z",
      "deleted": false
    },
    {
      "dataSetId": "20240926090000123b2c3d4e5f6a7890123456789012345",
      "dataSetName": "European Customers",
      "dataSetDescription": "Customer records for European region",
      "businessAreaId": "20240115103000123a1b2c3d4e5f6789012345678901234",
      "businessAreaName": "Customer",
      "schemaId": "20240115103000456d1e2f3a4b5c6789012345678901234",
      "schemaName": "Individual Customer",
      "recordCount": 8750,
      "erroredRecordsCount": 3,
      "runningJobsCount": 0,
      "version": 2,
      "createdBy": "b5f6g733-cc73-5d0e-c388-605eeda7e9e",
      "createdByEmail": "data_manager@example.com",
      "createdByName": "Jane Manager",
      "updatedBy": "b5f6g733-cc73-5d0e-c388-605eeda7e9e",
      "updatedByEmail": "data_manager@example.com",
      "updatedByName": "Jane Manager",
      "createdDate": "2024-09-26T09:00:00.123Z",
      "updatedDate": "2024-11-20T14:45:00.000Z",
      "deleted": false
    }
  ],
  "nextPageKey": "eyJMYXN0RXZhbHVhdGVkS2V5Ijp7ImRhdGFTZXRJZCI6IjIwMjQwOTI2MDkwMDAw..."
}

Response Without Pagination

When all datasets fit in a single response, no nextPageKey is returned:

{
  "items": [
    {
      "dataSetId": "20240925152201042a1b2c3d4e5f6789012345678901234",
      "dataSetName": "US Customers",
      "dataSetDescription": "Customer records for United States region",
      "businessAreaId": "20240115103000123a1b2c3d4e5f6789012345678901234",
      "businessAreaName": "Customer",
      "schemaId": "20240115103000456d1e2f3a4b5c6789012345678901234",
      "schemaName": "Individual Customer",
      "recordCount": 15420,
      "erroredRecordsCount": 0,
      "runningJobsCount": 0,
      "version": 5,
      "createdBy": "9ae5f422-bb62-4c9d-b277-594ddcda6d8d",
      "createdByEmail": "admin@example.com",
      "createdByName": "John Admin",
      "updatedBy": "9ae5f422-bb62-4c9d-b277-594ddcda6d8d",
      "updatedByEmail": "admin@example.com",
      "updatedByName": "John Admin",
      "createdDate": "2024-09-25T15:22:01.042Z",
      "updatedDate": "2024-12-15T10:30:00.000Z",
      "deleted": false
    }
  ]
}

Empty Response

If the schema has no datasets defined, the response will contain an empty items array:

{
  "items": []
}

Error Responses

Status Code	Description
`401 Unauthorized`	Invalid or expired access token. Obtain a new token from `/v1/oauth2/token` and try again.
`403 Forbidden`	Your application client does not have permission to access datasets. Contact your tenant administrator.
`404 Not Found`	The specified business area or schema does not exist, or you do not have access to it.
`500 Internal Server Error`	An unexpected error occurred on the server. Try again later or contact support.

Pagination

When a schema contains many datasets, results are paginated. Use the nextPageKey from the response to fetch subsequent pages:

async function getAllDatasets(businessAreaId, schemaId) {
  const allDatasets = [];
  let pageKey = null;

  do {
    const response = await getDatasets(businessAreaId, schemaId, pageKey);
    allDatasets.push(...response.items);
    pageKey = response.nextPageKey;
  } while (pageKey);

  return allDatasets;
}

const allDatasets = await getAllDatasets(businessAreaId, schemaId);
console.log(`Total datasets: ${allDatasets.length}`);

Use Cases

Filtering Search Results by Dataset

Use the dataset names returned by this endpoint to filter your data object searches:

# First, get the list of datasets
curl -X GET "https://api.pretectum.io/v1/businessareas/{businessAreaId}/schemas/{schemaId}/datasets" \
  -H "Authorization: your_access_token"

# Then search within a specific dataset
curl -X GET "https://api.pretectum.io/dataobjects/search?query=John&dataSet=US%20Customers" \
  -H "Authorization: your_access_token"

Monitoring Data Quality

Check the erroredRecordsCount to identify datasets with data quality issues:

const datasets = await getDatasets(businessAreaId, schemaId);

const datasetsWithErrors = datasets.items.filter(ds => ds.erroredRecordsCount > 0);
datasetsWithErrors.forEach(ds => {
  console.log(`${ds.dataSetName}: ${ds.erroredRecordsCount} errors out of ${ds.recordCount} records`);
});

Tracking Record Counts

Monitor the size of your datasets:

datasets = get_datasets(business_area_id, schema_id)

total_records = sum(ds['recordCount'] for ds in datasets['items'])
print(f"Total records across all datasets: {total_records}")

for ds in sorted(datasets['items'], key=lambda x: x['recordCount'], reverse=True):
    print(f"  {ds['dataSetName']}: {ds['recordCount']:,} records")

Best Practices

Cache dataset lists: Dataset metadata changes less frequently than record data. Cache the response and refresh periodically.
Filter by active datasets: Exclude datasets where deleted: true in user-facing interfaces.
Use names for search filters: When filtering searches with the dataSet parameter, use the dataSetName field value.
Handle pagination: Always check for nextPageKey in responses and fetch all pages if needed.
Monitor error counts: Regularly check erroredRecordsCount to identify data quality issues early.

List Schemas

Get schema IDs for dataset queries

List Business Areas

Get business area IDs

Search Data Objects

Search within specific datasets

Get Access Token

Obtain authentication token

Overview

Authentication

Business Areas

Schemas

Datasets

Data Objects

​Prerequisites

​Authentication

​Request

​Path Parameters

​Query Parameters

​Headers

​Example Requests

​Response

​Example Response

​Response Without Pagination

​Empty Response

​Error Responses

​Pagination

​Use Cases

​Filtering Search Results by Dataset

​Monitoring Data Quality

​Tracking Record Counts

​Best Practices

​Related Endpoints

List Schemas

List Business Areas

Search Data Objects

Get Access Token

Prerequisites

Authentication

Request

Path Parameters

Query Parameters

Headers

Example Requests

Response

Example Response

Response Without Pagination

Empty Response

Error Responses

Pagination

Use Cases

Filtering Search Results by Dataset

Monitoring Data Quality

Tracking Record Counts

Best Practices

Related Endpoints