Datasets API - Prime Video Tech Docs

Datasets API

Last updated 2026-01-23

The new Datasets API in Prime Video Slate enables developers to build clients to retrieve event-gain exports (datasets) and any related dimensional datasets.

Important: The new endpoint documented here supports both subscription and playback. Playback datasets are only available through this new endpoint.

Datasets API overview

The Datasets API is part of our new partner data product, Slate Analytics. Unlike other Slate reports, datasets are append-only (each file has new data), are not available for download in the Slate UI (but are accessible via API only), and are built explicitly for partner data engineers to consume granular data and perform analytics. This topic helps data engineers set up their pipelines to retrieve dataset, defines the values in the dataset files, and provides sample queries and suggestions for the optimal ways partners can use this data.

Practical use of datasets

We provide datasets to consumers in the form of a changelog. Each event is published only once. However, if any column values for a previously provided row need to be updated, we will publish a new version of the record to reflect the changes in your next available file. The changelog is append-only, to ensure that all data modifications are captured. Data engineers can use this changelog to update their data tables directly.

When you process the changelog, it’s essential to always use the latest record for a given event_id, based on the last_update_time_utc column. This ensures that you always have the most up-to-date version of each record. If a record needs to be deleted, this action is reflected in the is_deleted column. A value of 1 indicates that the record has been deleted, while a value of 0 represents an active record. This changelog approach allows you to effectively manage new and changing data, and ensures that your data tables remain accurate and up to date with the latest information.

Datasets API preliminaries

Before you make requests to the Dataset API, it’s important to understand the basic requirements for authentication and pagination. This section covers how to securely access the API and navigate large datasets efficiently.

Onboarding to Analytics API
To retrieve datasets you need to onboard to the Analytics API suite first. More details can be found here.

    The base URI is: https://videocentral.amazon.com/apis/v2. All requests should include a valid LWA authentication token in the request authorization header. For example:

    If the request header doesn’t include the token, or if the token is expired, the Datasets API will return an unauthorized exception.

    Pagination
    All Slate API responses are paginated. Pagination parameters are specified through requests parameters.

    Request parameter

    Default value

    Description

    limit

    10

    The number of documents returned in a single page (the page size).

    offset

    0

    The number of pages to skip (the page number).

    All paginated responses contain the following fields.

    Field

    Description

    total

    The total document count in all pages.

    next

    The URL to the next page. Null if the last page.

    Use the Datasets API

    To programmatically access datasets, clients should follow a series of API calls that enumerate available resources—such as accounts, groups, businesses, and datasets—before retrieving downloadable URLs for the data files. This sequence is designed to support automation and can be integrated into recurring data pipelines or scheduled workflows.

    List accounts
    /v2/accounts

    This resource returns the list of Slate accounts that the user can access. The set of accounts is accessible in Slate through the accounts dropdown list near the top right corner of the portal. You can also use these links to find your account_id or your channel/studio_id.

    Example request

    Example response

    List groups (business lines)
    /v2/accounts/{account_id}

    This resource returns the groups of business lines (such as channels) that the user can access.

    Example request

    Example response

    List businesses
    /v2/accounts/{account_id}/{group_id}

    This resource returns a list of businesses (such as specific channel names) available for this account, depending on the given business line.

    Example request

    Example response

    List available datasets
    /v2/accounts/{acccount_id}/{group_id}/{business_id}/datasets

    This resource returns the list of datasets available for a given channel or studio. (The list of available datasets and their attributes are included in Dataset definitions, later in this topic.) The datasets currently available to download are:

    • Subscription: Events in the customer lifecycle, such as when a customer subscribed.
    • Playback: Playback session events where customers engaged with content.
    • Catalog: Events where your catalog metadata has changed, such as when a new title was added.

    Example request

    Example response

    Obtain dataset file(s)
    /v2/accounts/{account_id}/{group_id}/{business_id}/datasets/{dataset_id}

    This resource provides a list of dataset files. Depending on the requested time range, the list may include a large number of files. The total field indicates how many files to expect. After completing a full backfill, you can stay up to date by continuing to request files using a startDateTime equal to the last retrieved timestamp and an endDateTime set to the current time.

    New datasets are published approximately every 4 hours, and may contain events that have occurred within the previous 12 hours. We recommend calling our API multiple times per day, approximately every 4-6 hours, to ensure your local data is as complete and up-to-date as possible. If we experience a delay in publishing, we will communicate through email as soon as possible.

    The following table describes the available request parameters for dataset files.

    Request parameter

    Description

    startDateTime

    Recommendation is to set from the last time pulled.
    (Format: YYYY-MM-DDTHH:MM:SSZ - timestamp UTC)

    endDateTime

    Recommendation is to set at time of pulling/current time.
    (Format: YYYY-MM-DDTHH:MM:SSZ - timestamp UTC)

    limit

    Maximum limit is 1000 links per page.

    Note: Our maximum data retention is 2 years. Requests for datasets with a timestamp earlier than 2 years prior will not return any results.

    Example request

    Example response

    Notes:

    • Maximum file zip size is 300 MB. Files that exceed that limit will be split into multiple files.
    • Presigned URL time-to-live (TTL) is 60 minutes.

    Dataset definitions

    The tables in this section list the columns, data types, and definitions for each of the 3 available datasets.

    Subscription dataset

    Column

    Type

    Definition

    subscription_event_id (pk)

    string

    The unique ID for each subscription event vended through this log.

    subscription_event_type

    string

    The type of subscription event that occurred:

    Start: Customer subscribed to a channel they were not subscribed to previously.
    Renewal: Customer subscribed to a channel and was already active for that channel prior to the subscription event.
    Cancel: Customer canceled their subscription.
    Active - AR ON: Customer is active and has turned autorenew on.
    Active - AR OFF: Customer is active, but has turned autorenew off.

    subscription_event_time_utc

    timestamp

    The time the subscription event occurred, standardized to UTC.

    subscription_event_time_zone

    string

    The time zone of the subscription marketplace.

    cid

    string

    Anonymized customer identifier (CID). This customer identifier will persist for all events under a single parent channel to enable inter-tier movement and customer lifecycle tracking.

    offer_id

    string

    The ID of the specific subscription offer the event occurred in relation to.

    offer_name

    string

    The human-readable name of the offer.

    offer_type

    string

    The type of offer.

    offer_marketplace

    string

    The marketplace where the subscription offer was live.

    offer_billing_type

    string

    The type of payment required for the offer:

    HO: Hard offer; payment required.
    FT: Free trial; no payment required.

    offer_payment_amount

    string

    The billing amount of the offer_id.

    benefit_id

    string

    The ID of the Prime Video benefit the offer is configured under.

    channel_label

    string

    The name of the channel the offer is under.

    Note: If this column shows a null value, and you have concerns, please contact your CAM or PsM.

    channel_tier_label

    string

    The name of the channel the offer is under.

    Note: If this column shows a null value, and you have concerns, please contact your CAM or PsM.

    is_promo

    int

    Indicates whether an offer is on a promotion at time of event (0 = no promo, 1 = yes promo).

    create_time_utc

    timestamp

    The time the subscription event log record was created, standardized to UTC.

    last_update_time_utc

    timestamp

    The time the subscription event log record was last updated, standardized to UTC.

    is_deleted

    int

    Indicates whether a record that was previously created should be deleted (0 = should persist, 1= should be deleted).

    Playback dataset

    Column

    Type

    Definition

    session_id (pk)

    string

    The unique ID for the playback session.

    marketplace_id

    int

    The unique ID for the playback marketplace.

    marketplace_desc

    string

    A friendly description for the playback marketplace.

    cid

    string

    The user identifier, anonymized with UUID.

    benefit_id

    string

    The benefit associated with content that was streamed.

    catalog_id

    string

    Foreign key (FK) used to join to catalog table.

    subscription_offer_id

    string

    The subscription offer_id customer is subscribed to at time of stream (Active or ApprovalPending).

    subscription_event_id

    string

    Foreign key (FK) to join to subscription event log to get the exact status of subscriber at time of playback (Active)

    start_segment_utc

    timestamp

    Start of playback segment in UTC.

    end_segment_utc

    timestamp

    End of playback segment in UTC.

    seconds_viewed

    int

    Seconds user streamed content during playback.

    position_start

    double

    Second of stream where playback session started.

    position_end

    double

    Second of stream where playback session ended.

    connection_type

    string

    Connection used by the customer to stream the content.

    stream_type

    string

    Classification between Video-On-Demand, Live, or Just After Broadcast (JAB) streams.

    device_class

    string

    Type of device (such as Living Room, Mobile, Web, or Others).

    device_sub_class

    string

    Granular type of device (such as game console, smart_tv, roku).

    geo_dma

    string

    The 3-digit geographical Designated Market Area (DMA) of the area where the stream was generated.

    playback_method

    string

    Accounts for whether playback is Online or Offline.

    quality

    string

    Playback quality (such as 1080p or 4K)

    event_type

    string

    The defining event type (playback_segments)

    create_time_utc

    timestamp

    Timestamp when record was added to table, in UTC.

    last_update_time_utc

    timestamp

    Last updated timestamp when record was modified, in UTC.

    is_deleted

    int

    Flag to denote to partners if the record should be deleted in their system.

    Catalog dataset

    Column

    Type

    Definition

    id (pk)

    string

    The unique ID for the title.

    marketplace_id

    int

    The unique ID for the offer marketplace.

    benefit_id

    string

    The benefit assoicated with the content extended.

    title

    string

    The title of the series/movie.

    vendor_sku

    string

    An arbitrary identifier that the vendor generates for each of their movies or episodes.

    season

    integer

    The season number (for episodic content).

    episode

    integer

    The episode number.

    episode_name

    string

    The episode name (optional).

    runtime_minutes

    integer

    The runtime of the content viewed.

    live_linear_channel_name

    string

    The channel name for live content.

    content_type

    string

    Either TV or Movie.

    content_quality

    string

    HD or SD

    content_group

    string

    3P_SUBS

    create_time_utc

    timestamp

    Timestamp when record was added to table, in UTC.

    last_update_time_utc

    timestamp

    Last updated timestamp when record was modified, in UTC.

    is_deleted

    int

    Flag to denote to partners if the record should be deleted in their system.

    Sample queries

    The following SQL example demonstrates how the dataset tables connect. You can join playback data to the subscription event log on the subscription_event_id column. This provides the latest subscription status prior to that stream. In this example, the catalog_id column in the playback dataset is joined to the id field in catalog_event_log to provide all catalog metadata.

    The following SQL example will return the top 10 first-watched titles for customers post their having started a subscription.

    Sample orchestration

    If you want to automate data extraction from the Datasets API on a recurring schedule, the following sample Python script demonstrates how to make incremental API calls every 6 hours. It tracks the timestamp of the last successful request by persisting it locally, and uses that value—plus one second—as the startDateTime for the next call. The script calculates endDateTime as the current time, builds the appropriate query parameters, and sends a GET request with authentication. This approach ensures continuous, non-overlapping data retrieval across time windows and can be scheduled via cron or another job scheduler.

    Can’t find what you’re looking for?

    Contact us


    Errore interno del server. Riprova tra qualche istante.
    La tua sessione è scaduta

    Accedi per continuare

    Accedi
    edit