There are two types of update operations that you can make to your data transfers in Supermetrics’ data warehouse and cloud storage destinations. Combine them with different update types to find the right refresh type and cadence for your data.
This article covers important information about update operations and update types. We've included an overview table that summarizes each option.
When you configure a transfer, you can specify when and how frequently its results should be updated with a scheduled refresh operation.
When you set this operation up, you’ll define a refresh window for each query. The refresh window determines how many days of data will be updated every time the data is refreshed. For example, a refresh window of 7 ensures that each time the transfer runs the last 7 days of data, including today’s data, will be replaced individually.
Historical backfills allow you to transfer data from a specific period in the past. Once you specify a range of dates, Supermetrics will pull this data from the source one day at a time.
A query’s maximum historical backfill range is determined by:
- Your Supermetrics license
- The data source's data retention policies
- This can also apply to retention policies for specific fields. For example, some data sources limit retention for demographic data fields.
Supermetrics’ update operations will default to rolling updates as long as the “Date” dimension is included in the query. It does this because most of the data sources it connects to supply data that can be partitioned by date (sometimes called “time-series data”).
Rolling updates supply a range of run dates, beginning with the start date and ending with the end date, to the processing queue. The data is processed in reverse chronological order, one day at a time, with each day considered independently. This applies to both scheduled refreshes and historical backfills.
Different destinations process this update type in these ways:
- BigQuery: Single days of data are stored as individual table shards. Update operations will overwrite the date-partitioned shard, eliminating the chance of introducing duplicates to the dataset.
- Snowflake, Azure Synapse, and Redshift: Queries with rolling updates will store data in the same table. Supermetrics will perform a DELETE and COPY INTO operation based on the date associated with data being updated.
- Cloud storage (including SFTP): Data from each day is stored in a separate file. Update operations will replace files in the destination folder with new data from Supermetrics.
Snapshot updates capture data from a data source at a specific point in time. Because there isn’t a unique run day, they’re incompatible with historical backfills.
They’re useful when:
- The API doesn't contain historical data. This is common in organic social APIs.
- The data from API isn't time-series data. This can happen when pulling data from fields like Contacts, Companies, or Deals from customer relationship management (CRM) sources.
- You want to pull lifetime metrics from a given data source. For example, non-aggregatable metrics like Reach and Frequency benefit from snapshot updates.
Snapshot updates will replace the entire table’s data every day. Tables in data warehouse destinations will be dropped and recreated each day, while files in cloud storage destinations will be completely overwritten.
Snapshot updates will occur when the “Date” dimension is omitted from the schema. For snapshot updates to run successfully, it’s important to have the refresh window set to 2.
When the “Run date” is equal to the prior day, this will trigger the snapshot update, which will then add a start- and end-date time range to the underlying query. The start date will be the maximum allowable historical limit defined by your Supermetrics license, and the end date is the day before you created the transfer.
Rolling snapshot updates
Rolling snapshot updates allow the results of a snapshot update to be stored for historical use. Subsequent snapshot queries can be run and stored in the same table or date-stamped files and will be partitioned by the day the query was run.
Like snapshot updates, they’re incompatible with historical backfills.
Rolling snapshot updates need to be configured by someone from Supermetrics — get in touch with your customer success manager or our support team to get started. We’ll set a special parameter in the backend to aggregate the results day after day.
We recommend setting the refresh window to 2 for rolling snapshot updates when you create your schema. You should include the “Today” field in your query. Once the schema and transfer are in place, let us know, and we’ll work with you to implement rolling snapshots.
Update types overview
|Rolling updates||Snapshot updates||Rolling snapshot updates|
|Description||Updates each day of data individually||Captures lifetime values and refreshes the entire table every day||Captures lifetime values and stores the accumulated result|
|Best for||Time-series oriented queries||Stateful, non-time-series data queries||Tracking state change over time where historical data is unavailable|
Aggregatable metrics (clicks, cost, impressions)
|Non-aggregatable metrics (reach, frequency, unique users)||Stateful metrics (followers, post likes, post comments)|
|Refresh window (days)||2 to 30||2||2|
|Accumulation of data history||Yes||No||Yes|
|Run date||Supplied via refresh window or backfill operation|
Start date: Defined by license
End date: Yesterday
Start date: Defined by license
End date: Yesterday
|Time dimension to include in query||"Date"||"Today"||"Today"|
|Self-service||Yes||Yes||Must be enabled by Supermetrics|