Set up the schedule for ongoing data extractions. We will use this schedule information to establish the dataset workflows to keep your data up-to-date.
The Schedule step in data onboarding provides details about your dataset’s configured schedule, monitoring options for late or missed deliveries, and displays the inferred schedule based on source data patterns and file modification timestamps
Configure dataset schedule
Use the Configured schedule to specify when to check the data source for updates.
You can modify the Configured schedule by clicking the "Edit" button and adjusting the frequency and time of data ingestion to serve your needs.
Monitor late or missed deliveries
Crux offers monitoring of your data delivery to track and alert you when data updates are delayed or missed. While Crux does not control the data’s availability at the source, it can notify you when data is delayed based on your configured thresholds.
✨ Recommended delay threshold
Crux AI Technology suggests a delay threshold based on the dataset's historical delivery patterns. At a minimum, ten deliveries are needed to analyze past delivery trends and create a recommendation.
Manual delay threshold
Use this option to manually set up a delivery deadline if a recommended deadline is unavailable or you want to override it. Select from adding a time delay to the configured schedule (Option 2) or configuring the delivery deadline with a custom Cron expression (Option 3). Crux uses Spring format with 6 single-space-separated time and date fields.
When monitoring is enabled, the Health dashboard will expose delayed or missed deliveries.
✨ Review inferred schedule
Crux analyzes file patterns and timestamps from the data source to determine how often the dataset is updated. By default, the Configured schedule matches the Inferred schedule, but you can customize it as needed. It's important to find a balance between ingesting data as soon as possible (closer to when the data supplier updates it at the source) and not too early to avoid reading data prematurely, causing unnecessary alert fatigue and wasting valuable processing resources. This section of the visual interface is read-only and helps you manage the Configured schedule.
Next steps
After you have set up the Configured schedule and delivery thresholds/monitoring preferences, proceed to the next step, setting up Destinations.
Learn more
Visit the Understanding dataset ingestion schedule to learn more about the details and benefits of the ingestion schedule.
If setting up the Configured schedule fails or produces unexpected deliveries, refer to the Troubleshooting ingestion schedule.