Handling Sync Failures
Data sync operations interact with external APIs that can fail for many reasons: expired credentials, rate limits, network timeouts, schema changes, or provider outages. Otesse's sync engine is designed to handle these failures gracefully with automatic retries, circuit breakers, and dead letter queues. This page explains how failures are detected, handled, and resolved.
Types of Failures
Execution-Level Failures
These affect the entire sync run, not just individual records:
| Failure | Cause | Automatic Response |
|---|---|---|
| Credentials expired | OAuth token expired during execution | Attempt automatic refresh. If refresh succeeds, resume. If not, mark execution as "failed" and connection as "expired" |
| Provider API down | 5xx response from provider | Retry the request up to 3 times with exponential backoff (1s, 5s, 15s). If all retries fail, mark execution as "failed" |
| Rate limited | 429 response from provider | Respect the Retry-After header. Pause execution for the specified duration. Resume after cooldown |
| Timeout | Execution exceeds configured timeout | Mark as "timed_out." Record progress counts as-is. Note which batch was in progress |
| Schema mismatch | Provider API returns unexpected field structure | Log the mismatch. Skip affected records. Continue processing others |
Record-Level Failures
These affect individual records while allowing the rest of the batch to continue:
| Failure | Cause | Response |
|---|---|---|
| Validation error | A required field is missing or has an invalid value | Log the error, increment recordsFailed, continue to next record |
| Duplicate conflict | The record already exists at the target | Skip if duplicate detection is enabled, create if not |
| Reference not found | A foreign key references an entity that does not exist at the target | Log as failed, continue processing |
| Transform error | A field mapping transformation failed (e.g., invalid date format) | Use default value if configured, otherwise mark as failed |
Automatic Retry
When a sync execution fails or times out, the system can automatically retry if retryOnFailure is enabled on the sync job:
| Retry | Delay | Behavior |
|---|---|---|
| 1st | 5 minutes | Creates a new SyncExecution with trigger type "retry" |
| 2nd | 30 minutes | Same |
| 3rd | 2 hours | Same |
| Beyond max retries | N/A | No more automatic retries. Alert created if configured. Admin notified |
Each retry creates a fresh SyncExecution record, preserving the history of every attempt. The retry uses incremental mode if the original job was incremental, processing only records that changed since the last successful run.
Circuit Breaker
The circuit breaker prevents the sync engine from generating excessive errors when something is fundamentally wrong (like an invalid API key or a breaking schema change):
Trigger condition: More than 50% of processed records fail AND at least 10 records have been processed.
When the circuit breaker activates:
- The current batch is halted
- Remaining records are not processed
- The execution status is set to "failed" with a circuit breaker note
- The progress counts reflect what was processed before halting
- The administrator is notified
The circuit breaker resets on the next execution attempt. This means a single bad batch does not permanently disable the sync job — but it prevents a 10,000-record sync from generating 10,000 error entries.
Dead Letter Queue
Records that fail repeatedly across multiple execution attempts are moved to a dead letter queue:
How Records Enter the Dead Letter Queue
- The engine tracks failures per record using
ExternalReferenceand error details - If a specific record fails 3 times across separate executions, it is flagged as "dead letter"
- Dead letter records are excluded from future sync runs to prevent infinite retry loops
Managing Dead Letter Records
Dead letter records appear in a "Failed Records" tab on the sync job detail view:
| Column | Description |
|---|---|
| Record ID | The local or external record identifier |
| Entity Type | Customer, invoice, product, etc. |
| First Failed | When the record first failed |
| Last Failed | When the record most recently failed |
| Attempts | Number of failed attempts |
| Error | Most recent error message |
| Actions | Retry (individual) or Dismiss |
Retry: Re-processes the single record immediately, removing it from the dead letter queue if successful.
Dismiss: Marks the record as intentionally skipped. It will not be retried automatically and is removed from the dead letter view. A note is recorded in the audit log.
Error Investigation
When a sync execution shows failures, administrators can investigate using the execution detail view:
Error Log
The error log shows every failed record with:
Record: customer-uuid-1234
Entity: Customer
Error: ValidationError - Required field 'email' is null
Timestamp: 2026-03-01 14:23:45
Record Breakdown
A visual breakdown shows the proportions:
- Green: Created records
- Blue: Updated records
- Grey: Skipped records
- Red: Failed records
Common Error Patterns
| Pattern | Likely Cause | Resolution |
|---|---|---|
| All records fail with 401 | Expired or invalid credentials | Re-authenticate the connection |
| All records fail with 429 | Rate limit exceeded | Reduce sync frequency or batch size |
| Specific records fail with validation errors | Data quality issues | Fix the source data or update field mappings |
| Intermittent 5xx errors | Provider instability | Wait and retry; the automatic retry policy handles this |
| All records fail with schema error | Provider API changed | Update field mappings to match the new schema |
Preventing Failures
- Start with incremental mode — Full syncs process more records and are more likely to hit rate limits or timeouts
- Set reasonable timeouts — Give long-running syncs enough time to complete without setting the timeout so high that failures go undetected
- Monitor execution history — Check the sync job list regularly for executions with "completedwitherrors" status
- Keep credentials fresh — For OAuth providers, monitor connection health and re-authenticate before tokens expire
- Use field mapping defaults — Set default values for optional fields to prevent null validation errors
On this page