Appearance
Concepts & Terminology
Watch
A Watch is a monitoring mechanism designed to ensure that workflows or processes are running as expected. Workflows signal their operation to a Watch via API Check-ins.
There are two types of Watches, Notify-only and Missed Action watch.
Notify Only Watch
A Notify Only watch functions as a simple notification relay. Upon receiving a check-in, it triggers an alert to the team. Since similar notifications can be directly implemented within a workflow, the core value of Automation Watchdog lies primarily in its Missed Action watch functionality.
Missed Action Watch
A Missed Action watch operates by expecting regular Check-ins from the workflow, which confirm its ongoing status.
Watch States
When check-ins are received within the expected timeframe, the Watch maintains a Relax State, signaling normal operation. However, if a check-in is missed or delayed beyond the designated period, the Watch shifts to an Error State, automatically generating an alert to inform the team of a potential issue.
Watch Activations
Watches are Activated either by a Schedule or by an Activation Event. A schedule-activated watch remains active for the duration defined by the schedule. An event-activated Watch remains active until it is explicitly deactivated by a subsequent event. A Watch can only transition from the Relax state to the Error state, indicating a missed check-in, when it is active.
An active watch expects check-ins within either Fixed or Cascading time periods.
Watch Check-in Strategy
There are two strategies for watches to check-in:
Fixed Period
A Fixed Period of 15 minutes means a check-in is required at least once every 15 minutes. For example, a check-in may be required between 9:00 - 9:15, 9:15 - 9:30, 9:30 - 9:45 and so forth.
Cascading Period
A Cascading Period of 15 minutes means a check-in is expected every 15 minutes from the last time the check-in was received. For example, if a check-in is required by 9:15, if one is received at 9:07, the next check-in is required by 9:22.
Advanced Concepts
Multi-Machine Workflow
Automation Watchdog provides robust monitoring for workflows running on any number of machines. Here’s how it handles multi-machine scenarios:
Machine Identification: Machines identify themselves to the Watch through API check-ins. This allows the Watch to track each machine's activity individually.
Watch Activation/Deactivation:
- The Watch is activated either directly or when a machine for the watch is activated.
- The overall Watch remains active as long as at least one machine is active.
- Individual machines handle their own deactivation when they've completed their assigned tasks for the current iteration.
- The Watch deactivates only after all active machines have deactivated.
Dynamic Tracking: The Watch dynamically adapts to situations where only a subset of machines is active. It focuses on tracking only those machines involved in the current iteration, whether they've provided check-ins or were explicitly activated.
Check-in Application:
- The Watch applies check-in requirements to the overall Watch, not to each individual machine independently.
- For example, in a Cascading Watch with a 5-minute period: if Machine 1 checks in, the next expected check-in is 5 minutes later. If Machine 2 checks in shortly after, the next check-in time is incremented 5 minutes from Machine 2's check-in.
- The Watch maintains a single expected check-in time, not a separate one for each machine.
- Note: Maintaining individual expected check-ins per machine ("Condition By Machine") is a planned future feature."
Condition By Queue
Watches can be configured to monitor a workflow that operates across multiple queues. This is particularly useful when a single workflow is reused for different queues. Instead of creating a separate Watch for each queue, you can use the 'Condition By Queue' setting. Here’s how it works:
Queue Identification: The Queue is identified in the API call. This allows the Watch to track activity for each queue individually.
Individual Tracking: The Watch monitors each queue’s check-ins and tracks its progress. The check-in requirements are applied to each queue individually.
Watch Activation/Deactivation:
- The Watch is activated either directly or when a queue for the watch is activated.
- The overall Watch remains active as long as at least one queue is active.
- Individual queues handle their own deactivation when they've completed their assigned tasks for the current iteration.
- The Watch deactivates only after all active queues have deactivated.
Dynamic Tracking: The Watch dynamically adapts to situations where only a subset of queues are active. It focuses on tracking only those queues involved in the current iteration, whether they've provided check-ins or were explicitly activated.
Check-in Application:
- The Watch applies check-in requirements to each active queue in the Watch.
- For example, in a Cascading Watch with a 5-minute period: if Queue 1 checks in, the next expected check-in is 5 minutes later only for Queue 1. If Queue 2 didn’t check-in within its 5 minutes, Queue 2 would transition to an Error state.
- This Watch also supports multi-machine workflows for each Queue.