GenevaDrive provides durable, multi-step workflows for Rails applications. The graph of steps and the code that executes each step live in separate universes — the structure is defined at class load time, the behavior runs later via ActiveJob. GenevaDrive delivers on that architecture.
Imagine you need to onboard a new user. You send a welcome email, wait three days, send a reminder if they haven't completed setup, then schedule a follow-up call. In a traditional Rails application, you face several challenges:
Instead of pretending that a function can be suspended, serialized, and resumed — which no mainstream runtime actually supports — GenevaDrive models workflows as DAGs (directed acyclic graphs). The code driving the DAG and the code driving each node are strictly separate and run in different domains.
A GenevaDrive workflow is a DAG with a single permitted input and a single permitted output per node. It explicitly ditches the illusion of a marshalable VM universe in favor of clarity, cohesion with the host environment (UNIX system running Ruby running Rails) and deliberately picks clarity over pretense of magic.
class OrderFulfillmentWorkflow < GenevaDrive::Workflow # DAG context — definition time
cancel_if do # DAG context — definition time
hero.canceled? # Node context — execution time
end # DAG context — definition time
step :reserve_inventory do # DAG context — definition time
hero.line_items.each do |item| # Node context — execution time
Inventory.decrement!(item.sku, item.quantity) # Node context — execution time
end # Node context — execution time
end # DAG context — definition time
step :charge_payment, wait: 1.hour do # DAG context — definition time
PaymentGateway.capture(hero.payment_intent_id) # Node context — execution time
end # DAG context — definition time
step :ship_order do # DAG context — definition time
Fulfillment.create_shipment(hero) # Node context — execution time
OrderMailer.shipped(hero).deliver_later # Node context — execution time
end # DAG context — definition time
end # DAG context — definition time
This separation is the entire point. The DAG context runs once, when Ruby loads the class. It defines the structure: which steps exist, in what order, with what wait times and preconditions. The node context runs later — possibly much later, possibly on a different machine — when ActiveJob executes the step. You can see at a glance which code is "architecture" and which code is "behavior." There is no ambiguity about what persists and what doesn't, because the boundary is syntactic: everything inside a do...end block is node context; everything outside is DAG context. Instance variables set in node context will not exist in the next node. Database writes in node context will persist. The framework does not pretend otherwise.
GenevaDrive addresses these challenges with a small set of guarantees:
bundle add geneva_drive
bin/rails generate geneva_drive:install
bin/rails db:migrate
The generator creates the migration required for geneva_drive to operate - you need to run all of them. When updating the gem, rerun the generator to add any migrations you may need to run as migrations get added as the gem evolves.
GenevaDrive uses a two-table design:
geneva_drive_workflows — The workflow records. Each row represents one workflow instance with its current state, hero association, and progress tracking.geneva_drive_step_executions — The idempotency keys. Each row represents one attempt to execute a step, with timing, outcome, and error information.This separation keeps the workflows table clean while maintaining a complete audit trail in step executions.
If your application uses UUID primary keys, the migrations will detect this and also use UUIDs for the foreign keys and the primary keys of the geneva_drive resources.
Note that you don't want to mix integer IDs and UUIDs in the same application. geneva_drive uses a polymorphic relation for the hero of the workflow, which has a typed hero_id column. You will thus want either all of your models to be used as heroes to have UUID primary keys or bigint primary keys, but not mix the two.
[!WARNING] If you are using UUIDs, we also strongly recommend adopting a lexicographically ordered UUID flavour, at least for the geneva_drive tables. Such flavours include UUIDv7 and tou - you will want your primary keys to be pre-sorted for using the admin effectively, as well as for efficient
find_eachusage with the geneva_drive tables.
A Workflow is an ActiveRecord model representing a durable process. You define a workflow by subclassing GenevaDrive::Workflow and declaring steps. Each workflow instance is tied to a single record — the hero — that the workflow operates on.
class SubscriptionRenewalWorkflow < GenevaDrive::Workflow
step :send_renewal_notice do
RenewalMailer.notice(hero).deliver_later
end
step :charge_payment, wait: 3.days do
PaymentGateway.charge(hero.payment_method, hero.renewal_amount)
end
step :activate_new_period do
hero.subscription.extend!(1.month)
ConfirmationMailer.renewed(hero).deliver_later
end
end
When you create a workflow, GenevaDrive immediately schedules the first step:
SubscriptionRenewalWorkflow.create!(hero: user)
which produces the step execution job in your ActiveJob queue. Your worker will then pick up that job and perform the step on the already-persisted Workflow model. For a complete example showing workflows in action, see the User Onboarding Workflow in the appendix.
Steps are units of work executed sequentially. Each step can optionally specify a wait time before execution. GenevaDrive runs steps one at a time, in order, with the database enforcing that no two steps ever execute simultaneously for the same workflow.
class DocumentProcessingWorkflow < GenevaDrive::Workflow
step :extract_text do
hero.update!(extracted_text: TextExtractor.extract(hero.file))
end
step :analyze_content, wait: 5.minutes do
# Give the extraction time to settle in search indexes
hero.update!(analysis: ContentAnalyzer.analyze(hero.extracted_text))
end
step :generate_summary, wait: 10.minutes do
hero.update!(summary: Summarizer.generate(hero.analysis))
end
end
The hero is the record the workflow operates on. We call it "hero" rather than "target" or "subject" because it emphasizes that the workflow exists to serve this record — it's the protagonist of the story.
PaymentWorkflow.create!(hero: payment)
Inside any step, you access the hero directly:
step :process do
hero.mark_processing!
hero.account.notify_payment_started!
end
The hero can be any ActiveRecord model. Choose the most specific record that represents what the workflow is about. If you're processing an invoice, the hero should be the Invoice, not the User who owns it.
Step executions are the idempotency mechanism. Each attempt to run a step creates a StepExecution record. This record serves multiple purposes:
workflow.execution_history.each do |exec|
puts "#{exec.step_name}: #{exec.state} (#{exec.outcome})"
puts " Started: #{exec.started_at}"
puts " Completed: #{exec.completed_at}"
puts " Error: #{exec.error_message}" if exec.failed?
end
Most steps have explicit names that describe what they do:
class AccountVerificationWorkflow < GenevaDrive::Workflow
step :verify_email do
EmailVerifier.send_code(hero)
end
step :verify_phone do
PhoneVerifier.send_code(hero)
end
step :verify_identity do
IdentityVerifier.request_documents(hero)
end
end
When step names don't add clarity — typically in polling or retry scenarios — you can omit the name:
class StatusPollingWorkflow < GenevaDrive::Workflow
step { check_status! }
step(wait: 1.minute) { check_status! }
step(wait: 5.minutes) { check_status! }
step :mark_timeout do
hero.update!(status: :timed_out)
end
private
def check_status!
finished! if hero.external_status == "complete"
end
end
GenevaDrive assigns auto-generated names like step_1, step_2, etc.
Ruby executes class body code at load time. This means you can use loops to generate steps — particularly useful for polling patterns with staggered intervals:
class WebhookDeliveryWorkflow < GenevaDrive::Workflow
# Immediate first attempt
step { attempt_delivery! }
# Retry every 30 seconds for 2 minutes
4.times do
step(wait: 30.seconds) { attempt_delivery! }
end
# Then every 5 minutes for 30 minutes
6.times do
step(wait: 5.minutes) { attempt_delivery! }
end
# Then hourly for 6 hours
6.times do
step(wait: 1.hour) { attempt_delivery! }
end
# Final step: mark as failed
step :mark_undeliverable do
hero.update!(delivery_status: :failed)
WebhookFailureNotifier.notify(hero)
end
private
def attempt_delivery!
response = HttpClient.post(hero.endpoint_url, hero.payload)
if response.success?
hero.update!(delivery_status: :delivered, delivered_at: Time.current)
finished!
end
# Otherwise, fall through to next step
end
end
The loop bodies execute once when Ruby loads the class. Each iteration adds a new step definition. When a workflow instance runs, it executes those pre-defined steps in order.
For complex steps, you can define instance methods and reference them with step def:
class DataExportWorkflow < GenevaDrive::Workflow
step def gather_records
hero.update!(export_data: hero.exportable_records.to_json)
end
step def write_to_storage
Storage.write(hero.export_path, hero.export_data)
hero.update!(exported_at: Time.current)
end
step def notify_user
ExportMailer.complete(hero).deliver_later
end
end
before_step: and after_step:Steps normally execute in definition order. You can override this by specifying where a step should be inserted relative to another:
class ComplianceWorkflow < GenevaDrive::Workflow
step :collect_data do
hero.update!(data: DataCollector.gather(hero))
end
step :submit_report do
ComplianceApi.submit(hero.data)
end
# Insert audit check before submission
step :audit_check, before_step: :submit_report do
raise "Data incomplete" unless hero.data_complete?
AuditLog.record(hero, :pre_submission)
end
end
The referenced step must already be defined — you can only insert before or after steps that appear earlier in the class body.
skip_if:You can declare conditions that cause a step to be skipped without entering the step body:
class NotificationWorkflow < GenevaDrive::Workflow
step :send_email, skip_if: -> { hero.email_unsubscribed? } do
NotificationMailer.notify(hero).deliver_later
end
step :send_sms, skip_if: :sms_disabled? do
SmsService.notify(hero)
end
private
def sms_disabled?
!hero.sms_enabled? || hero.phone.blank?
end
end
The skip_if option accepts a lambda, a symbol (method name), or a boolean. The condition is evaluated before the step executes.
For a complete example showing conditional steps in context, see the User Onboarding Workflow in the appendix.
cancel_ifWhen certain conditions should cancel the entire workflow regardless of which step is running, use cancel_if:
class EngagementWorkflow < GenevaDrive::Workflow
cancel_if { hero.deactivated? }
cancel_if { hero.unsubscribed? }
step :send_week_1_email do
EngagementMailer.week_1(hero).deliver_later
end
step :send_week_2_email, wait: 7.days do
EngagementMailer.week_2(hero).deliver_later
end
step :send_week_4_email, wait: 14.days do
EngagementMailer.week_4(hero).deliver_later
end
end
GenevaDrive evaluates cancel_if conditions before every step. If any condition returns true, the workflow cancels immediately.
GenevaDrive provides five flow control methods that let you change workflow behavior from inside a step:
cancel! — Stop the workflow, mark it canceledpause! — Stop the workflow, await manual interventionreattempt!(wait:) — Retry the current stepskip! — Skip the current step, proceed to the nextfinished! — Complete the workflow earlyThese methods use Ruby's throw/catch mechanism to interrupt step execution cleanly. When you call one of these methods, execution immediately stops and GenevaDrive updates the workflow state accordingly.
For a complete example showing flow control in a realistic scenario, see the Payment Processing Workflow in the appendix.
Use cancel! when conditions make the workflow irrelevant:
class TrialConversionWorkflow < GenevaDrive::Workflow
step :send_trial_reminder, wait: 7.days do
cancel! if hero.converted_to_paid?
TrialMailer.reminder(hero).deliver_later
end
step :send_final_reminder, wait: 3.days do
cancel! if hero.converted_to_paid?
TrialMailer.final_reminder(hero).deliver_later
end
step :expire_trial do
hero.expire_trial! unless hero.converted_to_paid?
end
end
Use pause! when a human needs to review something before the workflow continues:
class FraudReviewWorkflow < GenevaDrive::Workflow
step :automated_check do
risk_score = FraudDetector.score(hero)
hero.update!(risk_score: risk_score)
if risk_score > 80
hero.flag_for_review!
pause!
end
end
step :process_transaction do
TransactionProcessor.process!(hero)
end
end
When a workflow is paused, it stays paused until you explicitly resume it:
workflow = FraudReviewWorkflow.find(id)
workflow.resume! # Re-enqueues the scheduled step
When you call pause! externally on a workflow that's waiting for a scheduled step, GenevaDrive preserves the scheduled step execution rather than canceling it. This provides better timeline visibility — you can see that a step was scheduled, became overdue during the pause period, and when it eventually ran.
Pause behavior:
ready to pausedscheduled stateResume behavior:
paused back to ready# Timeline example:
# T+0h: step_one completes, step_two scheduled for T+2h
# T+1h: pause! called (step_two still scheduled for T+2h)
# T+3h: resume! called (step_two is overdue, runs immediately)
workflow = WaitingWorkflow.create!(hero: user)
perform_next_step(workflow) # step_one runs, step_two scheduled
# Later...
workflow.pause! # step_two stays scheduled
workflow.step_executions.last.state # => "scheduled"
# Much later (after scheduled time passed)...
workflow.resume! # step_two re-enqueued to run now
This behavior means:
Use reattempt! to retry the current step, optionally after a delay:
class ExternalApiWorkflow < GenevaDrive::Workflow
step :sync_to_crm do
result = CrmApi.sync(hero)
if result.rate_limited?
reattempt!(wait: result.retry_after)
end
hero.update!(crm_synced_at: Time.current)
end
end
Each reattempt creates a new step execution record, maintaining the full history of attempts.
Use skip! to bypass a step and move to the next one:
class OnboardingWorkflow < GenevaDrive::Workflow
step :request_phone_verification do
skip! if hero.phone_verified?
SmsService.send_verification_code(hero)
end
step :wait_for_verification, wait: 5.minutes do
skip! if hero.phone_verified?
reattempt!(wait: 1.minute)
end
step :complete_onboarding do
hero.complete_onboarding!
end
end
Use finished! to complete the workflow before reaching the last step:
class OrderFulfillmentWorkflow < GenevaDrive::Workflow
step :check_delivery_status do
if hero.delivered?
hero.complete!
finished!
end
reattempt!(wait: 1.hour)
end
step :escalate_delayed_delivery do
SupportTeam.escalate(hero)
end
end
By default, unhandled exceptions pause the workflow. The step execution records the error message and backtrace, and the workflow waits for manual intervention:
step :risky_operation do
ExternalService.call!(hero) # If this raises, workflow pauses
end
Use on_exception: on a step to control what happens when that step raises:
class ResilientApiWorkflow < GenevaDrive::Workflow
step :call_external_api, on_exception: :reattempt! do
ExternalApi.call(hero)
end
end
Available actions:
:pause! — (default) Pause the workflow for manual review:cancel! — Cancel the workflow:reattempt! — Retry the step (creates a new step execution):skip! — Skip the step and continue to the nextDeclare on_exception at the class level to set a default for all steps. Steps without an explicit on_exception: override inherit this policy:
class ResilientWorkflow < GenevaDrive::Workflow
on_exception :reattempt!, max_reattempts: 5
step :fetch_data do
ExternalApi.fetch(hero) # Inherits reattempt! policy
end
step :send_email, on_exception: :skip! do
Mailer.deliver(hero) # Step-level overrides class-level
end
end
You can declare multiple class-level policies to handle different exception types:
class OAuthWorkflow < GenevaDrive::Workflow
# Specific: only matches OAuth2::Error and its subclasses
on_exception OAuth2::Error, action: :reattempt!, wait: 15.seconds
# Specific: cancel on permanent client errors
on_exception Google::Apis::ClientError, action: :cancel!
# Blanket: everything else gets reattempted
on_exception :reattempt!, max_reattempts: 3
step :sync do
GoogleCalendar.sync(hero)
end
end
When a step raises an exception, GenevaDrive resolves the policy in this order:
on_exception: with a single policy (symbol, ExceptionPolicy, or Proc), use it unconditionallyWhen using :reattempt!, limit consecutive reattempts with max_reattempts:. This prevents infinite retry loops when an error is persistent:
class ExternalApiWorkflow < GenevaDrive::Workflow
step :sync_to_crm, on_exception: :reattempt!, max_reattempts: 5 do
CrmApi.sync(hero)
end
end
The max_reattempts: option:
:reattempt! is used — a safety net against infinite loopsnil to disable the limit entirelyreattempt! calls — only automatic exception handling respects this limitWhen the limit is exceeded, GenevaDrive logs a warning and pauses the workflow by default. Use terminal_action: to change this behavior:
# Cancel the workflow instead of pausing when reattempts are exhausted
step :flaky_api, on_exception: :reattempt!, max_reattempts: 10, terminal_action: :cancel! do
FlakyService.call(hero)
end
terminal_action: accepts :pause! (default), :cancel!, or :skip!.
For policies shared across multiple workflows, create an ExceptionPolicy object:
# Store in a constant or module for reuse
TRANSIENT_RETRY = GenevaDrive::ExceptionPolicy.new(
:reattempt!,
wait: 30.seconds,
max_reattempts: 5,
terminal_action: :cancel!
)
class OrderWorkflow < GenevaDrive::Workflow
step :charge_card, on_exception: TRANSIENT_RETRY do
PaymentGateway.charge(hero)
end
end
class ShippingWorkflow < GenevaDrive::Workflow
step :book_courier, on_exception: TRANSIENT_RETRY do
CourierApi.book(hero)
end
end
Use the matching: keyword to create policies that target specific exception classes. This is especially useful when composing multiple policies into an array (see Composable Step-Level Policies):
# Targets a specific exception class
TIMEOUT_RETRY = GenevaDrive::ExceptionPolicy.new(
:reattempt!,
matching: Net::OpenTimeout,
wait: 10.seconds,
max_reattempts: 5
)
# Targets multiple exception classes
OAUTH_CANCEL = GenevaDrive::ExceptionPolicy.new(
:cancel!,
matching: [OAuth2::Error, "Faraday::ConnectionFailed"]
)
matching: accepts a class, a string (resolved lazily via safe_constantize — the class doesn't need to be loaded at definition time), or an array of either. A policy without matching: is a blanket policy that matches any exception.
You can also pass an ExceptionPolicy to the class-level on_exception. Exception class filters are set on the policy after creation:
class ApiWorkflow < GenevaDrive::Workflow
on_exception Timeout::Error, action: :reattempt!, wait: 10.seconds, max_reattempts: 3
on_exception :cancel! # Everything else cancels
step :call_api do
SlowApi.call(hero)
end
end
Sometimes a single action isn't enough — you want transient errors reattempted, fatal errors to cancel, and everything else to skip. Pass an array of ExceptionPolicy objects to on_exception: to route different exception types to different actions on a single step:
class CalendarSyncWorkflow < GenevaDrive::Workflow
step :sync_events, on_exception: [
GenevaDrive::ExceptionPolicy.new(:reattempt!, matching: Net::OpenTimeout, wait: 10.seconds, max_reattempts: 5),
GenevaDrive::ExceptionPolicy.new(:cancel!, matching: OAuth2::Error),
GenevaDrive::ExceptionPolicy.new(:skip!) # blanket fallback for anything else
] do
GoogleCalendar.sync(hero)
end
step :send_confirmation do
CalendarMailer.synced(hero).deliver_later
end
end
Each policy in the array can use the matching: keyword to target specific exception classes. When an exception is raised, GenevaDrive walks the array in two passes:
matching:) are checked first, in definition order. The first match wins.matching:) acts as a catchall fallback.:pause!).This means you can combine a step-level array with a class-level policy for layered exception handling:
class RobustApiWorkflow < GenevaDrive::Workflow
on_exception :pause! # class-level fallback
step :call_api, on_exception: [
GenevaDrive::ExceptionPolicy.new(:reattempt!, matching: Timeout::Error, max_reattempts: 3)
# No blanket fallback — unmatched exceptions fall through to class-level :pause!
] do
SlowApi.call(hero)
end
end
When multiple policies in the array have max_reattempts:, GenevaDrive enforces the minimum across all of them as a global cap on consecutive reattempts. This prevents runaway retries when different exception types alternate:
step :sync, on_exception: [
GenevaDrive::ExceptionPolicy.new(:reattempt!, matching: Timeout::Error, max_reattempts: 10),
GenevaDrive::ExceptionPolicy.new(:reattempt!, matching: RateLimitError, max_reattempts: 3, terminal_action: :cancel!)
] do
ExternalApi.sync(hero)
end
Here the global cap is 3. Even if every failure is a Timeout::Error (whose individual policy allows 10), the step stops reattempting after 3 consecutive failures. When the cap is hit, the terminal_action of the matched policy applies — so if the 4th failure is a Timeout::Error, the workflow pauses (the default), but if it's a RateLimitError, the workflow cancels.
[!TIP] Store reusable policy arrays as frozen constants so you can share them across workflows:
module ExceptionPolicies EXTERNAL_API = [ GenevaDrive::ExceptionPolicy.new(:reattempt!, matching: Net::OpenTimeout, max_reattempts: 5), GenevaDrive::ExceptionPolicy.new(:cancel!, matching: OAuth2::Error), GenevaDrive::ExceptionPolicy.new(:pause!) # blanket fallback ].freeze end class SyncWorkflow < GenevaDrive::Workflow step :sync, on_exception: ExceptionPolicies::EXTERNAL_API do ExternalApi.sync(hero) end end
For full control, pass a block to on_exception. The block receives the exception and runs in the workflow context — you must call a flow control method (reattempt!, cancel!, pause!, or skip!):
class SmartRetryWorkflow < GenevaDrive::Workflow
on_exception RateLimitError do |error|
reattempt! wait: error.retry_after.seconds
end
on_exception do |error|
if error.message.include?("temporary")
reattempt!
else
cancel!
end
end
step :call_api do
ExternalApi.call(hero)
end
end
If the block returns without calling a flow control method, the workflow pauses.
Imperative handlers cannot be combined with max_reattempts:, wait:, or terminal_action: — manage that logic inside the block.
When a step raises an exception, GenevaDrive reports it to your error tracker via Rails.error.report. This is the right default — you want visibility into failures. But some exceptions are expected and handled: rate limits, transient timeouts, throttling responses. Reporting these on every reattempt floods your error tracker with noise and obscures the errors that actually need attention.
The report: option controls when Rails.error.report is called. It accepts three values:
:always — (default) Report every exception, regardless of what the policy does with it. This is the safest choice and preserves the behavior you're used to.:never — Never report the exception. Use this for errors that are fully expected and handled — rate limits, throttles, circuit breaker trips. The exception still triggers the policy action (reattempt, skip, etc.), but your error tracker stays clean.:terminal_only — Suppress reports while the step is being reattempted, but report when reattempts are exhausted and the terminal_action fires. This is the sweet spot for transient errors: you don't care about individual retries, but you want to know when the retries give up.Use report: on both class-level on_exception and on ExceptionPolicy objects:
class CalendarSyncWorkflow < GenevaDrive::Workflow
# Rate limits are expected — never report them
on_exception Pecorino::Throttle::Throttled, report: :never do |error|
reattempt!(wait: error.retry_after.clamp(10, 600).seconds)
end
# Transient timeouts: only report when we give up
on_exception Net::OpenTimeout,
action: :reattempt!,
wait: 10.seconds,
max_reattempts: 5,
terminal_action: :cancel!,
report: :terminal_only
step :sync_events do
GoogleCalendar.sync(hero)
end
end
The same option works on ExceptionPolicy objects for reusable policies and composable arrays:
RATE_LIMIT_POLICY = GenevaDrive::ExceptionPolicy.new(
:reattempt!,
matching: RateLimitError,
wait: 30.seconds,
max_reattempts: 10,
terminal_action: :pause!,
report: :terminal_only
)
step :call_api, on_exception: [
RATE_LIMIT_POLICY,
GenevaDrive::ExceptionPolicy.new(:reattempt!, matching: Timeout::Error, report: :never, max_reattempts: 3),
GenevaDrive::ExceptionPolicy.new(:pause!) # blanket fallback, reports by default
] do
ExternalApi.call(hero)
end
[!IMPORTANT] The
report:option only controlsRails.error.report. The exception is still re-raised after the executor commits its state transitions — your background job framework (Sidekiq, SolidQueue, etc.) will see it. If your error tracker also hooks into the job framework's error handler, you may need to configure that separately.
For the most granular control, handle exceptions directly within the step using standard Ruby rescue:
class PaymentWorkflow < GenevaDrive::Workflow
step def initiate_payment
PaymentGateway.charge(hero)
rescue PaymentGateway::RateLimited => e
reattempt!(wait: e.retry_after)
rescue PaymentGateway::CardDeclined
hero.notify_card_declined!
cancel!
rescue PaymentGateway::ServiceDown
pause! # Wait for manual intervention
end
end
This bypasses the exception policy entirely — the step catches the error before GenevaDrive sees it.
For a complete example showing granular exception handling, see the Payment Processing Workflow in the appendix.
Find and resume paused workflows from the console:
GenevaDrive::Workflow.paused.each do |workflow|
puts "#{workflow.class.name} ##{workflow.id}: next step is #{workflow.next_step_name}"
# Check if there's a scheduled execution (externally paused)
if (scheduled = workflow.current_execution)
overdue = scheduled.scheduled_for < Time.current
puts " Scheduled step: #{scheduled.step_name} (#{overdue ? 'overdue' : 'future'})"
puts " Originally scheduled for: #{scheduled.scheduled_for}"
end
# Check if there's a failed execution (paused due to exception)
if (failed = workflow.step_executions.failed.last)
puts " Failed step: #{failed.step_name}"
puts " Error: #{failed.error_message}"
end
end
# Resume a specific workflow
workflow = GenevaDrive::Workflow.find(id)
workflow.resume! # Re-enqueues existing scheduled step or creates new one
stateDiagram-v2
[*] --> ready: create
ready --> performing: step starts
performing --> ready: step completes / reattempt
performing --> finished: last step completes / finished!
performing --> canceled: cancel!
performing --> paused: pause! / exception
paused --> ready: resume!
A workflow begins in ready state. When a step starts executing, it transitions to performing. Upon step completion, it returns to ready (unless it was the last step, in which case it transitions to finished).
Step executions have their own state machine:
| State | Meaning |
|---|---|
scheduled |
Waiting to run |
in_progress |
Currently executing |
completed |
Finished successfully |
failed |
Exception occurred |
canceled |
Canceled before execution |
skipped |
Skipped via skip_if or skip! |
A GenevaDrive workflow should be a one-off process with a clear beginning and end — not a long-running loop that retries forever. The best workflows are created, execute their steps over hours or days, reach finished (or canceled), and are eventually cleaned up by the housekeeping job. If you need a recurring process for a hero, create a new workflow on a schedule rather than building an immortal one.
Code changes between deploys. A workflow definition is a Ruby class. When you deploy new code, the class may have different steps, different logic, different wait times. A workflow that was created before the deploy still carries its original step pointer — it will execute the steps as they existed when it was defined. If you keep a single workflow alive for months, it accumulates an increasingly stale relationship with your codebase. A fresh workflow always runs the latest code.
Step executions are append-only history. Each step attempt creates a StepExecution record. This history is the audit trail for what happened and when. A workflow that runs for two weeks and processes three steps has a concise, readable history. A workflow that has been alive for six months and has reattempted the same polling step 4,000 times has an audit trail that is effectively noise. The step execution table becomes a dumping ground rather than a useful record.
Terminal states tell you what succeeded and what didn't. GenevaDrive's state machine is designed around the assumption that workflows end. A finished workflow is one that completed all its steps successfully. A canceled workflow is one that was stopped deliberately. A paused workflow is one that needs attention. These states are meaningful precisely because they are final. A workflow that never finishes — because it is designed to loop forever — can never be finished, which means you lose the ability to distinguish "working as intended" from "stuck." Your monitoring and alerting becomes guesswork.
Instead of building a workflow that loops endlessly, create a cron job that periodically creates a new workflow for each eligible hero:
# app/jobs/create_billing_check_workflows_job.rb
class CreateBillingCheckWorkflowsJob < ApplicationJob
def perform
Account.active.find_each do |account|
# Only create if no ongoing workflow exists for this hero
next if BillingCheckWorkflow.ongoing.exists?(hero: account)
BillingCheckWorkflow.create!(hero: account)
end
end
end
Schedule this job with your cron adapter:
# With Solid Queue recurring tasks
billing_check:
class: CreateBillingCheckWorkflowsJob
schedule: every day at 6am
# With GoodJob cron
GoodJob::Cron.schedule(
cron: "0 6 * * *",
class: "CreateBillingCheckWorkflowsJob"
)
Each invocation creates a fresh workflow that runs the current code, produces a clean execution history, and terminates with a clear outcome.
The immortal workflow (don't do this):
class BillingCheckWorkflow < GenevaDrive::Workflow
step :check_billing do
if hero.payment_overdue?
BillingMailer.overdue(hero).deliver_later
end
reattempt!(wait: 1.day) # Loop forever
end
end
This workflow never finishes. You cannot tell from its state whether it is working correctly or stuck. Its step execution history grows without bound. When you deploy new billing logic, existing workflows keep running the old code path. And if you need to change the check interval from daily to hourly, you must somehow update every running workflow instance.
The one-off workflow (do this instead):
class BillingCheckWorkflow < GenevaDrive::Workflow
step :check_payment_status do
skip! unless hero.payment_overdue?
hero.update!(last_overdue_notice_at: Time.current)
end
step :send_overdue_notice, skip_if: -> { hero.last_overdue_notice_at.nil? } do
BillingMailer.overdue(hero).deliver_later
end
step :schedule_followup, skip_if: -> { hero.last_overdue_notice_at.nil? } do
hero.update!(followup_needed: true)
end
end
Each day, the cron job creates a new BillingCheckWorkflow for each account. It runs, does its work, finishes. Tomorrow's workflow will use tomorrow's code. The execution history for each workflow is three steps long. You can query BillingCheckWorkflow.where(hero: account).finished to see every successful check, and BillingCheckWorkflow.where(hero: account).paused to find the ones that hit problems.
You might wonder: if a cron job creates a new workflow every day, won't old workflows collide with new ones?
No. GenevaDrive maintains a partial unique index on (type, hero_type, hero_id) that only covers ongoing workflows — those not in finished or canceled state. This means:
BillingCheckWorkflow for a given account can be ready, performing, or paused at any time.finished or canceled, it drops out of the index entirely.This is the designed-for pattern. Yesterday's finished workflow and today's active workflow coexist in the database. The unique index prevents accidental duplicates (two active workflows for the same hero), while the housekeeping job eventually cleans up old finished workflows after your configured retention period.
# This works because yesterday's workflow already finished
account = Account.find(42)
BillingCheckWorkflow.where(hero: account).finished.count # => 30 (last month)
BillingCheckWorkflow.ongoing.where(hero: account).count # => 1 (today's)
The short-lived workflow pattern works with the library's design rather than against it. Workflows start, do their work, and end. The database tells you exactly which ones succeeded and which ones didn't. New code applies immediately. History stays readable. This is the intended way to use GenevaDrive.
GenevaDrive steps execute asynchronously via ActiveJob. This has important implications:
wait: 30.days step means exactly what it says.[!WARNING] Instance variables do not persist between steps. Store data in the database.
# WRONG - @data won't exist in next step
step :fetch do
@data = ExternalApi.fetch(hero.external_id)
end
step :process do
process(@data) # @data is nil here!
end
# RIGHT - persist to the hero or another record
step :fetch do
hero.update!(external_data: ExternalApi.fetch(hero.external_id))
end
step :process do
process(hero.external_data)
end
You will almost never have the same self between steps. Treat each step as an independent unit that reads from and writes to the database.
Make the hero the specific business object being processed, not the user who owns it. This keeps workflows focused and allows multiple concurrent workflows for the same user.
# Good: Invoice is the hero
invoice = user.invoices.create!(amount: 100)
InvoiceWorkflow.create!(hero: invoice)
# Less good: User is the hero, but the workflow is really about an invoice
InvoiceWorkflow.create!(hero: user, invoice_id: invoice.id)
If you're processing a subscription renewal, the hero is the Subscription. If you're processing an order, the hero is the Order. The more specific your hero, the easier it is to reason about workflow state.
GenevaDrive uses STI (Single Table Inheritance) combined with a polymorphic hero association. This means a single hero can have multiple different workflow types associated with it, each representing a distinct process the hero is going through.
Define associations on your hero model to access each workflow type directly:
class User < ApplicationRecord
has_one :signup_workflow, as: :hero, class_name: "SignupWorkflow"
has_one :billing_workflow, as: :hero, class_name: "BillingCycleWorkflow"
has_one :referral_workflow, as: :hero, class_name: "ReferralProgramWorkflow"
end
Each association returns only workflows of that specific type. Rails automatically scopes queries by the STI type column when you specify class_name.
This pattern lets you query the state of each process independently:
user = User.find(id)
# Check signup progress
if user.signup_workflow&.finished?
# User completed onboarding
end
# Check billing state
if user.billing_workflow&.paused?
# Billing needs attention
end
# Start a new workflow only if one isn't already running
unless user.referral_workflow&.ongoing?
ReferralProgramWorkflow.create!(hero: user)
end
For workflows where you need access to historical records (finished or canceled), use has_many with a scope:
class User < ApplicationRecord
has_one :current_billing_workflow,
-> { ongoing },
as: :hero,
class_name: "BillingCycleWorkflow"
has_many :billing_workflows,
as: :hero,
class_name: "BillingCycleWorkflow"
end
If the hero is deleted while the workflow is running, GenevaDrive cancels the workflow automatically. This is the right default — in practice, the vast majority of workflows make no sense without their hero. If an admin manually deletes a user, the onboarding workflow for that user should not keep sending emails into the void.
If a step tries to access a nil hero, it raises, and the workflow pauses. This is also correct — it surfaces the problem to an operator rather than silently continuing with broken assumptions.
For the extremely rare case where a workflow must continue after the hero has been deleted from the database, the may_proceed_without_hero! escape hatch exists. You will almost never need it.
When a step completes, GenevaDrive creates a new StepExecution record and enqueues a PerformStepJob. The job runs after the transaction commits (using after_all_transactions_commit), ensuring the step execution record is visible to the job worker.
If a step specifies wait:, GenevaDrive passes the delay to ActiveJob's set(wait_until:). This means your queue adapter handles the scheduling — GenevaDrive doesn't implement its own timer.
We recommend using a queue adapter that co-commits with your database:
Co-committing matters because GenevaDrive relies on transactional guarantees. When a step completes and schedules the next step, both the state change and the job enqueue should be atomic. With co-committing adapters, if the transaction rolls back, the job is never enqueued.
With non-transactional adapters (Sidekiq, Resque), there's a small window where the job is enqueued but the transaction hasn't committed. GenevaDrive handles this gracefully — the job will see the step execution in the wrong state and skip it — but you may see occasional log warnings.
When creating many workflows at once, you may want to batch the job inserts for efficiency. Libraries like BulkEnqueue or native adapter bulk methods can significantly reduce database round-trips. However, GenevaDrive's default behavior of deferring job enqueueing to after_all_transactions_commit interferes with bulk enqueueing:
BulkEnqueue.in_bulk { } starts, buffer is setafter_all_transactions_commit callback is registered (not executed)after_all_transactions_commit callbacks fireUse GenevaDrive.with_inline_enqueue to temporarily disable deferred enqueueing:
BulkEnqueue.in_bulk do
GenevaDrive.with_inline_enqueue do
drafts.each do |draft|
DeliverBriefWorkflow.create!(hero: draft)
end
end
end
Inside the block, jobs are enqueued immediately (inline) rather than being deferred to after_all_transactions_commit. This allows bulk enqueueing libraries to capture and batch the job inserts.
[!WARNING] Only use
with_inline_enqueueif you have a co-committing, database-backed ActiveJob adapter (SolidQueue, GoodJob, Gouda) running on the same database as your application. With these adapters, the job INSERT and workflow records are written in the same transaction, guaranteeing atomicity.If you use a non-transactional adapter (Redis-based Sidekiq, Resque, etc.), do not use this in production — jobs may be picked up before their associated records are committed, causing "record not found" errors.
The method is thread-safe and won't affect other concurrent requests.
Override the queue or priority for all steps in a workflow:
class HighPriorityWorkflow < GenevaDrive::Workflow
set_step_job_options queue: :critical, priority: 0
step :urgent_action do
UrgentService.process!(hero)
end
end
The options are passed directly to ActiveJob's set method.
GenevaDrive::HousekeepingJob performs two maintenance tasks:
in_progress or scheduled state for too long, indicating a process crash or lost job.Configure housekeeping thresholds in an initializer:
# config/initializers/geneva_drive.rb
GenevaDrive.delete_completed_workflows_after = 30.days
GenevaDrive.stuck_in_progress_threshold = 1.hour
GenevaDrive.stuck_scheduled_threshold = 15.minutes
GenevaDrive.stuck_recovery_action = :reattempt # or :cancel
delete_completed_workflows_after — How long to keep finished/canceled workflows. Set to nil to disable cleanup.stuck_in_progress_threshold — How long a step can be in_progress before it's considered stuck.stuck_scheduled_threshold — How long past scheduled_for a step can be while still scheduled before it's considered stuck.stuck_recovery_action — What to do with stuck steps: :reattempt reschedules them, :cancel cancels the workflow.Schedule the housekeeping job to run periodically:
# With GoodJob cron
GoodJob::Cron.schedule(
cron: "*/30 * * * *",
class: "GenevaDrive::HousekeepingJob"
)
# Or enqueue manually
GenevaDrive::HousekeepingJob.perform_later
Include GenevaDrive::TestHelpers in your test class:
class SignupWorkflowTest < ActiveSupport::TestCase
include GenevaDrive::TestHelpers
end
speedrun_workflow executes all pending steps synchronously, ignoring wait times:
test "subscription workflow sends emails and activates" do
user = users(:active_subscriber)
workflow = SubscriptionRenewalWorkflow.create!(hero: user)
speedrun_workflow(workflow)
assert workflow.finished?
assert user.subscription.reload.active?
end
perform_step_inline executes a specific step without running through the entire workflow:
test "payment step handles rate limiting" do
payment = payments(:pending)
workflow = PaymentWorkflow.create!(hero: payment)
PaymentGateway.stub(:charge, -> { raise PaymentGateway::RateLimited.new(retry_after: 60) }) do
perform_step_inline(workflow, :initiate_payment)
end
workflow.reload
assert workflow.ready?
assert_equal "initiate_payment", workflow.next_step_name
end
The test helpers provide convenience assertions:
test "skips email if user unsubscribed" do
user = users(:unsubscribed)
workflow = NotificationWorkflow.create!(hero: user)
speedrun_workflow(workflow)
assert_step_executed(workflow, :send_email, state: "skipped")
assert_workflow_state(workflow, :finished)
end
GenevaDrive defers job enqueueing to after_all_transactions_commit so that step execution records are visible to the job worker before it runs. This can cause problems with transactional tests on SQLite — the after_all_transactions_commit callback writes to the database after the inner transaction commits, but with SQLite's single-writer limitation this can conflict with the test transaction wrapper or other connections. Symptoms include SQLite3 errors in workflow tests while the rest of the test suite works fine.
By default, GenevaDrive detects Rails.env.test? and skips the deferral, enqueueing jobs immediately. This means transactional tests work out of the box with no extra setup.
If you need strict after-commit semantics in a specific test (for example, to verify the exact commit-then-enqueue ordering), you can re-enable deferral:
test "job is enqueued only after commit" do
GenevaDrive.enqueue_after_commit = true
# ... test that relies on real after_commit timing ...
ensure
GenevaDrive.enqueue_after_commit = false
end
If you are not using transactional tests at all (for example, you use DatabaseCleaner with truncation), you can set this globally in your initializer:
GenevaDrive.enqueue_after_commit = true
GenevaDrive uses tagged logging. Each log entry includes the workflow class, ID, and hero information:
[SubscriptionRenewalWorkflow id=123 hero_type=User hero_id=456] Scheduling next step "charge_payment" after "send_renewal_notice"
When a step executes, the log entries also include the step execution ID and step name:
[SubscriptionRenewalWorkflow id=123 hero_type=User hero_id=456] [execution_id=789 step_name=charge_payment] Step completed successfully
GenevaDrive emits three instrumentation events:
| Event | When | Payload |
|---|---|---|
precondition.geneva_drive |
Before step, during cancel_if/skip_if evaluation | execution_id, workflow_id, workflow_class, step_name, outcome |
step.geneva_drive |
During step execution | execution_id, workflow_id, workflow_class, step_name, outcome, exception |
finalize.geneva_drive |
After step, during state transitions | execution_id, workflow_id, workflow_class, step_name, workflow_state, step_state |
Subscribe to events for custom metrics or logging:
ActiveSupport::Notifications.subscribe("step.geneva_drive") do |event|
StatsD.timing("geneva_drive.step.duration", event.duration)
StatsD.increment("geneva_drive.step.#{event.payload[:outcome]}")
if event.payload[:exception]
Sentry.capture_exception(event.payload[:exception])
end
end
There is a rather popular approach to building durable execution systems based on the concept of "durable functions". Systems like Temporal and absurd as well as Vercel Workflows take that concept quite far. It seems neat on the surface, yet deeply flawed in nature.
The assumption made with those "durable functions" is that it is possible to pretend that you have a marshalable stack. For example, this section in a workflow function:
let step = 0;
while (step++ < 20) {
const { newMessages, finishReason } = await ctx.step("iteration", async () => {
return await singleStep(messages);
});
messages.push(...newMessages);
if (finishReason !== "tool-calls") {
break;
}
}
can only work if step gets marshaled and reinstated if the function gets resumed. Async generators (and Fibers in Ruby, and - in general - any systems based on continuations or coroutines) allow suspension and resumption, but none allow proper serialization and revival. If you try to encode a durable function, consisting of multiple steps, as a suspendable and resumable workflow, you essentially have 3 ways to do it:
Most "workflow engines" do their utmost to maintain the guise of resumable functions while not providing them. The fact that you have to wait on a Fiber to receive an HTTP result is not very useful if the only program that can receive that result is the very process which has started that HTTP request.
The fundamental issue is that no mainstream runtime — V8, SpiderMonkey, YARV, the JVM, the CLR — provides the ability to serialize an executing call stack to bytes and revive it later. This is not an oversight. It is a deliberate engineering decision driven by hard constraints.
A call stack contains pointers: return addresses, references to heap objects, handles to file descriptors and sockets, pointers into native libraries. Serializing a pointer is meaningless — the memory address 0x7fff5fbff8c0 on one machine means nothing on another, or even on the same machine after a restart. To serialize a stack, you must either:
Modern runtimes chose speed over serializability. JavaScript engines like V8 perform aggressive JIT compilation that inlines functions, eliminates stack frames, and stores values in machine registers. The "stack" you think exists in your async function is a fiction maintained for debugging — the actual execution state is scattered across registers, hidden classes, inline caches, and optimized machine code that has no stable representation.
True stack serialization is not impossible. It has been done — just not in environments optimized for raw speed.
Smalltalk images are the canonical example. A Smalltalk system serializes its entire object memory, including all activation records (stack frames), to a single file. You can save an image mid-computation, quit, restart days later, and continue exactly where you left off. This works because Smalltalk controls everything: the object format, the bytecode interpreter, the garbage collector. There are no opaque pointers to external resources — or if there are, the image-saving mechanism explicitly handles them.
Erlang/OTP takes a different approach: processes are so lightweight and isolated that you simply design for crash recovery. A process dies, its supervisor restarts it, and it reconstructs its state from durable storage. There's no pretense that you can freeze and thaw a running computation — you design for restart from the beginning.
Scheme continuations (particularly in implementations like Chez Scheme or Gambit) can capture delimited continuations and, in some implementations, serialize them. But these implementations pay the cost: they maintain a CPS-transformed representation that is inherently slower than direct-style execution.
Seaside deserves special mention. This Smalltalk web framework, developed in the early 2000s, used continuations to model web application control flow. You could write a multi-page wizard as a single method with call: and answer: — the framework would suspend execution while waiting for user input and resume it when the response arrived. Wee, a Ruby port inspired by Seaside's ideas, attempted the same trick using Ruby's callcc. Both frameworks demonstrated genuine continuation-based web development. But they also demonstrated its limits: Seaside required either keeping all session continuations in memory (scaling poorly) or relying on Smalltalk's image persistence (requiring the same VM instance to handle subsequent requests). Wee suffered from Ruby 1.8's notorious continuation memory leaks and remained a curiosity rather than a production tool.
The fundamental problem with continuation-based web frameworks is process affinity. A suspended continuation exists in the memory of a specific process on a specific machine. When the user submits the next form, that exact process must handle the request — no load balancer can route it elsewhere, no autoscaler can spin up a fresh instance to handle the load, no deployment can replace the running code. This is incompatible with modern elastic infrastructure. Kubernetes doesn't care that your user's shopping cart continuation lives in pod web-7f8d9c-xk2p4 — when traffic spikes, it will route requests wherever capacity exists. When you deploy, it will terminate old pods and start new ones. Your continuations die with them.
Even if you could serialize a continuation, you would face the problem of transient resources. A continuation captures the call stack, but the call stack contains references to objects that cannot meaningfully survive process boundaries.
Consider a database transaction. Your step opens a connection, begins a transaction, inserts a row, and then — mid-transaction — suspends to wait for user confirmation:
step :reserve_inventory do
ActiveRecord::Base.transaction do
hero.line_items.each { |item| Inventory.decrement!(item.sku, item.quantity) }
# Suspend here, wait for payment confirmation...
yield # In a hypothetical continuation-based system
hero.update!(reserved_at: Time.current)
end
end
What happens when you try to revive this continuation on a different machine, or even the same machine after a restart? The ActiveRecord::Base.connection object holds a socket to a PostgreSQL server. That socket is gone. You could theoretically reconnect — some systems use lazy connection resolution for exactly this reason — but reconnecting gives you a new connection. The transaction you started? It was rolled back the moment the original connection died. The BEGIN you issued exists only in the logs. The row locks you held have been released. Some other process may have already modified the rows you thought you had locked.
There is no way to "re-enter" a transaction. Transactions are not addressable resources you can resume — they are ephemeral states of a connection that exist only as long as that connection lives. The same applies to file handles, HTTP connections mid-request, mutex locks, and any other resource that represents a relationship with an external system. A serialized continuation that references such resources is not a suspended computation — it is a lie about the state of the world.
The lesson from these systems is clear: serializable execution state requires either total control over the runtime environment or acceptance of significant performance overhead. You cannot bolt it onto V8 or Ruby's YARV after the fact.
Building a marshalable stack VM is a multi-year, multi-million-dollar undertaking. It requires:
The JavaScript ecosystem optimizes for different goals: startup time, peak throughput, memory efficiency on mobile devices. Google, Mozilla, and Apple compete on V8, SpiderMonkey, and JavaScriptCore benchmarks. No one is competing on "ability to serialize a running function to disk."
The developers building "durable function" frameworks are, by and large, application developers — skilled in their domain, but not runtime engineers. They are building atop V8, not replacing it. They cannot make V8 serialize its internal state because V8 was never designed to expose that state. The best they can do is replay: run the function again from the start, skip the steps that already completed, and hope the interleaving of side effects is deterministic. This is not serialization. It is simulation.
When a framework claims to offer "durable functions" without true stack serialization, it is engaging in denial — not about the laws of physics, but about the semantics of their own system.
Consider what happens when a "durable function" resumes. The framework:
This only works if the control flow between steps is perfectly deterministic. But JavaScript (and Ruby, and Python) are not deterministic languages. The order of object keys, the behavior of Math.random(), the resolution of race conditions in Promise.all() — all of these can vary between runs. If your loop counter depends on a hash iteration order that changed between Node versions, your "resumed" function takes a different path than the original.
The frameworks paper over this with restrictions: don't use randomness, don't depend on time, don't read from external systems except through blessed APIs. But these restrictions are invisible until you violate them. You write what looks like normal code, you test it, it works — and then six months later, after a Node upgrade, a resumed workflow takes a wrong turn and corrupts your data.
Honesty requires admitting what the system actually provides. If execution state is not truly serialized, don't pretend it is. Make the boundaries explicit. Make the steps explicit. Make the user acknowledge, at every step boundary, that they are persisting state to a database. GenevaDrive takes this position.
When a step raises an unhandled exception, the default behavior is to pause the workflow. This is not a coincidence — it is the most operationally useful thing we can do. A paused workflow is not broken. It is stopped, intact, waiting for a human to decide what happens next.
Consider the alternatives:
performing state. This is bad — a performing workflow cannot have a new step execution started, so there is no way to advance it once the bug is fixed or the upstream problem is resolved. The workflow appears to hang indefinitely.Pausing avoids all of these. The workflow stops advancing, but every piece of state is preserved: which step failed, what exception was raised, the full backtrace, and which step would run next. An operator can inspect the situation, fix the underlying problem (deploy a bug fix, correct bad data, restore an external service), and then decide how to proceed.
A paused workflow is still ongoing. It keeps its slot in the unique index on (type, hero_type, hero_id). This means no competing workflow can be created for the same hero while the paused one exists. If a cron job or a controller action tries to create a duplicate, the unique constraint rejects it.
This is a feature, not a limitation. Imagine a payment processing workflow pauses because the gateway returned an unexpected error. You do not want a second payment workflow starting for the same order while the first one is stopped mid-flight — that risks double-charging. The paused workflow holds the slot, acting as a lock that says "a human needs to look at this before anything else happens for this hero."
You have three options when a workflow is paused:
resume! sets the workflow back to ready and re-enqueues the step that was about to run. If the step was originally scheduled for some time in the future and that time has now passed, it runs immediately — the system recognizes it is overdue. If the scheduled time is still in the future, it waits the remaining duration. This is the most common recovery path: fix the root cause, then resume.
workflow = PaymentProcessingWorkflow.find(id)
workflow.resume! # Re-enqueues the failed step
skip! marks the current step as skipped and advances to the next one. This is useful when the step is no longer relevant — perhaps you resolved the issue out-of-band, or the step was attempting something that has since been handled manually.
workflow = PaymentProcessingWorkflow.find(id)
workflow.skip! # Advances past the failed step
cancel! terminates the workflow entirely. The slot in the unique index is released, and a new workflow for the same hero can be created.
workflow = PaymentProcessingWorkflow.find(id)
workflow.cancel! # Ends the workflow, frees the hero slot
Steps can also call pause! explicitly to request human intervention before the workflow continues. This is the "emergency stop button" — the step has detected a condition that requires a conscious decision rather than automatic progression.
The payment processing example in this manual demonstrates this pattern. When the gateway flags a transaction as potentially fraudulent, the step pauses the workflow rather than proceeding:
step :capture_payment, wait: 1.hour do
result = PaymentGateway.capture(
authorization_id: hero.authorization_id,
idempotency_key: "capture-#{hero.id}"
)
hero.update!(captured_at: Time.current, transaction_id: result.transaction_id)
rescue PaymentGateway::FraudSuspected
hero.flag_for_fraud_review!
pause! # A human must review before we continue
end
The workflow will sit in paused state — holding the slot, preventing duplicates — until a fraud analyst reviews the case and calls resume! or cancel!.
There is one more scenario where pause happens automatically. If a workflow references a step name that no longer exists in the class definition — typically after a deploy that removed or renamed a step — the executor pauses the workflow rather than silently skipping or crashing. This gives operators a chance to notice the mismatch and decide whether to skip, cancel, or deploy a fix.
This workflow demonstrates named steps with wait times, skip conditions, and blanket cancellation. It guides a new user through account setup with timed reminders.
class UserOnboardingWorkflow < GenevaDrive::Workflow
cancel_if { hero.account_closed? }
cancel_if { hero.onboarding_completed? }
step :send_welcome_email do
OnboardingMailer.welcome(hero).deliver_later
hero.update!(welcome_email_sent_at: Time.current)
end
step :check_profile_completion, wait: 1.day do
skip! if hero.profile_complete?
OnboardingMailer.complete_profile_reminder(hero).deliver_later
end
step :verify_email, wait: 2.days, skip_if: :email_verified? do
OnboardingMailer.verify_email_reminder(hero).deliver_later
end
step :suggest_connections, wait: 3.days, skip_if: -> { hero.connections.any? } do
ConnectionSuggester.generate_for(hero)
OnboardingMailer.connection_suggestions(hero).deliver_later
end
step :schedule_onboarding_call, wait: 5.days do
skip! if hero.onboarding_call_scheduled? || hero.onboarding_completed?
CalendarService.schedule_onboarding(hero)
OnboardingMailer.call_scheduled(hero).deliver_later
end
step :mark_onboarding_complete, wait: 7.days do
hero.update!(onboarding_completed_at: Time.current)
OnboardingMailer.onboarding_complete(hero).deliver_later
end
private
def email_verified?
hero.email_verified?
end
end
This workflow demonstrates manual exception handling, dynamic retry waits, and early termination. It handles the complexity of interacting with an external payment gateway.
class PaymentProcessingWorkflow < GenevaDrive::Workflow
cancel_if { hero.canceled? }
cancel_if { hero.refunded? }
step :validate_payment_method do
unless hero.payment_method&.valid?
hero.mark_invalid_payment_method!
cancel!
end
end
step :authorize_payment do
result = PaymentGateway.authorize(
amount: hero.amount,
payment_method: hero.payment_method,
idempotency_key: "authorize-#{hero.id}"
)
hero.update!(authorization_id: result.authorization_id)
rescue PaymentGateway::CardDeclined => e
hero.update!(failure_reason: e.message)
PaymentMailer.card_declined(hero).deliver_later
cancel!
rescue PaymentGateway::RateLimited => e
reattempt!(wait: e.retry_after || 30.seconds)
rescue PaymentGateway::ServiceUnavailable
reattempt!(wait: 5.minutes)
end
step :capture_payment, wait: 1.hour do
# Allow time for fraud checks
result = PaymentGateway.capture(
authorization_id: hero.authorization_id,
idempotency_key: "capture-#{hero.id}"
)
hero.update!(
captured_at: Time.current,
transaction_id: result.transaction_id
)
rescue PaymentGateway::AuthorizationExpired
# Re-authorize if the hold expired
hero.update!(authorization_id: nil)
reattempt!
rescue PaymentGateway::FraudSuspected
hero.flag_for_fraud_review!
pause!
end
step :send_receipt do
PaymentMailer.receipt(hero).deliver_later
end
step :update_inventory do
hero.line_items.each do |item|
InventoryService.decrement(item.sku, item.quantity)
end
end
step :notify_fulfillment do
FulfillmentService.queue(hero)
hero.update!(fulfillment_queued_at: Time.current)
end
end
This workflow processes a GDPR-style data erasure request. Every step requires the hero to exist — if an admin or another process deletes the user externally before the workflow finishes, the next step will raise on hero access, pausing the workflow for an operator to investigate. This is the correct behavior: silent continuation with a missing user would be worse than stopping.
The final step destroys the hero. Because it is the last step, the workflow transitions to finished and there are no subsequent steps that need the hero.
class UserErasureWorkflow < GenevaDrive::Workflow
step :export_to_archive do
# Every step accesses hero directly — no nil guards.
# If hero was deleted externally, this raises and the
# workflow pauses. An operator can then investigate why
# the user vanished before erasure completed.
archive_data = UserDataExporter.export(hero)
ComplianceArchive.store(
user_id: hero.id,
data: archive_data,
expires_at: 7.years.from_now
)
end
step :notify_third_parties do
hero.oauth_connections.each do |connection|
ThirdPartyDeletionService.request(connection)
end
end
step :delete_user_data, wait: 24.hours do
# Give third parties time to process deletion requests
hero.posts.destroy_all
hero.comments.destroy_all
hero.messages.destroy_all
hero.files.each { |f| f.purge_later }
end
step :send_confirmation do
ErasureMailer.complete(hero.email).deliver_later
end
step :delete_user_record do
# Last step — hero is destroyed, workflow finishes.
# No subsequent steps need the hero.
hero.destroy!
end
end
| Method | Effect |
|---|---|
cancel! |
Stop workflow, mark canceled |
pause! |
Stop workflow, await manual resume |
reattempt!(wait:) |
Retry current step, optionally after delay |
skip! |
Skip current step, proceed to next |
finished! |
Complete workflow early |
| State | Meaning |
|---|---|
ready |
Waiting for next step to execute |
performing |
Currently executing a step |
finished |
All steps completed successfully |
canceled |
Workflow was canceled |
paused |
Awaiting manual intervention |
| State | Meaning |
|---|---|
scheduled |
Waiting to run |
in_progress |
Currently executing |
completed |
Finished successfully |
failed |
Exception occurred |
canceled |
Canceled before execution |
skipped |
Skipped via skip_if or skip! |
| Option | Type | Description |
|---|---|---|
wait: |
Duration | Delay before step executes |
skip_if: |
Proc, Symbol, Boolean | Condition to skip step |
on_exception: |
Symbol | Exception handler (:pause!, :cancel!, :reattempt!, :skip!) |
max_reattempts: |
Integer, nil | Max consecutive reattempts before pausing (default: 100, nil = unlimited) |
before_step: |
Symbol | Insert before this step |
after_step: |
Symbol | Insert after this step |