When Vendors Fail (Part II): Writing Functional Continuity Into Your Contracts

Finger reaching toward a glowing shield-and-lock icon, symbolizing vendor security and functional continuity.

IT Trends Weekly — a curated, citation-first roundup for busy IT leaders.

Last week’s European airport disruption was a painful reminder that your customer experience is only as strong as your vendors’ worst day. A ransomware attack on a third-party check-in platform used by multiple airports forced manual processes, handwritten boarding passes, and cancellations across major hubs. Regulators confirmed ransomware; airports described staged recovery and manual workarounds over several days.[1][2][3]

This follow-up focuses on something most contracts miss: functional continuity—the vendor’s ability to help you keep serving customers (even in a degraded mode) while they restore systems. Backups and RPO/RTO targets matter, but they don’t answer phones or accept orders. You need written, testable commitments for the ugly middle between “incident declared” and “full restore.”

What “functional continuity” means in plain language

Functional continuity is the set of interim measures that let you continue essential functions during a disruption—manual intake, alternate channels, static status, limited transactions—until normal operations resume. NIST’s contingency guidance explicitly calls for interim measures and manual workarounds as part of continuity planning, not as afterthoughts.[4]

You’ve probably baked these into your internal runbooks. The gap is getting vendors to agree, in writing and in tests, to help you deliver them.

Five clauses to add to every critical vendor contract

  • 1) Functional continuity SLOs (not just uptime)
    Define the minimum essential services the vendor will help you keep running if they isolate or rebuild their stack. Examples: read-only status APIs/pages; CSV or secure-form intake; daily batch re-ingest of the backlog; named human contacts on a 24/7 rota; and a public-comms cadence your team can point customers to.
    Why: Airports could still show departure info long before they could check people in—classic “read vs. write” asymmetry. Your contract should reflect that reality.[1]
  • 2) CSV intake by S+4 hours, daily re-ingest
    If transactional endpoints are down, the vendor must accept customer requests via secure upload or a minimal alternate form within four hours of incident start, then re-ingest at least daily until systems are back.
    Why: NIST treats alternate processing and manual workarounds as first-class continuity tactics. Turning customers away is not continuity.[4]
  • 3) Read-first restore pattern
    Contract for a priority order of restoration: read (status/lookups) first, then write (transactions) with staged limits. Document the maximum stale-ness (e.g., status ≤ 30 minutes old) and what the splash/holding pages must say.
    Why: A functional status plane deflects calls and restores trust faster, as seen in the airport recovery sequence.[1]
  • 4) Vendor participation in tests and tabletops
    Require semiannual joint exercises (one business-led, one technical) using your top customer journeys and the vendor’s continuity playbook. Capture action items and a remediation clock.
    Why: NIST’s testing guidance (SP 800-84) emphasizes realistic TT&E to prove plans actually work—your contracts should too.[5]
  • 5) Identity & access requirements during incidents
    Mandate hardware-key MFA for vendor admins, just-in-time elevated access, and a procedure to provision your staff into their alternate tools/tenants during isolation. Log sharing and incident timelines should be deliverables.
    Why: Supply-chain guidance expects controls to extend through the vendor relationship, not stop at your boundary.[6]

Regulatory tailwinds (why your legal team will say yes)

Across sectors, regulators now expect third-party operational resilience, not just security checklists. In the EU’s financial sector, the Digital Operational Resilience Act (DORA) requires contractual controls for ICT providers supporting critical functions, including testing, reporting, and exit/portability. It explicitly folds ICT third-party risk into core risk management.[7][8] The broader NIS2 regime pushes essential/important entities to manage supply-chain cybersecurity risks and demonstrate continuity measures proportionate to impact—ENISA has published practical implementation guidance.[9][10]

In the U.S. financial sector, the updated FFIEC Business Continuity Management handbook expects institutions to evaluate continuity risks stemming from third-party providers and to test accordingly—good cover when you ask vendors for joint exercises and backlog intake options.[11][12]

How to make these clauses stick (without blowing up procurement)

  • Lead with business journeys, not tech. Map the 5 customer/citizen journeys that matter most (e.g., “new claim,” “appointment booking”). For each, write the “manual mode” outcome you need: information available, intake captured, backlog cleared daily. Legal teams respond well to customer impact language backed by recognized frameworks (NIST/ISO).[4][13]
  • Use template language vendors recognize. Many providers already commit to incident comms cadence, status pages, and export formats. Add your specifics to avoid ambiguity (e.g., “CSV with schema X; SFTP with key-based auth; 10:00/16:00 local updates”).
  • Make tests a deliverable, not a favor. Reference NIST SP 800-84 in the contract and attach a one-page annex describing scope, participants, and artifacts (timeline, evidence, action log). Tie a small portion of renewal or a service credit to successful completion.[5]
  • Write the “read-first” pattern in. Ask for a simple status/lookup surface the vendor can keep online or restore first—preferably hosted on separate infrastructure/DNS per resilience guidance. This keeps customers informed even if writes are disabled.[14]
  • Don’t forget exit/portability. If the vendor’s outage turns into termination, you’ll need portable data and a timely exit. DORA and ISO 22301 both support portability and continuity outcomes; echo that language.[7][13]

Sample “functional continuity” annex you can redline

(Non-legal sample to accelerate negotiations; adapt with counsel.)

  1. Scope. This Annex applies to Critical Functions A, B, C delivered by Provider to Customer. During a Major Incident requiring isolation or rebuild, Provider will support the following continuity measures until normal operations resume.
  2. Read-First Restore. Within 2 hours of incident declaration, Provider will publish/maintain a read-only status and lookup surface (separate hosting/DNS) with data no more than 30 minutes stale.
  3. Alternate Intake. Within 4 hours of incident declaration, Provider will accept new customer requests via (a) secure SFTP CSV (schema attached) or (b) minimal webform hosted in a clean environment, and will provide daily re-ingest of backlogged requests into the primary platform when available.
  4. Cadence & Contacts. Provider will post public updates at 10:00/16:00 local and maintain a 24/7 named technical contact bridge for Customer.
  5. Testing. Twice per year, Provider will participate in joint tabletop exercises covering the above measures and will provide evidence artifacts (timeline, logs, action items) within 5 business days.
  6. Identity & Access. Provider maintains hardware-key MFA for privileged admins and just-in-time elevation. Customer may be provisioned into Provider’s alternate tools during isolation for continuity operations.
  7. Portability & Exit. Upon request, Provider will deliver Customer data in agreed portable formats to support temporary workarounds or exit, consistent with regulatory obligations.
  8. Service Credits/Remedies. Failure to meet items 2–4 above triggers [service credits / escalation / right to terminate for cause] without penalty.

Proving it works: what your team should test this quarter

  • Tabletop #1 (Business-led): Primary portal down; run the status cadence; accept 500 requests via CSV; process re-ingest; publish “what works/what doesn’t” comms. Capture time to read-only and time to intake.[5]
  • Tabletop #2 (Tech-led): Vendor isolates production; you fail over your status site on separate DNS; enact JIT access into vendor’s clean tenant; verify hardware-key enforcement for vendor admins.[6][14]
  • Regulatory check: If you’re EU-regulated, align evidence to DORA/NIS2 expectations for ICT third-party risk. If you’re in U.S. financial services, tie evidence to FFIEC BCM examiner procedures.[7][8][11]

Bottom line

Outages don’t just “happen to” vendors; they happen to you. Write functional continuity into contracts, test it, and measure it the way you measure normal operations. When (not if) a partner isolates their systems, you’ll still be serving customers—in a degraded but deliberate mode—while they restore.

Like this format? Stay a step ahead.

Subscribe to IT Trends Weekly for one concise, citation-first roundup each week.

imperialvalleyinfotech.com/it-trends-weekly/#subscribe

Sources & citations

  1. Reuters — Cyberattack disrupts check-in/boarding systems; airports resort to manual processes.
  2. Reuters — ENISA confirms third-party ransomware behind airport disruptions.
  3. AP — Airport cyberattack disrupted flights; staged restoration and manual check-ins. See also The Guardian — continued delays at Heathrow/Brussels/Berlin.
  4. NIST SP 800-34 Rev.1 — Contingency Planning Guide for Federal Information Systems (interim measures, manual workarounds).
  5. NIST SP 800-84 — Guide to Test, Training, and Exercise (TT&E) Programs for IT Plans.
  6. NIST SP 800-161 Rev.1 — Cybersecurity Supply Chain Risk Management (C-SCRM) Practices.
  7. DORA (EU) — Article 28: Managing ICT third-party risk. See also ESAs RTS summary — policy on ICT services supporting critical functions.
  8. Reuters — European airports race to fix check-in after disruption.
  9. ENISA — Good Practices for Supply Chain Cybersecurity (NIS2 context).
  10. ENISA — Technical implementation guidance for NIS2 risk-management measures.
  11. FFIEC — Business Continuity Management booklet (2019 update).
  12. FDIC/OCC — Announcement of revised FFIEC BCM booklet.
  13. ISO 22301 — Business Continuity Management Systems (overview).
  14. NCSC (UK) — Using SaaS securely (resilience and outage planning principles).