Operational Excellence Case Study: A 120-Day Turnaround

A mid-market professional services firm in the technology sector had a familiar problem: smart people, strong demand, and inconsistent execution. Projects shipped late, teams worked nights to “catch up,” and leadership couldn’t reliably answer a basic question—what will we deliver, by when, and at what margin?

This anonymized case study documents a 120-day operational excellence engagement led with cross-functional stakeholders (delivery leadership, finance, and a small internal operations team). The work focused on stabilizing delivery, reducing rework, and creating an operating system that could scale—without adding heavy bureaucracy.

The Challenge: Variability, Rework, and Low Forecast Confidence

The firm delivered a mix of implementation and managed services for enterprise clients. Growth had outpaced the operating model: new teams were formed quickly, project managers were stretched thin, and each delivery pod had its own way of scoping, estimating, and reporting.

Symptoms observed in the first two weeks

On-time delivery averaged 62% across active projects (defined as hitting the committed milestone date within a 5-business-day window).
Rework was consuming an estimated 18–22% of delivery capacity, based on time entry sampling and defect/issue logs.
Gross margin volatility was high: leadership forecasted within ±3%, but actuals landed within ±10–12% month to month.
Cycle time for change requests (from identification to approved scope/budget update) averaged 9 business days, causing teams to keep working “in limbo.”
Status reporting took significant effort—delivery leads reported spending 4–6 hours/week assembling updates across tools.

Root causes (confirmed through interviews and data)

No standard definition of “done.” Acceptance criteria varied by team and client.
Estimation was inconsistent. Teams used different baselines and assumptions; historical data wasn’t leveraged.
Work-in-progress (WIP) was too high. People were assigned to too many parallel initiatives, driving context switching.
Hand-offs were brittle. Sales-to-delivery transitions relied on tribal knowledge; critical requirements were frequently missing.
Governance was reactive. Escalations happened after deadlines slipped, not when leading indicators appeared.

Key decision point #1: Fix governance or fix delivery first?

Leadership initially wanted to “solve reporting,” assuming visibility would drive improvement. The data indicated the opposite: reporting was a symptom. The decision was to stabilize delivery mechanics first (definition of done, WIP control, estimation), then implement lightweight governance that reflected real work.

The Approach: Diagnose, Design the Operating System, Prove It in a Pilot

The engagement used a three-part approach: (1) diagnostic with baseline metrics, (2) operating model design, and (3) pilot-to-scale execution.

1) Diagnostic (Weeks 1–2)

We combined qualitative interviews with quantitative analysis:

25 stakeholder interviews across delivery, finance, customer success, and sales.
Portfolio review of active projects (size, stage, margin, risk signals).
Process mapping of the end-to-end flow: intake → scoping → delivery → QA → release → handover.
Data pulls from ticketing and time-tracking to estimate rework and throughput.

Outputs included:

A baseline KPI dashboard (on-time delivery, WIP, rework, change request cycle time, forecast accuracy).
A constraint map identifying bottlenecks (QA capacity and unclear acceptance criteria were the top two).

2) Operating model design (Weeks 3–4)

Rather than importing a heavyweight framework, the team designed a “minimum viable operating system” with:

Standard work for scoping, estimation, and acceptance criteria.
WIP limits at team and individual level.
A tiered cadence (daily team huddles, weekly delivery review, monthly portfolio review).
A single source of truth for project health (integrating existing tools rather than replacing them).

3) Pilot-first execution (Weeks 5–8)

A pilot was selected with clear criteria:

Representative project types (implementation + managed services)
High enough volume to test the system
A delivery manager willing to enforce WIP limits

Key decision point #2: Which pilot to choose?

There was pressure to choose the most troubled account. We recommended a “typical but visible” portfolio slice instead. A pilot that is too broken can mask whether the operating system works; a typical slice provides cleaner proof and faster adoption.

Implementation: What Changed, What Didn’t, and the Setbacks

Implementation ran in two phases: pilot (Weeks 5–8) and scale (Weeks 9–16). The goal was to improve outcomes without disrupting client commitments.

Timeline of the engagement

Weeks 1–2: Diagnostic, baseline KPIs, constraint mapping
Weeks 3–4: Operating model design, templates, tool workflow design
Weeks 5–8: Pilot implementation, coaching, weekly retrospectives
Weeks 9–12: Scale to remaining teams, portfolio governance launch
Weeks 13–16: Stabilization, KPI automation, handover to internal ops

What we actually did

1) Standardized “Definition of Done” and acceptance criteria

Introduced a one-page acceptance criteria template tied to testable outcomes.
Required acceptance criteria at intake for new work and for any change request.
Implemented a “ready for QA” checklist to reduce back-and-forth.

Early setback: Teams initially treated the template as paperwork. Rework didn’t drop in the first two weeks of the pilot.

Correction: We embedded acceptance criteria into the delivery workflow (ticket fields and gating), making it part of execution rather than an external document.

2) Reduced WIP and improved flow

Implemented WIP limits per team (and a rule of thumb per individual).
Reassigned some responsibilities to reduce context switching.
Introduced a weekly “stop starting, start finishing” review focused on blocked work.

Setback: Some stakeholders interpreted WIP limits as reduced capacity.

Correction: We paired WIP limits with a transparent queue and explicit prioritization. Leadership agreed that anything above the WIP limit required a trade-off decision.

3) Estimation and forecasting improvements

Created a lightweight estimation rubric using historical ranges and complexity tiers.
Implemented a two-stage commit: initial estimate at intake, then a firm commit after discovery.
Worked with finance to align delivery forecasts with how work was actually executed.

Key decision point #3: Enforce a discovery gate or keep “fast starts”?

Sales teams worried a discovery gate would slow deals. The compromise was a time-boxed discovery (typically 3–5 business days) with a clear output: validated scope, risks, and acceptance criteria. This reduced downstream churn without materially slowing starts.

4) Introduced tiered governance with leading indicators

Daily (15 minutes): team huddle focused on blockers and WIP
Weekly (45 minutes): delivery review using leading indicators (blocked days, defect trends, change request aging)
Monthly (60 minutes): portfolio review with finance and leadership (margin, capacity, risk)

To avoid “status theater,” the governance used standard thresholds (e.g., change requests older than X days, QA re-open rate above Y%) that triggered action.

5) Tooling: integrate, don’t replace

The firm used multiple systems (ticketing, time tracking, and a project tracker). Replacing tools would have extended timelines and increased resistance. Instead:

Standardized a small set of required fields
Automated a weekly KPI export
Created a single health view for leadership

Results: Measurable Improvements After 120 Days

Results were measured by comparing baseline (Weeks 1–2) to stabilized performance (Weeks 13–16). Metrics were reviewed with delivery and finance to confirm definitions and eliminate “metric drift.”

Outcome metrics

On-time delivery: improved from 62% → 84% (22-point increase)
Rework rate: reduced from ~20% → ~12% of delivery capacity (~40% reduction)
Change request cycle time: reduced from 9 → 4 business days (56% faster)
Forecast accuracy (margin): improved from ±10–12% → ±4–6% variance
Status reporting effort: reduced from 4–6 hours/week → 1–2 hours/week per delivery lead (~60% time saved)

Business impact (conservative estimates)

The reduction in rework and reporting time freed an estimated 6–9% net capacity across the delivery organization (validated through utilization and throughput trends).
Improved forecasting reduced end-of-month “fire drills” and enabled more reliable staffing decisions, lowering the need for last-minute contractor spend (not eliminated, but reduced).

What didn’t fully improve (yet)

QA remained a constraint during peak periods. While re-open rates dropped, staffing and automation needed longer-term investment.
Sales-to-delivery handoff improved, but only after leadership enforced the discovery gate consistently. Inconsistent enforcement caused a brief regression around Week 10.

Lessons Learned: What We’d Repeat and What We’d Change

Operational excellence is a flow problem before it’s a reporting problem. Visibility helps, but only after the work system is stable.
WIP limits require executive air cover. Without explicit trade-off decisions, teams will be forced to overload and the system collapses.
Templates don’t work unless they’re embedded in workflow. If “Definition of Done” lives in a separate document, it won’t change outcomes.
Pilot selection matters. Choose a representative slice with leadership support, not the most chaotic edge case.
Leading indicators beat lagging indicators. Blocked days, defect re-open rates, and change request aging surfaced risks earlier than milestone slips.

Applicability: When This Operational Excellence Approach Fits

This approach is a strong fit when:

You have multiple delivery teams with inconsistent execution practices.
Work is project-based or hybrid (projects + managed services) with frequent change.
Leadership needs better forecast confidence without adding heavy process overhead.
Rework, context switching, and unclear acceptance criteria are driving missed dates.

It’s a weaker fit when:

The primary constraint is strategic (product-market fit) rather than execution.
Work is already highly standardized and the bottleneck is pure capacity (e.g., a single specialist role with no redundancy).

Conclusion: Actionable Takeaways for Operational Excellence

Operational excellence doesn’t require a massive transformation program. In this case, a 120-day sequence—baseline the truth, design a minimal operating system, pilot, then scale—produced measurable improvements in delivery reliability, rework, and forecast accuracy.

If you’re trying to improve operational performance, start with these three moves:

Baseline 5–7 metrics that reflect flow (on-time delivery, WIP, rework, change request aging).
Implement WIP limits with explicit trade-offs—make overload a leadership decision, not a team burden.
Standardize “Definition of Done” and acceptance criteria and embed them directly into the workflow.

CTA: If you want a practical operational excellence assessment—focused on delivery flow, governance, and measurable outcomes—cabrillo_club can help you identify the highest-leverage changes and build a 90–120 day execution plan.

Operational Excellence Case Study: A 120-Day Turnaround

The Challenge: Variability, Rework, and Low Forecast Confidence

Symptoms observed in the first two weeks

On-time delivery averaged 62% across active projects (defined as hitting the committed milestone date within a 5-business-day window).
Rework was consuming an estimated 18–22% of delivery capacity, based on time entry sampling and defect/issue logs.
Gross margin volatility was high: leadership forecasted within ±3%, but actuals landed within ±10–12% month to month.
Cycle time for change requests (from identification to approved scope/budget update) averaged 9 business days, causing teams to keep working “in limbo.”
Status reporting took significant effort—delivery leads reported spending 4–6 hours/week assembling updates across tools.

Root causes (confirmed through interviews and data)

No standard definition of “done.” Acceptance criteria varied by team and client.
Estimation was inconsistent. Teams used different baselines and assumptions; historical data wasn’t leveraged.
Work-in-progress (WIP) was too high. People were assigned to too many parallel initiatives, driving context switching.
Hand-offs were brittle. Sales-to-delivery transitions relied on tribal knowledge; critical requirements were frequently missing.
Governance was reactive. Escalations happened after deadlines slipped, not when leading indicators appeared.

Key decision point #1: Fix governance or fix delivery first?

The Approach: Diagnose, Design the Operating System, Prove It in a Pilot

The engagement used a three-part approach: (1) diagnostic with baseline metrics, (2) operating model design, and (3) pilot-to-scale execution.

1) Diagnostic (Weeks 1–2)

We combined qualitative interviews with quantitative analysis:

25 stakeholder interviews across delivery, finance, customer success, and sales.
Portfolio review of active projects (size, stage, margin, risk signals).
Process mapping of the end-to-end flow: intake → scoping → delivery → QA → release → handover.
Data pulls from ticketing and time-tracking to estimate rework and throughput.

Outputs included:

A baseline KPI dashboard (on-time delivery, WIP, rework, change request cycle time, forecast accuracy).
A constraint map identifying bottlenecks (QA capacity and unclear acceptance criteria were the top two).

2) Operating model design (Weeks 3–4)

Rather than importing a heavyweight framework, the team designed a “minimum viable operating system” with:

Standard work for scoping, estimation, and acceptance criteria.
WIP limits at team and individual level.
A tiered cadence (daily team huddles, weekly delivery review, monthly portfolio review).
A single source of truth for project health (integrating existing tools rather than replacing them).

3) Pilot-first execution (Weeks 5–8)

A pilot was selected with clear criteria:

Representative project types (implementation + managed services)
High enough volume to test the system
A delivery manager willing to enforce WIP limits

Key decision point #2: Which pilot to choose?

Implementation: What Changed, What Didn’t, and the Setbacks

Implementation ran in two phases: pilot (Weeks 5–8) and scale (Weeks 9–16). The goal was to improve outcomes without disrupting client commitments.

Timeline of the engagement

Weeks 1–2: Diagnostic, baseline KPIs, constraint mapping
Weeks 3–4: Operating model design, templates, tool workflow design
Weeks 5–8: Pilot implementation, coaching, weekly retrospectives
Weeks 9–12: Scale to remaining teams, portfolio governance launch
Weeks 13–16: Stabilization, KPI automation, handover to internal ops

What we actually did

1) Standardized “Definition of Done” and acceptance criteria

Introduced a one-page acceptance criteria template tied to testable outcomes.
Required acceptance criteria at intake for new work and for any change request.
Implemented a “ready for QA” checklist to reduce back-and-forth.

Early setback: Teams initially treated the template as paperwork. Rework didn’t drop in the first two weeks of the pilot.

Correction: We embedded acceptance criteria into the delivery workflow (ticket fields and gating), making it part of execution rather than an external document.

2) Reduced WIP and improved flow

Implemented WIP limits per team (and a rule of thumb per individual).
Reassigned some responsibilities to reduce context switching.
Introduced a weekly “stop starting, start finishing” review focused on blocked work.

Setback: Some stakeholders interpreted WIP limits as reduced capacity.

Correction: We paired WIP limits with a transparent queue and explicit prioritization. Leadership agreed that anything above the WIP limit required a trade-off decision.

3) Estimation and forecasting improvements

Created a lightweight estimation rubric using historical ranges and complexity tiers.
Implemented a two-stage commit: initial estimate at intake, then a firm commit after discovery.
Worked with finance to align delivery forecasts with how work was actually executed.

Key decision point #3: Enforce a discovery gate or keep “fast starts”?

4) Introduced tiered governance with leading indicators

Daily (15 minutes): team huddle focused on blockers and WIP
Weekly (45 minutes): delivery review using leading indicators (blocked days, defect trends, change request aging)
Monthly (60 minutes): portfolio review with finance and leadership (margin, capacity, risk)

To avoid “status theater,” the governance used standard thresholds (e.g., change requests older than X days, QA re-open rate above Y%) that triggered action.

5) Tooling: integrate, don’t replace

The firm used multiple systems (ticketing, time tracking, and a project tracker). Replacing tools would have extended timelines and increased resistance. Instead:

Standardized a small set of required fields
Automated a weekly KPI export
Created a single health view for leadership

Results: Measurable Improvements After 120 Days

Outcome metrics

On-time delivery: improved from 62% → 84% (22-point increase)
Rework rate: reduced from ~20% → ~12% of delivery capacity (~40% reduction)
Change request cycle time: reduced from 9 → 4 business days (56% faster)
Forecast accuracy (margin): improved from ±10–12% → ±4–6% variance
Status reporting effort: reduced from 4–6 hours/week → 1–2 hours/week per delivery lead (~60% time saved)

Business impact (conservative estimates)

The reduction in rework and reporting time freed an estimated 6–9% net capacity across the delivery organization (validated through utilization and throughput trends).
Improved forecasting reduced end-of-month “fire drills” and enabled more reliable staffing decisions, lowering the need for last-minute contractor spend (not eliminated, but reduced).

What didn’t fully improve (yet)

QA remained a constraint during peak periods. While re-open rates dropped, staffing and automation needed longer-term investment.
Sales-to-delivery handoff improved, but only after leadership enforced the discovery gate consistently. Inconsistent enforcement caused a brief regression around Week 10.

Lessons Learned: What We’d Repeat and What We’d Change

Operational excellence is a flow problem before it’s a reporting problem. Visibility helps, but only after the work system is stable.
WIP limits require executive air cover. Without explicit trade-off decisions, teams will be forced to overload and the system collapses.
Templates don’t work unless they’re embedded in workflow. If “Definition of Done” lives in a separate document, it won’t change outcomes.
Pilot selection matters. Choose a representative slice with leadership support, not the most chaotic edge case.
Leading indicators beat lagging indicators. Blocked days, defect re-open rates, and change request aging surfaced risks earlier than milestone slips.

Applicability: When This Operational Excellence Approach Fits

This approach is a strong fit when:

You have multiple delivery teams with inconsistent execution practices.
Work is project-based or hybrid (projects + managed services) with frequent change.
Leadership needs better forecast confidence without adding heavy process overhead.
Rework, context switching, and unclear acceptance criteria are driving missed dates.

It’s a weaker fit when:

The primary constraint is strategic (product-market fit) rather than execution.
Work is already highly standardized and the bottleneck is pure capacity (e.g., a single specialist role with no redundancy).

Conclusion: Actionable Takeaways for Operational Excellence

If you’re trying to improve operational performance, start with these three moves:

Baseline 5–7 metrics that reflect flow (on-time delivery, WIP, rework, change request aging).
Implement WIP limits with explicit trade-offs—make overload a leadership decision, not a team burden.
Standardize “Definition of Done” and acceptance criteria and embed them directly into the workflow.

Operational Excellence Case Study: A 120-Day Turnaround

The Challenge: Variability, Rework, and Low Forecast Confidence

Symptoms observed in the first two weeks

Root causes (confirmed through interviews and data)

Key decision point #1: Fix governance or fix delivery first?

The Approach: Diagnose, Design the Operating System, Prove It in a Pilot

1) Diagnostic (Weeks 1–2)

2) Operating model design (Weeks 3–4)

3) Pilot-first execution (Weeks 5–8)

Key decision point #2: Which pilot to choose?

Implementation: What Changed, What Didn’t, and the Setbacks

Timeline of the engagement

What we actually did

1) Standardized “Definition of Done” and acceptance criteria

2) Reduced WIP and improved flow

3) Estimation and forecasting improvements

4) Introduced tiered governance with leading indicators

5) Tooling: integrate, don’t replace

Results: Measurable Improvements After 120 Days

Outcome metrics

Business impact (conservative estimates)

What didn’t fully improve (yet)

Lessons Learned: What We’d Repeat and What We’d Change

Applicability: When This Operational Excellence Approach Fits

Conclusion: Actionable Takeaways for Operational Excellence

Ready to transform your operations?

Related Articles

Private AI for Federal Contractors: Data Sovereignty in 4 Steps

Email Ingestion and CUI Compliance: Protecting CUI in Your CRM

Data Sovereignty for Federal Contractors: Private AI Requirements

Operational Excellence Case Study: A 120-Day Turnaround

The Challenge: Variability, Rework, and Low Forecast Confidence

Symptoms observed in the first two weeks

Root causes (confirmed through interviews and data)

Key decision point #1: Fix governance or fix delivery first?

The Approach: Diagnose, Design the Operating System, Prove It in a Pilot

1) Diagnostic (Weeks 1–2)

2) Operating model design (Weeks 3–4)

3) Pilot-first execution (Weeks 5–8)

Key decision point #2: Which pilot to choose?

Implementation: What Changed, What Didn’t, and the Setbacks

Timeline of the engagement

What we actually did

1) Standardized “Definition of Done” and acceptance criteria

2) Reduced WIP and improved flow

3) Estimation and forecasting improvements

4) Introduced tiered governance with leading indicators

5) Tooling: integrate, don’t replace

Results: Measurable Improvements After 120 Days

Outcome metrics

Business impact (conservative estimates)

What didn’t fully improve (yet)

Lessons Learned: What We’d Repeat and What We’d Change

Applicability: When This Operational Excellence Approach Fits

Conclusion: Actionable Takeaways for Operational Excellence

Ready to transform your operations?

Related Articles

Private AI for Federal Contractors: Data Sovereignty in 4 Steps

Email Ingestion and CUI Compliance: Protecting CUI in Your CRM

Data Sovereignty for Federal Contractors: Private AI Requirements