Your time-in-status plugin is lying to you

A pattern I've watched play out across many teams and many companies over the years: a deadline slips. A retrospective gets scheduled. Someone asks where the bottleneck was — in good faith, looking for a fix. The default answer, in the absence of trustworthy data, is to blame the team closest to the visible delay. Engineering. QA. UAT. Code review. The team that owned the final stage before the deadline.

Sometimes there's no plugin in place to answer the question with data. More often, there is — and the plugin's number gets quoted in the meeting as authoritative. The action item is assigned. The team works to "speed up" the named stage for the next sprint. The next sprint, the same number comes back unchanged. The retrospective starts again.

This isn't one team's story. It repeats. It repeats because the meetings start from numbers, and the numbers come from plugins, and most plugins are computing the wrong thing — silently, plausibly, without flagging any error. The team's "throughput problem" was never the team's throughput problem. The data the meeting started from was wrong, and nobody knew because the dashboard rendered without complaint.

This is not a hypothetical. It is the predictable consequence of how nearly every Jira time-in-status plugin computes its numbers. Once you see the mechanism, you can't unsee it.

The claim

Most Jira time-in-status, cycle-time, and flow plugins compute their metrics from current_status plus a handful of timestamps — created_at, updated_at, resolved_at, sometimes a "last status change" field. They do not replay the issue changelog. When the underlying workflow stays static and statuses never get renamed, the answer is approximately right. The moment any of those assumptions breaks — and they break constantly — the numbers silently drift, and there is no warning surface.

Three things I'll defend in the rest of this post:

The standard time-in-status computation is a shortcut, not a measurement. In common scenarios it can be off by 2–10× per stage.
The drift is silent. The plugin doesn't tell you it's reporting derived approximations instead of replayed truth.
The fix is mechanical. Replay the changelog. There is no engineering excuse for not doing it.

The mechanism — six ways the shortcut breaks

Status renames

A team renames "In Review" to "Code Review." From Jira's API perspective the status ID is unchanged, but many plugins cache their aggregates by status name. After the rename, historical entries under "In Review" go to one of three places, none of them right:

Orphaned, no longer rolling up into the dashboard slice for the new "Code Review."
Attributed to "no status" or "deleted" — invisible in the new view.
Silently re-categorized into "Code Review" only for tickets that touched the new status post-rename, breaking comparison-over-time.

If your dashboard's last-quarter Code Review numbers were 1,200 tickets and this quarter's are 800, your throughput might have dropped 33% — or your plugin might just have lost the old name. You don't know from the dashboard.

60-second check

If you've renamed any status in the last 12 months

Pull the cycle-time or time-in-status report for that status across the rename window. If the historical numbers shrank materially, moved into an "Unknown" bucket, or disappeared entirely, your plugin caches by status name. Every workflow change you've ever made has silently distorted historical comparisons.

Workflow restructures

Mid-quarter, a team adds a "QA Pending" status between "In Progress" and "QA In Progress." Historical tickets — closed before the workflow change — never passed through "QA Pending." How does a plugin reading current_status handle this?

I've watched three behaviors, none correct:

Show zero for historical tickets in the new stage. Artificially inflates the new stage's apparent growth.
Estimate by interpolation. Literally inventing data.
Backfill by re-categorizing historical tickets retroactively. Changes what counts as which stage after the fact, breaking comparison-over-time.

Whichever path the plugin took, the cross-workflow-change comparison is now unreliable. And the plugin reports the comparison anyway, because it doesn't know it's broken.

60-second check

Pick a status that didn't exist 12 months ago in your workflow

Pull your time-in-status report for the last 12 months filtered to that status. If tickets that closed before the status existed show zero time in it, the plugin is silently zeroing historical coverage and your cross-workflow-change comparison is broken. If they show non-zero time, the plugin is inventing data — which is worse.

Manual ticket edits

A ticket bounces from "Done" back to "In Progress" by mistake — someone clicked the wrong transition. An admin reverts it four hours later. A plugin reading only current_status and a "last changed" timestamp sees the ticket as "In Progress" for that four-hour window, and attributes those four hours to the ticket's cycle time. The changelog shows the bounce, the revert, and the correct durations. The plugin doesn't read the changelog.

For a team running 100 tickets a week, a handful of these per month meaningfully skews the P50 and P95.

60-second check

Find any ticket whose history shows a transition reversal

Done → In Progress → Done, or In Review → To Do → In Review, or similar. They exist in every project; the issue's History panel in Jira surfaces them. Compare your plugin's cycle-time for that ticket against manual arithmetic from the changelog. If they differ by more than 5%, the plugin reads "last changed" rather than the full changelog, and your project's cycle-time P50 and P95 are contaminated by mistaken-transition windows.

Multiple transitions in fast succession

A ticket gets transitioned from "To Do" → "In Progress" → "Blocked" → "In Progress" in the same 30-minute window. A plugin reading only current_status and "last changed" sees: started in "To Do" 30 minutes ago, currently in "In Progress." Where's the "Blocked" interval? It existed for nine minutes but never gets counted.

Cumulative effect: status-toggle patterns popular with teams that flag-and-unflag tickets get under-attributed in time-in-status reports by 20–40%.

60-second check

Find any ticket that briefly passed through an intermediate status

A "Blocked" interval of less than an hour, a quick flag-and-unflag, a quick reassignment loop. Check your plugin's time-in-"Blocked" (or time-in-whatever-the-brief-status-was) for that ticket against the actual changelog interval. If the plugin shows zero or only the most recent occurrence, your project's time-in-that-status aggregate is under-attributed by 20–40% — and every retrospective debating "how much time do we lose to being Blocked" was working from numbers smaller than reality.

Calendar time treated as working time

A ticket sits in "In Progress" from Friday 17:00 to Monday 09:00 because the assignee took the weekend off. Wall-clock duration: 64 hours. Working-time duration: zero. The plugin reports 64.

This is the most-universal drift mechanism and the easiest one to verify against your own data. Pick any ticket whose cycle time spans a weekend or a holiday and check whether the plugin reports a cycle time the team could possibly have worked. Most plugins don't ask the question — they treat the wall clock as the work clock, weekends inclusive, holidays inclusive, off-hours inclusive. Whatever your team's HR system says about "business hours" has no bearing on the numbers your flow-analytics plugin shows.

The inflation isn't symmetric across stages, either. Statuses that tickets sit in passively (Blocked, Waiting, In Review when nobody's looking) accumulate non-working hours disproportionately to statuses that tickets sit in actively (In Progress when someone's typing). The bottleneck card's signal gets distorted toward the passive statuses, and the team gets the action item to "speed up Review" when Review actually had two business hours of time on it that week and forty-eight non-working hours.

The fix is per-tenant work schedule applied at duration-computation time — timezone, working days, work hours, holidays — so every duration goes through one helper that honors the schedule. Cycle time, time-in-status, alert thresholds, the bottleneck card's time signal all use the same math. Most plugins either don't have work-schedule configuration, or have it for one chart but not for the alerts, which is a different and arguably worse failure mode — the alert fires at a different threshold than the chart shows.

60-second check

Pick a ticket whose status duration spans a weekend or a holiday

They're easy to find with a Jira filter for "resolved on Monday, created Friday or earlier." Compare the plugin's reported duration against what your team could have actually worked (business hours only, weekends and holidays excluded). If the plugin reports the wall-clock duration — weekends and off-hours fully counted — every cycle-time number you've shown a stakeholder has been inflated by non-working hours, and the inflation is asymmetric across statuses (passive statuses accumulate more than active ones).

External waits attributed to the team

A ticket sits in "Waiting for Customer" for eight days. The customer is sitting on a question the team asked, or a contract review, or a procurement approval. The team has nothing they can do.

A plugin that scores every status uniformly will see "Waiting for Customer" as a high-time-in-status outlier and either name it as the bottleneck (technically correct, operationally useless — the team cannot fix it) or, in plugins that don't surface a single answer, simply attribute those eight days to the team's cycle time. The dashboard reads median cycle time: 13 days. The dashboard does not read median cycle time: 5 days plus 8 days waiting on the customer. The team gets the throughput question in retrospective. The team has nothing to say about it.

This pattern is universal anywhere a team has external dependencies — customer approvals, vendor responses, legal reviews, security reviews, procurement, third-party integrations. It compounds particularly hard on B2B teams whose work is gated by counterparty timelines. The team gets blamed in standup for time waiting on the world.

The fix is to distinguish external-blocking statuses from team-controllable ones, and to model the distinction correctly. Filtering external-blocking statuses out of the time slices entirely is the wrong fix — the data is still legitimate (yes, this ticket spent eight days waiting; the data should reflect that). The right shape is to preserve the slice everywhere it's a fact (per-issue history, charts, CSV export) and only skip it from the attribution step. The question "where is the team's controllable bottleneck?" deserves an honest answer that ignores statuses the team can't act on.

60-second check

Look at your last three months of data and pick the longest-duration status

If it's a status where work is paused on a third party (Waiting for Customer, Blocked: Vendor Response, In External Review, Awaiting Approval), check whether the plugin's bottleneck card and cycle-time aggregate exclude it from the team's attribution. If the plugin has no per-tenant configuration for "external-blocking" or "pause" statuses at all, every retrospective in which throughput was named the action item was working from numbers that mixed team-controllable time with time you couldn't influence. The team has been getting the blame in standup for time waiting on the world.

Each of these is small individually. The combination is not. On any team with a non-trivial workflow history, distributed timezones, or external dependencies, the time-in-status report drifts away from the changelog-and-business-context truth steadily and silently. Six months in, your dashboard is fiction. The team has been triaging fiction for two quarters.

How to check your own data

The audit is mechanical and takes 15 minutes. If your plugin passes, you're fine. If it doesn't, you've been triaging on drifted numbers.

Pick a closed ticket from last quarter. Any closed ticket.
Open it in Jira → click History. Scroll to all status changes. Write down each status name with its start time, end time, and computed duration.
Pull up the same ticket in your time-in-status plugin's per-ticket view (if it has one), or the aggregate report for its sprint/window.
Compare. Specifically:
- Do the per-status durations match your manual computation?
- Does the sum of durations equal (resolved_at - created_at)?
- Did the plugin lose any status the ticket passed through?

What you'll likely find for tickets with any real history:

Per-status durations off by 10–60% for tickets that bounced or had brief "Blocked" intervals.
Older tickets — those from before any workflow change — sometimes show zero time in stages they obviously moved through.
Sum of per-status durations doesn't equal total cycle time for tickets that ever touched a since-renamed status.
Tickets that sat across weekends or holidays report cycle times the team couldn't possibly have worked — the wall clock is treated as the work clock, weekends and off-hours fully counted.
Tickets that passed through statuses like "Waiting for Customer" or "Blocked: Vendor Response" inflate the team's cycle time with periods the team had no agency over — the dashboard scores the team for time spent waiting on the world.

If your plugin passes the audit, great. It's reading the changelog. If it doesn't, the question is what to do about it.

Why this happens

A brief, non-blaming explanation, because the shortcut is rational under early constraints — it's just incomplete.

Jira's REST API exposes /changelog as a separate, paginated, heavier-weight endpoint compared to the cheap /search and /issue calls that return current_status. For an early-stage plugin, the shortcut works on small datasets and gets you to "looks right" demo state fast. The drift only appears at scale, over time, after workflow changes — by which point the plugin has shipped and its computation pattern is locked in.

Rebuilding the computation to be changelog-truthful isn't a refactor. It's a re-architecture. The plugin's storage schema, query layer, sync pipeline, and aggregation logic all change. So most don't.

It's not malice. It's not laziness. It's an architectural commitment made early that the plugin can't walk back without breaking every customer's historical dashboard.

So they don't.

The alternative architecture

What changelog-truth actually looks like in mechanical terms. This is not marketing copy; it's a description of how the computation differs.

On install, on schedule, and on webhook fires, fetch each issue's full /changelog. Replay every transition — from_status, to_status, transition timestamp. Compute, per ticket, an ordered list of (status, entered_at, exited_at) triples covering exactly [created_at, resolved_at_or_now] with no gaps and no overlaps. Aggregate at query time, not at write time.

Properties that fall out of this architecture for free:

Status renames don't break aggregates. The changelog records the status name at the time of the transition. Renames don't rewrite history; they just change what future transitions get tagged as.
Workflow restructures don't drift. Historical tickets retain their actual historical path; new tickets follow the new workflow. They never need to be reconciled.
Manual reverts are visible. A "Done" → "In Progress" → "Done" sequence shows up as three transitions and gets attributed correctly. The bounce shows in the data exactly as it happened.
Multiple-in-fast-succession transitions are individually preserved. A nine-minute "Blocked" interval gets nine minutes counted against "Blocked," not absorbed into a neighbor.
Re-running the sync is idempotent. Same data in, same answer out, regardless of when you ran it.

Two more properties don't come "for free" from the changelog replay — they come from layering the right configuration on top of it, but they're available only to architectures that already compute every duration through a single per-slice helper.

Working-time math by construction. Every duration goes through one helper — call it working_seconds_between(start, end, schedule) — that honors a per-tenant work schedule (timezone, working days, work hours, holidays). The bottleneck card's time signal, cycle-time charts, alert thresholds, and trends all use the same math. A 64-hour wall-clock interval that straddles a weekend yields zero working seconds under a Mon–Fri 9–17 schedule; the alert threshold and the chart agree because they're reading from the same computation.
External-blocking statuses preserved as facts, excluded from attribution. A status marked as external-blocking (Waiting for Customer, Blocked: Vendor Response, In External Review) continues to record its actual duration in the time slice. Per-issue history shows it. CFD bands show it. CSV exports show it. Only the attribution step — the step that names a single bottleneck — skips it. The question "where is the team's controllable bottleneck?" gets the answer it deserves, while the question "how long did this ticket actually spend in each status?" gets answered honestly elsewhere.

The trade-off is real and worth naming. Replay-the-changelog computation is more expensive than current_status plus timestamp arithmetic. The first-time backfill for a 50,000-issue site is hours, not seconds. Subsequent syncs are cheap, but the cold start is not. That's the cost.

A second, subtler trade-off worth naming because most plugins fail it: when settings change — a work schedule activated, an external-blocking status added, a workflow restructured — the historical data needs to be replayed under the new rules. Otherwise the chart shows a discontinuity at the activation date, or worse, silently blends two math models. Most plugins do neither — they leave old slices computed under old rules and new slices computed under new rules in the same chart. The result is a quiet, hard-to-spot variant of the same drift problem this post is about, just induced by configuration changes instead of by workflow churn. The correct architecture is to async-recompute every historical slice on schedule activation / edit / disable, with a visible progress banner during the few minutes it takes ("Recomputing metrics with your new work schedule… 47% complete. Numbers may temporarily blend until this finishes."). The honest engineering is the banner. The lie would be the silent blend.

The benefit of all of this is that the answer is right.

A side note on AI

A growing number of analytics plugins now layer AI on top of their metrics — generated summaries of "what's happening with your flow this sprint," AI-suggested action items, conversational dashboards. These work cleanly on top of correct numbers. They break in subtle ways on top of drifted numbers, because the AI is fluent and the wrong answer reads as confident.

The architectural commitment that matters: AI translates already-correct numbers into language. It does not compute or smooth them. If your plugin's numbers are drifting and the AI is summarizing the drift in plain English, you have a fluent and confident wrong answer instead of an obvious one. The right line is: AI explains; data is computed deterministically from the changelog. Not the other way around.

The reveal

I'll lay my cards down. I built BottleneckIQ because I needed a flow-analytics tool that wasn't lying to me, and the existing options either lied (the time-in-status family) or cost $20–60/user/month (the dedicated value-stream tools). BottleneckIQ derives every metric from the Jira issue changelog. AI never touches numbers — it writes the one English sentence that explains the named bottleneck. Same data, same answer, every run.

Beyond the changelog-replay foundation: a per-tenant work schedule (timezone, working days, work hours, holidays) is applied to every duration so the numbers honor business hours, with an async historical recompute on activation so the charts never blend old calendar-time and new working-time math. External-blocking statuses can be configured per project so the bottleneck card stops naming the team for time spent waiting on customers, vendors, or external reviews. A per-issue panel lives inside the Jira issue view with external-blocking markers on each status slice. The CSV export emits one row per status slice with external_blocking and is_terminal columns, so the math can be audited in Excel.

It's a 30-day free trial. From $10/month for teams up to 10 users, $3/user beyond. The architecture is the one I described in the previous section, made concrete in a Forge plugin that lives on every Jira project page and on every Jira issue view.

I'd rather you audit your existing plugin against the steps in the section above and conclude that you're fine than have you take my word for it. If you do find drift, you know where to find me.

The bigger point

The reason this matters isn't "your plugin is wrong by a factor of two." The reason it matters is that engineering decisions made on drifting data calcify into wrong process, wrong retrospectives, wrong sprint planning. A team that's been triaging on bad cycle-time numbers for two quarters has spent two quarters fixing the wrong stages. The cost of that error compounds in ways the dashboard will never surface — because the dashboard is the thing producing the error.

The minimum bar for a tool that informs engineering process is that its numbers don't lie. Most don't meet it. Audit yours.

If this resonated, BottleneckIQ is on the Atlassian Marketplace with a 30-day free trial. Same architecture as described above. If you ran the audit on your current plugin and found drift, I'd love to hear what you found — francisco@bottleneckiq.com.