Exercise 20.3
Tune the Alarm -- The Staleness Threshold
The alert thresholds (2 days stale, 10%/mo decline, 85% uptime) decide how many trucks roll. Sweep the decline threshold from 5% to 25% and report how many wells it flags at each setting. Find the threshold that flags exactly the wells with a real problem (not normal decline noise), and explain the cost of setting it too low (false alarms) versus too high (a watering-out well missed for a month).
---
The staleness alarm decides how many wells land on the morning triage list. Set it too tight and a one-day telemetry lag pages someone; set it too loose and a genuinely dead feed goes unnoticed. This exercise sweeps the threshold and counts the cost.
The verified engine (make_field, well_kpis) is embedded under a do-not-edit banner. Write one function:
def sweep_stale(kpis, thresholds=(0, 1, 2, 3)):
"""How many wells the staleness alarm flags at each threshold.
A well is flagged when days_stale > threshold. Returns {threshold: count}."""Exact procedure: for each t in thresholds, count the wells with days_stale > t; return a dict {int(t): int(count)}.
At module level: FIELD = make_field(), KPIS = well_kpis(FIELD), sweep = sweep_stale(KPIS).
Expose: sweep_stale, sweep.
> Think about it: at a threshold of 0 days the alarm flags 4 of 6 wells -- > but three of those merely reported a day late, which is normal telemetry lag, not > a problem. At 2 days it flags exactly one: OD-001, whose feed has genuinely > gone quiet for three days. The jump from 4 to 1 is the cost of a too-tight > threshold: three false alarms every morning until the foreman stops trusting the > screen. What is the cost of setting it the other way, at 5 days?
Stuck? Reveal hints one at a time — they progress from nudge to near-solution.
visibilityReveal reference solutionexpand_more
Try solving it yourself first — the hints walk you through it. The solution below is one valid approach; yours may differ and still be correct.
import numpy as np
import pandas as pd
# ── Verified Chapter 20 field surveillance engine (do not edit) ──────────
# Per-well profiles: (id, qi, annual Di, b, problem). The field is mostly healthy;
# three wells carry the problems a morning surveillance check exists to catch.
WELLS = [
("OD-001", 1500, 0.22, 0.6, "stale"), # feed stopped reporting 3 days ago
("OD-002", 2100, 0.30, 0.7, None),
("OD-003", 2600, 0.28, 0.8, "decline"), # recent step drop -> steep 30-day decline
("OD-004", 1800, 0.20, 0.5, None), # the steady earner
("OD-005", 1200, 0.35, 0.7, "downtime"), # frequent outages -> low uptime
("OD-006", 2000, 0.26, 0.6, None),
]
DAYS = 730
def make_field(seed=11):
"""A field's DAILY production surveillance feed (long format). Sensor noise,
real outages, and three planted problems -- a stale feed, a steep decline, and
a low-uptime well -- the raw stream a monitoring dashboard sits on top of."""
rng = np.random.default_rng(seed)
rows = []
for wid, qi, Di_yr, b, problem in WELLS:
Di = Di_yr / 365.0
t = np.arange(DAYS)
q = qi / np.power(1 + b * Di * t, 1.0 / b)
q = q * rng.normal(1.0, 0.03, DAYS)
if problem == "decline": # a pressure/liquid-loading hit: -45% over last 40 d
q[-40:] *= np.linspace(1.0, 0.55, 40)
if problem == "downtime": # chronic intermittent producer
q[rng.random(DAYS) < 0.20] = 0.0 # ~20% of days down -> ~80% uptime
else:
for _ in range(rng.integers(2, 5)): # occasional multi-day outage
s = rng.integers(0, DAYS - 6)
q[s:s + rng.integers(1, 6)] = 0.0
last = DAYS - (3 if problem == "stale" else rng.integers(0, 2))
q = np.maximum(q[:last], 0.0)
rows.append(pd.DataFrame({"well": wid, "day": np.arange(len(q)), "oil_bopd": q}))
return pd.concat(rows, ignore_index=True)
def well_kpis(field):
"""The surveillance scorecard: one row per well, the numbers a foreman reads."""
asof = field.day.max()
out = []
for w, g in field.groupby("well"):
g = g.sort_values("day")
rate = g.oil_bopd.values
prod = rate > 0
last_rate = float(rate[prod][-1]) if prod.any() else 0.0
recent, prior = rate[-30:], rate[-60:-30]
a_recent = recent[recent > 0].mean() if (recent > 0).any() else 0.0
a_prior = prior[prior > 0].mean() if (prior > 0).any() else np.nan
decl = (a_prior - a_recent) / a_prior * 100 if a_prior and not np.isnan(a_prior) else 0.0
out.append(dict(well=w, last_rate=round(last_rate, 1), decline_30d_pct=round(float(decl), 1),
uptime_pct=round(float(prod.mean() * 100), 1),
cum_mbbl=round(rate.sum() / 1000, 1), days_stale=int(asof - g.day.max())))
return pd.DataFrame(out)
def downsample(day, rate, n_buckets=120):
"""min/max decimation: per bucket keep BOTH the lowest and highest sample, so a
spike or a zero (outage) is never averaged away. Returns reduced (day, rate)."""
if len(day) <= 2 * n_buckets:
return day, rate
keep = []
for chunk in np.array_split(np.arange(len(day)), n_buckets):
r = rate[chunk]
keep.append(chunk[r.argmin()])
keep.append(chunk[r.argmax()])
keep = np.unique(keep)
return day[keep], rate[keep]
# ── end do-not-edit ───────────────────────────────────────────
def sweep_stale(kpis, thresholds=(0, 1, 2, 3)):
"""How many wells the staleness alarm flags at each threshold (days_stale > t)."""
return {int(t): int((kpis.days_stale > t).sum()) for t in thresholds}
FIELD = make_field()
KPIS = well_kpis(FIELD)
sweep = sweep_stale(KPIS)
print("wells flagged by staleness threshold:", sweep)
lockCopying code is a Full Access feature.