The Cost-Weighted Threshold - Pricing Missed Gas vs False Alarms

Level 2

Chapter 17: Supervised Learning

descriptionProblem

Using the trained forest's gas-sand probabilities, find the threshold that maximises gas recall while keeping precision at or above 0.60. Then suppose a missed gas zone costs \ $2M in bypassed reserves and a false alarm costs \$ 50k for a confirmation log. Write the expected-cost expression as a function of threshold over the test set, find the cost-minimising threshold, and state it. Does it match the recall-maximising one, and why might an asset manager still overrule it?

---

The chapter leaves the threshold as a discussion: "which way to lean is an engineering decision." Here you turn that judgement into a number you can defend in a partner meeting by attaching a dollar cost to each kind of mistake and letting the cost pick the operating point.

On OML 58 the asset team prices the two error types like this:

A missed gas foot (true gas the model fails to flag) bypasses pay worth

COST_MISS = 2_000_000 dollars.

A false gas call (a flagged foot that is not gas) costs one more

confirmation log: COST_FALSE_ALARM = 50_000 dollars.

A missed pay zone is 40× costlier than a false alarm, so the break-even threshold lives well below the default 0.5. The verified make_facies_dataset generator, the FACIES / CENTROIDS constants, and the trained forest are embedded for you. Do not modify them or re-derive the physics. The setup is already run at module scope:

rock = make_facies_dataset()
X = rock[['GR','RHOB','NPHI','RT']].values
y = rock['facies'].values
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.25, random_state=0, stratify=y)
forest = RandomForestClassifier(n_estimators=200, random_state=0).fit(X_train, y_train)

GAS = 2 is the gas-sand class index.

Your tasks:

Write cost(threshold, gas_proba, is_gas) -> float:

gas_proba is the forest's gas-sand probability for every test foot

(forest.predict_proba(X_test)[:, 2], a 1-D array).

is_gas is the boolean truth (y_test == GAS).
Flag every foot with gas_proba >= threshold.
missed_gas = count of true-gas feet that were not flagged.
false_alarms = count of flagged feet that are not gas.
Return float(COST_MISS missed_gas + COST_FALSE_ALARM false_alarms).

Write best_threshold(gas_proba, is_gas, grid=None) -> (float, float):

If grid is None, use grid = np.round(np.arange(0.02, 0.601, 0.01), 2).
Evaluate cost(t, gas_proba, is_gas) over every t in grid.
Return (best_t, best_cost) for the cost-minimising t. On ties pick

the lowest t (use np.argmin, which returns the first minimum).

At module scope, compute exactly these output variables:

``python gas_proba = forest.predict_proba(X_test)[:, 2] is_gas = (y_test == GAS) best_t, best_cost = best_threshold(gas_proba, is_gas) cost_at_half = cost(0.5, gas_proba, is_gas) ``

> Think about it: the default 0.5 cut-off leaves cost_at_half in the tens > of millions of dollars, almost all of it bypassed pay, because the forest > only catches about half the gas at 0.5. The cost-minimising best_t lands far > below 0.5 (down in the low-precision / high-recall band) and slashes the cost > by roughly 6×. Why do these asymmetric costs push the threshold down, and why > does the optimum stop short of zero instead of flagging every foot as gas?

lightbulbHints (0/3)

Stuck? Reveal hints one at a time — they progress from nudge to near-solution.

codeYour solution

main.py

import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier

# ── Verified facies generator + constants (do not edit) ──────────────────
FACIES = {0: "Shale", 1: "Brine Sand", 2: "Gas Sand", 3: "Siltstone"}

# Each facies' centre in log space (GR, RHOB, NPHI, ln RT) and its scatter (the *_sd
# terms). Resistivity is centred on a LOG scale because it ranges over decades. Defined
# once and shared by the training-set and new-well generators below, which guarantees the
# rock physics is identical in training and inference.
CENTROIDS = {0: (108, 2.51, 0.33, np.log(3.5),  16, 0.080, 0.045, 0.40),
             1: (46,  2.28, 0.27, np.log(9.0),   18, 0.075, 0.045, 0.45),
             2: (43,  2.21, 0.21, np.log(15.0),  18, 0.085, 0.050, 0.55),
             3: (74,  2.40, 0.29, np.log(7.0),   21, 0.090, 0.050, 0.48)}

def make_facies_dataset(n=1500, seed=11):
    """Self-contained labelled facies set for OML 58: four logs -> rock type.

Each facies has a centroid in log space (GR, RHOB, NPHI, log-RT) and realistic
    scatter. The gas sand deliberately overlaps the brine sandstone - same low GR
    and high porosity - separated only by its lower density and neutron and higher
    resistivity. That overlap, plus its rarity (7% of the section), is what makes it
    the hard class, exactly as in the field.
    """
    rng = np.random.default_rng(seed)
    facies = rng.choice([0, 1, 2, 3], size=n, p=[0.42, 0.33, 0.07, 0.18])  # imbalanced
    GR = np.zeros(n); RHOB = np.zeros(n); NPHI = np.zeros(n); RT = np.zeros(n)
    for f, (gr, rhob, nphi, lnrt, grs, rhs, nps, rts) in CENTROIDS.items():
        m = facies == f; k = int(m.sum())
        GR[m]   = gr + rng.normal(0, grs, k)
        RHOB[m] = rhob + rng.normal(0, rhs, k)
        NPHI[m] = np.clip(nphi + rng.normal(0, nps, k), 0, 0.55)
        RT[m]   = np.clip(np.exp(lnrt + rng.normal(0, rts, k)), 0.3, 600)
    return pd.DataFrame({"GR": np.round(GR, 1), "RHOB": np.round(RHOB, 3),
                         "NPHI": np.round(NPHI, 3), "RT": np.round(RT, 2),
                         "facies": facies})

# ── Cost constants + trained forest (do not edit) ────────────────────────
COST_MISS = 2_000_000        # dollars per missed gas foot (bypassed pay)
COST_FALSE_ALARM = 50_000    # dollars per false gas call (one more confirmation log)
GAS = 2                      # gas-sand class index

rock = make_facies_dataset()
X = rock[["GR", "RHOB", "NPHI", "RT"]].values
y = rock["facies"].values
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.25, random_state=0, stratify=y)
# n_estimators=200 keeps the grader under its time budget; forest is deterministic.
forest = RandomForestClassifier(n_estimators=200, random_state=0).fit(X_train, y_train)

def cost(threshold, gas_proba, is_gas):
    """Dollar cost of calling gas at `threshold`.

Flag every foot with gas_proba >= threshold, then charge:
      COST_MISS        per true-gas foot NOT flagged (missed pay), and
      COST_FALSE_ALARM per flagged foot that is NOT gas (false alarm).
    Returns the total as a float.
    """
    # TODO: flagged = gas_proba >= threshold
    # TODO: missed_gas = int(np.sum(is_gas & ~flagged))
    # TODO: false_alarms = int(np.sum(flagged & ~is_gas))
    # TODO: return float(COST_MISS * missed_gas + COST_FALSE_ALARM * false_alarms)
    pass

def best_threshold(gas_proba, is_gas, grid=None):
    """Cost-minimising threshold over a grid.

Returns (best_t, best_cost) for the t that MINIMISES cost(t, ...).
    On ties pick the lowest t (np.argmin returns the first minimum).
    """
    if grid is None:
        grid = np.round(np.arange(0.02, 0.601, 0.01), 2)
    # TODO: costs = np.array([cost(t, gas_proba, is_gas) for t in grid])
    # TODO: i = int(np.argmin(costs))
    # TODO: return float(grid[i]), float(costs[i])
    pass

gas_proba = forest.predict_proba(X_test)[:, 2]   # P(Gas Sand) for every test foot
is_gas = (y_test == GAS)

# TODO: best_t, best_cost = best_threshold(gas_proba, is_gas)
# TODO: cost_at_half = cost(0.5, gas_proba, is_gas)
best_t = None
best_cost = None
cost_at_half = None

print("cost at default 0.5 ($):", cost_at_half)
print("best threshold:", best_t, "  best cost ($):", best_cost)

visibilityReveal reference solutionexpand_more

Try solving it yourself first — the hints walk you through it. The solution below is one valid approach; yours may differ and still be correct.

import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier


# ── Verified facies generator + constants (do not edit) ──────────────────
FACIES = {0: "Shale", 1: "Brine Sand", 2: "Gas Sand", 3: "Siltstone"}

# Each facies' centre in log space (GR, RHOB, NPHI, ln RT) and its scatter (the *_sd
# terms). Resistivity is centred on a LOG scale because it ranges over decades. Defined
# once and shared by the training-set and new-well generators below, which guarantees the
# rock physics is identical in training and inference.
CENTROIDS = {0: (108, 2.51, 0.33, np.log(3.5),  16, 0.080, 0.045, 0.40),
             1: (46,  2.28, 0.27, np.log(9.0),   18, 0.075, 0.045, 0.45),
             2: (43,  2.21, 0.21, np.log(15.0),  18, 0.085, 0.050, 0.55),
             3: (74,  2.40, 0.29, np.log(7.0),   21, 0.090, 0.050, 0.48)}

def make_facies_dataset(n=1500, seed=11):
    """Self-contained labelled facies set for OML 58: four logs -> rock type.

    Each facies has a centroid in log space (GR, RHOB, NPHI, log-RT) and realistic
    scatter. The gas sand deliberately overlaps the brine sandstone - same low GR
    and high porosity - separated only by its lower density and neutron and higher
    resistivity. That overlap, plus its rarity (7% of the section), is what makes it
    the hard class, exactly as in the field.
    """
    rng = np.random.default_rng(seed)
    facies = rng.choice([0, 1, 2, 3], size=n, p=[0.42, 0.33, 0.07, 0.18])  # imbalanced
    GR = np.zeros(n); RHOB = np.zeros(n); NPHI = np.zeros(n); RT = np.zeros(n)
    for f, (gr, rhob, nphi, lnrt, grs, rhs, nps, rts) in CENTROIDS.items():
        m = facies == f; k = int(m.sum())
        GR[m]   = gr + rng.normal(0, grs, k)
        RHOB[m] = rhob + rng.normal(0, rhs, k)
        NPHI[m] = np.clip(nphi + rng.normal(0, nps, k), 0, 0.55)
        RT[m]   = np.clip(np.exp(lnrt + rng.normal(0, rts, k)), 0.3, 600)
    return pd.DataFrame({"GR": np.round(GR, 1), "RHOB": np.round(RHOB, 3),
                         "NPHI": np.round(NPHI, 3), "RT": np.round(RT, 2),
                         "facies": facies})


# ── Cost constants + trained forest (do not edit) ────────────────────────
COST_MISS = 2_000_000        # dollars per missed gas foot (bypassed pay)
COST_FALSE_ALARM = 50_000    # dollars per false gas call (one more confirmation log)
GAS = 2                      # gas-sand class index

rock = make_facies_dataset()
X = rock[["GR", "RHOB", "NPHI", "RT"]].values
y = rock["facies"].values
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.25, random_state=0, stratify=y)
# n_estimators=200 keeps the grader under its time budget; forest is deterministic.
forest = RandomForestClassifier(n_estimators=200, random_state=0).fit(X_train, y_train)


def cost(threshold, gas_proba, is_gas):
    """Dollar cost of calling gas at `threshold`.

    Flag every foot with gas_proba >= threshold, then charge:
      COST_MISS        per true-gas foot NOT flagged (missed pay), and
      COST_FALSE_ALARM per flagged foot that is NOT gas (false alarm).
    Returns the total as a float.
    """
    gas_proba = np.asarray(gas_proba)
    is_gas = np.asarray(is_gas, dtype=bool)
    flagged = gas_proba >= threshold
    missed_gas = int(np.sum(is_gas & ~flagged))
    false_alarms = int(np.sum(flagged & ~is_gas))
    return float(COST_MISS * missed_gas + COST_FALSE_ALARM * false_alarms)


def best_threshold(gas_proba, is_gas, grid=None):
    """Cost-minimising threshold over a grid.

    Returns (best_t, best_cost) for the t that MINIMISES cost(t, ...).
    On ties pick the lowest t (np.argmin returns the first minimum).
    """
    if grid is None:
        grid = np.round(np.arange(0.02, 0.601, 0.01), 2)
    costs = np.array([cost(t, gas_proba, is_gas) for t in grid])
    i = int(np.argmin(costs))
    return float(grid[i]), float(costs[i])


gas_proba = forest.predict_proba(X_test)[:, 2]   # P(Gas Sand) for every test foot
is_gas = (y_test == GAS)

best_t, best_cost = best_threshold(gas_proba, is_gas)
cost_at_half = cost(0.5, gas_proba, is_gas)

print("cost at default 0.5 ($):", cost_at_half)
print("best threshold:", best_t, "  best cost ($):", best_cost)

lockCopying code is a Full Access feature.

arrow_back

16.4 Honest vs. Dishonest Evaluation - Data Leakage, Held-Out Test, and Cross-Validation

17.2 Reading a Confusion Matrix - The Most-Confused Facies Pair

arrow_forward