Reading a Confusion Matrix - The Most-Confused Facies Pair

Level 2

Chapter 17: Supervised Learning

descriptionProblem

Train both the logistic baseline and the random forest, and print each one's confusion matrix and per-class F1. Which facies pair is most often confused by each model, and is it the same pair? Explain, from the density-neutron crossplot, why that confusion is petrophysically reasonable rather than a model bug.

---

The graded task is the heart of that exercise: turn a confusion matrix into a single, actionable fact: which facies does each model most often confuse?

The verified OML-58 facies generator (FACIES, CENTROIDS, make_facies_dataset) is embedded for you under a do-not-edit banner, along with the exact train/test split and the two models the chapter builds:

rock = make_facies_dataset()                    # n=1500, seed=11
X = rock[["GR", "RHOB", "NPHI", "RT"]].values
y = rock["facies"].values
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.25, random_state=0, stratify=y)

baseline = make_pipeline(StandardScaler(), LogisticRegression(max_iter=2000)).fit(X_train, y_train)
forest   = RandomForestClassifier(n_estimators=200, random_state=0).fit(X_train, y_train)

(The chapter uses 300 trees; we use 200 here to stay inside the grader's time budget; the most-confused pair is unchanged.)

Your task:

Write most_confused_pair(model, X_test, y_test) -> (str, str):

Compute cm = confusion_matrix(y_test, model.predict(X_test)) over the

four classes 0..3.

Zero out the diagonal (the correct calls) so only mistakes remain:

off = cm - np.diag(np.diag(cm)).

Find the single largest off-diagonal cell with

np.unravel_index(np.argmax(off), off.shape). That gives (true_idx, pred_idx).

Return (FACIES[true_idx], FACIES[pred_idx]), a

(true_name, predicted_name) tuple naming the most-often-confused pair. On ties, argmax picks the first (row-major) cell; that is fine.

Call your helper on both fitted models and bind the outputs the tests read:

baseline_pair = most_confused_pair(baseline, X_test, y_test)
forest_pair   = most_confused_pair(forest,   X_test, y_test)
same_pair     = (baseline_pair == forest_pair)

> Think about it: both the linear baseline and the forest trip on the same > ambiguous pair: the model keeps reading one facies as another. Look at the > density–neutron crossplot in the chapter: the two facies overlap in log space, > so the confusion is petrophysically reasonable, not a coding bug. Why does a > rare, overlapping facies stay hard even when you swap a linear model for a > non-linear ensemble?

lightbulbHints (0/3)

Stuck? Reveal hints one at a time — they progress from nudge to near-solution.

codeYour solution

main.py

import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import make_pipeline
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import confusion_matrix

# ── Verified OML-58 facies generator (do not edit) ───────────────────────
FACIES = {0: "Shale", 1: "Brine Sand", 2: "Gas Sand", 3: "Siltstone"}

# Each facies' centre in log space (GR, RHOB, NPHI, ln RT) and its scatter (the *_sd
# terms). Resistivity is centred on a LOG scale because it ranges over decades. Defined
# once and shared by the training-set and new-well generators below, which guarantees the
# rock physics is identical in training and inference.
CENTROIDS = {0: (108, 2.51, 0.33, np.log(3.5),  16, 0.080, 0.045, 0.40),
             1: (46,  2.28, 0.27, np.log(9.0),   18, 0.075, 0.045, 0.45),
             2: (43,  2.21, 0.21, np.log(15.0),  18, 0.085, 0.050, 0.55),
             3: (74,  2.40, 0.29, np.log(7.0),   21, 0.090, 0.050, 0.48)}

def make_facies_dataset(n=1500, seed=11):
    """Self-contained labelled facies set for OML 58: four logs -> rock type.

Each facies has a centroid in log space (GR, RHOB, NPHI, log-RT) and realistic
    scatter. The gas sand deliberately overlaps the brine sandstone - same low GR
    and high porosity - separated only by its lower density and neutron and higher
    resistivity. That overlap, plus its rarity (7% of the section), is what makes it
    the hard class, exactly as in the field.
    """
    rng = np.random.default_rng(seed)
    facies = rng.choice([0, 1, 2, 3], size=n, p=[0.42, 0.33, 0.07, 0.18])  # imbalanced
    GR = np.zeros(n); RHOB = np.zeros(n); NPHI = np.zeros(n); RT = np.zeros(n)
    for f, (gr, rhob, nphi, lnrt, grs, rhs, nps, rts) in CENTROIDS.items():
        m = facies == f; k = int(m.sum())
        GR[m]   = gr + rng.normal(0, grs, k)
        RHOB[m] = rhob + rng.normal(0, rhs, k)
        NPHI[m] = np.clip(nphi + rng.normal(0, nps, k), 0, 0.55)
        RT[m]   = np.clip(np.exp(lnrt + rng.normal(0, rts, k)), 0.3, 600)
    return pd.DataFrame({"GR": np.round(GR, 1), "RHOB": np.round(RHOB, 3),
                         "NPHI": np.round(NPHI, 3), "RT": np.round(RT, 2),
                         "facies": facies})

# ── Verified split + the two models the chapter builds (do not edit) ─────
rock = make_facies_dataset()                       # n=1500, seed=11
X = rock[["GR", "RHOB", "NPHI", "RT"]].values
y = rock["facies"].values
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.25, random_state=0, stratify=y)

baseline = make_pipeline(StandardScaler(), LogisticRegression(max_iter=2000)).fit(X_train, y_train)
forest   = RandomForestClassifier(n_estimators=200, random_state=0).fit(X_train, y_train)
# ─────────────────────────────────────────────────────────────────────────

def most_confused_pair(model, X_test, y_test):
    """Name the single most-often-confused (true, predicted) facies pair.

Build the 4x4 confusion matrix, zero the diagonal (correct calls), find the
    largest off-diagonal cell, and return (FACIES[true_idx], FACIES[pred_idx]).
    """
    # TODO: cm = confusion_matrix(y_test, model.predict(X_test))
    # TODO: off = cm - np.diag(np.diag(cm))          # drop the correct calls
    # TODO: true_idx, pred_idx = np.unravel_index(np.argmax(off), off.shape)
    # TODO: return (FACIES[true_idx], FACIES[pred_idx])
    pass

# TODO: baseline_pair = most_confused_pair(baseline, X_test, y_test)
# TODO: forest_pair   = most_confused_pair(forest,   X_test, y_test)
# TODO: same_pair     = (baseline_pair == forest_pair)
baseline_pair = None
forest_pair = None
same_pair = None

print("baseline most-confused pair:", baseline_pair)
print("forest   most-confused pair:", forest_pair)
print("same pair?", same_pair)

visibilityReveal reference solutionexpand_more

Try solving it yourself first — the hints walk you through it. The solution below is one valid approach; yours may differ and still be correct.

import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import make_pipeline
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import confusion_matrix


# ── Verified OML-58 facies generator (do not edit) ───────────────────────
FACIES = {0: "Shale", 1: "Brine Sand", 2: "Gas Sand", 3: "Siltstone"}

# Each facies' centre in log space (GR, RHOB, NPHI, ln RT) and its scatter (the *_sd
# terms). Resistivity is centred on a LOG scale because it ranges over decades. Defined
# once and shared by the training-set and new-well generators below, which guarantees the
# rock physics is identical in training and inference.
CENTROIDS = {0: (108, 2.51, 0.33, np.log(3.5),  16, 0.080, 0.045, 0.40),
             1: (46,  2.28, 0.27, np.log(9.0),   18, 0.075, 0.045, 0.45),
             2: (43,  2.21, 0.21, np.log(15.0),  18, 0.085, 0.050, 0.55),
             3: (74,  2.40, 0.29, np.log(7.0),   21, 0.090, 0.050, 0.48)}

def make_facies_dataset(n=1500, seed=11):
    """Self-contained labelled facies set for OML 58: four logs -> rock type.

    Each facies has a centroid in log space (GR, RHOB, NPHI, log-RT) and realistic
    scatter. The gas sand deliberately overlaps the brine sandstone - same low GR
    and high porosity - separated only by its lower density and neutron and higher
    resistivity. That overlap, plus its rarity (7% of the section), is what makes it
    the hard class, exactly as in the field.
    """
    rng = np.random.default_rng(seed)
    facies = rng.choice([0, 1, 2, 3], size=n, p=[0.42, 0.33, 0.07, 0.18])  # imbalanced
    GR = np.zeros(n); RHOB = np.zeros(n); NPHI = np.zeros(n); RT = np.zeros(n)
    for f, (gr, rhob, nphi, lnrt, grs, rhs, nps, rts) in CENTROIDS.items():
        m = facies == f; k = int(m.sum())
        GR[m]   = gr + rng.normal(0, grs, k)
        RHOB[m] = rhob + rng.normal(0, rhs, k)
        NPHI[m] = np.clip(nphi + rng.normal(0, nps, k), 0, 0.55)
        RT[m]   = np.clip(np.exp(lnrt + rng.normal(0, rts, k)), 0.3, 600)
    return pd.DataFrame({"GR": np.round(GR, 1), "RHOB": np.round(RHOB, 3),
                         "NPHI": np.round(NPHI, 3), "RT": np.round(RT, 2),
                         "facies": facies})


# ── Verified split + the two models the chapter builds (do not edit) ─────
rock = make_facies_dataset()                       # n=1500, seed=11
X = rock[["GR", "RHOB", "NPHI", "RT"]].values
y = rock["facies"].values
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.25, random_state=0, stratify=y)

baseline = make_pipeline(StandardScaler(), LogisticRegression(max_iter=2000)).fit(X_train, y_train)
forest   = RandomForestClassifier(n_estimators=200, random_state=0).fit(X_train, y_train)
# ─────────────────────────────────────────────────────────────────────────


def most_confused_pair(model, X_test, y_test):
    """Name the single most-often-confused (true, predicted) facies pair.

    Build the 4x4 confusion matrix, zero the diagonal (correct calls), find the
    largest off-diagonal cell, and return (FACIES[true_idx], FACIES[pred_idx]).
    """
    cm = confusion_matrix(y_test, model.predict(X_test))
    off = cm - np.diag(np.diag(cm))                 # drop the correct calls
    true_idx, pred_idx = np.unravel_index(np.argmax(off), off.shape)
    return (FACIES[true_idx], FACIES[pred_idx])


baseline_pair = most_confused_pair(baseline, X_test, y_test)
forest_pair   = most_confused_pair(forest,   X_test, y_test)
same_pair     = (baseline_pair == forest_pair)

print("baseline most-confused pair:", baseline_pair)
print("forest   most-confused pair:", forest_pair)
print("same pair?", same_pair)

lockCopying code is a Full Access feature.

arrow_back

17.1 The Cost-Weighted Threshold - Pricing Missed Gas vs False Alarms

17.3 Does More Data Beat a Better Threshold?

arrow_forward