Blind-Well Honesty - Leave-One-Well-Out vs a Leaky Split

Level 3

Chapter 19: Real-World Projects

descriptionProblem

Project 1 scored porosity on a held-out well for a reason. Reproduce both protocols on a field whose logs carry realistic per-well tool drift: an honest leave-one-well-out R² (train on the other wells, predict each held-out well in turn) and a pooled random-row split R² (shuffle every well's samples together, then split). The random split reports a markedly higher R²; the apparent accuracy inflates by several points even though the model is no better. Explain to a colleague what the pooled split is silently leaking, and why taking its number into a reserves meeting is how a model that has never been tested gets trusted.

---

Project 1 scored porosity on a held-out well, not on a random sample of rows, and called a 0.99 R² a red flag. This exercise shows you the trap it was avoiding.

The embedded field (FIELD) is five wells whose logs carry realistic per-well tool drift, a smooth systematic offset unique to each well, exactly the thing a model cannot learn from one well and apply to the next. You evaluate the same porosity model two ways and watch one of them lie.

The generator, feats, N_ESTIMATORS, and FIELD are in the do-not-edit block. Write two functions and two module-level values:

def leave_one_well_out_r2(field):
    '''HONEST: train on the other wells, predict each held-out well in turn,
    pool the predictions, score one R2.'''

def pooled_random_split_r2(field):
    '''LEAKY: shuffle every well's samples together, then split rows at random.'''

Exact procedure:

In leave_one_well_out_r2, loop over each well index k: train a

RandomForestRegressor(n_estimators=N_ESTIMATORS, random_state=0) on pd.concat of the other wells (features via feats, target PHI_true), predict the held-out well k, and collect predictions and truths. Return one r2_score over all pooled predictions.

In pooled_random_split_r2, pd.concat all wells, then

train_test_split(feats(pool), pool.PHI_true.values, test_size=0.25, random_state=0), fit the same forest on the training rows, and return the test r2_score.

honest_r2 = leave_one_well_out_r2(FIELD); leaky_r2 = pooled_random_split_r2(FIELD).

Expose exactly: leave_one_well_out_r2, pooled_random_split_r2, honest_r2, leaky_r2.

> Think about it: the pooled split reports R² about 0.07 higher than the > honest one, not because the model is better, but because each test sample has > a depth-neighbour from the same well (same tool drift) sitting in the training > set. The model never had to generalize to an unseen well, which is the only > thing the next well actually asks of it. Which number would you stake a > perforation decision on?

lightbulbHints (0/3)

Stuck? Reveal hints one at a time — they progress from nudge to near-solution.

codeYour solution

main.py

import numpy as np, pandas as pd
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import train_test_split
from sklearn.metrics import r2_score

# ── Verified Chapter 19 well generator with per-well tool drift (do not edit) ─
ARCHIE = dict(a=0.81, m=2.0, n=2.0, Rw=0.04)
N_ESTIMATORS = 60
def _drift(rng, n, amp, w=21):
    e = np.cumsum(rng.normal(0, 1, n))
    e = pd.Series(e).rolling(w, center=True, min_periods=1).mean().values
    e = e - e.mean()
    return amp * e / (e.std() + 1e-9)
def make_well(wid, seed, top=9000.0, n=300):
    rng = np.random.default_rng(seed); depth = top + 0.5*np.arange(n)
    edges = np.sort(rng.integers(8, n-8, rng.integers(8, 13)-1)); facies = np.zeros(n, int)
    for seg, f in zip(np.split(np.arange(n), edges), rng.choice([0,1,2,3], len(edges)+1, p=[0.38,0.22,0.20,0.20])): facies[seg] = f
    Vsh = np.clip(np.where(facies==0,0.82,np.where(facies==3,0.55,0.15))+rng.normal(0,0.06,n),0,1)
    base = np.where(facies==0,0.06,np.where(facies==3,0.14,0.24)); bed_q = np.zeros(n)
    for seg in np.split(np.arange(n), edges): bed_q[seg] = rng.uniform(-0.10,0.05)
    phi = np.clip((base+bed_q)*(1-0.4*Vsh)+rng.normal(0,0.018,n),0.02,0.34)
    Sw = np.where(facies==2,rng.uniform(0.25,0.68,n),np.where(facies==1,0.88,0.95)); Sw = np.clip(Sw+rng.normal(0,0.05,n),0.12,1.0)
    gas = (facies==2)&(Sw<0.45)
    GR = 18*(1-Vsh)+135*Vsh+rng.normal(0,7,n)+_drift(rng,n,22.0)
    RHOB = (2.65+0.03*Vsh)*(1-phi)+np.where(gas,0.6,1.0)*phi+rng.normal(0,0.035,n)+_drift(rng,n,0.13)
    NPHI = phi+0.30*Vsh-np.where(gas,0.08,0.0)+rng.normal(0,0.025,n)+_drift(rng,n,0.07)
    a,m,nn,Rw = ARCHIE.values(); RT = np.clip(a*Rw/(np.clip(phi,0.03,1)**m*Sw**nn)*np.exp(rng.normal(0,0.22,n)),0.2,2000)
    return pd.DataFrame({"WELL":wid,"DEPTH":depth,"GR":GR,"RHOB":RHOB,"NPHI":NPHI,"RT":RT,"PHI_true":phi})
def feats(df):
    X = df[["GR","RHOB","NPHI","RT"]].copy(); X["RT"] = np.log10(X["RT"]); return X.values
FIELD = [make_well(f"OD-{i:03d}", seed=i, top=9000+i*30) for i in range(1, 6)]
# ── end do-not-edit ──────────────────────────────────────────────────────

def leave_one_well_out_r2(field):
    """HONEST protocol: train on the other wells, predict each held-out well in
    turn, pool the predictions, and score one porosity R2 over all of them."""
    # TODO: preds, truth = [], []
    # TODO: for k in range(len(field)):
    # TODO:     train = pd.concat([field[j] for j in range(len(field)) if j != k], ignore_index=True)
    # TODO:     model = RandomForestRegressor(n_estimators=N_ESTIMATORS, random_state=0).fit(feats(train), train.PHI_true)
    # TODO:     preds.append(model.predict(feats(field[k]))); truth.append(field[k].PHI_true.values)
    # TODO: return float(r2_score(np.concatenate(truth), np.concatenate(preds)))
    return 0.0

def pooled_random_split_r2(field):
    """LEAKY protocol: shuffle every well's samples together, then split rows at
    random -- so a test sample's depth-neighbour can sit in the training set."""
    # TODO: pool = pd.concat(field, ignore_index=True)
    # TODO: Xtr, Xte, ytr, yte = train_test_split(feats(pool), pool.PHI_true.values, test_size=0.25, random_state=0)
    # TODO: model = RandomForestRegressor(n_estimators=N_ESTIMATORS, random_state=0).fit(Xtr, ytr)
    # TODO: return float(r2_score(yte, model.predict(Xte)))
    return 0.0

# TODO: honest_r2 = leave_one_well_out_r2(FIELD)
# TODO: leaky_r2 = pooled_random_split_r2(FIELD)
honest_r2 = None
leaky_r2 = None

print("honest R2:", honest_r2)
print("leaky  R2:", leaky_r2)

visibilityReveal reference solutionexpand_more

Try solving it yourself first — the hints walk you through it. The solution below is one valid approach; yours may differ and still be correct.

import numpy as np, pandas as pd
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import train_test_split
from sklearn.metrics import r2_score

# ── Verified Chapter 19 well generator with per-well tool drift (do not edit) ─
ARCHIE = dict(a=0.81, m=2.0, n=2.0, Rw=0.04)
N_ESTIMATORS = 60
def _drift(rng, n, amp, w=21):
    e = np.cumsum(rng.normal(0, 1, n))
    e = pd.Series(e).rolling(w, center=True, min_periods=1).mean().values
    e = e - e.mean()
    return amp * e / (e.std() + 1e-9)
def make_well(wid, seed, top=9000.0, n=300):
    rng = np.random.default_rng(seed); depth = top + 0.5*np.arange(n)
    edges = np.sort(rng.integers(8, n-8, rng.integers(8, 13)-1)); facies = np.zeros(n, int)
    for seg, f in zip(np.split(np.arange(n), edges), rng.choice([0,1,2,3], len(edges)+1, p=[0.38,0.22,0.20,0.20])): facies[seg] = f
    Vsh = np.clip(np.where(facies==0,0.82,np.where(facies==3,0.55,0.15))+rng.normal(0,0.06,n),0,1)
    base = np.where(facies==0,0.06,np.where(facies==3,0.14,0.24)); bed_q = np.zeros(n)
    for seg in np.split(np.arange(n), edges): bed_q[seg] = rng.uniform(-0.10,0.05)
    phi = np.clip((base+bed_q)*(1-0.4*Vsh)+rng.normal(0,0.018,n),0.02,0.34)
    Sw = np.where(facies==2,rng.uniform(0.25,0.68,n),np.where(facies==1,0.88,0.95)); Sw = np.clip(Sw+rng.normal(0,0.05,n),0.12,1.0)
    gas = (facies==2)&(Sw<0.45)
    GR = 18*(1-Vsh)+135*Vsh+rng.normal(0,7,n)+_drift(rng,n,22.0)
    RHOB = (2.65+0.03*Vsh)*(1-phi)+np.where(gas,0.6,1.0)*phi+rng.normal(0,0.035,n)+_drift(rng,n,0.13)
    NPHI = phi+0.30*Vsh-np.where(gas,0.08,0.0)+rng.normal(0,0.025,n)+_drift(rng,n,0.07)
    a,m,nn,Rw = ARCHIE.values(); RT = np.clip(a*Rw/(np.clip(phi,0.03,1)**m*Sw**nn)*np.exp(rng.normal(0,0.22,n)),0.2,2000)
    return pd.DataFrame({"WELL":wid,"DEPTH":depth,"GR":GR,"RHOB":RHOB,"NPHI":NPHI,"RT":RT,"PHI_true":phi})
def feats(df):
    X = df[["GR","RHOB","NPHI","RT"]].copy(); X["RT"] = np.log10(X["RT"]); return X.values
FIELD = [make_well(f"OD-{i:03d}", seed=i, top=9000+i*30) for i in range(1, 6)]
# ── end do-not-edit ──────────────────────────────────────────────────────


def leave_one_well_out_r2(field):
    """HONEST protocol: train on the other wells, predict each held-out well in
    turn, pool the predictions, and score one porosity R2 over all of them."""
    preds, truth = [], []
    for k in range(len(field)):
        train = pd.concat([field[j] for j in range(len(field)) if j != k], ignore_index=True)
        model = RandomForestRegressor(n_estimators=N_ESTIMATORS, random_state=0).fit(feats(train), train.PHI_true)
        preds.append(model.predict(feats(field[k])))
        truth.append(field[k].PHI_true.values)
    return float(r2_score(np.concatenate(truth), np.concatenate(preds)))


def pooled_random_split_r2(field):
    """LEAKY protocol: shuffle every well's samples together, then split rows at
    random -- so a test sample's depth-neighbour can sit in the training set."""
    pool = pd.concat(field, ignore_index=True)
    Xtr, Xte, ytr, yte = train_test_split(feats(pool), pool.PHI_true.values, test_size=0.25, random_state=0)
    model = RandomForestRegressor(n_estimators=N_ESTIMATORS, random_state=0).fit(Xtr, ytr)
    return float(r2_score(yte, model.predict(Xte)))


honest_r2 = leave_one_well_out_r2(FIELD)
leaky_r2 = pooled_random_split_r2(FIELD)

print(f"honest (leave-one-well-out) porosity R2: {honest_r2:.3f}")
print(f"leaky  (pooled random split) porosity R2: {leaky_r2:.3f}")
print(f"the random split inflates the apparent R2 by {leaky_r2 - honest_r2:+.3f}")

lockCopying code is a Full Access feature.

arrow_back

19.3 The Over-Booking Trap - Capped vs Uncapped EUR

19.5 Find the Real Underperformer - Difficulty-Normalised Drilling Cost

arrow_forward