Exerciseschevron_rightChapter 7chevron_right7.7
fitness_center

Exercise 7.7

Pay Zone Statistics + HPT Ranking

Level 3
Chapter 7: Well Log Analysis
descriptionProblem

Net pay (a footage count) tells you how much sand passes the cutoffs. It does not tell you which interval to perforate first. On OD-001 (OML 58) the petrophysics pipeline has already flagged the net-pay samples; now the team needs each continuous pay interval ranked by hydrocarbon pore thickness:

> HPT = thickness_ft × mean(PHIE) × mean(1 − SW)

HPT is footage discounted by how porous the rock is and how much of that pore space holds hydrocarbon instead of water. A thick, clean, oil-filled sand wins; a thin or wet streak loses. Wireline noise will throw up a few sub-foot slivers that scrape past the Sw cutoff. Those are not pay, so you only book intervals at or above a minimum thickness.

Three functions over the canonical log:

  1. add_pipeline_columns(logs): return a copy of logs with VSHALE,

PHIE, SW, and a boolean NET_PAY column (use the correlations and base cutoffs from the chapter: Vsh < 0.40 and PHIE > 0.08 and SW < 0.60).

  1. pay_zones(logs, step_ft, min_thickness_ft=2.0): run the pipeline, then

return a list of (top_ft, base_ft, thickness_ft) tuples, one per continuous run of NET_PAY samples, sorted by depth. A run that spans samples i..j has top = DEPTH[i], base = DEPTH[j] + step_ft, and thickness_ft = base − top. Drop any interval thinner than min_thickness_ft.

  1. zone_stats(logs, top_ft, base_ft): for the samples with `top ≤ DEPTH <

base, return {"thickness_ft", "phie", "sw", "hpt"} where phie and sw are the means over those samples and hpt = thickness × phie × (1 − sw)`.

  1. rank_zones(logs): return the pay_zones enriched with their stats as a

list of dicts sorted by hpt descending (best zone first). With the canonical log there is exactly one bookable pay zone (~7400–7600 ft).

lightbulbHints (0/3)

Stuck? Reveal hints one at a time — they progress from nudge to near-solution.

codeYour solution
main.py
visibilityReveal reference solutionexpand_more

Try solving it yourself first — the hints walk you through it. The solution below is one valid approach; yours may differ and still be correct.

import numpy as np
import pandas as pd

np.random.seed(7)
depth = np.arange(7000.0, 7850.0, 0.5)
_cond = [depth < 7200, depth < 7400, depth < 7600, depth < 7700, depth >= 7700]
_gr   = np.select(_cond, [105.0, 40.0, 32.0, 110.0, 45.0])
_rt   = np.select(_cond, [4.0,   2.2,  35.0, 4.0,   2.5])
_rhob = np.select(_cond, [2.54,  2.31, 2.26, 2.55,  2.33])
_nphi = np.select(_cond, [0.33,  0.21, 0.17, 0.34,  0.22])
logs = pd.DataFrame({
    "DEPTH": depth,
    "GR":   _gr   + np.random.normal(0, 4.0,  len(depth)),
    "RT":   _rt   * np.exp(np.random.normal(0, 0.10, len(depth))),
    "RHOB": _rhob + np.random.normal(0, 0.02, len(depth)),
    "NPHI": np.clip(_nphi + np.random.normal(0, 0.015, len(depth)), 0.0, 1.0),
})
STEP_FT = 0.5

# Petrophysical constants (chapter base case).
RHO_MATRIX, RHO_FLUID = 2.65, 1.0
A, M, N, RW = 0.81, 2.0, 2.0, 0.04
VSH_CUT, PHIE_CUT, SW_CUT = 0.40, 0.08, 0.60


def add_pipeline_columns(logs):
    """Return a copy of logs with VSHALE, PHIE, SW, and NET_PAY columns."""
    out = logs.copy()

    gr_clean = out["GR"].quantile(0.05)
    gr_shale = out["GR"].quantile(0.95)
    igr = (out["GR"] - gr_clean) / (gr_shale - gr_clean)
    out["VSHALE"] = igr.clip(0.0, 1.0)

    phid = ((RHO_MATRIX - out["RHOB"]) / (RHO_MATRIX - RHO_FLUID)).clip(0.0, 0.45)
    phia = (phid + out["NPHI"]) / 2.0
    out["PHIE"] = (phia * (1.0 - out["VSHALE"])).clip(0.0, 0.40)

    sw = np.where(
        out["PHIE"] > 0.01,
        ((A * RW) / (out["PHIE"] ** M * out["RT"])) ** (1.0 / N),
        1.0,
    )
    out["SW"] = pd.Series(sw, index=out.index).clip(0.0, 1.0)

    out["NET_PAY"] = (
        (out["VSHALE"] < VSH_CUT) & (out["PHIE"] > PHIE_CUT) & (out["SW"] < SW_CUT)
    )
    return out


def pay_zones(logs, step_ft, min_thickness_ft=2.0):
    """List of (top_ft, base_ft, thickness_ft) continuous net-pay intervals."""
    df = add_pipeline_columns(logs)
    mask = df["NET_PAY"].to_numpy()
    d = df["DEPTH"].to_numpy()

    zones = []
    i, n = 0, len(mask)
    while i < n:
        if mask[i]:
            j = i
            while j + 1 < n and mask[j + 1]:
                j += 1
            top = float(d[i])
            base = float(d[j]) + step_ft
            thickness = base - top
            if thickness >= min_thickness_ft:
                zones.append((top, base, thickness))
            i = j + 1
        else:
            i += 1
    return zones


def zone_stats(logs, top_ft, base_ft):
    """Return {thickness_ft, phie, sw, hpt} for top_ft <= DEPTH < base_ft."""
    df = add_pipeline_columns(logs)
    sel = df[(df["DEPTH"] >= top_ft) & (df["DEPTH"] < base_ft)]
    thickness_ft = float(base_ft - top_ft)
    phie = float(sel["PHIE"].mean())
    sw = float(sel["SW"].mean())
    hpt = thickness_ft * phie * (1.0 - sw)
    return {"thickness_ft": thickness_ft, "phie": phie, "sw": sw, "hpt": hpt}


def rank_zones(logs):
    """pay_zones enriched with stats, sorted by hpt descending."""
    enriched = []
    for top, base, _thick in pay_zones(logs, STEP_FT):
        stats = zone_stats(logs, top, base)
        enriched.append({"top_ft": top, "base_ft": base, **stats})
    enriched.sort(key=lambda z: z["hpt"], reverse=True)
    return enriched


zones = pay_zones(logs, STEP_FT)
print(f"pay zones ({len(zones)}):", zones)
ranked = rank_zones(logs)
best = ranked[0]
print(f"best zone {best['top_ft']:.0f}-{best['base_ft']:.0f} ft  "
      f"thickness={best['thickness_ft']:.1f} ft  "
      f"phie={best['phie']:.3f}  sw={best['sw']:.3f}  hpt={best['hpt']:.2f}")

lockCopying code is a Full Access feature.