Exerciseschevron_rightChapter 4chevron_right4.8
fitness_center

Exercise 4.8

Well Ranking System

Level 3
Chapter 4: NumPy & Pandas
descriptionProblem

Which wells deserve attention first? Ranking needs a single score that fairly combines metrics measured in different units. The standard recipe: normalise every metric to a 0–100 scale, then take a weighted average.

The starter gives a metrics DataFrame with one row per well: oil_bopd (higher is better), water_cut (lower is better), gor_slope_abs and fwhp_slope_abs (the size of each trend; smaller means steadier, which is better). Write four functions:

  1. normalize_higher_better(series): min→0, max→100:

(x − min) / (max − min) × 100.

  1. normalize_lower_better(series): the inverse, so the smallest value

scores 100: (max − x) / (max − min) × 100.

  1. composite_score(df, weights=DEFAULT_WEIGHTS): normalise each metric

the right direction, then combine with the weights (oil 0.40, water cut 0.30, GOR trend 0.15, FWHP trend 0.15).

  1. rank_wells(df, weights=DEFAULT_WEIGHTS): return the frame sorted by

score descending, with a 1-based rank column.

Then draw a horizontal bar chart of the composite scores. Normalising is what stops the metric with the biggest raw numbers (oil rate) from silently dominating a score it shouldn't.

lightbulbHints (0/3)

Stuck? Reveal hints one at a time — they progress from nudge to near-solution.

codeYour solution
main.py
visibilityReveal reference solutionexpand_more

Try solving it yourself first — the hints walk you through it. The solution below is one valid approach; yours may differ and still be correct.

import pandas as pd
import matplotlib.pyplot as plt

metrics = pd.DataFrame([
    {"well": "OD-001", "oil_bopd": 1800, "water_cut": 0.25, "gor_slope_abs":  5, "fwhp_slope_abs":  8},
    {"well": "OD-002", "oil_bopd":  900, "water_cut": 0.55, "gor_slope_abs": 20, "fwhp_slope_abs": 16},
    {"well": "OD-005", "oil_bopd": 2400, "water_cut": 0.10, "gor_slope_abs":  3, "fwhp_slope_abs":  6},
    {"well": "OD-007", "oil_bopd": 1300, "water_cut": 0.40, "gor_slope_abs": 12, "fwhp_slope_abs": 12},
    {"well": "OD-009", "oil_bopd": 1700, "water_cut": 0.30, "gor_slope_abs":  8, "fwhp_slope_abs": 10},
])

DEFAULT_WEIGHTS = {"oil_bopd": 0.40, "water_cut": 0.30, "gor_slope_abs": 0.15, "fwhp_slope_abs": 0.15}


def normalize_higher_better(series):
    return (series - series.min()) / (series.max() - series.min()) * 100


def normalize_lower_better(series):
    return (series.max() - series) / (series.max() - series.min()) * 100


def composite_score(df, weights=DEFAULT_WEIGHTS):
    return (
        weights["oil_bopd"]       * normalize_higher_better(df["oil_bopd"])
        + weights["water_cut"]      * normalize_lower_better(df["water_cut"])
        + weights["gor_slope_abs"]  * normalize_lower_better(df["gor_slope_abs"])
        + weights["fwhp_slope_abs"] * normalize_lower_better(df["fwhp_slope_abs"])
    )


def rank_wells(df, weights=DEFAULT_WEIGHTS):
    out = df.copy()
    out["score"] = composite_score(out, weights)
    out = out.sort_values("score", ascending=False).reset_index(drop=True)
    out["rank"] = out.index + 1
    return out


ranked = rank_wells(metrics)
print(ranked[["rank", "well", "score"]].to_string(index=False))

fig, ax = plt.subplots(figsize=(9, 5))
ax.barh(ranked["well"], ranked["score"], color="#4682B4")
ax.invert_yaxis()  # rank 1 at the top
ax.set_xlabel("Composite score (0-100)")
ax.set_title("Well ranking - OML 58")
plt.tight_layout()
plt.show()

lockCopying code is a Full Access feature.