Exercise 21.1
Catch the Unit Mix-up -- A Validating Request Handler
A common silent error is a density curve sent in kg/m³ (≈2350) instead of g/cm³ (≈2.35). It is numeric and not a null, so the range check [1.5, 3.1] already rejects it, but the error message just says "out of range." Extend predict_payload so that when RHOB is between 1500 and 3100 it returns a specific error naming the likely g/cm³↔kg/m³ unit mix-up. Show it fires on RHOB = 2350 and not on a genuine out-of-range value like RHOB = 5.0. Why is a specific error worth more than a generic one to the engineer at 2 a.m.?
---
A deployed model's most valuable line of defence is the validator at the boundary. This exercise hardens it against a silent, common error: a density curve sent in kg/m³ (≈2350) instead of g/cm³ (≈2.35). The plain range check [1.5, 3.1] already rejects it -- but only with a generic "out of range" message. At 2 a.m. the engineer needs to be told it is a unit mix-up.
The contract (CURVES, RANGES) is embedded under a do-not-edit banner. Write one function:
def validate_payload(payload):
"""Return a list of error strings for an untrusted log payload (empty if clean)."""Exact procedure, per curve in CURVES, append an error and move on when:
- the curve is missing,
- its value is non-numeric (a bool counts as non-numeric),
- it is the null sentinel
-999or-999.25, - the unit-mix-up rule: if the curve is
RHOBand its value is between 1500
and 3100, append a message that contains the substring "unit" (the likely g/cm³ ↔ kg/m³ mix-up),
- otherwise it is outside
RANGES[curve](the generic out-of-range case).
Expose: validate_payload.
> Think about it: RHOB = 2350 should produce the specific unit message, > while RHOB = 5.0 should produce only the generic out-of-range one -- both are > rejected, but only one tells the engineer what actually went wrong. Why is a > specific diagnosis worth more than a correct rejection?
Stuck? Reveal hints one at a time — they progress from nudge to near-solution.
visibilityReveal reference solutionexpand_more
Try solving it yourself first — the hints walk you through it. The solution below is one valid approach; yours may differ and still be correct.
CURVES = ("GR", "RHOB", "NPHI", "RT")
RANGES = {"GR": (0, 250), "RHOB": (1.5, 3.1), "NPHI": (-0.05, 0.7), "RT": (0.1, 5000)}
def validate_payload(payload):
"""Return a list of error strings for an untrusted log payload (empty if clean)."""
errors = []
for c in CURVES:
if c not in payload:
errors.append(f"missing curve: {c}"); continue
v = payload[c]
if isinstance(v, bool) or not isinstance(v, (int, float)):
errors.append(f"{c} is not numeric"); continue
if v in (-999, -999.25):
errors.append(f"{c} is a null sentinel (-999)"); continue
if c == "RHOB" and 1500 <= v <= 3100:
errors.append(f"RHOB={v} looks like kg/m3 -- likely a g/cm3 unit mix-up"); continue
lo, hi = RANGES[c]
if not (lo <= v <= hi):
errors.append(f"{c}={v} out of physical range [{lo}, {hi}]")
return errors
print("RHOB=2350:", validate_payload({"GR": 55, "RHOB": 2350, "NPHI": 0.22, "RT": 18.0}))
print("RHOB=5.0 :", validate_payload({"GR": 55, "RHOB": 5.0, "NPHI": 0.22, "RT": 18.0}))
print("valid :", validate_payload({"GR": 55, "RHOB": 2.35, "NPHI": 0.22, "RT": 18.0}))
lockCopying code is a Full Access feature.