Exercise 3.5
LAS File Explorer
Every well log you'll ever work with arrives as a LAS file (Log ASCII Standard), the petroleum industry's interchange format since 1989. The chapter built one from scratch so you'd see the structure. Now write the inverse: a function that reads a LAS file and returns a structured summary you could put on a one-page well report.
A LAS file has the shape:
~VERSION INFORMATION
VERS. 2.0 : CWLS LOG ASCII STANDARD
WRAP. NO : ONE LINE PER DEPTH STEP
~WELL INFORMATION
STRT.FT 9000.00 : START DEPTH
STOP.FT 9500.00 : STOP DEPTH
STEP.FT 1.00 : STEP
NULL. -999.25 : NULL VALUE
WELL. Oso-Deep 003 : WELL NAME
FLD. OML 58 : FIELD
~CURVE INFORMATION
DEPT.FT : 1 DEPTH
GR.GAPI : 2 GAMMA RAY
RT.OHMM : 3 DEEP RESISTIVITY
~A DEPTH GR RT
9000.00 35.21 44.5
9001.00 36.04 43.9
...Write las_summary(text) that takes the contents of a LAS file as a string and returns a dict with these exact keys:
well: the value from theWELL.line (e.g."Oso-Deep 003")field: the value from theFLD.line (e.g."OML 58")start_ft: the start depth as a floatstop_ft: the stop depth as a floatstep_ft: the step as a floatnull_value: the null-value sentinel as a float (e.g.-999.25)curves: a list of curve mnemonics in declared order, **excluding
the depth column** (so for the example above, ["GR", "RT"])
stats: a dict mapping each curve mnemonic to its own dict
{"min", "max", "mean", "n_null"} computed across all data rows. Treat any value equal to null_value as null and exclude it from min/max/mean but count it in n_null.
The data section starts at the line beginning with ~A. The header line that follows tells you the column order; subsequent non-blank lines hold the data.
Stuck? Reveal hints one at a time — they progress from nudge to near-solution.
visibilityReveal reference solutionexpand_more
Try solving it yourself first — the hints walk you through it. The solution below is one valid approach; yours may differ and still be correct.
SAMPLE_LAS = """~VERSION INFORMATION
VERS. 2.0 : CWLS LOG ASCII STANDARD - VERSION 2.0
WRAP. NO : ONE LINE PER DEPTH STEP
~WELL INFORMATION
STRT.FT 9000.00 : START DEPTH
STOP.FT 9004.00 : STOP DEPTH
STEP.FT 1.00 : STEP
NULL. -999.25 : NULL VALUE
WELL. Oso-Deep 003 : WELL NAME
FLD. OML 58 : FIELD
~CURVE INFORMATION
DEPT.FT : 1 DEPTH
GR.GAPI : 2 GAMMA RAY
RT.OHMM : 3 DEEP RESISTIVITY
~A DEPTH GR RT
9000.00 35.20 44.50
9001.00 36.10 -999.25
9002.00 37.00 45.20
9003.00 -999.25 46.10
9004.00 38.50 47.00
"""
def _parse_kv_line(line):
"""`STRT.FT 9000.00 : START DEPTH` → ('STRT', '9000.00')."""
mnemonic_part, _, _description = line.partition(":")
head, _, value = mnemonic_part.strip().partition(" ")
mnemonic = head.split(".")[0].strip()
return mnemonic, value.strip()
def las_summary(text):
section = None
well_kv = {}
curves_in_order = []
data_header_seen = False
data_rows = []
for line in text.splitlines():
s = line.strip()
if not s:
continue
if s.startswith("~"):
tag = s.split()[0].upper()
if tag.startswith("~V"): section = "version"
elif tag.startswith("~W"): section = "well"
elif tag.startswith("~C"): section = "curve"
elif tag.startswith("~A"): section = "ascii"; data_header_seen = False
else: section = "other"
continue
if section == "well":
mnemonic, value = _parse_kv_line(line)
well_kv[mnemonic] = value
elif section == "curve":
mnemonic, _ = _parse_kv_line(line)
curves_in_order.append(mnemonic)
elif section == "ascii":
# The first non-blank line under ~A may be the columnar header
# (e.g. "DEPTH GR RT"). We can detect it because none of its
# tokens parse as floats.
tokens = s.split()
if not data_header_seen:
try:
[float(t) for t in tokens]
except ValueError:
data_header_seen = True
continue
# If they all parsed as numbers, there was no header line -
# fall through and treat this as data.
try:
data_rows.append([float(t) for t in tokens])
except ValueError:
continue
null_value = float(well_kv.get("NULL", -999.25))
start_ft = float(well_kv.get("STRT", "nan"))
stop_ft = float(well_kv.get("STOP", "nan"))
step_ft = float(well_kv.get("STEP", "nan"))
curves = [c for c in curves_in_order if c.upper() not in {"DEPT", "DEPTH"}]
# Compute per-curve statistics, excluding null sentinels.
stats = {}
n_data_cols = len(curves) + 1 # +1 for the depth column
for col_idx, mnemonic in enumerate(curves, start=1):
values = [
row[col_idx]
for row in data_rows
if len(row) >= n_data_cols
]
non_null = [v for v in values if v != null_value]
n_null = len(values) - len(non_null)
stats[mnemonic] = {
"min": min(non_null) if non_null else None,
"max": max(non_null) if non_null else None,
"mean": sum(non_null) / len(non_null) if non_null else None,
"n_null": n_null,
}
return {
"well": well_kv.get("WELL", ""),
"field": well_kv.get("FLD", ""),
"start_ft": start_ft,
"stop_ft": stop_ft,
"step_ft": step_ft,
"null_value": null_value,
"curves": curves,
"stats": stats,
}
from pprint import pprint
pprint(las_summary(SAMPLE_LAS))
lockCopying code is a Full Access feature.