Exerciseschevron_rightChapter 3chevron_right3.5
fitness_center

Exercise 3.5

LAS File Explorer

Level 3
Chapter 3: Data Structures
descriptionProblem

Every well log you'll ever work with arrives as a LAS file (Log ASCII Standard), the petroleum industry's interchange format since 1989. The chapter built one from scratch so you'd see the structure. Now write the inverse: a function that reads a LAS file and returns a structured summary you could put on a one-page well report.

A LAS file has the shape:

~VERSION INFORMATION
VERS.   2.0 : CWLS LOG ASCII STANDARD
WRAP.    NO : ONE LINE PER DEPTH STEP

~WELL INFORMATION
STRT.FT   9000.00 : START DEPTH
STOP.FT   9500.00 : STOP DEPTH
STEP.FT      1.00 : STEP
NULL.    -999.25  : NULL VALUE
WELL.   Oso-Deep 003 : WELL NAME
FLD.    OML 58       : FIELD

~CURVE INFORMATION
DEPT.FT       : 1  DEPTH
GR.GAPI       : 2  GAMMA RAY
RT.OHMM       : 3  DEEP RESISTIVITY

~A   DEPTH      GR        RT
   9000.00    35.21    44.5
   9001.00    36.04    43.9
   ...

Write las_summary(text) that takes the contents of a LAS file as a string and returns a dict with these exact keys:

  • well: the value from the WELL. line (e.g. "Oso-Deep 003")
  • field: the value from the FLD. line (e.g. "OML 58")
  • start_ft: the start depth as a float
  • stop_ft: the stop depth as a float
  • step_ft: the step as a float
  • null_value: the null-value sentinel as a float (e.g. -999.25)
  • curves: a list of curve mnemonics in declared order, **excluding

the depth column** (so for the example above, ["GR", "RT"])

  • stats: a dict mapping each curve mnemonic to its own dict

{"min", "max", "mean", "n_null"} computed across all data rows. Treat any value equal to null_value as null and exclude it from min/max/mean but count it in n_null.

The data section starts at the line beginning with ~A. The header line that follows tells you the column order; subsequent non-blank lines hold the data.

lightbulbHints (0/5)

Stuck? Reveal hints one at a time — they progress from nudge to near-solution.

codeYour solution
main.py
visibilityReveal reference solutionexpand_more

Try solving it yourself first — the hints walk you through it. The solution below is one valid approach; yours may differ and still be correct.

SAMPLE_LAS = """~VERSION INFORMATION
VERS.   2.0 : CWLS LOG ASCII STANDARD - VERSION 2.0
WRAP.    NO : ONE LINE PER DEPTH STEP

~WELL INFORMATION
STRT.FT   9000.00 : START DEPTH
STOP.FT   9004.00 : STOP DEPTH
STEP.FT      1.00 : STEP
NULL.    -999.25 : NULL VALUE
WELL.   Oso-Deep 003 : WELL NAME
FLD.    OML 58 : FIELD

~CURVE INFORMATION
DEPT.FT       : 1  DEPTH
GR.GAPI       : 2  GAMMA RAY
RT.OHMM       : 3  DEEP RESISTIVITY

~A   DEPTH      GR        RT
   9000.00    35.20    44.50
   9001.00    36.10  -999.25
   9002.00    37.00    45.20
   9003.00  -999.25    46.10
   9004.00    38.50    47.00
"""


def _parse_kv_line(line):
    """`STRT.FT   9000.00 : START DEPTH` → ('STRT', '9000.00')."""
    mnemonic_part, _, _description = line.partition(":")
    head, _, value = mnemonic_part.strip().partition(" ")
    mnemonic = head.split(".")[0].strip()
    return mnemonic, value.strip()


def las_summary(text):
    section = None
    well_kv = {}
    curves_in_order = []
    data_header_seen = False
    data_rows = []

    for line in text.splitlines():
        s = line.strip()
        if not s:
            continue

        if s.startswith("~"):
            tag = s.split()[0].upper()
            if tag.startswith("~V"):  section = "version"
            elif tag.startswith("~W"): section = "well"
            elif tag.startswith("~C"): section = "curve"
            elif tag.startswith("~A"): section = "ascii"; data_header_seen = False
            else:                      section = "other"
            continue

        if section == "well":
            mnemonic, value = _parse_kv_line(line)
            well_kv[mnemonic] = value

        elif section == "curve":
            mnemonic, _ = _parse_kv_line(line)
            curves_in_order.append(mnemonic)

        elif section == "ascii":
            # The first non-blank line under ~A may be the columnar header
            # (e.g. "DEPTH GR RT"). We can detect it because none of its
            # tokens parse as floats.
            tokens = s.split()
            if not data_header_seen:
                try:
                    [float(t) for t in tokens]
                except ValueError:
                    data_header_seen = True
                    continue
                # If they all parsed as numbers, there was no header line -
                # fall through and treat this as data.
            try:
                data_rows.append([float(t) for t in tokens])
            except ValueError:
                continue

    null_value = float(well_kv.get("NULL", -999.25))
    start_ft   = float(well_kv.get("STRT", "nan"))
    stop_ft    = float(well_kv.get("STOP", "nan"))
    step_ft    = float(well_kv.get("STEP", "nan"))

    curves = [c for c in curves_in_order if c.upper() not in {"DEPT", "DEPTH"}]

    # Compute per-curve statistics, excluding null sentinels.
    stats = {}
    n_data_cols = len(curves) + 1   # +1 for the depth column
    for col_idx, mnemonic in enumerate(curves, start=1):
        values = [
            row[col_idx]
            for row in data_rows
            if len(row) >= n_data_cols
        ]
        non_null = [v for v in values if v != null_value]
        n_null   = len(values) - len(non_null)
        stats[mnemonic] = {
            "min":    min(non_null) if non_null else None,
            "max":    max(non_null) if non_null else None,
            "mean":   sum(non_null) / len(non_null) if non_null else None,
            "n_null": n_null,
        }

    return {
        "well":       well_kv.get("WELL", ""),
        "field":      well_kv.get("FLD", ""),
        "start_ft":   start_ft,
        "stop_ft":    stop_ft,
        "step_ft":    step_ft,
        "null_value": null_value,
        "curves":     curves,
        "stats":      stats,
    }


from pprint import pprint
pprint(las_summary(SAMPLE_LAS))

lockCopying code is a Full Access feature.