Part I: Python Fundamentals
Chapter 5
Data Visualization for Petroleum Engineers
Every major decision in petroleum engineering starts with a picture.
A production decline plot determines whether a field gets $200 million in new investment or gets scheduled for abandonment. A well log display tells a completion engineer exactly where to perforate — get it wrong, and you produce water instead of oil. A pressure map tells a reservoir engineer where to inject water to sweep remaining oil. A bubble map shows a field manager where production is coming from and where it is dying.
Tables of numbers do not drive these decisions. Charts do. The reason is simple: the human eye detects patterns in visual data orders of magnitude faster than it processes rows and columns. A table with 25 wells and 36 months of production is 900 numbers. A single decline curve overlay, color-coded by well region, communicates the same information in seconds.
This chapter builds every visualization type that petroleum engineers use in practice. Each one exists because it answers a specific engineering question. Understanding why each chart type exists — what question it answers, what pattern it reveals, what decision it supports — matters more than memorizing the Matplotlib syntax that draws it.
infoWhat You Will Learn
- Build publication-quality plots with Matplotlib using the
fig, axpattern - Create the standard petroleum chart types: production time series, decline curves, crossplots, bar charts, bubble maps, and cumulative production plots
- Construct multi-panel production surveillance displays with shared time axes
- Build a well log "triple combo" display and interpret what it reveals
- Use Seaborn for statistical exploration of well and reservoir properties
- Assemble a multi-panel field report suitable for a management committee presentation
lightbulbDatasets Used in This Chapter
All data is generated in-code using NumPy and Pandas. No external files are required. The synthetic data simulates realistic production histories, well coordinates, reservoir pressures, water cuts, and well log responses so every example runs on any machine.
Why Each Chart Type Exists
Before writing any code, it is worth understanding why the petroleum industry standardized around specific chart types. Each one evolved to answer a question that other formats could not answer efficiently.
Production time series (rate vs. time) exist because production is inherently a time-dependent process. Reservoir pressure depletes, water breaks through, equipment degrades — all of these show up as changes in rate over time. This is the most common plot in any producing operation. If you work in oil and gas, you will see this plot every single day.
Decline curves (rate vs. time on a semi-log scale) exist because exponential and hyperbolic decline functions become straight lines on a semi-log plot. A straight line is easy to extrapolate. That extrapolation is how the industry estimates remaining reserves and economic well life — which determines whether a well is worth maintaining or should be plugged and abandoned.
Crossplots (one property vs. another) exist because rock and fluid properties are physically related. Porosity and permeability are connected through pore throat geometry — the relationship is logarithmic because flow through porous media follows Darcy's law, and pore throat size distributions create exponential scaling. A crossplot reveals that relationship visually and exposes which rock types are present.
Well log displays (measurements vs. depth) exist because the subsurface is layered. Shale sits on top of sand sits on top of limestone. Each rock type has distinct physical properties — radioactivity, electrical resistivity, density. Plotting these measurements as continuous curves against depth lets the interpreter see formation boundaries, identify pay zones, and decide where to complete the well.
Bubble maps (well locations with sized markers) exist because petroleum reservoirs are three-dimensional geological structures. Where a well sits relative to the structure — on the crest, on the flank, near a fault — fundamentally affects its performance. A bubble map overlays production data on geography so that spatial patterns become visible.
Cumulative production plots (cumulative volume vs. time) exist because they smooth out the noise in rate data. Monthly rates fluctuate due to operational events — workovers, shut-ins, facility constraints. Cumulative production only goes up, and its slope at any point equals the instantaneous rate. Changes in slope reveal changes in well or reservoir behavior more clearly than noisy rate data.
Each of these exists because an engineer needed to make a better decision. That is the standard every visualization in this chapter will meet.
Matplotlib Fundamentals
Matplotlib is the foundational plotting library in Python. Nearly every other visualization tool in the scientific Python ecosystem — Seaborn, Pandas plotting, parts of scikit-learn — is built on top of it.
The core concept is the Figure and Axes model. A Figure is the entire canvas. An Axes is a single plot within that canvas. One figure can hold one axes or many. This separation is what makes Matplotlib powerful: each element is controlled independently.

Every Matplotlib plot follows this skeleton:
fig, ax = plt.subplots()— creates one figure and one axes.figsize=(9, 5)sets the canvas dimensions in inches.ax.plot()— draws data. First argument is x-values, second is y-values. Everything after is styling.ax.set_xlabel(),ax.set_ylabel(),ax.set_title()— labels. An unlabeled plot is useless in engineering — nobody should have to guess what the axes represent.ax.legend()— displays the legend using thelabelset inax.plot().ax.grid(True, alpha=0.3)— subtle grid. Thealphacontrols transparency.plt.tight_layout()— adjusts spacing to prevent clipping.
lightbulbAlways Use `fig, ax = plt.subplots()`
Code examples online often use plt.plot() directly without creating fig and ax. That shortcut works for throwaway plots but breaks the moment you need subplots, shared axes, or precise control. Always use fig, ax = plt.subplots(). One extra line saves hours of debugging.
Production Time Series — Multi-Well Overlay
One well is informative. Twenty-five wells on the same plot reveal which wells are carrying the field and which are dragging it down. When the question is "which wells are declining fastest?", the answer is a multi-well overlay where the eye immediately separates strong performers from weak ones.

Wells OD-007 and OD-011 separate visually from the pack. The steep decline could indicate several things: higher drawdown, a smaller drainage area, less reservoir support, or water coning. The plot does not diagnose the cause — that requires further analysis (Chapters 9 and 11) — but it immediately tells you where to focus your attention.
Cumulative Production — Smoothing the Noise
Monthly production rates are noisy. Wells shut in for workovers, facilities hit capacity constraints, allocation methods introduce errors. Cumulative production eliminates that noise because it only increases. The slope of the cumulative curve at any point equals the average rate over that period.
This matters for reserves estimation. When rate data is too noisy to fit a reliable decline curve, cumulative production often reveals the underlying trend more clearly.

This plot reveals something that a rate-vs-time plot obscures: Well OD-005 — despite starting at a lower rate than OD-011 — ends up producing more total oil because its decline is gentler. In petroleum economics, cumulative production is what generates revenue. A well that starts high but crashes quickly can be less valuable than one that starts moderate but sustains.
Crossplots — Porosity vs. Permeability
A crossplot places one rock property on the x-axis and another on the y-axis. The most important crossplot in reservoir engineering is porosity versus permeability.
The relationship between porosity and permeability is logarithmic. This is not arbitrary — it follows from physics. Porosity measures how much void space exists in the rock. Permeability measures how easily fluid flows through that void space. Flow depends not just on how much pore space exists but on how the pores are connected — specifically, on the size of the pore throats. Pore throat distributions in natural rocks tend to follow log-normal distributions, which is why permeability spans orders of magnitude (from 0.01 millidarcies in tight rock to 10,000 millidarcies in high-quality sand) while porosity varies within a much narrower band (5% to 35%).
This is why petroleum engineers always plot permeability on a logarithmic y-axis. On a linear scale, any data point below 100 mD would be invisible. On a log scale, the full range is visible and the trend that connects porosity to permeability becomes clear.

At the same porosity, Zone A has roughly ten times the permeability of Zone B. This tells the reservoir engineer that Zone A is the primary flow unit. It is where fluid moves, where production comes from, and where perforations should be placed. Zone B stores hydrocarbons but delivers them slowly. This distinction — between storage capacity (porosity) and deliverability (permeability) — is fundamental to reservoir characterization, and the crossplot makes it immediately visible.
Bar Charts — Ranking and Comparing
Bar charts are the simplest and most direct way to compare wells, fields, or operators. When someone asks "which wells produce the most?" or "how does this field compare to the offset?", the answer is a bar chart.

infoHorizontal vs. Vertical Bars
Use horizontal bars (barh) when comparing named items — well IDs, field names, operators. Names read left-to-right naturally. Use vertical bars (bar) when the x-axis is numeric or time-based. This is a small choice that makes a large readability difference.
Multi-Panel Subplots — Production Surveillance
In production surveillance, engineers do not examine oil rate in isolation. They look at oil rate, water cut, and gas-oil ratio together, stacked vertically with a shared time axis. The reason is physics: these three measurements are linked by reservoir behavior.
If oil rate drops while water cut rises, water is displacing oil. That is either a waterflood response (intentional and good) or water coning (unintentional and problematic). If GOR spikes, gas is breaking through — reservoir pressure may have dropped below the bubble point, liberating dissolved gas.
The multi-panel display with sharex=True locks all panels to the same time axis. When you point at a feature in the oil rate panel, your eye drops naturally to the same month in the water cut and GOR panels. The correlation (or lack of it) between panels is the diagnostic information.

This three-panel layout is so common that many operating companies have standardized templates for it. Some companies add a fourth panel for reservoir pressure or a fifth for injection rates in waterflooded fields. The design principle remains the same: stack related measurements vertically with a shared time axis so that correlations across measurements are immediately visible.
Well Log Display — The Triple Combo
The well log display is the most iconic visualization in petroleum engineering. No other industry uses anything quite like it. Three measurement tracks — gamma ray, resistivity, and density-neutron porosity — plotted against depth, with depth increasing downward. This format evolved because it directly represents the physical reality: you are looking at a vertical cross-section of the earth, layer by layer.
Each track measures a different physical property of the rock:
- Gamma Ray (GR) measures natural radioactivity. Shales are radioactive because they contain potassium, uranium, and thorium in their clay minerals. Clean sands and carbonates have low radioactivity. The GR log therefore distinguishes between reservoir rock (sand, low GR) and non-reservoir rock (shale, high GR). A cutoff — typically around 60 API units — separates the two.
- Resistivity measures how well the rock conducts electricity. Water conducts electricity readily (low resistivity). Oil and gas do not (high resistivity). A sand zone with high resistivity therefore contains hydrocarbons. The magnitude of the resistivity increase — compared to a known water-bearing zone — is used in Archie's equation (Chapter 7) to calculate water saturation.
- Density-Neutron porosity uses two independent measurements of porosity. When they agree (overlay), the pore space contains water. When the density porosity reads lower than neutron porosity — a "crossover" — the pore space contains light hydrocarbons (gas or light oil), because hydrogen density in hydrocarbons differs from water.
Together, these three tracks answer the critical question: Is there a producible hydrocarbon zone, and where exactly is it?

Reading this log from left to right: The GR track identifies two sand intervals (yellow shading where GR drops below 60 API). The resistivity track shows that the deeper sand (7,200–7,350 ft) has a dramatic resistivity spike — that is the hydrocarbon signature. The shallower sand (7,080–7,150 ft) has moderate resistivity, indicating water-bearing sand. The density-neutron overlay confirms: in the deeper sand, density porosity drops below neutron porosity (crossover), confirming light hydrocarbons in the pore space. The shallower sand shows no crossover — water.
The interpretation: perforate from 7,200 to 7,350 ft. That is the pay zone. Skip the shallower sand — it will produce water.
Bubble Maps — Spatial Production Display
A bubble map places wells at their geographic coordinates and sizes each marker by a production metric. It answers the question every field manager asks: "Where is the production coming from?"
The reason this matters is geology. Petroleum reservoirs are not uniform. Structural position (how high on the anticline), stratigraphic variation (sand thickness, quality), and proximity to faults all affect individual well performance. A bubble map overlays production data on space, making geological patterns visible without requiring a geological model.

Three dimensions of information in one figure: location (position), production rate (circle size), and water cut (color). The northwest cluster has large green circles (high production, low water cut). The southeast wells have small red circles (low production, high water cut). This pattern suggests water encroachment from the southeast and a structural high or better reservoir quality in the northwest.
That spatial insight guides where to drill the next well.
Decline Curves on a Semi-Log Scale
Decline curve analysis is covered thoroughly in Chapter 9. The visual format is introduced here because it is so fundamental to how the industry thinks about well performance and reserves.
When production rate is plotted on a logarithmic y-axis against time, an exponential decline becomes a straight line. This happens because the exponential function becomes when you take the natural log — a linear equation.
A straight line is easy to extrapolate. You extend it forward until it hits an economic limit (the minimum rate at which the well covers its operating costs). Where it crosses that limit is the well's remaining economic life. The area under the curve is the remaining reserves.
Seaborn — Statistical Visualization for Reservoir Data
Matplotlib gives total control. Seaborn gives speed for statistical plots. Built on top of Matplotlib, Seaborn specializes in visualizing distributions, relationships, and categories — the questions you ask when you are exploring a dataset before building a formal analysis.
Distribution of Initial Production Rates
Initial production (IP) rate distributions are important because they characterize what a "typical" well in the field looks like. They also reveal whether a few prolific wells are carrying most of the production — a common pattern that has direct implications for field development economics.
The right-skewed distribution is extremely common in oil and gas. Most wells are moderate producers, a few are excellent, and a handful are marginal. This asymmetry is why the industry uses percentile statistics (P10, P50, P90) rather than averages — the average is pulled upward by a few prolific wells and does not represent the "typical" well.
Correlation Heatmaps
When you have multiple well properties, a correlation heatmap reveals which properties move together. A strong positive correlation between two properties may indicate a physical relationship worth investigating. A surprising absence of correlation may challenge assumptions.
A correlation of +1 means two properties move in lockstep; -1 means they move in opposite directions; 0 means no linear relationship. In this data, expect porosity and permeability to be positively correlated (more pore space generally means better connectivity). Any strong unexpected correlation — or the absence of an expected one — is worth investigating further.
Pair Plots — All Relationships at Once
When you want to see the actual scatter of every property pair — not just the correlation coefficient — Seaborn's pair plot creates a grid of scatter plots with distributions along the diagonal.
Pair plots are the first tool to reach for when exploring a new dataset. They reveal outliers, clusters, and non-linear relationships that summary statistics miss. In a field with 25 wells, four properties can be visually absorbed in under a minute.
Bringing It Together — The Field Report
Everything in this chapter converges into a single deliverable: a multi-panel field summary that communicates the state of a field to a management audience. Four panels — production trend, water cut trend, well ranking, and spatial map — in one figure.
Four panels. The top-left shows the production decline that triggered the meeting. The top-right shows water cut rising across the field, which explains part of the decline. The bottom-left ranks current well performance so management knows where to focus. The bottom-right shows the spatial distribution, revealing that the best-performing wells cluster in the northwest.
If data updates next month, you re-run the script. The entire report regenerates in seconds.
lightbulbSaving Figures for Reports and Presentations
To save any figure as a high-resolution image:
``python fig.savefig('oml58_field_report.png', dpi=300, bbox_inches='tight', facecolor='white') ``
Use dpi=300 for print quality and bbox_inches='tight' to prevent clipped labels. For vector output (scalable without pixelation), save as PDF or SVG: fig.savefig('report.pdf', bbox_inches='tight').
infoInteractive Plots
Libraries like Plotly and Bokeh create interactive, zoomable, hover-enabled visualizations. These are excellent for dashboards and web applications, and Chapter 22 builds interactive dashboards using Streamlit and Plotly. The static, publication-quality figures in this chapter are the foundation — they go into SPE papers, regulatory filings, and board presentations where interactivity is not available or expected.
Exercises
Decline Rate Comparison
Generate production data for three wells with different decline characteristics: gentle decline (Di = 0.02/month), moderate (Di = 0.05/month), and agg...
Waterflood Surveillance Dashboard
Create a 2×2 subplot figure for a waterflood project: (a) Oil rate and water rate on the same plot with dual y-axes (use ax.twinx()). (b) Water cut ve...
Extended Well Log Display
Extend the triple combo well log from this chapter by adding a fourth track: water saturation (Sw). Sw ranges from 0 to 1, where low Sw indicates hydr...
Regional Comparison with Seaborn
Using the well_data DataFrame from the Seaborn section, add a "Region" column by assigning wells to "North" or "South" based on a random coordinate. C...
Summary
This chapter covered the visualization types that petroleum engineers use daily, and the reason each one exists:
- Matplotlib's
fig, axpattern is the foundation of all scientific plotting in Python. Always useplt.subplots()— it provides full control and scales to any complexity. - Production time series show rate trends over time. Multi-well overlays identify underperformers instantly.
- Cumulative production plots smooth out noise in rate data and reveal underlying trends more clearly — particularly useful for reserves estimation.
- Crossplots with log scales reveal physical relationships like porosity-permeability trends. The logarithmic relationship arises from pore throat geometry and Darcy's law.
- Bar charts rank and compare with immediate clarity.
- Multi-panel subplots with shared axes are the standard for production surveillance — oil rate, water cut, and GOR stacked vertically reveal correlated reservoir behavior.
- Well log displays (the triple combo) are the most iconic visualization in the industry. Three tracks — gamma ray, resistivity, and density-neutron porosity — identify pay zones and guide completion decisions.
- Bubble maps combine location, magnitude, and a third variable to reveal spatial production patterns driven by underlying geology.
- Decline curves on semi-log scales linearize exponential declines, enabling visual extrapolation for reserves and economic life estimation.
- Seaborn accelerates statistical exploration through distribution plots, correlation heatmaps, and pair plots that reveal patterns across many variables simultaneously.
- Programmatic visualization is a reusable tool, not a one-off effort. When new data arrives, the script regenerates the entire report.
Every chart type in this chapter exists because an engineer needed to make a better decision. The goal is not to make pretty pictures — it is to turn data into understanding that drives action.
In Part II, these tools are applied to real petroleum engineering problems, starting with the data formats and workflows that define the industry in Chapter 6.