Post-processing tools#

The gospl.analyse package and a couple of helpers in gospl.tools provide the post-processing / analysis scripts that turn a finished goSPL run (XDMF + per-partition HDF5) into publication products: regular-grid NetCDFs, per-basin river profiles and outflow fluxes, stratigraphic sections / wells / Wheeler diagrams, a 3-D stratigraphic wedge and source-to-sink provenance.

Each is importable as a function API and runnable as a console command (see Running goSPL for the command-line usage). They need the optional extras pip install gospl[analysis] (numba, geopandas, netCDF4, matplotlib); the heavy dependencies are imported lazily so importing the modules themselves is cheap.

Regular-grid NetCDF + river profiles — `gospl-grid`#

Post-processing: rasterise a goSPL surface to a regular CF-NetCDF grid for PyGMT / ArcGIS, with drainage basins, chi (\(\chi\)) and drainage area computed on the grid — plus per-basin river longitudinal profiles.

goSPL writes an unstructured triangular mesh (XDMF + per-partition HDF5) that PyGMT and ArcGIS do not read directly. This tool reassembles the global mesh, interpolates every surface field of a chosen output step onto a regular grid (honouring the mesh triangulation), runs a standard raster D8 hydrology pass on the gridded elevation, and writes one NetCDF holding every field plus drainage_area, basin (integer outlet id) and chi.

Raster D8 hydrology (self-contained, no heavy deps): a priority-flood + epsilon fill guarantees drainage, then steepest-descent receivers give the drainage area, the basin each cell drains to, the flow distance to the outlet, and \(\chi = \int (A_0/A)^{m/n}\,dl\) integrated upstream from base level. These are raster/D8 quantities (the convention for chi and basins), so they can differ slightly from goSPL’s internal MFD drainage.

The companion API extracts and plots river profiles for a chosen basin:

basin_rivers() — the channel network (cells with drainage area above a threshold) of one basin, split into the main stem and its tributaries, each with along-channel distance, elevation, chi and area.
plot_long_profile() — longitudinal profile (distance or chi vs elevation), main stem highlighted.
plot_basin_map() — the channel network drawn on the surface.

Options#

--h5dir DIR: goSPL output h5 directory (required).
--mesh FILE:vkey:ckey: global mesh .npz and its vertex / cell keys (required), e.g. input/mesh.npz:v:c — supplies the global coordinates the partitions map onto and the triangulation used for interpolation.
--out FILE: output NetCDF path (default surface.nc).
--step N: output step to grid (default: the last one found).
--spacing DX[,DY]: grid spacing in mesh units (default: the median node spacing).
--fields a,b,...: surface fields to grid (default: all present in the step).
--mn FLOAT / --a0 FLOAT: chi concavity m/n (default 0.5) and reference area (default 1.0).
--base-level FLOAT: elevation defining the coast/outlet — catchments drain to it and chi is measured from it (default: the run’s sea level, read from the step’s .xmf; falls back to 0). Cells at/below it are marine and excluded.
--latlim FLOAT: geographic meshes only: crop |latitude| to this limit, dropping the singular polar caps (default 89.9).
--file-base BASE: mesh-output file base name (default gospl).

A global (spherical) mesh is auto-detected and gridded in lon/lat (degrees, CF lon/lat). A near-full-longitude global grid is treated as periodic in longitude, so the D8 flow and the rasterisation wrap across the antimeridian — a continent/river crossing ±180° stays intact, regardless of where the seam falls. Hydrology metrics are latitude-weighted (dx = R·cos(lat)·dλ). The poles are a genuine singularity of any lon/lat grid (meridians converge); they are treated as boundaries and the polar caps are cropped (--latlim). Drainage that genuinely crosses a pole needs a polar projection (out of scope) — regional/continental basins are unaffected.

Runnable as gospl-grid (installed) or python -m gospl.analyse.gridexport:

gospl-grid --h5dir myrun/h5 --mesh input/mesh.npz:v:c --out surface.nc \
    --spacing 1000 --mn 0.5

# then, in a notebook:
from gospl.analyse.gridexport import grid_export, basin_rivers, \
    plot_long_profile, plot_basin_map
g = grid_export("myrun/h5", "input/mesh.npz", step=20, spacing=1000.)
riv = basin_rivers(g, basin_id=g["main_basin"], area_threshold=5e6)
plot_long_profile(riv); plot_basin_map(g, riv)

Functions

analyse.gridexport.grid_export(h5dir, mesh, step=None, vkey='v', ckey='c', spacing=None, nx=None, ny=None, fields=None, mn=0.5, a0=1.0, base_level=None, file_base='gospl', latlim=None)[source]#

Build the regular-grid surface (fields + D8 hydrology) for one step.

Returns:: a dict of 2-D grids (ny, nx) keyed by field name plus drainage_area, basin, chi, flowdist, the 1-D axes x/y, mask, receiver (flat index, -1 at outlets), spacing (dx, dy) and main_basin (largest basin id).

analyse.gridexport.to_netcdf(result, path, time=None)[source]#: Write the gridded surface to a CF-1.x NetCDF (1-D coordinate variables + 2-D fields), readable by PyGMT (grdimage etc.) and ArcGIS. Geographic grids use CF lon/lat (degrees_east/degrees_north) so they are recognised as geographic; planar grids use x/y.

analyse.gridexport.basin_rivers(result, basin_id=None, area_threshold=None)[source]#: Extract the channel network of one basin: cells whose drainage area exceeds area_threshold and that belong to basin_id (default: the largest basin). Returns a dict with the main stem (traced from the outlet up the largest-area donor at each step) and the list of tributaries (each channel head traced down to the main stem), every path carrying x/y/dist/elev/chi/area arrays ordered outlet->source.

analyse.gridexport.plot_long_profile(rivers, xaxis='dist', ax=None, which='elev', figsize=(7, 4))[source]#

Longitudinal profile (xaxis = 'dist' distance-to-outlet, or 'chi') vs elevation: main stem bold, tributaries thin.

Parameters:: which – 'elev' (default) plots the raw gridded elevation — which keeps real depressions/lakes and TIN-interpolation roughness as small peaks/dips; 'filled' plots the priority-flood-filled elevation, which is strictly monotonic upstream (each cell drains to a lower receiver) — the smooth, hydrologically-conditioned profile.

analyse.gridexport.plot_basin_map(result, rivers, background='elev', ax=None, figsize=(7, 5), sea_level=None)[source]#

Map the channel network on the surface: background field as an image, main stem (red) + tributaries (cyan) overlaid, plus the sea-level coastline (the elev == sea_level contour). Returns the axes.

Parameters:: sea_level – elevation of the coastline contour; defaults to the run’s sea level used for the hydrology (result['base_level']). Pass a float to override, or None is used (no line) if neither is available.

Per-basin outflow fluxes — `gospl-catchment`#

Post-processing: extract, per drainage basin, the cell of maximum water discharge and the cell of maximum sediment load from a gridded goSPL output — i.e. each basin’s outflow point (river mouth) and its flux.

This is the natural downstream step after gospl.analyse.gridexport: that tool rasterises a goSPL surface to a CF-NetCDF grid carrying FA (flow accumulation / water discharge), sedLoad (sediment load) and basin (the drainage-basin id of every cell). Here, for every basin, we pick the single cell where FA peaks (the water outlet) and the single cell where sedLoad peaks (the sediment outlet), and write their lon/lat + value. Looped over a time series, this gives the migrating river-mouth fluxes used in flux maps.

The water discharge defaults to FA. Pass flow_var="fillFA" (--flow-var fillFA) to use the depression-filled accumulation instead, which routes the trunk river through lakes / pits so a basin’s outlet carries its full upstream discharge even when the channel crosses a depression (raw FA can drop to zero inside a lake).

Consistent variable names#

The defaults match the names gospl.analyse.gridexport.to_netcdf() writes, so this tool reads a surface*.nc produced by gospl-grid directly — no renaming step:

quantity	gridexport name	legacy fallback (mapOutputs)
water discharge	`FA`	`flowDischarge`
sediment load	`sedLoad`	`sedimentLoad`
basin id	`basin`	`basinID`
longitude	`lon`	`longitude`
latitude	`lat`	`latitude`

Each variable falls back to the legacy name automatically if the gridexport one is absent, so old fsdata*.nc files still work; explicit *_var arguments override both.

Speed#

The per-basin maximum is a grouped arg-max, done once with a single lexsort over the subaerial cells (sort by (basin, value); the last cell of each basin group is its maximum). This is \(O(N \log N)\) for the whole grid, replacing the previous \(O(N_\mathrm{basins} \times N_\mathrm{cells})\) scan (a full where(basin == k) per basin), which dominated runtime on a global 0.1° grid (millions of cells × thousands of basins). It is fast enough to run serially — the old MPI fan-out over basins is no longer needed.

Runnable as gospl-catchment (installed) or python -m gospl.analyse.catchment:

gospl-catchment -i inputSedFlow.csv -o flowsed

where the index CSV has two columns time,netcdf:

time,netcdf
1,results/surface1.nc
5,results/surface5.nc
10,results/surface10.nc

writes flowsed/flow{time}.csv and flowsed/sed{time}.csv (columns basin,lon,lat,val). In a notebook:

from gospl.analyse.catchment import catchment_flux, basin_outflow
out = catchment_flux("inputSedFlow.csv", "flowsed")   # batch -> CSVs
o = basin_outflow("results/surface10.nc")             # one file
flowdf, seddf = o["flow"], o["sed"]

Functions

analyse.catchment.basin_outflow(ncfile, min_cells=10, flow_var=None, sed_var=None, basin_var=None, lon_var=None, lat_var=None)[source]#

Per-basin outflow points for one gridded NetCDF.

Parameters:

ncfile – path (or open netCDF4/xarray dataset) of a gridded surface — typically a gospl-grid surface*.nc.
min_cells – basins with at most this many cells are ignored (default 10).
lat_var (flow_var, sed_var, basin_var, lon_var,) – override the variable names (defaults auto-detect the gridexport names, then the legacy ones).

Returns:

{"flow": DataFrame, "sed": DataFrame} — each basin,lon,lat,val (val in m³/yr), one row per basin, the cell where that basin’s water discharge (flow) / sediment load (sed) is largest.

analyse.catchment.catchment_flux(index, outdir=None, min_cells=10, verbose=True, flow_var=None, sed_var=None, basin_var=None, lon_var=None, lat_var=None)[source]#

Batch basin_outflow() over a time series of gridded NetCDFs.

Parameters:

index – a CSV path with columns time,netcdf (paths relative to the CSV’s directory are resolved), a pandas.DataFrame with those columns, or an iterable of (time, ncfile) pairs.
outdir – if given, write outdir/flow{time}.csv and outdir/sed{time}.csv (columns basin,lon,lat,val).

Returns:

{time: {"flow": DataFrame, "sed": DataFrame}}.

Stratigraphic sections, wells & Wheeler — `gospl-section`#

Post-processing: publication-quality stratigraphic sections, wells and Wheeler diagrams from a goSPL run (matplotlib, vector PDF/SVG).

Reassembles the global stratigraphic pile (stratZ/stratH and, when present, stratHf/phiS/phiF and provenance stratP) plus the current surface elev, drops the bedrock-sentinel layer, and builds the current layer-interface elevations (surface = elev; each interface = elev − Σ thickness above; basement = elev − Σ thickness), so eroded layers pinch out correctly. On that it offers four products:

cross_section() — a vertical section along the x- or y-direction or an arbitrary (x, y) path: stratigraphy drawn as a coloured layer mesh (color_by = deposition elevation / thickness / lithology / porosity / age / provenance) bounded by thick surface and basement lines, sampled at the mesh resolution.
horizontal_slice() — a map at constant elevation z: which layer (and its property) is preserved at that depth (subcrop / structure slice).
synthetic_well() — the full stratigraphic column at a point, as a borehole log coloured by any layer property (well_panel() draws several wells side by side on one figure with a shared colour scale).
wheeler() — a chronostratigraphic (Wheeler) chart along a transect: distance vs deposition time, coloured by preserved thickness / composition, with hiatuses and erosion shown as gaps.

Each returns the matplotlib Figure/Axes for paper tuning. CLI gospl-section (or python -m gospl.analyse.stratasection) with --kind cross|slice|well|wheeler.

Functions

analyse.stratasection.load_strata(h5dir, mesh, step=None, vkey='v', ckey='c', file_base='gospl', strat_base='stratal')[source]#

Reassemble the global stratigraphic pile + current surface for one step.

Returns:: a dict with the mesh (coords/cells/x/y, geographic), elev (n,), per-layer stratZ/stratH and optional stratHf/phiS/phiF/stratP (n, L) with the bedrock sentinel dropped, interfaces (n, L+1) current interface elevations (basement..surface), nlayers and lo (first kept layer).

analyse.stratasection.cross_section(data, kind='x', at=None, path=None, color_by='lithology', npts=None, vexag=1.0, sea_level=None, cmap=None, ax=None, tstart=0.0, dt=None, figsize=None, layer_lines=0, layer_line_kw=None, facies_depths=(0.0, 20.0, 50.0, 75.0), facies_colors=('limegreen', 'darkkhaki', 'sandybrown', 'khaki', 'c', 'teal'), facies_labels=('fluvial / deltaic plain', 'shoreface', 'distal offshore', 'upper slope', 'lower slope'), legend_loc='lower left', title_fontsize=None, xlim=None, ylim=None)[source]#

Vertical stratigraphic section along x / y / a path. Layers are drawn as a coloured pcolormesh (pinching where eroded) between the current interfaces, bounded by thick surface and basement lines, with a light-grey fill below the basement and a solid sea-level datum line drawn at the back (behind the coloured section).

The y-axis shows the true elevation (m). Vertical exaggeration is applied as a real data aspect ratio (vexag), so the tick labels stay true; with vexag == 1 the box is auto-scaled and figsize controls the look.

Parameters:

figsize – figure size (w, h) inches when a new figure is created.
vexag – vertical exaggeration (data aspect); 1 = auto-scaled (labels are always true elevation either way).
layer_lines – if > 0, overlay a thin interface line every N layers (e.g. 2 = every other layer) on top of the surface/basement lines.
layer_line_kw – dict of matplotlib line kwargs for those layer lines (defaults: thin grey).
color_by – any of deposition/thickness/lithology/coarse/ porosity/age/provenance/facies. facies classifies each layer by the water depth at deposition (see facies_field) and draws a discrete colour + legend instead of a colour bar.
facies_labels (facies_depths / facies_colors /) – tunable facies bin edges (m below sea level), colours and legend labels.
sea_level – datum line + facies depth reference; defaults to the value read from the simulation for this step (data["sea_level"]). Omitted if the run did not record one.
ylim (xlim /) – explicit axis ranges (min, max) in data units (m). When either is given the view is clipped to exactly those bounds and the aspect is left auto (so figsize sets the proportions); vexag only locks the data aspect when neither limit is pinned.

analyse.stratasection.horizontal_slice(data, z, color_by='age', cmap=None, ax=None, tstart=0.0, dt=None, figsize=None, title_fontsize=None)[source]#: Map at constant elevation z: colour each location by the property of the layer preserved at that elevation (subcrop / structure slice). Cells where z is above the surface or below the basement are blank.

analyse.stratasection.synthetic_well(data, x, y, color_by='lithology', cmap=None, ax=None, tstart=0.0, dt=None, width=1.0, figsize=None, title_fontsize=None, cbar_orientation='horizontal', vmin=None, vmax=None, colorbar=True, ylim=None)[source]#

Synthetic well at (x, y): the stratigraphic column drawn as stacked intervals (true current elevations) coloured by a layer property.

To draw several wells on one figure with a shared colour scale, pass each an ax= (your own subplots) plus common vmin/vmax and colorbar=False, then add a single colour bar — or just use well_panel(), which does this for a list of locations.

Parameters:

cbar_orientation – 'horizontal' (default, colour bar across the base) or 'vertical'.
vmax (vmin /) – colour-scale limits; default is this well’s own data range (pass shared values to make several wells comparable).
colorbar – draw the per-well colour bar (set False when sharing one).
ylim – elevation range (min, max); default is this well’s own basement-to-surface span.
title_fontsize – font size for the title (matplotlib default if None).

analyse.stratasection.well_panel(data, locations, color_by='lithology', cmap=None, tstart=0.0, dt=None, width=1.0, figsize=None, title_fontsize=None, labels=None, sharey=True, vmin=None, vmax=None, ylim=None)[source]#

Draw several synthetic wells side by side on one figure, with a shared colour scale and a single colour bar at the base.

Parameters:

locations – list of (x, y) well positions.
labels – per-well titles; default "well 1", "well 2", … (pass a list to name them, one entry per location).
color_by – layer property coloured in every well (shared scale).
vmax (vmin /) – colour-scale limits; default is the field’s finite range across all the wells.
ylim – elevation range (min, max) for every well; default is the union of the wells’ elevation bounds (basement of the deepest to surface of the highest), so the panel spans the full range between the defined wells.
sharey – share the elevation axis across wells (default True).

Returns:

(fig, axes).

analyse.stratasection.wheeler(data, kind='x', at=None, path=None, color_by='thickness', npts=None, tstart=0.0, dt=None, sea_level=None, cmap=None, ax=None, figsize=None, facies_depths=(0.0, 20.0, 50.0, 75.0), facies_colors=('limegreen', 'darkkhaki', 'sandybrown', 'khaki', 'c', 'teal'), facies_labels=('fluvial / deltaic plain', 'shoreface', 'distal offshore', 'upper slope', 'lower slope'), legend_loc='upper right', title_fontsize=None, xlim=None, ylim=None)[source]#

Wheeler (chronostratigraphic) diagram along a transect: distance (x) vs deposition time / layer (y), coloured by color_by; hiatuses and erosion are blank (layers with zero preserved thickness along the transect).

The shoreline trajectory is overlaid — the locus where the deposition water depth (sea_level − stratZ) crosses 0 (the subaerial↔marine boundary) — showing transgression/regression through time. sea_level defaults to the value read from the simulation for this step (data["sea_level"]); pass a float, or a per-layer array (length nlayers) for a time-varying paleo sea level, or it is omitted if the run did not record one.

color_by="facies" colours each layer by the depositional facies (water depth at deposition; see facies_field()) with a discrete legend — consistent with the shoreline overlay. figsize and the facies_* lists are tunable.

3-D stratigraphic wedge for ParaView — `gospl-strata-volume`#

Post-processing: build a 3-D wedge volume of the goSPL stratigraphy for one or more output steps, ready to open in ParaView.

goSPL records the stratigraphic pile in the distributed stratal.<step>.p<rank> .h5 files (per-layer stratZ/stratH/phiS and, optionally, stratHf/phiF for dual lithology and stratP for provenance), with the triangular surface mesh in topology.p<rank>.h5. This tool stacks that triangular mesh across the recorded layers into a triangular-prism (wedge) volume: each vertex of the surface mesh is replicated at every layer’s deposition elevation stratZ, and each surface triangle becomes a column of wedge cells between consecutive layers. The layer properties are written as cell data on the wedges, so colouring a slab shows that layer’s composition.

Three field modes (--field):

basic (default) — just the always-recorded per-layer thickness, elevation, layer index and porosity. Works for any run (a plain stratigraphy run needs neither dual lithology nor provenance).
lithology — coarse / fine thickness, fine fraction and the per-fraction porosities (needs the dual-lithology stratHf/phiF fields).
provenance — per-source-class volume fraction, the dominant source and the per-layer porosity (plus phiFine if the run is also dual lithology); needs the provenance stratP field.

(thickness, elevation and layer are attached in every mode.)

The basal bedrock sentinel layer (the ~1e6 m infinite reservoir goSPL keeps under the recorded deposits) is auto-detected and skipped, so the volume spans only the real recorded layers (override with --first-layer).

Output mirrors goSPL’s own layout — one HDF5 per input partition per step (<outdir>/<out>.<step>.p.h5) plus a single self-contained <outdir>/<out>.xdmf temporal collection referencing them. It runs serially (looping over all partitions) or under mpirun (the partitions are split across ranks — independent of the simulation’s processor count); the output is identical either way, so the result is trivially parallel and time-evolving.

Options#

--h5dir DIR: goSPL output h5 directory (required).
--outdir DIR: output directory, created if needed (default strata).
--out PREFIX: output file prefix (default strata); writes <outdir>/<out>.xdmf plus <outdir>/<out>.<step>.p.h5.
--field {basic,lithology,provenance}: cell-field set (default basic). basic = thickness/elevation/layer/ porosity only (works for any run); lithology = coarse/fine thickness, fine fraction and per-fraction porosity (needs the dual-lithology stratHf/phiF fields); provenance = per-source-class volume fraction + dominant source + per-layer porosity (needs stratP).
--steps S0,S1,...: comma-separated output steps to convert (default: all found). Steps with fewer than 2 recorded layers are skipped.
--tout YR (with optional --tstart YR): set the ParaView Time to the simulation year tstart + step*tout instead of the bare step index (default: step index).
--first-layer N: first layer index to include (default: auto-skip the basal bedrock-sentinel layer, detected as thickness >= 1e5 m).
--mesh-base BASE: mesh-output file base providing the current surface elev (default gospl). Layer interfaces are placed at the current surface minus the cumulative compacted thickness, so eroded layers collapse instead of overhanging; falls back to the recorded stratZ if the mesh file is absent.
--file-base BASE: stratal-file base name (default stratal).

Runnable as the installed command gospl-strata-volume or, with no install step, as python -m gospl.analyse.stratamesh. Examples:

# defaults -> writes ./strata/strata.xdmf (+ per-partition .h5)
gospl-strata-volume --h5dir myrun/h5 --field lithology --steps 0,5,10

# provenance source mix, ParaView time in years, run in parallel
mpirun -np 4 gospl-strata-volume --h5dir myrun/h5 --outdir vol \
    --field provenance --tout 100000

Functions

analyse.stratamesh.discover_steps(h5dir, file_base='stratal')[source]#: Return (sorted step list, partition count) found in h5dir.

analyse.stratamesh.detect_first_layer(stratal_path)[source]#: Index of the first non-basement layer: skip leading layers whose thickness exceeds the sentinel cut-off (the ~1e6 m infinite-bedrock reservoir). The layer structure is global, so reading one partition is representative.

analyse.stratamesh.build_partition(stratal_path, topology_path, lo, field, mesh_path=None)[source]#

Build the wedge volume for one partition / step.

Parameters:

stratal_path – stratal.<step>.p.h5 (per-layer fields).
topology_path – topology.p.h5 (coords, cells).
lo – first layer to include (basement layers below are dropped).
field – "lithology" or "provenance".
mesh_path – optional <mesh>.<step>.p.h5 providing the current surface elev. When given, the layer-interface elevations are reconstructed from the current surface minus the cumulative (compacted) thickness above each layer — so eroded layers (stratH = 0) collapse to zero-thickness wedges instead of being drawn at their stale deposition elevation (which produced spurious cells above the eroded surface). Without it, the recorded stratZ is used.

Returns:

dict with vertices (Nv, 3), wedges (Nw, 6), cells cell fields (each (Nw,)), frac (Nw, C) for provenance, and n_classes. Returns None when the step has too few recorded layers to form a wedge.

analyse.stratamesh.write_partition_h5(path, data)[source]#: Write one partition’s wedge volume to HDF5.

analyse.stratamesh.write_xdmf(out_prefix, meta, field, times=None)[source]#

Write a single self-contained <out>.xdmf (temporal collection): one time per step, each a spatial collection of the per-partition wedge grids.

Parameters:

meta – {step: [ {h5, nv, nw, cells:[names], ncl}, ... per partition ]}.
times – optional {step: value} for the ParaView Time (e.g. the simulation year); defaults to the step index.

Sediment provenance — `gospl-provenance`#

Sediment provenance / source-to-sink attribution from goSPL output.

Standalone post-processor (Approach A in docs/DESIGN_PROVENANCE.md): given a per-vertex source class (from user source-rock polygons) and a goSPL run, it tracks the eroded material down the per-step flow network and accumulates, in each sink, the deposited volume attributable to each source rock type — per basin and per pixel (mesh node) — plus the mean transport distance and an optional copper-fertility layer.

The core (ProvenanceTracker) is pure NumPy and operates on arrays, so it runs without GIS / HDF5 dependencies; the optional I/O helpers (classes_from_shapefile(), tracker_from_gospl()) import geopandas / h5py lazily.

Method (deterministic continuum of particle tracking, mass-conserving per step): for each output step the net erodep change gives the eroded volume (sources) and deposited volume (sinks); the eroded material — recycled from the local deposited pile, plus fresh bedrock of the node’s source class — is routed downstream and laid down where the model deposits, carrying its provenance and travelled distance. Routing is multiple-flow-direction by default (flux splits across all lower neighbours by slope, so a source can feed several basins), with single steepest descent available (routing='sfd').

Caveats (see the design doc): it sees the net erodep per output step (not the compute dt), re-derives the routing from the saved elevation, and is a reconstruction — exact volumes need the in-model tracer (Approach B). Provenance is not prospectivity: the Cu layer is a fertility-weighted detritus proxy, upstream of the ore-forming process.

Inputs (all per-vertex arrays are in the global input-mesh ordering — the npdata mesh v; build per-vertex labels with classes_from_shapefile() and reassemble model fields with GosplOutput):

Classes

class analyse.provenance.ProvenanceTracker(coords, source_class, n_classes, area=None, cells=None, neighbours=None, basin_id=None, cu_weight=None, routing='mfd', flow_exp=1.0, method='auto')[source]#

Accumulate source-to-sink sediment provenance over a goSPL run.

Feed cumulative erodep snapshots in time order with step(); read the attribution with basin_percentages(), pixel_fractions(), mean_distance() and cu_fraction().

Methods

`basin_percentages`()	Dict `{basin_id: array(n_classes)}` of the % contribution of each source class to each sink basin.
`cu_fraction`()	Per-node Cu-sourced fraction of the deposit (needs `cu_weight`).
`mean_distance`()	Per-node flux-weighted mean source->deposition transport distance.
`pixel_fractions`()	Per-node deposited provenance fractions (npoints, n_classes); rows that received no deposit are all-zero.
`step`(elev, erodep)	Process one output step.

__init__(coords, source_class, n_classes, area=None, cells=None, neighbours=None, basin_id=None, cu_weight=None, routing='mfd', flow_exp=1.0, method='auto')[source]#

Required

Parameters:

coords – (npoints, 2|3) global-mesh vertex coordinates (mesh v); used for travel distance and to define the vertex ordering all the other per-vertex arrays must follow.
source_class – (npoints,) integer source-rock class per vertex, each value in [0, n_classes). Obtain by point-in-polygon of the source shapefiles (classes_from_shapefile()); give vertices outside every polygon a dedicated “other”/background class — not -1.
n_classes – number of source classes.

One of cells / neighbours (the routing connectivity)

Parameters:

cells – (ntri, 3) triangular mesh cells (mesh c) — adjacency is built from these.
neighbours – pre-built (indptr, indices) CSR adjacency (instead of cells).

Optional

Parameters:

area – (npoints,) Voronoi cell area (m²); defaults to 1 (volume == thickness). Provide it for correct volumes on irregular meshes; percentages/ratios are area-independent and unaffected.
basin_id – (npoints,) sink-basin label per vertex, -1 = not in a sink basin. Optional — needed only for basin_percentages() (the per-basin roll-up). Per-pixel provenance (pixel_fractions()) is produced at every depositing node regardless.
cu_weight – (n_classes,) copper fertility per source class; needed only for cu_fraction().
routing – 'mfd' (default, multiple-flow-direction — flux splits across all lower neighbours by slope, so a source can feed several basins) or 'sfd' (single steepest descent).
flow_exp – MFD flow-partition exponent (slope power for the weights).
method – routing-sweep backend — 'auto' (default: the Numba-compiled sweep when numba is installed, else pure Python), 'numba' (require Numba), or 'python' (reference). All give identical results (same topological sweep, including the min(D, flux) deposition cap). numba removes the Python per-node loop — the dominant cost on large meshes — for a several-fold (and growing with N) per-step speedup; the vectorised edge build / erosion is shared, so the overall gain is modest on small meshes. (A sparse transport-with-loss solve was evaluated and rejected — slower at realistic sizes due to LU fill-in, and only exact when no node is sediment-undersupplied; see docs/DESIGN_PROVENANCE.md.)

basin_percentages()[source]#: Dict {basin_id: array(n_classes)} of the % contribution of each source class to each sink basin. Requires basin_id.

cu_fraction()[source]#: Per-node Cu-sourced fraction of the deposit (needs cu_weight).

mean_distance()[source]#: Per-node flux-weighted mean source->deposition transport distance.

pixel_fractions()[source]#: Per-node deposited provenance fractions (npoints, n_classes); rows that received no deposit are all-zero.

step(elev, erodep)[source]#

Process one output step.

Parameters:

elev – (npoints,) surface elevation at this step (filled elevation preferred — fillFA’s surface — so routing has no spurious pits).
erodep – (npoints,) cumulative erosion-deposition at this step.

class analyse.provenance.GosplOutput(h5dir, global_coords, file_base='gospl')[source]#

Reader that reassembles global per-vertex fields from goSPL’s partitioned HDF5 output.

goSPL writes one file per MPI partition per step (<base>.<step>.p<rank>.h5) holding only that rank’s local nodes, with the partition meshes in topology.p<rank>.h5. Provenance routing is global, so this reader maps each partition’s local nodes onto the global input mesh ordering with a KDTree — the same local↔global map goSPL builds at load time (tree.query(lcoords)) — and scatters each field into a global array. Pass the global mesh coordinates (the npdata mesh v) so the assembled arrays line up with your per-vertex source_class / basin_id.

Methods

field(step, name)

Assemble a global per-vertex array for name at output step.

__init__(h5dir, global_coords, file_base='gospl')[source]#

field(step, name)[source]#: Assemble a global per-vertex array for name at output step.

Functions

analyse.provenance.build_neighbours(cells, npoints)[source]#

Build the vertex adjacency (CSR indptr/indices) from triangular mesh cells — the undirected edge graph used for steepest-descent routing.

Parameters:

cells – (ntri, 3) integer vertex indices.
npoints – number of vertices.

Returns:

(indptr, indices) CSR adjacency arrays.

analyse.provenance.steepest_receivers(elev, indptr, indices)[source]#

Single-flow-direction (SFD) receivers: each node points to its lowest neighbour strictly below it, or to itself (a sink / outlet) if none is lower.

Returns:: integer receiver array (length npoints).

analyse.provenance.downhill_edges(elev, coords, indptr, indices, routing='mfd', exponent=1.0)[source]#

Build the weighted downhill-flow graph as CSR edges (e_indptr, e_idx, e_weight, e_seg): for each node, the receivers it routes to, the fraction of its flux each takes, and the edge length.

routing='sfd': a single edge to the steepest-descent neighbour (weight 1); empty for sinks/outlets.
routing='mfd': an edge to every lower neighbour, weight \(\propto \text{slope}^{exponent}\) and normalised to sum to one (multiple-flow-direction, as goSPL routes water/sediment). exponent is the flow-partition exponent (1 ≈ uniform-to-slope spreading).

analyse.provenance.classes_from_shapefile(coords, shapefile, attribute, background='other', crs=None)[source]#

Assign each vertex an integer class by point-in-polygon against a source-rock (or sink-basin) polygon shapefile. The result is directly usable as the tracker’s source_class: every vertex gets a class in [0, n_classes), with vertices outside every polygon assigned a background class (so there are no -1 sentinels). Lazily imports geopandas/shapely.

Parameters:

coords – (npoints, 2|3) vertex coordinates in the shapefile’s CRS (only the first two columns are used).
shapefile – path to a polygon shapefile.
attribute – polygon attribute holding the class label.
background – label for vertices outside every polygon (appended as its own class). Set to None to instead return -1 for those vertices (e.g. when building a basin_id, where -1 means “not a sink”).
crs – optional CRS to reproject the polygons to before testing.

Returns:

(class_array, {label: int} mapping). For source_class use the array as-is; for basin_id pass background=None.

analyse.provenance.write_xdmf(prefix, npoints, ncells, geom_dim, n_classes, steps, has_cu)[source]#

Write an XDMF temporal-collection wrapper for the provenance HDF5 so the per-step results open as a time series in ParaView (it references the <prefix>.h5 datasets — keep the two files side by side).

Exposes, per step: dominant source (scalar), distance, optional cu_fraction, and each source’s contribution frac_class<c> (a HyperSlab column of the fractions array).

ELA maps from paleo-temperature — `gospl-ela`#

Derive goSPL glacier-geometry maps (hela / hice) from a paleo-climate temperature map, by inverting the atmospheric lapse rate.

The equilibrium-line altitude (ELA) is the elevation at which the air temperature reaches a calibrated threshold T_ELA. With a lapse rate \(\Gamma\) (deg C per metre),

ELA(x) = (T_ref(x) - T_ELA) / Gamma

where T_ref is the temperature reduced to sea level. If your temperature map is given at the local surface elevation z(x) (the usual near-surface output of a climate model, after remapping to the goSPL mesh), reduce it on the fly:

T_ref(x) = T(x) + Gamma * z(x) ==> ELA(x) = z(x) + (T(x) - T_ELA) / Gamma

The ice-cap altitude (top of the accumulation ramp) is set a fixed band above the ELA: hice = hela + band.

The output is a single .npz with per-vertex hela and hice arrays, in the convention goSPL expects for the ice maps. Run it once per paleo-climate snapshot, then list the outputs in the ice.glaciers time series (each entry hela: [<out-without-.npz>, 'hela'] etc.). See docs/user_guide/surfproc.rst and docs/tech_guide/ice.rst.

NOTE#

The temperature and elevation maps must already be per-vertex on the goSPL mesh (same shape/order as the mesh npdata arrays). Remapping a climate model’s native grid onto the mesh is a separate, upstream step.
This derives the ELA position only. goSPL’s ablation magnitude stays precipitation-scaled (the ELA-ramp SMB); it is not a degree-day / energy-balance melt model.

Running it — terminal#

One snapshot (prints a ready-to-paste ice.glaciers entry via --start):

python scripts/ela_from_temperature.py \
    --temperature climate/t2m_21ka.npz --t-key t2m \
    --reference surface --elevation input/mesh.npz --z-key z \
    --lapse 0.0065 --t-ela -2.0 --band 400 \
    --out input/ela_21ka.npz --start -21000

Whole paleo-climate history (one map per snapshot) in a shell loop:

for t in 21000 18000 15000 12000; do
    python scripts/ela_from_temperature.py \
        --temperature climate/t2m_${t}ka.npz --t-key t2m \
        --reference surface --elevation input/mesh.npz --z-key z \
        --lapse 0.0065 --t-ela -2.0 --band 400 \
        --out input/ela_${t}.npz --start -${t}
done

python scripts/ela_from_temperature.py --help lists every option.

Running it — Jupyter notebook#

Either shell out to the script with the ! magic:

!python scripts/ela_from_temperature.py \
    --temperature climate/t2m_21ka.npz --t-key t2m \
    --reference surface --elevation input/mesh.npz --z-key z \
    --lapse 0.0065 --t-ela -2.0 --band 400 --out input/ela_21ka.npz

or import the conversion and loop over snapshots, building the glaciers list programmatically:

import sys, numpy as np
sys.path.append("scripts")              # make the helper importable
from ela_from_temperature import derive_ela

z = np.load("input/mesh.npz")["z"]      # mesh elevation (per vertex)
snapshots = {-21000: "climate/t2m_21ka.npz",
             -18000: "climate/t2m_18ka.npz"}

glaciers = []
for t, fpath in sorted(snapshots.items()):
    T = np.load(fpath)["t2m"]
    hela, hice = derive_ela(T, lapse=0.0065, t_ela=-2.0, band=400.0,
                            reference="surface", elevation=z)
    out = f"input/ela_{int(-t/1000)}ka.npz"
    np.savez(out, hela=hela, hice=hice)
    glaciers.append({"start": float(t),
                     "hela": [out[:-4], "hela"],
                     "hice": [out[:-4], "hice"]})

glaciers   # paste under the YAML `ice:` block as `glaciers:`

Functions

tools.ela_from_temperature.derive_ela(temperature, lapse, t_ela, band, reference='surface', elevation=None)[source]#

Return (hela, hice) per-vertex arrays from a temperature map.

Parameters:

temperature – per-vertex temperature (deg C).
lapse – atmospheric lapse rate (deg C per metre, e.g. 0.0065).
t_ela – temperature at the equilibrium line (deg C).
band – hice - hela accumulation-band width (m).
reference – ‘surface’ (T at the local elevation; needs elevation) or ‘sealevel’ (T already reduced to sea level).
elevation – per-vertex surface elevation (m); required for ‘surface’.

Post-processing tools#

Regular-grid NetCDF + river profiles — gospl-grid#

Options#

Per-basin outflow fluxes — gospl-catchment#

Consistent variable names#

Speed#

Stratigraphic sections, wells & Wheeler — gospl-section#

3-D stratigraphic wedge for ParaView — gospl-strata-volume#

Options#

Sediment provenance — gospl-provenance#

ELA maps from paleo-temperature — gospl-ela#

NOTE#

Running it — terminal#

Running it — Jupyter notebook#

Regular-grid NetCDF + river profiles — `gospl-grid`#

Per-basin outflow fluxes — `gospl-catchment`#

Stratigraphic sections, wells & Wheeler — `gospl-section`#

3-D stratigraphic wedge for ParaView — `gospl-strata-volume`#

Sediment provenance — `gospl-provenance`#

ELA maps from paleo-temperature — `gospl-ela`#