Post-processing tools#
The gospl.analyse package and a couple of helpers in gospl.tools
provide the post-processing / analysis scripts that turn a finished goSPL
run (XDMF + per-partition HDF5) into publication products: regular-grid NetCDFs,
per-basin river profiles and outflow fluxes, stratigraphic sections / wells /
Wheeler diagrams, a 3-D stratigraphic wedge and source-to-sink provenance.
Each is importable as a function API and runnable as a console command (see
Running goSPL for the command-line usage). They need the optional extras
pip install gospl[analysis] (numba, geopandas, netCDF4,
matplotlib); the heavy dependencies are imported lazily so importing the
modules themselves is cheap.
Regular-grid NetCDF + river profiles — gospl-grid#
Post-processing: rasterise a goSPL surface to a regular CF-NetCDF grid for PyGMT / ArcGIS, with drainage basins, chi (\(\chi\)) and drainage area computed on the grid — plus per-basin river longitudinal profiles.
goSPL writes an unstructured triangular mesh (XDMF + per-partition HDF5) that
PyGMT and ArcGIS do not read directly. This tool reassembles the global mesh,
interpolates every surface field of a chosen output step onto a regular grid
(honouring the mesh triangulation), runs a standard raster D8 hydrology pass on
the gridded elevation, and writes one NetCDF holding every field plus
drainage_area, basin (integer outlet id) and chi.
Raster D8 hydrology (self-contained, no heavy deps): a priority-flood + epsilon fill guarantees drainage, then steepest-descent receivers give the drainage area, the basin each cell drains to, the flow distance to the outlet, and \(\chi = \int (A_0/A)^{m/n}\,dl\) integrated upstream from base level. These are raster/D8 quantities (the convention for chi and basins), so they can differ slightly from goSPL’s internal MFD drainage.
The companion API extracts and plots river profiles for a chosen basin:
basin_rivers()— the channel network (cells with drainage area above a threshold) of one basin, split into the main stem and its tributaries, each with along-channel distance, elevation, chi and area.plot_long_profile()— longitudinal profile (distance or chi vs elevation), main stem highlighted.plot_basin_map()— the channel network drawn on the surface.
Options#
--h5dir DIRgoSPL output
h5directory (required).--mesh FILE:vkey:ckeyglobal mesh
.npzand its vertex / cell keys (required), e.g.input/mesh.npz:v:c— supplies the global coordinates the partitions map onto and the triangulation used for interpolation.--out FILEoutput NetCDF path (default
surface.nc).--step Noutput step to grid (default: the last one found).
--spacing DX[,DY]grid spacing in mesh units (default: the median node spacing).
--fields a,b,...surface fields to grid (default: all present in the step).
--mn FLOAT/--a0 FLOATchi concavity
m/n(default 0.5) and reference area (default 1.0).--base-level FLOATelevation defining the coast/outlet — catchments drain to it and chi is measured from it (default: the run’s sea level, read from the step’s
.xmf; falls back to 0). Cells at/below it are marine and excluded.--latlim FLOATgeographic meshes only: crop
|latitude|to this limit, dropping the singular polar caps (default 89.9).--file-base BASEmesh-output file base name (default
gospl).
A global (spherical) mesh is auto-detected and gridded in lon/lat
(degrees, CF lon/lat). A near-full-longitude global grid is treated as
periodic in longitude, so the D8 flow and the rasterisation wrap across the
antimeridian — a continent/river crossing ±180° stays intact, regardless of
where the seam falls. Hydrology metrics are latitude-weighted
(dx = R·cos(lat)·dλ). The poles are a genuine singularity of any lon/lat
grid (meridians converge); they are treated as boundaries and the polar caps are
cropped (--latlim). Drainage that genuinely crosses a pole needs a polar
projection (out of scope) — regional/continental basins are unaffected.
Runnable as gospl-grid (installed) or python -m gospl.analyse.gridexport:
gospl-grid --h5dir myrun/h5 --mesh input/mesh.npz:v:c --out surface.nc \
--spacing 1000 --mn 0.5
# then, in a notebook:
from gospl.analyse.gridexport import grid_export, basin_rivers, \
plot_long_profile, plot_basin_map
g = grid_export("myrun/h5", "input/mesh.npz", step=20, spacing=1000.)
riv = basin_rivers(g, basin_id=g["main_basin"], area_threshold=5e6)
plot_long_profile(riv); plot_basin_map(g, riv)
Functions
- analyse.gridexport.grid_export(h5dir, mesh, step=None, vkey='v', ckey='c', spacing=None, nx=None, ny=None, fields=None, mn=0.5, a0=1.0, base_level=None, file_base='gospl', latlim=None)[source]#
Build the regular-grid surface (fields + D8 hydrology) for one step.
- Returns:
a dict of 2-D grids (
ny, nx) keyed by field name plusdrainage_area,basin,chi,flowdist, the 1-D axesx/y,mask,receiver(flat index, -1 at outlets),spacing(dx, dy)andmain_basin(largest basin id).
- analyse.gridexport.to_netcdf(result, path, time=None)[source]#
Write the gridded surface to a CF-1.x NetCDF (1-D coordinate variables + 2-D fields), readable by PyGMT (
grdimageetc.) and ArcGIS. Geographic grids use CFlon/lat(degrees_east/degrees_north) so they are recognised as geographic; planar grids usex/y.
- analyse.gridexport.basin_rivers(result, basin_id=None, area_threshold=None)[source]#
Extract the channel network of one basin: cells whose drainage area exceeds
area_thresholdand that belong tobasin_id(default: the largest basin). Returns a dict with the main stem (traced from the outlet up the largest-area donor at each step) and the list of tributaries (each channel head traced down to the main stem), every path carryingx/y/dist/elev/chi/areaarrays ordered outlet->source.
- analyse.gridexport.plot_long_profile(rivers, xaxis='dist', ax=None, which='elev', figsize=(7, 4))[source]#
Longitudinal profile (
xaxis='dist'distance-to-outlet, or'chi') vs elevation: main stem bold, tributaries thin.- Parameters:
which –
'elev'(default) plots the raw gridded elevation — which keeps real depressions/lakes and TIN-interpolation roughness as small peaks/dips;'filled'plots the priority-flood-filled elevation, which is strictly monotonic upstream (each cell drains to a lower receiver) — the smooth, hydrologically-conditioned profile.
- analyse.gridexport.plot_basin_map(result, rivers, background='elev', ax=None, figsize=(7, 5), sea_level=None)[source]#
Map the channel network on the surface:
backgroundfield as an image, main stem (red) + tributaries (cyan) overlaid, plus the sea-level coastline (theelev == sea_levelcontour). Returns the axes.- Parameters:
sea_level – elevation of the coastline contour; defaults to the run’s sea level used for the hydrology (
result['base_level']). Pass a float to override, orNoneis used (no line) if neither is available.
Per-basin outflow fluxes — gospl-catchment#
Post-processing: extract, per drainage basin, the cell of maximum water discharge and the cell of maximum sediment load from a gridded goSPL output — i.e. each basin’s outflow point (river mouth) and its flux.
This is the natural downstream step after gospl.analyse.gridexport: that
tool rasterises a goSPL surface to a CF-NetCDF grid carrying FA (flow
accumulation / water discharge), sedLoad (sediment load) and basin (the
drainage-basin id of every cell). Here, for every basin, we pick the single cell
where FA peaks (the water outlet) and the single cell where sedLoad peaks
(the sediment outlet), and write their lon/lat + value. Looped over a
time series, this gives the migrating river-mouth fluxes used in flux maps.
The water discharge defaults to FA. Pass flow_var="fillFA"
(--flow-var fillFA) to use the depression-filled accumulation instead,
which routes the trunk river through lakes / pits so a basin’s outlet carries
its full upstream discharge even when the channel crosses a depression (raw
FA can drop to zero inside a lake).
Consistent variable names#
The defaults match the names gospl.analyse.gridexport.to_netcdf() writes,
so this tool reads a surface*.nc produced by gospl-grid directly — no
renaming step:
quantity |
gridexport name |
legacy fallback (mapOutputs) |
|---|---|---|
water discharge |
|
|
sediment load |
|
|
basin id |
|
|
longitude |
|
|
latitude |
|
|
Each variable falls back to the legacy name automatically if the gridexport one
is absent, so old fsdata*.nc files still work; explicit *_var arguments
override both.
Speed#
The per-basin maximum is a grouped arg-max, done once with a single
lexsort over the subaerial cells (sort by (basin, value); the last cell
of each basin group is its maximum). This is \(O(N \log N)\) for the whole
grid, replacing the previous \(O(N_\mathrm{basins} \times N_\mathrm{cells})\)
scan (a full where(basin == k) per basin), which dominated runtime on a
global 0.1° grid (millions of cells × thousands of basins). It is fast enough to
run serially — the old MPI fan-out over basins is no longer needed.
Runnable as gospl-catchment (installed) or
python -m gospl.analyse.catchment:
gospl-catchment -i inputSedFlow.csv -o flowsed
where the index CSV has two columns time,netcdf:
time,netcdf
1,results/surface1.nc
5,results/surface5.nc
10,results/surface10.nc
writes flowsed/flow{time}.csv and flowsed/sed{time}.csv (columns
basin,lon,lat,val). In a notebook:
from gospl.analyse.catchment import catchment_flux, basin_outflow
out = catchment_flux("inputSedFlow.csv", "flowsed") # batch -> CSVs
o = basin_outflow("results/surface10.nc") # one file
flowdf, seddf = o["flow"], o["sed"]
Functions
- analyse.catchment.basin_outflow(ncfile, min_cells=10, flow_var=None, sed_var=None, basin_var=None, lon_var=None, lat_var=None)[source]#
Per-basin outflow points for one gridded NetCDF.
- Parameters:
ncfile – path (or open
netCDF4/xarraydataset) of a gridded surface — typically agospl-gridsurface*.nc.min_cells – basins with at most this many cells are ignored (default 10).
lat_var (flow_var, sed_var, basin_var, lon_var,) – override the variable names (defaults auto-detect the gridexport names, then the legacy ones).
- Returns:
{"flow": DataFrame, "sed": DataFrame}— eachbasin,lon,lat,val(valin m³/yr), one row per basin, the cell where that basin’s water discharge (flow) / sediment load (sed) is largest.
- analyse.catchment.catchment_flux(index, outdir=None, min_cells=10, verbose=True, flow_var=None, sed_var=None, basin_var=None, lon_var=None, lat_var=None)[source]#
Batch
basin_outflow()over a time series of gridded NetCDFs.- Parameters:
index – a CSV path with columns
time,netcdf(paths relative to the CSV’s directory are resolved), apandas.DataFramewith those columns, or an iterable of(time, ncfile)pairs.outdir – if given, write
outdir/flow{time}.csvandoutdir/sed{time}.csv(columnsbasin,lon,lat,val).
- Returns:
{time: {"flow": DataFrame, "sed": DataFrame}}.
Stratigraphic sections, wells & Wheeler — gospl-section#
Post-processing: publication-quality stratigraphic sections, wells and Wheeler diagrams from a goSPL run (matplotlib, vector PDF/SVG).
Reassembles the global stratigraphic pile (stratZ/stratH and, when
present, stratHf/phiS/phiF and provenance stratP) plus the
current surface elev, drops the bedrock-sentinel layer, and builds the
current layer-interface elevations (surface = elev; each interface =
elev − Σ thickness above; basement = elev − Σ thickness), so eroded
layers pinch out correctly. On that it offers four products:
cross_section()— a vertical section along the x- or y-direction or an arbitrary (x, y) path: stratigraphy drawn as a coloured layer mesh (color_by= deposition elevation / thickness / lithology / porosity / age / provenance) bounded by thick surface and basement lines, sampled at the mesh resolution.horizontal_slice()— a map at constant elevation z: which layer (and its property) is preserved at that depth (subcrop / structure slice).synthetic_well()— the full stratigraphic column at a point, as a borehole log coloured by any layer property (well_panel()draws several wells side by side on one figure with a shared colour scale).wheeler()— a chronostratigraphic (Wheeler) chart along a transect: distance vs deposition time, coloured by preserved thickness / composition, with hiatuses and erosion shown as gaps.
Each returns the matplotlib Figure/Axes for paper tuning. CLI
gospl-section (or python -m gospl.analyse.stratasection) with
--kind cross|slice|well|wheeler.
Functions
- analyse.stratasection.load_strata(h5dir, mesh, step=None, vkey='v', ckey='c', file_base='gospl', strat_base='stratal')[source]#
Reassemble the global stratigraphic pile + current surface for one step.
- Returns:
a dict with the mesh (
coords/cells/x/y,geographic),elev(n,), per-layerstratZ/stratHand optionalstratHf/phiS/phiF/stratP(n, L) with the bedrock sentinel dropped,interfaces(n, L+1) current interface elevations (basement..surface),nlayersandlo(first kept layer).
- analyse.stratasection.cross_section(data, kind='x', at=None, path=None, color_by='lithology', npts=None, vexag=1.0, sea_level=None, cmap=None, ax=None, tstart=0.0, dt=None, figsize=None, layer_lines=0, layer_line_kw=None, facies_depths=(0.0, 20.0, 50.0, 75.0), facies_colors=('limegreen', 'darkkhaki', 'sandybrown', 'khaki', 'c', 'teal'), facies_labels=('fluvial / deltaic plain', 'shoreface', 'distal offshore', 'upper slope', 'lower slope'), legend_loc='lower left', title_fontsize=None, xlim=None, ylim=None)[source]#
Vertical stratigraphic section along x / y / a path. Layers are drawn as a coloured
pcolormesh(pinching where eroded) between the current interfaces, bounded by thick surface and basement lines, with a light-grey fill below the basement and a solid sea-level datum line drawn at the back (behind the coloured section).The y-axis shows the true elevation (m). Vertical exaggeration is applied as a real data aspect ratio (
vexag), so the tick labels stay true; withvexag == 1the box is auto-scaled andfigsizecontrols the look.- Parameters:
figsize – figure size (w, h) inches when a new figure is created.
vexag – vertical exaggeration (data aspect); 1 = auto-scaled (labels are always true elevation either way).
layer_lines – if > 0, overlay a thin interface line every N layers (e.g. 2 = every other layer) on top of the surface/basement lines.
layer_line_kw – dict of matplotlib line kwargs for those layer lines (defaults: thin grey).
color_by – any of
deposition/thickness/lithology/coarse/porosity/age/provenance/facies.faciesclassifies each layer by the water depth at deposition (seefacies_field) and draws a discrete colour + legend instead of a colour bar.facies_labels (facies_depths / facies_colors /) – tunable facies bin edges (m below sea level), colours and legend labels.
sea_level – datum line + facies depth reference; defaults to the value read from the simulation for this step (
data["sea_level"]). Omitted if the run did not record one.ylim (xlim /) – explicit axis ranges
(min, max)in data units (m). When either is given the view is clipped to exactly those bounds and the aspect is left auto (sofigsizesets the proportions);vexagonly locks the data aspect when neither limit is pinned.
- analyse.stratasection.horizontal_slice(data, z, color_by='age', cmap=None, ax=None, tstart=0.0, dt=None, figsize=None, title_fontsize=None)[source]#
Map at constant elevation
z: colour each location by the property of the layer preserved at that elevation (subcrop / structure slice). Cells where z is above the surface or below the basement are blank.
- analyse.stratasection.synthetic_well(data, x, y, color_by='lithology', cmap=None, ax=None, tstart=0.0, dt=None, width=1.0, figsize=None, title_fontsize=None, cbar_orientation='horizontal', vmin=None, vmax=None, colorbar=True, ylim=None)[source]#
Synthetic well at
(x, y): the stratigraphic column drawn as stacked intervals (true current elevations) coloured by a layer property.To draw several wells on one figure with a shared colour scale, pass each an
ax=(your own subplots) plus commonvmin/vmaxandcolorbar=False, then add a single colour bar — or just usewell_panel(), which does this for a list of locations.- Parameters:
cbar_orientation –
'horizontal'(default, colour bar across the base) or'vertical'.vmax (vmin /) – colour-scale limits; default is this well’s own data range (pass shared values to make several wells comparable).
colorbar – draw the per-well colour bar (set
Falsewhen sharing one).ylim – elevation range
(min, max); default is this well’s own basement-to-surface span.title_fontsize – font size for the title (matplotlib default if None).
- analyse.stratasection.well_panel(data, locations, color_by='lithology', cmap=None, tstart=0.0, dt=None, width=1.0, figsize=None, title_fontsize=None, labels=None, sharey=True, vmin=None, vmax=None, ylim=None)[source]#
Draw several synthetic wells side by side on one figure, with a shared colour scale and a single colour bar at the base.
- Parameters:
locations – list of
(x, y)well positions.labels – per-well titles; default
"well 1","well 2", … (pass a list to name them, one entry per location).color_by – layer property coloured in every well (shared scale).
vmax (vmin /) – colour-scale limits; default is the field’s finite range across all the wells.
ylim – elevation range
(min, max)for every well; default is the union of the wells’ elevation bounds (basement of the deepest to surface of the highest), so the panel spans the full range between the defined wells.sharey – share the elevation axis across wells (default True).
- Returns:
(fig, axes).
- analyse.stratasection.wheeler(data, kind='x', at=None, path=None, color_by='thickness', npts=None, tstart=0.0, dt=None, sea_level=None, cmap=None, ax=None, figsize=None, facies_depths=(0.0, 20.0, 50.0, 75.0), facies_colors=('limegreen', 'darkkhaki', 'sandybrown', 'khaki', 'c', 'teal'), facies_labels=('fluvial / deltaic plain', 'shoreface', 'distal offshore', 'upper slope', 'lower slope'), legend_loc='upper right', title_fontsize=None, xlim=None, ylim=None)[source]#
Wheeler (chronostratigraphic) diagram along a transect: distance (x) vs deposition time / layer (y), coloured by
color_by; hiatuses and erosion are blank (layers with zero preserved thickness along the transect).The shoreline trajectory is overlaid — the locus where the deposition water depth (
sea_level − stratZ) crosses 0 (the subaerial↔marine boundary) — showing transgression/regression through time.sea_leveldefaults to the value read from the simulation for this step (data["sea_level"]); pass a float, or a per-layer array (lengthnlayers) for a time-varying paleo sea level, or it is omitted if the run did not record one.color_by="facies"colours each layer by the depositional facies (water depth at deposition; seefacies_field()) with a discrete legend — consistent with the shoreline overlay.figsizeand thefacies_*lists are tunable.
3-D stratigraphic wedge for ParaView — gospl-strata-volume#
Post-processing: build a 3-D wedge volume of the goSPL stratigraphy for one or more output steps, ready to open in ParaView.
goSPL records the stratigraphic pile in the distributed stratal.<step>.p<rank>
.h5 files (per-layer stratZ/stratH/phiS and, optionally,
stratHf/phiF for dual lithology and stratP for provenance), with the
triangular surface mesh in topology.p<rank>.h5. This tool stacks that
triangular mesh across the recorded layers into a triangular-prism (wedge)
volume: each vertex of the surface mesh is replicated at every layer’s
deposition elevation stratZ, and each surface triangle becomes a column of
wedge cells between consecutive layers. The layer properties are written as
cell data on the wedges, so colouring a slab shows that layer’s composition.
Three field modes (--field):
basic(default) — just the always-recorded per-layerthickness,elevation,layerindex andporosity. Works for any run (a plain stratigraphy run needs neither dual lithology nor provenance).lithology— coarse / fine thickness, fine fraction and the per-fraction porosities (needs the dual-lithologystratHf/phiFfields).provenance— per-source-class volume fraction, the dominant source and the per-layerporosity(plusphiFineif the run is also dual lithology); needs the provenancestratPfield.
(thickness, elevation and layer are attached in every mode.)
The basal bedrock sentinel layer (the ~1e6 m infinite reservoir goSPL keeps
under the recorded deposits) is auto-detected and skipped, so the volume spans
only the real recorded layers (override with --first-layer).
Output mirrors goSPL’s own layout — one HDF5 per input partition per step
(<outdir>/<out>.<step>.p<p>.h5) plus a single self-contained
<outdir>/<out>.xdmf temporal collection referencing them. It runs serially
(looping over all partitions) or under mpirun (the partitions are split
across ranks — independent of the simulation’s processor count); the output is
identical either way, so the result is trivially parallel and time-evolving.
Options#
--h5dir DIRgoSPL output
h5directory (required).--outdir DIRoutput directory, created if needed (default
strata).--out PREFIXoutput file prefix (default
strata); writes<outdir>/<out>.xdmfplus<outdir>/<out>.<step>.p<p>.h5.--field {basic,lithology,provenance}cell-field set (default
basic).basic= thickness/elevation/layer/ porosity only (works for any run);lithology= coarse/fine thickness, fine fraction and per-fraction porosity (needs the dual-lithologystratHf/phiFfields);provenance= per-source-class volume fraction + dominant source + per-layerporosity(needsstratP).--steps S0,S1,...comma-separated output steps to convert (default: all found). Steps with fewer than 2 recorded layers are skipped.
--tout YR(with optional--tstart YR)set the ParaView
Timeto the simulation yeartstart + step*toutinstead of the bare step index (default: step index).--first-layer Nfirst layer index to include (default: auto-skip the basal bedrock-sentinel layer, detected as thickness >= 1e5 m).
--mesh-base BASEmesh-output file base providing the current surface
elev(defaultgospl). Layer interfaces are placed at the current surface minus the cumulative compacted thickness, so eroded layers collapse instead of overhanging; falls back to the recordedstratZif the mesh file is absent.--file-base BASEstratal-file base name (default
stratal).
Runnable as the installed command gospl-strata-volume or, with no install
step, as python -m gospl.analyse.stratamesh. Examples:
# defaults -> writes ./strata/strata.xdmf (+ per-partition .h5)
gospl-strata-volume --h5dir myrun/h5 --field lithology --steps 0,5,10
# provenance source mix, ParaView time in years, run in parallel
mpirun -np 4 gospl-strata-volume --h5dir myrun/h5 --outdir vol \
--field provenance --tout 100000
Functions
- analyse.stratamesh.discover_steps(h5dir, file_base='stratal')[source]#
Return
(sorted step list, partition count)found inh5dir.
- analyse.stratamesh.detect_first_layer(stratal_path)[source]#
Index of the first non-basement layer: skip leading layers whose thickness exceeds the sentinel cut-off (the ~1e6 m infinite-bedrock reservoir). The layer structure is global, so reading one partition is representative.
- analyse.stratamesh.build_partition(stratal_path, topology_path, lo, field, mesh_path=None)[source]#
Build the wedge volume for one partition / step.
- Parameters:
stratal_path –
stratal.<step>.p<p>.h5(per-layer fields).topology_path –
topology.p<p>.h5(coords,cells).lo – first layer to include (basement layers below are dropped).
field –
"lithology"or"provenance".mesh_path – optional
<mesh>.<step>.p<p>.h5providing the current surfaceelev. When given, the layer-interface elevations are reconstructed from the current surface minus the cumulative (compacted) thickness above each layer — so eroded layers (stratH= 0) collapse to zero-thickness wedges instead of being drawn at their stale deposition elevation (which produced spurious cells above the eroded surface). Without it, the recordedstratZis used.
- Returns:
dict with
vertices(Nv, 3),wedges(Nw, 6),cellscell fields (each (Nw,)),frac(Nw, C) for provenance, andn_classes. ReturnsNonewhen the step has too few recorded layers to form a wedge.
- analyse.stratamesh.write_partition_h5(path, data)[source]#
Write one partition’s wedge volume to HDF5.
- analyse.stratamesh.write_xdmf(out_prefix, meta, field, times=None)[source]#
Write a single self-contained
<out>.xdmf(temporal collection): one time per step, each a spatial collection of the per-partition wedge grids.- Parameters:
meta –
{step: [ {h5, nv, nw, cells:[names], ncl}, ... per partition ]}.times – optional
{step: value}for the ParaViewTime(e.g. the simulation year); defaults to the step index.
Sediment provenance — gospl-provenance#
Sediment provenance / source-to-sink attribution from goSPL output.
Standalone post-processor (Approach A in docs/DESIGN_PROVENANCE.md): given a
per-vertex source class (from user source-rock polygons) and a goSPL run, it
tracks the eroded material down the per-step flow network and accumulates, in
each sink, the deposited volume attributable to each source rock type — per
basin and per pixel (mesh node) — plus the mean transport distance and an
optional copper-fertility layer.
The core (ProvenanceTracker) is pure NumPy and operates on arrays, so it
runs without GIS / HDF5 dependencies; the optional I/O helpers
(classes_from_shapefile(), tracker_from_gospl()) import geopandas
/ h5py lazily.
Method (deterministic continuum of particle tracking, mass-conserving per step):
for each output step the net erodep change gives the eroded volume (sources)
and deposited volume (sinks); the eroded material — recycled from the local
deposited pile, plus fresh bedrock of the node’s source class — is routed
downstream and laid down where the model deposits, carrying its provenance and
travelled distance. Routing is multiple-flow-direction by default (flux
splits across all lower neighbours by slope, so a source can feed several
basins), with single steepest descent available (routing='sfd').
Caveats (see the design doc): it sees the net erodep per output step (not the compute dt), re-derives the routing from the saved elevation, and is a reconstruction — exact volumes need the in-model tracer (Approach B). Provenance is not prospectivity: the Cu layer is a fertility-weighted detritus proxy, upstream of the ore-forming process.
Inputs (all per-vertex arrays are in the global input-mesh ordering — the
npdata mesh v; build per-vertex labels with
classes_from_shapefile() and reassemble model fields with
GosplOutput):
Classes
- class analyse.provenance.ProvenanceTracker(coords, source_class, n_classes, area=None, cells=None, neighbours=None, basin_id=None, cu_weight=None, routing='mfd', flow_exp=1.0, method='auto')[source]#
Accumulate source-to-sink sediment provenance over a goSPL run.
Feed cumulative
erodepsnapshots in time order withstep(); read the attribution withbasin_percentages(),pixel_fractions(),mean_distance()andcu_fraction().Methods
Dict
{basin_id: array(n_classes)}of the % contribution of each source class to each sink basin.Per-node Cu-sourced fraction of the deposit (needs
cu_weight).Per-node flux-weighted mean source->deposition transport distance.
Per-node deposited provenance fractions (npoints, n_classes); rows that received no deposit are all-zero.
step(elev, erodep)Process one output step.
- __init__(coords, source_class, n_classes, area=None, cells=None, neighbours=None, basin_id=None, cu_weight=None, routing='mfd', flow_exp=1.0, method='auto')[source]#
Required
- Parameters:
coords – (npoints, 2|3) global-mesh vertex coordinates (mesh
v); used for travel distance and to define the vertex ordering all the other per-vertex arrays must follow.source_class – (npoints,) integer source-rock class per vertex, each value in
[0, n_classes). Obtain by point-in-polygon of the source shapefiles (classes_from_shapefile()); give vertices outside every polygon a dedicated “other”/background class — not -1.n_classes – number of source classes.
One of
cells/neighbours(the routing connectivity)- Parameters:
cells – (ntri, 3) triangular mesh cells (mesh
c) — adjacency is built from these.neighbours – pre-built
(indptr, indices)CSR adjacency (instead ofcells).
Optional
- Parameters:
area – (npoints,) Voronoi cell area (m²); defaults to 1 (volume == thickness). Provide it for correct volumes on irregular meshes; percentages/ratios are area-independent and unaffected.
basin_id – (npoints,) sink-basin label per vertex,
-1= not in a sink basin. Optional — needed only forbasin_percentages()(the per-basin roll-up). Per-pixel provenance (pixel_fractions()) is produced at every depositing node regardless.cu_weight – (n_classes,) copper fertility per source class; needed only for
cu_fraction().routing –
'mfd'(default, multiple-flow-direction — flux splits across all lower neighbours by slope, so a source can feed several basins) or'sfd'(single steepest descent).flow_exp – MFD flow-partition exponent (slope power for the weights).
method – routing-sweep backend —
'auto'(default: the Numba-compiled sweep whennumbais installed, else pure Python),'numba'(require Numba), or'python'(reference). All give identical results (same topological sweep, including themin(D, flux)deposition cap).numbaremoves the Python per-node loop — the dominant cost on large meshes — for a several-fold (and growing with N) per-step speedup; the vectorised edge build / erosion is shared, so the overall gain is modest on small meshes. (A sparse transport-with-loss solve was evaluated and rejected — slower at realistic sizes due to LU fill-in, and only exact when no node is sediment-undersupplied; seedocs/DESIGN_PROVENANCE.md.)
- basin_percentages()[source]#
Dict
{basin_id: array(n_classes)}of the % contribution of each source class to each sink basin. Requiresbasin_id.
- class analyse.provenance.GosplOutput(h5dir, global_coords, file_base='gospl')[source]#
Reader that reassembles global per-vertex fields from goSPL’s partitioned HDF5 output.
goSPL writes one file per MPI partition per step (
<base>.<step>.p<rank>.h5) holding only that rank’s local nodes, with the partition meshes intopology.p<rank>.h5. Provenance routing is global, so this reader maps each partition’s local nodes onto the global input mesh ordering with a KDTree — the same local↔global map goSPL builds at load time (tree.query(lcoords)) — and scatters each field into a global array. Pass the global mesh coordinates (thenpdatameshv) so the assembled arrays line up with your per-vertexsource_class/basin_id.Methods
field(step, name)Assemble a global per-vertex array for
nameat outputstep.
Functions
- analyse.provenance.build_neighbours(cells, npoints)[source]#
Build the vertex adjacency (CSR
indptr/indices) from triangular mesh cells — the undirected edge graph used for steepest-descent routing.- Parameters:
cells – (ntri, 3) integer vertex indices.
npoints – number of vertices.
- Returns:
(indptr, indices) CSR adjacency arrays.
- analyse.provenance.steepest_receivers(elev, indptr, indices)[source]#
Single-flow-direction (SFD) receivers: each node points to its lowest neighbour strictly below it, or to itself (a sink / outlet) if none is lower.
- Returns:
integer receiver array (length npoints).
- analyse.provenance.downhill_edges(elev, coords, indptr, indices, routing='mfd', exponent=1.0)[source]#
Build the weighted downhill-flow graph as CSR edges
(e_indptr, e_idx, e_weight, e_seg): for each node, the receivers it routes to, the fraction of its flux each takes, and the edge length.routing='sfd': a single edge to the steepest-descent neighbour (weight 1); empty for sinks/outlets.routing='mfd': an edge to every lower neighbour, weight \(\propto \text{slope}^{exponent}\) and normalised to sum to one (multiple-flow-direction, as goSPL routes water/sediment).exponentis the flow-partition exponent (1 ≈ uniform-to-slope spreading).
- analyse.provenance.classes_from_shapefile(coords, shapefile, attribute, background='other', crs=None)[source]#
Assign each vertex an integer class by point-in-polygon against a source-rock (or sink-basin) polygon shapefile. The result is directly usable as the tracker’s
source_class: every vertex gets a class in[0, n_classes), with vertices outside every polygon assigned a background class (so there are no-1sentinels). Lazily importsgeopandas/shapely.- Parameters:
coords – (npoints, 2|3) vertex coordinates in the shapefile’s CRS (only the first two columns are used).
shapefile – path to a polygon shapefile.
attribute – polygon attribute holding the class label.
background – label for vertices outside every polygon (appended as its own class). Set to
Noneto instead return-1for those vertices (e.g. when building abasin_id, where -1 means “not a sink”).crs – optional CRS to reproject the polygons to before testing.
- Returns:
(class_array, {label: int} mapping). Forsource_classuse the array as-is; forbasin_idpassbackground=None.
- analyse.provenance.write_xdmf(prefix, npoints, ncells, geom_dim, n_classes, steps, has_cu)[source]#
Write an XDMF temporal-collection wrapper for the provenance HDF5 so the per-step results open as a time series in ParaView (it references the
<prefix>.h5datasets — keep the two files side by side).Exposes, per step:
dominantsource (scalar),distance, optionalcu_fraction, and each source’s contributionfrac_class<c>(a HyperSlab column of thefractionsarray).
ELA maps from paleo-temperature — gospl-ela#
Derive goSPL glacier-geometry maps (hela / hice) from a paleo-climate
temperature map, by inverting the atmospheric lapse rate.
The equilibrium-line altitude (ELA) is the elevation at which the air
temperature reaches a calibrated threshold T_ELA. With a lapse rate
\(\Gamma\) (deg C per metre),
ELA(x) = (T_ref(x) - T_ELA) / Gamma
where T_ref is the temperature reduced to sea level. If your temperature map
is given at the local surface elevation z(x) (the usual near-surface output
of a climate model, after remapping to the goSPL mesh), reduce it on the fly:
T_ref(x) = T(x) + Gamma * z(x) ==> ELA(x) = z(x) + (T(x) - T_ELA) / Gamma
The ice-cap altitude (top of the accumulation ramp) is set a fixed band above
the ELA: hice = hela + band.
The output is a single .npz with per-vertex hela and hice arrays, in
the convention goSPL expects for the ice maps. Run it once per paleo-climate
snapshot, then list the outputs in the ice.glaciers time series (each entry
hela: [<out-without-.npz>, 'hela'] etc.). See docs/user_guide/surfproc.rst
and docs/tech_guide/ice.rst.
NOTE#
The temperature and elevation maps must already be per-vertex on the goSPL mesh (same shape/order as the mesh
npdataarrays). Remapping a climate model’s native grid onto the mesh is a separate, upstream step.This derives the ELA position only. goSPL’s ablation magnitude stays precipitation-scaled (the ELA-ramp SMB); it is not a degree-day / energy-balance melt model.
Running it — terminal#
One snapshot (prints a ready-to-paste ice.glaciers entry via --start):
python scripts/ela_from_temperature.py \
--temperature climate/t2m_21ka.npz --t-key t2m \
--reference surface --elevation input/mesh.npz --z-key z \
--lapse 0.0065 --t-ela -2.0 --band 400 \
--out input/ela_21ka.npz --start -21000
Whole paleo-climate history (one map per snapshot) in a shell loop:
for t in 21000 18000 15000 12000; do
python scripts/ela_from_temperature.py \
--temperature climate/t2m_${t}ka.npz --t-key t2m \
--reference surface --elevation input/mesh.npz --z-key z \
--lapse 0.0065 --t-ela -2.0 --band 400 \
--out input/ela_${t}.npz --start -${t}
done
python scripts/ela_from_temperature.py --help lists every option.
Running it — Jupyter notebook#
Either shell out to the script with the ! magic:
!python scripts/ela_from_temperature.py \
--temperature climate/t2m_21ka.npz --t-key t2m \
--reference surface --elevation input/mesh.npz --z-key z \
--lapse 0.0065 --t-ela -2.0 --band 400 --out input/ela_21ka.npz
or import the conversion and loop over snapshots, building the glaciers
list programmatically:
import sys, numpy as np
sys.path.append("scripts") # make the helper importable
from ela_from_temperature import derive_ela
z = np.load("input/mesh.npz")["z"] # mesh elevation (per vertex)
snapshots = {-21000: "climate/t2m_21ka.npz",
-18000: "climate/t2m_18ka.npz"}
glaciers = []
for t, fpath in sorted(snapshots.items()):
T = np.load(fpath)["t2m"]
hela, hice = derive_ela(T, lapse=0.0065, t_ela=-2.0, band=400.0,
reference="surface", elevation=z)
out = f"input/ela_{int(-t/1000)}ka.npz"
np.savez(out, hela=hela, hice=hice)
glaciers.append({"start": float(t),
"hela": [out[:-4], "hela"],
"hice": [out[:-4], "hice"]})
glaciers # paste under the YAML `ice:` block as `glaciers:`
Functions
- tools.ela_from_temperature.derive_ela(temperature, lapse, t_ela, band, reference='surface', elevation=None)[source]#
Return
(hela, hice)per-vertex arrays from a temperature map.- Parameters:
temperature – per-vertex temperature (deg C).
lapse – atmospheric lapse rate (deg C per metre, e.g. 0.0065).
t_ela – temperature at the equilibrium line (deg C).
band – hice - hela accumulation-band width (m).
reference – ‘surface’ (T at the local elevation; needs
elevation) or ‘sealevel’ (T already reduced to sea level).elevation – per-vertex surface elevation (m); required for ‘surface’.