Technical overview

How It Works

Twice a day, the system pulls tide predictions and weather ensembles, compares them to every known Redondo flood of the past five years, and returns a category from 1 to 7.

The data we pull

Every run collects three independent streams:

  • NOAA tide predictions for the Tacoma reference station, datum-corrected to Redondo Beach observations.
  • Five ensemble forecasts from GFS, ECMWF, ICON, GEM, and UKMO — pressure, wind vectors, and surface roughness for the next 72 hours.
  • A local observation snapshot at forecast time: recent pressure trend, wind persistence, and water level anomaly at Tacoma.

The classifier

A deterministic rules layer turns the raw numbers into the four signals that actually drive Redondo floods: astronomical tide height, pressure anomaly, onshore wind stress, and antecedent water level. Each signal contributes to a base score.

The five weather forecasts rarely agree on the details. One might call for slightly lower pressure, another for a stronger wind shift. Rather than pick one and hope, the system runs the classifier hundreds of times — each run using a slightly different combination of what the five forecasts are telling us. The category that comes up most often becomes the final answer. If most runs land in the same category, confidence is high. If they disagree, the report says so (e.g., “Cat 3 most likely, but roughly a 15% chance of Cat 4 if pressure drops further than the midpoint forecast suggests”).

The categories

CatMeaning
1–2Ordinary high tides. No action.
3First level where we notify subscribers. Minor ponding possible.
4Meaningful shoreline flooding likely.
5December 2022 territory — street-level flooding.
6–7Rare, extreme stacking. Last one was in the historical record.

The self-audit

After every event, a separate job compares the prediction against what actually happened — NOAA water-level observations, NWS alerts, and community reports. The classifier’s running hit rate and false-alarm rate are tracked and used to tune future thresholds.

The LLM’s role

The system uses OpenAI’s o1 reasoning model as its primary language model, with gpt-4-1106-preview for second-opinion reviews. The LLM does not predict the flood — that’s the deterministic classifier described above. The classifier is fully auditable, scored against every known Redondo flood event, and its math can be reproduced by anyone reading the same weather data.

What the LLM does

  • Writes the narrative reasoning in each report. Given the classifier’s inputs (tide, pressure, wind, antecedent water) and its output category, the LLM drafts the plain-language “why” — the case tonight’s conditions make for that category. This is the part of the report a human actually reads.
  • Compares to past events. Given a shortlist of analog flood events from the retrieval layer, the LLM picks the most representative one and writes a one-paragraph comparison — “tonight’s setup most resembles December 14, 2022, but with a lower pressure drop and weaker wind persistence.”
  • Provides a second-opinion review. Before an alert is sent, a second model re-reads the reasoning with fresh context and flags obvious inconsistencies. If the review disagrees with the primary output strongly enough, the run logs an error and the audit log notes it for later inspection.

Why a reasoning model?

Flood-narrative writing has to thread a chain of causation: high astronomical tide + dropping pressure + persistent onshore wind → water stacks up against the shoreline → ponding and minor structural flooding. Earlier GPT-4 variants would sometimes skip a step or conflate magnitudes. o1 does its chain-of-thought explicitly before committing to text, which has been noticeably more reliable in testing. The trade-off is cost per report — it’s higher than a standard chat model, and it’s budgeted and monitored.

What the LLM is not allowed to do

  • Change the category. The category comes from the deterministic classifier. The LLM can describe it, not move it.
  • Decide who gets alerted. Alert rules are hard-coded: Category 3+ triggers SMS, subscriber preference determines channel.
  • Access the subscriber list. The classifier, LLM, and messaging paths are entirely separate. Names and phone numbers never enter the model context.

Put another way: if OpenAI went dark tomorrow, the classifier would still produce a category and the alert pipeline would still send a text — subscribers would just lose the plain-language reasoning and get a shorter “Cat X forecast, peak tide Y ft at Z:ZZ” message. The system degrades gracefully.

Areas covered

The forecast targets the Puget Sound shoreline in southwestern King County, Washington — from Burien and Three Tree Point in the north to Dash Point in the south. Tide, pressure, and wind conditions behave similarly across this stretch because all of it shares the same NOAA Tacoma tide station (9446484) as the upstream data source. Local exposure differences (sheltered bays vs. open beaches) affect wave action but not the underlying tide cycle.

Coverage area. Hover any dot for local notes. Dots are placed by real coordinates; shoreline shapes are stylized but recognizable as East Passage with Vashon Island to the west.
Map of the Puget Sound shoreline from Burien to Dash Point, Washington A schematic map showing the mainland coast of King County, Washington, from Burien in the north to Dash Point in the south, with Vashon and Maury Islands across East Passage to the west. Markers identify the six communities along the shoreline covered by Redondo Flood Watch forecasts. East Passage Vashon Island Maury Is. N 5 miles NOAA Tacoma · 9446484 Burien Normandy Park Des Moines Marina Saltwater State Park Redondo Beach Federal Way Dash Point