DATAFABRIX THERMAL · Layers 2–3

Real-time thermal intelligence
at every junction.

Datafabrix Thermal is the thermal-intelligence module of the platform. It builds a live map of every thermal junction in your fleet — from PCB substrate to rack exhaust — and closes the loop with cooling and workload in real time. Built to operate on the ThermalSense telemetry surface of the upcoming Datafabrix Gen6 thermal-aware smart backplane (sampling 2026).

Beta · H2 2026
Datafabrix Thermal module visualization
DATAFABRIX THERMAL · MODULE

Thermal Intelligence — engineered for AI-class workloads.

Per-junction sensing at 1,000 Hz. Predictive throttle alerts. Closed-loop with cooling and workload.

WHY IT MATTERS FOR AI DATA CENTERS

The problem we solve.

Heat is the silent SLA killer of modern AI infrastructure. A 10 °C rise in board temperature multiplies bit-error rate by an order of magnitude. A sustained hot spot throttles GPUs invisibly — long before any datasheet alarm fires. And cooling is the largest non-IT line item in a modern data center's energy bill: a 40% reduction shows up directly on the P&L and the sustainability report.

Most data centers manage thermal at the rack — through air-handler set-points, return-air temperature, and end-of-row sensors. That worked when racks were 5 kW. At 40 kW per rack and rising, you need per-junction visibility — and a control plane that can do something about it.

Thermal gives you both. Per-slot sensing at 1,000 readings per second, predictive throttle alerts 30 seconds ahead, autonomous workload migration on hot-spot detection, and a single dashboard that connects substrate-level signal all the way up to the rack-level cooling response.

40%
Cooling energy saved
30 s
Predictive throttle warning
±0.5 °C
Per-slot accuracy
0
Hot-spot SLA escapes
CAPABILITIES

What Datafabrix Thermal does.

  1. Live thermal map

    A real-time, per-junction thermal map of every backplane, board, and rack you operate. Resolution: 1,000 readings per second per sensor. Accuracy: ±0.5 °C.

  2. Predictive throttle alerts

    ML models forecast throttle events 30 seconds before they happen. Workloads migrate or cooling adjusts — before performance degrades, not after.

  3. Closed-loop cooling control

    Thermal speaks back to your cooling system. CRAC setpoints, chilled-water flow rates, and rack-level fans all become controllable variables in a single closed loop.

  4. Hot-spot autonomous response

    Detect a developing hot spot. Migrate the affected workload to a cooler zone. Re-balance the cooling load. Log the incident with full attribution.

  5. PUE optimization

    40% reduction in cooling energy across deployed sites. PUE that moves measurably — and shows up in your annual sustainability disclosures.

  6. Per-tenant thermal accounting

    Multi-tenant fleets get per-tenant thermal budgets. Your largest customers' workloads get the cool zones; smaller tenants get fair allocation; everyone gets transparency.

HOW IT HELPS AI DATA CENTERS

Real scenarios. Real outcomes.

Three representative engagements that illustrate the kind of value Datafabrix Thermal delivers in the field.

The Problem

AI training fleet thermal stability

A 512-GPU training run is at hour 96 of 240. Subtle thermal drift on rack 9 will throttle compute in 15 minutes — invisible to standard monitoring.

Our Approach

Thermal predicts the throttle, migrates the affected tenant's workload to a cooler rack, raises chilled-water flow by 3%, and logs the migration with full attribution.

The Outcome

Zero throughput loss. Engineer learns about the swap from the morning report. Training completes on schedule.

The Problem

Storage density unlocked

A storage OEM is leaving 20% of rack density on the table because they can't guarantee thermal envelope across their full SKU mix.

Our Approach

Thermal models the exact thermal cost per workload-class per chassis, then advises which SKUs can be densified into which rack positions while staying inside envelope.

The Outcome

20% density unlocked. Same cooling envelope. Materially higher revenue per square foot for the operator.

The Problem

PUE move that moves the P&L

A hyperscaler reports a PUE of 1.42 across one of their AI campuses. Goal: 1.25. Cooling alone is $14M/year.

Our Approach

Thermal closes the loop on cooling: workload-aware setpoint optimization, per-rack airflow tuning, predictive ramping. PUE drops to 1.26 over two quarters.

The Outcome

$4.6M annual energy savings. Sustainability disclosures improve. Site capacity for AI growth extended without new cooling capex.

INTEGRATIONS

Drops cleanly into your existing stack.

Open-standards first. Your existing tooling keeps working — Datafabrix Thermal adds the AI-infrastructure-specific layer you've been missing.

Redfish OpenBMC DCIM CRAC controllers Liquid-cooling vendors ASHRAE standards
EXPLORE THE PLATFORM

Datafabrix Thermal works best with...

Ready to see Thermal in action?

Tell us about your fleet and your top operational pain. We will map Datafabrix Thermal to a 90-day pilot scope — and quantify the expected outcome — within five business days.