Datafabrix Thermal is the thermal-intelligence module of the platform. It builds a live map of every thermal junction in your fleet — from PCB substrate to rack exhaust — and closes the loop with cooling and workload in real time. Built to operate on the ThermalSense telemetry surface of the upcoming Datafabrix Gen6 thermal-aware smart backplane (sampling 2026).
Per-junction sensing at 1,000 Hz. Predictive throttle alerts. Closed-loop with cooling and workload.
Heat is the silent SLA killer of modern AI infrastructure. A 10 °C rise in board temperature multiplies bit-error rate by an order of magnitude. A sustained hot spot throttles GPUs invisibly — long before any datasheet alarm fires. And cooling is the largest non-IT line item in a modern data center's energy bill: a 40% reduction shows up directly on the P&L and the sustainability report.
Most data centers manage thermal at the rack — through air-handler set-points, return-air temperature, and end-of-row sensors. That worked when racks were 5 kW. At 40 kW per rack and rising, you need per-junction visibility — and a control plane that can do something about it.
Thermal gives you both. Per-slot sensing at 1,000 readings per second, predictive throttle alerts 30 seconds ahead, autonomous workload migration on hot-spot detection, and a single dashboard that connects substrate-level signal all the way up to the rack-level cooling response.
A real-time, per-junction thermal map of every backplane, board, and rack you operate. Resolution: 1,000 readings per second per sensor. Accuracy: ±0.5 °C.
ML models forecast throttle events 30 seconds before they happen. Workloads migrate or cooling adjusts — before performance degrades, not after.
Thermal speaks back to your cooling system. CRAC setpoints, chilled-water flow rates, and rack-level fans all become controllable variables in a single closed loop.
Detect a developing hot spot. Migrate the affected workload to a cooler zone. Re-balance the cooling load. Log the incident with full attribution.
40% reduction in cooling energy across deployed sites. PUE that moves measurably — and shows up in your annual sustainability disclosures.
Multi-tenant fleets get per-tenant thermal budgets. Your largest customers' workloads get the cool zones; smaller tenants get fair allocation; everyone gets transparency.
Three representative engagements that illustrate the kind of value Datafabrix Thermal delivers in the field.
A 512-GPU training run is at hour 96 of 240. Subtle thermal drift on rack 9 will throttle compute in 15 minutes — invisible to standard monitoring.
Thermal predicts the throttle, migrates the affected tenant's workload to a cooler rack, raises chilled-water flow by 3%, and logs the migration with full attribution.
Zero throughput loss. Engineer learns about the swap from the morning report. Training completes on schedule.
A storage OEM is leaving 20% of rack density on the table because they can't guarantee thermal envelope across their full SKU mix.
Thermal models the exact thermal cost per workload-class per chassis, then advises which SKUs can be densified into which rack positions while staying inside envelope.
20% density unlocked. Same cooling envelope. Materially higher revenue per square foot for the operator.
A hyperscaler reports a PUE of 1.42 across one of their AI campuses. Goal: 1.25. Cooling alone is $14M/year.
Thermal closes the loop on cooling: workload-aware setpoint optimization, per-rack airflow tuning, predictive ramping. PUE drops to 1.26 over two quarters.
$4.6M annual energy savings. Sustainability disclosures improve. Site capacity for AI growth extended without new cooling capex.
Tell us about your fleet and your top operational pain. We will map Datafabrix Thermal to a 90-day pilot scope — and quantify the expected outcome — within five business days.