Changelog

New releases, fixes, and improvements to NodeFoundry.

v0.14.0 March 15, 2026

Added

Rolling upgrade support for major Ceph version transitions (Reef → Squid). NodeFoundry drains each OSD, upgrades the host OS and Ceph packages, and recommissions before moving to the next node.
IPMI power-cycle and hard reboot actions are now available directly from the node list view in the dashboard.
New nf cluster health --verbose flag shows per-OSD and per-placement-group status inline.
API endpoint POST /v1/clusters/{id}/upgrade with progress streaming via GET /v1/operations/{id}.

Changed

Dashboard cluster overview page redesigned with a 24-hour utilization timeline and IOPS sparklines per node.
API rate limits for Professional tier raised from 300 to 600 requests per minute.
The nf cluster create --wait flag now streams live progress instead of polling.

Fixed

S.M.A.R.T. polling failure on certain NVMe devices that report no temperature sensor.
nf node list now correctly shows nodes that are in maintenance mode rather than showing them as offline.
Race condition in the OSD activation sequence on nodes with more than 24 drives.

v0.13.2 February 28, 2026

Fixed

Race condition in OSD drain that could stall upgrade operations on clusters with more than 64 OSDs.
Alert webhook retries were not respecting exponential backoff — they would flood the endpoint on network errors.
Node registration timing issue on certain Supermicro BIOS versions that send DHCP requests before the NIC link is stable.
Fixed an off-by-one error in the PG count calculation when creating erasure-coded pools with k=4, m=2.

v0.13.0 February 10, 2026

Added

S.M.A.R.T. monitoring for HDDs and NVMe devices, with configurable alert thresholds per drive model. Predictive failure events generate dashboard warnings and optional webhook alerts.
Centralized log aggregation — OSD, monitor, and service logs from all nodes are indexed in Loki and searchable from the dashboard and via nf log search.
Webhook alerting: route cluster health events to Slack, PagerDuty, or any HTTP endpoint. Configure per-severity thresholds and retry policies.
New nf log search CLI command with --since, --until, --node, and --level flags.

Changed

Minimum supported Ceph version raised to Pacific (16.2.x). Luminous and Nautilus clusters are no longer supported for new deployments.
iPXE bootstrap image now uses a read-only tmpfs overlay to improve boot reliability on nodes with marginal disks.
DHCP lease timeout reduced from 24 hours to 4 hours to improve re-registration after IP changes.

v0.12.0 January 20, 2026

Added

NVMe-oF gateway service provisioning. Deploy and manage NVMe-oF targets on top of RBD pools from the dashboard and CLI.
Multi-cluster selector in the dashboard — switch between clusters without navigating away.
nf pool create --erasure with support for named erasure code profiles and automatic PG count calculation.
Node maintenance mode: nf node maintenance <name> drains OSDs and suppresses alerts while you perform hardware work.

Changed

Cluster create API now returns an operation_id and supports streaming progress via SSE at /v1/operations/{id}/stream.
Dashboard metrics retention increased from 7 days to 30 days on Professional tier.

Fixed

CephFS MDS placement on single-rack topologies was placing both active and standby on the same host.
CRUSH map generation failed on clusters where node hostnames contained uppercase letters.

v0.11.0 December 1, 2025

Added

General availability. NodeFoundry is out of beta.
Object Gateway (RGW) service provisioning with S3-compatible endpoint and configurable zonegroups.
API key management: create scoped read-only and read-write keys per account.
Ceph Squid (19.x) support.