Pextra CloudEnvironment®
Pextra CloudEnvironment is an enterprise private cloud platform engineered from first principles for the operational realities of large-scale, multi-tenant, GPU-accelerated infrastructure. Built on a fully distributed control plane backed by CockroachDB and a KVM-based compute fabric, it is designed to deliver consistent performance and governance whether you are operating three racks in a single datacenter or thirty sites across multiple regions.
Where incumbents like VMware were architected when “cloud” meant on-premises virtualization, and where OpenStack carries the operational overhead of its open-source assembly-required heritage, Pextra CloudEnvironment targets a third path: the simplicity and developer experience of a public cloud, delivered entirely on infrastructure you own and control.
Platform Architecture
Control Plane: Distributed-First by Design
The Pextra control plane avoids the single-point-of-failure model that has historically plagued hypervisor management systems (vCenter, Prism Central). Its metadata and state layer is built on CockroachDB — a distributed SQL database that uses Raft-based consensus to maintain availability during node and network failures.
This means:
- No primary controller to lose. State is replicated across all control-plane nodes. Loss of a minority of nodes does not interrupt API availability or ongoing provisioning operations.
- Linearizable transactions. VM state, tenant quota tracking, billing metering, and access control are all managed with full ACID guarantees — no eventual-consistency edge cases in critical operations.
- Horizontal scale. Control-plane throughput scales by adding nodes; there is no vertical “vCenter sizing” exercise.
The REST API layer is stateless and sits in front of CockroachDB; any API node can serve any request, enabling load balancing across all control-plane instances with no session affinity requirement.
Compute Fabric
Each compute host runs a KVM-based hypervisor with QEMU managing individual VM processes. The platform installs a lightweight host agent that handles:
| Agent Responsibility | Detail |
|---|---|
| VM lifecycle | Create, start, stop, live-migrate, snapshot |
| Resource reporting | CPU, memory, IO, network utilization → control plane |
| Storage I/O path | Connects VM disk images to the storage backend via libvirt/librbd |
| Network attachment | Programs OVS/OVN rules for tenant network isolation |
| GPU scheduling | Reports GPU topology, programs SR-IOV VF assignments per VM |
| Health heartbeat | Reports node liveness; drives HA failover decisions |
VirtIO paravirtualized drivers are used for all I/O paths (storage, network, memory balloon, RNG) to minimize hypervisor overhead — typical CPU overhead is 2–4% on Linux workloads.
Storage Architecture
Pextra CloudEnvironment supports multiple storage backends, with the reference architecture using Ceph for all storage tiers:
- VM block storage — Ceph RBD with thin provisioning, snapshots, and clone-on-write for fast VM deployment from templates.
- Object storage — Ceph RGW providing an S3-compatible API for tenant object storage, ISO uploads, and backup staging.
- Shared filesystem — CephFS for workloads requiring POSIX filesystems, NFS-like semantics, or shared read access from multiple VMs.
Ceph is integrated directly into the control plane: storage pools, CRUSH rules, and OSD health are exposed through the same API that governs compute resources. Storage quotas are enforced per tenant at the pool level.
For deployments requiring external SAN or NFS, the platform supports iSCSI and NFS v4 backends as secondary storage targets.
Networking: OVN-based Overlay
Pextra uses Open Virtual Network (OVN) layered on Open vSwitch (OVS) for tenant network isolation and overlay routing. This provides:
- Per-tenant virtual routers: Each tenant receives an isolated L3 routing domain. East-west traffic between tenant VMs never leaves the hypervisor host fabric unencrypted.
- Geneve encapsulation: Tenant traffic is encapsulated in Geneve tunnels between hypervisor nodes — no VLAN sprawl, no per-tenant physical VLAN provisioning required.
- Distributed routing: L3 routing decisions are made at the source hypervisor, eliminating “tromboning” through a central router for east-west traffic.
- Security groups: Stateful firewall rules are enforced in OVS kernel datapath on each host — no dedicated firewall appliance in the data path.
- Floating IPs / NAT: The platform manages external IP assignment and DNAT mappings via the OVN logical router gateway port.
- Load Balancing: Built-in L4 load balancing using OVN load balancer targets — no external load balancer required for intra-tenant services.
External connectivity is provided through gateway nodes running OVN gateway chassis, which handle BGP peering with upstream fabric switches for external IP advertisement.
Multi-Tenancy and RBAC
Tenant Isolation Architecture
Every resource in Pextra CloudEnvironment is owned by a tenant (organizational unit). Tenants are fully isolated at all layers:
- Compute: VMs cannot share memory or CPU execution contexts across tenants.
- Network: OVN logical networks are per-tenant; inter-tenant routing is gated on explicit policy.
- Storage: Ceph pools and quotas are scoped per tenant; namespace isolation prevents cross-tenant data access.
- API: Every API call is evaluated against the caller’s tenant context; cross-tenant operations require explicit admin delegation.
- Audit logs: Every API mutation is logged with caller identity, tenant, resource, and timestamp — immutable append-only log available for compliance export.
Role-Based and Attribute-Based Access Control
The platform implements both RBAC and ABAC in a unified policy engine:
| Mechanism | Use Case |
|---|---|
| RBAC roles | Static role assignments: admin, operator, viewer, billing per tenant |
| ABAC policies | Dynamic rules: “allow provisioning GPU VMs only in datacenter-region-us-east-1 for team=ml-infra” |
| Resource tags | Tag-based scoping for budgets, policies, and automation filters |
| Conditional access | MFA enforcement for destructive operations (cluster delete, quota modification) |
| Federation | SAML 2.0 / OIDC integration with enterprise IdPs (Okta, Azure AD, Ping) |
GPU and AI Workload Architecture
GPU Scheduling Model
Pextra CloudEnvironment treats GPUs as first-class schedulable resources, not afterthoughts bolted onto a CPU-centric cluster. The GPU scheduling subsystem operates at three abstraction layers:
1. Physical GPU Inventory Each host agent enumerates GPUs via the NVIDIA Management Library (NVML) or equivalent vendor APIs. The control plane maintains a live GPU topology map including:
- Physical GPU UUID, model, VRAM capacity
- Current SR-IOV VF allocation state
- PCIe topology (NUMA node affinity, PCIe bandwidth)
- NVLink / NVSwitch topology for multi-GPU domains
2. SR-IOV Virtual Function Allocation For workloads that require GPU slicing (multiple tenants sharing one physical GPU), the platform uses SR-IOV to create Virtual Functions from a Physical Function. Each VF is a hardware-enforced partition with dedicated VRAM allocation and isolated compute contexts.
3. Passthrough for Full-GPU Workloads
For AI training and HPC workloads requiring exclusive native GPU performance, the platform assigns the full GPU to a single VM via PCIe passthrough (vfio-pci). Live migration is disabled for passthrough-GPU VMs; the platform exposes this as a scheduling constraint visible in the API and UI.
Scheduler placement algorithm for GPU workloads:
- Filter candidate nodes: sufficient GPU VRAM, correct GPU model family (user-specified flavors), NUMA affinity.
- Score candidates: prefer colocation on same PCIe switch to maximize NVLink bandwidth; penalize hosts with active GPU memory contention.
- Reserve GPU VF or full GPU on winning host.
- Program SR-IOV VF before VM boot.
AI/ML Use Cases
| Workload Class | Recommended Configuration |
|---|---|
| LLM inference serving | Multiple VMs with SR-IOV VFs; auto-scale on GPU utilization metrics |
| Distributed training | PCIe passthrough VMs; colocate on NVLink-connected GPU topology |
| Feature engineering | CPU-heavy VMs on GPU-adjacent nodes for data locality |
| MLflow / experiment tracking | Standard VMs with SSD-backed storage and object storage integration |
API-First Operations and Automation
REST API and OpenAPI Spec
The entire platform surface — from cluster provisioning to GPU scheduling to tenant quota management — is exposed through a versioned REST API with a published OpenAPI 3.0 specification. Key operational implications:
- Any UI operation can be performed via API; the UI is itself built on the REST API.
- API tokens are scoped to tenant + role; short-lived tokens are supported for CI/CD pipelines.
- Rate limiting, request tracing, and structured error responses are standardized across all endpoints.
Terraform Provider
A community-maintained Terraform provider enables full infrastructure lifecycle management:
resource "pextra_instance" "ml_training" {
name = "ml-training-01"
flavor_id = "gpu.a100.full"
image_ref = "ubuntu-22.04-lts"
tenant_id = var.tenant_id
network_ids = [pextra_network.ml_segment.id]
gpu_config {
mode = "passthrough"
model_class = "nvidia-a100"
count = 1
}
}
Ansible Integration
Official Ansible modules cover:
- VM provisioning and lifecycle (
pextra_vm) - Tenant and user management (
pextra_tenant,pextra_user) - Network and security group management (
pextra_network,pextra_secgroup) - Snapshot and backup orchestration (
pextra_snapshot)
GitOps Workflows
Infrastructure state can be declaratively tracked in Git. Typical pipeline:
Git commit → CI pipeline validates Terraform plan → CD pipeline applies to staging →
promote to production → Pextra API executes changes → audit log records SHA, user, timestamp
This makes Pextra CloudEnvironment fully compatible with Platform Engineering practices — infrastructure changes go through code review, have commit history, and can be rolled back by reverting Git commits.
Observability and Operations
Built-in Metrics Stack
Pextra exposes a Prometheus-compatible metrics endpoint for all platform components. Default scraped metrics include:
- Per-VM CPU, memory, disk IOPS, network Mbps
- Per-host GPU utilization, VRAM occupancy, temperature, power draw
- Control-plane API latency, request rates, error rates
- Ceph cluster health, OSD performance, replication lag
- Tenant quota utilization (CPU, RAM, storage, GPU VF)
A bundled Grafana dashboard library covers the most common operational views out of the box. Organizations already running an observability stack can federate these metrics into existing Prometheus + Alertmanager + Grafana deployments without running duplicate infrastructure.
Log Aggregation
Platform and workload logs are forwarded via a structured log pipeline. Supported sinks: Elasticsearch / OpenSearch, Loki, Splunk HEC, syslog-ng. Logs include tenant context, VM ID, and cluster node — enabling per-tenant log isolation and cross-cluster correlation.
Health and Alerting
The platform’s alert framework pre-defines rules for:
- Node unreachable (triggers HA evaluation)
- GPU thermal threshold crossed
- Tenant quota utilization > 90%
- Ceph OSD down or near-full condition
- Control-plane API error rate spike
Alerts are routable to PagerDuty, Slack, email, or any Alertmanager receiver.
High Availability and Disaster Recovery
Compute HA
When a compute node becomes unreachable (failed health heartbeat for >30 seconds), the HA manager:
- Fences the node using IPMI/BMC out-of-band management.
- Marks all VMs on the failed node as requiring restart.
- Evaluates placement constraints and reschedules VMs to surviving nodes.
- Restarts VMs in defined priority order (HA groups allow
critical > high > normalsequencing).
Target RTO for HA failover: < 3 minutes from node failure detection to VM running on surviving host.
Live Migration
For zero-downtime migrations (planned maintenance, load rebalancing), the platform uses KVM live migration with pre-copy memory transfer: the VM continues running on the source host while memory pages are iteratively copied to the destination. The VM is paused briefly only for the final dirty-page delta transfer — typically < 100ms pause for memory-resident workloads on 25GbE+ networks.
Live migration requires shared storage (Ceph RBD) so disk images are accessible by both source and destination hosts without data transfer.
Multi-Site Federation
In multi-site deployments, Pextra operates federated clusters — each site runs an independent control plane (independent CockroachDB ring), with a global federation layer that:
- Provides unified tenant and user management across sites.
- Enables cross-site VM placement policies (“prefer primary site, failover to secondary”).
- Aggregates observability data into a global metrics view.
- Enforces global quota policies while respecting per-site resource pools.
Cross-site replication: VMs can be replicated between sites using Ceph RBD mirroring (async, block-level) with configurable RPO targets.
Security Architecture
Encryption at Rest
All Ceph OSDs in the reference architecture use disk-level encryption (dm-crypt / LUKS2). The Ceph key management is integrated with an external KMS (HashiCorp Vault or KMIP-compatible HSM), ensuring decryption keys are never stored on the same media as data.
Encryption in Transit
- Control plane API: TLS 1.3, mutual TLS (mTLS) between control-plane components.
- VM network traffic: Geneve-encapsulated tenant traffic is not encrypted by default at the overlay layer (it remains within the trusted fabric); optional IPsec tunnel encryption is available for deployments requiring in-fabric encryption.
- CockroachDB inter-node traffic: mTLS with certificate rotation.
- Ceph cluster network: mTLS for the OSD replication network.
Compliance Posture
Pextra CloudEnvironment’s architecture is designed to support the following compliance frameworks:
| Framework | Key Alignment |
|---|---|
| SOC 2 Type II | Audit logging, access controls, encryption at rest |
| ISO 27001 | Asset management, access control, incident response hooks |
| HIPAA | Tenant isolation, audit trails, encryption, BAA support frameworks |
| PCI-DSS | Network segmentation (OVN), RBAC, log immutability |
| FedRAMP | Evaluated path (not authorized at time of writing; confirm with vendor) |
Organizations pursuing compliance certification should engage directly with Pextra support to review current third-party audit reports and shared responsibility documentation.
Deployment Patterns and Sizing
Pattern 1: Single-Site Hyperconverged (Entry)
| Component | Spec |
|---|---|
| Compute/storage nodes | 3× (minimum) to 6× |
| CPU | 2× 16–32 core Xeon / EPYC per node |
| RAM | 256–512 GB per node |
| Storage | 4–8× NVMe SSDs per node (Ceph OSD) |
| Network | 2× 25GbE bonded per node |
| Control plane | Co-hosted on 3 nodes (CockroachDB 3-node ring) |
Suitable for: 200–800 standard VMs or 50–200 GPU-enabled VMs.
Pattern 2: Separated Compute + Storage (Scale-Out)
Dedicate a storage cluster (Ceph-only nodes) and a compute cluster (KVM-only nodes). Scales storage IOPS and capacity independently of compute.
| Layer | Nodes | Spec |
|---|---|---|
| Control plane | 3–5× | Low-spec servers; CockroachDB + API |
| Compute | 8–32× | CPU/RAM optimized; 2× 25GbE |
| Storage (Ceph) | 3–12× | Storage-dense; 2× 100GbE |
Suitable for: 1,000+ VMs, mixed workloads with highly variable I/O profiles.
Pattern 3: Multi-Site Federated (Enterprise / DR)
Two or more single-site clusters (Pattern 1 or 2) united by the federation layer. Each site is independently operational during WAN connectivity loss — the global control plane provides consistency and central management when sites are connected.
Licensing and Support
Pextra CloudEnvironment is offered under a subscription model sized by cluster capacity (node count or vCPU pool). Key licensing dimensions:
- Core platform: Covers all compute, networking, multi-tenancy, and API features.
- GPU module: Add-on subscription enabling the GPU scheduling subsystem and GPU observability.
- Federation module: Add-on for multi-site federated operations.
- Enterprise support: SLA-backed support with defined response times; includes access to engineering escalation and dedicated customer success.
Unlike VMware’s historical per-CPU socket model or Nutanix’s node-based licensing, Pextra’s subscription is usage-aligned — pricing tracks the infrastructure you deploy, not theoretical maximums.
For detailed, current pricing, contact Pextra directly at pextra.cloud .
3-Year TCO: Pextra vs. Alternatives
Illustrative model for a 10-node cluster running 500 VMs. Actual costs vary by region, negotiated discount, and configuration.
| Cost Category | Pextra CloudEnvironment | VMware vSphere + vSAN | Nutanix AOS |
|---|---|---|---|
| Hypervisor license (3 yr) | Subscription (usage-aligned) | ~$120,000–$180,000 | ~$90,000–$150,000 |
| vCenter / management | Included | ~$10,000–$20,000 | Included (Prism) |
| Backup solution | Ceph snapshots (included) | Veeam / VBR ~$15,000 | Nutanix Mine or Veeam ~$12,000 |
| Distributed networking | Included (OVN) | Enterprise Plus required +~$30,000 | Included |
| GPU scheduling module | Add-on subscription | PCI passthrough only (no scheduler) | NVIDIA AI Enterprise add-on |
| Training + ramp-up | Low (GitOps/API-native) | High (VMware stack expertise) | Medium |
| Estimated 3-yr platform cost | Competitive subscription | ~$175,000–$250,000 | ~$115,000–$200,000 |
Consult Pextra for a detailed TCO worksheet tailored to your environment.
Migration to Pextra CloudEnvironment
From VMware vSphere
- Inventory audit: Export VM inventory from vCenter (PowerCLI:
Get-VM | Export-CSV); classify by OS, disk format, and network dependencies. - VM export: Export as OVA/OVF or use live Virt-V2V for near-zero-downtime conversion.
- Import to Pextra: Upload disk images via the Pextra API or S3-compatible object store; register via
POST /v1/images. - Network re-mapping: Map vSphere port groups to Pextra tenant virtual networks.
- Validation: Boot cloned VMs in isolated test tenant; validate application behavior.
- Cut-over: Update DNS/load-balancer records; decommission source VMs.
From OpenStack
The control plane, API design, and networking model are philosophically similar. VMs can be migrated using Ceph RBD export/import if both environments share Ceph. Key mapping delta:
- OpenStack Projects → Pextra Tenants
- OpenStack Keystone → Pextra IAM (OIDC/SAML federation supported)
- OpenStack Neutron ML2/OVN → Pextra OVN (compatible; network configs can be scripted via both APIs)
From Proxmox / Bare-Metal KVM
Because Pextra’s compute fabric is KVM-based, disk image formats (qcow2, raw, rbd) and guest configurations are directly compatible. Migration path:
- Export QCOW2 disk via
qemu-img convert -O raw. - Upload to Pextra object storage.
- Register as Pextra image; provision VM.
- Verify VirtIO driver status (guests migrating from Proxmox will already have VirtIO drivers).
Competitive Positioning Summary
| Dimension | Pextra CloudEnvironment | VMware vSphere | Nutanix AOS | OpenStack |
|---|---|---|---|---|
| Control plane resilience | Distributed (no SPOF) | Single vCenter (HA option complex) | Prism Central (HA optional) | Distributed (complex) |
| GPU scheduling | Native, first-class | PCI passthrough only | NVIDIA AI Enterprise add-on | Nova PCI passthrough |
| Multi-tenancy | Tenant-isolated (all layers) | Clusters/folders model | Projects | Projects+networks |
| Automation / API-first | Full REST + Terraform + Ansible | REST + PowerCLI | REST + Prism | REST + Heat/Terraform |
| Licensing model | Usage-aligned subscription | Per-CPU + feature packs | Node-based subscription | Open source (support cost) |
| Operational complexity | Medium (modern tooling) | High | Medium-Low | High |
| Ecosystem maturity | Growing | Very mature | Mature | Very mature (fragmented) |
| Ideal fit | Modern enterprises, GPU workloads, API-first ops | VMware-invested orgs | HCI simplicity seekers | Telcos, large service providers |
Related Resources
- Pextra Platform Overview — Strategic context for executives and architects
- Pextra Key Features Deep Dive — Technical feature breakdown
- Pextra vs VMware vs Nutanix Comparison — Full TCO and feature matrix
- Private Cloud Architecture Primer — Design patterns for private cloud deployments
- Datacenter Design Guides — Physical infrastructure planning
- Official site: pextra.cloud