Pextra CloudEnvironment® — Feature Reference

CloudManaged Research | Apr 1, 2025 min read

Pextra CloudEnvironment® — Feature Reference

This page provides a technical breakdown of Pextra CloudEnvironment’s capabilities organized by platform area. For the architectural rationale behind these features, see the full platform profile .


1. Distributed Control Plane (CockroachDB)

Capability Detail
Database engine CockroachDB — distributed, ACID-compliant SQL over Raft consensus
High availability No primary node; any control-plane node handles any API request
Fault tolerance Survives loss of (n−1)/2 control-plane nodes without API interruption
Scalability Horizontal scale-out by adding control-plane nodes; no vCenter-style vertical sizing
Transaction guarantees Serializable isolation for all VM state, quota, and billing operations
Metadata sync Consistent view of all cluster state globally within milliseconds

Operational benefit: Eliminates the “vCenter is down, nothing works” scenario. Control-plane maintenance, upgrades, and even node failures do not create provisioning blackouts.


2. Compute: KVM Hypervisor

Capability Detail
Hypervisor KVM (Kernel-based Virtual Machine) + QEMU
VM density Thousands of VMs per cluster; tested at hyperscale node counts
Live migration Pre-copy KVM live migration; typical pause < 100ms on 25GbE+
CPU pinning NUMA-aware vCPU placement for latency-sensitive workloads
Huge pages 2 MB and 1 GB huge page allocation per VM
Memory overcommit Configurable; balloon driver + KSM (Kernel Same-page Merging)
VirtIO drivers Full VirtIO stack: virtio-blk, virtio-scsi, virtio-net, virtio-balloon, virtio-rng
UEFI / Secure Boot OVMF UEFI firmware; Secure Boot support for Windows and hardened Linux
Machine type Q35 (PCIe) and i440FX (legacy) machine types
Guest OS support Any x86-64 OS: Linux, Windows Server, FreeBSD, and others
Instance flavors Pre-defined and custom flavors; GPU flavors for AI workloads
Instance snapshots Consistent point-in-time snapshot via Ceph RBD snapshot primitives

3. GPU and AI Workload Scheduling

This is Pextra’s most differentiated capability area.

Capability Detail
GPU inventory Full GPU topology map: UUID, model, VRAM, PCIe/NVLink topology
SR-IOV VF allocation Create hardware-isolated GPU partitions; assign per-VM with dedicated VRAM
PCIe passthrough Full GPU assignment to a single VM via vfio-pci; maximum performance, exclusive access
NUMA-aware placement Scheduler constrains GPU VMs to NUMA nodes local to the GPU’s PCIe attachment
NVLink topology awareness Score placement candidates by NVLink proximity for multi-GPU distributed training
GPU quota enforcement Per-tenant GPU VF quota; prevents runaway GPU allocation
GPU observability NVML-sourced metrics: utilization, VRAM occupancy, temperature, power draw → Prometheus
GPU flavors Admin-defined flavors: gpu.a100.full, gpu.a100.mig-7g, etc.
Auto-scaling signals Scale-out VM groups triggered by GPU utilization thresholds via API hooks

Supported GPU architectures: NVIDIA Ampere (A100, A30), Hopper (H100, H200), Ada Lovelace (L40S), and earlier Volta / Turing via passthrough. AMD Instinct support on roadmap (verify with Pextra for current status).


4. Networking: OVN-Based Overlay

Capability Detail
Network virtualization Open Virtual Network (OVN) on Open vSwitch (OVS)
Tenant isolation Per-tenant L3 virtual router; no cross-tenant traffic by default
Encapsulation Geneve (IETF RFC 8926) tunnel between hypervisors
Distributed routing L3 routing at source hypervisor; no central router bottleneck
Security groups Stateful L4 firewall in OVS kernel datapath; per-NIC rule enforcement
Floating IPs External IP assignment with OVN DNAT/SNAT; API-managed
L4 load balancing OVN-native load balancer; no external LB appliance required
DNS Per-tenant internal DNS resolving VM hostnames to private IPs
VLAN support Admin-defined provider networks mapping to physical VLANs for external connectivity
BGP peering Gateway chassis nodes peer with upstream fabric for external IP advertisement
VPN Site-to-site IPsec VPN for connecting external sites to tenant networks
SDN zones Logical zone model for multi-site network policy propagation

5. Storage Integration

Capability Detail
Primary: Ceph RBD Distributed block storage; thin provisioning, snapshots, clone-on-write templates
Object: Ceph RGW S3-compatible endpoint; per-tenant buckets with quota
Filesystem: CephFS POSIX-compliant shared filesystem; multi-VM read/write access
External backends iSCSI, NFS v4, local disks (non-HA)
Quota enforcement Per-tenant storage quota: total GB, snapshot quota, object bucket quota
Volume types Admin-defined types: ssd-performance, ssd-capacity, nvme-ultra, etc.
Snapshot policy Snapshot schedules per VM or volume; retained snapshots per policy tier
Volume encryption Optional per-volume dm-crypt; KMS-managed key (HashiCorp Vault / KMIP HSM)
Live resize Expand block volumes without VM downtime (requires guest OS support)
Import/export Import qcow2, raw, vmdk formats; export to qcow2 or raw

6. Multi-Tenancy and Identity

Capability Detail
Tenant isolation Full isolation: compute, network, storage, audit log, API namespace
RBAC Built-in roles: admin, operator, member, viewer, billing
ABAC Policy engine: tag-based, resource-based, datacenter-scoped rules
User federation SAML 2.0, OIDC/OAuth 2.0; integrates with Okta, Azure AD, Ping Identity, Keycloak
API tokens Scoped tokens (tenant + role); short-lived tokens for CI/CD
MFA enforcement Configurable MFA requirement for admin operations and destructive actions
Audit log Immutable append-only log: method, resource, tenant, user, IP, timestamp
Quota management CPU, RAM, storage, GPU VF quotas per tenant; real-time utilization tracking
Self-service portal Tenants provision VMs, networks, and storage without admin involvement
Chargeback/showback Per-tenant resource metering → CSV/API export for cost allocation

7. High Availability and Disaster Recovery

Capability Detail
Compute HA Automatic VM restart on surviving nodes when a host fails
Fencing IPMI/BMC out-of-band fencing before restart to prevent split-brain
HA groups Priority groups: critical, high, normal; restart ordering control
RTO target < 3 minutes from node failure detection to VM running on surviving host
Live migration Zero-downtime migration for planned maintenance
VM replication Asynchronous VM replication between clusters via Ceph RBD mirroring
Backup integration Proxmox Backup Server or S3-compatible target for incremental deduplicated backups
Multi-site federation Cross-site placement policies; independent site operation during WAN loss

8. Observability and Monitoring

Capability Detail
Metrics Prometheus-compatible endpoint; per-VM, per-host, per-tenant, per-GPU metrics
Dashboards Bundled Grafana dashboards; federate into existing Grafana deployments
Alerting Pre-built alert rules; routes to PagerDuty, Slack, email, Alertmanager
Logs Structured logs with tenant context; sinks: Elasticsearch, Loki, Splunk HEC, syslog
Tracing Distributed API request tracing (OpenTelemetry compatible)
Health status Real-time cluster health API: node status, Ceph health, control-plane quorum
SLA reporting Tenant-scoped uptime and availability reporting via API

9. Automation and API-First Operations

Capability Detail
REST API Full platform surface; OpenAPI 3.0 spec published
Terraform provider pextra provider: VMs, networks, tenants, quotas, GPU resources
Ansible modules pextra_vm, pextra_tenant, pextra_network, pextra_secgroup, pextra_snapshot
CLI pextra-cli command-line interface for scripting and automation
Webhooks Event-driven webhooks for VM state changes, quota threshold breaches, node failures
GitOps compatibility Declarative infrastructure state tracked in Git; full change auditability
CI/CD integration API token scoping for pipeline-safe provisioning; idempotent API design

10. Security

Capability Detail
Encryption at rest dm-crypt/LUKS2 on Ceph OSDs; KMS-managed keys (Vault / KMIP)
Encryption in transit TLS 1.3 for API; mTLS between control-plane components; mTLS on Ceph cluster network
Security groups Stateful L4 firewall; deny-by-default on new tenant networks
Network microsegmentation OVN logical routing isolates tenant segments at hypervisor level
Immutable audit log All API mutations logged with full context; compliance export
CVE response Vendor-managed security advisories; enterprise subscribers receive advance notification
Compliance frameworks SOC 2, ISO 27001, HIPAA, PCI-DSS alignment (confirm current audit status with vendor)

Feature Comparison Snapshot

Feature Pextra CE VMware vSphere Nutanix AOS OpenStack
Distributed control plane ❌ (vCenter SPOF) ⚠️ (Prism HA option) ✅ (complex)
Native GPU scheduling ❌ (passthrough only) ⚠️ (add-on) ⚠️ (manual)
Tenant isolation (all layers) ⚠️ (cluster/folder model) ⚠️ (project model)
Full REST API + Terraform ⚠️ (PowerCLI + partial)
Built-in L4 load balancing ✅ (OVN) ⚠️ ✅ (Octavia)
S3-compatible object storage ✅ (Ceph RGW) ⚠️ (Objects add-on) ✅ (Swift/RGW)
Usage-aligned licensing ❌ (per-CPU) ❌ (per-node) ✅ (open source)
ABAC policy engine ⚠️ (Keystone policies)