Omnia Versions Save

An open-source toolkit for deploying and managing high performance clusters for HPC, AI, and data analytics workloads.

v1.5.1

1 month ago

This patch release is focused on fixing following issue:

  • Installation of Kubernetes 1.16 and 1.19 are deprecated.

  • Spark Operator support is deprecated.

  • Omnia now installs Kubernetes 1.26

v1.4.3.1

6 months ago

This release is focused on supporting following features:

  • Hardware Support: Intel E810 NIC, ConnectX-5/6 NICs.

  • Omnia github now hosts a “genesis” image with this functionality baked in for initial bootup.

  • Host aliasing for Scheduler and IPA authentication.

  • Login and Manager Node access from both public and private NIC.

  • Validation check enhancements:

    • Rearranged to occur as early as possible.

    • Isolate checks when running smaller playbooks.

  • Added a Benchmark Install Guide: OneAPI for Intel, MPI AOCC HPL for AMD.

v1.5

7 months ago

This release is focused on supporting following features:

v1.4.3

9 months ago

This release is focused on supporting following features:

  • XE 9640, R760 XA, R760 XD2 are now supported as control planes or target nodes with Nvidia H100 accelerators.
  • Added ability for split port configuration on NVIDIA Quantum-2-based QM9700 (Nvidia InfiniBand NDR400 switches).
  • Extended password-less SSH support for multiple user configuration in a single execution.
  • Input mapping files and inventory files now support commented entries for customized playbook execution.
  • NFS share is now available for hosting user home directories within the cluster.

v1.4.2

1 year ago

This release is focused on supporting following features:

  • XE9680, R760, R7625, R6615, R7615 are now supported as control planes or target nodes
  • Added ability for switch-based discovery of remote servers and PXE provisioning.
  • Active RedHat subscription is no longer required on the control plane and the compute nodes. Users can configure and use local RHEL repositories.
  • IP ranges can be defined for assignment to remote nodes when discovered via the switch.

v1.4.1

1 year ago

This release is focused on supporting following features:

  • R660, R6625 and C6620 platforms are now supported as control planes or target nodes. 
  • One touch provisioning now allows for OFED installation, NVIDIA   CUDA-toolkit installation along with iDRAC and InfiniBand IP configuration on   target nodes. 
  • Potential servers can now be discovered via iDRAC. 
  • Servers can be provisioned automatically without manual intervention for booting/PXE settings. 
  • Target node provisioning status can now be checked on the control plane by viewing the OmniaDB. 
  • Omnia clusters can be configured with passwordless SSH for seamless execution of HPC jobs run by non-root users. 
  • Accelerator drivers can be installed on Rocky target nodes in addition to RHEL.

v1.4.0.1

1 year ago

Bugfix patch release which address the broken Singularity install issue.

v1.4

1 year ago

Omnia has been enhanced to offer:

Control Plane

  • Omnia Prerequisites Installation

  • Provision Tool - xCAT installation

  • Node Discovery using Switch IP Address

  • Provisioning of remote nodes using

     - Mapping File
     - Auto discovery of nodes from Switch IP
    
  • Database Update of remote nodes info that includes,

    • Host or Admin IP

    • iDRAC IP

    • InfiniBand IP

    • Hostname

  • Inventory creation on Control Plane

Cluster

  • iDRAC and InfiniBand IP Assignment on remote nodes (nodes in the cluster)

  • Installation and Configuration of:

    • NVIDIA Accelerator and CUDA Toolkit

    • AMD Accelerator and ROCm

    • OFED

    • LDAP Client

Device Support

  • InfiniBand Switch Configuration with port split functionality

  • Ethernet Z-Series Switch Configuration with port split functionality

v1.3.1

1 year ago

Bugfix patch release which address image retries and pod timeout.

v1.3

1 year ago

Omnia has been enhanced to offer:

  • CLI support for all Omnia playbooks (AWX GUI is now optional/deprecated).
  • Automated discovery and configuration of all devices (including PowerVault, InfiniBand, and ethernet switches) in shared LOM configuration.
  • Job based user access with Slurm.
  • AMD server support (R6415, R7415, R7425, R6515, R6525, R7515, R7525, C6525).
  • PowerVault ME5 series support (ME5012, ME5024, ME5084).
  • PowerVault ME4 and ME5 SAS Controller configuration and NFS server, client configuration.
  • NFS bolt-on support.
  • BeeGFS bolt-on support.
  • Lua and Lmod installation on manager and compute nodes running RedHat 8.x, Rocky 8.x and Leap 15.3.
  • Automated setup of FreeIPA client on all nodes.
  • Automate configuration of PXE device settings (active NIC) on iDRAC.