Geopm Versions Save

Global Extensible Open Power Manager

v3.0.1

4 months ago
  • Hotfix for v3.0.0 release.
  • Fix missing systemd dependency on the msr-safe systemd service. This bug could cause MSRs to be unavailable from the GEOPM Service if load order is incorrect.
  • Fix systemd unit definition to maintain same model for GPUs/chip topology when linked against versions of libze_loader.so where "COMPOSITE" is not the default.
  • Fix security issue where UID 0 was being used to indicate privilege, switched to using libcap for capabilities checks instead.
  • Fix bug in startup that was causing long delays when initializing batch interface of PlatformIO
  • Fix potential lock when creating PlatformTopo object as user with CAP_SYS_ADMIN.
  • Fix several build and packaging issues that could cause problems when dependency packages are not installed to standard locations.
  • Fix "make coverage" build target dependency
  • Fix issue with sphinx documentation generation
  • Fix regression in support for client Intel platforms.
  • Fix install failures on some SLES systems by modifying helper install script to prefer the zypper command to the rpm command.
  • Add documentation for non-MPI application integration test for GEOPM Runtime.

v3.0.0

6 months ago
  • Official v3.0.0 release tag.
  • GEOPM Runtime support for non-MPI applications.
  • Integration with OpenPBS through plugins and launcher support.
  • Security improvements and bug fixes.
  • Additional GEOPM Service DBus APIs to support application profiling.
  • Communication between controller and application is managed by GEOPM Service.
  • Creation of topo-cache and responsibility for determining system topology is managed by GEOPM Service.
  • Update C++ standard requirement to C++17.
  • Add more signals and controls including GPU and platform features.
  • ConstConfigIOGroup uses JSON file to define constant settings/configurations as signals.
  • Increase the sample period of the monitor agent from 5 ms to 200 ms to reduce default CPU requirements of runtime.
  • Add Sapphire Rapids server (SPR) as a supported platform.
  • Removal of libgeopmpolicy.so, use libgeopm.so instead.
  • Removal of geopmdpy.runtime module: no support for python based agents.
  • GEOPM_PERIOD / --geopm-period sets the sample period for controller in units of seconds.
  • GEOPM_INIT_CONTROL / --geopm-init-control to write a batch of controls at application startup.
  • GEOPM_CTL_LOCAL / --geopm-ctl-local disable controller's use of MPI.
  • GEOPM_PROGRAM_FILTER / --geopm-program-filter to select processes for profiling.
  • GEOPM_NUM_PROC sets number of processes per node for controller process to track.
  • geopmlaunch support for PALS.
  • geopmlaunch --geopm-preload option required for ld preloading libgeopm.so, not on by default.
  • Default for --geopm-ctl is now "application".
  • geopmlaunch does not control CPU affinity application by default (--geopm-affinity-enable now required).
  • Debian / Ubuntu packaging support.
  • Renamed runtime packages for all distros.
  • Improvements for NVML and LevelZero support for GPUs.
  • Documentation improvements including "Quick Start Guide"
  • Improved error and warning messages.
  • ABI so-version for libgeopm and libgeopmd increased to 2.0.0.
  • Added --direct option for geopmaccess.
  • Add GPU-CA agent for beta testing.
  • Add FFNet agent for beta testing.
  • Add CPU-CA agent for beta testing.
  • FrequencyMapAgent can now control GPU frequency.
  • Configuration and plugin directories for GEOPM renamed and combined.
  • Add PBS integration for power capping clusters.
  • Fuzz test integration and support for sanitizer builds.
  • The environment of controller determines output file paths, not the application environment.
  • Support for liburing for batching kernel I/O.
  • Python interface for endpoint in beta.
  • Program name is no longer the default profile name, "default" is used instead.
  • Track time spent in MPI_Init*() by the application.
  • Removed nearly all use of the /tmp directory (topo-cache still created in /tmp if GEOPM Service is not running)
  • More detailed and accurate reporting of GEOPM overhead, MPI overhead, and controller startup time.
  • Generic runner for GEOPM experiment infrastructure.
  • MSR, NVML and LevelZero IOGroups not loaded except when user has CAP_SYSADMIN or through the GEOPM Service.

v2.0.2

1 year ago
  • Hot fix 2 for release 2.0.
  • Add security.md doc for vulnerability reporting.
  • Align behavior of secure_make_dirs() to documentation w.r.t. intermediate directories.
  • Includes bug fixes and documentation improvements.
  • Fix constness of return value from dgcm_device_pool().
  • Fix warning from recent gcc about uninitialized variables.
  • Use PALSLauncher on australis.
  • PALSLauncher: use list option to cpu-bind
  • Fix for suppressed error reporting.
  • Fix for SST kernel driver on SLES 15.3.
  • Fix for issue where missing data can cause Controller crash.
  • Update copyright year to 2023.
  • Fix LevelZero exception location.
  • Fix error when GPUs are supported by service but not client.
  • Swap load order of msr and service iogroups.
  • Resolve service integration test issues.

v2.0.1

1 year ago
  • Hot fix 1 for release 2.0.
  • Includes bug fixes and documentation improvements.
  • Fix install and packaging of plugin directory (#2823).
  • Fixes for IMPI mpiexec launch wrapper (#2822, #2820)
  • Fix issues discovered in with recent Clang and in the Ubuntu 22 environment (#2829, #2740)
  • Better error reporting from geopmd signal handler (#2789).
  • Fix for supporting LevelZero when MPI also initializes LevelZero (#2802).
  • Better error reporting when application handshake fails (#2801).
  • Use multi-user.target in systemd unit file rather than default.
  • Fix overwrite of access list with --force option (#2712).
  • Use control access list to generate signal list (#2707).
  • Fix spelling errors in documentation (#2644).
  • Support for recent LevelZero implementations which require user to zero call by reference parameters.
  • Better error reporting with LevelZero topology failures.
  • Update spec file to make LevelZero inclusion parameterized and suggestions from SUSE maintainers.
  • Enable CNLIOGroup by default.
  • Fix potential memory issue with CircularBuffer (not exposed by current implementation).
  • Use more robust method to obtain sticker frequency.
  • Use SKX MSR definitions for newer architectures.

v2.0.0

1 year ago
  • Official v2.0.0 release tag.
  • Provides the GEOPM Systemd Service.
  • Removes Python 2 support, only supporting Python 3.
  • Support for GPUs from Intel and NVIDIA.
  • Support for the isst_interface driver.
  • Support for new server processors including Sky Lake, Cascade Lake and Ice Lake.
  • Support for Cray Linux energy counters.
  • Higher performance / lower latency profile interface.
  • More consistent naming scheme for PlatformIO signals and controls.
  • Extended set of signals and controls provided by PlatformIO.
  • Removed msr-safe requirement though GEOPM Service features.
  • Support for new HPC runtime launchers (pals, impi).
  • Flexible YAML report generation and parsing that may contain arbitrary content.
  • Extended python interface support including Reporter features.
  • Python based agents for prototyping runtime algorithms that do not require application feedback.
  • Removed Energy Efficient Agent (will be replaced in a future release).
  • Documentation and web page improvements.
  • Other improvements and feature additions.

v2.0.0+rc3

1 year ago
  • Release candidate 3 for version 2.0
  • This is a pre-release version of GEOPM that has all features that will be present in the v2.0.0 release.
  • No changes other than documentation and possible bug fixes are expected prior to v2.0.0.
  • This represents a code freeze and version 2.0 is anticipated soon after this release.
  • All feedback about this release candidate is appreciated: https://geopm.github.io/contrib.html

v2.0.0+rc2

1 year ago
  • Release candidate 2 for version 2.0
  • This is a pre-release version of GEOPM that has all features that will be present in the v2.0.0 release.
  • The names of signals and controls provided by the PIO interface have changed for rc2 as described here: https://github.com/geopm/geopm/issues/1671
  • Chapter 7 man page documentation has been added for the PlatformIO interface and supported signals and controls.
  • Other changes required for version 2.0 have also been made.
  • All feedback about this release candidate is appreciated: https://geopm.github.io/contrib.html

v2.0.0+rc1

1 year ago

v1.1.0

4 years ago
  • Tue Nov 5 2019 Diana Guttman [email protected] v1.1.0
  • Release overview:
    • Support for Python 3.6 has been added.
    • Support for Python 2.7 continues but will be removed in a future release.
    • New features targeting integration with resource managers.
    • Enhancements to EnergyEfficientAgent.
    • Improved support for automatic OpenMP region detection.
    • Support for launching with OpenMPI.
    • Bug fixes, new and updated tests, and updates to documentation.
  • New features:
    • GEOPM environment variables can now be initialized from a JSON file.
    • Add geopm_agent_enforce_policy() function and Agent::enforce_policy() to public interface.
    • Add tracing for the profile table log with GEOPM_TRACE_PROFILE.
    • Add REGION_COUNT signal to get number times a region has been seen.
    • Add REGION_COUNT signal to default trace columns.
    • Add python wrappers for geopm_pio_c, geopm_topo_c, geopm_error_c, and geopm_agent_c interfaces.
    • Add format_function() method to IOGroups to get a formatting function from a signal name.
    • Add IOGroup for Compute Node Linux PM counters.
    • Allow the FrequencyMapAgent to come from the agent's policy rather than the deprecated environment variable.
    • Add launcher for OpenMPI.
  • New beta features:
    • Add geopmconvertreport script to convert report file into yaml and json.
    • Add a new error type for data store errors.
    • Add PolicyStore class to map agents and profiles to policies.
    • Introduce new Endpoint API, which replaces and extends the ManagerIO.
    • Implement geopm_endpoint_c API.
  • Modified implementations and interfaces:
    • Add CSV class to support CSV files created by GEOPM.
    • Modify Tracer and ProfileTracer to use the CSV class.
    • Add trace_formats() method to Agents.
    • Change freq_sweep analysis to use system max frequency for default max.
    • Move geopmpy package to 'production' status.
    • Minimize set of functions in Environment C interface.
    • Change Environment class variable names for better readability.
    • Update FrequencyMapAgent to use Environment class for its environment variable.
    • Add TEMPERATURE_* signals to list shown by geopmread.
    • Change REGION_RUNTIME signal reflect time of outer region only.
    • Add MSR turbo ratio limit for KNL.
    • Use max turbo ratio limit for platform max frequency.
    • Remove ability to write turbo ratio limit.
    • Add MPI_Barrier before entering all2all model region.
    • Increase problem size of FFT to D class.
    • Add IMPI support to tutorials.
    • Add feature to geopmagent and Agent interface where partial policies will be completed with NANs.
    • Add SLURM -bootstrap option for IMPI.
    • Add geopm_time_to_string() to convert a time structure into a string.
    • Add write_file() helper function.
    • Add value of policy to report, or DYNAMIC when policy comes from an Endpoint.
    • Separate Agent creation time from init() in Controller.
    • Add DebugIOGroup for extending trace with internal Agent values.
    • Add pthread mutex to beginning of SharedMemory regions, with get_scoped_lock() as the only method to lock the mutex.
    • Remove pthread mutex from ManagerIO struct.
    • Use git ls-tree to generate the MANIFEST in any git repo.
    • Remove m_request_s from PlatformIO public interface.
    • Change RPM to build libgeopmpolicy only and remove check step.
    • Add get_hostnames() method to Controller.
    • Add unlink() method to SharedMemory.
    • Update VERSION with each call to autogen.sh.
    • Do not markup anything in geopmbench if all regions are suffixed with '-unmarked'.
    • Update OMPT interface to newest standard.
    • Use libdl and libelf to map instruction address to symbol name.
    • Remove hard requirement for hosts file usage in tutorials.
    • Remove MacOS portability.
    • Remove signal handling logic from Controller.
    • Change board power min/max/tdp to use sum aggregation.
    • Change power cap policy of PowerGovernorAgent and PowerBalancerAgent to POWER_PACKAGE_LIMIT_TOTAL.
    • Change "mpi-time" in report to "network-time" and change time to include all network time.
    • Rename EPOCH_RUNTIME_MPI signal to EPOCH_RUNTIME_NETWORK.
    • Move Environment class definition to header.
    • Split geopm_pmpi.c into C/C++ parts.
    • Clean up build and run scripts for tutorials.
    • Remove region entry and exit lines from the trace by default; they can be added with --enable-bloat.
  • Improved error messages and warnings:
    • Make prefix of runtime warning strings consistently start with "Warning: ".
    • Improve error message when msr driver can't be loaded.
    • Print a proper message on failure to launch lscpu job.
    • Add more verbose geopm plugin load failure warning.
    • Add more detailed description to geopm_error_message() based on last exception thrown.
    • Change throw to warning for PowerBalancerAgent running on a single node.
    • Fix error message when MSR read fails.
  • Extensive changes to EnergyEfficientAgent algorithm:
    • Change EE Agent to learn separately for each control domain.
    • Add max filtering to EnergyEfficientRegion.
    • Use sticker when passing NaN in the policy.
    • Add PERF_MARGIN as a policy for EnergyEfficientAgent.
    • Do not set frequency for regions shorter than 50 ms or unmarked.
    • Have EE Agent always use min frequency for network regions.
    • Update EE agent to use region count to detect adjacent regions with same hash.
    • Add separate max frequency to use for static policy.
    • Bug fixes and refactoring in EnergyEfficientAgent.
  • Updates to integration tests:
    • Increase iterations for EnergyEfficientAgent test.
    • Decrease margin in test for geopm python wrapper measuring time.
    • Add a integration test checking that chosen frequencies increase monotonically with CPU-bound time in regions.
    • Update integration tests to use new trace file format.
    • Add imbalance to power_balancer integration test.
    • Refactor report mock functions in integration tests.
    • Move integration test helpers into util.py.
    • Add integration test for the epoch data in report.
    • Add msr save and restore calls to test launcher.
  • Updates to unit tests:
    • Add unit tests for EnergyEfficientAgent.
    • Cleanup environment variables in unit tests.
    • Add unit tests for the geopmpy.io module.
    • Add unit tests for the geopmpy.launcher module.
    • Make profile tests work with different task sets.
    • Fix TestAffinity to check for OMP_NUM_THREADS in test setup.
    • Fix ExceptionTest to account for extra char in error message.
  • Updates to documentation:
    • Add Daniel Wilson to the AUTHORS file.
    • Change CONTRIBUTING instructions on how to get version.
    • Add version to geopm man pages.
    • Update man pages and README to describe Environment changes and integration with resource managers.
    • Fix PlatformTopo C++ man page to match new interfaces.
    • Add section to README about user environment for non-standard install.
    • Modify frequency_map man page to use floating point frequencies.
    • Rename geopm_pio_c man page to show its section number.
    • Add man page for Endpoint class.
    • Update endpoint_c man page.
    • Remove references to uninstalled man pages from geopm.7.
    • Remove specific list of available launchers from geopm.7.
    • Add documentation to README for Ubuntu support.
    • Add example for systems programmers using PlatformIO.
    • Fix typos in documentation.
  • Bug fixes:
    • Fix paths for building tutorial from module environment.
    • Fix Tracer handling of # signals from environment.
    • Fix Tracer handling of region hash and hint integers.
    • Fix a bug where regions with the same name as the profile did not appear in the report.
    • Fix trace file cache loading print in io.py.
    • Rename and fix analysis for EE and frequency map agents.
    • Fix a bug where LD_PRELOAD was always set.
    • Update geopmplotter to sue agents and cosmetic fixes to plots.
    • Fix geopm::string_split() so it works with multi-character delimiters.
    • Fix build when using --disable-openmp.
    • Fix build when using --disable-mpi.
    • Fix a bug where launcher did not use srun reservation for geopmread cache.
    • Fix placement of verbose flag for geopmbench.
    • Fix epoch reporting when there are no regions.
    • Fix generation of report hdf5 cache.
    • Fix date generation in geopm_time.h.
    • Only overwrite roff pages with ronn if the roff page is missing.
    • Avoid a buffer overrun when copying cpusets.
    • Check if MPI has been finalized before freeing the comm.
    • Fix stderr piping in autogen.sh.
    • Fix build errors from gcc8.
    • Fixes to allow installed headers to be used out of source.
    • Fix a bug where tutorial tarball was not built when docs are disabled.
    • Remove DRAM power from PowerGovernorAgent samples.
    • Avoid loss of precision when converting policies to json strings.
    • Do not use GEOPM_REGION_HASH_INVALID in Agent implementations.
    • Remove '0x' from IMPI affinity mask.

v1.0.0

5 years ago
  • Tue Apr 16 2019 Christopher M. Cantalupo [email protected] v1.0.0
  • Release overview:
    • The official 1.0 release of the GEOPM software!
    • Primary changes are bug fixes and documentation updates since release candidate 3.
  • Updates to integration tests:
    • Fix test_runtime_regulator integration test which had improper tolerances for sleep() interface.
    • Update some integration tests to print errors when platform read/write fails.
  • Updates to unit tests:
    • Add more unit tests for launcher affinity.
  • Updates to documentation:
    • Clean up geopm_pio_c(3) and geopm_topo_c(3) man pages.
    • Remove references to Comm man pages that are not installed.
    • Add include and linking instructions to geopm_pio.3.ronn.
  • Installed header clean up:
    • Update PlatformTopo singleton to return const reference.
    • Clean up forward declaration in public header.
  • Bug fixes:
    • Fix tprof API calls when Controller is not present to avoid segmentation fault.
    • Fix issue by removing call to EnergyEfficientRegion::update_freq_range().
    • Fix issue where FrequencyGovernor was being used but not created by agents above the leaf.
    • Fix missing hidden header dependencies.
    • Fix OMP_NUM_THREADS calculation when --geopm-hyperthreads-disable option is provided to launcher.
    • Fix IOGroup and Agent tutorials to use new Agent interfaces.
    • Fix domain for frequency signal/control on some x86 platforms.