Wed Apr 3 2019 Christopher M. Cantalupo [email protected] v1.0.0+rc3
Modified implementations and interfaces:
Finalized interfaces for 1.0.0 release.
Changed class naming scheme to drop "I" prefix from interface base classes and add "Imp" suffix to implementation classes.
Replaced ascend() and descend() Agent methods with more fine grained interface.
Modified MSRIOGroup to use JSON to store MSR data.
Updated utility classes for Agent interface changes.
Removed use of raw pointers from MSRIOGroup.
Added Helper function to list files in a directory.
Renamed split_string() to string_split().
Removed sort call from table dump since no longer needed.
Removed samples sent up tree from MonitorAgent.
Moved "PlatformTopo::m_domain_e" to a C enum "geopm_domain_e" in geopm_topo.h.
Changed GEOPM_DOMAIN_INVALID to -1 and shifted the all other domains values by one.
Renamed all references to the PlatformTopo::m_domain_e enum to use geopm_domain_e.
Removed PlatformIO::num_signal() and PlatformIO::num_control() from public interface.
Renamed PlatformIO method is_domain_within() to is_nested_domain().
Moved geopm_region_info_s to geopm.h.
Renamed Agent::report_node() to report_host().
Removed ProfileIOGroup from installed headers.
Renamed CircularBufferImp to CircularBuffer.
Moved MSRSignal and MSRControl into their own files.
Moved Imp classes for installed classes to own non-installed header.
Moved SharedMemory and SharedMemoryUser classes into separate headers.
Introduced FrequencyGovernor that holds common code for setting frequency.
Updated EnergyEfficientAgent and FrequencyMapAgent to use FrequencyGovernor.
Replaced ascend() and descend() methods in all built in agents to use new APIs.
Removed num_signal_pushed() and num_control_pushed() from public PlatformIO APIs.
Made tutorial shell scripts compatible with more shell variants.
Updated features:
Implemented and documented C wrappers for the PlatformIO class: geopm_pio_c(3).
Implemented and documented C wrappers for the PlatformTopo class: geopm_topo_c(3).
Changed implementation to stop sending messages about MPI regions nested inside of network hint regions.
Added command line option to geopmread(1) and geopmwrite(1) to create topology cache file.
Added make_unique and make_shared factory methods all installed C++ header classes.
Added check for RAPL lock bit when using power controls
Added UNCORE_RATIO_LIMIT MSR support for HSX, BDX, and SKX.
Added per-region power to Report.
Enabled MSRIOGroup to extend MSRs through JSON file at runtime located in GEOPM_PLUGIN_PATH.
Added MSR methods for parsing function and units strings.
Introduced FrequencyMapAgent which runs regions at specified frequencies.
Added --enable-beta configure flag which installs beta features with make install target.
Updated and extended integration tests:
Ignore failures for missing python packages.
Added feature to save/restore power limit and frequency between each integration test.
Updated unit tests:
Added more unit tests for Helper.
Fixed AgentFactoryTest.
Updates to documentation:
Added documentation on MPI requirements for geopm_prof_c(3) APIs.
Removed references to endpoint in documentation since this is still a beta feature.
Added documentation about Agent report/trace extension name conventions.
Add man page for geopm_pio_c(3) and geopm_topo_c(3).
Add man page for geopm_agent_frequency_map(7).
Bug fixes:
Fixed EnergyEfficientAgent so it actually functions properly.
Fixed issue with using temporary script in launcher to execute lscpu.
Fixed missing input parameter checks in PlatformTopo and PlatformIO.
Fixed Fortran build and missing dependency that could break parallel builds.
v1.0.0+rc2
5 years ago
Fri Feb 22 2019 Christopher M. Cantalupo [email protected] v1.0.0+rc2
Modified implementations and interfaces:
Rename GEOPM_PROFILE_TIMEOUT environment variable to GEOPM_TIMEOUT.
Modify default behavior when using the geopmlaunch: --geopm-ctl=process --geopm-report=geopm.report.
Introduce --geopm-disable-ctl CLI option for geopmlaunch to preserve passthrough behavior.
Remove geopm_prof_init() interface from installed header.
Fix geopmhash example command line tool.
Update plugin loading implementation to use C++.
Refactor IOGroup lookup in PlatformIO.
Modify analysis power sweep to consider multiple packages.
Support lscpu versions that omit 0x from hex values.
Do not install Comm.hpp or MPIComm.hpp.
Modify time signal to be scoped to the CPU.
Rename M_UNITS_HZ to M_UNITS_HERTZ
Add tables module to Python requirements.
Change MSR names to match names in Intel (R) Software Developers Manual.
Make end bit of MSR bitfield inclusive.
Add descriptions for built-in signals and controls.
Align launcher names and programmatically generate list of supported launchers.
Modified Agent::validate_policy() interface.
Add stricter domain checks in TimeIOGroup and CpuinfoIOGroup
Fix configuration and build issues with ompt.
Disable python unit testing in RPM check target.
Remove uninstalled files from spec file.
Updated features:
Update tracer to enable user specified column signals to also specify domain.
Update reporter to enable user specified signals and domains.
Add REGION_HASH and REGION_HINT signals.
Remove all references to the region_id from public interfaces.
Add domain aggregation for read_signal and write_control.
Add TEMPERATURE as default trace column.
Add split_string() helper function.
Install geopm_hash.h and add man page.
Add helper function to replace gethostname().
Improve trace column header names for PowerBalancerAgent.
Modify how epoch totals are calculated.
Updated and extended integration tests:
Fix fence-post problem in test_trace_runtimes.
Skip EnergyEfficientAgent integration test on non-BDX platforms.
Updated unit tests:
Fix timing issue with PowerGovernorAgentTest.wait test.
Fix geopmagent CLI test.
Clean up PlatformIOTest.
Update to googletest v1.8.1.
Optimize Travis CI build.
Updates to documentation:
Update man pages to reflect environment extension of report and trace.
Update man pages for Agg, CircularBuffer, IOGroup, Exception, Helper, RegionAggregator, SharedMemory, PluginFactory, MSR, MSRIO, and MSRIOGroup classes.
Update geopm_region_id_c.3 man page.
Update geopm_sched.3.ronn.
Clean up geopmlaunch man page.
Update man pages for IOGroups
Add tutorial about plugin loading order.
Add missing links to geopm(7) man page.
Update copyright date to 2019.
Use BLURB in geopm.7 man page.
Sync spec file for OpenHPC with the one published with OpenHPC.
Change die.net links to man7.org
Bug fixes:
Fix all timeouts for usages of SharedMemoryUser to reflect geopm_env_profile_timeout().
Fix energy status units for DRAM on Haswell and Broadwell.
Fix energy reporting on multi-socket systems.
Fix issue when application calls MPI_Init_thread() to increase thread level to match GEOPM requirements.
Fix broken build when configured with --enable-overhead.
Fix issues detected with clang.
Fix launcher args for IMPI.
Fix throw in Tracer when reading hash and hint which are allowed to be zero.
v1.0.0-rc1
5 years ago
Fri Dec 21 2018 Christopher M. Cantalupo [email protected] v1.0.0-rc1
Release overview:
This is the first candidate for the v1.0.0 release of the GEOPM package.
The version 1.0 is significant in that semantic versioning https://semver.org/ is intended for all subsequent releases.
The APIs defined by all installed header files and the documented behavior of those interfaces shall remain compatible with linking applications until version 2.0.
The documented definition for all built in signals and controls supported by PlatformIO is not intended to change prior to version 2.0.
Expected changes prior to v1.0.0 release:
The documentation included in this release candidate will be improved upon prior to the actual v1.0.0 release.
Man pages which currently link to doxygen will be filled in.
The definition of the high order bits in the REGION_ID# signal supported by PlatformIO may be changed in the way documented in the PlatformIO(3) man page to split into two signals (REGION_ID AND REGION_HINT).
It is possible that interface classes currently prefixed with "I" may be renamed to exclude the "I" (e.g. IPlatformIO -> PlatformIO).
In this case the concrete implementation would be appended with "Imp" (e.g. PlatformIO -> PlatformIOImp).
The appearance of the epoch signal in the REGION_ID column of the trace will be removed.
The EPOCH_COUNT signal will be added to the default set of traced signals to enable tracking of epoch calls.
High level summary of changes since v0.6.1:
With this release we have removed all references to the Policy, Decider, Platform and PlatformImp objects.
These have been replaced by the PlatformIO / IOGroup / Agent class interactions.
The Kontroller object which was supporting the new code path has been renamed Controller.
The legacy Controller implementation has been removed.
GEOPM no longer depends on the hwloc library, and is relying on running lscpu on compute node instead.
Modified implementations and interfaces:
Rename launcher to geopmlaunch.
Do not install geopmanalysis and geopmplotter command line utilities.
The command line interfaces for these tools will be changing.
Once they are committed, we will begin installing them again.
Remove unused error codes from geopm_error.h.
Remove some deprecated interfaces and files.
Remove legacy artifacts from Reporter and Tracer.
Remove legacy structures from geopm_message.h.
Remove deprecated API headers.
Remove CtlConf Python object.
Remove region ID memory from derivative for power signals, this is a feature for agent to implement.
Remove unused arguments from the geopmctl_main.
Remove push_combined_signal() from PlatformIO interface.
Remove NAN check for policy in Controller. Agents are responsible for handling NAN.
Remove IPlatformTopo::define_cpu_group(). This method is not implemented and not used.
Remove MPI bit from region ID in report.
Remove install of geopm_message.h and geopm_plugin.h.
Remove environment variables for min/max frequency used by EnergyEfficientAgent: this functionality is provided through the policy as documented.
Fixes for online mode of EnergyEfficientAgent: ignore 0.0 when sampling runtime, fix min/max frequency range in analysis.py, fix final requested frequency printed in report.
EnergyEfficientAgent no longer considers DRAM energy in its optimization.
Change default frequency for hints from min to max in EnergyEfficientAgent.
Implement EnergyEfficientAgent analysis using hints only.
Change meaning of EPOCH_RUNTIME signal: MPI and ignore time reported explicitly and a separately.
Install many C++ headers into /usr/include/geopm.
Move geopmbench source files files from tutorial directory into src.
Don't copy any files from src into tutorials.
Update tutorials to use Agent code path.
Throw if multiple hints given to geopm_prof_region.
Allow writing controls for containing domains: the same value will be written to every subdomain.
Update EpochRuntimeRegulator accounting: PKG and DRAM energy dissociated from rank.
Updated to report pre-epoch MPI and ignore runtime.
Make TreeComm fan out configurable with environment variable.
Per thread progress is supported by the 'REGION_THREAD_PROGRESS' signal.
Align command line options to the launcher and the environment variables used by the controller.
Merge tutorial Makefiles into one and remove duplicate scripts.
Rename runtime related APIs.
Merge ProfileIO into ProfileIOSample.
Refactor analysis.py command line parsing to use argparse, etc.
Move some header includes from headers into source files when possible.
Change "POWER_PACKAGE" control name to "POWER_PACKAGE_LIMIT".
Expose MSR PKG_POWER_LIMIT fields as signals.
Reorder directory search in plugin load: load plugins from right to left to so leftmost plugin wins in case of IOGroup loading same name for controls and signals.
Use accumulator member in EpochRuntimeRegulator for MPI runtime.
Changes to the launcher for mpiexec using in hydra
Move set_policy_defaults to Agent interface
Aggregation functions have been moved out of PlatformIO and into their own class: Agg.
Implement agg_function for IOGroups, including tutorial.
Do not stop integration test in looper if one test fails.
Increase shmem table size to 2MB per rank to reduce risk of overflow.
Remove hash table structure in ProfileTable; all regions now use the same table entry.
Change CpuinfoIOGroup to throw in constructor if cpuinfo could not be parsed.
In python analysis do not parse traces if total size is more than half of memory.
Remove redundant HDF5 cache from analysis.py.
Remove TURBO_RATIO_LIMIT2 control for platforms where it is not in whitelist.
Read multiple samples for a short time in geopmread to support POWER signals.
Narrow scope of warning message about cpufreq governor: only print warning when an attempt is made to write to a control that begins with POWER or FREQUENCY.
Prevent MSRIOGroup from throwing when saving MSRs.
Implement and use AgentConf in python code to create agent polices.
Updated features:
Add timestamp counter to available signals.
Add --info option to geopmread and geopmwrite.
Add check for invalid GEOPM_CTL values.
Add temperature signals.
Add Imbalancer interface to libgeopm and libgeopmpolicy: Imbalancer_() -> geopm_imbalancer_().
Add some placeholder descriptions to MSRIOGroup and TimeIOGroup to support integration tests.
Add methods to RegionAggregator to get region IDs and signals.
Add methods to PlatformIO to provide signal/control descriptions: this will be used to augment geopmread/write with descriptions.
Add description APIs for IOGroup: allows IOGroups to provide a user-friendly description of signals/controls.
Add GEOPM_TIME_REF constant for use with geopm_time_*() APIs.
Add INSTRUCTIONS_RETIRED alias signal.
Add TIMESTAMP_COUNTER alias for MSRIOGroup.
Add signal to enable reading of the RAPL lock bit.
Add PKG_POWER_LIMIT MSR fields as a signal.
Add expect_same aggregation function that returns NAN if any elements of the vector differ.
Add average node frequency to EnergyEfficientAgent tree samples.
Add support for POWER_* as signals that give meaningful results without runtime.
Add module conflict of darshan to theta module file.
Add psutils python dependency.
Add warnings for system misconfiguration.
Add read_file() to Helper.hpp.
Add job start in Trace and Report headers.
Add outlier detector script.
Add handling of NAN for default policy values to all agents.
Add parsing for overhead fields to io.py.
Add reading of the thread table through PlatformIO.
Updated and extended integration tests:
Ignore misconfigured system warnings in integration test.
Remove ignore of multiple plugin load warnings that stopped occurring after removal of legacy code.
Do not test epoch runtime in test_region_runtimes.
Add all2all to power_balancer integration test.
Adjust power_balancer test logic to compare Governor and Balancer relatively.
Fix EnergyEfficientAgent integration test.
Test decorators implemented to use launcher. This forces the checks to be run on the compute nodes.
Update integration tests to reflect removal of legacy code path.
Update test_power_consumption to use PowerGovernor.
Fix integration test to exclude MPI and model-init regions from tests using traces.
Fix integration test to use assertNear to account for new MPI region markup.
Move GEOPM_EXEC_WRAPPER functionality into integration test.
Updated unit tests:
Add tests of domain aggregation for pushed signals.
Add test for geopmread signal aggregation.
Stop the unit tests from littering files.
Fixed signed / unsigned comparison issue in PlatformIO test.
Update unit tests to reflect removal of legacy code path.
Add test of IOGroup factory that checks that an IOGroup's list of signal/control names are all valid.
Updates to documentation:
Update GEOPM main README.
Add doxygen target for public interface files.
Add man pages for all C++ headers that are now installed to support plugin development.
Full man pages have been added for PluginFactory, PlatformIO, PlatformTopo, Agent, and IOGroup.
Add documentation about aliasing signals and controls.
Update launcher ronn to include references to env vars.
Add README for outlier_detection.
Update the tutorial README.md to reference geopmbench and point out the agent and iogroup subdirectories.
Document how to build GEOPM with Intel Toolchain.
Fix example source code in geopm_prof_c.3 man page.
Add man pages for geopm_time.h and geopm_imbalancer.h.
Update Doxygen to reflect removal of legacy code path.
Remove alpha and beta labels from documentation.
Bug fixes:
Fix how starting energy counters are recorded in EpochRuntimeRegulator.
Fix timestamp issue with Tracer.
Fix region handling in Reporter hints.
Fix OMPT enabled pthread launch with Controller/Agent.
Fix for invalid function for some MSR signals.
Fix for EnergyEfficientAgent policy: initialize min and max frequency to NAN.
Fix EnergyEfficentAgent offline analysis parsing.
Fix geopmbench stream benchmark which was using too little memory.
Fix python tests to print better warnings and avoid print command.
Fix for MPI region entry: MPI regions used in GEOPM startup were given a region ID of 0.
Fix initialization of per rank ignore and mpi runtime.
Fix default policy generated by geopmagent to properly represent NAN.
Fix reporting of MPI and ignore runtime prior to first epoch for report totals.
Contributing instructions updated with details of gerrit review process.
Modified implementations and interfaces:
Major refactor of the controller and plugin architecture is provided as an optional new code path.
Most of the changes made to the implementation for this release modify the new code path.
The old code path is still available for users as long as the controller is run without the GEOPM_AGENT environment variable set.
The new code path will be active if the user selects an agent by name with the GEOPM_AGENT environment variable when launching the controller.
The old code path is maintained in the current Controller object along with the the Decider / Platform / PlatformImp plugins.
The new code path is maintained in a replacement for the Controller which has been temporarily named the Kontroller.
The Kontroller will be renamed the Controller after this release, and the old code path will no longer be available.
Similar to the Kontroller/Controller replacement, the KprofileIOGroup KprofileIOSample and KruntimeRegulator are temporary replacements for their non-K counterparts and will be renamed.
The beta release enables a new set of plugin interfaces named the IOGroup, Agent, and Comm.
It is through the IOGroup, Agent and Comm plugins that the GEOPM runtime can be extended.
The Decider / Platform / PlatformImp plugin extensions are deprecated and will be removed after this release.
The IOGroup plugin enables a user to add new signal and control mechanisms for an Agent to read and write.
The Agent plugin enables a user to add new monitor and control algorithms to the GEOPM runtime.
MPI use by the GEOPM runtime which is not linked by application has been completely encapsulated in the Comm object.
The tutorial has been extended with two new directories: tutorial/agent and tutorial/iogroup.
The tutorial/iogroup directory documents how to write an IOGroup plugin.
The tutorial/agent directory documents how to write an Agent plugin.
The interface to the resource manager has been made much more flexible for supporting the new Agent interfaces.
The resource manager interface is documented in the geopm_agent_c(3) and geopm_endpoint_c(3) man pages.
Additionally command line tools have been proposed and partially implemented to support the interfaces documented in those man pages.
The geopm_agent_c(3) APIs and geopmagent(1) CLI has software support.
The endpoint interfaces are a work in progress that has not yet been integrated into the mainline source.
The PlatformIO object provides the interface to the IOGroups.
The PlatformIO C++ object will soon have an associated C interface documented as geopm_platformio_c(3).
The geopmread and geopmwrite provide a CLI to the PlatformIO features.
Introducing the MSRIOGroup which provides an implementation of the IOGroup for MSRs.
Introducing the TimeIOGroup which provides an IOGroup for the time signal.
Introducing the CpuinfoIOGroup which provides data from /proc/cpuinfo as signals.
Introducing the ProfileIOGroup which provides profile data collected from the main compute application through the geopm_prof_c(3) APIs.
The release includes three new installed binaries: geopmread, geopmwrite, and geopmagent.
Each of these command line interfaces is documented with a man page and there is a man page for a future command line tool called geopmendpoint.
Deprecated geopm_policy_() interfaces that have been replaced with the geopm_agent_() and geopm_endpoint_*() APIs.
Introducing the first three Agent implementations: MonitorAgent, PowerBalancerAgent, and EnergyEfficientAgent.
Introducing PlatformTopo, replacement for PlatformTopology.
Introducing DefaultProfile singleton which supports geopm_prof_c(3) APIs for profiling.
Added documentation for monitor, energy_efficient, and power_balancer Agents, but the implementation is not currently aligned.
The monitor agent is implemented and fully featured.
The energy_efficient agent will soon be extended to match the man page, and currently use of the network is not enabled.
The existing implementation of the energy_efficient agent does currently provide similar functionality to the efficient_freq Decider.
The power_balancer agent is a work in progress that is not well aligned with the man page, but will be feature complete soon.
Reports and traces generated by Agent code path are designed to be backward compatible with reports and traces generated with the Decider code path.
New environment variables documented in geopm(7): GEOPM_ENDPOINT, GEOPM_AGENT, GEOPM_TRACE_SIGNALS, and GEOPM_DISABLE_HYPERTHREADS.
Remove GEOPM_ERROR_AFFINITY_IGNORE environment variable, no longer required for testing.
New plugin registration mechanism has been put in place and new factory has been implemented.
Replace independent factories with single templated class the PluginFactory.
No longer register a plugin using a half instantiated object.
Removed call to dlsym, and plugins now use attribute((constructor)) to specify a callback target used when plugin is loaded.
In this callback the plugin should register with its respective factory.
Each plugin type has a make_plugin() static method that creates the plugin object and returns a pointer to the base class.
The make_plugin() function pointer is what is registered with the factory.
Extend the PluginFactory to require a the registration of a dictionary (map<string,string>) to enable queries of plugin capabilities.
Use stricter criterion for selecting plugin files to load, name must be of the form libgeopmpi*.so.0.0.0 where 0.0.0 is the GEOPM ABI version.
Moved geopm_plugin_description_s definition to geopm.h.
Add a configure option to enable use of the msr-safe ioctl interface for writing with PlatformIO.
Added APIs for manipulating hint bits in region id hash.
Many changes were made to modernize the use of C++.
Change protected members of all classes to private where possible.
Replace all raw pointer usage with C++11 smart pointers if possible.
Use default keyword for constructors and destructors where appropriate.
Use delete keyword rather than throw to avoid copy constructor.
Add override keyword to derived classes.
Use forward declaration of classes rather than include one header inside of another.
Add and integrate make_unique implementation for C++11.
Confirmed const correctness for all class methods.
Add public interface to register IOGroups with PlatformIO which enables IOGroups to be created at runtime.
Standardize the IOGroup signal and control names so that they are prefixed by the IOGroup name and two colons.
Agents should generally use high level aliases rather than these low level signals and controls.
Introduce functions for converting between signals and bit-fields to allow for PlatformIO to provide full 64 bit integer signals like the region ID.
Add overflow function type to MSR class.
Change frequency APIs to use Hz to enforce uniform use of SI units.
Use instruction offset in OMPT derived region name; this resolves a name ambiguity when more than one OpenMP region is discovered within the same function.
Use gmock archive uploaded to the geopm organization on github.
PlatformTopo is built on top of lscpu and does not require hwloc.
Throw on GlobalPolicy misconfiguration earlier in the runtime execution.
Rename SimpleFreqDecider to EfficientFreqDecider which will be replaced by EnergyEfficientAgent.
Update to efficient Decider and Agent related environment variables according to above name changes.
The json-c library is no longer a dependency, all references have been removed.
Now using the json11 library which is distributed in the "contrib" sub-directory.
Updated features:
Enable Agent to augment report and trace.
Enable user to augment trace through environment variable GEOPM_TRACE_SIGNALS in new code path.
Changes to PlatformIO to support non-CPU domains.
Added MSR save/restore functionality to PlatformIO save/reset interfaces.
Allow loading PlatformIO when some IOGroups fail to load.
Add aggregation functions to PlatformIO to encode how to combine signals.
Add PlatformTopo methods for converting domain to string and vice-versa.
Add signal_names() and control_names() to PlatformIO and IOGroup.
Add Skylake server (SKX) as a supported platform.
Add Haswell and SandyBridge MSRs to PlatformIO interface.
OMPT report region names include instruction offset, now two OpenMP regions within the same function can be distinguished.
Add region runtime as default trace column.
Simpler column names in trace; print some columns using old names.
Change region ID to hex in report and trace.
Order regions in report by runtime.
Add application total ignore time to report.
Replace tabs with spaces for report formatting.
Enable PlatformIO to support Epoch based signals.
Add power signals to PlatformIO using derivative calculation previously done in Region object.
Add PlatformIO aliases for region ID, progress, frequency and energy.
Add CombinedSignal class which is used to combine signals from different IOGroups.
Allow for a user provided number of experiment iterations (loops) to perform for each geopmanalysis type
Enable geopmanalysis to provide more detailed information about the results
Allow turbo to be skipped by geopmanalysis when determining the best per-region frequencies.
Updates to geopmanalysis python script to bypass trace parsing if requested and in debug plot ignore check for multiple profile names.
Use hyphen instead of underscore in geopmanalysis options for consistency with other interfaces.
Don't require -n and -N with geopmanalysis when skipping launch.
Pass output_dir through to plotter when using geopmanalysis.
Changes to analysis.py for SC17 data: multiply energy percent by 100, have frequency sweep plots use frequencies from profile name.
Add geopmanalysis option to specify controller launch method.
Updated and extended integration tests:
Integration tests validated with the GEOPM_AGENT set to test new code path.
A few problems with the new code path exposed by integration tests have been added to github issues.
A few changes to support integration tests with new code path have been integrated.
Change io.py and integration tests: Allow hex numbers for region ID in report, skip extra lines in report.
Remove Platform plugin registration.
Update EfficientFreqDecider to use new runtime metric for performance.
Update EfficientFreqDecider to use PlatformIO directly and remove method from Policy object for adjusting frequency.
Updated unit tests:
Many unit tests have been added to accompany the new code path which has many new classes.
The new classes were specifically designed to enable unit testing poorly covered code that it refactors.
Refactor Profile constructor into testable functions.
Add unit tests for Profile class.
Simple profile class in test directory for testing and debug: enables profiling of the GEOPM runtime itself.
More detailed checks of messages in unit tests when exceptions are thrown.
Fix test-license to assert that files in MANIFEST.EXEMPT exist.
Remove TestPlugin code that is not used by tests.
Add make check target to tutorial build.
Bug fixes:
Update GEOPM runtime C APIs to print to standard error instead of having the controller suppress error messages.
Handle exceptions that occur during app/controller handshake.
Enable timeout rather than hang if Controller or application fail during execution.
Fix for package-scoped MSRs that will write to all CPUs in a package rather than just one.
Fix HSX and SKX frequency control MSRs to core domain.
Fix issue when running on systems with offline CPUs.
Do not report a completed send if policy or sample contains a NAN.
Fix lscpu parsing for offline CPUs.
Exclude regions with 0 count from report, except unmarked region, which is always 0.
Add verbose error message when PluginFactory::dictionary() is called with plugin name that has not been registered.
Fix get_alloc_nodes for slurm in geopmpy launcher
Fix for test_power_consumption to checks the current platform cpuid to decide power budget.
Fix geopmpy.launcher for Intel's mpiexec: does not accept -- as a separator for positional arguments.
Fix for when GEOPM_PLUGIN_PATH contains multiple paths.
Fix tutorial tarball so that it will build out of place.
Fix shared memory issues during start-up when launching the Controller as a separate application.
Remove erroneous double split of the Controller's comm; the ppn1 comm is already passed into the constructor.
Fix test to use in-memory file system to avoid adding missing msync() calls.
Fix resource leak in TreeCommunicator constructor.
Fix tracing capability with geopmanalysis.
Leave -- separator in list of arguments to avoid parsing command line arguments intended for application as launcher arguments.
Updated algorithm for choosing CPU affinity in the launcher: fill application CPUs from back to front, and never share physical cores between MPI ranks.
Created new abstraction for interfacing with MSRs and more broadly for abstracting hardware IO (PlatformIO, MSRIO, and MSR classes).
Application region hints are now properly exposed to the decider.
Added geopmanalysis executable to the geopmpy package; this executable runs applications and performs analysis of power and performance based on GEOPM report and trace data.
Added geopmbench to the installed binaries; this is simply an installed version of the tutorial_6 executable.
Added GEOPM_RM environment variable and --geopm-rm command line option to select geopmpy.launcher's back end resource manager.
Updated man pages to include geopmanalysis and geopmbench.
Removed handling of SIGCHLD signal in GEOPM runtime (commonly raised in non-error conditions when using popen(3)).
Launcher will guess correct number of OpenMP threads if user has not specified.
Added warning message at start up if report and trace files will not be created due to permissions issues.
Added better error handling to tutorial sources.
Added support for geopmctl to be run as a different user than application.
Added support for user provided shmkey's that do not begin with '/'.
Added error checking in launcher user requests more ranks per node than there are cores per node.
Added more robust error checking for command line issues in launcher.
Added command line option to launcher to exclude use of hyperthreads: --geopm-disable-hyperthreads.
If a plugin fails at registration time, do not bring down the controller; a warning is printed if debug is enabled.
Remove -s parameter from geopmctl CLI (was being ignored).
Encapsulated use of MPI by GEOPM inside of a class abstraction (IComm), but controller has not been modified to use the new class due to deadlock bug.
Encapsulated in a class the handshake interface between the controller and the application across shared memory.
General clean up of the geompy.plotter implementation.
Added more error checking in Controller.
Some fixes for issues exposed by static analysis.
Updated features:
Added new decider called "simple_freq" that adjusts CPU frequency to save energy with a small impact to performance; name will likely change to "efficient_freq" in the future.
Added region runtime reporting to traces and Region objects based on the average execution time of a region by all of the ranks on a node.
Added a method to the Region object to give access to the telemetry time stamps to the decider.
Added online learning approach to energy efficient frequency decider.
Added support to geopmpy.launcher for launching with Intel(R) MPI's mpiexec.
Added option to plotter to use all samples or just epoch samples.
Modified the tutorials to enable use of the geopmpy launcher.
Improved tutorial Makefile to allow user override of GNU Make standard variables.
Added an RPM spec file for use with the OpenHPC distribution.
Updated and extended integration tests:
Moved Controller death test from the unit tests to the integration tests.
Added integration tests for pthread an application launch of the controller.
Added an isolated hardware test for RAPL power limit functionality.
Updated documentation: both man pages and doxygen have been reviewed and cleaned up.
Updated unit tests:
Added unit test for SubsetOptionParser.
Reduced dependence of unit tests on MPI runtime.
Removed MPIProfileTest unit test which is covered by integration tests, and not really a unit test.
Removed unused MPIControllerTest.
Removed MVAPICH2 Fortran tests.
Bug fixes:
Fixed broken build in tutorials (tutorial_region.c).
Fixed faulty argument parsing by the geopmpy launcher.
Fixed error reporting when using geopmpy with python 3.x.
Fixed issues with affinity when launching the controller as a pthread.
Fixed issue in passing power budgets down a multi-level tree.
Fixed issue in platform choice when head node architecture differs from the compute nodes.
Fixed broken build if --disable-doc configuration option is passed.
Fixed decider setup code to correctly propagate power bounds down tree.
Fixed the way RAPL time window is set.
Fixed the use of cached data by geopmpy.plotter.
Fixed integration test issues related to systems with multiple cluster node partitions.
Fixed process CPU affinity implementation (don't use hwloc) and added unit tests for this.
Fixed potential overflow issue with error messages in PlatformImp.cpp.
Fixed race in SharedMemory test.
Fixed markup patch for MiniFE.
Fixed launcher when user explicitly requests OMP_NUM_THREADS=1.
Fixed MPIInterfaceTests so it uses only mocked MPI interfaces, and does not explicitly require MPI.
Fixed memory leaks in GlobalPolicy.
Fixed linking order of libgeopm and libmpi.
Fixed non-performance mode integration test launcher.
Fixed issue where libgeopmpolicy had false dependence on OMPT.cpp
Fixed rpm Makefile target to avoid the rpmbuild -t option to avoid trying to use the OpenHPC spec file.
Fixed issue where platform topology could be determined from nodes other than the ones that run the job.
Fixed Intel(R) MPI launcher's use of host files and the --ppn CLI.
Fixed incompatibility between MVAPICH2 affinity and srun affinity.
Fixed test_progress_exit integration test to account for extrapolation error.
Fixed integration test for MPI time accounting.
Fixed launcher problem when node is listed in multiple queues by sinfo.
Fixed and improved affinity assignment in corner cases.