Supvisors: A Control System for Distributed Applications
Refactoring of the Supvisors internal communications.
The internal_port
of the Supvisors section in the Supervisor configuration file is no longer needed.
As a consequence, the supvisors_list
option is simplified as follows: <identifier>host_name:http_port
.
The transitional SupvisorsInstanceStates.ISOLATING
state has been removed.
The remote Supvisors instance becomes SILENT
as soon as the published events fails due to a transport issue.
Implement Issue #50.
A new tag operational_status
in the Application rules allows to declare the formula applicable to evaluate the
application operational status.
status_formula
is added to the result of the XML-RPC get_application_rules
.
Implement Issue #15.
A StarterModel
has been added to Supvisors to give a prediction of the application distribution when started.
The command is available through the new XML-RPCs test_start_application
and test_start_process
and have been
added to supervisorctl
.
The Supvisors core_identifiers
option and the Supvisors rules can now accept indifferently Supervisor
identifiers or keys with the host:http_port
format.
Update the get_instance_info
XML-RPC so that the function accepts a stereotype as parameter.
As a consequence, it now returns a list of dictionaries.
Add a lazy
attribute to the update_numprocs
XML-RPC, so that when set combined to a numprocs decrease,
Supvisors defers the obsolete processes deletion from the Supervisor configuration only when the processes stop
(exit, crash or later user request) instead of stopping them immediately.
Add monotonic time in internal model and exchanges to cope with time updates while Supvisors is running.
Impact on the XML-RPC get_instance_info
, get_process_info
and on the event interface for instance status
and process event.
Add new get_statistics_status
, enable_host_statistics
, enable_process_statistics
, update_collecting_period
XML-RPCs to support the possibility to get and update the collection of host and process statistics.
The corresponding commands stats_status
, enable_stats
and stats_period
have been added to supervisorctl
.
The JAVA client includes the new XML-RPCs.
Add new get_all_inner_process_info
and get_inner_process_info
XML-RPCs to support debug investigation.
They return internal information on the processes declared on a Supvisors instance.
Move the host statistics collector to the statistics collector process.
The option stats_collecting_period
is now applicable to host statistics collector.
Re-apply the eventual process extra_args
when restarting the application.
In the Supervisors navigation menu of the Web UI, add a red light to Supervisor instances having raised a failure.
Allow the display of a software name and icon at the top of the Supvisors Web UI.
The options software_name
and software_icon
have been added to the Supvisors section of the Supervisor
configuration file.
All internal identifiers are now based on the host:http_port
format.
Rename the DEPLOYMENT
state as DISTRIBUTION
state to lift ambiguity ("deployment" is rather connoted when dealing
with the orchestration domain).
Rework Supvisors RPCInterface
exceptions.
Rework the Web UI.
Fix bug that was randomly blocking Supvisors on restart or shutdown, due to a stdout flush hanging in multiprocessing bowels. The statistics Process is now started before any other thread.
guest
time removed from because CPU calculation because it is already accounted in user
time on Linux.
Fix process CPU so that it corresponds to the Linux top
result.
Use the latest versions of Sphinx-related modules for documentation, as sphinx-5.0
is now the minimal dependency.
Fix rare I/O exception by joining the SupervisorsProxy
thread before exiting the SupvisorsMainLoop
.
Fix rare exception when host network statistics are prepared for display in the Supvisors Web UI in the event where network interfaces have different history sizes.
Fix the Supvisors identifier possibilities when using the distribution rule SINGLE_INSTANCE
.
Update the process statistics collector thread so that it exits by itself when supervisord
is killed.
Improve the node selection when using the distribution rule SINGLE_NODE
.
Use an asynchronous server in the Supvisors internal communications.
The refactoring fixes an issue with the TCP server that sometimes wouldn't bind despite the SO_REUSEADDR
set.
Restore the action
class in the HTML of the Supvisors Web UI.
CI targets added for Python 3.11 and 3.12.
Fix Issue #112.
Write the disabilities file even if no call to disable
and enable
XML-RPCs have been done.
Try to create the folder at startup if it does not exist.
Fix a case where the Starter
would block if the process reaches the expected state without reception
of the corresponding event.
Fix typo for zmq
requirement when installing Supvisors from pypi
.
Fix flask-restx
dependency in setup according to Python version.
Fix uncaught exception the request to start a process is rejected due to a lack of resources. The exception was dependent from the Python version (absent in 3.6 but raised in 3.9).
Monkeypatch fix of Supervisor Issue #1596. Shutdown of the asyncore socket before it is closed.
Improve robustness against network failures. All Supervisor events are applied to the local Supvisors instance
before they are published, so that it remains functional despite a network failure.
The internal TCP sockets are rebound when a network interface becomes up (requires psutil
).
Provide a discovery mode where the Supvisors instances are added on-the-fly without declaring them in
the supvisors_list
option. The function relies on a Multicast Group definition (options multicast_group
,
multicast_interface
and multicast_ttl
added to that purpose).
The attribute discovery_mode
is added to the get_state
and get_instance_info
XML-RPCs.
Add a new option stereotypes
to support the discovery mode. The identifiers
of the Application and Program rules
can now reference a Supvisors stereotype in addition to identifiers and aliases.
By extension, it is made available to the non-discovery mode.
Add a new option syncho_options
to enable the user to choose the conditions putting an end to the Supvisors
synchronization phase.
More particularly when using the new USER
condition, the Supvisors Web UI provides a means to end the
INITIALIZATION
state, with optional Master selection. The command is also available as an XML-RPC end_synchro
and has been added to supervisorctl
.
The new item @
in the identifiers
of the Program rules takes the behavior of the item #
as it was
before Supvisors version 0.13, i.e. the assignment is strictly limited by the length of the identifiers
list,
without roll-over.
NOTE: This is not available for Application rules.
Use host aliases when looking for the local Supvisors instance.
Use IP address rather than host identification when dealing with SINGLE_NODE
starting strategy.
To prevent the situation that led the Starter
to block, a new state CHECKED
is added to SupvisorsInstanceStates
,
which is actually a pre-RUNNING
state.
Such a Supvisors instance is considered active and is updated with received events but cannot be part of any
starting sequence until all starting jobs in progress are completed.
Limit the consideration of the process forced state to display in the Application page of the Supvisors Web UI, so that it does not interfere with the real process state.
Add master_identifier
to the output of the XML-RPCs get_supvisors_state
and get_instances_info
.
The supervisorctl
commands sstate
and instance_status
have also been updated.
Monkeypatch Supervisor on-the-fly so that its logger is thread-safe and add log traces in Supvisors threads.
Simplify the Supvisors state machine and replace the states RESTART
and SHUTDOWN
by a single state FINAL
.
Highlight the process line hovered by the cursor in the Supvisors Web UI.
Remove the figures from the Supvisors Web UI when matplotlib
is not installed.
Add RPC changeLogLevel
to the JAVA client.
Do not catch XmlRpc exceptions in the JAVA client.
Refactoring of the Supvisors internal communications.
Add websockets
as an option to the Supvisors event listener (Python 3.7+ only).
Re-design the PyZMQ
event listener using the zmq.asyncio
support for better commonalities
with the wesockets
solution.
Re-design the statistics collection and compilation.
The option stats_enabled
takes additional values to control host and process statistics independently.
The option stats_collecting_period
has been added to set the minimum time between process statistics collection.
The option stats_periods
accepts float values, not necessarily multiples of 5.
Fix Issue #54. Add host and process statistics to the Supvisors event interface.
Fix children process CPU times in statistics.
Fix Solaris mode not taken into account for the process mean CPU value in the Supvisors Web UI.
Only one Supvisors instance is running when both unix_http_server
and inet_http_server
sections are defined
in the supervisor configuration file.
Fix Flask start_args
to pass the extra arguments in the URL attributes rather than in the route.
The local Supvisors instance is identified as the item having the same fully qualified domain name
(as returned by socket.gethostaddr
and socket.getfqdn
) among the items of the supvisors_list
option.
Use the HTTP server port to help the identification of the local Supvisors instance when multiple items
of the supvisors_list
option fit and identifier is not set.
The attribute process_failure
is added to the get_instance_info
XML-RPC to inform if there is a process failure
in the Supvisors instance. The attribute is also provided in the event interface and in the instance_status
option of the supervisorctl
command.
Raise an exception when the matching Supvisors instance in the supvisors_list
option is inconsistent
with the local configuration.
Add a Supvisors logo.