A suite of utilities simplilfying linux networking stack performance troubleshooting and tuning.
rss-ladder
tool supports PCI-slot-based queue naming now. I mean this:
30: 127355089 0 0 0 PCI-MSI-edge mlx5_comp0@pci:0000:01:00.0
31: 120112828 5482507 0 0 PCI-MSI-edge mlx5_comp1@pci:0000:01:00.0
32: 121978940 0 5524729 0 PCI-MSI-edge mlx5_comp2@pci:0000:01:00.0
33: 122736116 0 0 5465612 PCI-MSI-edge mlx5_comp3@pci:0000:01:00.0
How to tune it?
$ rss-ladder pci:0000:01:00.0
- distribute interrupts of pci:0000:01:00.0 (mlx5_async_eq) on socket 0
- distribute interrupts of pci:0000:01:00.0 (mlx5_cmd_eq) on socket 0
- distribute interrupts of pci:0000:01:00.0 (mlx5_comp) on socket 0
- pci:0000:01:00.0: queue mlx5_comp0@pci:0000:01:00.0 (irq 30) bound to CPU0
- pci:0000:01:00.0: queue mlx5_comp1@pci:0000:01:00.0 (irq 31) bound to CPU1
- pci:0000:01:00.0: queue mlx5_comp2@pci:0000:01:00.0 (irq 32) bound to CPU2
- pci:0000:01:00.0: queue mlx5_comp3@pci:0000:01:00.0 (irq 33) bound to CPU3
- distribute interrupts of pci:0000:01:00.0 (mlx5_pages_eq) on socket 0
It may be not perfect but it works at least. Well, at least for mlx5 driver.
autorps
tool doesn't yelling at you with dreadful exception if you try to tune multiqueue NIC. Just says that it may be wrong idea and you should use -f
flag to really change RPS settings.
Also some processors have inverted CPU masks in rps_cpus
file and you could put all processing on foreign NUMA node. I don't know how these masks work and don't want to know, so default behaviour now is to copy mask from /sys/class/net/$dev/device/local_cpus
instead of evaluate it.
There is a new class-structure in server-info
utility and netutils_linux_hardware
package:
Server class manages five subsystems - CPU, Disk, Net, Memory and System.
Server (server-info) can collect (--collect), read (--show) and rate (--rate) data.
Before the refactoring there were 3 big classes: Reader, Parser (--show) and Assessor (--rate). They had duplicated data about subsystems. Well, there were no "subsystems", there were just a lot of functions with prefixes in those classes. Now all those functions live in their own subsystems, all subsystems have standardised API (that's very cool for Server class, it can just iterate over subsystems).
There is a new Folding
class with all the folding logic/constants. Also there is no more -f, -ff, -fff args, use --device, --subsystem, --server instead.
Some code was simplified and I restored run tests for server-info --rate
. Also I got rid of six.iteritems
dependency in few places and just use .items(). There no big data, so I don't think that 2-3 kbits of RAM are more than code simplicity.
All new options available:
--cpu Show information about CPU
--memory Show information about RAM
--net Show information about network devices
--disk Show information about disks
--system Show information about system overall (rate only)
Example:
# server-info --rate --device --net --cpu
cpu:
BogoMIPS: 5
CPU MHz: 5
CPU(s): 5
Core(s) per socket: 10
L3 cache: 7
Socket(s): 10
Thread(s) per core: 10
Vendor ID: 10
net:
eth1: 3.6666666666666665
eth6: 9.666666666666666
eth7: 9.666666666666666
It also works with --show
:
# server-info --show --memory --disk
disk:
vda:
model: null
size: 21474836480
type: HDD
memory:
devices:
'0x1100':
size: '512'
speed: 0
type: RAM
size:
MemFree: 78272
MemTotal: 500196
SwapFree: 0
SwapTotal: 0
Also if you run server-info without necessary parameters it shows more human-oriented error and --help output instead of traceback with AssertionError.
Well, I failed the challenge to release it before new 2018 year. But later is better than never!
Detail are boring, you'd better look at examples in README!
server-info-rate
and server-info-show
utils deleted.server-info-collect
called via wrapper until it will be rewritten in python. It's a separate issue../tests/server-info-show.test/
server-info --rate
instead of server-info-rate
server-info --directory <path-to-directory> --gzip
. It will make <path-to-directory>.tar.gz
with all the data that you can take from server for later analyze.You can now skip details of your server's rating this way:
server-info-rate -f
- shows entire device rateserver-info-rate -ff
- shows entire subsystem's rateserver-info-rate -fff
- shows entire server's rate$ server-info-rate -fff
WARNING: why do you use 20 years old hardware, dude?
Just a joke. For example:
➜ vscale-vm git:(folding) ✗ server-info-rate -ff
cpu: 4.5
disk: 1.0
memory: 1.0
net: 1.3333333333333333
system: 1.0
It can't be used directly via server-info rate
call, you should go to /root/server/ before run server-info-rate
. I know it's shitty usability, I'll fix it this week in 2.7.2.
There is optional dmidecode support, now you can see how good your RAM is:
Python 2.6 is deprecated so hard, so it's probably impossible to use pytest to run tests in Travis-ci anymore. I tried to jump into details and problem is probably not in pytest or python, but in pip. pip 8.0.1 installed by yum in CentOS 6 installed the latest version of pytest without any problems while pip 9.0.1 refused. I don't know how to set version of pip for python 2.6 environment in Travis, so I just removed it from .travis.yml. What does it mean? Higher probability of making bugs specific for python 2.6 - some modern syntax usage, etc. But is it a problem?
Main py2.6 users are CentOS 6 users. Where are python3.4 in EPEL in CentOS 6, so it's still possible to use. Also, there are Carbon Reductor 7 users. Well, most of them have old versions installed already and everything works. In Carbon Reductor 8 I will move netutils-linux from py2.6 env to 3.4 in January when I'll upgrade to a new version (from... 2.0?).
We (Carbon Soft) also offer a help in migrating from Carbon Reductor 7 to Carbon Reductor 8. So probably no one will suffer from bugs (I hope), but feel free to create issues.
Fixed:
rss-ladder
And WOW, A FEATURE OF YEAR:
lscpu -p
output to network-top
in debugging purposes.Basic /proc/net/snmp file watcher:
Finished in just one day!
There is much more to do: https://github.com/strizhechenko/netutils-linux/issues/144
Well, the only difference between rps and xps tuning if queue prefix (rx and tx)... So here are 25 lines changed and you are able to distribute packets transmitting between CPUs even with single-queue NIC!
Example:
# autoxps eth0
Using mask 'ff' for eth0-tx-0
Autorps didn't work in systems with multiple NUMA-nodes or CPU sockets before, because it had been calculating cpu mask by total cpu count and wasn't aware of CPU/NUMA topology.
It has been rewritten in python. Now you can:
/sys/class/net/$NIC/device/numa_node
(fallback - 0).--force
it to work with multiqueue NIC.--cpu-mask=fe
--cpus 0 2 4 6
(in the end of options).--dry-run
: it will print something like Using mask 'fc0' for eth0-rx-0
.--socket=1
. Why would you ever need it? Because you may found out that moving this nic to external NUMA-node gives you better performance than put all your NIC's on the device's local NUMA-node (and you can't put NIC in this NUMA-node's PCI slot right now).Also I accidentally drop .pylintrc in repo and fixed all small pep8 violations and other code smells that landscape.io was hiding with default settings.