Kb Layout Evaluation Save

Evaluate ergonomic keyboard layouts over multiple languages

Project README

Keyboard layout evaluation

An evaluation of existing keyboard layouts over multiple languages, focused on ergonomic keyboards.

Many keyboard layouts are designed (by hand or generated by an algorithm) to improve the ergonomics of Qwerty. However, they are typically assessed for typing in a single language, and on a standard keyboard. This analysis evaluates those layouts over several languages, and for an ergonomic keyboard.

The method uses statistics of bigram use (sets of 2 letters) for each language, and grades them according to subjective "weights" (depending on the keys positions), to calculate a comparative difficulty between layouts.

It is focused on writing text, not code. Programming requires special characters more suited to an additional layer; this script focuses on the base layer. Also, no attempt was made to generate a new "optimum" layout, as there would be a lot of variation depending on subjective parameters. Existing layouts already perform very well here, including some generated ones.

The results show that any alternative layout gives a significant ergonomic advantage over Qwerty. Several options give good results, in particular Colemak DH which also brings good familiarity, accessible shortcuts, and positive user feedback.

results

This project has also been modified and used here.

Keyboard layout evaluation
Table of contents
Character statistics
Layout evaluation
Conclusion

Character statistics

Contained in folder character_stats.

The layout evaluation needs bigram frequencies (sets of 2 letters) for each language.

The frequencies come from Practical Cryptography for English, French, Spanish, German, and Swedish; and from Norvig for English (to compare).

For comparison, my own corpus is also analyzed (for English and French); made of my emails, some texts from free books, and some internet articles.

count.py

Requirements: Python 3, Pandas.

The script count.py takes the text files in the data folder and outputs the character counts in chars.csv, and the bigram counts in bigrams.csv. Upper case is converted to lower case.

The list of characters to take into account is configurable in the code, in the list chars. Currently it takes the basic alphabet, plus éèêàçâîô.,-'/.

The provided chars.csv and bigrams.csv files were generated with a personal corpus of emails (mails_en and mails_fr, 300~400kB of raw text each) and various free books and articles (vrac_en and vrac_fr, 200~400kB each).

Spreadsheet analysis

This analysis is done in the Libreoffice spreadsheet stats.ods.

Character counts

The character frequencies for both English and French are quite consistent between the sources and my own corpus.

chars_en

Here is the same chart for French.

Bigram counts

The bigram counts show more discrepancies. The charts below show the top 80 bigrams (sorted by the average of both sources for English).

bigram_en

Here is the same chart for French.

Punctuation

The "theory" numbers from Practical Cryptography and Norvig do not contain data on punctuation characters such as .,-'/. However while the use of . and , depend a lot on each author's style, taking into account a non-zero frequency is essential due to their frequency.

The relevant bigram frequencies are not present in any existing statistics, but for French the Bépo layout project gives character frequencies from a Wikipedia 2008 dump.

For English, Vivian Cook's analysis gives frequencies by word. This University of Maryland paper (Table 1) gives a frequency for . of 1.151% (from Google Ngram data).

The frequencies from the personal corpus contain those characters so it can be used as a control. However, it is useful to have more realistic "theory" results. The approach is to "fix" the missing frequencies by copying the relevant ones from the personal corpus and normalizing the column.

To check that those inputs make sense, they can be compared to the few data available.

For English, we can estimate by dividing the frequencies from Vivian Cook by the average word length of 4.79 letters (from Norvig).

For English	Personal corpus	Frequency / Word length	Google Ngram
Period .	1.6 %	1.36 %	1.151 %
Comma ,	1.2 %	1.29 %

For French	Personal corpus	Wikipedia dump
Period .	1.2 %	0.83 %
Comma ,	1.5 %	1.02 %

Some variation can be observed, but the numbers from the personal corpus pass that sanity check. It seems a bit more punctuation is used compared to the average literature, which isn't bad to consider.

Takeaway

For the evaluation, the "theory" numbers will be the average from both sources for English, and the only source I have for other languages. But the differences with my own corpus show the sensitivity of those inputs, therefore the results should be taken with some tolerance.

As the "theory" numbers do not contain characters such as .,-'/, those will be copied from the personal corpus data then normalized.

Layout evaluation

Contained in folder layout_evaluation.

Focus definition

The influence of the physical keyboard is on the weights and penalties. The algorithm is more or less the same otherwise.

The chosen weights are for an ergonomic keyboard, so they are symmetrical. It would be similar for any non-staggered keyboard (no horizontal shift between rows).

Only the keys on the main 3×12 matrix are taken into account, which are reachable by a finger easily. The "numbers" row is ignored as we focus on the alphabetical layout.

Thumb keys available on ergonomic keyboards are ignored as I arbitrarily prefer not to place any alphanumeric character on them.

layout_matrix

The keys are designated by a code (hand, row, column). The numbering includes space for some currently-unused keys, in case of evolution (like for an Ergodox-like keyboard).

Evaluation principle

For each language, the bigram frequencies are imported from the character statistics, as a percentage of use.

The principle is to assign a difficulty (weight) to a bigram (two keys typed consecutively). The bigram weight is multiplied by its frequency, and all the results are summed up to get a general difficulty value of the layout.

Weight_layout = sum( Weight_bigram × Probability_bigram )

The bigram weight is composed of:

The weights assigned to the two keys, representing the relative difficulty to push them individually
A penalty, representing the added difficulty of pushing those 2 keys one after the other

Weight_bigram = Weight_key1 + Weight_key2 + Penalty_{key1 & key2}

The results for all layouts and languages are finally normalized compared to Qwerty in English (at 100%).

Key base weights

The base weights are shown below. The home row is identified by a red border.

They represent the relative difficulty to hit a single key. The proposed values are for an ergonomic keyboard, with vertical columns and a comfortable home row position.

weights

Penalties

The penalties represent the additional difficulty of hitting 2 keys consecutively. They come on top of the base weight. They are only taken into account if the 2 keys are hit by the same hand (and are not the same key, like "aa"). By default, they are given a slightly positive value in order to favor hand alternation.

Generally, the penalties are higher if the same finger is used, and the more rows separate the 2 keys. They can be negative if the relative position makes the motion easy, such as a close "inward roll" (like "sd" on Qwerty).

First finger	Second finger	Same row	1 row jump	2 rows jump	Comment
Index	Index	2.5	3.5	4.5	Same finger
Index	Middle	0.5	1.0	2.0
Index	Ring	0.5	0.8	1.5
Index	Pinky	0.5	0.8	1.1
Middle	Index	-1.5	-0.5	1.5	Inward roll
Middle	Middle	N/A	3.5	4.5	Same finger
Middle	Ring	0.5	1.0	2.0
Middle	Pinky	0.5	0.8	1.5
Ring	Index	-1.5	-0.5	1.5	Inward roll
Ring	Middle	-2.0	-0.5	1.2	Inward roll
Ring	Ring	N/A	3.5	4.5	Same finger
Ring	Pinky	1.0	1.5	2.5
Pinky	Index	-1.0	0.0	1.0	Inward roll
Pinky	Middle	-1.0	0.0	1.5	Inward roll
Pinky	Ring	-1.0	0.0	1.5	Inward roll
Pinky	Pinky	3.0	4.0	5.5	Same finger

Limits

Bigram frequencies variations

The results are approximate as the bigram frequencies aren't a precise and objective number for everyone.

However, the results for English and French show very little difference between the "theory" values and my personal corpus. Therefore it seems the variation in bigram use doesn't affect the final grade very significantly.

Accented characters

The results for languages outside English are slightly off because most accented characters are not taken into account.

Currently, the ignored characters are êàçâîôñäöüß/å, mainly because those characters are absent from most considered layouts. The characters é and è were added manually to the layouts (on unused keys, on the vowel side if there's one) because I particularly care about French, and due to their high frequency (2.85%).

The characters ' and - were also added when missing, on unused keys.

The issue mainly affects German (äöüß, 1.56% of the characters), Swedish (äöå, 4.45%), but also French (êàçâîô, 0.75%) and Spanish (ñ, 0.22%).

To mitigate this, the bigram frequencies are normalized after removing the ignored characters, so the summed grade is still calculated over 100%.

script.py

Requirements: Python 3, Pandas, Matplotlib.

script.py uses the bigram statistics from stats.csv, and config.txt (key weights, penalties, and layouts definitions) to generate the results (table and plot).

To customize the script, edit config.txt and have a look at the main() function.

The code isn't very efficient as it iterates through dataframes to generate the results. In practice it executes in ~10s so it doesn't really matter.

Results

Here are the full results, for all languages and including my personal corpus for English and French.

results

In addition, here are the results for only English and only French. For comparison this includes the results before adding the punctuation as described earlier. The necessity of taking into account the punctuation is clear. Both the "theory" and personal corpus give very similar grades, so results are quite consistent.

Here is the complete results list. The layouts can be seen in config.txt.

Layout	English	English perso	French	French perso	Spanish	German	Swedish
MTGAP	64.48	63.17	69.73	69.20	66.14	67.15	67.38
BEAKL 19bis	64.51	63.47	66.87	65.81	64.45	67.92	69.00
Colemak DH mod	64.54	64.18	67.32	66.75	64.18	64.70	68.16
Engram 2.0	64.59	63.32	71.17	70.28	66.52	69.11	69.47
Colemak DH	64.66	64.31	67.88	67.21	64.18	64.70	68.16
The-1	64.80	63.20	72.64	71.75	68.97	67.54	70.26
APT	64.81	63.69	70.29	70.02	66.56	64.68	65.86
MTGAP 2.0	64.91	64.44	67.21	65.77	64.01	64.98	66.77
MTGAP "ergonomic"	64.99	64.96	69.18	68.63	64.81	66.23	66.72
White	65.10	64.00	73.60	73.76	68.09	66.50	69.55
Kaehi	65.56	64.18	70.35	69.09	65.92	67.83	67.94
x1	65.63	64.87	69.76	69.23	66.28	68.03	67.91
Workman	65.83	65.51	71.42	70.57	66.85	66.93	71.31
MTGAP "standard"	65.84	65.24	68.35	67.27	64.43	66.78	66.74
Soul mod	65.89	65.53	68.96	68.00	64.71	64.38	68.37
BEAKL 19	65.98	65.00	70.36	68.28	66.12	66.99	68.85
MTGAP "shortcuts" (ROTS)	66.02	65.59	68.24	67.30	62.72	65.44	66.34
BEAKL 19 Opt French	66.32	65.23	67.12	65.96	65.85	65.31	69.32
Uciea	66.43	66.14	69.36	68.61	65.01	65.99	68.32
Oneproduct	66.44	66.07	73.48	72.44	68.07	68.45	69.22
Boo	66.54	65.87	70.78	69.42	66.04	67.03	68.68
Hands down	66.64	66.20	68.97	67.21	66.10	63.14	66.66
MTGAP "Easy"	66.78	66.44	68.63	67.15	64.55	64.97	65.35
Colemak	67.15	67.08	68.40	67.00	65.37	67.77	68.82
Niro mod	67.41	67.39	70.47	69.30	66.58	67.71	70.41
BEAKL 15	67.43	66.98	71.86	69.85	66.64	69.37	68.59
Halmak	67.94	67.45	71.84	70.78	66.87	69.60	70.68
Three	68.23	67.51	73.43	72.53	69.46	71.00	70.95
Norman	68.34	67.74	74.01	72.48	71.24	69.86	72.10
Semimak	68.85	68.87	72.62	72.63	68.21	64.74	66.94
ASSET	68.88	68.29	69.42	67.53	66.69	70.35	71.03
Notarize	69.45	68.83	70.76	68.81	67.68	69.21	67.94
Optimal digram	70.64	70.21	72.36	71.53	68.94	70.21	70.11
qgmlwyfub	70.83	70.39	75.87	76.11	70.24	72.16	73.71
Optimot	70.98	70.79	65.24	63.64	65.56	70.85	72.55
Carpalx	71.02	70.70	76.25	76.62	70.86	74.00	74.53
Qwpr	71.69	71.30	73.36	72.93	69.44	73.07	71.69
Coeur	72.29	72.46	67.35	65.66	67.07	71.04	73.48
Bépo keyberon	72.79	72.71	68.15	66.78	68.53	72.07	73.49
Minimak-8key	72.94	72.37	74.81	73.48	71.94	75.20	73.55
Bépo 40%	73.20	73.19	67.97	66.61	68.54	73.10	73.56
Dvorak	73.74	72.79	78.15	76.07	75.34	74.59	78.53
Neo	76.31	75.50	76.37	75.37	74.56	71.62	74.16
Qwertz	98.56	97.63	98.12	97.59	93.66	98.31	95.30
Qwerty	100.00	99.17	98.90	98.64	92.45	99.74	96.08
Azerty	105.44	104.64	104.18	103.88	102.40	102.81	102.72

As the "Colemak DH" layout gave the most interesting results, a personal modified version was added to replace some characters like ; (that can be on a layer like Shift + ,) by the French é.

Conclusion

All the alternative layouts perform very significantly better than the traditional ones (Qwerty and similar). The biggest interest for ergonomics is using an alternative layout at all, the selected choice matters a lot less.

results

The options that perform the best are some Colemak variants like DH (image below), the obscure BEAKL 19bis, the Engram, as well as the layouts generated by Mathematical multicore (MTGAP).

Colemak DH

In conclusion, the Colemak DH layout is particularly recommended for its good results and better shortcuts access, familiarity with Qwerty, and larger positive user feedback.

Open Source Agenda is not affiliated with "Kb Layout Evaluation" Project. README Source: bclnr/kb-layout-evaluation

Stars

Open Issues

Last Commit

2 years ago

Repository

bclnr/kb-layout-evaluation

License

MIT

Open Source Agenda Badge

<a href="https://www.opensourceagenda.com/projects/kb-layout-evaluation"><img src="https://www.opensourceagenda.com/projects/kb-layout-evaluation/reviews/badge.svg" alt="Open Source Agenda"></a>

Submit Review Review Your Favorite Project

Submit Resource Articles, Courses, Videos

Submit Article Submit a post to our blog