A lightweight and efficient Python Morton encoder with support for geo-hashing
Ordinal hashing of multidimensonal data and geographic coordinates via Morton coding / Z-ordering.
In mathematical analysis and computer science, Z-order, Morton-order, or a Morton-code is a function which maps multidimensional data to one dimension while preserving locality of the data points. It was introduced in 1966 by IBM researcher, G. M. Morton. The z-value of a point in multidimensions is calculated by interleaving the binary representations of its coordinate values. Once the data are sorted into this ordering, any one-dimensional data structure can be used, such as binary search trees, B-trees, skip lists, or hash tables. The resulting ordering can equivalently be described as the order one would achieve from a depth-first traversal of a quadtree,
where {x, y, ..., K}
are combined into a single ordinal value that is easily compared, searched, and indexed against other Morton numbers.
At the highest level, pymorton is split into two logical functions:
(de)interleave: encodes/decodes hashes representing two or three dimensionsal integer sets. {x, y, z ∈ Z}
or {x, y ∈ Z}
, where Z
represents all integer values.
(de)interleave_latlng: encodes and decodes hashes representing latitude and longitude.
Given a directory of images, sort the images by color (average RGB):
from statistics import mean
from glob import glob
from PIL import Image
import pymorton
imgs = [(fname, Image.open(fname)) for fname in glob('imgpath/*.jpg')[:100]]
# for each image, generate a tuple of len==3, representing the image's average RGB value
avg_rgb_values = [
[int(mean(img.getdata(band))) for band in range(3)] for _, img in imgs]
# using the average RGB values, compute the Z-order of each image
hashed_imgs = list(zip([fname for fname, _ in imgs],
[pymorton.interleave(*avg_rgb) for avg_rgb in avg_rgb_values]))
# returns a sorted-by-color list of photos found within the directory
return sorted(hashed_imgs, key=lambda img_tuple: img_tuple[1])
While the above use-case is fairly uncommon in the context of Morton-coding, I believe it illustrates the utility of the algorithm quite well. Morton-coding is most commonly used within the realm of geospatial indexing, but its potential applications are infinite!
via pip:
pip install pymorton
via source:
git clone https://github.com/trevorprater/pymorton.git
cd pymorton
python setup.py install
import pymorton as pm
mortoncode = pm.interleave(100, 200, 50) # 5162080
mortoncode = pm.interleave3(100, 200, 50) # 5162080
pm.deinterleave3(mortoncode) # (100, 200, 50)
import pymorton as pm
mortoncode = pm.interleave(100, 200) # 46224
mortoncode = pm.interleave2(100, 200) # 46224
pm.deinterleave2(mortoncode) # (100, 200)
import pymorton as pm
geohash = pm.interleave_latlng(40.723471, -73.985361) # '03023211233202130332202203002303'
pm.deinterleave_latlng(geohash) # (40.723470943048596, -73.98536103777587)
pymorton.interleave(*args)
x, y
or x, y, z
into a single value.
This function wraps interleave2() and interleave3() by supporting variable-length args.pymorton.interleave2(x, y)
x, y
.pymorton.interleave3(x, y, z)
x, y, z
.pymorton.interleave_latlng(lat, lng)
lat, lng
.pymorton.deinterleave2(hash)
pymorton.deinterleave3(hash)
pymorton.deinterleave_latlng(hash)
From the project's root directory, execute nosetests
.
Please feel free to contact [email protected] regarding any questions/comments/issues.
MIT