Kornia Versions Save

Geometric Computer Vision Library for Spatial AI

v0.7.2

1 month ago

What's Changed

New Contributors

Full Changelog: https://github.com/kornia/kornia/commits/v0.7.2

v0.7.1

4 months ago

What's Changed

New Contributors

Full Changelog: https://github.com/kornia/kornia/compare/v0.7.0...v0.7.1

v0.7.0

9 months ago

Highlights

Image API

In this release we have added a new Image API as placeholder to support a more generic multibackend api. You can export/import from files, numpy and dlapck.

>>> # from a torch.tensor
>>> data = torch.randint(0, 255, (3, 4, 5), dtype=torch.uint8)  # CxHxW
>>> pixel_format = PixelFormat(
...     color_space=ColorSpace.RGB,
...     bit_depth=8,
... )
>>> layout = ImageLayout(
...     image_size=ImageSize(4, 5),
...     channels=3,
...     channels_order=ChannelsOrder.CHANNELS_FIRST,
... )
>>> img = Image(data, pixel_format, layout)
>>> assert img.channels == 3

Object Detection API

We have added the ObjectDetector that includes by default the RT-DETR model. The detection pipeline is fully configurable by supplying a pre-processor, a model, and a post-processor. Example usage is shown below.

from io import BytesIO

import cv2
import numpy as np
import requests
import torch
from PIL import Image
import matplotlib.pyplot as plt

from kornia.contrib.models.rt_detr import RTDETR, DETRPostProcessor, RTDETRConfig
from kornia.contrib.object_detection import ObjectDetector, ResizePreProcessor

model_type = "hgnetv2_x"  # also available: resnet18d, resnet34d, resnet50d, resnet101d, hgnetv2_l
checkpoint = f"https://github.com/kornia/kornia/releases/download/v0.7.0/rtdetr_{model_type}.ckpt"
config = RTDETRConfig(model_type, 80, checkpoint=checkpoint)
model = RTDETR.from_config(config).eval()

detector = ObjectDetector(model, ResizePreProcessor(640), DETRPostProcessor(0.3))

url = "https://github.com/kornia/data/raw/main/soccer.jpg"
img = Image.open(BytesIO(requests.get(url).content))
img = np.asarray(img, dtype=np.float32) / 255
img_pt = torch.from_numpy(img).permute(2, 0, 1)
detection = detector.predict([img_pt])

for cls_score_xywh in detection[0].numpy():
    class_id = int(cls_score_xywh[0])
    score = cls_score_xywh[1]
    x, y, w, h = cls_score_xywh[2:].round().astype(int)
    cv2.rectangle(img, (x, y, w, h), (255, 0, 0), 3)

    text = f"{class_id}, {score:.2f}"
    font = cv2.FONT_HERSHEY_SIMPLEX
    (text_width, text_height), _ = cv2.getTextSize(text, font, 1, 2)
    cv2.rectangle(img, (x, y - text_height, text_width, text_height), (255, 0, 0), cv2.FILLED)
    cv2.putText(img, text, (x, y), font, 1, (255, 255, 255), 2)

plt.imshow(img)
plt.show()

img

Deep Models

As part of the kornia.contrib module, we started building a models module where Deep Learning models for Computer Vision (Semantic Segmentation, Object Detection, etc.) will exist.

From an abstract base class ModelBase, we will implement and make available these deep learning models (eg Segment anything). Similarly, we provide standard structures to be used with the results of these models such as SegmentationResults.

The idea is that we can abstract and standardize how these models will behave with our High level APIs. Like for example interacting with the Visual Prompter backend (today Segment Anything is available).

ModelBase provides methods for loading checkpoints (load_checkpoint), and compiling itself via the torch.compile API. And we plan to increase it according to the needs of the community.

Within this release, we are also making other models available to be used like RT_DETR and tiny_vit.

Example of using these abstractions to implement a model:

# Each model should be a submodule inside the `kornia.contrib.models`, and the Model class itself will be exposed under this
# `models` module.

from kornia.contrib.models.base import ModelBase
from dataclasses import dataclass
from kornia.contrib.models.structures import SegmentationResults
from enum import Enum

class MyModelType(Enum):
    """Map the model types."""
    a = 0
    ...

@dataclass
class MyModelConfig:
    model_type: str | int | SamModelType | None = None
    checkpoint: str | None = None
    ...

class MyModel(ModelBase[MyModelConfig]):
    def __init__(...) -> None:
        ...

    @staticmethod
    def from_config(config: MyModelConfig) -> MyModel:
        """Build the model based on the config"""
        ...

    def forward(...) -> SegmentationResults:
        ...

RT-DETR

In most object detection models, non-maximum suppression (NMS) is necessary to remove overlapping and similar bounding boxes. This post-processing algorithm has high latency, preventing object detectors from reaching real-time speed. DETR is a new class of detectors that eliminate NMS step by using transformer decoder to directly predict bounding boxes. RT-DETR enhances Deformable DETR to achieve real-time speed on server-class GPUs by using an efficient backbone. More details can be seen here

TinyViT

TinyViT is an efficient and high-performing transformer model for images. It achieves a top-1 accuracy of 84.8% on ImageNet-1k with only 21M parameters. See TinyViT for more information.

MobileSAM

MobileSAM replaces the heavy ViT-H backbone in the original SAM with TinyViT, which is more than 100 times smaller in terms of parameters and around 40 times faster in terms of inference speed. See MobileSAM for more details.

To use MobileSAM, simply specify "mobile_sam" in the SamConfig:

from kornia.contrib.visual_prompter import VisualPrompter
from kornia.contrib.models.sam import SamConfig

prompter = VisualPrompter(SamConfig("mobile_sam", pretrained=True))

LightGlue matcher

Added the LightGlue LightGlue-based matcher in kornia API. This is based on the original code from paper “LightGlue: Local Feature Matching at Light Speed”. See [LSP23] for more details.

The LightGlue algorithm won a money prize in the Image Matching Challenge 2023 @ CVPR23: https://www.kaggle.com/competitions/image-matching-challenge-2023/overview

See a working example integrating with COLMAP: https://github.com/kornia/kornia/discussions/2469 image

New Sensors API

New kornia.sensors module to interface with sensors like Camera, IMU, GNSS etc.

We added CameraModel , PinholeModel , CameraModelBase for now.

Usage example:

Define a CameraModel

>>> # Pinhole Camera Model
>>> cam = CameraModel(ImageSize(480, 640), CameraModelType.PINHOLE, torch.Tensor([328., 328., 320., 240.]))
>>> # Brown Conrady Camera Model
>>> cam = CameraModel(ImageSize(480, 640), CameraModelType.BROWN_CONRADY, torch.Tensor([1.0, 1.0, 1.0, 1.0,
... 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]))
>>> # Kannala Brandt K3 Camera Model
>>> cam = CameraModel(ImageSize(480, 640), CameraModelType.KANNALA_BRANDT_K3, torch.Tensor([1.0, 1.0, 1.0,
... 1.0, 1.0, 1.0, 1.0, 1.0]))
>>> # Orthographic Camera Model
>>> cam = CameraModel(ImageSize(480, 640), CameraModelType.ORTHOGRAPHIC, torch.Tensor([328., 328., 320., 240.]))
>>> cam.params
tensor([328., 328., 320., 240.])

Added kornia.geometry.solvers submodule

New module for geometric vision solvers that include the following:

This is part of an upgrade of the find_fundamental to support the 7POINT algorithm.

Image terminal printing

Added kornia.utils.print_image API for printing any given image tensors or image path to terminal.

>>> kornia.utils.print_image("panda.jpg")

Screenshot 2023-07-26 at 11 39 00 PM

What's Changed

New Contributors

Full Changelog: https://github.com/kornia/kornia/compare/v0.6.12...v0.7.0

v0.6.12

1 year ago

Highlights

ImagePrompter API

In this release we have added a new ImagePrompter API that settles the basis as a foundational api for the task to query geometric information to images inspired by LLM. We leverage the ImagePrompter API via the Segment Anything (SAM) making the model more accessible, packaged and well maintained for industry standards.

Check the full tutorial: https://github.com/kornia/tutorials/blob/master/nbs/image_prompter.ipynb

import kornia as K
from kornia.contrib.image_prompter import ImagePrompter
from kornia.geometry.keypoints import Keypoints
from kornia.geometry.boxes import Boxes

image: Tensor = K.io.load_image("soccer.jpg", ImageLoadType.RGB32, "cuda")

# Load the prompter
prompter = ImagePrompter(config, device="cuda")

# set the image: This will preprocess the image and already generate the embeddings of it
prompter.set_image(image)

# Generate the prompts
keypoints = Keypoints(torch.tensor([[[500, 375]]], device="cuda")) # BxNx2
# For the keypoints label: 1 indicates a foreground point; 0 indicates a background point
keypoints_labels = torch.tensor([[1]], device="cuda") # BxN
boxes = Boxes(
    torch.tensor([[[[425, 600], [425, 875], [700, 600], [700, 875]]]], device="cuda"), mode='xyxy'
)

# Runs the prediction with all prompts
prediction = prompter.predict(
    keypoints=keypoints,
    keypoints_labels=keypoints_labels,
    boxes=boxes,
    multimask_output=True,
)

image

Guided Blurring

Blur images by preserving edges via Bilateral and Guided Blurring -> https://kornia.readthedocs.io/en/latest/filters.html#kornia.filters.guided_blur

image

What's Changed

New Contributors

Full Changelog: https://github.com/kornia/kornia/compare/v0.6.11...v0.6.12

v0.6.11

1 year ago

Highlights

In this release we have added DISK, which is the best free local feature for 3D reconstruction. (part of winning solutions in IMC2021 together with SuperGlue). Thanks to @jatentaki for the great work and relicensing the DISK to Apache 2!

import kornia.feature as KF

disk = KF.DISK.from_pretrained('depth').to(device)
with torch.inference_mode():
    inp = torch.cat([img1, img2], dim=0)
    features1, features2 = disk(inp, 2048, 
                                pad_if_not_divisible=True)
    kps1, descs1 = features1.keypoints, features1.descriptors
    kps2, descs2 = features2.keypoints, features2.descriptors
    dists, idxs = KF.match_smnn(descs1, descs2, 0.98)
image

What's Changed

New Contributors

Full Changelog: https://github.com/kornia/kornia/compare/v0.6.10...v0.6.11

v0.6.10

1 year ago

What's Changed

New Contributors

Full Changelog: https://github.com/kornia/kornia/compare/v0.6.9...v0.6.10

v0.6.9

1 year ago

What's Changed

New Contributors

Full Changelog: https://github.com/kornia/kornia/compare/v0.6.8...v0.6.9

v0.6.8

1 year ago

Highlights

NeRF API

In this release in we include an experimental kornia.nerf submodule with a high level API that implements a vanilla Neural Radiance Field (NeRF). Read more about the roadmap of this project: https://github.com/kornia/kornia/issues/1936 // contribution done by @YanivHollander

from kornia.nerf import NerfSolver
from kornia.geomtry.camera import PinholeCamera

 camera: PinholeCamera = create_one_camera(5, 9, device, dtype)
 img = create_red_images_for_cameras(camera, device)

 nerf_obj = NerfSolver(device=device, dtype=dtype)
 num_img_rays = 15
 nerf_obj.init_training(camera, 1.0, 3.0, False, img, num_img_rays, batch_size=5, num_ray_points=10, lr=1e-2)
 nerf_obj.run(num_epochs=10)

 img_rendered = nerf_obj.render_views(camera)[0].permute(2, 0, 1)

ezgif com-gif-maker

Improvements, docs and tutorials soon!

Edge Detection

Added kornia.contrib.EdgeDetection API that implements dexined: https://github.com/xavysp/DexiNed

import kornia as K
from kornia.contrib import EdgeDetection

edge_detection = EdgeDetector().to(device)

# preprocess
img = K.image_to_tensor(frame, keepdim=False).to(device)
img = K.color.bgr_to_rgb(img.float())

# detect !
with torch.no_grad():
    edges = edge_detection(img)

img_vis = K.tensor_to_image(edges.byte())

amiga_edge

Image matching bugfixes:

After testing kornia LoFTR and AdaLAM under big load, our users and we have experiences some bugs in corners cases, such as big images or no input correspondences, which caused pipeline to crash. Not anymore!

Various kornia demos in gradio by community:

See demos in our HuggingFace space: https://huggingface.co/kornia image

RANSAC improvements

We have added homography-from-line-segments solver, as well as various speed-ups. We are not yet at OpenCV RANSAC quality level, more improvements to come :) But the line-solver is pretty unique! We also have example in our tutorials https://kornia-tutorials.readthedocs.io/en/latest/line_detection_and_matching_sold2.html

image

Apple Silicon M1 support is closer, CI improvements

We are slowly working on being able to run kornia on M1. So far we have added possibility to test locally on M1 and mostly report Pytorch MPS backend crashes in various use-cases. Once this work is finished, we may provide some workarounds to have kornia-M1

Quaternion improvements

Implemented Quaternion.slerp to interpolate between quaternions using quaternion arithmetic -- contributed by @cjpurackal

import torch
from kornia.geometry.quaternion import Quaternion

q0 = Quaternion.identity(batch_size=1)
q1 = Quaternion(torch.tensor([[1., .5, 0., 0.]]))
q2 = q0.slerp(q1, .3)

More augmentations!

What's Changed

New Contributors

Full Changelog: https://github.com/kornia/kornia/compare/v0.6.7...v0.6.8

v0.6.7

1 year ago

Highlights

SOLD2 line segment detector & descriptor

Contributed by SOLD2 original authors

Geometry-aware matchers: AdaLAM & FGINN

image Good old Lowe ratio-test is good for descriptor matching (implemented as `match_snn`, `match_smnn` in kornia, but it is often not enough: it does not take into account keypoint positions. With this version we started to add geometry aware descriptor matchers, starting with [FGINN](https://arxiv.org/abs/1503.02619) and [AdaLAM](https://arxiv.org/abs/2006.04250). Later we plan to add something like SuperGlue (but free version, ofc).

AdaLAM works particularly well with kornia.feature.KeyNetAffNetHardNet. AdaLAM is adopted from original author's implementation.

import matplotlib.pyplot as plt
import cv2
import kornia as K
import kornia.feature as KF
import numpy as np
import torch
from kornia_moons.feature import *

def load_torch_image(fname):
    img = K.image_to_tensor(cv2.imread(fname), False).float() /255.
    img = K.color.bgr_to_rgb(img)
    return img

device = K.utils.get_cuda_device_if_available()

fname1 = 'kn_church-2.jpg'
fname2 = 'kn_church-8.jpg'

img1 = load_torch_image(fname1)
img2 = load_torch_image(fname2)


feature = KF.KeyNetAffNetHardNet(5000, True).eval().to(device)

input_dict = {"image0": K.color.rgb_to_grayscale(img1), # LofTR works on grayscale images only 
              "image1": K.color.rgb_to_grayscale(img2)}

hw1 = torch.tensor(img1.shape[2:])
hw2 = torch.tensor(img1.shape[2:])

adalam_config = {"device": device}

with torch.inference_mode():
    lafs1, resps1, descs1 = feature(K.color.rgb_to_grayscale(img1))
    lafs2, resps2, descs2 = feature(K.color.rgb_to_grayscale(img2))
    dists, idxs = KF.match_adalam(descs1.squeeze(0), descs2.squeeze(0),
                                  lafs1, lafs2, # Adalam takes into account also geometric information
                                  config=adalam_config,
                                  hw1=hw1, hw2=hw2) # Adalam also benefits from knowing image size

More - in our Tutorials section

Geometry conversions

Converting camera pose from (R,t) to actually pose in world coordinates can be a pain. We are relieving you from it, by implementing various conversion functions, such as camtoworld_to_worldtocam_Rt, worldtocam_to_camtoworld_Rt, camtoworld_graphics_to_vision_4x4, etc. The conversions come with two variants: for (R,t) tensor tuple, or with since extrinsics mat4x4.

Quaternion API

More geometry-related stuff! We have added Quaternion API to make work with rotation representations easy. Checkout the PR

>>> q = Quaternion.identity(batch_size=4)
>>> q.data
Parameter containing:
tensor([[1., 0., 0., 0.],
        [1., 0., 0., 0.],
        [1., 0., 0., 0.],
        [1., 0., 0., 0.]], requires_grad=True)
>>> q.real
tensor([[1.],
        [1.],
        [1.],
        [1.]], grad_fn=<SliceBackward0>)
>>> q.vec
tensor([[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]], grad_fn=<SliceBackward0>)

Mosaic Augmentation

We recently included the RandomMosaic to mosaic image transforms and combine them into one output image. The output image is composed of the parts from each sub-image.

The mosaic transform steps are as follows:

  • Concate selected images into a super-image.
  • Crop out the outcome image according to the top-left corner and crop size.
>>> mosaic = RandomMosaic((300, 300), data_keys=["input", "bbox_xyxy"])
>>> boxes = torch.tensor([[
...     [70, 5, 150, 100],
...     [60, 180, 175, 220],
... ]]).repeat(8, 1, 1)
>>> input = torch.randn(8, 3, 224, 224)
>>> out = mosaic(input, boxes)
>>> out[0].shape, out[1].shape
(torch.Size([8, 3, 300, 300]), torch.Size([8, 8, 4]))
image

Edge-aware blurring

Thanks to @nitaifingerhut

!wget https://github.com/kornia/data/raw/main/drslump.jpg

import torch
import kornia
import cv2
import matplotlib.pyplot as plt

# read the image with OpenCV
img: np.ndarray = cv2.imread('./drslump.jpg')
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

# convert to torch tensor
data: torch.tensor = kornia.image_to_tensor(img, keepdim=False)/255.  # BxCxHxW
data-=0.2*torch.rand_like(data).abs()

plt.figure(figsize=(12,8))
edge_blurred = kornia.filters.edge_aware_blur_pool2d(data, 19)
plt.imshow(kornia.tensor_to_image(torch.cat([data, edge_blurred],axis=3)))

image

What's Changed

Full Changelog: https://github.com/kornia/kornia/compare/v0.6.6...v0.6.7

v0.6.6

1 year ago

Highlights

ParametrizedLine API

First of integrations to revamp kornia.geometry to align with Eigen and Sophus. Docs: https://kornia.readthedocs.io/en/latest/geometry.line.html?#kornia.geometry.line.ParametrizedLine See: example: https://github.com/kornia/kornia/blob/master/examples/geometry/fit_line2.py

Figure_1

Support for macos and windows in load_image

Automated the packaging infra in kornia_rs to handle multi architecture builds. Arm64 soon :) See: https://github.com/kornia/kornia-rs

    # load the image using the rust backend          
    img: Tensor = K.io.load_image(file_name, K.io.ImageLoadType.RGB32)
    img = img[None]  # 1xCxHxW / fp32 / [0, 1]

HuggingFacce integration

Created Kornia AI org under the HuggingFace platform. Starting to port the tutorials under HuggingFace kornia org to rapidly show live docs and make community. Link: https://huggingface.co/kornia

Demos:

What's new ?

New Contributors

Full Changelog: https://github.com/kornia/kornia/compare/v0.6.5...v0.6.6