Mediapipe Versions Save

Cross-platform, customizable ML solutions for live and streaming media.

v0.9.3.0

1 year ago

Bazel changes

Bazel version upgrade to v6.1.1
Update Halide build rules for MediaPipe to use Halide v15.0.1
Use "x86_32" instead of "i386" for Bazel CPU ID

Framework and core calculator improvements

Added MPPImageClassifierOptionsHelpers, TensorsToSegmentationCalculatorOptionsProto.java into tasks core's maven package, MPPObjectDetectorOptions, MPPObjectDetectorOptionsHelpers, MPPClassifierOptions, MPPGestureRecognizerOptions, MPPGestureRecognizerOptions.m, support for more standard scaling options in GlSurfaceViewRenderer
Updated cosine similarity utility
Added method to send packet map to C++ task runner
Added methods to MPPVisionTaskRunner
Added methods to MPPVisionPacketCreator
Updated build targets of vision packet creator and task runner
Added MPPImageClassifierResultHelpers, MPPImageClassifierOptionsHelpers
Added MPPImageClassifier
Updated method signature in MPPTaskRunner
Added Face Detector implementation and tests
Added the AudioRecord API
Update audio_record_test.py
Add FaceLandmarker C++ API
Updated models
Add the dataset module for face stylizer in model maker
Update Node version to 16.19.0
Add metadata writer for image segmentation
Add Interactive Segmenter MediaPipe Task
Add label_map filtering into filter_detection drishti calculator
Add the source code TensorsToSegmentationCalculatorOptionsProto.java into tasks core's maven package
Add ImageData output to GraphRunner
Added MPPImage Utils for tests
Added stream info for some modes in MPPImageClassifier
Added flow limiting for live stream mode in MPPImageClassifier
Add WebGLTexture output for ImageSegmenter
Add face_landmarker to vision types
Add a function to convert CoreAudio buffers into a MediaPipe time series matrix
Add the model configuration and training hyperparameters for BlazeFaceStylizer
Add landmarks smoothing filter when requested face num is 1
Added MPPDetection
Added MPPObjectDetectionResult
Added MPPObjectDetectorOptions
Added MPPObjectDetectorOptionsHelpers, MPPObjectDetectionResultHelpers, MPPDetectionHelpers
Added MPPObjectDetector
Add FrameBuffer view on ImageFrame
Add EDGETPU_NNAPI delegate option in MediaPipe tasks API
Added MPPLandmark
Added MPPLandmarkHelpers
Added MPPGestureRecognizerResult
Added MPPGestureRecognizerOptions, MPPClassifierOptions
Added EndLoopImageCalculator and FaceToRectCalculator
Updated FaceStylizer API to align with the new Base Vision Task API changes
Added some face landmarks constants
Added pose landmarker C++ API
Update TF version to 2023-04-12
Added CoreAudio and MediaToolbox to BUILD file
Update Flatbuffers to 23.1.21
Updated error with info about unsupported mirrored orientations in MPPVisionTaskRunner
Add VEC32F4 support to ImageFrame
Add shaders that support better landscape rendering with GlSurfaceViewRenderer
Update TensorsToFaceLandmarksGraph to support face mesh v2 model
Add support for more standard scaling options in GlSurfaceViewRenderer

MediaPipe solutions update

This section should highlight the changes that are done specifically for any platform and don't propagate to other platforms.

Android

Add FaceDetector, Pose Landmarker, FaceLandmarker and FaceStylizer Java API
Add getLabels to ImageSegmeter Java API
Fix the vision tasks aar build rule to solve the "cannot find symbol" error:
Add LabelMapProto.java source code to MediaPipe AAR
Add interactive segmenter java API
Add face landmarker and face geometry java lite proto source code into mediapipe tasks AAR
Switch to use the isPresent() API since the isEmpty() is only available since java 11: https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/Optional.html#isEmpty()
Update java image segmenter to always output confidence masks and optionally output category mask
Adds a LanguageDetector Java API
Update Java interactive segmenter to output both confidence masks and category mask optionally

iOS

Changed method Updated method calls to process packet map in iOS text tasks
Solve iOS build error for gpu_buffer.cc
Fixed iOS running mode display strings
Linked in Opencv iOS framework with vision tasks
Added flow limiter calculator to iOS vision tasks

Javascript

Add FaceLandmarker Web API
Add the FaceStylizer Web API
Add InteractiveSegmenter Web API

Python

Python support for M1
Added Interactive Segmenter Python API and some tests
Expose face detector, face landmarker, face stylizer and interactive segmenter as MediaPipe Tasks Python API
Enable TextClassifier and TextEmbedder on Windows Python
Gracefully fail resource path lookup for Python on Windows
Expose as mediapipe python API
Make AudioTools compile when build from python:framework_bindings

Bug fixes

Upgrades and fixes for image segmentation category mask on GPU

MediaPipe Dependencies

Added dependency for image format
Disable OpenCL dependency for OpenCV
Add missing dependency library targets to mediapipe_task_aar

v0.9.2.1

1 year ago

Bazel changes

Add @ to all references to files in WORKSPACE.bazel

Framework and core calculator improvements

Added MPPTextEmbedderOptions, MPPTextEmbedderOptionsHelpers, MPPImageClassifierOptions
Added volume_gain_db option into AudioToTensorCalculator
Added MPPEmbedding, MPPEmbeddingResult, MPPTextEmbedderResult
Added iOS text embedder result files
Update test to reflect the recommended graph construction style:
Add FrameBuffer format
Updated documentation of embedding containers
Add YuvImage as a GpuBuffer storage backend
Updated to types of float and quantized embedding
Add Text Embedder tests for text with different themes
Added MPPEmbeddingHelpers, MPPEmbeddingResultHelpers, MPPTextEmbedderOptionsHelpers, MPPTextEmbedderResultHelpers, MPPTextEmbedder
Add "noasan" to MPPTextClassifierObjcTest
Added MPPCosineSimilarity and cosine similarity to MPPTextEmbedder
Added text embedder objective c tests
Add ViewProvider<FrameBuffer> to YuvImage storage backend
Update MP Tasks to observe timestamp bounds
Updated swift name for ImageSource Type
Updated list of designated initializers
Update TensorFlow to latest
Add more filtering methods to detection filter calculator
Update WASM files for 0.1.0-alpha-4 release
Updated the Begin/EndLoopCalculator to be able to handle mediapipe::Tensor
Add location info in registry (debug mode only)
Added vision task runner
Added designated initializer in vision task runner
Updated MPPImageUtils with methods to create image frame
Updated MPPVisionTaskRunner
Add mediapipe tasks face blendshapes graph
Add "java_package" and "java_outer_classname" to ImageTransformationCalculatorOptions
Updated method name in MPPVisionPacketCreator
Update MediaPipe TFLite code to use generic "shim" symbols and headers
Update detection result to include optional keypoints
Update face detector graph for downstream face landmarks graph
Add Bitmap image capture capability to GlSurfaceViewRenderer
Update ImageSegmenter API for image/video mode to have both callback API and returned result API
Small fixes to TensorsToImageCalculator
Add optional face blendshapes to face landmarks detector graph
Add a CHECK for the cases when null service is accessed unconditionally
Add FaceLandmarkerResult for FaceLandmarker API
Add ViewProvider for ImageFrame in GpuBufferStorageYuvImage
Add GetInputImageTensorSpecs into BaseVisionTaskApi for tasks api users to get input image tensor specifications
Add custom metadata in metadata_schema
Add FaceDetectorResult
Add volume_gain_db option to TensorsToAudioCalculator
Add build system for Halide and expose FrameBufferUtils
Add requiredInputBufferSize as an input argument of createAudioRecord
Update ImageFrameToGpuBufferCalculator to use api2 and GpuBuffer conversions
Add Empty Packet support to GraphRunner
Add support for [xmin, ymin, xmax, ymax] style of bbox output
Add TensorsToFaceLandmarksGraph to support two types of face mesh models

MediaPipe solutions update

This section should highlight the changes that are done specifically for any platform and don't propagate to other platforms.

Android

Remove usage of var for ImageSegmenter.java
When "--define=MEDIAPIPE_NO_JNI=1" used in compilation, no implementation in libandroid.so is used

iOS

Added iOS text embedder result files
Added iOS test for different themes in text embedder
Added iOS test for quantized embedding
Added a note about swift test coverage in iOS text embedder tests
Added MPPTaskImage for iOS vision tasks
Open visibility of iOS TextClassifier & TextEmbedder
Solve Linking error for Hello World iOS example
Added swift tests for text embedder

Javascript

Fix incorrect uint8 -> int8 conversion in JS cosine similarity
Add MediaPipe Image Segmenter task for Web

Python

Enable Python Audio Classifier & Embedder on Windows

Bug fixes

Bug fixes in MPPImage
Ssd anchors calculator add fixed anchors

MediaPipe Dependencies

Bump Halide version from 14.0.0 to 15.0.0 and add MacOS Halide dependency

v0.9.1

1 year ago

Build changes

Allow split_vector_calculator to be build with iOS and MEDIAPIPE_DISABLE_GPU
Update mediapipe_aar.bzl to put more mediapipe framework java proto classes into AARs.

Bazel changes

Update Bazel dependencies for Apple

Framework and core calculator improvements

Add HandLandmarkerGraph which connect HandDetectorGraph and HandLandmarkerSubgraph with landmarks tracking.
Updated image classifier to use a region of interest parameter
Add support for input image rotation in ImageClassifier and ObjectDetector C++ API
Adding BypassCalculator for use with SwitchContainer.
Add MergeDetectionsToVectorCalculator, CombinedPredictionCalculator, EndLoopMatrixCalculator, ConcatenateClassificationListCalculator, RegexPreprocessingCalculator and BERTPreprocessorCalculator, TextToTensorCalculator and UniversalSentenceEncoderPreprocessorCalculator
Added the TextClassifier C++ API, the TextPreprocessingSubgraph.
Rename "Bound" struct to "Rect" and remove unused "Landmark" struct.
Add tensor_index and tensor_name fields to ClassificationList
Replace numpy.float with the builtin float type as numpy removes its own float type in v1.24.
Add BGR -> RGB color conversion to ColorConvertCalculator.
Add SQRT_HANN window type to both SpectrogramCalculator and InverseSpectrogramCalculator.
Allow conversion of GlTextureBuffer to CVPixelBufferRef. This means that, if an iOS application sends in a GlTextureBuffer but expects a CVPixelBufferRef as output, everything will work even if the graph just forwards the same input. Also, access by Metal calculators will also work transparently.
Allowing BypassCalculator to accept InputSidePackets.
Enable unsigned quantized infererence using XNNPACK.
Adds a preprocessor for Universal Sentence Encoder models.

MediaPipe solutions update

Android

Enable creating MediaPipe Image c++ packet directly from an Android media image object when its format is RGBA_8888.
Add Java ImageEmbedder API and TextEmbedder API.
Fix aar breakage caused by missing "//mediapipe/tasks/java/com/google/mediapipe/tasks/components/containers:normalized_landmark".
Fix aar breakage caused by missing "//mediapipe/tasks/cc/vision/image_segmenter/proto:segmenter_options_java_proto_lite".

Web

Hand Landmarker Web API
Allow Web developers to opt into CPU or GPU processing
Add support for browsers without SIMD
Add pre-compiled WASM files to NPM packages

Bug fixes

Fix RGBA vs RGB selection when creating GLTexture.
Fix accidental suppressions of GLSL linker error reporting
Fix for CHECK failure due to pointer description sometimes being larger than allocated string space
ClassificationAggregationCalculator and EmbeddingAggregationCalculator now fill in the timestamp_ms field of the classification results in the stream mode.
Fix ObjectDetector C++ flow limiter and improve documentation.
Better handling of empty packets in vector calculators.

MediaPipe Dependencies

Bump up the dependency library pybind11's version to 2.10.1.

v0.8.11

1 year ago

Build changes

We are no longer adding *.tflite model files and other large binaries to our GitHub repository. Instead, these models are downloaded from Google Cloud Storage. This should speed up your getting started experience with MediaPipe (especially if you can work of a shallow clone of the repository) and allows us to expand our feature set without significantly increasing the size of the repository. Please update your Python binaries if they are fetching models from GitHub (see download_utils.py).
We have made the build targets //mediapipe/objc:mediapipe_framework_ios, //mediapipe/objc:mediapipe_input_sources_ios, //mediapipe/objc:mediapipe_layer_renderer publicly visible. These targets can now be used in external iOS applications.

v0.8.10.2

1 year ago

Build changes

Fixed a duplicate symbol conflict in the Windows build

v0.8.10.1

1 year ago

Bazel update

Updated Bazel to 5.2 to fix a build incompatibility issue

Framework and core calculator improvements

Various updates to the underlying frameworks.

v0.8.10

1 year ago

Apple Silicon support

Support building MediaPipe on Mac computers with Apple Silicon.

Framework and core calculator improvements

The required minimum ios version is now 11.0.
New calculators: GetVectorItemCalculator, VectorSizeCalculator, and DetectionTransformationCalculator.
MediaPipe Tensor now supports uint8 and int8 data type.
New features in TensorsToDetectionsCalculator.

v0.8.9

2 years ago

MediaPipe Android Solutions

MediaPipe Hands, Face Detection, and Face Mesh Android Solutions are now available in Google's Maven Repository.
The Android Studio example project is available in mediapipe/examples/android/solutions.

MediaPipe Hands

MediaPipe Hands models are updated.
MediaPipe Hands now supports outputting world landmarks in world coordinates.

MediaPipe Dependencies

MediaPipe Python wheels are now supporting Python 3.10.
The MediaPipe dependency library protobuf, tensorflow, cere solver, pybind, and apple support are updated.
The recommended Bazel version is updated to 4.2.1.
The recommended Android SDK and NDK versions are updated.

v0.8.8

2 years ago

MediaPipe Face Mesh
- Added an refine_landmarks option in the Solution APIs to further improve landmarks around eyes and lips, and output additional landmarks around the irises.
- Released the Attention Mesh ML model that enables refine_landmarks.
MediaPipe Holistic
- Added the enable_segmentation and smooth_segmentation option in the Solution APIs, previously only available in MediaPipe Pose.

v0.8.7.1

2 years ago

A huge thank you to all contributors: @Abduttayyeb @brettkoonce @chr0nikler @daniel13520cs @gabrielsanchez @GantMan @GzuPark @homuler @magamig @PeterPocsi @TomHsiao1260 @yuripourre