Mediapipe Versions Save

Cross-platform, customizable ML solutions for live and streaming media.

v0.9.3.0

1 year ago

Bazel changes

  • Bazel version upgrade to v6.1.1
  • Update Halide build rules for MediaPipe to use Halide v15.0.1
  • Use "x86_32" instead of "i386" for Bazel CPU ID

Framework and core calculator improvements

  • Added MPPImageClassifierOptionsHelpers, TensorsToSegmentationCalculatorOptionsProto.java into tasks core's maven package, MPPObjectDetectorOptions, MPPObjectDetectorOptionsHelpers, MPPClassifierOptions, MPPGestureRecognizerOptions, MPPGestureRecognizerOptions.m, support for more standard scaling options in GlSurfaceViewRenderer
  • Updated cosine similarity utility
  • Added method to send packet map to C++ task runner
  • Added methods to MPPVisionTaskRunner
  • Added methods to MPPVisionPacketCreator
  • Updated build targets of vision packet creator and task runner
  • Added MPPImageClassifierResultHelpers, MPPImageClassifierOptionsHelpers
  • Added MPPImageClassifier
  • Updated method signature in MPPTaskRunner
  • Added Face Detector implementation and tests
  • Added the AudioRecord API
  • Update audio_record_test.py
  • Add FaceLandmarker C++ API
  • Updated models
  • Add the dataset module for face stylizer in model maker
  • Update Node version to 16.19.0
  • Add metadata writer for image segmentation
  • Add Interactive Segmenter MediaPipe Task
  • Add label_map filtering into filter_detection drishti calculator
  • Add the source code TensorsToSegmentationCalculatorOptionsProto.java into tasks core's maven package
  • Add ImageData output to GraphRunner
  • Added MPPImage Utils for tests
  • Added stream info for some modes in MPPImageClassifier
  • Added flow limiting for live stream mode in MPPImageClassifier
  • Add WebGLTexture output for ImageSegmenter
  • Add face_landmarker to vision types
  • Add a function to convert CoreAudio buffers into a MediaPipe time series matrix
  • Add the model configuration and training hyperparameters for BlazeFaceStylizer
  • Add landmarks smoothing filter when requested face num is 1
  • Added MPPDetection
  • Added MPPObjectDetectionResult
  • Added MPPObjectDetectorOptions
  • Added MPPObjectDetectorOptionsHelpers, MPPObjectDetectionResultHelpers, MPPDetectionHelpers
  • Added MPPObjectDetector
  • Add FrameBuffer view on ImageFrame
  • Add EDGETPU_NNAPI delegate option in MediaPipe tasks API
  • Added MPPLandmark
  • Added MPPLandmarkHelpers
  • Added MPPGestureRecognizerResult
  • Added MPPGestureRecognizerOptions, MPPClassifierOptions
  • Added EndLoopImageCalculator and FaceToRectCalculator
  • Updated FaceStylizer API to align with the new Base Vision Task API changes
  • Added some face landmarks constants
  • Added pose landmarker C++ API
  • Update TF version to 2023-04-12
  • Added CoreAudio and MediaToolbox to BUILD file
  • Update Flatbuffers to 23.1.21
  • Updated error with info about unsupported mirrored orientations in MPPVisionTaskRunner
  • Add VEC32F4 support to ImageFrame
  • Add shaders that support better landscape rendering with GlSurfaceViewRenderer
  • Update TensorsToFaceLandmarksGraph to support face mesh v2 model
  • Add support for more standard scaling options in GlSurfaceViewRenderer

MediaPipe solutions update

This section should highlight the changes that are done specifically for any platform and don't propagate to other platforms.

Android

  • Add FaceDetector, Pose Landmarker, FaceLandmarker and FaceStylizer Java API
  • Add getLabels to ImageSegmeter Java API
  • Fix the vision tasks aar build rule to solve the "cannot find symbol" error:
  • Add LabelMapProto.java source code to MediaPipe AAR
  • Add interactive segmenter java API
  • Add face landmarker and face geometry java lite proto source code into mediapipe tasks AAR
  • Switch to use the isPresent() API since the isEmpty() is only available since java 11: https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/Optional.html#isEmpty()
  • Update java image segmenter to always output confidence masks and optionally output category mask
  • Adds a LanguageDetector Java API
  • Update Java interactive segmenter to output both confidence masks and category mask optionally

iOS

  • Changed method Updated method calls to process packet map in iOS text tasks
  • Solve iOS build error for gpu_buffer.cc
  • Fixed iOS running mode display strings
  • Linked in Opencv iOS framework with vision tasks
  • Added flow limiter calculator to iOS vision tasks

Javascript

  • Add FaceLandmarker Web API
  • Add the FaceStylizer Web API
  • Add InteractiveSegmenter Web API

Python

  • Python support for M1
  • Added Interactive Segmenter Python API and some tests
  • Expose face detector, face landmarker, face stylizer and interactive segmenter as MediaPipe Tasks Python API
  • Enable TextClassifier and TextEmbedder on Windows Python
  • Gracefully fail resource path lookup for Python on Windows
  • Expose as mediapipe python API
  • Make AudioTools compile when build from python:framework_bindings

Bug fixes

  • Upgrades and fixes for image segmentation category mask on GPU

MediaPipe Dependencies

  • Added dependency for image format
  • Disable OpenCL dependency for OpenCV
  • Add missing dependency library targets to mediapipe_task_aar

v0.9.2.1

1 year ago

Bazel changes

  • Add @ to all references to files in WORKSPACE.bazel

Framework and core calculator improvements

  • Added MPPTextEmbedderOptions, MPPTextEmbedderOptionsHelpers, MPPImageClassifierOptions
  • Added volume_gain_db option into AudioToTensorCalculator
  • Added MPPEmbedding, MPPEmbeddingResult, MPPTextEmbedderResult
  • Added iOS text embedder result files
  • Update test to reflect the recommended graph construction style:
  • Add FrameBuffer format
  • Updated documentation of embedding containers
  • Add YuvImage as a GpuBuffer storage backend
  • Updated to types of float and quantized embedding
  • Add Text Embedder tests for text with different themes
  • Added MPPEmbeddingHelpers, MPPEmbeddingResultHelpers, MPPTextEmbedderOptionsHelpers, MPPTextEmbedderResultHelpers, MPPTextEmbedder
  • Add "noasan" to MPPTextClassifierObjcTest
  • Added MPPCosineSimilarity and cosine similarity to MPPTextEmbedder
  • Added text embedder objective c tests
  • Add ViewProvider<FrameBuffer> to YuvImage storage backend
  • Update MP Tasks to observe timestamp bounds
  • Updated swift name for ImageSource Type
  • Updated list of designated initializers
  • Update TensorFlow to latest
  • Add more filtering methods to detection filter calculator
  • Update WASM files for 0.1.0-alpha-4 release
  • Updated the Begin/EndLoopCalculator to be able to handle mediapipe::Tensor
  • Add location info in registry (debug mode only)
  • Added vision task runner
  • Added designated initializer in vision task runner
  • Updated MPPImageUtils with methods to create image frame
  • Updated MPPVisionTaskRunner
  • Add mediapipe tasks face blendshapes graph
  • Add "java_package" and "java_outer_classname" to ImageTransformationCalculatorOptions
  • Updated method name in MPPVisionPacketCreator
  • Update MediaPipe TFLite code to use generic "shim" symbols and headers
  • Update detection result to include optional keypoints
  • Update face detector graph for downstream face landmarks graph
  • Add Bitmap image capture capability to GlSurfaceViewRenderer
  • Update ImageSegmenter API for image/video mode to have both callback API and returned result API
  • Small fixes to TensorsToImageCalculator
  • Add optional face blendshapes to face landmarks detector graph
  • Add a CHECK for the cases when null service is accessed unconditionally
  • Add FaceLandmarkerResult for FaceLandmarker API
  • Add ViewProvider for ImageFrame in GpuBufferStorageYuvImage
  • Add GetInputImageTensorSpecs into BaseVisionTaskApi for tasks api users to get input image tensor specifications
  • Add custom metadata in metadata_schema
  • Add FaceDetectorResult
  • Add volume_gain_db option to TensorsToAudioCalculator
  • Add build system for Halide and expose FrameBufferUtils
  • Add requiredInputBufferSize as an input argument of createAudioRecord
  • Update ImageFrameToGpuBufferCalculator to use api2 and GpuBuffer conversions
  • Add Empty Packet support to GraphRunner
  • Add support for [xmin, ymin, xmax, ymax] style of bbox output
  • Add TensorsToFaceLandmarksGraph to support two types of face mesh models

MediaPipe solutions update

This section should highlight the changes that are done specifically for any platform and don't propagate to other platforms.

Android

  • Remove usage of var for ImageSegmenter.java
  • When "--define=MEDIAPIPE_NO_JNI=1" used in compilation, no implementation in libandroid.so is used

iOS

  • Added iOS text embedder result files
  • Added iOS test for different themes in text embedder
  • Added iOS test for quantized embedding
  • Added a note about swift test coverage in iOS text embedder tests
  • Added MPPTaskImage for iOS vision tasks
  • Open visibility of iOS TextClassifier & TextEmbedder
  • Solve Linking error for Hello World iOS example
  • Added swift tests for text embedder

Javascript

  • Fix incorrect uint8 -> int8 conversion in JS cosine similarity
  • Add MediaPipe Image Segmenter task for Web

Python

  • Enable Python Audio Classifier & Embedder on Windows

Bug fixes

  • Bug fixes in MPPImage
  • Ssd anchors calculator add fixed anchors

MediaPipe Dependencies

  • Bump Halide version from 14.0.0 to 15.0.0 and add MacOS Halide dependency

v0.9.1

1 year ago

Build changes

  • Allow split_vector_calculator to be build with iOS and MEDIAPIPE_DISABLE_GPU
  • Update mediapipe_aar.bzl to put more mediapipe framework java proto classes into AARs.

Bazel changes

Update Bazel dependencies for Apple

Framework and core calculator improvements

  • Add HandLandmarkerGraph which connect HandDetectorGraph and HandLandmarkerSubgraph with landmarks tracking.
  • Updated image classifier to use a region of interest parameter
  • Add support for input image rotation in ImageClassifier and ObjectDetector C++ API
  • Adding BypassCalculator for use with SwitchContainer.
  • Add MergeDetectionsToVectorCalculator, CombinedPredictionCalculator, EndLoopMatrixCalculator, ConcatenateClassificationListCalculator, RegexPreprocessingCalculator and BERTPreprocessorCalculator, TextToTensorCalculator and UniversalSentenceEncoderPreprocessorCalculator
  • Added the TextClassifier C++ API, the TextPreprocessingSubgraph.
  • Rename "Bound" struct to "Rect" and remove unused "Landmark" struct.
  • Add tensor_index and tensor_name fields to ClassificationList
  • Replace numpy.float with the builtin float type as numpy removes its own float type in v1.24.
  • Add BGR -> RGB color conversion to ColorConvertCalculator.
  • Add SQRT_HANN window type to both SpectrogramCalculator and InverseSpectrogramCalculator.
  • Allow conversion of GlTextureBuffer to CVPixelBufferRef. This means that, if an iOS application sends in a GlTextureBuffer but expects a CVPixelBufferRef as output, everything will work even if the graph just forwards the same input. Also, access by Metal calculators will also work transparently.
  • Allowing BypassCalculator to accept InputSidePackets.
  • Enable unsigned quantized infererence using XNNPACK.
  • Adds a preprocessor for Universal Sentence Encoder models.

MediaPipe solutions update

Android

  • Enable creating MediaPipe Image c++ packet directly from an Android media image object when its format is RGBA_8888.
  • Add Java ImageEmbedder API and TextEmbedder API.
  • Fix aar breakage caused by missing "//mediapipe/tasks/java/com/google/mediapipe/tasks/components/containers:normalized_landmark".
  • Fix aar breakage caused by missing "//mediapipe/tasks/cc/vision/image_segmenter/proto:segmenter_options_java_proto_lite".

Web

  • Hand Landmarker Web API
  • Allow Web developers to opt into CPU or GPU processing
  • Add support for browsers without SIMD
  • Add pre-compiled WASM files to NPM packages

Bug fixes

  • Fix RGBA vs RGB selection when creating GLTexture.
  • Fix accidental suppressions of GLSL linker error reporting
  • Fix for CHECK failure due to pointer description sometimes being larger than allocated string space
  • ClassificationAggregationCalculator and EmbeddingAggregationCalculator now fill in the timestamp_ms field of the classification results in the stream mode.
  • Fix ObjectDetector C++ flow limiter and improve documentation.
  • Better handling of empty packets in vector calculators.

MediaPipe Dependencies

  • Bump up the dependency library pybind11's version to 2.10.1.

v0.8.11

1 year ago

Build changes

  • We are no longer adding *.tflite model files and other large binaries to our GitHub repository. Instead, these models are downloaded from Google Cloud Storage. This should speed up your getting started experience with MediaPipe (especially if you can work of a shallow clone of the repository) and allows us to expand our feature set without significantly increasing the size of the repository. Please update your Python binaries if they are fetching models from GitHub (see download_utils.py).
  • We have made the build targets //mediapipe/objc:mediapipe_framework_ios, //mediapipe/objc:mediapipe_input_sources_ios, //mediapipe/objc:mediapipe_layer_renderer publicly visible. These targets can now be used in external iOS applications.

v0.8.10.2

1 year ago

Build changes

  • Fixed a duplicate symbol conflict in the Windows build

v0.8.10.1

1 year ago

Bazel update

  • Updated Bazel to 5.2 to fix a build incompatibility issue

Framework and core calculator improvements

  • Various updates to the underlying frameworks.

v0.8.10

1 year ago

Apple Silicon support

  • Support building MediaPipe on Mac computers with Apple Silicon.

Framework and core calculator improvements

v0.8.9

2 years ago

MediaPipe Android Solutions

MediaPipe Hands

  • MediaPipe Hands models are updated.
  • MediaPipe Hands now supports outputting world landmarks in world coordinates.

MediaPipe Dependencies

  • MediaPipe Python wheels are now supporting Python 3.10.
  • The MediaPipe dependency library protobuf, tensorflow, cere solver, pybind, and apple support are updated.
  • The recommended Bazel version is updated to 4.2.1.
  • The recommended Android SDK and NDK versions are updated.

v0.8.8

2 years ago