Combining Segment Anything (SAM) with Grounded DINO for zero-shot object detection and CLIPSeg for zero-shot segmentation
This is a proof of concept for zero-shot panoptic segmentation using the Segment Anything Model (SAM).
SAM cannot immediately achieve panoptic segmentation due to two limitations:
To solve these challenges we use the following additional models:
You can try out the pipeline by running the notebook in Colab or by trying out the Gradio demo on Hugging Face Spaces.
The notebook also shows how the predictions from this pipeline can be uploaded to Segments.ai as pre-labels, where you can adjust them to obtain perfect labels for fine-tuning your segmentation model.
Our Frankenstein-ish pipeline looks as follows: