[2311.06242] Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks
https://arxiv.org/abs/2311.06242
https://arxiv.org/abs/2311.06242
arXiv.org
Florence-2: Advancing a Unified Representation for a Variety of...
We introduce Florence-2, a novel vision foundation model with a unified, prompt-based representation for a variety of computer vision and vision-language tasks. While existing large vision models...
[2408.00714] SAM 2: Segment Anything in Images and Videos
https://arxiv.org/abs/2408.00714
https://arxiv.org/abs/2408.00714
arXiv.org
SAM 2: Segment Anything in Images and Videos
We present Segment Anything Model 2 (SAM 2), a foundation model towards solving promptable visual segmentation in images and videos. We build a data engine, which improves model and data via user...
COCO test-dev Benchmark (Object Detection) | Papers With Code
https://paperswithcode.com/sota/object-detection-on-coco?p=grounding-dino-marrying-dino-with-grounded
https://paperswithcode.com/sota/object-detection-on-coco?p=grounding-dino-marrying-dino-with-grounded
Paperswithcode
Papers with Code - COCO test-dev Benchmark (Object Detection)
The current state-of-the-art on COCO test-dev is Co-DETR. See a full comparison of 257 papers with code.
π‘ Remember Box
Grounding DINO + FastSAM
GitHub
GitHub - IDEA-Research/Grounded-Segment-Anything: Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusionβ¦
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything - IDEA-Research/Grounded-S...
Florence-2 (~50M), Grounding DINO (~40M), SAM-2 (~30M), totaling ~120M parameters, much larger than Grounding DINO + FastSAM (~10M)
π‘ Remember Box
Grounding DINO + FastSAM
Grounding DINO + FastSAM + LLaVA
Accuracy (IoU for semantic segmentation, mAP for instance detection, area error)Speed (inference time for a 256x256 image)Model Size (parameters, disk space, GPU memory)Building Code Compliance (ability to exclude non-habitable spaces and apply code rules)Label Processing (text recognition accuracy and integration)Complexity (ease of implementation, number of models, etc.)
Segmenting satellite images using SAM and Grounding DINO | Echo Blog
https://www.echo-analytics.com/blog/segmenting-satellite-images-using-sam-and-grounding-dino
https://www.echo-analytics.com/blog/segmenting-satellite-images-using-sam-and-grounding-dino
Echo-Analytics
Segmenting satellite images using SAM and Grounding DINO | Echo Blog
Read about segmenting satellite images using SAM and Grounding DINO for our data product, Shapes.