AI Models – Page 18

Visual Question AnsweringMarch 4, 2024

pix2struct-ai2d-base

Model card for Pix2Struct - Finetuned on AI2D (scientific diagram VQA) Table of Contents TL;DR Using the model Contribution Citation TL;DR Pix2Struct is an…

0Likes 0Comments

Depth EstimationMarch 4, 2024

dpt-hybrid-midas

Model Details: DPT-Hybrid (also known as MiDaS 3.0) Dense Prediction Transformer (DPT) model trained on 1.4 million images for monocular depth estimation. It was introduced in the paper Vision…

0Likes 0Comments

Text-to-SpeechMarch 4, 2024

OpenVoice

Edit model card OpenVoice Features How to Use Links OpenVoice OpenVoice, a versatile instant voice cloning approach that requires only a short audio clip from the reference speaker…

0Likes 0Comments

Text ClassificationMarch 4, 2024

hallucination_evaluation_model

In Loving memory of Simon Mark Hughes... Introduction The HHEM model is an open source model, created by Vectara, for detecting hallucinations in LLMs. It is particularly useful…

0Likes 0Comments

VLLMsMarch 4, 2024

LSTP-Chat

LSTP-Chat: Language-guided Spatial-Temporal Prompt Learning for Video Chat Available Models: LSTP-Chat-7B (Vicuna-7b) For more details, please refer to our official repository Source link

0Likes 0Comments

Visual Question AnsweringMarch 4, 2024

Vision-and-Language Transformer (ViLT), fine-tuned on VQAv2 Vision-and-Language Transformer (ViLT) model fine-tuned on VQAv2. It was introduced in the paper ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision by Kim et…

0Likes 0Comments

Depth EstimationMarch 4, 2024

depth-anything-small-hf

Depth Anything (small-sized model, Transformers version) Depth Anything model. It was introduced in the paper Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data by Lihe Yang et al.…

0Likes 0Comments

Text-to-SpeechMarch 4, 2024

whisperspeech

Edit model card WhisperSpeech Progress update [2024-01-18] Progress update [2024-01-10] Progress update [2023-12-10] Downloads Roadmap Architecture Whisper for modeling semantic tokens EnCodec for modeling acoustic tokens Appreciation Consulting Citations …

0Likes 0Comments

pix2struct-ai2d-base

dpt-hybrid-midas

OpenVoice

hallucination_evaluation_model

LSTP-Chat

vilt-b32-finetuned-vqa

depth-anything-small-hf

whisperspeech

TF_Decision_Trees

Creatie

Dottypost

COMPANY

SUPPORT

Follow Us