Discover AI Models

Browse and compare the latest AI language models. Find pricing, context limits, and key capabilities at a glance.

Cosmos 3

Cosmos 3 is NVIDIA’s next-generation family of open omnimodal world foundation models for Physical AI.

Context: -
Input /1M: -
Output /1M: -
Downloads: 2,830
Released: Jun 1, 2026

Stable Audio 3.0

Stable Audio 3.0 is a model family trained on fully licensed data, designed to be the foundation for what the audio community builds next.

Context: -
Input /1M: -
Output /1M: -
Downloads: -

SANA-WM

SANA is an efficiency-oriented codebase for high-resolution image and video generation, providing complete training and inference pipelines.

Context: -
Input /1M: -
Output /1M: -
Downloads: -
Released: May 17, 2026

Gemini Omni Flash

Gemini Omni Flash is Google DeepMind's native multimodal video generation model, announced at Google I/O 2026.

Context: -
Input /1M: -
Output /1M: -
Downloads: -
Released: May 21, 2026

Pixal3D

Pixal3D is an image-to-3D AI model that converts a single 2D image into a high-fidelity 3D asset with detailed geometry and textures.

Context: -
Input /1M: -
Output /1M: -
Downloads: -
Released: May 1, 2026

Qwen3.7 Max

Qwen3.7-Max is a new generation flagship model designed for the era of intelligent agents.

Context: -
Input /1M: -
Output /1M: -
Downloads: -
Released: May 19, 2026

Lance

Lance is a lightweight native unified multimodal model that supports image and video understanding, generation, and editing within a single framework.

Context: —
Input /1M: —
Output /1M: —
Downloads: 438

Hy3 Preview

Hy3 preview is a high-efficiency Mixture-of-Experts model from Tencent designed for agentic workflows and production use.

Context: 262K
Input /1M: $0.066
Output /1M: $0.26
Downloads: 63,435
Released: Apr 23, 2026

Sulphur 2 Base GGUF

An uncensored video generation model based on LTX 2.3 supporting both t2v and i2v natively, as well as all of the other ltx 2.3 formats.

Context: —
Input /1M: —
Output /1M: —
Downloads: 52,477
Released: Apr 9, 2026

GLM-5.1

GLM-5.1 is our next-generation flagship model for agentic engineering, with significantly stronger coding capabilities than its predecessor. It achieves state-of-the-art performance on SWE-Bench Pro and leads GLM-5 by a wide margin on NL2Repo (repo generation) and Terminal-Bench 2.0 (real-world terminal tasks).

Context: 202,752
Input /1M: 0.980
Output /1M: 3.08
Downloads: 241,258
Released: Apr 7, 2026

Scenema Audio

Scenema Audio is an audio diffusion model extracted from LTX 2.3. It generates speech with emotional acting, pacing, breath control, and sound effects from a text prompt. It is not speech synthesis. It is performative audio generation.

Context: —
Input /1M: —
Output /1M: —
Downloads: 99
Released: May 15, 2026

Sulphur 2 base

Sulphur 2 is a censorship-free video generation model based on the LTX 2.3 architecture. Its main function is to support text-to-video and image-to-video creation. It also works with all other LTX 2.3 formats.

Context: —
Input /1M: —
Output /1M: —
Downloads: 627K
Released: May 3, 2026

OpenAI: GPT-5.5

GPT-5.5 is our newest frontier model for the most complex professional work. Learn more in our latest model guide. Reasoning.effort supports: none, low, medium (default), high and xhigh.

Context: 1.1M
Input /1M: $5.00
Output /1M: $30.00
Downloads: —
Released: Apr 24, 2026

#	Model	Context	Input /1M	Output /1M	Downloads	Released
1	Cosmos 3 Cosmos 3 is NVIDIA’s next-generation family of open omnimodal world foundation models for Physical AI.	-	-	-	2,830	Jun 1, 2026
2	Stable Audio 3.0 Stable Audio 3.0 is a model family trained on fully licensed data, designed to be the foundation for what the audio community builds next.	-	-	-	-	—
3	SANA-WM SANA is an efficiency-oriented codebase for high-resolution image and video generation, providing complete training and inference pipelines.	-	-	-	-	May 17, 2026
4	Gemini Omni Flash Gemini Omni Flash is Google DeepMind's native multimodal video generation model, announced at Google I/O 2026.	-	-	-	-	May 21, 2026
5	Pixal3D Pixal3D is an image-to-3D AI model that converts a single 2D image into a high-fidelity 3D asset with detailed geometry and textures.	-	-	-	-	May 1, 2026
6	Qwen3.7 Max Qwen3.7-Max is a new generation flagship model designed for the era of intelligent agents.	-	-	-	-	May 19, 2026
7	Lance Lance is a lightweight native unified multimodal model that supports image and video understanding, generation, and editing within a single framework.	—	—	—	438	—
8	Hy3 Preview Hy3 preview is a high-efficiency Mixture-of-Experts model from Tencent designed for agentic workflows and production use.	262K	$0.066	$0.26	63,435	Apr 23, 2026
9	Sulphur 2 Base GGUF An uncensored video generation model based on LTX 2.3 supporting both t2v and i2v natively, as well as all of the other ltx 2.3 formats.	—	—	—	52,477	Apr 9, 2026
10	GLM-5.1 GLM-5.1 is our next-generation flagship model for agentic engineering, with significantly stronger coding capabilities than its predecessor. It achieves state-of-the-art performance on SWE-Bench Pro and leads GLM-5 by a wide margin on NL2Repo (repo generation) and Terminal-Bench 2.0 (real-world terminal tasks).	202,752	0.980	3.08	241,258	Apr 7, 2026
11	Scenema Audio Scenema Audio is an audio diffusion model extracted from LTX 2.3. It generates speech with emotional acting, pacing, breath control, and sound effects from a text prompt. It is not speech synthesis. It is performative audio generation.	—	—	—	99	May 15, 2026
12	S Sulphur 2 base Sulphur 2 is a censorship-free video generation model based on the LTX 2.3 architecture. Its main function is to support text-to-video and image-to-video creation. It also works with all other LTX 2.3 formats.	—	—	—	627K	May 3, 2026
13	OpenAI: GPT-5.5 GPT-5.5 is our newest frontier model for the most complex professional work. Learn more in our latest model guide. Reasoning.effort supports: none, low, medium (default), high and xhigh.	1.1M	$5.00	$30.00	—	Apr 24, 2026