◈ Homepage — https://vast.ai/This website utilizes technologies such as cookies to enable essential site functionality, as well as for analytics, personalization, and targeted advertising. To learn more, view the following link:
Privacy Policy
Developers
Pricing
Products
Hosting
Use Cases
Company
Contact Sales
Console
SOC 2 Certified
Agent-Ready AI
Infrastructure.
Vast is the infrastructure layer where AI agents autonomously design, procure, and optimize their own compute. API-native provisioning. Real-time pricing. Per-second billing.
Get Started
CLI Docs
700K+ transactions/mo
20,000+ GPUs
40+ data centers
68+ GPU types
Trusted by developers and AI teams worldwide
Real-time GPU infrastructure
Prices set by supply and demand across 20,000+ GPUs. Transparent. Programmatically queryable.
View All GPUs
How it works
From sign-up to running GPU workloads in under five minutes.
1
Add credit & get your API key
Start with as little as $5. Grab your API key from the console — no contracts, no sales calls.
2
Search GPUs
Filter by model, VRAM, price, and availability — via console or API.
3
Deploy
Launch instances in seconds. Scale up or down programmatically.
Get Started for $5
Compare. Launch. Exit. Repeat.
Every GPU on Vast.ai is provisioned through code. The same API that developers use to deploy in seconds is the interface agents use to procure and optimize at scale.
PYTHON SDK & CLI
pip install vastai
One install gives you both the CLI and Python SDK.
SDK — Programmatic compute provisioning in five lines of code.
Docs →
CLI — Search, filter, and deploy from your terminal.
Docs →
REST API
Docs →
The interface agents call to provision infrastructure.
curl -H "Authorization: Bearer $VAST_API_KEY" https://cloud.vast.ai/api/v1/bundles/
Explore Developer Tools
deploy.py
from vastai import VastAI
vast = VastAI(api_key="...")
offers = vast.search_offers(
query="gpu_name=H100_SXM num_gpus=8"
)
result = vast.launch_instance(
id=offers[0]["id"],
image="vllm/vllm-openai:latest"
)
One platform. Three ways to deploy.
GPU Cloud for full control. Serverless for zero-ops inference. Clusters for large-scale training.
GPU Cloud
On-demand instances across 40+ data centers and 20,000+ GPUs. Deploy in seconds via CLI, SDK, or API.
Explore GPU Cloud
Serverless
Deploy models as endpoints with automatic benchmarking and optimization across GPU types. Autoscale to zero, pay only for compute time.
Try Serverless
Clusters
Dedicated multi-node GPU clusters with InfiniBand networking for large-scale training.
View Clusters
Built for Every AI Workload
From training to inference, fine-tuning to rendering — run any GPU workload on Vast.
Use Cases
AI/ML Frameworks
AI Text Generation
AI Image + Video Generation
AI Agents
Batch Data Processing
Audio-to-Text Transcription
AI Fine Tuning
Virtual Computing
GPU Programming
Graphics Rendering
Popular Models, Ready to Deploy
Launch pre-configured templates for the most popular open-source models.
Kimi K2.6
Kimi K2.6 is an open-source, native multimodal agentic MoE model from Moonshot AI with 1T total parameters, 32B activated, advancing long-horizon coding, coding-driven design, and swarm-based task orchestration
Deploy
Qwen3.6 35B A3B
Agentic coding MoE with hybrid Gated DeltaNet and vision support
Deploy
Gemma 4 31B IT
Gemma 4 31B dense vision-language model by Google with 256K context and thinking mode
Deploy
Qwen3.5 27B
Dense 27B vision-language model with unified multimodal reasoning
Deploy
Browse Model Library
“Vast.ai reduced our GPU costs by over 60% while giving us the flexibility to scale training jobs on demand. We serve 200K daily users without breaking the bank.”
Giang, Creatix Technology
How teams build on Vast.ai
See how teams use Vast.ai to scale AI infrastructure and accelerate production workloads.
Creatix Technology
Creatix Technology Scales to 200K Daily Users with Vast.ai's GPU Cloud
How a fast-growing AI app company cut infrastructure costs by over 60% and powered millions of new users with Vast.ai.
Tech
View Case Study
PAICON
PAICON Accelerates Global, Data-Centric Cancer Diagnostics with Vast.ai
How a global oncology data platform used Vast.ai’s GPU cloud to rapidly iterate on Athena—validating that diversity can matter more than scale—while significantly reducing research-phase training costs.
Medical AI
View Case Study
Start with $5. Scale to 20,000 GPUs.
No humans required.
Get Started
Contact Sales
Subscribe for our product updates.
→
© 2026 Vast.ai. All rights reserved.
Products
GPU Cloud
Clusters
Hosting
Developers
CLI
Python SDK
API Reference
Documentation
Resources
Enterprise
Startup Program
Pricing
Use Cases
Docs
FAQs
Press Kit
Community
Discord
GitHub
Twitter
YouTube
Contact
Get in Touch
Contact Sales
Investor Inquiries
Legal
Terms of Service
Privacy Policy
Compliance
Vulnerability Disclosure
Data Processing
◈ Interior Pages — 53 pages crawledPython SDK — Vast.ai Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases AI/ML Frameworks AI Text Generation AI Image + Video Generation AI Agents Batch Data Processing Audio-to-Text Transcription AI Fine Tuning Virtual Computing GPU Programming Graphics Rendering Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Contact Sales Console Contact Sales Console Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases All Use Cases AI Agents AI Fine Tuning AI Image + Video Generation AI Text Generation AI/ML Frameworks Audio-to-Text Transcription Batch Data Processing GPU Programming Graphics Rendering Virtual Computing Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases PYTHON SDK GPU Compute in 3 Lines of Python Search, provision, and manage GPU instances programmatically. Build agents that scale their own compute. Get API Key PyPI Package Install pip install vastai from vastai import VastAI vast = VastAI(api_key="YOUR_API_KEY") One package, two interfaces: pip install vastai installs both the Python SDK and the CLI . What you can build Search & Launch Monitor & Scale Autoscale Endpoints File Transfer from vastai import VastAI vast = VastAI(api_key="...") # Find cheapest 8x H100 offers = vast.search_offers( query='gpu_name=H100_SXM num_gpus=8', order='dph', limit=3 ) # Launch with vLLM result = vast.launch_instance( id=offers[0]["id"], image="vllm/vllm-openai:latest", disk=100, ssh=True ) print(f"Instance {result['new_contract']} created") Agent-Ready Build AI agents that provision their own GPU compute. No human in the loop. Type Hints Full IDE support — autocomplete, type checking, inline docs. CLI Parity Every CLI command has an SDK equivalent. Same query syntax, same filters. Pip Install One dependency. Python 3.9+. No compiled extensions. pip install vastai Get your API key and start building GPU-powered applications. Get API Key Full SDK Docs Subscribe for our product updates. → © 2026 Vast.ai. All rights reserved. Products GPU Cloud Clusters Hosting Developers CLI Python SDK API Reference Documentation Resources Enterprise Startup Program Pricing Use Cases Docs FAQs Press Kit Community Discord GitHub Twitter YouTube Contact Get in Touch Contact Sales Investor Inquiries Legal Terms of Service Privacy Policy Compliance Vulnerability Disclosure Data Processing © 2026 Vast.ai. All rights reserved. REST API — Vast.ai Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases AI/ML Frameworks AI Text Generation AI Image + Video Generation AI Agents Batch Data Processing Audio-to-Text Transcription AI Fine Tuning Virtual Computing GPU Programming Graphics Rendering Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Contact Sales Console Contact Sales Console Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases All Use Cases AI Agents AI Fine Tuning AI Image + Video Generation AI Text Generation AI/ML Frameworks Audio-to-Text Transcription Batch Data Processing GPU Programming Graphics Rendering Virtual Computing Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases REST API REST API Direct HTTP access to every Vast.ai operation. Search offers, create instances, manage resources — from any language. API Reference Get API Key Quick example # Search for available H100s curl -s -H "Authorization: Bearer $VAST_API_KEY" \ "https://cloud.vast.ai/api/v0/bundles/?q=%7B%22gpu_name%22%3A%22H100_SXM%22%7D" \ | jq '.offers[:3] | .[] | {id, gpu_name, num_gpus, dph_total}' Key endpoints Method Endpoint Description GET /api/v0/bundles/ Search available GPU offers PUT /api/v0/asks/{id}/ Create instance from offer GET /api/v0/instances/ List your instances PUT /api/v0/instances/{id}/ Update instance DELETE /api/v0/instances/{id}/ Destroy instance Authentication All requests require an API key. Pass it as a Bearer token in the Authorization header or as a query parameter. # Bearer token (recommended) curl -H "Authorization: Bearer YOUR_API_KEY" https://cloud.vast.ai/api/v0/instances/ # Query parameter curl "https://cloud.vast.ai/api/v0/instances/?api_key=YOUR_API_KEY" Explore the full API Complete endpoint reference with request/response examples. API Reference Get API Key Subscribe for our product updates. → © 2026 Vast.ai. All rights reserved. Products GPU Cloud Clusters Hosting Developers CLI Python SDK API Reference Documentation Resources Enterprise Startup Program Pricing Use Cases Docs FAQs Press Kit Community Discord GitHub Twitter YouTube Contact Get in Touch Contact Sales Investor Inquiries Legal Terms of Service Privacy Policy Compliance Vulnerability Disclosure Data Processing © 2026 Vast.ai. All rights reserved. Scalable GPU Audio Transcription with Whisper & ASR Models at Scale | Vast.ai Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases AI/ML Frameworks AI Text Generation AI Image + Video Generation AI Agents Batch Data Processing Audio-to-Text Transcription AI Fine Tuning Virtual Computing GPU Programming Graphics Rendering Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Contact Sales Console Contact Sales Console Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases All Use Cases AI Agents AI Fine Tuning AI Image + Video Generation AI Text Generation AI/ML Frameworks Audio-to-Text Transcription Batch Data Processing GPU Programming Graphics Rendering Virtual Computing Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Audio-to-Text Transcription Rapidly convert audio to accurate text with GPU-powered transcription. Built for This Convert audio files to accurate transcripts using open-source models. Handle large-scale transcription workloads with scalable GPU access. Support multiple languages and any common audio format inside a locked-down container. Launch ready-to-use speech-to-text environments in one click or via CLI. Models audio ACE Step V1 3.5B ACE-Step is a novel open-source foundation model for music generation that overcomes key limitations of existing approaches through a holistic architectural design audio Dia 1.6B Dia directly generates highly realistic dialogue from a transcript. You can condition the output on audio, enabling emotion and tone control Related Blogs Transcribing Audio with Whisper Large V3 on Vast.ai Implementing Speech-to-Text with Speaker Diarization: Comparing Pyannote and Sortformer on VAST.ai Voice Activity Detection (VAD) with Pyannote on VAST Related Guides Whisper ASR Guide Start Building: Audio-to-Text Transcription Templates Whisper ASR Web Service Multitask model capable of multilingual speech recognition, speech translation, and language identification View All Templates Subscribe for our product updates. → © 2026 Vast.ai. All rights reserved. Products GPU Cloud Clusters Hosting Developers CLI Python SDK API Reference Documentation Resources Enterprise Startup Program Pricing Use Cases Docs FAQs Press Kit Community Discord GitHub Twitter YouTube Contact Get in Touch Contact Sales Investor Inquiries Legal Terms of Service Privacy Policy Compliance Vulnerability Disclosure Data Processing © 2026 Vast.ai. All rights reserved. Deploy GPU-Powered VMs for Development, Gaming & Research | Vast.ai Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases AI/ML Frameworks AI Text Generation AI Image + Video Generation AI Agents Batch Data Processing Audio-to-Text Transcription AI Fine Tuning Virtual Computing GPU Programming Graphics Rendering Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Contact Sales Console Contact Sales Console Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases All Use Cases AI Agents AI Fine Tuning AI Image + Video Generation AI Text Generation AI/ML Frameworks Audio-to-Text Transcription Batch Data Processing GPU Programming Graphics Rendering Virtual Computing Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Virtual Computing Easily provision GPU-enabled virtual machines with flexible, secure access. Built for This Spin up GPU-enabled virtual desktops or remote workstations in seconds. Run Linux or Ubuntu with full access to your preferred tools and workflows. Adjust resources as needed without switching providers or plans. Maintain privacy and isolation in dedicated environments. Related Blogs Announcing Virtual Machine Rental on Vast.ai Related Guides Linux Virtual Desktop Linux Virtual Machines Start Building: Virtual Computing Templates Linux Desktop Container Containerized desktop environment with both low-latency desktop interface by Selkies and VNC support Ubuntu 22.04 CLI (VM Template) Command-line Ubuntu 22.04 virtual machine pre-loaded with common utilities for a smoother terminal workflow Ubuntu Desktop (VM Template) Access a KDE Plasma desktop with hardware-accelerated graphics and audio directly through your web browser View All Templates Subscribe for our product updates. → © 2026 Vast.ai. All rights reserved. Products GPU Cloud Clusters Hosting Developers CLI Python SDK API Reference Documentation Resources Enterprise Startup Program Pricing Use Cases Docs FAQs Press Kit Community Discord GitHub Twitter YouTube Contact Get in Touch Contact Sales Investor Inquiries Legal Terms of Service Privacy Policy Compliance Vulnerability Disclosure Data Processing © 2026 Vast.ai. All rights reserved. GPUs for Startups | Vast.ai Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases AI/ML Frameworks AI Text Generation AI Image + Video Generation AI Agents Batch Data Processing Audio-to-Text Transcription AI Fine Tuning Virtual Computing GPU Programming Graphics Rendering Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Contact Sales Console Contact Sales Console Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases All Use Cases AI Agents AI Fine Tuning AI Image + Video Generation AI Text Generation AI/ML Frameworks Audio-to-Text Transcription Batch Data Processing GPU Programming Graphics Rendering Virtual Computing Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Startup Built for Startups That Are Already Building The Vast.ai Startup Program matches your compute spend with credits — so your budget goes twice as far. Apply for the Startup Credit Matching Program Apply below and our team will reach out to finalize your offer. Why Startups Choose Vast.ai Cut costs, not capability H100s, H200s, B200s at up to 80% less than AWS, GCP, or Azure. No long-term contracts, no minimums. Instant access Spin up in minutes. No waitlists, no sales cycles, no approvals just to get started. Transparent pricing Pay only for what you use. Per-second billing means you're never overpaying for idle compute. SOC 2 Type 2 compliant Enterprise-grade security without the enterprise overhead. GDPR and HIPAA-friendly configurations available. Scales with you From a single GPU to full clusters. Vast grows with your workload — not against it. How It Works Apply and get approved for a credit match up to a set amount. Spend on Vast, and we match your spend in credits — dollar for dollar . No upfront giveaways. No free trials. Our team reviews every application and reaches out to finalize your offer based on your company stage and use case. Credits matched on verified compute spend. Pending verification. Credit Matching — Example You spend $500 → You get $500 back You spend $1,000 → You get $1,000 back You spend $2,500 → You get $2,500 back This Program is for Companies That Are Shipping Good Fit Actively training models, running inference, or scaling a production workload. You have a real use case and you're building now. Not the Right Fit Still evaluating whether AI is right for your business, or looking for free compute with no committed spend. Ready to Make Your Compute Budget Go Further? Apply to the startup program. Our team will review your application and reach out to finalize your match. Apply Now Subscribe for our product updates. → © 2026 Vast.ai. All rights reserved. Products GPU Cloud Clusters Hosting Developers CLI Python SDK API Reference Documentation Resources Enterprise Startup Program Pricing Use Cases Docs FAQs Press Kit Community Discord GitHub Twitter YouTube Contact Get in Touch Contact Sales Investor Inquiries Legal Terms of Service Privacy Policy Compliance Vulnerability Disclosure Data Processing © 2026 Vast.ai. All rights reserved. Kimi K2.6 - AI Model Library | Build on Vast.ai Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases AI/ML Frameworks AI Text Generation AI Image + Video Generation AI Agents Batch Data Processing Audio-to-Text Transcription AI Fine Tuning Virtual Computing GPU Programming Graphics Rendering Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Contact Sales Console Contact Sales Console Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases All Use Cases AI Agents AI Fine Tuning AI Image + Video Generation AI Text Generation AI/ML Frameworks Audio-to-Text Transcription Batch Data Processing GPU Programming Graphics Rendering Virtual Computing Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Model Library / Kimi K2.6 Kimi K2.6 LLM Reasoning Vision Language Kimi K2.6 is an open-source, native multimodal agentic MoE model from Moonshot AI with 1T total parameters, 32B activated, advancing long-horizon coding, coding-driven design, and swarm-based task orchestration Kimi K2.6 vllm Deploy Now ... On-Demand Dedicated 8 x H200 CLI Details Modalities text, vision Version 2.6 Recommended Hardware 8 x H200 Estimated Price Loading... Provider Moonshot AI Family Kimi K2 Parameters 1000 B Context 256000 tokens License MIT (Modified) Kimi K2.6 Kimi K2.6 is an open-source, native multimodal agentic model from Moonshot AI that advances practical capabilities in long-horizon coding, coding-driven design, proactive autonomous execution, and swarm-based task orchestration. It is a Mixture-of-Experts model with 1 trillion total parameters and 32 billion activated per token, built on the Kimi K2.5 architecture. Key Features Long-Horizon Coding — Significant improvements on complex, end-to-end coding tasks, generalizing robustly across programming languages (Rust, Go, Python) and domains spanning front-end, DevOps, and performance optimization. Coding-Driven Design — Transforms simple prompts and visual inputs into production-ready interfaces and lightweight full-stack workflows, generating structured layouts, interactive elements, and rich animations with deliberate aesthetic precision. Elevated Agent Swarm — Scales horizontally to 300 sub-agents executing 4,000 coordinated steps; dynamically decomposes tasks into parallel, domain-specialized subtasks, delivering end-to-end outputs from documents to websites to spreadsheets in a single autonomous run. Proactive & Open Orchestration — Demonstrates strong performance in powering persistent 24/7 background agents that proactively manage schedules, execute code, and orchestrate cross-platform operations without human oversight. Thinking & Instant Modes — Supports reasoning (thinking) mode by default and an instant-response mode; preserve_thinking retains full reasoning content across multi-turn interactions for coding-agent scenarios. Multimodal Input — Accepts text, image, and video input via the MoonViT vision encoder (400M parameters). Model Summary | | | |:---|:---| | Architecture | Mixture-of-Experts (MoE) | | Total Parameters | 1T | | Activated Parameters | 32B | | Number of Layers | 61 (1 dense + 60 MoE) | | Number of Experts | 384 (8 selected per token, 1 shared) | | Attention Hidden Dimension | 7168 | | MoE Hidden Dimension per Expert | 2048 | | Number of Attention Heads | 64 | | Vocabulary Size | 160K | | Context Length | 256K | | Attention Mechanism | MLA | | Activation Function | SwiGLU | | Vision Encoder | MoonViT (400M parameters) | Kimi K2.6 ships with native INT4 quantization, using the same method as Kimi K2 Thinking. Benchmarks Agentic HLE-Full (with tools): 54.0 BrowseComp: 83.2 (86.3 with Agent Swarm) DeepSearchQA (f1-score): 92.5 DeepSearchQA (accuracy): 83.0 WideSearch (item-f1): 80.8 Toolathlon: 50.0 MCPMark: 55.9 Claw Eval (pass^3): 62.3; (pass@3): 80.9 APEX-Agents: 27.9 OSWorld-Verified: 73.1 Coding Terminal-Bench 2.0 (Terminus-2): 66.7 SWE-Bench Pro: 58.6 SWE-Bench Multilingual: 76.7 SWE-Bench Verified: 80.2 SciCode: 52.2 OJBench (python): 60.6 LiveCodeBench (v6): 89.6 Reasoning & Knowledge HLE-Full: 34.7 AIME 2026: 96.4 HMMT 2026 (Feb): 92.7 IMO-AnswerBench: 86.0 GPQA-Diamond: 90.5 Vision MMMU-Pro: 79.4 (80.1 with python) CharXiv (RQ): 80.4 (86.7 with python) MathVision: 87.4 (93.2 with python) BabyVision: 39.8 (68.5 with python) V* (with python): 96.9 Use Cases Autonomous agentic workflows spanning coding, research, and browsing Long-horizon software engineering and multi-step code generation Coding-driven UI/UX design from prompts and visual inputs Document, chart, and image understanding at scale Multi-agent task orchestration with parallel sub-agent coordination Persistent background agents for schedule management and cross-platform operations Quick Start Guide Choose a model and click 'Deploy' above to find available GPUs recommended for this model. Rent your dedicated instance preconfigured with the model you've selected. Start sending requests to your model instance and getting responses right now. Related Models text vision Kimi K2.5 Kimi K2.5 is an open-source, native multimodal agentic model built through continual pretraining on approximately 15 trillion mixed visual and text tokens atop Kimi-K2-Base text Kimi K2 Thinking Open-source trillion-parameter MoE AI model with thinking text Kimi K2 Instruct 0905 Open-source trillion-parameter MoE AI model Subscribe for our product updates. → © 2026 Vast.ai. All rights reserved. Products GPU Cloud Clusters Hosting Developers CLI Python SDK API Reference Documentation Resources Enterprise Startup Program Pricing Use Cases Docs FAQs Press Kit Community Discord GitHub Twitter YouTube Contact Get in Touch Contact Sales Investor Inquiries Legal Terms of Service Privacy Policy Compliance Vulnerability Disclosure Data Processing © 2026 Vast.ai. All rights reserved. GPU Financing — Vast Finance | Vast.ai Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases AI/ML Frameworks AI Text Generation AI Image + Video Generation AI Agents Batch Data Processing Audio-to-Text Transcription AI Fine Tuning Virtual Computing GPU Programming Graphics Rendering Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Contact Sales Console Contact Sales Console Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases All Use Cases AI Agents AI Fine Tuning AI Image + Video Generation AI Text Generation AI/ML Frameworks Audio-to-Text Transcription Batch Data Processing GPU Programming Graphics Rendering Virtual Computing Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Finance Your GPUs. Let Your Earnings Make the Payments. We connect Vast hosts with financing partners who specialize in GPU infrastructure. Your platform earnings help you qualify — submit once, let partners compete. Free inquiry No credit impact No obligation How It Works Three steps from inquiry to funded hardware. 1 . Tell Us Your Needs Share your GPU requirements, timeline, and budget. If you're an existing Vast host, we can use your platform earnings history to strengthen your application. 2 . We Match a Partner Our network of financing partners evaluates your profile and competes for your business. Platform earnings data means less paperwork and faster approvals. 3 . Get Funded & Start Earning Receive financing, deploy your GPU hardware on the Vast platform, and start generating revenue. Your earnings can even service your payments. The GPU Financing Flywheel Finance hardware, deploy it, earn revenue, and use those earnings to finance more. Finance Get matched with a financing partner Deploy Hardware goes live on the Vast platform Earn Revenue from 120K+ developers Expand Earnings qualify you for more financing Finance Get matched with a financing partner Deploy Hardware goes live on the Vast platform Earn Revenue from 120K+ developers Expand Earnings qualify you for more financing Financing Structures Our financing partners offer structures tailored to GPU infrastructure investments. Equipment-as-Collateral Leases The GPU hardware itself secures the financing. Lower barrier to entry — your equipment is the collateral, so you don't need extensive business credit history. Revenue-Based Lending Repayments tied to your Vast platform earnings. When you earn more, you pay faster. When demand dips, payments flex with you. Traditional Equipment Financing Fixed-term loans with predictable monthly payments. Best for established businesses with strong credit who want certainty on costs. Sale-Leaseback Already own GPU hardware? Sell it to a financing partner and lease it back — freeing up capital to expand while keeping your machines running. How It Plays Out A host with 6 months of $15K/month Vast earnings submits a financing inquiry for $250K in H100s. Because we can share verified revenue data with financing partners, they receive offers in days — not the weeks a traditional equipment loan would take. Just as Amazon Lending uses seller performance data to offer loans, Vast can share your hosting revenue history with our financing partners to help you qualify. Better earnings history = better terms. Get Matched with a Financing Partner Free inquiry. No credit impact. We'll connect you with GPU infrastructure financing specialists. 1 2 3 Your Project Financing Amount Needed * Select a range $25,000 – $100,000 $100,000 – $250,000 $250,000 – $500,000 $500,000 – $1,000,000 $1,000,000+ GPU Hardware of Interest * B200 H200 H100 SXM H100 PCIe A100 80GB A100 40GB L40S RTX 5090 RTX 4090 RTX 3090 Other Timeline * When do you need financing? Immediately 1–3 months 3–6 months Just exploring options Continue Need Hardware First? Don't just finance GPUs — source them through us too. New and certified refurbished hardware, pre-configured for the Vast platform. Vast Hardware Ready to Scale Your Hosting? Get matched with a financing partner, or source GPU hardware through our vetted supplier network. Contact Sales Subscribe for our product updates. → © 2026 Vast.ai. All rights reserved. Products GPU Cloud Clusters Hosting Developers CLI Python SDK API Reference Documentation Resources Enterprise Startup Program Pricing Use Cases Docs FAQs Press Kit Community Discord GitHub Twitter YouTube Contact Get in Touch Contact Sales Investor Inquiries Legal Terms of Service Privacy Policy Compliance Vulnerability Disclosure Data Processing © 2026 Vast.ai. All rights reserved. Qwen3.6 35B A3B - AI Model Library | Build on Vast.ai Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases AI/ML Frameworks AI Text Generation AI Image + Video Generation AI Agents Batch Data Processing Audio-to-Text Transcription AI Fine Tuning Virtual Computing GPU Programming Graphics Rendering Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Contact Sales Console Contact Sales Console Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases All Use Cases AI Agents AI Fine Tuning AI Image + Video Generation AI Text Generation AI/ML Frameworks Audio-to-Text Transcription Batch Data Processing GPU Programming Graphics Rendering Virtual Computing Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Model Library / Qwen3.6 35B A3B Qwen3.6 35B A3B LLM Vision Language MoE Reasoning Coding Agentic coding MoE with hybrid Gated DeltaNet and vision support Qwen3.6 35B A3B vllm Deploy Now ... On-Demand Dedicated 1 x RTX PRO 6000 S CLI Details Modalities text, vision Recommended Hardware 1 x RTX PRO 6000 S Estimated Price Loading... Provider Alibaba Family Qwen3.6 Parameters 35 B Context 262144 tokens License apache-2.0 Qwen3.6 35B A3B: Agentic Coding with Hybrid Gated DeltaNet Qwen3.6 35B A3B is the first open-weight model in the Qwen3.6 series, built on direct community feedback and focused on stability and real-world utility. It combines a hybrid Gated DeltaNet and Gated Attention architecture with sparse Mixture-of-Experts routing and a vision encoder for unified multimodal reasoning. Key Features Agentic Coding - Handles frontend workflows and repository-level reasoning with improved fluency and precision over earlier Qwen generations Thinking Preservation - New option to retain reasoning context from historical messages, streamlining iterative development and reducing redundant token generation Hybrid Architecture - Alternating Gated DeltaNet and Gated Attention blocks combined with sparse MoE, balancing long-context efficiency against attention precision Sparse Mixture-of-Experts - 256 total experts with 8 routed and 1 shared expert active per token, delivering 35B total capacity with only 3B active parameters Multi-Token Prediction - Trained with multi-step MTP, enabling speculative decoding for lower-latency inference Native 262K Context - Handles 262,144 tokens natively, extensible up to 1,010,000 tokens via YaRN RoPE scaling Multimodal Inputs - Unified vision-language model supporting text, image, and video inputs Tool Calling - Native tool-calling support with the qwen3_coder parser for agent workflows Benchmark Performance Coding and Software Engineering: SWE-bench Verified: 73.4 SWE-bench Multilingual: 67.2 SWE-bench Pro: 49.5 Terminal-Bench 2.0: 51.5 LiveCodeBench v6: 80.4 NL2Repo: 29.4 QwenClawBench: 52.6 General Agent and Tool Use: TAU3-Bench: 67.2 DeepPlanning: 25.9 MCPMark: 37.0 MCP-Atlas: 62.8 WideSearch: 60.1 Knowledge: MMLU-Pro: 85.2 MMLU-Redux: 93.3 SuperGPQA: 64.7 C-Eval: 90.0 STEM and Reasoning: GPQA: 86.0 HLE: 21.4 HMMT Feb 25: 90.7 HMMT Nov 25: 89.1 HMMT Feb 26: 83.6 IMOAnswerBench: 78.9 AIME26: 92.6 Use Cases Agentic coding tasks across frontend, backend, and repository-level workflows Multi-turn agent scenarios where preserved reasoning context improves decision consistency Tool-calling and MCP-based automation Competition-level mathematics and STEM reasoning Long-context document analysis up to 262K tokens natively Visual question answering and image-grounded reasoning Video understanding with configurable frame sampling Architecture Qwen3.6 35B A3B uses a 40-layer hybrid architecture organized as ten cycles of three Gated DeltaNet blocks followed by one Gated Attention block, each paired with a sparse Mixture-of-Experts feed-forward layer. Gated DeltaNet provides linear-attention efficiency with a fixed-size recurrent state, keeping long-context compute and memory cost tractable. The interleaved Gated Attention blocks use 16 query heads and 2 key-value heads with a 256-dimensional head and a 64-dimensional rotary position embedding, preserving precise token-level attention where it is most valuable. The Mixture-of-Experts layer routes each token through 8 of 256 available experts plus 1 shared expert, with a 512-dimensional expert intermediate size. The model is trained with Multi-Token Prediction across multiple steps, enabling speculative decoding at inference time. A 2048-dimensional language backbone pairs with a vision encoder to form a unified multimodal model, supporting a 248,320-token padded vocabulary and handling text, image, and video inputs through a shared representation. Deploy Qwen3.6 35B A3B on Vast.ai with vLLM, SGLang, or llama.cpp for efficient agentic coding, long-context reasoning, and multimodal inference on flexible GPU infrastructure. Quick Start Guide Choose a model and click 'Deploy' above to find available GPUs recommended for this model. Rent your dedicated instance preconfigured with the model you've selected. Start sending requests to your model instance and getting responses right now. Related Models text vision Qwen3.5 397B A17B Efficient multimodal reasoning model with hybrid DeltaNet-attention architecture text Qwen3 Coder 480B A35B Instruct Qwen3 coding model text Qwen3 Coder Next Ultra-efficient 80B coding agent with only 3B active parameters Subscribe for our product updates. → © 2026 Vast.ai. All rights reserved. Products GPU Cloud Clusters Hosting Developers CLI Python SDK API Reference Documentation Resources Enterprise Startup Program Pricing Use Cases Docs FAQs Press Kit Community Discord GitHub Twitter YouTube Contact Get in Touch Contact Sales Investor Inquiries Legal Terms of Service Privacy Policy Compliance Vulnerability Disclosure Data Processing © 2026 Vast.ai. All rights reserved. GPU Cloud Hosting for AI & ML | Vast.ai Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases AI/ML Frameworks AI Text Generation AI Image + Video Generation AI Agents Batch Data Processing Audio-to-Text Transcription AI Fine Tuning Virtual Computing GPU Programming Graphics Rendering Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Contact Sales Console Contact Sales Console Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases All Use Cases AI Agents AI Fine Tuning AI Image + Video Generation AI Text Generation AI/ML Frameworks Audio-to-Text Transcription Batch Data Processing GPU Programming Graphics Rendering Virtual Computing Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases GPU Cloud Launch Fast, Pay Less Launch a GPU in 60 seconds. 20,000+ GPUs — ready with one-click templates or fully automated via API & CLI. Get Started Explore Templates Trusted by developers worldwide No Pricing Guesswork. Just GPUs. Deploy a 4090 for under $5, or access H100s for as little as $0.90/hour. Save up to 80% compared to AWS, Azure, or GCP. Choose the pricing model that fits your workflow: On-Demand, Interruptible, or Reserved. Get Started On-Demand Pricing Bulk Pricing Your Infrastructure Launchpad Deploy AI models, run intensive compute jobs, and scale your workloads across one of the largest NVIDIA GPU fleets available. Leverage NVIDIA hardware for peak performance — then scale down effortlessly on your terms. Start Building Case Studies Blackwell RTX 5090 Hopper H100 NVL H100 SXM H200 H100 PCIe Ada Lovelace RTX 4090 L-Series RTX Ada Other Ada Ampere RTX 3090 RTX A-Series A100 Other GPUs Volta / Tesla Turing RTX 2 Pascal GTX Maxwell NVIDIA GPU Architectures on Vast On-Demand GPU Deployment Spin up 4090s, A100s, H100s, and more -- on your timeline, with no upfront negotiation or quotas. Flexible, Transparent Pricing Per-second billing with On-Demand, Interruptible, or Reserved pricing and a $5 minimum to get started. Secure Cloud Isolation Run workloads on dedicated infrastructure with full environment control and SOC 2 Type I compliance. Dev-First Interfaces Prefer code? Hit our lightweight CLI or API endpoints to provision fleets without ever opening our GUI dashboard. Up-to-date Templates Use official templates, remix thousands of community-built stacks, or start from scratch -- with DLPerf scores to help you pick the right GPU. Support That Doesn't Sleep Get 24/7 help from real humans. Need more? Premium support tiers include onboarding, architectural consults, and guaranteed response times. Private by Design. Secure by Default. Your Workloads. Your Data. Your Rules. Build without compromise on our Secure Cloud — from idea to deployment, your stack stays yours. Secure Cloud Full Environment Control Launch isolated instances with direct SSH, CLI, and API access -- no container sharing, no noisy neighbors. Compliance-Ready Deploy on SOC 2 Type I-certified environments built for healthcare, finance, and regulated industries. Data Sovereignty Delete models, data, and workloads when you choose -- nothing persists without your command. Enterprise Security Features Enable private VPN access, optional audit trails, and enterprise-grade compliance support for complete operational security. From Zero to Compute in Seconds Skip the quotas, skip the contracts, skip the chaos. Build, deploy, and scale — the way you always knew it could be. Get Started Talk to Sales Subscribe for our product updates. → © 2026 Vast.ai. All rights reserved. Products GPU Cloud Clusters Hosting Developers CLI Python SDK API Reference Documentation Resources Enterprise Startup Program Pricing Use Cases Docs FAQs Press Kit Community Discord GitHub Twitter YouTube Contact Get in Touch Contact Sales Investor Inquiries Legal Terms of Service Privacy Policy Compliance Vulnerability Disclosure Data Processing © 2026 Vast.ai. All rights reserved. Careers | Vast.ai Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases AI/ML Frameworks AI Text Generation AI Image + Video Generation AI Agents Batch Data Processing Audio-to-Text Transcription AI Fine Tuning Virtual Computing GPU Programming Graphics Rendering Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Contact Sales Console Contact Sales Console Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases All Use Cases AI Agents AI Fine Tuning AI Image + Video Generation AI Text Generation AI/ML Frameworks Audio-to-Text Transcription Batch Data Processing GPU Programming Graphics Rendering Virtual Computing Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Careers Build where the boundaries end. If our mission hits something in you, keep reading. Get in Touch Our mission is to organize, optimize, and orient the world's computation. More About Us What We're Building Vast.ai is the AI compute platform — 20,000+ GPUs, 25,000+ monthly customers, and 8 years of operations data. We're building the infrastructure layer where AI agents and developers programmatically provision and manage GPU compute. We are also developing software to accelerate the training and deployment of complex neural networks on our decentralized infrastructure. Our Locations We have offices in Los Angeles and San Francisco. Vast.ai Los Angeles 1100 Glendon Ave #1840 Los Angeles, CA 90024 Vast.ai San Francisco 100 First Street, #2250 San Francisco, CA 94105 The Work The journey to our destiny will not be easy. Our goal is not the safe harbor of today. You will be taxed, and you will be pushed. We love to work. We can't help it; we are witnessing the birth of AGI. All technical roles report to Jake Cannell, the CEO and founder. Jake is a prolific writer and thinker on the subject of AI. Two examples from Less Wrong: LOVE in a simbox is all you need The Brain as a Universal Learning Machine Join Us We're always looking for exceptional people. Send your resume and tell us what you'd build.
[email protected] Subscribe for our product updates. → © 2026 Vast.ai. All rights reserved. Products GPU Cloud Clusters Hosting Developers CLI Python SDK API Reference Documentation Resources Enterprise Startup Program Pricing Use Cases Docs FAQs Press Kit Community Discord GitHub Twitter YouTube Contact Get in Touch Contact Sales Investor Inquiries Legal Terms of Service Privacy Policy Compliance Vulnerability Disclosure Data Processing © 2026 Vast.ai. All rights reserved. Contact Us | Vast.ai Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases AI/ML Frameworks AI Text Generation AI Image + Video Generation AI Agents Batch Data Processing Audio-to-Text Transcription AI Fine Tuning Virtual Computing GPU Programming Graphics Rendering Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Contact Sales Console Contact Sales Console Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases All Use Cases AI Agents AI Fine Tuning AI Image + Video Generation AI Text Generation AI/ML Frameworks Audio-to-Text Transcription Batch Data Processing GPU Programming Graphics Rendering Virtual Computing Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Contact Us Do you have questions or need more information about our services? We're here to help! Fill out the form below, and our team will get in touch with you to provide assistance and answer any queries you might have. Full Name Email Address Message Submit Subscribe for our product updates. → © 2026 Vast.ai. All rights reserved. Products GPU Cloud Clusters Hosting Developers CLI Python SDK API Reference Documentation Resources Enterprise Startup Program Pricing Use Cases Docs FAQs Press Kit Community Discord GitHub Twitter YouTube Contact Get in Touch Contact Sales Investor Inquiries Legal Terms of Service Privacy Policy Compliance Vulnerability Disclosure Data Processing © 2026 Vast.ai. All rights reserved. PAICON Accelerates Global, Data-Centric Cancer Diagnostics with Vast.ai Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases AI/ML Frameworks AI Text Generation AI Image + Video Generation AI Agents Batch Data Processing Audio-to-Text Transcription AI Fine Tuning Virtual Computing GPU Programming Graphics Rendering Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Contact Sales Console Contact Sales Console Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases All Use Cases AI Agents AI Fine Tuning AI Image + Video Generation AI Text Generation AI/ML Frameworks Audio-to-Text Transcription Batch Data Processing GPU Programming Graphics Rendering Virtual Computing Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Case Studies PAICON PAICON Accelerates Global, Data-Centric Cancer Diagnostics with Vast.ai How a global oncology data platform used Vast.ai’s GPU cloud to rapidly iterate on Athena—validating that diversity can matter more than scale—while significantly reducing research-phase training costs. 60%+ Research-Phase Training Cost Reduction Agility Parallel experiments at research scale Speed Faster iteration and feedback cycles Hybrid Vast.ai for R&D, hyperscalers for production Overview The Challenge: Enabling Research-Scale Iteration The Solution: Hybrid Compute with Vast.ai Results Beyond Foundation Models: Knowledge Systems for Diagnostics Why PAICON Stands Out Looking Ahead About PAICON Overview Industry: Medical AI Key area: Data-centric oncology and multimodal clinical data PAICON is a Germany-based, data-centric oncology company building the foundations for truly generalizable medical AI. Its core advantage is access to globally distributed, real-world clinical datasets through long-term collaborations with hospitals, labs and research institutions -- spanning geographies, patient populations and technical imaging variability. PAICON unifies and quality-controls multimodal medical data, starting with oncology, to enable partners to develop -- and regulators to trust -- AI systems that work beyond a single region, scanner or protocol. To demonstrate what this approach makes possible, PAICON built Athena, a histopathology foundation model designed to learn from diversity rather than sheer volume -- supporting the principle that variety can outperform scale when the goal is real-world generalization. PAICON supports and actively engages in the development of clinical-grade AI by providing curated, governance-ready datasets and the infrastructure needed to train and validate models across heterogeneous real-world conditions -- helping reduce time-to-research and improving external validity across sites. Led by Dr. Manasi Aichmüller-Ratnaparkhe (CEO & Co-founder), Danny Quick (COO) and Dr. Christian Aichmüller (CTO & Co-founder), PAICON works with a growing network of global collaborators. This gives PAICON access to one of the most geographically and technically diverse pathology datasets -- capturing differences in staining, scanner hardware, lab workflows and population-level variation that often cause medical AI systems to fail when deployed outside the environments they were trained on. The Challenge: Enabling Research-Scale Iteration Building generalizable medical AI is both a data and compute challenge. Robust models require large multimodal datasets, strict governance requirements, and extensive testing across sources of variation. The training strategy behind Athena required extensive experimentation -- not just a single large training run -- because robustness emerges from iterating across diverse subsets, sampling strategies, architectures, augmentation approaches and evaluation setups. These workloads required substantial GPU compute to support: Training and adapting vision models across multi-site, multi-region datasets Running many parallel experiments Validating generalization across technical and population diversity Hyperscale cloud platforms such as AWS are excellent for production deployments --offering reliability, security tooling and global availability. However, Athena’s R&D phase required a high volume of experiments at the frontier of cost-efficiency. In that environment, premium GPU instance pricing and periodic capacity constraints made sustained, large-scale iteration economically challenging. For production workloads, hyperscalers remain a strong fit. For research-scale experimentation, PAICON required a way to iterate rapidly without costs becoming prohibitive. Medical research also demands flexibility around where compute runs and how workflows are controlled -- particularly when operating across regions with varying governance requirements. The Solution: Hybrid Compute with Vast.ai PAICON used Vast.ai’s distributed GPU cloud to access large, multi-GPU configurations with pricing that made research-scale iteration feasible. This enabled the team to run more experiments, explore robustness strategies and accelerate development of Athena -- while retaining hyperscalers for production deployments when required. The hybrid workflow included: Curating and pre-processing datasets in existing environments before exporting training-ready subsets Training and fine-tuning on multi-GPU Vast.ai machines for high-throughput experimentation Using on-demand clusters to run parallel trials and rapidly iterate on model variants Vast.ai’s flexible supply model complemented PAICON’s multinational collaboration footprint, supporting dynamic research workloads without locking experimentation into rigid infrastructure models. “With Vast.ai, we can scale experiments up and down quickly -- moving from 4 to 8 GPUs when needed -- while keeping iteration economically sustainable.” See How Vast.ai can Transform AI for You Discover how Vast.ai powers AI innovation. Our team will guide you through a tailored platform walkthrough and show how we combine enterprise-level capabilities with startup-friendly pricing. Contact Sales Team Results Significant Training Cost Reduction by 60%+ By shifting large training workloads to Vast.ai, PAICON substantially reduced research-phase GPU training costs compared with comparable hyperscale configurations. Greater Experimental Agility On-demand access to large GPU instances enabled parallel experimentation and faster evaluation of robustness strategies across diverse data slices. Faster Research Cycles Multi-GPU training runs across heterogeneous datasets completed reliably, allowing faster feedback loops. Early Athena experiments showed encouraging cross-tissue signals, supporting PAICON’s thesis that diversity-aware training improves generalization. Beyond Foundation Models: Knowledge Systems for Diagnostics In parallel with vision research, PAICON explored how curated pathology knowledge corpora can support education and decision support. Vast.ai’s GPU capacity enabled prototyping and fine-tuning of domain-adapted language models within PAICON’s broader data-centric roadmap. Why PAICON Stands Out PAICON’s differentiator is data diversity with governance: aggregating real-world clinical data across regions, labs and technical pipelines to enable AI systems that hold up outside controlled environments. By representing population-level and laboratory-level variability -- rather than optimizing only for size -- PAICON helps partners build models that are more robust, less biased and more transferable across sites. Athena serves as a proof point of this thesis: a foundation model built to learn from heterogeneity, demonstrating why diversity-aware training is essential for clinical deployment – enabled by and benefiting from Vast.ai’s globally distributed compute infrastructure. Looking Ahea Accelerate Data Processing & ETL with Cloud GPUs | Vast.ai Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases AI/ML Frameworks AI Text Generation AI Image + Video Generation AI Agents Batch Data Processing Audio-to-Text Transcription AI Fine Tuning Virtual Computing GPU Programming Graphics Rendering Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Contact Sales Console Contact Sales Console Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases All Use Cases AI Agents AI Fine Tuning AI Image + Video Generation AI Text Generation AI/ML Frameworks Audio-to-Text Transcription Batch Data Processing GPU Programming Graphics Rendering Virtual Computing Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Batch Data Processing Accelerate large-scale data processing tasks with robust GPU performance. Built for This Automate large-scale data transformation and cleanup tasks. Dynamically allocate compute resources to match batch job needs. Reduce idle costs with per-second billing and interruptible instance options. Integrate easily with existing data pipelines or cloud storage. Start Building: Batch Data Processing Templates NVIDIA RAPIDS Run your entire data-science and analytics workflow natively on GPUs View All Templates Subscribe for our product updates. → © 2026 Vast.ai. All rights reserved. Products GPU Cloud Clusters Hosting Developers CLI Python SDK API Reference Documentation Resources Enterprise Startup Program Pricing Use Cases Docs FAQs Press Kit Community Discord GitHub Twitter YouTube Contact Get in Touch Contact Sales Investor Inquiries Legal Terms of Service Privacy Policy Compliance Vulnerability Disclosure Data Processing © 2026 Vast.ai. All rights reserved. Deploy LLMs for Inference & Fine-Tuning | Vast.ai Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases AI/ML Frameworks AI Text Generation AI Image + Video Generation AI Agents Batch Data Processing Audio-to-Text Transcription AI Fine Tuning Virtual Computing GPU Programming Graphics Rendering Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Contact Sales Console Contact Sales Console Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases All Use Cases AI Agents AI Fine Tuning AI Image + Video Generation AI Text Generation AI/ML Frameworks Audio-to-Text Transcription Batch Data Processing GPU Programming Graphics Rendering Virtual Computing Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases AI Text Generation Spin up the latest open-source LLMs or your own custom fine-tuned models in minutes. Built for This Launch open-source models like LLaMA 3, DeepSeek, or your own fine-tuned checkpoints. Skip the DevOps; prebuilt images like vLLM, TGI, and Oobabooga do the heavy lifting. Serve models via clean APIs or WebUIs with minimal setup. Own your environment. Your models run on isolated GPUs, and your data disappears when you say so. Models text vision Gemma 4 31B IT Gemma 4 31B dense vision-language model by Google with 256K context and thinking mode text vision Kimi K2.6 Kimi K2.6 is an open-source, native multimodal agentic MoE model from Moonshot AI with 1T total parameters, 32B activated, advancing long-horizon coding, coding-driven design, and swarm-based task orchestration text vision Qwen3.5 27B Dense 27B vision-language model with unified multimodal reasoning View More Models Related Blogs Serving Online Inference with vLLM API on Vast Serving Online Inference with TGI on Vast.ai Serving Rerankers on Vast.ai using vLLM Related Guides vLLM (LLM Inference and Serving) Hugging Face TGI with Llama 3 Oobabooga (LLM WebUI) Quantized GGUF Models Start Building: AI Text Generation Templates Open Webui (Ollama) Extensible, self-hosted AI interface that adapts to your workflow Oobabooga Text Gen UI & API Web interface for Stable Diffusion, implemented using the Gradio library HuggingFace Llama3 TGI API Toolkit for deploying and serving Large Language Models (LLMs) vLLM Fast and easy-to-use library for LLM inference and serving View All Templates Subscribe for our product updates. → © 2026 Vast.ai. All rights reserved. Products GPU Cloud Clusters Hosting Developers CLI Python SDK API Reference Documentation Resources Enterprise Startup Program Pricing Use Cases Docs FAQs Press Kit Community Discord GitHub Twitter YouTube Contact Get in Touch Contact Sales Investor Inquiries Legal Terms of Service Privacy Policy Compliance Vulnerability Disclosure Data Processing © 2026 Vast.ai. All rights reserved. Access On-Demand GPU Power for Custom CUDA Development | Optimize & Test on Vast.ai Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases AI/ML Frameworks AI Text Generation AI Image + Video Generation AI Agents Batch Data Processing Audio-to-Text Transcription AI Fine Tuning Virtual Computing GPU Programming Graphics Rendering Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Contact Sales Console Contact Sales Console Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases All Use Cases AI Agents AI Fine Tuning AI Image + Video Generation AI Text Generation AI/ML Frameworks Audio-to-Text Transcription Batch Data Processing GPU Programming Graphics Rendering Virtual Computing Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases GPU Programming Accelerate AI and HPC breakthroughs; unleash blazing-fast GPU computing at scale. Built for This Access raw GPU power for custom CUDA-based application development. Optimize performance by targeting specific architectures like A100 or 4090. Use full admin privileges to configure drivers, memory, and execution environments. Test and iterate across multiple GPU types with minimal setup. Related Guides CUDA Programming on Vast.ai Start Building: GPU Programming Templates NVIDIA CUDA Foundational Docker base image, designed to serve as the starting point for all containerized development View All Templates Subscribe for our product updates. → © 2026 Vast.ai. All rights reserved. Products GPU Cloud Clusters Hosting Developers CLI Python SDK API Reference Documentation Resources Enterprise Startup Program Pricing Use Cases Docs FAQs Press Kit Community Discord GitHub Twitter YouTube Contact Get in Touch Contact Sales Investor Inquiries Legal Terms of Service Privacy Policy Compliance Vulnerability Disclosure Data Processing © 2026 Vast.ai. All rights reserved. CLI Hello World - Vast.ai Documentation: Affordable GPU Cloud Marketplace Skip to main content Vast.ai Documentation: Affordable GPU Cloud Marketplace home page Search... ⌘ K Ask AI FAQ Discord Console Console Search... Navigation Getting started CLI Hello World Guides CLI & SDK API Host Examples CLI Getting started CLI Authentication Permissions Rate Limits Templates Accounts Billing Instances Search & templates Serverless Teams Volumes Host SDK Getting started Accounts Billing Instances Search & templates Serverless Serverless client library Teams Volumes Host On this page Prerequisites 1. Install the CLI 2. Set Your API Key 3. Verify Authentication 4. Search for GPUs 5. Register Your SSH Key 6. Create an Instance 7. Wait Until Ready 8. Connect via SSH 9. Copy Data 10. Clean Up Next Steps Getting started CLI Hello World Copy page Copy page Documentation Index Fetch the complete documentation index at: https://docs.vast.ai/llms.txt Use this file to discover all available pages before exploring further. The Vast.ai CLI gives you command-line access to the entire platform, authentication, GPU search, instance lifecycle, templates, volumes, serverless endpoints, and more. Anything you can do in the web console, you can automate from your terminal. This guide walks through the core workflow: install the CLI, authenticate, search for a GPU, rent it, wait for it to boot, connect to it, copy data, and clean up. By the end you’ll understand the commands needed to manage instances without touching the web console. Prerequisites A Vast.ai account with credit (~$0.01-0.05, depending on test instance run time) Python 3 installed 1. Install the CLI Install from PyPI: pip install vastai Or grab the latest version directly from GitHub: wget https://raw.githubusercontent.com/vast-ai/vast-cli/master/vast.py -O vastai && chmod +x vastai Verify the installation: vastai --help 2. Set Your API Key Generate an API key from the Keys page by clicking +New . Copy the key, you’ll only see it once. Save it to the CLI: vastai set api-key YOUR_API_KEY_HERE This stores your key in a config file in your home directory. Do not share your API keys with anyone. The console creates a full-access key by default. You can also create scoped keys with limited permissions using vastai create api-key , useful for CI/CD or shared tooling. See the permissions documentation for details. 3. Verify Authentication Confirm your key works by fetching your account info: vastai show user This returns your user ID, email, balance, and SSH key. If you see an authentication error, double-check your API key. 4. Search for GPUs Find available machines using search offers . This query returns on-demand RTX 4090s on verified machines with direct port access, sorted by deep learning performance per dollar: vastai search offers 'gpu_name=RTX_4090 num_gpus=1 verified=true direct_port_count>=1 rentable=true' -o 'dlperf_usd-' Each parameter in the query controls a different filter: Parameter Meaning gpu_name=RTX_4090 Filter to a specific GPU model num_gpus=1 Exactly 1 GPU per instance verified=true Only machines verified by Vast.ai (identity-checked hosts) direct_port_count>=1 At least 1 directly accessible port (needed for direct SSH) rentable=true Only machines currently available to rent -o 'dlperf_usd-' Sort by DL performance per dollar, best value first Note the ID of the offer you want, you’ll use it in the next step. If no offers are returned, try relaxing your filters (e.g. a different GPU model or removing direct_port_count ). Use vastai search offers --help for the full list of filter fields and options, or see the CLI commands reference . 5. Register Your SSH Key Do this before creating an instance. Your SSH public key must be registered on your account, it is applied at container creation time. vastai create ssh-key ~/.ssh/id_ed25519.pub If you don’t have a key yet, omit the argument and the CLI will generate one: vastai create ssh-key Your key persists on your account, you only need to do this once per key. If you forgot and already created an instance, use the SSH key button on the instance card in the console to add a key without recreating. 6. Create an Instance Rent the machine using create instance with the offer ID from step 4 (search): vastai create instance OFFER_ID --image pytorch/pytorch:2.4.0-cuda12.4-cudnn9-runtime --disk 20 --onstart-cmd "echo hello && nvidia-smi" --ssh --direct Flag Meaning --image Docker image to launch --disk 20 20 GB of disk storage --onstart-cmd Command to run when the instance boots --ssh --direct Direct SSH access (lower latency than proxy SSH) The output includes the new instance ID: { "success" : true , "new_contract" : 12345678 } Save the new_contract value, this is your instance ID. Storage charges begin at creation. GPU charges begin when the instance reaches the running state. --onstart-cmd is limited to 16KB . For longer scripts, gzip and base64 encode them, see the Template Settings page for the workaround. 7. Wait Until Ready The instance needs time to pull the Docker image and boot. Check the status with: vastai show instance INSTANCE_ID The status field progresses through these states: Status Meaning loading Docker image is downloading running Ready to use Check every 10-30 seconds. Boot time is typically 1-5 minutes depending on the Docker image size. Always handle non-happy-path statuses in your poll loop. If status becomes exited (container crashed), unknown (no heartbeat from host), or offline (host disconnected), it will never reach running . Without a timeout or error check, your script will loop forever while the instance continues accruing disk charges. Destroy the instance and retry with a different offer if you see these states. 8. Connect via SSH Once the instance is running, get the SSH connection details: vastai ssh-url INSTANCE_ID Then connect: ssh root@SSH_HOST -p SSH_PORT 9. Copy Data Use vastai copy to transfer files between your local machine and the instance: # Upload to instance vastai copy local:./data/ INSTANCE_ID:/workspace/data/ # Download from instance vastai copy INSTANCE_ID:/workspace/results/ local:./results/ You can also copy between instances or to/from cloud storage: # Instance to instance vastai copy INSTANCE_A:/workspace/ INSTANCE_B:/workspace/ # Cloud storage (requires a configured cloud connection) vastai copy s3.CONNECTION_ID:/bucket/data/ INSTANCE_ID:/workspace/ For cloud storage syncing and instance-to-instance transfers, see the data movement guide . 10. Clean Up When you’re done, destroy the instance to stop all billing. Alternatively, to pause an instance temporarily instead of destroying it, you can stop it. Stopping halts compute billing but disk storage charges continue. Destroy (removes everything): vastai destroy instance INSTANCE_ID Stop (pauses compute, disk charges continue): vastai stop instance INSTANCE_ID Next Steps You’ve now completed the full instance lifecycle through the CLI: installation, authentication, search, creation, polling, data transfer, and teardown. From here: SSH setup , See the SSH guide for key configuration and advanced connection options. Full command reference , Every CLI command is documented under the Reference tab, grouped by domain ( Accounts , Instances , Search , Serverless , and more). Use templates , Avoid repeating image and config parameters on every create call. See the templates guide for creating and managing templates. Suggest edits Raise issue Authentication ⌘ I x github youtube Powered by This documentation is built and hosted on Mintlify, a developer documentation platform Assistant Responses are generated using AI and may contain mistakes. Qwen3.5 27B - AI Model Library | Build on Vast.ai Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases AI/ML Frameworks AI Text Generation AI Image + Video Generation AI Agents Batch Data Processing Audio-to-Text Transcription AI Fine Tuning Virtual Computing GPU Programming Graphics Rendering Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Contact Sales Console Contact Sales Console Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases All Use Cases AI Agents AI Fine Tuning AI Image + Video Generation AI Text Generation AI/ML Frameworks Audio-to-Text Transcription Batch Data Processing GPU Programming Graphics Rendering Virtual Computing Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Model Library / Qwen3.5 27B Qwen3.5 27B LLM Reasoning Vision Language Dense 27B vision-language model with unified multimodal reasoning Qwen3.5 27B vllm Deploy Now ... On-Demand Dedicated 1 x RTX PRO 6000 S CLI Details Modalities text, vision Version 3.5 27B Recommended Hardware 1 x RTX PRO 6000 S Estimated Price Loading... Provider Alibaba Family Qwen Parameters 27 B Context 262144 tokens License apache-2.0 Qwen3.5 27B: Dense Vision-Language Reasoning Model Qwen3.5 27B is a dense multimodal foundation model from Alibaba's Qwen team, built on a hybrid Gated DeltaNet and Gated Attention architecture. With 27 billion parameters, it pairs strong text reasoning with native vision understanding through early fusion multimodal training, delivering competitive benchmark performance against much larger models while remaining practical to serve on single-node hardware. Key Features Unified Vision-Language Foundation - Early fusion training on multimodal tokens achieves cross-generational parity with Qwen3 and outperforms Qwen3-VL models across reasoning, coding, agents, and visual understanding benchmarks Efficient Hybrid Architecture - Gated Delta Networks combined with Gated Attention deliver high-throughput inference with minimal latency overhead Scalable RL Generalization - Reinforcement learning scaled across million-agent environments with progressively complex task distributions for robust real-world adaptability Global Linguistic Coverage - Expanded support to 201 languages and dialects for inclusive worldwide deployment Long Context - 262,144 tokens natively, extensible up to 1,010,000 tokens with YaRN Architecture Causal Language Model with Vision Encoder 27B dense parameters 64 layers with a 16 × (3 × (Gated DeltaNet → FFN) → 1 × (Gated Attention → FFN)) hybrid layout Gated DeltaNet linear attention (48 V heads, 16 QK heads, head dim 128) Gated Attention (24 Q heads, 4 KV heads, head dim 256) Feed Forward Network intermediate dimension 17408 Multi-token prediction (MTP) trained with multi-steps Native 262K context, extensible to 1M tokens Use Cases Multimodal reasoning and visual question answering Document, chart, and diagram understanding Coding and software engineering agents Tool-using agent workflows across long horizons Multilingual chat and instruction following across 201 languages Long-context analysis and retrieval over large document sets Benchmarks On the Qwen3.5 benchmark suite ( source ), Qwen3.5 27B scores MMLU-Pro 86.1, MMLU-Redux 93.2, C-Eval 90.5, SuperGPQA 65.6, IFEval 95.0, GPQA Diamond 85.5, and LongBench v2 60.6 — outperforming the larger Qwen3-235B-A22B on several of these metrics while activating every parameter densely. Quick Start Guide Choose a model and click 'Deploy' above to find available GPUs recommended for this model. Rent your dedicated instance preconfigured with the model you've selected. Start sending requests to your model instance and getting responses right now. Related Models text vision Qwen3.5 35B A3B Efficient 35B MoE with 3B active params, unified vision-language reasoning text vision Qwen3.5 397B A17B Efficient multimodal reasoning model with hybrid DeltaNet-attention architecture text Qwen3 235B A22B Thinking 2507 Qwen3 thinking model Subscribe for our product updates. → © 2026 Vast.ai. All rights reserved. Products GPU Cloud Clusters Hosting Developers CLI Python SDK API Reference Documentation Resources Enterprise Startup Program Pricing Use Cases Docs FAQs Press Kit Community Discord GitHub Twitter YouTube Contact Get in Touch Contact Sales Investor Inquiries Legal Terms of Service Privacy Policy Compliance Vulnerability Disclosure Data Processing © 2026 Vast.ai. All rights reserved. FAQ Overview - Vast.ai Documentation: Affordable GPU Cloud Marketplace Skip to main content Vast.ai Documentation: Affordable GPU Cloud Marketplace home page Search... ⌘ K Ask AI FAQ Discord Console Console Search... Navigation FAQ FAQ Overview Guides CLI & SDK API Host Examples Getting started Welcome Quickstart Concepts API keys Templates Teams Instances Overview Pricing Find & rent Connect Storage & data Manage Serverless Serverless Architecture Deployments Quickstart Scaling Behavior Workergroup Parameters Creating Custom PyWorkers Monitoring and Debug Pricing Pre-built Templates Templates Introduction Creating Templates Managing Templates Template Settings Advanced Setup Teams Overview Managing Your Team Teams Roles Legacy Teams Account & billing Pricing overview Account Settings Billing Keys Two-Factor Authentication Referral Program Troubleshooting FAQ Overview General Instances Rental Types Jupyter & SSH Billing Security Technical Networking FAQ FAQ Overview Copy page Find answers to common questions about Vast.ai Copy page Documentation Index Fetch the complete documentation index at: https://docs.vast.ai/llms.txt Use this file to discover all available pages before exploring further. Browse our frequently asked questions organized by topic. General Platform basics, advantages, and how Vast.ai works Instances Creating, managing, and configuring GPU instances Rental Types On-demand vs interruptible instances and pricing Jupyter & SSH Connecting to instances via Jupyter and SSH Billing Details about billing, pricing, and balance notifications Security Data protection and platform security Technical DLPerf scores, Docker, and advanced topics Networking Configuring port mapping on your templates Suggest edits Raise issue Troubleshooting General ⌘ I x github youtube Powered by This documentation is built and hosted on Mintlify, a developer documentation platform Assistant Responses are generated using AI and may contain mistakes. Enterprise | Vast.ai Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases AI/ML Frameworks AI Text Generation AI Image + Video Generation AI Agents Batch Data Processing Audio-to-Text Transcription AI Fine Tuning Virtual Computing GPU Programming Graphics Rendering Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Contact Sales Console Contact Sales Console Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases All Use Cases AI Agents AI Fine Tuning AI Image + Video Generation AI Text Generation AI/ML Frameworks Audio-to-Text Transcription Batch Data Processing GPU Programming Graphics Rendering Virtual Computing Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Enterprise Solutions AI Infrastructure that Scales with You From hundreds to thousands of GPUs, Vast partners with enterprises to design secure, scalable AI infrastructure—on your terms. Get a Quote Start Building Why Enterprises Build on Vast Vast delivers the performance, control, and reliability that modern enterprises demand—without the complexity of traditional cloud providers. Case Studies Explore real-world use cases and success stories from Vast enterprise clients. Bulk Pricing for Scalable Workloads Custom pricing tiers and volume discounts for large-scale AI deployments. Compliance Ready Infrastructure Fully isolated GPU clusters built for regulated, high-compliance environments. Private Networking & Networked Storage Secure network paths, high-speed bandwidth, and scalable storage solutions. Premium White-Glove Support Hands-on onboarding and optimization, and dedicated 24/7 enterprise level support. Custom Software & Advanced Workflows Tailored environments and workflow support for complex ML/AI projects. Case Studies See How Companies are Building Faster with Vast 200,000+ Scaled from zero to 200K MAU within 12 months—no CapEx 8x4090 GPUs Massive multi-GPU clusters enable large-scale language model training. 80% Savings Multi-GPU ML workloads launched at a fraction of big-cloud pricing. Scaling Creativity with Vast Advancing Low-Resource AI Accelerating R&D with Vast “ Certain experiments wouldn't have been cost-effective anywhere else. Vast.ai truly enabled us to experiment at scale. ” Founder & CEO , AI Consultancy White-Glove Support at Scale From onboarding to production, Vast's enterprise support keeps your team moving fast and confidently. Request a Quote Flexible Support Tiers Start with standard support or upgrade to premium tiers for white-glove service and personalized account management. Onboarding & Optimization Work directly with Vast engineers to configure, launch, and continuously optimize your deployment for peak performance. Priority Response & Escalation Get guaranteed SLA-backed response times, with rapid escalation pathways for production-critical issues. Live Support & Expert Access Access live technical support when you need it, plus AI infrastructure specialists for architectural consulting and troubleshooting. SOC2 Certified Compliance Ready Infrastructure Isolated infrastructure and full data control—built for regulated industries and critical workloads. Search Secure Cloud Trust Center Compliance Single-Tenant Isolation Deploy on dedicated GPU hardware with zero shared compute, ensuring complete environment isolation and preventing cross-tenant risks. Data Sovereignty Fully control your data lifecycle with on-demand workload deletion and guaranteed removal of all residual data across the stack. Custom Security Configurations Enable optional private VPNs, persistent audit logging, and custom security profiles to meet HIPAA, GDPR, or regional compliance requirements. Full Access Environments Work directly in SSH, CLI, API, or Jupyter with root-level control over your GPU instances and compute environments. Bulk Pricing for Scalable Workloads The bigger your deployment, the more you save. Vast offers flexible pricing agreements for teams running large-scale, long-term workloads. Volume Discounts Negotiate custom pricing based on reserved capacity and usage tiers. Reserved GPU Contracts Lock in discounted rates with long-term reservations for critical workloads. Cost Transparency Upfront, clear pricing—no surprises or hidden fees. Flexible Commitments Adjust terms as your projects scale up or down. Talk to Sales See How It Works Watch how enterprises deploy and manage GPU workloads at scale on Vast.ai. Private Networking & Networked Storage For teams with advanced requirements, Vast offers deep infrastructure flexibility—including private networking, dedicated bandwidth, and customizable storage solutions. Private Networking & VPN Set up secure subnets, network paths, and VPN access for isolated deployments. Regional & Hybrid Configurations Choose specific server regions or blend Vast GPUs with in-house infrastructure. Custom Storage & Access Controls Scalable storage solutions with advanced user permissions. Dedicated Bandwidth Reserve high-speed connections to guarantee performance for critical workloads. Custom Software & Advanced Workflows Vast supports complicated use cases—from specialized machine learning frameworks to custom orchestration setups—letting your team build exactly what's needed. Tailor cluster configurations for niche AI/ML workflows Deploy proprietary or custom-built software stacks Integrate third-party tools seamlessly Support for unique orchestration and job scheduling requirements Build Boldly. Scale Confidently. Enterprise-grade security. Custom deployments. World-class support. Vast gives your team the GPU infrastructure you need—without compromise. Get Started Talk to Sales Subscribe for our product updates. → © 2026 Vast.ai. All rights reserved. Products GPU Cloud Clusters Hosting Developers CLI Python SDK API Reference Documentation Resources Enterprise Startup Program Pricing Use Cases Docs FAQs Press Kit Community Discord GitHub Twitter YouTube Contact Get in Touch Contact Sales Investor Inquiries Legal Terms of Service Privacy Policy Compliance Vulnerability Disclosure Data Processing © 2026 Vast.ai. All rights reserved. Press Releases | Vast.ai Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases AI/ML Frameworks AI Text Generation AI Image + Video Generation AI Agents Batch Data Processing Audio-to-Text Transcription AI Fine Tuning Virtual Computing GPU Programming Graphics Rendering Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Contact Sales Console Contact Sales Console Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases All Use Cases AI Agents AI Fine Tuning AI Image + Video Generation AI Text Generation AI/ML Frameworks Audio-to-Text Transcription Batch Data Processing GPU Programming Graphics Rendering Virtual Computing Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Press Releases The latest press releases from Vast.ai Downloads Vast.AI Expands to San Francisco to Access Top Talent Market June 5, 2025 Vast.ai Achieves Security Milestone with SOC 2 Type I Certification April 10, 2025 Vast.ai Adds Virtual Machine Support, Expanding Access and Flexibility for the GPU Rental Platform December 10, 2024 VAST.AI Becomes First GPU Rental Marketplace To Offer AMD Support May 2, 2024 Subscribe for our product updates. → © 2026 Vast.ai. All rights reserved. Products GPU Cloud Clusters Hosting Developers CLI Python SDK API Reference Documentation Resources Enterprise Startup Program Pricing Use Cases Docs FAQs Press Kit Community Discord GitHub Twitter YouTube Contact Get in Touch Contact Sales Investor Inquiries Legal Terms of Service Privacy Policy Compliance Vulnerability Disclosure Data Processing © 2026 Vast.ai. All rights reserved. Terms of Service | Vast.ai Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases AI/ML Frameworks AI Text Generation AI Image + Video Generation AI Agents Batch Data Processing Audio-to-Text Transcription AI Fine Tuning Virtual Computing GPU Programming Graphics Rendering Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Contact Sales Console Contact Sales Console Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases All Use Cases AI Agents AI Fine Tuning AI Image + Video Generation AI Text Generation AI/ML Frameworks Audio-to-Text Transcription Batch Data Processing GPU Programming Graphics Rendering Virtual Computing Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Vast.ai Terms of Use Agreement Version Date: November 10, 2025 This Terms of Use Agreement (“Agreement”) constitutes a legally binding agreement made between you, whether personally or on behalf of an entity or Authorized Users (collectively, “user” or “you”) and Vast.ai Inc. and its affiliated companies (collectively, “Company” or “we” or “us” or “our”), concerning your access to and use of the https://vast.ai website as well as any other media form, media channel, mobile website or mobile application related or connected thereto (collectively, the “Website”). “Authorized Users” means user and, to the extent applicable, user’s employees, consultants, contractors, and agents (i) who are authorized by user to access and use the Services under the rights granted to user under this Agreement and (ii) for whom access to the Services has been purchased pursuant to the terms hereof. The Website provides the following service: Rental by users of hosting services (“User”) from independent providers (“Provider”), and software services allowing a user to act as an independent provider and get paid for doing so (“Company Services”). Supplemental terms and conditions or documents that may be posted on the Website from time to time, are hereby expressly incorporated into this Agreement by reference. All Providers are independent contractors with respect to the Company and Users. Accordingly, the Company shall not be held responsible for the acts or omissions of any Provider outside of the scope of the agreement that the Company has entered into with such Provider. All engagement with the Services is through the Website. Generally, the Company does not permit direct engagement between Users and Providers. The Company may offer communication via Discord on or through the Website, but any such use of Discord is subject to its own terms and conditions. We do not monitor these communications and are not responsible for the content or tone of those communications. NOTICE: THIS AGREEMENT CONTAINS A MANDATORY ARBITRATION AGREEMENT. YOU AGREE THAT ANY CLAIMS YOU MAY HAVE AGAINST US RELATING TO THE WEBSITE OR THE COMPANY SERVICES, THIS AGREEMENT OR ANY TERMS AND CONDITIONS CONTAINED HEREIN MUST BE ARBITRATED, AND YOU EXPRESSLY WAIVE THE RIGHT TO (1) ASSERT CLAIMS AGAINST US IN COURT; (2) PARTICIPATE IN A REPRESENTATIVE OR CLASS ACTION; AND (3) HAVE A JURY HEAR YOUR CASE. YOU EXPRESSLY CONSENT TO HAVE ALL OF YOUR CLAIMS ARBITRATED ON AN INDIVIDUAL BASIS ONLY. IF YOU DO NOT AGREE TO BE SO BOUND, YOU MAY NOT ACCESS OR USE THE WEBSITE OR ANY COMPANY SERVICES. Company reserves the right, at its own discretion, to make changes or modifications by updating this Agreement from time to time. We will alert you regarding any changes by updating the “Last updated” date of this Agreement, and you waive any right to receive specific notice of each such change. You will be subject to, and will be deemed to have been made aware of, and to have accepted the changes to the Agreement by your continued use of the Website after the date such revised Agreement has been posted. Company makes no representation that the Website is appropriate or available in other locations other than where it is operated by Company. The information provided on the Website is not intended for distribution to or use by any person or entity in any jurisdiction or country where such distribution or use would be contrary to law or regulation or which would subject Company to any registration requirement within such jurisdiction or country. Accordingly, those persons who choose to access the Website from other locations do so on their own initiative and are solely responsible for compliance with local laws, if and to the extent local laws are applicable. All users who are minors in the jurisdiction in which they reside (generally under the age of 18) must have the permission of, and be directly supervised by, their parent or guardian to use the Website. If you are a minor, you must have your parent or guardian read and agree to this Agreement prior to you using the Website. Persons under the age of 13 are not permitted to register for the Website or use the Company Services. YOU ACCEPT AND AGREE TO BE BOUND BY THIS AGREEMENT BY ACKNOWLEDGING SUCH ACCEPTANCE DURING THE REGISTRATION PROCESS (IF APPLICABLE) AND ALSO BY CONTINUING TO USE THE WEBSITE. IF YOU DO NOT AGREE TO ABIDE BY THIS AGREEMENT, OR TO MODIFICATIONS THAT COMPANY MAY MAKE TO THIS AGREEMENT IN THE FUTURE, DO NOT USE OR ACCESS OR CONTINUE TO USE OR ACCESS THE COMPANY SERVICES OR THE WEBSITE. ACCOUNT; PURCHASES; PAYMENT Account Registration. If you create an account, you must provide us with complete and accurate information. You must promptly update such information to keep it complete and accurate. You are entirely responsible for maintaining the confidentiality of your password and account. You are entirely responsible for any and all activities that occur under your account, including, without limitation, in connection with all use by Authorized Users. You may not use anyone else’s account at any time. We may remove or reclaim your username if we believe it is appropriate (such as in response to a trademark claim). Security of Your Account. You agree to notify the Company immediately of any unauthorized use of your account or any other breach of security. We will not be liable for any loss, damages, liability, expenses or attorneys’ fees that you may incur as a result of someone else using your password or account, either with or without your knowledge, to the fullest extent permitted by applicable law. You will be liable for losses, damages, liability, expenses and attorneys’ fees incurred by Company or a third party due to someone else using your account. No Obligation to Retain a Record of Your Account. Company has no obligation to retain a record of your account or any data or information that you may have stored for your convenience by means of your account or the Company Services. Payment. Company bills you through an online billing account for purchases of products and/or services. You agree to pay Company all charges at the prices then in effect for the products you or other persons using your billing account may purchase, and you authorize Company to charge your chosen payment provider for any such purchases. You agree to make payment using that selected payment method. If you have ordered a product or service that is subject to recurring charges then you consent to our charging your payment method on a recurring basis, without requiring your prior approval from you for each recurring charge until such time as you cancel the applicable product or service. Company reserves the right to correct any errors or mistakes in pricing that it makes even if it has already requested or received payment. Sales tax will be added to the sales price of purchases as deemed required by Company. Company may change prices at any time. All payments shall be in U.S. dollars. Purchases. The Website sets Investor Inquiries | Vast.ai Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases AI/ML Frameworks AI Text Generation AI Image + Video Generation AI Agents Batch Data Processing Audio-to-Text Transcription AI Fine Tuning Virtual Computing GPU Programming Graphics Rendering Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Contact Sales Console Contact Sales Console Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases All Use Cases AI Agents AI Fine Tuning AI Image + Video Generation AI Text Generation AI/ML Frameworks Audio-to-Text Transcription Batch Data Processing GPU Programming Graphics Rendering Virtual Computing Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Investor Inquiries If you are an investor and interested in learning more about Vast.ai, submit this form and we will get back to you with more information. Get in Touch Fill out the form below and our team will reach out. Subscribe for our product updates. → © 2026 Vast.ai. All rights reserved. Products GPU Cloud Clusters Hosting Developers CLI Python SDK API Reference Documentation Resources Enterprise Startup Program Pricing Use Cases Docs FAQs Press Kit Community Discord GitHub Twitter YouTube Contact Get in Touch Contact Sales Investor Inquiries Legal Terms of Service Privacy Policy Compliance Vulnerability Disclosure Data Processing © 2026 Vast.ai. All rights reserved. Compliance | Vast.ai Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases AI/ML Frameworks AI Text Generation AI Image + Video Generation AI Agents Batch Data Processing Audio-to-Text Transcription AI Fine Tuning Virtual Computing GPU Programming Graphics Rendering Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Contact Sales Console Contact Sales Console Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases All Use Cases AI Agents AI Fine Tuning AI Image + Video Generation AI Text Generation AI/ML Frameworks Audio-to-Text Transcription Batch Data Processing GPU Programming Graphics Rendering Virtual Computing Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Our Commitment to Security and Compliance Vast.ai maintains rigorous security controls and compliance certifications to protect customer data. As a GPU compute marketplace serving AI startups, research universities, and Fortune 500 enterprises, we hold ourselves to the same standards our customers require. To discuss your compliance requirements, contact our sales team . Certifications & Standards SOC 2 Type 3 Our SOC 2 Type 3 report is available immediately upon request. Contact sales to obtain a copy. SOC 2 Type 2 Vast.ai has completed SOC 2 Type 2 certification. This audit, conducted by an independent third party, verified that our security, availability, and confidentiality controls meet AICPA Trust Services Criteria over a sustained observation period. The SOC 2 Type 2 report is available under a signed NDA — contact sales to request access. HIPAA Vast.ai supports HIPAA-covered workloads on our Secure Cloud tier. Technical safeguards — including data isolation, access controls, and audit logging — align with HIPAA requirements. Business Associate Agreements (BAAs) are available for qualifying customers. GDPR We comply with the General Data Protection Regulation for all European users. Our Data Processing Agreement details data handling, sub-processor disclosures, and data subject rights. EU-region compute is available on request. US Data Privacy Vast.ai complies with applicable US state privacy laws, including CCPA/CPRA. Our Privacy Policy outlines data collection, use, retention, and deletion practices. Platform Security Client Data Isolation Every workload runs in an unprivileged Docker container, isolated from other tenants Clients access only their own data — no shared filesystems between tenants Data is destroyed immediately when a client deletes an instance Network & Access Controls All API and console traffic is encrypted in transit via TLS 1.2+ Role-based access controls govern internal systems API key authentication for all programmatic access Monitoring & Incident Response Continuous monitoring for anomalous activity across the platform Documented incident response procedures with defined escalation paths Regular internal and third-party security audits Employee Security Background checks for all employees Security and compliance training at onboarding and annually thereafter Principle of least privilege applied to all internal access Security Tiers Vast.ai offers two security tiers to match your requirements: Verified Hosts Suitable for general-purpose AI and HPC workloads. Manually tested for reliability and performance Docker-level tenant isolation Cost-effective option for non-regulated workloads Secure Cloud (Trusted Datacenters) For regulated industries and enterprise security requirements. Filter for these offers on cloud.vast.ai by selecting "Secure Cloud (Only Trusted Datacenters)." Datacenter partner requirements: Equipment housed in professionally managed data center facilities Minimum 5 GPU servers with flagship-class hardware Signed Data Processing Agreements with Vast.ai Due diligence on facility security, ownership, and business identity Certifications held by datacenter partners may include: ISO 27001, ISO 20000-1, ISO 22301, ISO 14001 SOC 1 Type 2, SOC 2 Type 2, SOC 3 HIPAA, HITRUST, PCI DSS NIST frameworks GDPR compliance Security certifications such as ISO 27001 or SOC 2 are encouraged and strengthen a partner's application, but are not strictly required for certification. Physical & environmental security: Restricted facility access with biometric or badge authentication Video surveillance with 90+ day retention Fire detection and suppression systems Redundant power and climate control Annual testing of all environmental control systems Auditing & oversight: Vast.ai audits datacenter partners on ownership structure, identity, and source of funds Ongoing verification that partners maintain facility standards and follow best practices Legal & Contractual Protections Data Processing Agreement governs all data handling Privacy Policy details collection, use, and retention practices Terms of Service define platform obligations and customer rights Secure Cloud datacenter hosts sign expanded hosting agreements with additional DPA coverage Track Record Vast.ai has maintained a clean security record since launch in 2018 with no major incidents. Contact For compliance documentation, audit reports, or to discuss your security requirements: Email :
[email protected] Sales : Contact our sales team Subscribe for our product updates. → © 2026 Vast.ai. All rights reserved. Products GPU Cloud Clusters Hosting Developers CLI Python SDK API Reference Documentation Resources Enterprise Startup Program Pricing Use Cases Docs FAQs Press Kit Community Discord GitHub Twitter YouTube Contact Get in Touch Contact Sales Investor Inquiries Legal Terms of Service Privacy Policy Compliance Vulnerability Disclosure Data Processing © 2026 Vast.ai. All rights reserved. Lowest Cost, Autoscaling GPU Cloud | Serverless | Vast.ai Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases AI/ML Frameworks AI Text Generation AI Image + Video Generation AI Agents Batch Data Processing Audio-to-Text Transcription AI Fine Tuning Virtual Computing GPU Programming Graphics Rendering Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Contact Sales Console Contact Sales Console Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases All Use Cases AI Agents AI Fine Tuning AI Image + Video Generation AI Text Generation AI/ML Frameworks Audio-to-Text Transcription Batch Data Processing GPU Programming Graphics Rendering Virtual Computing Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Serverless Lowest Cost, Autoscaling GPU Cloud on the Market Predictive optimization automatically and proactively identifies the best-performing hardware within Vast's industry-leading cloud infrastructure. Get Started Documentation Where GPU Cloud Meets Serverless Serverless access to Vast.ai's entire portfolio of GPUs, from consumer GPUs to high-performance clusters. Easy to Use SDK takes all management out of worker scaling. No tiers, no limits, no hidden surcharges. Transparent Pricing No tiers, no limits. Fully transparent with no surcharge for serverless. Access All Hardware Pick from consumer and enterprise GPUs and Vast.ai matches the right fleet for each workload. Flexible Regions Deploy to the ideal region to minimize latency and meet compliance. Serverless Key Features Automate the provisioning of GPU workers to match the dynamic computational needs of your workloads. This system ensures efficient and cost-effective scaling for AI inference and other GPU computing tasks. Get Started Documentation Dynamic Scaling Automatically scale your AI inference up or down based on customizable performance metrics. Global GPU Fleet Leverage Vast's global fleet of powerful, affordable GPUs for your computational needs. Fast Cold-Start Times Minimize cold-start times with a reserve pool of workers that can spin up in seconds. Metrics and Debugging Access ample metrics and debugging tools for your serverless usage, including logs and Jupyter/SSH access. Deploy from Python, Not the Dashboard Define your Docker image, packages, and autoscaling config entirely in code. The Vast SDK handles endpoint creation and management, no GUI required. Custom Worker Types Define custom worker types through CLI search filters and create commands, supporting multiple worker types per endpoint. What Does Vast.ai Stack Up? Features Vast.ai Typical Provider Pricing Tiers Vast.ai: One low price across all GPUs Typical: Expensive pro tiers & hidden fees Autoscaling Vast.ai: Predictive spin-up based on demand Typical: Laggy cold starts or manual scaling GPU Variety Vast.ai: 68+ types, 50+ filters Typical: Limited presets, low flexibility Global Reach Vast.ai: 500+ locations across all regions Typical: Mostly US-based, low international spread Latency & Compliance Vast.ai: Deploy close to users or meet regulations Typical: Few region choices Fault Tolerance Vast.ai: Distributed fleet reduces single-point risk Typical: Centralized infrastructure Debugging Tools Vast.ai: Logs, Jupyter, SSH included Typical: Limited or restricted access Cold Start Speed Vast.ai: Reserve workers minimize wait time Typical: Delays on every new job Private by Design. Secure by Default. Your Workloads. Your Data. Your Rules. Build without compromise on our Secure Cloud — from idea to deployment, your stack stays yours. Full Environment Control Launch isolated instances with direct SSH, CLI, and API access — no container sharing, no noisy neighbors. Compliance-Ready Deploy on SOC 2 Type I-certified environments built for healthcare, finance, and regulated industries. Data Sovereignty Delete models, data, and workloads when you choose — nothing persists without your command. Enterprise Security Features Enable private VPN access, optional audit trails, and enterprise-grade compliance support for complete operational security. Predictive Optimization Predicts load based on history and market benchmarking. Optimizes for cost and latency. Automatically orchestrates provisioning of GPU workers to match dynamic workloads. On-Demand GPU Deployment Spin up 4090s, A100s, H100s, and more — on your timeline, with no upfront negotiation or quotas. Flexible, Transparent Pricing Per-second billing with On-Demand, Interruptible, or Reserved pricing and a $5 minimum to get started. Secure Cloud Isolation Run workloads on dedicated infrastructure with full environment control and SOC 2 Type I compliance. Dev-First Interfaces Prefer code? Hit our lightweight CLI or API endpoints to provision fleets without ever opening our GUI dashboard. Up-to-date Templates Use official templates, remix thousands of community-built stacks, or start from scratch — with DLPerf scores helping you pick the right GPU. Support That Doesn't Sleep Get 24/7 help from real humans. Need more? Premium tiers include onboarding, architectural consults, and guaranteed response times. Unrestricted Selection & Control Bring your own model. Choose the exact machine specs you need. Automatically pull from a globally distributed fleet and wide spectrum of hardware types. Get Started Documentation “ We needed to enrich 100,000 documents every two hours using LLMs — something that was prohibitively expensive on other clouds. With Vast Serverless, we scaled up to 46 H100 servers on demand and completed the job in just 38 minutes, at 1/4th the cost. It enabled us to move to production with confidence. ” Anna Bosch — VP of Data Intelligence , Launchmetrics Frequently Asked Questions What is Vast.ai Serverless? How does Vast.ai Serverless pricing work? What GPUs are available on Vast.ai Serverless? How do I deploy a serverless endpoint on Vast.ai? Is my data secure on Vast.ai Serverless? From Zero to Compute in Seconds Skip the quotas, skip the contracts, skip the chaos. Leverage Vast's fleet when and where you need it. Get Started Documentation Subscribe for our product updates. → © 2026 Vast.ai. All rights reserved. Products GPU Cloud Clusters Hosting Developers CLI Python SDK API Reference Documentation Resources Enterprise Startup Program Pricing Use Cases Docs FAQs Press Kit Community Discord GitHub Twitter YouTube Contact Get in Touch Contact Sales Investor Inquiries Legal Terms of Service Privacy Policy Compliance Vulnerability Disclosure Data Processing © 2026 Vast.ai. All rights reserved. Fine Tune AI Models with On-Demand GPU Rentals | AI Training at Scale | Vast.ai Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases AI/ML Frameworks AI Text Generation AI Image + Video Generation AI Agents Batch Data Processing Audio-to-Text Transcription AI Fine Tuning Virtual Computing GPU Programming Graphics Rendering Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Contact Sales Console Contact Sales Console Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases All Use Cases AI Agents AI Fine Tuning AI Image + Video Generation AI Text Generation AI/ML Frameworks Audio-to-Text Transcription Batch Data Processing GPU Programming Graphics Rendering Virtual Computing Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases AI Fine Tuning Improve AI performance through efficient, on-demand fine-tuning. Built for This Train and refine pre-trained models on your own datasets for better task-specific results. Use powerful GPUs to reduce training time and cost. Customize storage, RAM, and compute to fit your model size. Seamlessly deploy fine-tuned models for inference once training is complete. Start Building: AI Fine Tuning Templates Axolotl Streamlines fine-tuning of diverse AI models with flexible configuration and architecture support View All Templates Subscribe for our product updates. → © 2026 Vast.ai. All rights reserved. Products GPU Cloud Clusters Hosting Developers CLI Python SDK API Reference Documentation Resources Enterprise Startup Program Pricing Use Cases Docs FAQs Press Kit Community Discord GitHub Twitter YouTube Contact Get in Touch Contact Sales Investor Inquiries Legal Terms of Service Privacy Policy Compliance Vulnerability Disclosure Data Processing © 2026 Vast.ai. All rights reserved. Contact Sales | Vast.ai Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases AI/ML Frameworks AI Text Generation AI Image + Video Generation AI Agents Batch Data Processing Audio-to-Text Transcription AI Fine Tuning Virtual Computing GPU Programming Graphics Rendering Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Contact Sales Console Contact Sales Console Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases All Use Cases AI Agents AI Fine Tuning AI Image + Video Generation AI Text Generation AI/ML Frameworks Audio-to-Text Transcription Batch Data Processing GPU Programming Graphics Rendering Virtual Computing Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Bulk Pricing Unlock Unparalleled GPU Power at a Fraction of the Cost Optimize your AI workflows with secure, scalable, and cost-effective solutions tailored for your business needs. Request Your Bulk Pricing Why Leading AI Teams Choose Vast.ai Vast.ai provides the lowest prices for On-Demand GPU among others. Cost Savings Achieve 3-5x savings compared to traditional providers like AWS, Azure, or Google Cloud. No upfront costs, only pay for what you use. Tailored GPU Solutions Access a vast platform with options for high-performance chips and cost-efficient alternatives to support LLM training, render graphics, and access virtual machines. Enterprise-Grade Security Work in a secure, high-quality environment designed to meet rigorous compliance and performance standards. Expert Support Get real-time assistance for technical challenges and complex deployments, ensuring seamless operations. “ Comparing with other vendors like Azure and Lambda, Vast.ai's pricing, server speed, customer service, and network are much better. ” Interaction Designer, Google Top Customer Use Cases AI Model Training and Testing Cost-efficient GPU rental for prototyping and fine-tuning models. Learn More Enterprise-Scale Deployments Scalable infrastructure to support large-scale AI initiatives without breaking the budget. Learn More Real-Time Inference High-performance GPUs for time-sensitive applications like autonomous systems and generative AI. Learn More Data Science Workloads Affordable and secure compute power for data preprocessing, analytics, and advanced modeling. Learn More The Vast.ai Difference Our unique platform model empowers teams to optimize spending by matching workloads with the perfect hardware, from high-performance chips for critical tasks to cost-effective solutions for testing and prototyping. Platform Efficiency Optimize costs with dynamic GPU allocation, ensuring you only pay for what you need. Flexible Scalability Scale workloads seamlessly, from small tests to large enterprise deployments, without compromising performance. Enterprise Focus Solutions built to meet the demands of security, compliance, and business-critical operations. Best Price-to-Performance Ratio Leverage cutting-edge chips with unbeatable pricing to maximize your ROI. See How Vast.ai Can Transform AI for You Our team is here to guide you through how Vast.ai can transform your AI initiatives. Get a tailored walkthrough of our platform and see how we deliver enterprise-grade solutions at startup-friendly costs. Request Bulk Pricing Subscribe for our product updates. → © 2026 Vast.ai. All rights reserved. Products GPU Cloud Clusters Hosting Developers CLI Python SDK API Reference Documentation Resources Enterprise Startup Program Pricing Use Cases Docs FAQs Press Kit Community Discord GitHub Twitter YouTube Contact Get in Touch Contact Sales Investor Inquiries Legal Terms of Service Privacy Policy Compliance Vulnerability Disclosure Data Processing © 2026 Vast.ai. All rights reserved. Vulnerability Disclosure Policy | Vast.ai Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases AI/ML Frameworks AI Text Generation AI Image + Video Generation AI Agents Batch Data Processing Audio-to-Text Transcription AI Fine Tuning Virtual Computing GPU Programming Graphics Rendering Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Contact Sales Console Contact Sales Console Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases All Use Cases AI Agents AI Fine Tuning AI Image + Video Generation AI Text Generation AI/ML Frameworks Audio-to-Text Transcription Batch Data Processing GPU Programming Graphics Rendering Virtual Computing Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Vulnerability Disclosure Policy Purpose Vast.ai welcomes responsible security research and coordinated disclosure. This policy explains how to report potential vulnerabilities, what testers may expect from us, and the legal protections we extend without promising monetary rewards. Vulnerability Bounty Program While we do not guarantee monetary rewards for submissions, Vast.ai may, at its sole discretion, offer vulnerability bounties—monetary compensation—for high‑value reports. All rewards are discretionary and not guaranteed. Participation should not be driven solely by an expectation of payment. Scope In‑scope (highest priority) Out‑of‑scope vast.ai web console, REST API & billing flows Provider Daemon code (host agent) Match‑making & pricing engine Default Docker & KVM isolation on reference images GPU memory‑isolation / tenant breakout flaws User workloads & third‑party container images Social‑engineering, physical security, or denial‑of‑service (DoS) tests Brute‑force attacks against customer passwords or MFA Any activity that violates applicable law, exports regulations, or provider Terms of Service If you are unsure whether a target is in scope, ask first via the channels in the How to Report Section. Safe Harbor Research conducted in accordance with this policy is considered authorized activity. Vast.ai will not pursue civil or criminal action for accidental, good‑faith violations. We adopt the industry "Gold‑Standard Safe Harbor" language to protect good‑faith researchers. Safe harbor does not apply to actions on third‑party infrastructure (e.g., upstream data‑centers) that we do not control. Rules of Engagement Do no harm. Avoid privacy violations, service disruption, or destruction of data. Test with your own resources. Use your own account. Stop & report immediately if you encounter user data (PII, PHI, payment info, model checkpoints, etc.). No spam or extortion. Coordinate disclosure. Allow Vast.ai 90 days to remediate before public release, unless we agree to an earlier date. Our Commitments Action SLA Initial triage & severity rating within 5 business days Status updates at least every 30 days Resolution target (Critical) ≤ 30 days Resolution target (High) ≤ 60 days We will keep you informed and extend safe harbor. How to Report Email:
[email protected] Please include: summary, service affected, step‑by‑step reproduction, impact assessment, and any PoC code or screenshots. Preferred Report Quality Well‑written English reports with clear reproduction steps and minimal tools output accelerate triage. Proof‑of‑concept code is strongly encouraged. Out‑of‑Scope Findings Vast is not interested in theoretical or highly unlikely vulnerabilities, or findings with no demonstrable security impact. Examples of those include: Click‑jacking with no security impact SPF/DMARC misconfigurations of non‑vast.ai domains Use of outdated TLS ciphers on assets not serving sensitive data Best‑practice advice or recommendations without an exploitable vulnerability Version enumeration, banner disclosure or verbose error messages without proven risk Issues affecting only end-of-life or unsupported browsers/OSes Broken-link hijacking, tabnabbing, content-spoofing/text-injection without further impact Attacks that require physical access "Self-XSS" or "self-DoS" where the researcher can only exploit their own account CSRF on non-critical forms (e.g. logout) Permissive CORS with no exploit path CSV injection; open redirects (unless chained to a real attack) Legal Notice By participating, you acknowledge: You have read and will abide by this policy. Vast.ai may use any submitted information for vulnerability remediation. Vast.ai reserves all rights, including modification or termination of this policy at any time. Changes will be posted at least 30 days before taking effect. Version History Date Change 23 Jul 2025 Initial public release Questions? Email
[email protected] with the subject [VDP Question] and we will respond within two business days. Thank you for helping keep Vast.ai and our community secure. Subscribe for our product updates. → © 2026 Vast.ai. All rights reserved. Products GPU Cloud Clusters Hosting Developers CLI Python SDK API Reference Documentation Resources Enterprise Startup Program Pricing Use Cases Docs FAQs Press Kit Community Discord GitHub Twitter YouTube Contact Get in Touch Contact Sales Investor Inquiries Legal Terms of Service Privacy Policy Compliance Vulnerability Disclosure Data Processing © 2026 Vast.ai. All rights reserved. Data Center Application — Vast.ai Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases AI/ML Frameworks AI Text Generation AI Image + Video Generation AI Agents Batch Data Processing Audio-to-Text Transcription AI Fine Tuning Virtual Computing GPU Programming Graphics Rendering Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Contact Sales Console Contact Sales Console Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases All Use Cases AI Agents AI Fine Tuning AI Image + Video Generation AI Text Generation AI/ML Frameworks Audio-to-Text Transcription Batch Data Processing GPU Programming Graphics Rendering Virtual Computing Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Become a Certified Data Center Run 5+ GPU servers in a professionally managed facility? Apply for Certified Data Center status to earn a blue trust label, priority platform placement, and increased customer traffic. Blue trust label Priority placement Dedicated support Apply for Certification Tell us about your facility. We review applications within 2 business days. 1 2 3 Your Facility Facility Type * Select Own/leased data center Colocation Building out / planning Location(s) * Total GPU Servers (current) * Select 5–10 11–25 26–50 51–100 100+ GPU Models Deployed * B200 H200 H100 SXM H100 PCIe A100 80GB A100 40GB L40S RTX 5090 RTX 4090 RTX 3090 Other Additional Capacity Planned (next 6 months) Select None currently 1–10 servers 11–50 servers 50+ servers Continue Why Get Certified? Certified Data Centers receive visible trust signals and algorithmic advantages on the Vast platform. Blue Trust Label Your offers display a verified blue label on the platform, signaling professional-grade infrastructure to renters. Priority Placement Certified machines are auto-sorted higher in search results, so enterprise customers find you first. Increased Traffic We actively direct more user flow to certified hosts because of the quality and reliability of their equipment. Dedicated Support Get direct access to the Vast team for onboarding, troubleshooting, and operational support. What It Takes You provide GPU hardware in a professional environment. Vast.ai lists your machines with a blue trust label and prioritizes them on the platform. Clients rent your GPUs, and you earn based on usage. Professional Environment Servers housed in a professionally managed data center facility with reliable power, cooling, and network infrastructure. Minimum 5 GPU Servers At least 5 servers equipped with flagship-class GPU hardware (e.g. A100, H100, H200, RTX 4090/5090). Verified Business Equipment owned by a registered business. Owners must complete identity verification and sign the Data Center Hosting Agreement. Security Standards Certifications like ISO 27001, SOC 2, or equivalent are encouraged and will strengthen your application, but are not strictly required. Ready to Scale Your Hosting Operation? Explore financing to grow your GPU fleet, or source certified hardware through our vetted supplier network. Explore Financing Source Hardware Subscribe for our product updates. → © 2026 Vast.ai. All rights reserved. Products GPU Cloud Clusters Hosting Developers CLI Python SDK API Reference Documentation Resources Enterprise Startup Program Pricing Use Cases Docs FAQs Press Kit Community Discord GitHub Twitter YouTube Contact Get in Touch Contact Sales Investor Inquiries Legal Terms of Service Privacy Policy Compliance Vulnerability Disclosure Data Processing © 2026 Vast.ai. All rights reserved. Vast.ai | Console About Vast.ai — The Story Behind the GPU Cloud Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases AI/ML Frameworks AI Text Generation AI Image + Video Generation AI Agents Batch Data Processing Audio-to-Text Transcription AI Fine Tuning Virtual Computing GPU Programming Graphics Rendering Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Contact Sales Console Contact Sales Console Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases All Use Cases AI Agents AI Fine Tuning AI Image + Video Generation AI Text Generation AI/ML Frameworks Audio-to-Text Transcription Batch Data Processing GPU Programming Graphics Rendering Virtual Computing Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases We Built This So No One Owns the Future Alone Whoever controls compute controls AI. Vast.ai exists to keep that power distributed — a market operating system for the agentic economy, where AI agents autonomously procure and optimize their own compute. The Idea That Started Everything In 2010, Jake Cannell — ML engineer, GPU programmer, and compulsive theorist — began publishing essays on LessWrong arguing an unconventional thesis: intelligence is fundamentally a function of compute. Not clever algorithms. Not hand-engineered modules. Compute. His 2015 essay The Brain as a Universal Learning Machine laid out the complete framework — predicting AlphaGo two years before it happened and forecasting human-level vision and language via scaled deep learning within a window that would prove accurate. Christian Horne — a fellow thinker and builder who also published on LessWrong — shared Jake's view that the compute scaling thesis had profound implications, not just for AI development, but for who would control it. Both saw the same thing: if whoever controlled the most compute controlled the most powerful AI, then the future of artificial general intelligence would be determined by who had the deepest pockets, not who had the best ideas. On June 28, 2016, they incorporated Vast.ai. The founding thesis fit on a napkin: the world was full of underutilized GPU hardware — in gaming rigs, mining farms, research labs, and small data centers — and the people who needed that compute most couldn't afford the hyperscaler rates. But the motivation was never purely commercial. A world where compute flows freely to thousands of independent researchers is a fundamentally different world than one where it is locked behind the pricing walls of a few incumbents. Jake Cannell CEO & Co-founder Watch the founder's story “A world where compute flows freely to thousands of independent researchers is a fundamentally different world than one where it is locked behind the pricing walls of AWS, GCP, and Azure.” — Jake Cannell, 2016 Timeline of a Thesis What Jake predicted. What the team built. How the field caught up. Thesis Validated 2010–2014 The scaling thesis takes shape Jake Cannell publishes a series of essays on LessWrong arguing that intelligence is fundamentally a function of compute — not clever algorithms or hand-engineered modules. Christian Horne (lahwran), a fellow LessWrong contributor, shares the same conviction. The two become collaborators. Thesis Validated 2012 AlexNet validates the hypothesis AlexNet breaks ImageNet benchmarks by scaling a known neural network architecture on GPUs — exactly as the scaling hypothesis predicted. The deep learning revolution begins. Thesis Validated June 2015 The Brain as a Universal Learning Machine Jake publishes his landmark essay arguing that the human brain is a single, general-purpose learning algorithm — not a zoo of specialized circuits. He predicts AlphaGo two years before it happens and forecasts human-level vision (~2024±3) and language via scaled deep learning. June 28, 2016 Vast.ai incorporated Jake Cannell and Christian Horne incorporate Vast.ai as a Delaware C Corporation. The founding thesis: the world is full of underutilized GPU hardware, and the people who need that compute most can’t afford hyperscaler rates. The market needs a two-sided platform. 2016–2017 Building in the dark For two years, Jake and Christian build the platform end-to-end: host onboarding, search interface, pricing engine, Docker-based instance management — engineered to work across heterogeneous hardware and wildly different network conditions. September 2018 Launch day Vast.ai launches — not with a press release, but the way honest products launch: to friends, family, and a post on Reddit. GPU compute 3–5x cheaper than AWS, available in seconds, no enterprise contract required. 2019 First hosts, first traction Early independent hosts join the platform. The platform concept is validated — developers get cheaper GPUs, hosts monetize idle hardware. Growth begins compounding rapidly. Thesis Validated 2021 The predictions land CLIP and GPT-3 arrive at roughly the vision and language capability levels Jake predicted in 2015. The scaling thesis — once unconventional — is validated at industry scale. April 2022 Travis Cannell joins as COO Rapid growth demands operational infrastructure to match. Travis Cannell joins to bring the systems, process, and leadership discipline required to scale a startup into an institution. 2022–2023 Enterprise GPU expansion Enterprise customers discover Vast.ai. The platform launches Secure Cloud — certified data center partners in professionally managed facilities — alongside serverless inference and dedicated cluster products. 2024 310% growth year Vast.ai’s biggest year. The platform scales to a new tier as demand for GPU compute surges across the AI industry. 310% growth · 17,000+ GPUs · 350+ hosts July 2024 Los Angeles headquarters opens Vast.ai opens its first physical office in Los Angeles — formalizing what had been a fully distributed team since inception. 2025 40+ employees, two offices, SOC 2 certified San Francisco office opens, establishing a second hub in the heart of the AI ecosystem. Vast.ai achieves SOC 2 Type I certification. The infrastructure that makes AI open is now an institution. Why It Matters Jake's model of AI risk is not the Hollywood version. He doesn't lose sleep over a rogue superintelligence suddenly seizing control. His concern is subtler and, in many ways, more tractable: that the empirical process by which alignment gets solved — many labs, many approaches, competitive pressure producing diversity — breaks down because compute becomes too expensive or too centralized for anyone outside a handful of incumbents to participate. The cure for that risk is access. When GPUs became accessible to graduate students and small labs — not just Google and Bell Labs — the field exploded. The researchers who will find the alignment approaches that actually work, the architectures that generalize correctly, the training paradigms that produce safe and capable systems — many of them don't work at the big labs. They work at universities, at startups, at independent research organizations, at companies you haven't heard of yet. They need compute. Vast.ai exists to make sure they can get it. Our Vision To make life substrate-independent through Vast Artificial Intelligence. Our Mission To organize, optimize, and orient the world's computation. The Team Jake Cannell CEO & Co-founder Visit Jake Cannell 's LinkedIn profile Travis Cannell COO Visit Travis Cannell 's LinkedIn profile Scott Darden Director of Engineering Aida Atagian Allison Corley Anthony Benjamin Edgar Lin Guthrie Lonergan Senior Product Manager JC Park Liam Weldon Lindsey Longeretta Talent Acquisition Lead Marco Hernandez Ryan Barry Sammy Javed Dangerous workloads. Unstoppable rewards. Feel the pull of Deploy and Scale AI Agents with Vast.ai's Cost-Efficient GPU Compute Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases AI/ML Frameworks AI Text Generation AI Image + Video Generation AI Agents Batch Data Processing Audio-to-Text Transcription AI Fine Tuning Virtual Computing GPU Programming Graphics Rendering Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Contact Sales Console Contact Sales Console Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases All Use Cases AI Agents AI Fine Tuning AI Image + Video Generation AI Text Generation AI/ML Frameworks Audio-to-Text Transcription Batch Data Processing GPU Programming Graphics Rendering Virtual Computing Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases AI Agents Deploy and scale AI agents with Vast.ai's cost-efficient GPU compute. Built for This Run the frameworks you already use. Deploy agent stacks built with LangChain / Langflow, AutoGen, CrewAI, or your own code on Vast.ai's fully integrated GPU cloud. Scale agent workloads seamlessly from a single node to distributed clusters. Iterate fast without overspend. Pay per second while real-time utilization dashboards surface GPU, CPU, and cost metrics so you can tune performance, not guess it. Preserve your setup as a reusable template. Lock in the exact Docker image, libraries, CUDA, and driver versions your agents need. Run open source equivalents 90%+ cheaper per token than OpenAI or Anthropic API. Switch easily with OpenAI-compatible chat endpoints. Models text vision Gemma 4 31B IT Gemma 4 31B dense vision-language model by Google with 256K context and thinking mode text vision Kimi K2.6 Kimi K2.6 is an open-source, native multimodal agentic MoE model from Moonshot AI with 1T total parameters, 32B activated, advancing long-horizon coding, coding-driven design, and swarm-based task orchestration text vision Qwen3.5 27B Dense 27B vision-language model with unified multimodal reasoning View More Models Start Building: AI Agents Templates Langflow + Ollama Visual programming for LLM workflows with integrated Ollama backend View All Templates Subscribe for our product updates. → © 2026 Vast.ai. All rights reserved. Products GPU Cloud Clusters Hosting Developers CLI Python SDK API Reference Documentation Resources Enterprise Startup Program Pricing Use Cases Docs FAQs Press Kit Community Discord GitHub Twitter YouTube Contact Get in Touch Contact Sales Investor Inquiries Legal Terms of Service Privacy Policy Compliance Vulnerability Disclosure Data Processing © 2026 Vast.ai. All rights reserved. Hosting Overview - Vast.ai Documentation: Affordable GPU Cloud Marketplace Skip to main content Vast.ai Documentation: Affordable GPU Cloud Marketplace home page Search... ⌘ K Ask AI FAQ Discord Console Console Search... Navigation Concepts Hosting Overview Guides CLI & SDK API Host Examples Concepts Hosting Overview Understanding Verification Earning Guides Verification Stages How to Self-Test Datacenter Status Host Payouts Tax Guide for Hosts VMs CLI cleanup machine delete machine list machine list machines show machine show machines set defjob remove defjob set min-bid schedule maint cancel maint show maints unlist machine defrag machines self-test machine SDK cleanup_machine delete_machine list_machine list_machines show_machine show_machines set_defjob remove_defjob set_min_bid schedule_maint cancel_maint show_maints unlist_machine defrag_machines self_test_machine On this page Account setup and hosting agreement Machine setup General concepts Offers and Rental Contracts The Rental Contract Offer End Date How the offer becomes a rental contract Can I change the offer? Example: Multiple rental contracts from different offers Min GPU On-demand Price Interruptible min price (optional) Reserved Discount Pricing Factor Volume Offers Storage Usage Listing Volumes Out of Sync Rental Contracts Extending Rental Contracts Testing your own machine Setup a separate client account Use the CLI (preferred) Maintenance Uninstalling Common Questions How do I host my machine(s) on Vast? How can I rent my PC? How do I get an invoice? How do I check if my machine is listed? Can you verify my machine? How does verification work? How do I gain datacenter status? How do I uninstall vast from my machine? Help I am getting this error on my machine? Why is my machine not listed? Can I send a message to a client using my machine letting them know that I fixed an issue that they were having? I fear I will decrease my reliability from restarting my machine and potentially lose my verification. How much can I make hosting on Vast? Why did the reliability on my machine decrease? How do I minimize my reliability dropping? If someone has already used an image on my machine does redownload happen or is the system smart? My storage for clients is somehow full. I just have a few jobs stored in my server and most of them are old and didn’t delete once the job finished. A lot of them are really old, can I remove them to free up some space? I can’t find my machine? Why can’t I see my machine on the Search page in the console? Concepts Hosting Overview Copy page Copy page Documentation Index Fetch the complete documentation index at: https://docs.vast.ai/llms.txt Use this file to discover all available pages before exploring further. Vast is a GPU marketplace. Hosts sell GPU resources on the marketplace. Hosts are responsible for: Setup: installing Ubuntu, creating disk partitions, installing NVIDIA drivers, opening network ports on the router and installing the Vast hosting software. Testing and troubleshooting all issues that can arise, such as driver conflicts, errors, bad GPUs, and bad network ports. Vast does not offer support for getting your machine working. Connect your Vast host account to our Discord to access our host-only discord channels to chat or seek help from other hosts on the platform. Managing your offers, including pricing and offer end dates Planning maintenance so that no active rental contracts are disrupted Account setup and hosting agreement You must create a new account for hosting. If you are using Vast.ai as a client, do not use the same account. A single client and hosting account is not supported and you will quickly run into issues. Once your account is created, open the host setup guide . There is a link in the first paragraph to the hosting agreement. Read through the agreement. Once you accept, your account will then be converted to a hosting account. You will notice there is now a link to Machines in the navigation, along with some other changes. Your account can now list machines that are running the daemon software. Machine setup The host setup guide is the official documentation for setting up a machine on Vast.ai. Read through each section closely. Common issues to check: Make sure to test the networking. Clients require open ports to directly connect to the machine for most jobs. Make sure to read the section on IOMMU if you have an AMD EPYC system. Make sure to disable auto-updates so that your machine doesn’t drop a client job to update a driver. Once you are ready to list your machine, come back to this guide to understand pricing and the rental contract lifecycle. General concepts Clients have high expectations coming from AWS or GCP. As a host, plan to offer 100% uptime for your machine during the rental period. Expect that the GPU is going to be used at close to max capacity for the rental period. Ensure that your Internet, power source and heat dissipation systems are all functioning and that you have thought through how hosting will affect each one of those items. Offers and Rental Contracts Hosts can create offers (sometimes called listings) through the CLI command list machine or the machine control panel GUI on the host machines page. The main offer parameters include: the pricing for GPUs,internet,storage the discount schedule param which determines the price difference between on-demand and reserved instances the min bid price for interruptible instances the min_gpu param controlling ‘slicing’ (explained below) the offer end date, which determines how long the offer accepts new rental contracts The offer accepts new rentals until the offer end date. When a client rents an instance on your machine, a rental contract is created from your offer. If your machine has multiple GPUs and you’ve set min_gpu to allow slicing, multiple clients can rent from the same offer, each creating their own independent rental contract. Once clients rent your machine, it is very important to honor the terms of each rental contract until its rental end date. The Rental Contract By listing your machine, you create an offer visible to potential clients. A rental contract is created each time a client accepts your offer by renting an instance. The rental contract locks in all of the offer’s terms at the time of rental, including pricing, the rental end date, and hardware specs, and those terms cannot be changed afterward. Each rental contract is independent, you may have multiple active rental contracts from different clients on the same machine, each with its own rental end date and pricing. As the host, you are committing to provide the services as advertised in your offer: the host must provide the hardware/services according to all the advertised specs the hardware can not be used for any other purposes the client’s data must be isolated and protected according to the data protection policy the advertised services must be provided until each rental contract’s rental end date For full details, see the hosting agreement . Offer End Date The offer end date is your commitment to how long you will keep the machine online and fully functional. The pricing you set is the rate you’ll be paid for that commitment. Together, these form the terms of your offer. You can set the offer end date in the hosting interface by clicking on the date field under expiration, or via the end date field in the CLI list machine command. Make sure to set an offer end date before listing your machine, or the offer will remain open indefinitely. How the offer becomes a rental contract When a client rents an instance from your offer, all of the offer’s terms at that moment, the offer end date, the pricing, and the hardware specs, are locked into a rental contract . The offer end date becomes the contract’s rental end date (shown in the UI as “client end date”), and the pricing becomes the contract’s rate. Once created, a rental contract’s terms cannot be changed, no Developers — Vast.ai Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases AI/ML Frameworks AI Text Generation AI Image + Video Generation AI Agents Batch Data Processing Audio-to-Text Transcription AI Fine Tuning Virtual Computing GPU Programming Graphics Rendering Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Contact Sales Console Contact Sales Console Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases All Use Cases AI Agents AI Fine Tuning AI Image + Video Generation AI Text Generation AI/ML Frameworks Audio-to-Text Transcription Batch Data Processing GPU Programming Graphics Rendering Virtual Computing Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Build on Vast.ai CLI, Python SDK, and REST API — everything you need to provision GPU compute programmatically. Install CLI Read Docs CLI Search, deploy, and manage GPU instances from your terminal. One pip install, zero dependencies. Learn more Python SDK Programmatic access to the Vast.ai platform. pip install vastai. Learn more REST API Direct HTTP access to every Vast.ai operation. OpenAPI spec available. Learn more Get started in seconds CLI pip install vastai vastai set api-key YOUR_API_KEY vastai search offers 'gpu_name=H100_SXM num_gpus>=4' Python SDK pip install vastai from vastai import VastAI vast = VastAI(api_key="YOUR_API_KEY") vast.search_offers(query='gpu_name=H100_SXM num_gpus>=4') Start building Get your API key and deploy your first GPU in 60 seconds. Get API Key Read Docs Subscribe for our product updates. → © 2026 Vast.ai. All rights reserved. Products GPU Cloud Clusters Hosting Developers CLI Python SDK API Reference Documentation Resources Enterprise Startup Program Pricing Use Cases Docs FAQs Press Kit Community Discord GitHub Twitter YouTube Contact Get in Touch Contact Sales Investor Inquiries Legal Terms of Service Privacy Policy Compliance Vulnerability Disclosure Data Processing © 2026 Vast.ai. All rights reserved. Press Kit — Vast.ai Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases AI/ML Frameworks AI Text Generation AI Image + Video Generation AI Agents Batch Data Processing Audio-to-Text Transcription AI Fine Tuning Virtual Computing GPU Programming Graphics Rendering Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Contact Sales Console Contact Sales Console Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases All Use Cases AI Agents AI Fine Tuning AI Image + Video Generation AI Text Generation AI/ML Frameworks Audio-to-Text Transcription Batch Data Processing GPU Programming Graphics Rendering Virtual Computing Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Press Kit Everything journalists, analysts, and partners need to cover Vast.ai. For press inquiries, contact
[email protected] Company Overview Vast.ai is the world's largest GPU compute platform. Founded in 2016, the platform connects AI researchers, startups, and enterprises with affordable GPU infrastructure from a distributed network of 350+ independent hosts across 40+ data centers. Unlike hyperscalers, Vast.ai operates as a two-sided platform where prices are set by supply and demand — delivering compute at a fraction of AWS, GCP, and Azure rates. The company was founded on the thesis that whoever controls compute controls AI — and that power should stay distributed. Vast.ai supports on-demand, interruptible, and reserved GPU instances, as well as serverless inference, dedicated clusters, and Secure Cloud offerings for enterprise customers. Founded June 28, 2016 Headquarters Los Angeles, CA Second Office San Francisco, CA Employees 40+ GPUs on Platform 17,000+ Independent Hosts 350+ Data Centers 40+ Certifications SOC 2 Type I Leadership Jake Cannell CEO & Co-founder ML engineer, GPU programmer, and AI theorist who published influential essays on LessWrong predicting the scaling revolution years before the field caught up. Founded Vast.ai in 2016 to democratize GPU compute and prevent dangerous concentration of AI infrastructure. LinkedIn Travis Cannell COO Joined Vast.ai in 2022 to build the operational infrastructure needed to scale a GPU cloud platform from startup to institution. Oversees business operations, partnerships, and growth. LinkedIn Brand Assets Download official logos and executive photos. Use them as provided — do not modify, recolor, or distort. Logo Package Horizontal, vertical, icon, and wordmark variants in SVG and PNG. Light and dark versions included. ZIP · 15 KB Download Executive Headshots High-resolution headshots of CEO Jake Cannell and COO Travis Cannell for media use. ZIP · 1 MB Download Logo Preview Press Releases View All Vast.AI Expands to San Francisco to Access Top Talent Market Berkeley-Founded AI Cloud Pioneer Opens Prime SOMA Office, Tapping Bay Area's Elite Talent Pool for New Product Initiatives as AI talent wars intensify June 5, 2025 Vast.ai Achieves Security Milestone with SOC 2 Type I Certification Cloud computing marketplace strengthens security posture April 10, 2025 Vast.ai Adds Virtual Machine Support, Expanding Access and Flexibility for the GPU Rental Platform Vast.ai launches Virtual Machine support, giving AI and ML teams greater flexibility, security, and faster workflows on its GPU rental platform. December 10, 2024 VAST.AI Becomes First GPU Rental Marketplace To Offer AMD Support Announces 265% YoY Growth as GPU as a Service (GPUaaS) Market Revenue Soars May 2, 2024 View All Press Releases Media Inquiries For interviews, comments, or additional materials, reach us at:
[email protected] Subscribe for our product updates. → © 2026 Vast.ai. All rights reserved. Products GPU Cloud Clusters Hosting Developers CLI Python SDK API Reference Documentation Resources Enterprise Startup Program Pricing Use Cases Docs FAQs Press Kit Community Discord GitHub Twitter YouTube Contact Get in Touch Contact Sales Investor Inquiries Legal Terms of Service Privacy Policy Compliance Vulnerability Disclosure Data Processing © 2026 Vast.ai. All rights reserved. GPU Hardware — Vast Hardware | Vast.ai Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases AI/ML Frameworks AI Text Generation AI Image + Video Generation AI Agents Batch Data Processing Audio-to-Text Transcription AI Fine Tuning Virtual Computing GPU Programming Graphics Rendering Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Contact Sales Console Contact Sales Console Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases All Use Cases AI Agents AI Fine Tuning AI Image + Video Generation AI Text Generation AI/ML Frameworks Audio-to-Text Transcription Batch Data Processing GPU Programming Graphics Rendering Virtual Computing Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Source GPUs That Are Ready to Earn New and certified refurbished GPU servers from our vetted supplier network. Every unit is stress-tested, benchmark-verified, and pre-configured for the Vast platform — so your hardware starts earning from day one. New & refurbished Burn-in tested Platform-ready How It Works Submit once. We source from multiple suppliers and come back with options. 1 . Tell Us What You Need Share your GPU requirements, preferred condition, quantity, and timeline. We'll match you with the right suppliers. 2 . We Source & Test Our team sources options from multiple suppliers — new and refurbished. Every GPU passes our certification pipeline before it reaches you. 3 . Deploy & Earn Receive Vast Certified hardware, install our client, and start hosting. Your GPUs are platform-ready from day one. Why Source Through Vast Your GPU hardware procurement partner — not a reseller, a trusted intermediary with supplier relationships across new and refurbished channels. Volume Pricing Our purchasing volume and supplier relationships translate to pricing you can't get sourcing on your own. The more you buy, the better the rates. Platform-Ready Hardware arrives pre-configured for the Vast platform. Install our client, list your machines, and start earning revenue from day one. Vast Certified Every unit goes through our GPU verification pipeline. You receive a certification report with burn-in, VRAM, benchmark, and thermal results. Financing Integration Pair hardware procurement with Vast Finance to scale faster. Your platform earnings can even help you qualify for better terms. What You Get: Vast Certified Every unit passes our GPU verification pipeline before it ships. You receive a certification report with every order. Burn-in Testing 72-hour stress test under full load VRAM Validation Complete memory check, no defects Benchmark Scoring Performance verified against reference Thermal Profiling Cooling confirmed under sustained load Get a Quote Tell us what you need. We'll source options from multiple suppliers and come back with a quote within 1–2 business days. 1 2 Hardware Requirements GPU Model(s) * B200 H200 High Demand H100 SXM High Demand H100 PCIe A100 80GB A100 40GB L40S RTX 5090 Limited Supply RTX 4090 RTX 3090 Other Quantity * Select 1–4 GPUs 5–16 GPUs 17–64 GPUs 64+ GPUs Condition Preference * New Certified Refurbished Either / Best Value Form Factor Select Complete Server Bare GPUs Either Not sure Timeline * When do you need hardware? ASAP (under 2 weeks) 1–3 months 3–6 months Just exploring Primary Use Case Select AI/ML Training Inference Rendering HPC / Scientific General Compute Other Existing Vast Host? Select Yes, currently hosting Planning to host No Budget Range Select Under $25,000 $25,000 – $100,000 $100,000 – $500,000 $500,000+ Prefer not to say Continue Ready to Build Your GPU Fleet? Source hardware through our vetted supplier network, or explore financing to scale faster. Explore Financing Start Hosting Subscribe for our product updates. → © 2026 Vast.ai. All rights reserved. Products GPU Cloud Clusters Hosting Developers CLI Python SDK API Reference Documentation Resources Enterprise Startup Program Pricing Use Cases Docs FAQs Press Kit Community Discord GitHub Twitter YouTube Contact Get in Touch Contact Sales Investor Inquiries Legal Terms of Service Privacy Policy Compliance Vulnerability Disclosure Data Processing © 2026 Vast.ai. All rights reserved. GPU Cloud CLI — Vast.ai Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases AI/ML Frameworks AI Text Generation AI Image + Video Generation AI Agents Batch Data Processing Audio-to-Text Transcription AI Fine Tuning Virtual Computing GPU Programming Graphics Rendering Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Contact Sales Console Contact Sales Console Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases All Use Cases AI Agents AI Fine Tuning AI Image + Video Generation AI Text Generation AI/ML Frameworks Audio-to-Text Transcription Batch Data Processing GPU Programming Graphics Rendering Virtual Computing Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases DEVELOPER TOOLS GPU Cloud from Your Terminal Search thousands of GPUs, deploy Docker containers, and manage instances — all from the command line. No UI, no waitlist, no contracts. Get API Key View Docs vast-terminal Install in seconds pip install vastai vastai set api-key YOUR_API_KEY Requires Python 3.9+. Also available via pipx. How it works Step 1 Search the Platform Filter by GPU model, VRAM, price, reliability, location, and 20+ fields. Sort by performance-per-dollar. $ vastai search offers 'gpu_name=H100_SXM num_gpus>=4 reliability>0.99' -o 'dph' --limit 5 ID GPU Num VRAM $/hr DLPerf Location 847291 H100_SXM 4 320GB $4.12 98.7 US-East 851003 H100_SXM 4 320GB $4.28 97.2 EU-West 849177 H100_SXM 8 640GB $7.84 99.1 US-West Step 2 Deploy Your Container Launch any Docker image on bare-metal GPUs. Pass environment variables, expose ports, mount volumes. SSH access included. $ vastai create instance 847291 \ --image nvcr.io/nvidia/pytorch:24.01-py3 \ --disk 64 --ssh --direct {"success": true, "new_contract": 9841205} Step 3 Monitor & Manage Check status, stream logs, execute commands, and destroy instances when done. Script your entire workflow. $ vastai show instances ID GPU Status Image Cost/hr 9841205 4x H100_SXM running nvcr.io/nvidia/pytorch:24.01-py3 $4.12 $ vastai logs 9841205 --tail 5 [14:02:45] Model loaded. 70B params, 4-bit quantized. [14:02:46] Server ready on :8080 $ vastai destroy instance 9841205 20+ Filter Fields GPU model, VRAM, price, reliability, location, bandwidth, compute capability, and more. Any Docker Image Public or private registries. PyTorch, TensorFlow, vLLM, custom images — anything. Spot Pricing Bid on interruptible instances. RTX 4090 from $0.14/hr. SSH & Jupyter Direct SSH access or launch Jupyter notebooks with --jupyter. File Transfer vastai copy moves data between instances or to/from local. Scripting-First JSON output with --raw. Pipe to jq. Automate everything. GPUs from $0.14/hr (spot) and $0.50/hr (on-demand) Thousands of GPUs available across 30+ data centers worldwide. View Pricing Deploy your first GPU in 60 seconds No contracts, no minimums. Get your API key and start building. Get API Key Read Full Docs Subscribe for our product updates. → © 2026 Vast.ai. All rights reserved. Products GPU Cloud Clusters Hosting Developers CLI Python SDK API Reference Documentation Resources Enterprise Startup Program Pricing Use Cases Docs FAQs Press Kit Community Discord GitHub Twitter YouTube Contact Get in Touch Contact Sales Investor Inquiries Legal Terms of Service Privacy Policy Compliance Vulnerability Disclosure Data Processing © 2026 Vast.ai. All rights reserved. Creatix Technology Scales to 200K Daily Users with Vast.ai's GPU Cloud Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases AI/ML Frameworks AI Text Generation AI Image + Video Generation AI Agents Batch Data Processing Audio-to-Text Transcription AI Fine Tuning Virtual Computing GPU Programming Graphics Rendering Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Contact Sales Console Contact Sales Console Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases All Use Cases AI Agents AI Fine Tuning AI Image + Video Generation AI Text Generation AI/ML Frameworks Audio-to-Text Transcription Batch Data Processing GPU Programming Graphics Rendering Virtual Computing Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Case Studies Creatix Technology Creatix Technology Scales to 200K Daily Users with Vast.ai's GPU Cloud How a fast-growing AI app company cut infrastructure costs by over 60% and powered millions of new users with Vast.ai. 60%+ Ongoing Cost Reduction 5x+ Increased Daily User Support Speed Higher throughput and shorter processing times Flexibility Optimized for performance and bandwidth Overview Challenge Solution Results Why Creatix Stands Out Looking Ahead About Creatix Technology Overview Industry: Tech Key area: Object removal, Background segmentation, and Video enhancement Creatix Technology is an artificial intelligence company developing the next generation of creative tools for everyday users. Founded with the mission to bring advanced technology closer to everyone, Creatix has grown into a global team specializing in AI-driven mobile apps, custom model development, and cloud-based AI services. Their flagship consumer apps — Magic Eraser and AI Video Editor — empower millions of users to remove objects, enhance footage, and create polished content with ease. Together, these apps have achieved over 10 million downloads and serve more than 200,000 daily active users across iOS and Android. Behind the scenes, Creatix runs GPU-accelerated training and inference workloads on Vast.ai, combining on-premise servers with Vast's distributed GPU cloud to scale efficiently and affordably Challenge As Creatix's user base and feature set expanded, so did their compute demands. Originally hosted on Google Cloud Platform, the company faced escalating GPU costs that made it difficult to experiment with and deploy new AI features profitably. Their core features — including object removal, background segmentation, and video enhancement — required continuous access to high-end GPUs. Even consumer-class GPUs were cost-prohibitive on traditional clouds, limiting their ability to scale to new features or handle large inference volumes. To sustain their rapid growth, Creatix needed a way to reduce infrastructure spend without compromising performance or reliability . Solution After evaluating multiple platforms, Creatix turned to Vast.ai, which offered access to high-performance consumer GPUs like the RTX 4090, 5090, and H100 at a fraction of the cost of major clouds. They built a hybrid infrastructure that leverages: Vast.ai's GPU cloud for scalable, high-bandwidth inference workloads On-premise servers for continuous 24/7 operations A microservice-based architecture connected via RESTful APIs to route traffic seamlessly between environments This setup allows Creatix to train, fine-tune, and deploy AI models quickly while keeping costs predictable and performance strong. "By tapping into Vast.ai's cloud of high-performance consumer GPUs, Creatix achieved data-center-level performance at a fraction of the ongoing cost — reducing monthly infrastructure expenses by more than half." See How Vast.ai can Transform AI for You Discover how Vast.ai powers AI innovation. Our team will guide you through a tailored platform walkthrough and show how we combine enterprise-level capabilities with startup-friendly pricing. Contact Sales Team Results 60%+ Ongoing Cost Reduction Migrating workloads to Vast.ai cut compute expenses by over half, saving $5,000–$10,000 every month compared to GCP. These recurring savings compound into tens of thousands annually , freeing up budget for R&D and product innovation. Faster Inference & Lower Latency RTX 4090 GPUs on Vast.ai deliver higher throughput and shorter processing times, improving end-user experience across Magic Eraser and AI Video Editor. Explosive Growth Since adopting Vast.ai, Creatix scaled from roughly 40,000 to over 200,000 daily users , supporting 30,000+ new installs each day . Operational Flexibility Their hybrid model enables workloads to shift dynamically between on-premise servers and Vast.ai instances, optimizing for performance and bandwidth "We save five to ten thousand dollars every month using Vast.ai — and that adds up fast. The performance is better, the costs are lower, and it lets us keep scaling without limits." Giang , Founder of Creatix Technology Why Creatix Stands Out Creatix isn't just an app developer — it's a full-stack AI company combining consumer software with enterprise-grade innovation. Beyond its hit apps, Creatix provides AI consulting, analytics, and custom model development services for partners in Japan, the U.S., and Australia. Their Figma Translate Plugin, powered by Google Gemini AI, exemplifies their engineering range — translating design text automatically while preserving layout. Together, these projects showcase Creatix's ability to deliver both mass-market AI products and bespoke business solutions with equal excellence. By leveraging Vast.ai, Creatix proved that a lean, independent AI team can scale globally and profitably — without enterprise-scale budgets. "Vast.ai gives us stronger GPUs at a cheaper price. We save thousands while getting better performance — which lets us serve more users and keep innovating." Looking Ahead Creatix plans to expand its use of Vast.ai for training new AI models and to adopt Vast's upcoming serverless GPU platform, which will allow programmatic scaling based on live user demand. The company also intends to grow its outsourcing and AI consulting business, helping global clients develop and deploy custom AI services backed by its proven infrastructure and expertise. About Creatix Technology Creatix Technology builds innovative AI-powered products that help millions of users around the world create, edit, and translate faster. The company develops both consumer applications — including Magic Eraser, AI Video Editor, and Figma Translate Plugin — and enterprise services such as custom AI model development, mobile app design , and AI service deployment . Mission: Bringing advanced technology closer to everyone. Website: creatixtechnology.com Start Building AI Without Limits Join thousands of developers and enterprises running AI workloads on Vast. Get Started PAICON PAICON Accelerates Global, Data-Centric Cancer Diagnostics with Vast.ai How a global oncology data platform used Vast.ai’s GPU cloud to rapidly iterate on Athena—validating that diversity can matter more than scale—while significantly reducing research-phase training costs. Medical AI View Case Study Subscribe for our product updates. → © 2026 Vast.ai. All rights reserved. Products GPU Cloud Clusters Hosting Developers CLI Python SDK API Reference Documentation Resources Enterprise Startup Program Pricing Use Cases Docs FAQs Press Kit Community Discord GitHub Twitter YouTube Contact Get in Touch Contact Sales Investor Inquiries Legal Terms of Service Privacy Policy Compliance Vulnerability Disclosure Data Processing © 2026 Vast.ai. All rights reserved. GPU Pricing — Live Platform Rates | Vast.ai Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases AI/ML Frameworks AI Text Generation AI Image + Video Generation AI Agents Batch Data Processing Audio-to-Text Transcription AI Fine Tuning Virtual Computing GPU Programming Graphics Rendering Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Contact Sales Console Contact Sales Console Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases All Use Cases AI Agents AI Fine Tuning AI Image + Video Generation AI Text Generation AI/ML Frameworks Audio-to-Text Transcription Batch Data Processing GPU Programming Graphics Rendering Virtual Computing Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases GPU Pricing — Live Platform Rates Prices set by supply and demand across 40+ data centers. On-demand, interruptible, or reserved — find the right GPU at the right price. Browse GPUs Calculate Cost Live GPU Prices Real-time pricing from across the Vast.ai platform. Click any card for detailed specs and history. Availability: High (120+) Medium (40–119) Low (<40) 30 D 90 D 180 D Choose Your Instance Type Three pricing tiers to match your workload and budget. On-Demand Most Popular Guaranteed uptime. Best for production. Per-second billing No interruptions Spin up/down anytime Browse On-Demand Interruptible Best Value 50%+ cheaper. Best for batch training. Preemptible — may be reclaimed Ideal for fault-tolerant workloads Checkpoint and resume easily Browse Interruptible Reserved Up to 50% Off Long-term commitment. Best for steady workloads. 1, 3, or 6 month terms Guaranteed capacity Volume discounts available Contact Sales Pricing Calculator Estimate your GPU costs by model, count, and usage hours. Flagship GPU Pricing Live price distribution across the platform. Updated in real time. Why Vast Pricing Works A GPU platform means real competition, real transparency, and structurally lower prices. Contact Sales Team Supply & Demand Pricing Prices are set by the market, not by Vast. More supply means lower prices — and you always see the real rate. Per-Second Billing Pay only for what you use, down to the second. No minimum hours, no rounding up, no surprise charges. 68+ GPU Types From RTX 3060 to B200 — find exactly the right GPU for your workload and budget across the full spectrum. No Lock-In No long-term contracts required. Scale up, scale down, or switch GPU types anytime without penalties. Subscribe for our product updates. → © 2026 Vast.ai. All rights reserved. Products GPU Cloud Clusters Hosting Developers CLI Python SDK API Reference Documentation Resources Enterprise Startup Program Pricing Use Cases Docs FAQs Press Kit Community Discord GitHub Twitter YouTube Contact Get in Touch Contact Sales Investor Inquiries Legal Terms of Service Privacy Policy Compliance Vulnerability Disclosure Data Processing © 2026 Vast.ai. All rights reserved. Vast.ai Documentation - Affordable GPU Cloud Marketplace Skip to main content Vast.ai Documentation: Affordable GPU Cloud Marketplace home page Search... ⌘ K Ask AI FAQ Discord Console Console Search... Navigation Getting started Welcome Guides CLI & SDK API Host Examples Getting started Welcome Quickstart Concepts API keys Templates Teams Instances Overview Pricing Find & rent Connect Storage & data Manage Serverless Serverless Architecture Deployments Quickstart Scaling Behavior Workergroup Parameters Creating Custom PyWorkers Monitoring and Debug Pricing Pre-built Templates Templates Introduction Creating Templates Managing Templates Template Settings Advanced Setup Teams Overview Managing Your Team Teams Roles Legacy Teams Account & billing Pricing overview Account Settings Billing Keys Two-Factor Authentication Referral Program Troubleshooting FAQ Overview General Instances Rental Types Jupyter & SSH Billing Security Technical Networking On this page How It Works What are you looking for? Mission Talk to Us Getting started Welcome Copy page Copy page Documentation Index Fetch the complete documentation index at: https://docs.vast.ai/llms.txt Use this file to discover all available pages before exploring further. Vast.ai is a marketplace for affordable GPU cloud computing. We make it easy for anyone to: Spin up GPU instances in seconds at competitive prices . Scale across thousands of GPUs from Secure Cloud datacenters or community providers. Launch prebuilt or custom templates with one click. How It Works Vast.ai connects compute providers, from hobbyists to Tier-4 datacenters, with users who need GPUs. Our search engine lets you filter by GPU type, RAM, CPU, bandwidth, and more, while providers retain full control over pricing and contracts . What are you looking for? Rent your first GPU 5-minute walkthrough: search the marketplace, launch an instance, SSH in. Automate with the CLI or SDK vastai from the shell or Python. Same operations, different syntax. Pricing On-demand, reserved, interruptible, and serverless rates. Connect to instances SSH, Jupyter, or the web portal. Build a template Package your environment so any GPU can run it with one click. See example workloads PyTorch, vLLM, ComfyUI, agents, and more. Real deployments you can copy. Run serverless workloads Deploy auto-scaling endpoints for inference, batch jobs, or APIs. Host GPUs and earn List your machines on the marketplace, set prices, get paid. Mission Vast.ai’s mission is to align and democratize AI. Machine learning is progressing towards powerful AI systems with the potential to radically reshape our future. We believe it is imperative that this awesome power be distributed widely; that its benefits accrue to the many rather than the few; that its secrets are unlocked for the good of all humanity. Towards these ends we work to ensure that the compute powering AI is supplied by the people and for the people. Talk to Us Support Chat → Available 24/7 in the bottom-right corner of our console . Email →
[email protected] Discord → Join our community for help and discussions. Documentation → Browse our instances guide , templates documentation , or troubleshooting tips . Suggest edits Raise issue Quickstart ⌘ I x github youtube Powered by This documentation is built and hosted on Mintlify, a developer documentation platform Assistant Responses are generated using AI and may contain mistakes. Case Studies | Vast.ai Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases AI/ML Frameworks AI Text Generation AI Image + Video Generation AI Agents Batch Data Processing Audio-to-Text Transcription AI Fine Tuning Virtual Computing GPU Programming Graphics Rendering Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Contact Sales Console Contact Sales Console Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases All Use Cases AI Agents AI Fine Tuning AI Image + Video Generation AI Text Generation AI/ML Frameworks Audio-to-Text Transcription Batch Data Processing GPU Programming Graphics Rendering Virtual Computing Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Case Studies Creatix Technology Creatix Technology Scales to 200K Daily Users with Vast.ai's GPU Cloud How a fast-growing AI app company cut infrastructure costs by over 60% and powered millions of new users with Vast.ai. Tech View Case Study PAICON PAICON Accelerates Global, Data-Centric Cancer Diagnostics with Vast.ai How a global oncology data platform used Vast.ai’s GPU cloud to rapidly iterate on Athena—validating that diversity can matter more than scale—while significantly reducing research-phase training costs. Medical AI View Case Study Subscribe for our product updates. → © 2026 Vast.ai. All rights reserved. Products GPU Cloud Clusters Hosting Developers CLI Python SDK API Reference Documentation Resources Enterprise Startup Program Pricing Use Cases Docs FAQs Press Kit Community Discord GitHub Twitter YouTube Contact Get in Touch Contact Sales Investor Inquiries Legal Terms of Service Privacy Policy Compliance Vulnerability Disclosure Data Processing © 2026 Vast.ai. All rights reserved. Quick 3D Rendering with Cloud GPUs, Real-Time & Batch | Vast.ai Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases AI/ML Frameworks AI Text Generation AI Image + Video Generation AI Agents Batch Data Processing Audio-to-Text Transcription AI Fine Tuning Virtual Computing GPU Programming Graphics Rendering Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Contact Sales Console Contact Sales Console Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases All Use Cases AI Agents AI Fine Tuning AI Image + Video Generation AI Text Generation AI/ML Frameworks Audio-to-Text Transcription Batch Data Processing GPU Programming Graphics Rendering Virtual Computing Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Graphics Rendering Quickly render detailed 3D visuals with powerful GPU acceleration. Built for This Render detailed 3D models quickly using GPU acceleration. Reduce frame times and scene processing for complex animations or visualizations. Access professional-grade GPUs without long-term contracts. Support workflows for VFX, product design, architecture, and more. Related Guides Blender in the Cloud Blender Batch Rendering Introduction Start Building: Graphics Rendering Templates Blender Batch Renderer Automates rendering either full animations or a specified frame across batches of .blend files View All Templates Subscribe for our product updates. → © 2026 Vast.ai. All rights reserved. Products GPU Cloud Clusters Hosting Developers CLI Python SDK API Reference Documentation Resources Enterprise Startup Program Pricing Use Cases Docs FAQs Press Kit Community Discord GitHub Twitter YouTube Contact Get in Touch Contact Sales Investor Inquiries Legal Terms of Service Privacy Policy Compliance Vulnerability Disclosure Data Processing © 2026 Vast.ai. All rights reserved. AI Model Library for Training & Deployment | Vast.ai Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases AI/ML Frameworks AI Text Generation AI Image + Video Generation AI Agents Batch Data Processing Audio-to-Text Transcription AI Fine Tuning Virtual Computing GPU Programming Graphics Rendering Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Contact Sales Console Contact Sales Console Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases All Use Cases AI Agents AI Fine Tuning AI Image + Video Generation AI Text Generation AI/ML Frameworks Audio-to-Text Transcription Batch Data Processing GPU Programming Graphics Rendering Virtual Computing Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Model Library Choose from ready-to-go models -- Or bring your own custom model. Deploy Your Own All Image Generation Video Generation Audio Generation Text Generation Computer Vision Featured Models text vision Kimi K2.6 Kimi K2.6 is an open-source, native multimodal agentic MoE model from Moonshot AI with 1T total parameters, 32B activated, advancing long-horizon coding, coding-driven design, and swarm-based task orchestration text vision Qwen3.6 35B A3B Agentic coding MoE with hybrid Gated DeltaNet and vision support text vision Gemma 4 31B IT Gemma 4 31B dense vision-language model by Google with 256K context and thinking mode text vision Qwen3.5 27B Dense 27B vision-language model with unified multimodal reasoning video LTX-2.3 LTX-2.3 is a DiT-based audio-video foundation model with improved quality and prompt adherence for synchronized video and audio generation image FLUX.2 [dev] Rectified flow transformer capable of generating, editing and combining images based on text instructions Deploy Your Own Looking for PyTorch? Vast.ai has thousands of templates to get started immediately, or you can create your own template specifically for your needs. Explore or start building your own template. Deploy Your Own Image Generation Create stunning images from text prompts with state-of-the-art AI models View All FLUX.2 [dev] Rectified flow transformer capable of generating, editing and combining images based on text instructions FLUX.1 [dev] Rectified flow transformer capable of generating images from text descriptions HiDream I1 Full Open-source image generative foundation model with 17B parameters that achieves state-of-the-art image generation quality within seconds Juggernaut XI v11 Amazing prompt adherence with massively improved aesthetics and enhanced text generation capability Video Generation Generate and edit videos using advanced AI video models View All LTX-2.3 LTX-2.3 is a DiT-based audio-video foundation model with improved quality and prompt adherence for synchronized video and audio generation LTX-2 LTX-2 is a DiT-based audio-video foundation model designed to generate synchronized video and audio within a single model LTX Video LTX-Video is the first DiT-based video generation model capable of generating high-quality videos in real-time Mochi 1 Preview Mochi 1 preview is an open state-of-the-art video generation model with high-fidelity motion and strong prompt adherence in preliminary evaluation. Audio Generation Synthesize, enhance, and process audio with AI-powered tools ACE Step V1 3.5B ACE-Step is a novel open-source foundation model for music generation that overcomes key limitations of existing approaches through a holistic architectural design Dia 1.6B Dia directly generates highly realistic dialogue from a transcript. You can condition the output on audio, enabling emotion and tone control Text Generation Large language models for text generation, analysis, and understanding View All Kimi K2.6 Kimi K2.6 is an open-source, native multimodal agentic MoE model from Moonshot AI with 1T total parameters, 32B activated, advancing long-horizon coding, coding-driven design, and swarm-based task orchestration Qwen3.6 35B A3B Agentic coding MoE with hybrid Gated DeltaNet and vision support Gemma 4 31B IT Gemma 4 31B dense vision-language model by Google with 256K context and thinking mode Qwen3.5 27B Dense 27B vision-language model with unified multimodal reasoning Computer Vision Advanced vision models for image analysis, recognition, and understanding View All Kimi K2.6 Kimi K2.6 is an open-source, native multimodal agentic MoE model from Moonshot AI with 1T total parameters, 32B activated, advancing long-horizon coding, coding-driven design, and swarm-based task orchestration Qwen3.6 35B A3B Agentic coding MoE with hybrid Gated DeltaNet and vision support Gemma 4 31B IT Gemma 4 31B dense vision-language model by Google with 256K context and thinking mode Qwen3.5 27B Dense 27B vision-language model with unified multimodal reasoning Subscribe for our product updates. → © 2026 Vast.ai. All rights reserved. Products GPU Cloud Clusters Hosting Developers CLI Python SDK API Reference Documentation Resources Enterprise Startup Program Pricing Use Cases Docs FAQs Press Kit Community Discord GitHub Twitter YouTube Contact Get in Touch Contact Sales Investor Inquiries Legal Terms of Service Privacy Policy Compliance Vulnerability Disclosure Data Processing © 2026 Vast.ai. All rights reserved. API Reference - Vast.ai Documentation: Affordable GPU Cloud Marketplace Skip to main content Vast.ai Documentation: Affordable GPU Cloud Marketplace home page Search... ⌘ K Ask AI FAQ Discord Console Console Search... Navigation Overview API Reference Guides CLI & SDK API Host Examples Overview API Introduction Authentication Permissions Two-Factor Authentication Endpoints Rate Limits and Errors Common workflows Instances Templates Endpoints Instances Machines Accounts Serverless Team Search Templates Volumes Billing On this page Quickstart Reference sections Base URL Overview API Reference Copy page REST API for managing GPU instances, machines, templates, volumes, serverless endpoints, and billing on Vast.ai. Copy page Documentation Index Fetch the complete documentation index at: https://docs.vast.ai/llms.txt Use this file to discover all available pages before exploring further. The Vast.ai REST API gives you programmatic control over the entire platform. It’s the foundation that the CLI and SDK are built on. The raw REST API is intended for advanced users only. Most users will have a better experience with the CLI or SDK , which handle authentication, retries, and request shape for you. Reach for the API directly when you need maximum flexibility, are integrating from a non-Python language, or are building tooling on top of Vast. Quickstart New to the API? Start with the API Hello World , it walks through the full instance lifecycle (authenticate, search, rent, connect, clean up) using only curl . Reference sections Section What’s covered Authentication Bearer-token auth, API key generation and management Permissions Scoped API keys, role-based access Rate limits & errors Per-endpoint limits, error codes, retry guidance Creating instances Search-and-rent flow, configuration options Templates Template fields, creation, instance launch Endpoints Full OpenAPI reference for every endpoint Base URL https://console.vast.ai/api/v0 All endpoints require Authorization: Bearer $VAST_API_KEY . Get your key from the Keys page . Suggest edits Raise issue API Authentication ⌘ I x github youtube Powered by This documentation is built and hosted on Mintlify, a developer documentation platform Assistant Responses are generated using AI and may contain mistakes. Privacy Policy | Vast.ai Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases AI/ML Frameworks AI Text Generation AI Image + Video Generation AI Agents Batch Data Processing Audio-to-Text Transcription AI Fine Tuning Virtual Computing GPU Programming Graphics Rendering Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Contact Sales Console Contact Sales Console Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases All Use Cases AI Agents AI Fine Tuning AI Image + Video Generation AI Text Generation AI/ML Frameworks Audio-to-Text Transcription Batch Data Processing GPU Programming Graphics Rendering Virtual Computing Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Privacy Policy Data Processing Agreement Vast.ai Version Date: June 18, 2025 Vast.ai Inc. (“Company” or “we” or “us” or “our”) respects the privacy of its users (“user” or “you”) that use our website located at https://vast.ai, including other media forms, media channels, mobile website or mobile application related or connected thereto (collectively, the “Website”). The following Company privacy policy (“Privacy Policy”) is designed to inform you, as a user of the Website, about the types of information that Company may gather about or collect from you in connection with your use of the Website. It also is intended to explain the conditions under which Company uses and discloses that information, and your rights in relation to that information. Changes to this Privacy Policy are discussed at the end of this document. Each time you use the Website, however, the current version of this Privacy Policy will apply. Accordingly, each time you use the Website you should check the date of this Privacy Policy (which appears at the beginning of this document) and review any changes since the last time you used the Website. The Website is hosted in the United States of America and is subject to U.S. state and federal law. If you are accessing our Website from other jurisdictions, please be advised that you are transferring your personal information to us in the United States, and by using our Website, you consent to that transfer and use of your personal information in accordance with this Privacy Policy. You also agree to abide by the applicable laws of applicable states and U.S. federal law concerning your use of the Website and your agreements with us. Any persons accessing our Website from any jurisdiction with laws or regulations governing the use of the Internet, including personal data collection, use and disclosure, different from those of the jurisdictions mentioned above may only use the Website in a manner lawful in their jurisdiction. If your use of the Website would be unlawful in your jurisdiction, please do not use the Website. BY USING OR ACCESSING THE WEBSITE, YOU ARE ACCEPTING THE PRACTICES DESCRIBED IN THIS PRIVACY POLICY. GATHERING, USE AND DISCLOSURE OF NON-PERSONALLY-IDENTIFYING INFORMATION Users of the Website Generally “Non-Personally-Identifying Information” is information that, without the aid of additional information, cannot be directly associated with a specific person. “Personally-Identifying Information,” by contrast, is information such as a name or email address that, without more, can be directly associated with a specific person. Like most website operators, Company gathers from users of the Website Non-Personally- Identifying Information of the sort that Web br/owsers, depending on their settings, may make available. That information includes the user’s Internet Protocol (IP) address, operating system, br/owser type and the locations of the websites the user views right before arriving at, while navigating and immediately after leaving the Website. Although such information is not Personally-Identifying Information, it may be possible for Company to determine from an IP address a user’s Internet service provider and the geographic location of the visitor’s point of connectivity as well as other statistical usage data. Company analyzes Non-Personally-Identifying Information gathered from users of the Website to help Company better understand how the Website is being used. By identifying patterns and trends in usage, Company is able to better design the Website to improve users’ experiences, both in terms of content and ease of use. From time to time, Company may also release the Non-Personally-Identifying Information gathered from Website users in the aggregate, such as by publishing a report on trends in the usage of the Website. Web Cookies A “Web Cookie” is a string of information which assigns you a unique identification that a website stores on a user’s computer, and that the user’s br/owser provides to the website each time the user submits a query to the website. We use cookies on the Website to keep track of services you have used, to record registration information regarding your login name and password, to record your user preferences, to keep you logged into the Website and to facilitate purchase procedures. Company also uses Web Cookies to track the pages that users visit during each Website session, both to help Company improve users’ experiences and to help Company understand how the Website is being used. As with other Non- Personally-Identifying Information gathered from users of the Website, Company analyzes and discloses in aggregated form information gathered using Web Cookies, so as to help Company, its partners and others better understand how the Website is being used. COMPANY USERS WHO DO NOT WISH TO HAVE WEB COOKIES PLACED ON THEIR COMPUTERS SHOULD SET THEIR BR/OWSERS TO REFUSE WEB COOKIES BEFORE ACCESSING THE WEBSITE, WITH THE UNDERSTANDING THAT CERTAIN FEATURES OF THE WEBSITE MAY NOT FUNCTION PROPERLY WITHOUT THE AID OF WEB COOKIES. WEBSITE USERS WHO REFUSE WEB COOKIES ASSUME ALL RESPONSIBILITY FOR ANY RESULTING LOSS OF FUNCTIONALITY. When you visit or log in to our website, cookies and similar technologies may be used by our online data partners or vendors to associate these activities with other personal information they or others have about you, including your email. We may then send communications and marketing to these emails. You may opt out of receiving this advertising by visiting (https://app.retention.com/optout). Web Beacons A “Web Beacon” is an object that is embedded in a web page or email that is usually invisible to the user and allows website operators to check whether a user has viewed a particular web page or an email. Company may use Web Beacons on the Website and in emails to count users who have visited particular pages, viewed emails and to deliver co-br/anded services. Web Beacons are not used to access users’ Personally-Identifying Information. They are a technique Company may use to compile aggregated statistics about Website usage. Web Beacons collect only a limited set of information, including a Web Cookie number, time and date of a page or email view and a description of the page or email on which the Web Beacon resides. You may not decline Web Beacons. However, they can be rendered ineffective by declining all Web Cookies or modifying your br/owser setting to notify you each time a Web Cookie is tendered, permitting you to accept or decline Web Cookies on an individual basis. Analytics We may use third-party vendors, including Google, who use first-party cookies (such as the Google Analytics cookie) and third-party cookies (such as the DoubleClick cookie) together to inform, optimize and serve ads based on your past activity on the Website, including Google Analytics for Display Advertising. The information collected may be used to, among other things, analyze and track data, determine the popularity of certa Run PyTorch, TensorFlow on GPU Rentals | AI/ML Frameworks | Vast.ai Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases AI/ML Frameworks AI Text Generation AI Image + Video Generation AI Agents Batch Data Processing Audio-to-Text Transcription AI Fine Tuning Virtual Computing GPU Programming Graphics Rendering Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Contact Sales Console Contact Sales Console Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases All Use Cases AI Agents AI Fine Tuning AI Image + Video Generation AI Text Generation AI/ML Frameworks Audio-to-Text Transcription Batch Data Processing GPU Programming Graphics Rendering Virtual Computing Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases AI/ML Frameworks Execute leading frameworks rapidly on scalable GPU infrastructure. Built for This Run popular ML frameworks like TensorFlow and PyTorch on hardware you choose and control. Deploy across single or multiple nodes with support for distributed training. Lock in the exact CUDA and NVIDIA-driver versions your code needs. Accelerate GPU performance using hardware counters for advanced tuning. Related Guides PyTorch on Vast.ai Start Building: AI/ML Frameworks Templates PyTorch Deep learning framework TensorFlow End-to-end platform for ML. View All Templates Subscribe for our product updates. → © 2026 Vast.ai. All rights reserved. Products GPU Cloud Clusters Hosting Developers CLI Python SDK API Reference Documentation Resources Enterprise Startup Program Pricing Use Cases Docs FAQs Press Kit Community Discord GitHub Twitter YouTube Contact Get in Touch Contact Sales Investor Inquiries Legal Terms of Service Privacy Policy Compliance Vulnerability Disclosure Data Processing © 2026 Vast.ai. All rights reserved. Blog | Vast.ai Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases AI/ML Frameworks AI Text Generation AI Image + Video Generation AI Agents Batch Data Processing Audio-to-Text Transcription AI Fine Tuning Virtual Computing GPU Programming Graphics Rendering Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Contact Sales Console Contact Sales Console Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases All Use Cases AI Agents AI Fine Tuning AI Image + Video Generation AI Text Generation AI/ML Frameworks Audio-to-Text Transcription Batch Data Processing GPU Programming Graphics Rendering Virtual Computing Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Blog All Posts GPU Industry NVIDIA AI Vast.ai Named Among Fastest Growing Vendors by Ramp and Brex March 4, 2026 All-in-One App Studio: Your Complete Creative AI Toolkit in a Single Container May 13, 2026 May 2026 Product Update May 12, 2026 TurboQuant Explained: How It Reduces LLM Memory by 5x and Speeds Up Inference May 4, 2026 Deploy LLM Inference Using Vast.ai Serverless April 20, 2026 Mistral Small 4 Just Dropped — Run It on Affordable H200s with Vast.ai April 7, 2026 April 2026 Product Update April 2, 2026 Next Subscribe for our product updates. → © 2026 Vast.ai. All rights reserved. Products GPU Cloud Clusters Hosting Developers CLI Python SDK API Reference Documentation Resources Enterprise Startup Program Pricing Use Cases Docs FAQs Press Kit Community Discord GitHub Twitter YouTube Contact Get in Touch Contact Sales Investor Inquiries Legal Terms of Service Privacy Policy Compliance Vulnerability Disclosure Data Processing © 2026 Vast.ai. All rights reserved. Gemma 4 31B IT - AI Model Library | Build on Vast.ai Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases AI/ML Frameworks AI Text Generation AI Image + Video Generation AI Agents Batch Data Processing Audio-to-Text Transcription AI Fine Tuning Virtual Computing GPU Programming Graphics Rendering Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Contact Sales Console Contact Sales Console Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases All Use Cases AI Agents AI Fine Tuning AI Image + Video Generation AI Text Generation AI/ML Frameworks Audio-to-Text Transcription Batch Data Processing GPU Programming Graphics Rendering Virtual Computing Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Model Library / Gemma 4 31B IT Gemma 4 31B IT LLM Vision Language Chat Reasoning Gemma 4 31B dense vision-language model by Google with 256K context and thinking mode Gemma 4 31B IT vllm Deploy Now ... On-Demand Dedicated 1 x H200 CLI Details Modalities text, vision Recommended Hardware 1 x H200 Estimated Price Loading... Provider google Family gemma Parameters 31 B Context 262144 tokens License apache-2.0 Gemma 4 31B IT: Dense Vision-Language Model Gemma 4 is Google DeepMind's next-generation family of open multimodal models. The 31B variant is the dense flagship of the family, built to deliver frontier-level reasoning, coding, and multimodal understanding on consumer GPUs and workstations. It natively handles text and image input, supports a 256K context window, and covers 140+ languages. Key Features Dense 31B Architecture - 30.7B-parameter dense transformer targeting the highest-quality end of the Gemma 4 family. Hybrid Attention - Interleaves sliding window (local) and full global attention layers, with unified Keys and Values on global layers and Proportional RoPE (p-RoPE) for efficient long-context processing. Reasoning / Thinking Mode - Built-in configurable thinking mode lets the model reason step-by-step before answering. Multimodal - Native text and image understanding with variable aspect ratio and resolution support; video analysis via frame sequences. Function Calling - Native structured tool use with a custom tool-call protocol for agentic workflows. Long Context - 256K token context window for document analysis, long-form reasoning, and agent trajectories. Multilingual - Out-of-the-box support for 35+ languages, pre-trained on 140+. Native System Prompts - First-class support for the system role. Use Cases Document and PDF parsing, OCR (including multilingual and handwriting) Chart, diagram, and screen/UI understanding Long-context reasoning and summarization Code generation, completion, and correction Agentic workflows with structured function calling Visual question answering and image analysis Multilingual chat and translation Architecture Gemma 4 31B IT is a 60-layer dense transformer with a 1024-token sliding window on local attention layers and unified Keys/Values on global layers, paired with a ~550M parameter vision encoder. The final layer is always global, ensuring deep awareness for long-context tasks while local layers keep the memory footprint manageable. Benchmarks Instruction-tuned results reported by Google DeepMind (selected): MMLU Pro: 85.2% AIME 2026 (no tools): 89.2% LiveCodeBench v6: 80.0% Codeforces ELO: 2150 GPQA Diamond: 84.3% Tau2 (average over 3): 76.9% HLE (no tools): 19.5% HLE (with search): 26.5% BigBench Extra Hard: 74.4% MMMLU: 88.4% MMMU Pro (vision): 76.9% MATH-Vision: 85.6% MedXPertQA MM: 61.3% MRCR v2 8-needle 128K: 66.4% For full benchmark tables and model family comparisons, see the model card on HuggingFace . Quick Start Guide Choose a model and click 'Deploy' above to find available GPUs recommended for this model. Rent your dedicated instance preconfigured with the model you've selected. Start sending requests to your model instance and getting responses right now. Related Models text vision Gemma 4 26B A4B IT Gemma 4 26B A4B MoE vision-language model by Google with 256K context and thinking mode text vision Qwen3.5 27B Dense 27B vision-language model with unified multimodal reasoning Subscribe for our product updates. → © 2026 Vast.ai. All rights reserved. Products GPU Cloud Clusters Hosting Developers CLI Python SDK API Reference Documentation Resources Enterprise Startup Program Pricing Use Cases Docs FAQs Press Kit Community Discord GitHub Twitter YouTube Contact Get in Touch Contact Sales Investor Inquiries Legal Terms of Service Privacy Policy Compliance Vulnerability Disclosure Data Processing © 2026 Vast.ai. All rights reserved. GPU Use Cases for AI, ML, Inference & More | Vast.ai Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases AI/ML Frameworks AI Text Generation AI Image + Video Generation AI Agents Batch Data Processing Audio-to-Text Transcription AI Fine Tuning Virtual Computing GPU Programming Graphics Rendering Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Contact Sales Console Contact Sales Console Developers Quickstart CLI Python SDK API Docs Pricing Products GPU Cloud Clusters Serverless Model Library Hosting Hosting Data Centers Financing Hardware Docs Use Cases All Use Cases AI Agents AI Fine Tuning AI Image + Video Generation AI Text Generation AI/ML Frameworks Audio-to-Text Transcription Batch Data Processing GPU Programming Graphics Rendering Virtual Computing Company About Blog Careers Enterprise Case Studies Startup Program FAQ Press Releases Use Cases You Just Need Imagination Whatever AI workload you're running — training, tuning, rendering, or transcribing, Vast helps you spin it up fast, flexibly, and affordably. Pay per second and scale without limits. Get Started Spin up a 4090 for Under $5 Build with the Vast Stack Run open-source models with total control and none of the platform markup. Vast gives you direct access to high-performance GPUs at the lowest market rates, with prebuilt templates that make deployment effortless. Explore Templates Open-Source by Default Use open-source frameworks like vLLM, TGI, or WebUI — no proprietary licenses or usage fees. Templates That Jus
Showing first 200,000 of 228,902 chars · Full corpus: output/vast-ai/full-text.txt