Fal.ai
Developer API platform for running image, video, and audio generation models (Flux, SDXL, Whisper, and more) at low latency. Popular as a serverless GPU layer for multimodal AI apps, with a clean Python/JS SDK and pay-per-use pricing.
Replicate
Cloud platform for running thousands of open-source ML models via a simple API. Supports LLMs, image generation, audio, and video models.