Standalone, small, language-neutral
The ultimate RAG for your monorepo
Tiny vision language model
Easy token price estimates for 400+ LLMs. TokenOps
Build a large language model from 0 only with Python foundation
LMDeploy is a toolkit for compressing, deploying, and serving LLMs
New family of code large language models (LLMs)
Language-model investigation agent with a terminal UI
A Python Automated Machine Learning tool that optimizes ML
Qwen3-Coder is the code version of Qwen3
A modular graph-based Retrieval-Augmented Generation (RAG) system
GPU accelerated decision optimization
Model Context Protocol tool support for LangChain
Simple, Pythonic building blocks to evaluate LLM applications
Universal LLM Deployment Engine with ML Compilation
High-performance inference framework for large language models
The agent that grows with you
Chat with your SQL database
Official Repo for ICML 2024 paper
Library for building type-safe natural language interfaces with LLMs
A text-to-speech, speech-to-text and speech-to-speech library
Interact with your documents using the power of GPT
Composable building blocks to build Llama Apps
Chat with your documents using local AI
Easy-to-use LLM fine-tuning framework (LLaMA-2, BLOOM, Falcon