Bringing BERT into modernity via both architecture changes and scaling
A curated collection of skills for AI coding agents
Stable Diffusion web UI
Omnilingual ASR Open-Source Multilingual SpeechRecognition
Open source codebase for Scale Agentex
Diffusion Transformer with Fine-Grained Chinese Understanding
Graph Neural Network Library for PyTorch
Models and examples built with TensorFlow
Large Multimodal Models for Video Understanding and Editing
OCR expert VLM powered by Hunyuan's native multimodal architecture
Multimodal Diffusion with Representation Alignment
Generate high-definition story short videos with one click using AI
Tools for merging pretrained large language models
Handwritten Text Recognition (HTR) system implemented with TensorFlow
From nobody to big model (LLM) hero
Large Language Model Principles and Practice Tutorial from Scratch
Definitions for AI/ML tasks like dataset creation
Collection of reference environments, offline reinforcement learning
LLM training in simple, raw C/CUDA
Implementation of "MobileCLIP" CVPR 2024
LLM-based Reinforcement Learning audio edit model
Developer AI Persona Search Agent
Implementation of 'lightweight' GAN, proposed in ICLR 2021
Learning agent trained in a diffusion world model
Large Audio Language Model built for natural interactions