DB-GPT-Hub is an open-source repository that provides datasets, models, and training tools designed to improve large language models for database interaction tasks, particularly Text-to-SQL. The project serves as a specialized extension of the broader DB-GPT ecosystem, focusing on the preparation and evaluation of models capable of translating natural language questions into structured database queries. It offers a modular framework that supports data preparation, model fine-tuning, benchmarking, and inference for Text-to-SQL systems. The repository includes datasets and experiment configurations that allow researchers to train models on real database schemas and evaluate them using standardized benchmarks. Its design encourages experimentation with different large language models and fine-tuning techniques, including parameter-efficient training approaches.
Features
- Repository of datasets and models for Text-to-SQL research
- Framework for fine-tuning large language models on database tasks
- Benchmarking environment for evaluating Text-to-SQL performance
- Tools for preparing training data and database schema datasets
- Support for experimentation with multiple LLM architectures
- Integration with the broader DB-GPT ecosystem for data applications