Showing 4 open source projects for "python ids"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 1
    minbpe

    minbpe

    Minimal, clean code for the Byte Pair Encoding (BPE) algorithm

    minbpe is a minimal, clean implementation of byte-level Byte Pair Encoding (BPE), the tokenization approach widely used in modern language models. It operates on UTF-8 encoded bytes rather than Unicode characters, which makes it robust to arbitrary text inputs and avoids needing a language-specific character vocabulary. The repository is structured as a teaching-oriented implementation that shows how to train a tokenizer by learning merge rules, then apply those merges to encode text into...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Tiktoken

    Tiktoken

    tiktoken is a fast BPE tokeniser for use with OpenAI's models

    ...Internally, it includes the core tokenizer logic (often implemented in Rust or efficient lower-level code), APIs for encoding, decoding, and counting tokens, and binding layers to Python (and sometimes other languages) for easy use.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 3
    MiniOneRec

    MiniOneRec

    Minimal reproduction of OneRec

    MiniOneRec is an open-source framework designed to explore generative approaches to recommendation systems using large language model architectures. Traditional recommender systems typically rely on large embedding tables and ranking models, but MiniOneRec adopts a generative paradigm in which items are represented as sequences of semantic identifiers generated by autoregressive models. The framework provides an end-to-end pipeline for building generative recommender systems, including...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    PRM800K

    PRM800K

    800,000 step-level correctness labels on LLM solutions to MATH problem

    PRM800K is a process supervision dataset accompanying the paper Let’s Verify Step by Step, providing 800,000 step-level correctness labels on model-generated solutions to problems from the MATH dataset. The repository releases the raw labels and the labeler instructions used in two project phases, enabling researchers to study how human raters graded intermediate reasoning. Data are stored as newline-delimited JSONL files tracked with Git LFS, where each line is a full solution sample that...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Train ML Models With SQL You Already Know Icon
    Train ML Models With SQL You Already Know

    BigQuery automates data prep, analysis, and predictions with built-in AI assistance.

    Build and deploy ML models using familiar SQL. Automate data prep with built-in Gemini. Query 1 TB and store 10 GB free monthly.
    Try Free
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB