Showing 209 open source projects for "parallel"

View related business solutions
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 1
    CAMPARI

    CAMPARI

    Software for molecular simulations and trajectory analysis

    We are proud to introduce version 5 of CAMPARI. We have added a number of new features, most notably a Python interface for interpreting user-supplied code (with the help of ForPy), a novel trajectory storage standard (with the help of libpqxx/PostgreSQL), and a module for performing transition path theory. Naturally, CAMPARI continues to provide the reference implementation of the ABSINTH force field paradigm and implicit solvation model. CAMPARI is a joint package for performing and...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 2
    FFTW++ is a C++ header class for the FFTW Fast Fourier Transform library that automates memory allocation, alignment, planning, wisdom, and communication on both serial and parallel (OpenMP/MPI) architectures. In 2D and 3D, hybrid dealiasing of convolutions substantially reduces memory usage and computation time. Wrappers for C, Python, and Fortran are included.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 3
    TensorHouse

    TensorHouse

    A collection of reference Jupyter notebooks and demo AI/ML application

    TensorHouse is a scalable reinforcement learning (RL) platform that focuses on high-throughput experience generation and distributed training. It is designed to efficiently train agents across multiple environments and compute resources. TensorHouse enables flexible experiment management, making it suitable for large-scale RL experiments in both research and applied settings.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 4
    HunyuanVideo-I2V

    HunyuanVideo-I2V

    A Customizable Image-to-Video Model based on HunyuanVideo

    HunyuanVideo-I2V is a customizable image-to-video generation framework developed by Tencent, extending the capabilities of HunyuanVideo. It allows for high-quality video creation from still images, using PyTorch and providing pre-trained model weights, inference code, and customizable training options. The system includes a LoRA training code for adding special effects and enhancing video realism, aiming to offer versatile and scalable solutions for generating videos from static image inputs.
    Downloads: 5 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    pgapack, the parallel genetic algorithm library is a powerfull genetic algorithm library by D. Levine, Mathematics and Computer Science Division Argonne National Laboratory. The library is written in C. PGAPy wraps this library for use with Python.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6

    GromacsProSuite

    Graphical User Interface for Gromacs

    ...The software automates tasks such as topology generation, solvation, ion addition, minimization, equilibration, and production runs while executing GROMACS commands in the background. Built-in monitoring tracks CPU, RAM, and disk usage to ensure stable performance during parallel processing. Beyond simulation execution, it includes advanced trajectory processing and analysis tools such as RMSD, RMSF, SASA, clustering, PCA, hydrogen-bond analysis, Ramachandran plots, and FEL mapping. With integrated visualization and plotting utilities, it offers a unified platform for researchers, educators, and students to perform complete MD workflows efficiently and reproducibly. ...
    Leader badge
    Downloads: 4 This Week
    Last Update:
    See Project
  • 7
    Functionary

    Functionary

    Chat language model that can use tools and interpret the results

    ...Function definitions are typically provided in JSON schema format, allowing the model to generate structured function calls compatible with modern tool-calling interfaces used in AI applications. Functionary can decide whether to execute tools sequentially or in parallel and can analyze the outputs of those tools to produce context-aware responses. This capability allows AI systems to interact with external services, APIs, or computation engines rather than relying solely on knowledge embedded in the model.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    YiVal

    YiVal

    Your Automatic Prompt Engineering Assistant for GenAI Applications

    YiVal is an open-source framework designed to automate prompt engineering and evaluation workflows for generative AI applications, enabling developers to systematically improve the performance of large language models. It focuses on experimentation and optimization by allowing users to test multiple prompt variations, configurations, and model parameters in parallel, then evaluate their outputs using structured metrics and scoring systems. The platform is particularly useful in production environments where prompt quality directly impacts user experience, as it provides a repeatable and data-driven approach to refining prompts rather than relying on manual trial and error. YiVal supports integration with various LLM providers and can orchestrate experiments across different models, making it adaptable to evolving AI ecosystems. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 9
    vits_chinese

    vits_chinese

    Best practice TTS based on BERT and VITS

    vits_chinese is an implementation of the VITS end-to-end text-to-speech (TTS) architecture tailored for Chinese (and possibly multilingual) speech synthesis. VITS is a model combining variational autoencoders (VAEs), normalizing flows, adversarial learning, and a stochastic duration predictor — a design that enables generation of natural, expressive speech, capturing variations in rhythm and prosody. By customizing or porting VITS for Chinese, this project aims to produce high-quality TTS...
    Downloads: 2 This Week
    Last Update:
    See Project
  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • 10
    Implicit

    Implicit

    Fast Python collaborative filtering for implicit feedback datasets

    This project provides fast Python implementations of several different popular recommendation algorithms for implicit feedback datasets. All models have multi-threaded training routines, using Cython and OpenMP to fit the models in parallel among all available CPU cores. In addition, the ALS and BPR models both have custom CUDA kernels - enabling fitting on compatible GPU’s. This library also supports using approximate nearest neighbour libraries such as Annoy, NMSLIB and Faiss for speeding up making recommendations.
    Downloads: 114 This Week
    Last Update:
    See Project
  • 11
    eCxx

    eCxx

    A C++ library for AVR and NodeMCU

    NOTE: This project is marked with 'Status: Abandoned' on SourceForge because not enough time can be dedicated to this project. However it may still get sporadic commits to the repository. eCxx is a library for AVR and NodeMCU tailored for micro LED displays and lighting effects. eCxx is utilizing Makefile build system. Java and Python based applications/tools are also included to ease the development and debugging process using the host PC. On one side, eCxx supports the original...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    Medusa

    Medusa

    Framework for Accelerating LLM Generation with Multiple Decoding Heads

    Medusa is a framework aimed at accelerating the generation capabilities of Large Language Models (LLMs) by employing multiple decoding heads. This approach allows for parallel processing during text generation, significantly enhancing throughput and reducing response times. Medusa is designed to be simple to implement and integrates with existing LLM infrastructures, making it a practical solution for scaling LLM applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Graph of Thoughts

    Graph of Thoughts

    Official Implementation of "Graph of Thoughts

    ...The framework executes these operations using a large language model as the reasoning engine while evaluating intermediate results to guide the search process. This approach enables models to explore multiple reasoning strategies in parallel and choose the most promising solutions during problem solving.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 14
    Petals

    Petals

    Run 100B+ language models at home, BitTorrent-style

    ...Run large language models like BLOOM-176B collaboratively — you load a small part of the model, then team up with people serving the other parts to run inference or fine-tuning. Single-batch inference runs at ≈ 1 sec per step (token) — up to 10x faster than offloading, enough for chatbots and other interactive apps. Parallel inference reaches hundreds of tokens/sec. Beyond classic language model APIs — you can employ any fine-tuning and sampling methods, execute custom paths through the model, or see its hidden states. You get the comforts of an API with the flexibility of PyTorch. You can also host BLOOMZ, a version of BLOOM fine-tuned to follow human instructions in the zero-shot regime — just replace bloom-petals with bloomz-petals. ...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 15
    ControlNet

    ControlNet

    Let us control diffusion models

    ControlNet is a neural network architecture designed to add conditional control to text-to-image diffusion models. Rather than training from scratch, ControlNet “locks” the weights of a pre-trained diffusion model and introduces a parallel trainable branch that learns additional conditions—like edges, depth maps, segmentation, human pose, scribbles, or other guidance signals. This allows the system to control where and how the model should focus during generation, enabling users to steer layout, structure, and content more precisely than prompt text alone. The project includes many trained model variants that accept different types of conditioning (e.g., canny edge input, normal maps, skeletal pose) and produce improved fidelity in stable diffusion outputs. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    text-dedup

    text-dedup

    All-in-one text de-duplication

    ...This is especially useful for NLP tasks where duplicated training data can skew model performance. text-dedup scales to billions of documents and offers tools for chunking, hashing, and comparing text efficiently with low memory usage. It supports Jaccard similarity thresholding, parallel execution, and flexible deduplication strategies, making it ideal for cleaning web-scraped data, language model training datasets, or document archives.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    GPT-NeoX

    GPT-NeoX

    Implementation of model parallel autoregressive transformers on GPUs

    This repository records EleutherAI's library for training large-scale language models on GPUs. Our current framework is based on NVIDIA's Megatron Language Model and has been augmented with techniques from DeepSpeed as well as some novel optimizations. We aim to make this repo a centralized and accessible place to gather techniques for training large-scale autoregressive language models, and accelerate research into large-scale training. For those looking for a TPU-centric codebase, we...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    ElegantRL

    ElegantRL

    Massively Parallel Deep Reinforcement Learning

    ElegantRL is an efficient and flexible deep reinforcement learning framework designed for researchers and practitioners. It focuses on simplicity, high performance, and supporting advanced RL algorithms.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Mars Framework

    Mars Framework

    Mars is a tensor-based unified framework for large-scale data

    ...The project provides a tensor-based execution model that extends the capabilities of tools such as NumPy, pandas, and scikit-learn so that large datasets can be processed in parallel without rewriting code for distributed environments. Its architecture automatically divides large computational tasks into smaller chunks that can be executed across multiple nodes in a cluster, allowing complex analytics, machine learning workflows, and data transformations to run efficiently at scale. Mars is particularly useful for workloads that exceed the memory capacity of a single machine or require high levels of parallel processing.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 20
    Scrapyd

    Scrapyd

    A service daemon to run Scrapy spiders

    ...Scrapyd is an application (typically run as a daemon) that listens to requests for spiders to run and spawns a process for each one. Scrapyd also runs multiple processes in parallel, allocating them in a fixed number of slots given by the max_proc and max_proc_per_cpu options, starting as many processes as possible to handle the load.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    FairScale

    FairScale

    PyTorch extensions for high performance and large scale training

    FairScale is a collection of PyTorch performance and scaling primitives that pioneered many of the ideas now used for large-model training. It introduced Fully Sharded Data Parallel (FSDP) style techniques that shard model parameters, gradients, and optimizer states across ranks to fit bigger models into the same memory budget. The library also provides pipeline parallelism, activation checkpointing, mixed precision, optimizer state sharding (OSS), and auto-wrapping policies that reduce boilerplate in complex distributed setups. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 22
    TextBox

    TextBox

    A text generation library with pre-trained language models github.com

    ...From a model perspective, we incorporate 47 pre-trained language models/modules covering the categories of general, translation, Chinese, dialogue, controllable, distilled, prompting, and lightweight models (modules). From a training perspective, we support 4 pre-training objectives and 4 efficient and robust training strategies, such as distributed data parallel and efficient generation. Compared with the previous version of TextBox, this extension mainly focuses on building a unified, flexible, and standardized framework for better supporting PLM-based text generation models.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23

    dispy

    Distributed and Parallel Computing with/for Python.

    dispy is a generic and comprehensive, yet easy to use framework for creating and using compute clusters to execute computations in parallel across multiple processors in a single machine (SMP), among many machines in a cluster, grid or cloud. dispy is well suited for data parallel (SIMD) paradigm where a computation (Python function or standalone program) is evaluated with different (large) datasets independently. dispy supports public / private / hybrid cloud computing, fog / edge computing.
    Leader badge
    Downloads: 12 This Week
    Last Update:
    See Project
  • 24
    Elephas

    Elephas

    Distributed Deep learning with Keras & Spark

    ...Elephas intends to keep the simplicity and high usability of Keras, thereby allowing for fast prototyping of distributed models, which can be run on massive data sets. Elephas implements a class of data-parallel algorithms on top of Keras, using Spark's RDDs and data frames. Keras Models are initialized on the driver, then serialized and shipped to workers, alongside with data and broadcasted model parameters. Spark workers deserialize the model, train their chunk of data and send their gradients back to the driver. The "master" model on the driver is updated by an optimizer, which takes gradients either synchronously or asynchronously. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Fairseq

    Fairseq

    Facebook AI Research Sequence-to-Sequence Toolkit written in Python

    Fairseq(-py) is a sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling and other text generation tasks. We provide reference implementations of various sequence modeling papers. Recent work by Microsoft and Google has shown that data parallel training can be made significantly more efficient by sharding the model parameters and optimizer state across data parallel workers. These ideas are encapsulated in the new FullyShardedDataParallel (FSDP) wrapper provided by fairscale. Fairseq can be extended through user-supplied plug-ins. Models define the neural network architecture and encapsulate all of the learnable parameters. ...
    Downloads: 1 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB