Showing 116 open source projects for "parallel"

View related business solutions
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    CUDA Core Compute Libraries (CCCL)

    CUDA Core Compute Libraries (CCCL)

    CUDA Core Compute Libraries

    ...It brings together Thrust, CUB, and libcudacxx, which collectively provide high-level abstractions, low-level performance primitives, and a CUDA-compatible standard library for GPU programming. The goal of CCCL is to simplify CUDA development by offering reusable building blocks that enable developers to write efficient and scalable parallel code without starting from scratch. Thrust provides a high-level interface for parallel algorithms, while CUB delivers highly optimized primitives for device-level operations, and libcudacxx ensures compatibility with modern C++ standards. By unifying these components, CCCL reduces duplication and improves developer productivity while maintaining performance across different GPU architectures.
    Downloads: 15 This Week
    Last Update:
    See Project
  • 2
    Soufflé

    Soufflé

    Datalog variant for tool designers crafting analyses in Horn clauses

    Rapid prototyping for your analysis problems with logic; enabling deep design-space explorations; designed for large-scale static analysis; e.g., points-to analysis for Java, taint-analysis, and security checks. Futamura projections/partial evaluation for effective translation to parallel C++; optimized staged compilation; specialized data-structures for logical relations. Efficient translation to parallel C++ of Datalog programs (CAV'16, CC'16) Efficient interpretation using de-specialization techniques (PLDI'21) Specialized data structure for relations (PACT'19, PPoPP'19, PMAM'19) with optimal index selection (VLDB'18) Extended semantics of Datalog, e.g., permitting unbounded recursions with numbers and terms. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3

    Halide

    A language for fast, portable data-parallel computation

    Halide is a programming language for fast, portable data-parallel computation. It was designed to make writing high-performance image and array processing code much easier on modern machines. It works on all major operating systems and with several CPU architectures (X86, ARM, MIPS, Hexagon, PowerPC) and GPU Compute APIs (CUDA, OpenCL, OpenGL, among others). It isn't a standalone programming language however; rather it is embedded in C++ which means that you write C++ code, building an in-memory representation of a Halide pipeline using Halide's C++ API. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 4
    ispc

    ispc

    Intel SPMD Program Compiler

    ...Under the SPMD model, the programmer writes a program that generally appears to be a regular serial program, though the execution model is actually that a number of program instances execute in parallel on the hardware. ispc compiles a C-based SPMD programming language to run on the SIMD units of CPUs and GPUs; it frequently provides a 3x or more speedup on architectures with 4-wide vector SSE units and 5x-6x on architectures with 8-wide AVX vector units, without any of the difficulty of writing intrinsics code. Parallelization across multiple cores is also supported by ispc, making it possible to write programs that achieve performance improvement that scales by both numbers of cores and vector unit size. ...
    Downloads: 76 This Week
    Last Update:
    See Project
  • Fully Managed MySQL, PostgreSQL, and SQL Server Icon
    Fully Managed MySQL, PostgreSQL, and SQL Server

    Automatic backups, patching, replication, and failover. Focus on your app, not your database.

    Cloud SQL handles your database ops end to end, so you can focus on your app.
    Try Free
  • 5
    ncnn

    ncnn

    High-performance neural network inference framework for mobile

    ncnn is a high-performance neural network inference computing framework designed specifically for mobile platforms. It brings artificial intelligence right at your fingertips with no third-party dependencies, and speeds faster than all other known open source frameworks for mobile phone cpu. ncnn allows developers to easily deploy deep learning algorithm models to the mobile platform and create intelligent APPs. It is cross-platform and supports most commonly used CNN networks, including...
    Downloads: 92 This Week
    Last Update:
    See Project
  • 6
    TensorStore

    TensorStore

    Library for reading and writing large multi-dimensional arrays

    ...It separates the logical view (shape, dtype, chunking) from the physical layout so the same code can target Zarr, N5, TIFF pyramids, or custom backends. Rich indexing, slicing, and broadcasting operations make it feel like a familiar array API, while asynchronous I/O pipelines stream chunks efficiently in parallel. Transactional semantics allow atomic updates and consistent snapshots, which is essential for large, shared datasets used by ML and scientific workflows. The library is engineered for scalability—background caching, chunk sharding, and retryable operations keep throughput high even over unreliable networks. With language bindings, it fits into Python-heavy analysis pipelines while retaining a fast C++ core.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    mold

    mold

    A Modern Linker

    Mold is a modern high-performance linker designed as a drop-in replacement for traditional Unix linkers, with a primary goal of dramatically reducing build times for large software projects. In compiled languages like C, C++, and Rust, the linking phase can become a significant bottleneck, especially in large codebases, and mold addresses this by leveraging highly optimized algorithms and extensive parallelism. It is capable of utilizing all available CPU cores efficiently, resulting in...
    Downloads: 17 This Week
    Last Update:
    See Project
  • 8
    XGBoost

    XGBoost

    Scalable and Flexible Gradient Boosting

    ...It supports regression, classification, ranking and user defined objectives, and runs on all major operating systems and cloud platforms. XGBoost works by implementing machine learning algorithms under the Gradient Boosting framework. It also offers parallel tree boosting (GBDT, GBRT or GBM) that can quickly and accurately solve many data science problems. XGBoost can be used for Python, Java, Scala, R, C++ and more. It can run on a single machine, Hadoop, Spark, Dask, Flink and most other distributed environments, and is capable of solving problems beyond billions of examples.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 9
    ChrysaLisp

    ChrysaLisp

    Parallel OS, with GUI, Terminal, OO Assembler, Class libraries

    ChrysaLisp is a 64-bit, MIMD, multi-CPU, multi-threaded, multi-core, multi-user parallel operating system with features such as a GUI, terminal, OO Assembler, class libraries, C-Script compiler, Lisp interpreter, debugger, profiler, vector font engine, and more. It supports MacOS, Windows, and Linux for x64, Riscv64, and Arm64 and eventually will move to bare metal. It also allows the modeling of various network topologies and the use of ChrysaLib hub nodes to join heterogeneous host networks. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 10
    SIMD

    SIMD

    C++ wrappers for SIMD intrinsics

    SIMD is a C++ library that provides portable abstractions over SIMD (Single Instruction, Multiple Data) instructions, enabling developers to write high-performance vectorized code without dealing directly with architecture-specific intrinsics. SIMD instructions allow a single operation to be applied to multiple data elements simultaneously, significantly accelerating numerical and data-parallel computations. However, differences across CPU architectures and compilers make direct usage complex, which xsimd addresses by offering a unified API that maps efficiently to underlying hardware capabilities. The library supports a wide range of instruction sets, including SSE, AVX, NEON, and WebAssembly SIMD, ensuring portability across platforms. ...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 11
    TensorRT

    TensorRT

    C++ library for high performance inference on NVIDIA GPUs

    ...With TensorRT, you can optimize neural network models trained in all major frameworks, calibrate for lower precision with high accuracy, and deploy to hyperscale data centers, embedded, or automotive product platforms. TensorRT is built on CUDA®, NVIDIA’s parallel programming model, and enables you to optimize inference leveraging libraries, development tools, and technologies in CUDA-X™ for artificial intelligence, autonomous machines, high-performance computing, and graphics. With new NVIDIA Ampere Architecture GPUs, TensorRT also leverages sparse tensor cores providing an additional performance boost.
    Downloads: 22 This Week
    Last Update:
    See Project
  • 12
    ArrayFire

    ArrayFire

    ArrayFire, a general purpose GPU library

    ArrayFire is a general-purpose tensor library that simplifies the process of software development for the parallel architectures found in CPUs, GPUs, and other hardware acceleration devices. The library serves users in every technical computing market. Data structures in ArrayFire are smartly managed to avoid costly memory transfers and to take advantage of each performance feature provided by the underlying hardware. The community of ArrayFire developers invites you to build with us if you're interested and able to write top performing tensor functions. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 13
    frugally-deep

    frugally-deep

    A lightweight header-only library for using Keras (TensorFlow) models

    ...Utterly ignores even the most powerful GPU in your system and uses only one CPU core per prediction. Quite fast on one CPU core, and you can run multiple predictions in parallel, thus utilizing as many CPUs as you like to improve the overall prediction throughput of your application/pipeline.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 14
    Google Highway

    Google Highway

    Performance-portable, length-agnostic SIMD with runtime dispatch

    Google Highway is a high-performance C++ library designed to provide portable SIMD (Single Instruction, Multiple Data) vectorization across multiple CPU architectures while maintaining predictable and efficient behavior. It abstracts low-level vector intrinsics into a consistent API that maps closely to hardware instructions, allowing developers to write high-performance code without relying heavily on compiler auto-vectorization. Highway enables the same source code to run across different...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    NeoPixelBus

    NeoPixelBus

    An Arduino NeoPixel support library

    ...There are multiple competing libraries, FastLED being the biggest and Adafruit NeoPixel being the most common for beginners. On ESP32, both FastLED and NeoPixelBus can provide more than one channel/bus. FastLED primarily uses RMT to support 8 parallel channels. NeoPixelBus now supports the RMTs 8 channels and two more channels using i2s.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    Apache brpc

    Apache brpc

    Industrial-grade RPC framework used throughout Baidu

    Apache brpc is an industrial-grade RPC framework for building reliable and high-performance services. Apache brpc (incubating) is an effort undergoing Incubation at The Apache Software Foundation (ASF), sponsored by the Incubator. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    OneFlow

    OneFlow

    OneFlow is a deep learning framework designed to be user-friendly

    OneFlow is a deep learning framework designed to be user-friendly, scalable and efficient. An extension for OneFlow to target third-party compiler, such as XLA, TensorRT and OpenVINO etc.CUDA runtime is statically linked into OneFlow. OneFlow will work on a minimum supported driver, and any driver beyond. For more information. Distributed performance (efficiency) is the core technical difficulty of the deep learning framework. OneFlow focuses on performance improvement and heterogeneous...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Octave Forge

    Octave Forge

    A collection of packages providing extra functionality for GNU Octave

    Octave Forge is a central location for collaborative development of packages for GNU Octave. The Octave Forge packages expand Octave's core functionality by providing field specific features via Octave's package system. See https://octave.sourceforge.io/packages.php for a list of all available packages. GNU Octave is a high-level interpreted language, primarily intended for numerical computations. It provides capabilities for the numerical solution of linear and nonlinear problems, and...
    Leader badge
    Downloads: 1,555 This Week
    Last Update:
    See Project
  • 19
    PANDA

    PANDA

    A comprehensive and flexible quantification tool for proteomics data

    ...On the levels of spectra, peptides and proteins, PANDA works out a few quantitative filters and new scores for quantification confidence. Third, PANDA is designed for processing proteomics big data in parallel.
    Downloads: 20 This Week
    Last Update:
    See Project
  • 20
    BMDFM

    BMDFM

    Binary Modular DataFlow Machine (BMDFM)

    ...The BMDFM dynamic scheduling subsystem performs a symmetric multiprocessing (SMP) emulation of a tagged-token dataflow machine to provide the transparent dataflow semantics for the applications. No directives for parallel execution are needed. More info: http://www.bmdfm.com
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Proximus for NUMA

    Proximus for NUMA

    Proximus is an Electronic System Level (ESL) design environment.

    Proximus-FOSS stands as an innovative platform that fosters the convergence of hardware design and software programming, enabling concurrent development across both disciplines. Its collaborative environment empowers developers to concurrently address hardware and software aspects of a project. The Proximus Open Source version boasts robust support for multi-threaded programming with a C++ implementation. This capability allows developers to harness the full potential of C++ for crafting...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 22
    Evolutionary Computation Framework

    Evolutionary Computation Framework

    C++ framework for application of any type of evolutionary computation.

    ECF is a framework intended for application of any type of evolutionary computation (GA/GP, DE, Clonalg, ES, PSO, ABC, GAn, local search...). It offers simplicity for the end-user (parameterless usage, tutorial) and customization for experienced EC practicioners.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 23
    Classdesc is a system for adding reflection to C++, ie the ability to query an object's structure at runtime.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    eCxx

    eCxx

    A C++ library for AVR and NodeMCU

    NOTE: This project is marked with 'Status: Abandoned' on SourceForge because not enough time can be dedicated to this project. However it may still get sporadic commits to the repository. eCxx is a library for AVR and NodeMCU tailored for micro LED displays and lighting effects. eCxx is utilizing Makefile build system. Java and Python based applications/tools are also included to ease the development and debugging process using the host PC. On one side, eCxx supports the original...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 25
    Thrust

    Thrust

    The C++ parallel algorithms library

    Thrust is the C++ parallel algorithms library which inspired the introduction of parallel algorithms to the C++ Standard Library. Thrust's high-level interface greatly enhances programmer productivity while enabling performance portability between GPUs and multicore CPUs. It builds on top of established parallel programming frameworks (such as CUDA, TBB, and OpenMP).
    Downloads: 2 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB