Open Source Scala Software - Page 3

Scala Software

Scala Clear Filters

Browse free open source Scala Software and projects below. Use the toggles on the left to filter open source Scala Software by OS, license, language, programming language, and project status.

  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    More flexibility. More control.

    Generate interest, access liquidity without selling, and execute trades seamlessly. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 1
    Algebird

    Algebird

    Abstract Algebra for Scala

    Algebird is Twitter’s Apache‑licensed Scala library providing abstract algebra data structures and algorithms, especially for online/streaming aggregation. It includes Monoid, Approximate, HyperLogLog, CMS, BloomFilter, Min/Max, Averaged Value types, supporting efficient distributed aggregation and approximate analytics.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 2
    CoolplaySpark

    CoolplaySpark

    Spark Cool Play: Spark source code analysis, Spark class library, etc.

    CoolplaySpark is a learning and practice repository designed to help users understand and work with Apache Spark. It serves as a companion resource for the book 深入理解Spark核心思想与源码分析 (In-Depth Understanding of Spark’s Core Concepts and Source Code Analysis). The project contains annotated examples, explanations, and exercises that guide learners through Spark’s architecture, execution model, and source code internals. It is particularly valuable for developers who want to strengthen their understanding of Spark by not only using it as a data processing engine but also exploring how its internals function. Through code analysis and commentary, CoolplaySpark helps readers connect theoretical concepts with practical implementation details. By combining book study with this repository, learners can develop both conceptual clarity and hands-on expertise in Spark’s core components.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    Deequ

    Deequ

    Deequ is a library built on top of Apache Spark

    Deequ is a library built atop Apache Spark that enables defining “unit tests for data” — that is, formal constraints or checks on datasets to ensure data quality along dimensions such as completeness, uniqueness, value ranges, correlations, etc. It can scale to large datasets (billions of rows) by translating those data checks into Spark jobs. Deequ supports advanced features like a metrics repository for storing computed statistics over time, anomaly detection of data quality metrics, and the suggestion of likely constraints automatically for new datasets. It also includes a little domain-specific language called DQDL (Data Quality Definition Language) which allows declarative specification of quality rules. Users typically run Deequ before feeding data downstream (to ML pipelines, analytics, or production systems), enabling early detection and isolation of data errors. There is also a Python wrapper, PyDeequ, for users who prefer working from Python environments.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 4
    Eclair

    Eclair

    A scala implementation of the Lightning Network

    Eclair (French for Lightning) is a Scala implementation of the Lightning Network. This software follows the Lightning Network Specifications (BOLTs). Other implementations include c-lightning, lnd, electrum, and rust-lightning. Eclair offers a feature-rich HTTP API that enables application developers to easily integrate. Eclair's JSON API should NOT be accessible from the outside world (similarly to Bitcoin Core API). Eclair requires Bitcoin Core 0.20.1 or 0.21.1. (other versions of Bitcoin Core are not actively tested - use at your own risk). If you are upgrading an existing wallet, you may need to create a new address and send all your funds to that address. Eclair needs a synchronized, segwit-ready, zeromq-enabled, wallet-enabled, non-pruning, tx-indexing Bitcoin Core node. You must configure your Bitcoin node to use bech32 (segwit) addresses.
    Downloads: 2 This Week
    Last Update:
    See Project
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 5
    FlockDB

    FlockDB

    A distributed, fault-tolerant graph database

    FlockDB is a specialized graph / adjacency-list storage system designed for high performance in large-scale, low-latency, real-time environments. It was developed at Twitter to store social graph data (followers, following, blocks, etc.) and secondary indexes. FlockDB emphasizes horizontal scalability, replication, and support for high rates of writes and updates, as well as efficient paging through very large result sets. It is not a general graph database in the sense of supporting complex multi-hop traversal queries or sophisticated graph algorithms; instead it focuses on the core problem of storing and querying directed edges with attributes such as sort order, state (normal, archived, removed), and position. Edges are stored both in forward and backward directions to facilitate queries in both directions. The project is now archived and in read-only mode, meaning it's no longer actively maintained by Twitter.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    GitBucket

    GitBucket

    A Git platform powered by Scala

    A Git platform powered by Scala with easy installation, high extensibility & GitHub API compatibility. GitBucket is a Git web platform powered by Scala offering, easy installation, intuitive UI, high extensibility by plugins, API compatibility with GitHub. You can also deploy gitbucket.war to a servlet container which supports Servlet 3.0 (like Jetty, Tomcat, JBoss, etc). To upgrade GitBucket, replace gitbucket.war with the new version, after stopping GitBucket. All GitBucket data is stored in HOME/.gitbucket by default. So if you want to back up GitBucket's data, copy the directory to the backup location. If you want to try the development version of GitBucket, or want to contribute to the project, please see the Developer's Guide. It provides instructions on building from source and on setting up an IDE for debugging. It also contains documentation of the core concepts used within the project.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    Jawn

    Jawn

    Jawn is for parsing jay-sawn (JSON)

    The term "jawn" comes from the Philadelphia area. It conveys about as much information as "thing" does. I chose the name because I had moved to Montreal so I remembered Philly fondly. Also, there isn't a better way to describe objects encoded in JSON than "things". Finally, we get a catchy slogan. Jawn was designed to parse JSON into an AST as quickly as possible. Currently, Jawn is competitive with the fastest Java JSON libraries (GSON and Jackson) and in the author's benchmarks, it often wins. It seems to be faster than any other Scala parser that exists (as of July 2014).
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8
    ScalaCheck

    ScalaCheck

    Property-based testing for Scala

    ScalaCheck is a library for property-based testing in Scala (and Java), inspired by Haskell’s QuickCheck. It automatically generates test inputs based on specifications, validating that properties hold across randomized scenarios, thereby enabling robust, declarative testing of edge cases and invariants.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
    Skunk

    Skunk

    A data access library for Scala + Postgres

    Skunk is a Postgres library for Scala. Skunk is powered by cats, cats-effect, scodec, and fs2. Skunk is purely functional, non-blocking, and provides a tagless-final API. Skunk gives very good error messages. Skunk embraces the Scala Code of Conduct. Skunk is pre-release software! Code and documentation are under active development! Skunk is published for Scala 2.12/2.13/3.1 and can be included in your project.Query and Command types are usually inferrable, but specifying a type ensures that the chosen encoders and decoders are consistent with the expected input and output Scala types. Postgres provides a protocol for execution of simple queries, returning all rows at once (Skunk returns them as a list).
    Downloads: 2 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 10
    Spark NLP

    Spark NLP

    State of the Art Natural Language Processing

    Experience the power of large language models like never before, unleashing the full potential of Natural Language Processing (NLP) with Spark NLP, the open source library that delivers scalable LLMs. The full code base is open under the Apache 2.0 license, including pre-trained models and pipelines. The only NLP library built natively on Apache Spark. The most widely used NLP library in the enterprise. Spark ML provides a set of machine learning applications that can be built using two main components, estimators and transformers. The estimators have a method that secures and trains a piece of data to such an application. The transformer is generally the result of a fitting process and applies changes to the target dataset. These components have been embedded to be applicable to Spark NLP. Pipelines are a mechanism for combining multiple estimators and transformers in a single workflow. They allow multiple chained transformations along a machine-learning task.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11
    XiangShan

    XiangShan

    Open-source high-performance RISC-V processor

    XiangShan is an open-source, high-performance RISC-V processor project that implements out-of-order superscalar cores using Chisel for hardware construction. The design targets modern performance goals—deep pipelines, speculative execution, multi-issue decode/execute, and sophisticated branch prediction—while remaining synthesizable for ASIC flows and portable to FPGAs for research. A modular microarchitecture separates frontend, backend, and memory subsystems with coherent caches and scalable interconnects, enabling multi-core configurations. The project invests heavily in verification: differential testing against reference models, extensive random instruction tests, and full software stacks (bootloaders, Linux) to validate correctness under realistic workloads. Tooling around the core (build scripts, simulators, waveform/debug support) lowers the barrier for academics and industry contributors to experiment and extend.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    atto

    atto

    Friendly little parsers

    Atto is a compact, pure-functional, incremental text parsing library for Scala. It offers a non-invasive API using familiar abstractions, making it a principled tool for everyday parsing tasks in functional programming.​
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    node2vec

    node2vec

    Learn continuous vector embeddings for nodes in a graph using biased R

    The node2vec project provides an implementation of the node2vec algorithm, a scalable feature learning method for networks. The algorithm is designed to learn continuous vector representations of nodes in a graph by simulating biased random walks and applying skip-gram models from natural language processing. These embeddings capture community structure as well as structural equivalence, enabling machine learning on graphs for tasks such as classification, clustering, and link prediction. The repository contains reference code accompanying the research paper node2vec: Scalable Feature Learning for Networks (KDD 2016). It allows researchers and practitioners to apply node2vec to various graph datasets and evaluate embedding quality on downstream tasks. By bridging ideas from graph theory and word embedding models, this project demonstrates how graph-based machine learning can be made efficient and flexible.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    s3_website

    s3_website

    Manage an S3 website: sync, deliver via CloudFront

    s3_website is a Ruby gem that automates the deployment of static websites to AWS S3 and optionally CloudFront. It handles site configuration, uploads, cache control, gzip compression, redirects, and supports Jekyll, Nanoc, and Middleman out of the box. Ideal for static site hosting without manual AWS setup.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    Downloads: 11 This Week
    Last Update:
    See Project
  • 16
    Alpakka Kafka

    Alpakka Kafka

    Alpakka is a Reactive Enterprise Integration library for Java

    The Alpakka project is an open source initiative to implement stream-aware and reactive integration pipelines for Java and Scala. It is built on top of Akka Streams and has been designed from the ground up to understand streaming natively and provide a DSL for reactive and stream-oriented programming, with built-in support for backpressure. Akka Streams is a Reactive Stream and JDK 9+ java.util.concurrent.Flow-compliant implementation and therefore fully interoperable with other implementations. As Kafka’s client protocol negotiates the version to use with the Kafka broker, you may use a Kafka client version that is different than the Kafka broker’s version.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    Chronos

    Chronos

    Fault tolerant job scheduler for Mesos to handle dependencies

    Chronos is a replacement for cron. It is a distributed and fault-tolerant scheduler that runs on top of Apache Mesos that can be used for job orchestration. It supports custom Mesos executors as well as the default command executor. Thus by default, Chronos executes sh (on most systems bash) scripts. Chronos can be used to interact with systems such as Hadoop (incl. EMR), even if the Mesos slaves on which execution happens do not have Hadoop installed. Chronos is also natively able to schedule jobs that run inside Docker containers. Chronos has a number of advantages over regular cron. It allows you to schedule your jobs using ISO8601 repeating interval notation, which enables more flexibility in job scheduling. Chronos also supports the definition of jobs triggered by the completion of other jobs. It supports arbitrarily long dependency chains. The easiest way to use Chronos is to use DC/OS and install chronos via the universe.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    FS2

    FS2

    Compositional, streaming I/O library for Scala

    FS2 (“Functional Streams for Scala”) is a purely functional, effectful abstraction for stream processing on the JVM. Built on Cats Effect, it enables compositional resource-safe streaming workflows with robust error handling, back-pressure, pull/push semantics, and support for concurrent and interruptible pipelines.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    Feathr

    Feathr

    A scalable, unified data and AI engineering platform for enterprise

    Feathr is a data and AI engineering platform that is widely used in production at LinkedIn for many years and was open sourced in 2022. It is currently a project under LF AI & Data Foundation. Define data and feature transformations based on raw data sources (batch and streaming) using Pythonic APIs. Register transformations by names and get transformed data(features) for various use cases including AI modeling, compliance, go-to-market and more. Share transformations and data(features) across team and company. Feathr is particularly useful in AI modeling where it automatically computes your feature transformations and joins them to your training data, using point-in-time-correct semantics to avoid data leakage, and supports materializing and deploying your features for use online in production.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    Finatra

    Finatra

    Fast, testable, Scala services built on TwitterServer and Finagle

    Finatra builds on TwitterServer and uses Finagle, therefore it is highly recommended that you familiarize yourself with those frameworks before getting started. The version of Finatra documented here is version 2.x. Version 2.x is a complete rewrite over v1.x and as such many things are different. Finatra at its core is agnostic to the type of service or application being created. It can be used to build anything based on TwitterUtil: c.t.app.App. For servers, Finatra builds on top of the features of TwitterServer (and Finagle) by allowing you to easily define a Server and controllers (a Service-like abstraction) which define and handle endpoints of the Server. You can also compose Filters either per controller, per route in a controller, or across all controllers. Powerful Feature and Integration test support. Optional JSR-330 Dependency Injection using Google Guice.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    Flix

    Flix

    The Flix Programming Language

    Flix is a statically typed programming language combining functional, imperative, and logic paradigms, with first‑class Datalog constraints and a polymorphic effect system. Designed to run on the JVM, Flix enforces purity tracking at compile time, supports algebraic data types, tail‑call elimination, and allows entire Datalog programs as values.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    Graphcool Framework

    Graphcool Framework

    Graphcool is an open-source backend development framework

    Graphcool was an open-source framework for developing and deploying GraphQL-based backends. It acts as a “backend-as-a-service / framework” that lets you define your data model via GraphQL SDL (Schema Definition Language), and in turn generates a GraphQL CRUD API, supports nested mutations, filtering, pagination, and real-time subscriptions. Graphcool separates the business logic from stateful storage components, allowing the stateful parts (database, subscription engine) to scale independently and giving flexibility in how you compose your system. Users could deploy Graphcool either locally (e.g. via Docker) or on a managed cloud offering. The framework also provided features like schema evolution, migrations, data loaders for performance, and built-in tooling to manage endpoints and deployments.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    Guardian Frontend

    Guardian Frontend

    The Guardian DotCom

    This repository hosts a major part of The Guardian’s web application stack, historically the Play Framework–based code that serves the newspaper’s content at scale. It orchestrates rendering of articles, live blogs, and interactive pieces while integrating advertising, analytics, identity, and paywall-adjacent features. The codebase coordinates with upstream content APIs, image services, and media platforms to compose pages dynamically with caching and edge-friendly layouts. Operationally, it’s engineered for high traffic spikes—breaking news or live sports—through aggressive caching strategies, feature switches, and robust fault isolation between services. The project reflects a long-running evolution from a monolith toward a service-oriented architecture, with portions moved to separate rendering services and modern front-end stacks while this repo remains the glue for core routes and legacy paths.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    Kestrel

    Kestrel

    Simple, distributed message queue system (inactive)

    Kestrel is a simple, distributed message queue system built originally by Twitter. Its design is relatively lightweight and is engineered for speed and simplicity. Kestrel supports queuing patterns such as enqueue, dequeue, and delayed re-enqueue (for example, when a consumer fails to process a message). It stores messages persistently on disk with a memory-backed cache, allowing recovery in case of failures. Because it is intended for relatively simple use cases, it does not provide the full feature set of some enterprise messaging systems, but is often sufficient for many asynchronous or buffered workloads. Over time, the project became inactive and is now archived. Its minimalism and ease of integration made it appealing for smaller or more controlled message-queueing needs.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    Lagom

    Lagom

    Reactive Microservices for the JVM

    The opinionated microservices framework for moving away from the monolith. Lagom helps you decompose your legacy monolith and build, test, and deploy entire systems of Reactive microservices. Lagom is an open source framework for building systems of Reactive microservices in Java or Scala. Lagom builds on Akka and Play, proven technologies that are in production in some of the most demanding applications today. Lagom's integrated development environment allows you to focus on solving business problems instead of wiring services together. A single command builds the project, starts supporting components and your microservices, as well as the Lagom infrastructure. The build hot-reloads when it detects changes to source code.
    Downloads: 1 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB