Showing 1533 open source projects for "compiler python linux"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    More flexibility. More control.

    Generate interest, access liquidity without selling, and execute trades seamlessly. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 1
    LocalStack

    LocalStack

    Develop and test your cloud apps offline

    LocalStack is a fully functional local AWS cloud stack that enables you to develop and test your cloud and serverless apps offline. It spins up an easy-to-use testing environment on your local machine that has the same APIs and works the same way as the real AWS cloud environment. It can spin up a number of different core Cloud APIs on your local machine, including API Gateway, Kinesis, DynamoDB, Firehose, Lambda and many others. LocalStack was built on some of today’s best-of-breed...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 2
    Crawl4AI

    Crawl4AI

    Open-source LLM Friendly Web Crawler & Scraper

    Crawl4AI is a high-performance, AI‑ready web crawler tailored for LLM data ingestion and RAG pipelines. It supports adaptive crawling heuristics (stopping when enough info is gathered), structured markdown output, and high-speed parallel execution. Designed to operate at scale with optional Docker deployment and framework integrations.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    changedetection.io

    changedetection.io

    The best free open source website change detection and restock service

    Loved by smart shoppers, data journalists, research engineers, data scientists, security researchers, and more. From simply monitoring website pages that have a change (such as watching prices, and restocking notifications), to deep inspection such as PDF text support, JSON and XML monitoring, and extensive text triggers. Monitor out-of-stock products and get alerts when those products are back in stock, get restock alerts via Discord, Slack, email, and many other platforms. Using the...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 4
    Scweet

    Scweet

    Scrape tweets, profiles, followers and following from Twitter/X

    Scweet is a Python-based Twitter/X scraping library and CLI designed to collect tweets, profile timelines, followers, following lists, and user profile data without requiring the official Twitter/X API or a developer account. Instead of depending on deprecated unauthenticated scraping methods, it works by using X’s web GraphQL API together with authenticated browser cookies, which gives it a more current and practical approach for data extraction. The project supports a broad set of...
    Downloads: 2 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 5
    news-please

    news-please

    Python tool for crawling and extracting structured data from news site

    news-please is an open source news crawler and information extraction tool designed to collect and structure articles from online news websites. It provides an integrated pipeline that crawls news sites, retrieves article pages, and extracts structured information such as headlines, authors, publication dates, and article text. news-please can recursively follow internal links and read RSS feeds to gather both recent and archived articles from a news outlet when given only the root URL of a...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    nginx-proxy

    nginx-proxy

    Automated nginx proxy for Docker containers using docker-gen

    nginx-proxy sets up a container running nginx and docker-gen. docker-gen generates reverse proxy configs for nginx and reloads nginx when containers are started and stopped. The containers being proxied must expose the port to be proxied, either by using the EXPOSE directive in their Dockerfile or by using the --expose flag to docker run or docker create and be in the same network. By default, if you don't pass the --net flag when your nginx-proxy container is created, it will only be...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 7
    FEAPDER

    FEAPDER

    Powerful Python crawler framework for scalable web scraping tasks

    feapder is a Python-based web crawling framework designed to simplify the process of building scalable and efficient web scrapers. It focuses on providing a developer-friendly environment that makes it easier to create, run, and manage crawlers for a variety of data collection tasks. It includes several built-in spider types, such as AirSpider, Spider, TaskSpider, and BatchSpider, which address different crawling scenarios ranging from lightweight scraping to distributed and batch-based...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Trafilatura

    Trafilatura

    Python & command-line tool to gather text on the Web

    Trafilatura is a Python package and command-line tool designed to gather text on the Web. It includes discovery, extraction and text-processing components. Its main applications are web crawling, downloads, scraping, and extraction of main texts, metadata and comments. It aims at staying handy and modular: no database is required, the output can be converted to various commonly used formats. Going from raw HTML to essential parts can alleviate many problems related to text quality, first by...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    ConsoleMe

    ConsoleMe

    A central control plane for AWS permissions and access

    ConsoleMe is a web service that makes AWS IAM permissions and credential management easier for end-users and cloud administrators. ConsoleMe provides numerous ways to log in to the AWS Console. An IAM Self-Service Wizard lets users request IAM permissions in plain English. Cross-account resource policies will be automatically generated and can be applied with a single click for certain resource types. Weep (ConsoleMe’s CLI) supports 5 different ways of serving AWS credentials locally. Cloud...
    Downloads: 6 This Week
    Last Update:
    See Project
  • Go from Code to Production URL in Seconds Icon
    Go from Code to Production URL in Seconds

    Cloud Run deploys apps in any language instantly. Scales to zero. Pay only when code runs.

    Skip the Kubernetes configs. Cloud Run handles HTTPS, scaling, and infrastructure automatically. Two million requests free per month.
    Try it free
  • 10
    Ansible Role s3cmd

    Ansible Role s3cmd

    Ansible role for s3cmd. Available on Ansible Galaxy

    Role to install (by default) s3cmd on Debian/Ubuntu and EL systems. s3cmd is a popular s3 client.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    crawler

    crawler

    Collection of JS reverse engineering examples for web scraping study

    crawler is a collection of web scraping and JavaScript reverse engineering examples designed for learning how modern websites protect their data and how those protections can be analyzed. It contains many case studies that demonstrate how to analyze and replicate request parameters, cookies, and encryption logic used by real websites. Each directory in the project focuses on a specific target service or scenario, showing how browser network requests and JavaScript code can be studied to...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 12
    SEO Machine

    SEO Machine

    A specialized Claude Code workspace for creating long-form

    SEO Machine is an AI-powered content production system built as a structured workspace for generating long-form, SEO-optimized blog content through automated workflows. It integrates research, writing, analysis, and optimization into a single pipeline, allowing users to produce high-quality articles tailored to search engine performance. The system uses specialized commands and agents to perform tasks such as keyword research, competitor analysis, content drafting, and optimization. It...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 13
    proxy.py

    proxy.py

    Utilize all available CPU cores for accepting new client connections

    proxy.py is made with performance in mind. By default, proxy.py will try to utilize all available CPU cores to it for accepting new client connections. This is achieved by starting AcceptorPool which listens on configured server port. Then, AcceptorPool starts Acceptor processes (--num-acceptors) to accept incoming client connections. Alongside, if --threadless is enabled, ThreadlessPool is setup which starts Threadless processes (--num-workers) to handle the incoming client connections....
    Downloads: 4 This Week
    Last Update:
    See Project
  • 14
    Nitter

    Nitter

    Alternative Twitter front-end

    Nitter is an open-source alternative frontend for Twitter designed to provide a privacy-focused and lightweight way to browse content without interacting directly with the official platform. It acts as a proxy between the user and Twitter, ensuring that requests are handled by the backend server rather than exposing the user’s IP address or browser fingerprint. The interface is intentionally minimalistic and removes elements such as advertisements, tracking scripts, and algorithmic...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 15
    CyberScraper 2077

    CyberScraper 2077

    A Powerful web scraper powered by LLM | OpenAI, Gemini & Ollama

    CyberScraper 2077 is not just another web scraping tool – it's a glimpse into the future of data extraction. Born from the neon-lit streets of a cyberpunk world, this AI-powered scraper uses OpenAI, Gemini and LocalLLM Models to slice through the web's defenses, extracting the data you need with unparalleled precision and style.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 16
    Cloud Custodian

    Cloud Custodian

    Rules engine for cloud security, cost optimization, and governance

    Cloud Custodian enables users to be well managed in the cloud. The simple YAML DSL allows you to easily define rules to enable a well-managed cloud infrastructure, that's both secure and cost-optimized. It consolidates many of the ad-hoc scripts organizations have into a lightweight and flexible tool, with unified metrics and reporting. Custodian supports managing AWS, Azure, and GCP public cloud environments. Besides just providing reports of issues, Custodian can actively enforce the...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 17
    SimpDL

    SimpDL

    A tool to scrape images from SimpCity

    SimpDL is an open-source media downloading tool designed to retrieve content from subscription-based or creator platforms, focusing on simplicity and ease of use. It enables users to download images, videos, and other media associated with specific creators or accounts, often through authenticated sessions. The project emphasizes a straightforward workflow where users provide login credentials or tokens, and the tool handles the retrieval and storage of content automatically. It is designed...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 18
    autocrawler

    autocrawler

    Multiprocess Selenium crawler for downloading images by keywords

    AutoCrawler is a Python-based image crawling tool designed to automatically download large numbers of images from search engines using automated browser interaction. It uses Selenium and a Chrome browser driver to navigate image search pages and collect image sources based on keywords provided by the user. AutoCrawler supports multiprocess and multithreaded downloading, which allows it to retrieve images faster by running several tasks simultaneously. Users provide search terms through a...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    watercrawl

    watercrawl

    AI-ready web crawler that extracts and structures website content

    WaterCrawl is an open source web crawling and data extraction platform designed to transform website content into structured data suitable for machine learning and AI workflows. It enables developers and researchers to crawl web pages, extract meaningful information, and convert it into formats that are easier to process and analyze. It provides a modern crawling system that can automatically navigate links, control crawl depth, and collect content from targeted sections of a website....
    Downloads: 5 This Week
    Last Update:
    See Project
  • 20
    SMTP Tunnel Proxy

    SMTP Tunnel Proxy

    A high-speed covert tunnel that disguises TCP traffic as SMTP email

    SMTP Tunnel Proxy is a high-speed covert tunneling proxy that disguises regular TCP traffic as legitimate SMTP email communication to evade deep packet inspection (DPI) firewalls and censorship systems. It implements a SOCKS5 proxy interface on the client that wraps outbound traffic into an SMTP-like handshake (EHLO, STARTTLS, AUTH) and encrypted payload, making the session appear to DPI systems as a normal email exchange. The tool supports modern TLS encryption (STARTTLS) with HMAC-SHA256...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 21
    geo-seo-claude

    geo-seo-claude

    GEO-first SEO skill for Claude Code

    geo-seo-claude is an AI-powered tool designed to automate the creation of geographically optimized SEO content using large language models, helping businesses improve their visibility in local search results. It leverages AI to generate location-specific content tailored to different regions, allowing users to scale SEO efforts across multiple cities or markets without manual content creation. The system focuses on producing structured and keyword-optimized pages that align with search...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 22
    Scrapling

    Scrapling

    An adaptive Web Scraping framework

    Scrapling is an adaptive web scraping framework designed to handle everything from a single HTTP request to large-scale, concurrent crawls. Built for modern websites, it intelligently adapts to structural changes by automatically relocating elements when page layouts update. The framework includes advanced fetchers capable of bypassing anti-bot protections such as Cloudflare Turnstile using stealth and browser automation techniques. Its powerful spider system supports multi-session crawling,...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 23
    hosts

    hosts

    Consolidate and extend hosts files from several well-curated sources

    Consolidating and extending hosts files from several well-curated sources. You can optionally pick extensions to block pornography, social media, and other categories. The unified hosts file is optionally extensible. Extensions are used to include domains by category. Currently, we offer the following categories: fakenews, social, gambling, and porn. Extensions are optional, and can be combined in various ways with the base hosts file. The combined products are stored in the alternates...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 24
    Changelog CI

    Changelog CI

    Changelog CI is a GitHub Action that enables a project

    Changelog CI is a GitHub Action that enables a project to automatically generate changelogs. Changelog CI can be triggered on pull_request, workflow_dispatch, and any other events that can provide the required inputs. Changelog CI uses python and GitHub API to generate a changelog for a repository. First, it tries to get the latest release from the repository (If available). Then, it checks all the pull requests/commits merged after the last release using the GitHub API. After that, it...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Grab Framework Project

    Grab Framework Project

    Web Scraping Framework

    Grab is a python framework for building web scrapers. With Grab you can build web scrapers of various complexity, from simple 5-line scripts to complex asynchronous website crawlers processing millions of web pages. Grab provides an API for performing network requests and for handling the received content e.g. interacting with DOM tree of the HTML document. The single request/response API that allows you to build network request, perform it and work with the received content. The API is...
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB