Skip to main content

ML Libraries

Top

Web

  • ml5.js - Friendly machine learning for the web.
  • ml.js - Machine learning tools in JavaScript.

Embedded

  • NNoM - High-level inference Neural Network library specifically for microcontrollers.

Other

  • SynapseML - Simple and Distributed Machine Learning. (Web) (Article)
  • imgaug - Image augmentation for machine learning experiments.
  • PlaidML - Framework for making deep learning work everywhere.
  • Leaf - Open Machine Intelligence Framework for Hackers. (GPU/CPU).
  • Apache MXNet - Deep learning framework designed for both efficiency and flexibility. It allows you to mix symbolic and imperative programming to maximize efficiency and productivity.
  • Sonnet - Library built on top of TensorFlow for building complex neural networks.
  • tvm - Open deep learning compiler stack for cpu, gpu and specialized accelerators.
  • dgl - Python package built to ease deep learning on graph, on top of existing DL frameworks.
  • PySyft - Library for encrypted, privacy preserving deep learning.
  • numpy-ml - Machine learning, in numpy.
  • cuML - Suite of libraries that implement machine learning algorithms and mathematical primitives functions that share compatible APIs with other RAPIDS projects.
  • ONNX Runtime - Cross-platform, high performance scoring engine for ML models.
  • MLflow - Machine Learning Lifecycle Platform.
  • auto-sklearn - Automated machine learning toolkit and a drop-in replacement for a scikit-learn estimator.
  • TensorNetwork - Library for easy and efficient manipulation of tensor networks.
  • lambda-ml - Small machine learning library aimed at providing simple, concise implementations of machine learning techniques and utilities.
  • scikit-learn - Python module for machine learning built on top of SciPy. (Tutorials) (Course) (Web) (HN) (Examples)
  • MLBox - Powerful Automated Machine Learning python library.
  • Mlxtend (machine learning extensions) - Python library of useful tools for the day-to-day data science tasks.
  • CrypTen - Framework for Privacy Preserving Machine Learning built on PyTorch.
  • Faiss - Library for efficient similarity search and clustering of dense vectors. (Tips)
  • pyHSICLasso - Versatile Nonlinear Feature Selection Algorithm for High-dimensional Data.
  • AutoGluon - AutoML Toolkit for Deep Learning.
  • DeepLearning.scala - Simple library for creating complex neural networks from object-oriented and functional programming constructs.
  • Optuna - Hyperparameter optimization framework. (Optuna Dashboard)
  • Vowpal Wabbit - Machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning. (Web) (Article)
  • Brancher - User-centered Python package for differentiable probabilistic inference.
  • Karate Club - General purpose community detection and network embedding library for research built on NetworkX.
  • FlexFlow - Distributed deep learning framework that supports flexible parallelization strategies.
  • DeltaPy - Tabular Data Augmentation & Feature Engineering.
  • TensorStore - Library for reading and writing large multi-dimensional arrays.
  • FATE - Industrial Level Federated Learning Framework.
  • Deepkit - Collaborative and real-time machine learning training suite: Experiment execution, tracking, and debugging.
  • Sls - Stochastic Line Search.
  • PyCaret - Open source low-code machine learning library in Python that aims to reduce the hypothesis to insights cycle time in a ML experiment. (Web)
  • scikit-multilearn - Python module capable of performing multi-label learning tasks.
  • imbalanced-learn - Python package offering a number of re-sampling techniques commonly used in datasets showing strong between-class imbalance.
  • DeepSpeed - Deep learning optimization library that makes distributed training easy, efficient, and effective.
  • HoMM - Library for Homoiconic Meta-mapping.
  • Hummingbird - Library for compiling trained traditional ML models into tensor computations.
  • Ax - Accessible, general-purpose platform for understanding, managing, deploying, and automating adaptive experiments.
  • Neuropod - Uniform interface to run deep learning models from multiple frameworks.
  • aerosolve - Machine learning package built for humans in Scala.
  • Kur - Descriptive Deep Learning.
  • NNI (Neural Network Intelligence) - Lightweight but powerful toolkit to help users automate Feature Engineering, Neural Architecture Search, Hyperparameter Tuning and Model Compression.
  • LMfit-py - Non-Linear Least Squares Minimization, with flexible Parameter settings, based on scipy.optimize.leastsq, and with many additional classes and methods for curve fitting.
  • tslearn - Machine learning toolkit for time series analysis in Python.
  • Libra - Ergonomic machine learning for everyone. (Docs)
  • NGBoost - Natural Gradient Boosting for Probabilistic Prediction.
  • LightGBM - Gradient boosting framework that uses tree based learning algorithms.
  • XGBoost - Optimized distributed gradient boosting library designed to be highly efficient, flexible and portable. It implements machine learning algorithms under the Gradient Boosting framework.
  • DMLC-Core - Common bricks library for building scalable and portable distributed machine learning.
  • Linear Models - Add linear models including instrumental variable and panel data models that are missing from statsmodels.
  • skift - scikit-learn wrappers for Python fastText.
  • pulearn - Positive-unlabeled learning with Python.
  • pescador - Library for streaming (numerical) data, primarily for use in machine learning applications.
  • TPOT (Tree-based Pipeline Optimization Tool) - Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming. (Docs)
  • GraKeL - Library that provides implementations of several well-established graph kernels. scikit-learn compatible.
  • creme - Python library for online machine learning. All the tools in the library can be updated with a single observation at a time, and can therefore be used to learn from streaming data. (Docs)
  • RecBole - Unified, comprehensive and efficient recommendation library.
  • NNFusion - Flexible and efficient DNN compiler that can generate high-performance executables from a DNN model description.
  • ncnn - High-performance neural network inference computing framework optimized for mobile platforms.
  • Scikit-Optimize - Sequential model-based optimization with a scipy.optimize interface.
  • scikit-rebate - Scikit-learn-compatible Python implementation of ReBATE, a suite of Relief-based feature selection algorithms for Machine Learning.
  • Fedlearner - Collaborative machine learning frameowork that enables joint modeling of data distributed between institutions.
  • SkLearn2PMML - Python library for converting Scikit-Learn pipelines to PMML.
  • vecstack - Python package for stacking (machine learning technique).
  • LightSeq - High Performance Inference Library for Sequence Processing and Generation.
  • modestpy - Facilitates parameter estimation in models compliant with Functional Mock-up Interface.
  • Distiller - Open-source Python package for neural network compression research.
  • modAL - Modular active learning framework for Python.
  • Bambi - BAyesian Model-Building Interface in Python.
  • Bolt - Deep learning library with high performance and heterogeneous flexibility.
  • hypothesis - Python toolkit for (simulation-based) inference and the mechanization of science.
  • MMFeat - Multi-modal features toolkit in Python.
  • Flower - Friendly Federated Learning Framework. (Web) (Flower Summit 2021)
  • brain.js - GPU accelerated Neural networks in JavaScript for Browsers and Node.js. (Web)
  • Buffalo - Fast and scalable production-ready open source project for recommender systems.
  • EvalML - AutoML library that builds, optimizes, and evaluates machine learning pipelines using domain-specific objective functions.
  • MindSpore - New open source deep learning training/inference framework that could be used for mobile, edge and cloud scenarios.
  • Flashlight - Fast, Flexible Machine Learning in C++.
  • raster-deep-learning - ArcGIS built-in python raster functions for deep learning to get you started fast.
  • CTranslate2 - Fast inference engine for OpenNMT models.
  • Causal Discovery Toolbox - Algorithms for graph structure recovery (including algorithms from the bnlearn, pcalg packages), mainly based out of observational data.
  • FedML - Research Library and Benchmark for Federated Machine Learning.
  • Auto_TS - Automatically build multiple Time Series models using a Single Line of Code.
  • AutoGL (Auto Graph Learning) - AutoML framework & toolkit for machine learning on graphs.
  • tsalib - Tensor Shape Annotation Library (numpy, tensorflow, pytorch, ...).
  • MMClassification - Open source image classification toolbox based on PyTorch.
  • Nimble - Lightweight and Parallel GPU Task Scheduling for Deep Learning.
  • Dannjs - Neural Network library for JavaScript. (Web)
  • Shapley - Python library for evaluating binary classifiers in a machine learning ensemble.
  • Orion - Machine learning library built for unsupervised time series anomaly detection.
  • BigDL - Distributed Deep Learning on Apache Spark. (Docs)
  • MNN - Blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba.
  • Haste - CUDA implementation of fused RNN layers with built-in DropConnect and Zoneout regularization.
  • sklearn-xarray - Metadata-aware machine learning.
  • dabnn - Accelerated binary neural networks inference framework for mobile platform.
  • OneFlow - Performance-centered and open-source deep learning framework.
  • DeepWalk - Deep Learning for Graphs. (Web)
  • sequitur - Autoencoders for sequence data.
  • cleanlab - Machine learning python package for learning with noisy labels and finding label errors in datasets. (Web) (Lobsters)
  • deeptime - Python library for analysis of time series data including dimensionality reduction, clustering, and Markov model estimation.
  • Jelly Bean World - Framework for experimenting with never-ending learning.
  • Larq - Open-source deep learning library for training neural networks with extremely low precision weights and activations, such as Binarized Neural Networks (BNNs). (Web)
  • tsai - State-of-the-art Deep Learning for Time Series and Sequence Modeling.
  • edbo - Experimental Design via Bayesian Optimization.
  • TensorJS - JS/TS library for accelerated tensor computation intended to be run in the browser.
  • micro-TCN - Efficient neural networks for audio effect modeling. (Web)
  • DESlib - Python library for dynamic classifier and ensemble selection.
  • BytePS - High performance and generic framework for distributed DNN training.
  • Hyperactive - Hyperparameter optimization and meta-learning toolbox for convenient and fast prototyping of machine-learning models.
  • Jittor - Just-in-time(JIT) deep learning framework.
  • autofeat - Linear Prediction Model with Automated Feature Engineering and Selection Capabilities.
  • Distrax - Lightweight library of probability distributions and bijectors. It acts as a JAX-native reimplementation of a subset of TensorFlow Probability (TFP).
  • scikit-learn-extra - Set of useful tools compatible with scikit-learn.
  • GeneticAlgorithmPython - Building Genetic Algorithm in Python.
  • Newt - Gaussian process library in JAX.
  • Hedgehog - Bayesian networks in Python.
  • Backdoors 101 - PyTorch framework for state-of-the-art backdoor defenses and attacks on deep learning models.
  • Sabertooth - Standalone pre-training recipe with JAX+Flax.
  • ProbFlow - Python package for building Bayesian models with TensorFlow or PyTorch.
  • Mars - Tensor-based unified framework for large-scale data computation which scales Numpy, pandas, Scikit-learn and Python functions.
  • DeepMatch - Deep matching model library for recommendations & advertising.
  • Layout Parser - Unified toolkit for Deep Learning Based Document Image Analysis. (Web)
  • scikit-survival - Survival analysis built on top of scikit-learn.
  • PySR - Simple, fast, and parallelized symbolic regression in Python/Julia via regularized evolution and simulated annealing.
  • Snowman Hotword Detection
  • CLU - Contains common functionality for writing ML training loops using JAX.
  • SparseML - Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models.
  • CogDL - Extensive Toolkit for Deep Learning on Graphs. (Web)
  • TensorLy - Tensor Learning in Python. (Web)
  • Cornac - Comparative Framework for Multimodal Recommender Systems.
  • MegEngine - Fast, scalable and easy-to-use deep learning framework, with auto-differentiation.
  • SeqIO - Task-based datasets, preprocessing, and evaluation for sequence models.
  • OpenAI Python - Provides convenient access to the OpenAI API from applications written in Python.
  • Mesh Transformer JAX - Model parallel transformers in JAX and Haiku. (HN)
  • Checking out a 6-Billion parameter GPT model, GPT-J, from Eleuther AI (2021)
  • deepC - Vendor independent deep learning library, compiler and inference framework designed for small form-factor devices.
  • Dlib - Modern C++/Python Toolkit for Machine Learning . (Web) (HN)
  • Continuum - Clean and simple data loading library for Continual Learning.
  • Smile - Statistical Machine Intelligence & Learning Engine.
  • AugLy - Data augmentations library for audio, image, text, and video.
  • Surprise - Python scikit for building and analyzing recommender systems. (Web)
  • TNN - High-performance, lightweight neural network inference framework.
  • Parallax - Immutable Torch Modules for JAX.
  • EvalAI - Open source platform for evaluating and comparing machine learning (ML) and artificial intelligence (AI) algorithms at scale. (Web)
  • Avalanche - End-to-End Library for Continual Learning. (Docs)
  • PyKale - Knowledge-Aware machine LEarning (KALE) from multiple sources in Python.
  • mltrace - Coarse-grained lineage and tracing for machine learning pipelines.
  • PPLNN - High-performance deep-learning inference engine for efficient AI inferencing.
  • Petastorm - Enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format.
  • Collie - Library for preparing, training, and evaluating scalable deep learning hybrid recommender systems using PyTorch. (Docs)
  • voxelmorph - Unsupervised Learning for Image Registration.
  • uTensor - TinyML AI inference library.
  • Tangram - Train a model from a CSV file on the command line.. (Web) (HN)
  • AdaptDL - Resource-adaptive cluster scheduler for deep learning training.
  • Triage - General Purpose Risk Modeling and Prediction Toolkit for Policy and Social Good Problems.
  • Gorse - Open source recommender system service written in Go. (Web) (HN)
  • LensKit - Python Tools for Recommender Experiments. (Web)
  • StarSpace - Learning embeddings for classification, retrieval and ranking.
  • ELFI - Engine for Likelihood-Free Inference. (Docs)
  • DaisyRec - Python toolkit dealing with rating prediction and item ranking issue.
  • AutoTS - Forecasting Model Selection for Multiple Time Series.
  • PyFlux - Open source time series library for Python.
  • trajax - Python library for differentiable optimal control on accelerators.
  • TransmogrifAI - End-to-end AutoML library for structured data written in Scala that runs on top of Apache Spark. (Web)
  • chitra - Multi-functional library for full-stack Deep Learning. It simplifies Model Building, API development, and Model Deployment.
  • DoubleML - Double Machine Learning in Python.
  • jaxfg - Factor graphs and nonlinear optimization in JAX.
  • pyltr - Python learning-to-rank toolkit with ranking models, evaluation metrics, data wrangling helpers, and more.
  • Wrangl - Ray-based parallel data preprocessing for NLP and ML.
  • Treex - Pytree-based Module system for Deep Learning in JAX. (Docs)
  • PhiFlow - Open-source simulation toolkit built for optimization and machine learning applications.
  • OpenVINO Toolkit - Deploy pre-trained deep learning models through a high-level C++ Inference Engine API integrated with application logic.
  • WILDS - Machine learning benchmark of in-the-wild distribution shifts, with data loaders, evaluators, and default models.
  • TurboTransformers - Fast and user-friendly runtime for transformer inference on CPU and GPU.
  • DeepOps - Mini Deep Learning framework supporting GPU accelerations written with CUDA.
  • Bayex - Bayesian Optimization Python Library powered by JAX.
  • Merlion - Machine Learning Framework for Time Series Intelligence.
  • Feast - Feature Store for Machine Learning. (Web)
  • nnabla - Neural Network Libraries by Sony. (Web)
  • RevLib - Simple and efficient RevNet-Library with DeepSpeed support.
  • DeepSparse - Neural network inference engine that delivers GPU-class performance for sparsified models on CPUs.
  • NVTabular - Engineering and preprocessing library for tabular data that is designed to easily manipulate terabyte scale datasets and train deep learning (DL) based recommender systems.
  • Treeo - Small library for creating and manipulating custom JAX Pytree classes.
  • FedJAX - JAX-based open source library for Federated Learning simulations that emphasizes ease-of-use in research.
  • oneAPI - OneAPI Deep Neural Network Library (oneDNN).
  • MosaicML Composer - Library of methods, and ways to compose them together for more efficient ML training.
  • deep-significance - Easy and Better Significance Testing for Deep Neural Networks.
  • Finetuner - Finetuning any DNN for better embedding on neural search tasks. (Docs)
  • mlcrate - Hon module of handy tools and functions, mainly for ML and Kaggle.
  • mle-hyperopt - Lightweight Hyperparameter Optimization Tool.
  • Feature Engine - Python library with multiple transformers to engineer and select features for use in machine learning models.
  • BaaL - Bayesian active learning library.
  • TorchArrow - torch.Tensor-like DataFrame library supporting multiple execution runtimes and Arrow as a common memory format.
  • Arm NN - Software and tools that enables machine learning workloads on power-efficient devices.
  • OpenRec - Open-source and modular library for neural network-inspired recommendation algorithms.
  • FlexFlow - Distributed deep learning framework that supports flexible parallelization strategies.
  • ColossalAI - Unified Deep Learning System for Large-Scale Parallel Training. (Docs) (Examples)
  • XManager - Framework for managing machine learning experiments.
  • T5X - Modular, composable, research-friendly framework for high-performance, configurable, self-service training.
  • mlinspect - Inspect ML Pipelines in Python in the form of a DAG.
  • Privacy Lint - Library that allows you to perform a privacy analysis (Membership Inference) of your model in PyTorch.
  • NVIDIA Object Detection Toolkit (ODTK) - Fast and accurate single stage object detection with end-to-end GPU optimization.
  • DeAI - Decentralized privacy-preserving ML training software framework, using p2p networking.
  • Varuna - Tool for efficient training of large DNN models on commodity GPUs and networking.
  • reXmeX - General purpose recommender metrics library for fair evaluation.
  • Einshape - DSL-based reshaping library for JAX and other frameworks.
  • BlobCity AutoAI - Framework to find the best performing AI/ML model for any AI problem.
  • PyPAL - Multiobjective active learning with tunable accuracy/efficiency tradeoff and clear stopping criterion.
  • RecList - Behavioral "black-box" testing for recommender systems.
  • dcbench - Benchmark of data-centric tasks from across the machine learning lifecycle.
  • Cockpit - Visual and statistical debugger specifically designed for deep learning.
  • CatBoost - Machine learning method based on gradient boosting over decision trees. (Web) (Tutorials)
  • Xplique - Neural Networks Explainability Toolbox.
  • Causal ML - Python Package for Uplift Modeling and Causal Inference with ML.
  • sklearn-onnx - Convert scikit-learn models and pipelines to ONNX.
  • Tools for JAX - Variety of tools for the differential programming library JAX.
  • KML - Machine Learning Framework for Operating Systems & Storage Systems. (HN)
  • ENN Incubator - Collection of in-progress libraries for entity neural networks.
  • Syne Tune - Large scale and asynchronous Hyperparameter Optimization at your fingertip.
  • Maggy - Framework for distribution transparent machine learning experiments on Apache Spark.
  • Apache SINGA - Distributed deep learning system. (Web)
  • Tiny CUDA Neural Networks - Lightning fast & tiny C++/CUDA neural network framework.
  • Apache TVM - Open Deep Learning Compiler Stack.
  • imodels - Interpretable ML package for concise, transparent, and accurate predictive modeling (sklearn-compatible).
  • FLSim - Flexible, standalone library written in PyTorch that simulates FL settings with a minimal, easy-to-use API.
  • Human Learn - Machine Learning models should play by the rules, literally.
  • MiniTorch - DIY teaching library for machine learning engineers who wish to learn about the internal concepts underlying deep learning systems.
  • TorchRecipes - Train machine learning models with a couple of lines of code.
  • DABS - Domain-Agnostic Benchmark for Self-Supervised Learning.
  • apricot - Implements submodular optimization for the purpose of selecting subsets of massive data sets to train machine learning models quickly.
  • Theseus - Library for differentiable nonlinear optimization built on PyTorch.
  • MMSelfSup - OpenMMLab Self-Supervised Learning Toolbox and Benchmark.
  • NVFlare - NVIDIA Federated Learning Application Runtime Environment. (Docs)
  • OSLO - Open Source framework for Large-scale transformer Optimization.
  • snntorch - Deep and online learning with spiking neural networks in Python.
  • NVIDIA DALI - GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
  • MIPLearn - Framework for solving discrete optimization problems using a combination of Mixed-Integer Linear Programming (MIP) and Machine Learning (ML).
  • tree-math - Mathematical operations for JAX pytrees.
  • ExplainX - Explainable AI framework for data scientists. Explain & debug any blackbox machine learning model with a single line of code.
  • Contextual AI - Adds explainability to different stages of machine learning pipelines.
  • jax_dataclasses - Pytrees + static analysis.
  • kingly - Zero-cost state-machine library for robust, testable and portable user interfaces (most machines compile ~1-2KB).
  • RTNeural - Lightweight neural network inferencing engine written in C++.
  • JAXopt - Hardware accelerated, batchable and differentiable optimizers in JAX.
  • chop - Optimization library based on PyTorch, with applications to adversarial examples and structured neural network training.
  • WebDNN - Fastest DNN Running Framework on Web Browser.
  • nonconformist - Python implementation of the conformal prediction framework.
  • jaxdf - JAX-based research framework for writing differentiable numerical simulators with arbitrary discretizations.
  • DoWhy - End-to-end library for causal inference.
  • hypopt - Parallelized hyper-param optimization with validation set, not crossval.
  • ML Collections - Library of Python Collections designed for ML use cases.
  • Latte - Cross-framework Python Package for Evaluation of Latent-based Generative Models.
  • Raster Vision - Open source framework for deep learning on satellite and aerial imagery.
  • SPEAR - Semi-Supervised Data Programming for Data Efficient Machine Learning.
  • Ivy - Unified machine learning framework, enabling framework-agnostic functions, layers and libraries. (Web)
  • NeuralForecast - Python library for time series forecasting with deep learning models.
  • pythae - Library for Variational Autoencoder benchmarking.
  • Pyraug - Data Augmentation with Variational Autoencoders.
  • product-quantization - Implementation of vector quantization algorithms, codes for Norm-Explicit Quantization: Improving Vector Quantization for Maximum Inner Product Search.
  • learned_optimization - Training and evaluating learned optimizers in JAX.
  • OTT - Sturdy, versatile and efficient optimal transport solvers, taking advantage of JAX features, such as JIT, auto-vectorization and implicit differentiation.
  • Marian - Efficient Neural Machine Translation framework written in pure C++ with minimal dependencies. (Web)
  • segmind - MLOps for end-to-end deep learning lifecycle.
  • FLUTE - Federated Learning Utilities and Tools for Experimentation.
  • evosax - JAX-Based Evolution Strategies.
  • Neural Processes - Framework for composing Neural Processes in Python.
  • Anomalib - Library for benchmarking, developing and deploying deep learning anomaly detection algorithms.
  • Fasterai - Library to make smaller and faster models with FastAI.
  • ClearML Server - Auto-Magical Suite of tools to streamline your ML workflow. Experiment Manager, ML-Ops and Data-Management.
  • Human Library - 3D Face Detection & Rotation Tracking, Face Description & more.
  • Towhee - Flexible, application-oriented framework for generating embedding vectors via a pipeline of ML models and other operations.
  • AutoFaiss - Automatically create Faiss knn indices with the most optimal similarity search parameters.
  • Statistical Forecast - Lightning fast forecasting with statistical and econometric models.
  • MLSpec - Standardize the intercomponent schemas for a multi-stage ML Pipeline.
  • Alfred Python - Command line tool for deep-learning usage.
  • Bacon - Framework for orchestrating machine learning experiments on AWS.
  • PyClustering - Python, C++ data mining library.
  • PQk-means - Fast and memory-efficient clustering.
  • LeanTransformer - Memory-efficient transformer.
  • HoloClean - Machine Learning System for Data Enrichment. Built on top of PyTorch and PostgreSQL.
  • OpenDelta - Open-Source Framework for Paramter Efficient Tuning (Delta Tuning).
  • Alpa - Automatically parallelizes tensor computational graphs and runs them on a distributed cluster.
  • GPBoost - Combining Tree-Boosting with Gaussian Process and Mixed Effects Models.
  • CORDS - Reduce end to end training time from days to hours (or hours to minutes), and energy requirements/costs by an order of magnitude using coresets and data selection.
  • DISTIL - Cut down your labeling cost and time by 3x-5x.
  • OpenFL - Open-Source Framework For Federated Learning.
  • Basenji - Sequential regulatory activity predictions with deep convolutional neural networks.
  • PyDP - Python Differential Privacy Library.
  • veGiantModel - Torch based high efficient training library developed by the Applied Machine Learning team at Bytedance.
  • Flame - Federated learning system for edge with flexibility and scalability at the core of its design.
  • DPU Utilities - Utilities used by the Deep Program Understanding team.
  • XGBoost-Ray - Distributed backend for XGBoost, built on top of distributed computing framework Ray.
  • Easy Parallel Library - General and efficient library for distributed model training.
  • MetricFlow - Allows you to define, build, and maintain metrics in code.
  • HuggingFace Evaluate
  • PADL - Pipeline Abstractions for Deep Learning.
  • Vertex AI SDK for Python - Python SDK for Vertex AI, a fully managed, end-to-end platform for data science and machine learning.
  • Tempo - MLOps Python Library.
  • LightFM - Python implementation of LightFM, a hybrid recommendation algorithm.
  • fklearn - Functional Machine Learning.
  • Transformer PhysX - Transformers for modeling physical systems.
  • Feathr - Enterprise-Grade, High Performance Feature Store. (Article)
  • To what extent can Rust be used for Machine Learning? (2022)
  • Vectorflow - Minimalist neural network library optimized for sparse data and single machine environments.
  • D2Go - Toolkit for efficient deep learning.
  • Slideflow - Deep learning pipeline for histology image analysis, with both Tensorflow and PyTorch support.
  • Forte - Bring good software engineering to your ML solutions, starting from Data.
  • Machine Learning(-ish) nix packages
  • PaddleSeg - High-Efficient Development Toolkit for Image Segmentation.
  • TorchSparse - High-performance neural network library for point cloud processing.
  • H2O - In-memory platform for distributed, scalable machine learning.
  • Ranger - Synergistic optimizer using RAdam (Rectified Adam), Gradient Centralization and LookAhead in one code base.
  • Unseal - Mechanistic Interpretability for Transformer Models.
  • ANTsPy - Advanced Normalization Tools in Python.
  • FasterTransformer Backend - Triton backend for the FasterTransformer.
  • Nixtla - Automated time series processing and forecasting.
  • FederatedScope - Easy-to-use federated learning platform.
  • Habitat Lab - Modular high-level library to train embodied AI agents across a variety of tasks, environments, and simulators.
  • Ranger21 - Integrating the latest deep learning components into a single optimizer.
  • Tevatron - Flexible toolkit for dense retrieval research and development.
  • mlrose - Python package for implementing a number of Machine Learning, Randomized Optimization and SEarch algorithms.
  • Scikit-Learn Compiled Trees
  • KotlinDL - High-level Deep Learning Framework written in Kotlin and inspired by Keras.
  • PGBM - Probabilistic Gradient Boosting Machines.
  • Fiddle - Python-first configuration library particularly well suited to ML applications.
  • tpunicorn - Python library and command-line program for managing TPUs.
  • CLAP - Contrastive Language-Audio Pretraining.
  • COMET - Neural Framework for MT Evaluation.
  • Magnitude - Feature-packed Python package and vector storage file format for utilizing vector embeddings in machine learning models.
  • TorchANI - Accurate Neural Network Potential on PyTorch.
  • gap-train - Gaussian Approximation Potential Training.
  • lleaves - LLVM-based compiler for LightGBM decision trees.
  • TensorScript - High-level language for specifying finite-dimensioned tensor computation. (Web)
  • Neural Fluid Fields - Small library for doing fluid simulation with neural fields.
  • OmniXAI - Library for eXplainable AI.
  • mmap.ninja - Library for storing your datasets in memory-mapped files, which leads to a dramatic speedup in the training time. Accelerate the iteration over your machine learning dataset by up to 20 times.
  • geomloss - Geometric loss functions between point clouds, images and volumes.
  • morphsnakes - Implementation of the Morphological Snakes for image segmentation. Supports 2D images and 3D volumes.
  • HyperLib - Common Neural Network components in the hyperbolic space (using the Poincare model).
  • Lite.Ai.ToolKit - C++ toolkit of awesome AI models.
  • RecZilla - Metalearning for algorithm selection on Recommender Systems.
  • EdgeML - Machine learning algorithms for edge devices developed at Microsoft Research India.
  • Quaterion - Framework for fine-tuning similarity learning models.
  • SecretFlow - Python Library for learning (Structure and Parameter), inference (Probabilistic and Causal), and simulations in Bayesian Networks.
  • pycox - Python package for survival analysis and time-to-event prediction with PyTorch.
  • AI2 Tango - Organize your experiments into discrete steps that can be cached and reused throughout the lifetime of your research project.
  • ADAPT - Awesome Domain Adaptation Python Toolbox.
  • giotto-deep - Deep learning made topological.
  • DeepSpeed-MII - Library from DeepSpeed, designed to make low-latency, low-cost inference of powerful transformer models.
  • logreg - Bayesian inference for a logistic regression model in various languages.
  • PINA - Physics-Informed Neural networks for Advanced modeling.
  • PyCave - Traditional Machine Learning Models for Large-Scale Datasets in PyTorch.
  • Draco - Formal framework for representing design knowledge about effective visualization design as a collection of constraints.
  • GRAPE - Rust/Python library for high-performance Graph Processing and Embedding.
  • dp-transformers - Differentially-private transformers using HuggingFace and Opacus.
  • TinyMaix - Tiny inference library for microcontrollers (TinyML).
  • x-unet - Implementation of a U-net complete with efficient attention as well as the latest research findings.