A curated list of Site Reliability and Production Engineering Tools
-
Updated
Mar 18, 2026
A curated list of Site Reliability and Production Engineering Tools
Lightweight, self-contained Linux® server monitoring tool
A Simple Monitoring Dashboard for Docker Swarm Cluster
Data Center workload and software optimizations for Intel hardware.
Dashboard for Docker Swarm Cluster
📊 Analyze and monitor Microsoft Intune Management Extension logs on Windows for real-time insights and error detection.
Advanced stealth web data collection framework for security
Utility to test and wipe hard disks and SSDs
Awesome Uptime Monitoring
Identify unused resources at Google Cloud Platform through Prometheus' metrics
A collection of scripts that extend EventSentry's functionality.
Network-Based Intrusion Detection System - dev/deploy-ment
Command line client for interacting with checkson.io
My Artificial Intelligence Log Sentinel for Postfix and beyond...
Real-time log file monitoring with pattern highlighting and desktop notifications. Cross-platform Rust CLI tool with regex matching, file rotation support, and desktop notifications.
🌐 Explore VandCloud, a cross-platform app to browse, test, and monitor APIs and services with real-time status updates.
Wazuh integration to send alerts to Keep (open-source alert management and AIOps platform)
🤖 Simplify IT operations with Wuhr AI Ops, an AI-driven platform for real-time monitoring, log analysis, and seamless CI/CD management.
🖥️ Monitor RAM and CPU usage in Proxmox for hosts, LXC, and QEMU/KVM VMs with clear visuals and detailed metrics for better resource management.
Add a description, image, and links to the monitoring-tools topic page so that developers can more easily learn about it.
To associate your repository with the monitoring-tools topic, visit your repo's landing page and select "manage topics."