The described uptime monitoring system leverages FastAPI for the backend, PostgreSQL as its database, Celery with Redis for task queuing and background polling of endpoints, and a separate service for notifications. This setup is designed to monitor endpoint health by periodically checking their availability and responsiveness. Failures are detected based on response times and HTTP status codes (4xx or 5xx), which trigger alerts through the notification system. The dashboard provides insights into uptime statistics and latency metrics. While this architecture demonstrates a foundational understanding of monitoring systems, scaling issues may arise with thousands of endpoints due to the polling-based approach via Celery. Long-term scalability concerns include potential bottlenecks in task distribution and execution, especially when high-frequency checks are required for numerous targets.
- FastAPI
- PostgreSQL
- Celery + Redis
- Evaluate and implement horizontal scaling for FastAPI, PostgreSQL, and Celery services to handle increased load.
- Consider upgrading to a more scalable task queue system if Celery is found inefficient under high loads.
- Update the notification service configuration to handle multiple alert streams efficiently.
In common homelab stacks with limited resources, this architecture might face performance issues as the number of monitored endpoints increases. The FastAPI application and PostgreSQL database need to be carefully scaled or optimized for handling a high volume of request logs.