Sizing and Requirements
This page is operating guidance based on the current architecture. It is not a hard product limit matrix.
What drives load in this system
The biggest load multipliers are:
- number of active tenants
- polling frequency for messages, payloads, packages, and artifacts
- amount of historical backfill
- payload download depth
- archive frequency and retention
- number of concurrent users
- number of active Celery worker processes
- whether PostgreSQL and Redis are internal or external
Built-in interval defaults
Current platform metric defaults are:
- host metrics every
600seconds - Django metrics every
300seconds - storage quick snapshots every
3600seconds - storage deep snapshots every
21600seconds
Current scheduled report dispatch batch limit:
50jobs per periodic pass
Small installation
Use this as a starting point for:
1to5users- low or moderate alert volume
- a handful of tenants
- limited payload and archive depth
Recommended shape:
- SQLite can be acceptable
- internal Redis can be acceptable
- worker profile
safeorbalanced
Starting infrastructure guidance:
- CPU:
2vCPU - RAM:
4 GB - disk:
20to50 GB - storage: persistent mount for
/app/data
Operational notes:
- keep polling conservative
- avoid heavy cold backfills during peak hours
- accept that internal Redis has no persistence
Medium installation
Use this as a starting point for:
- multiple admins
- regular background processing
- higher message volume
- meaningful archive and payload usage
Recommended shape:
- PostgreSQL recommended
- external Redis recommended
- worker profile
balanced
Starting infrastructure guidance:
- CPU:
4vCPU - RAM:
8to16 GB - disk:
100 GBand upward depending on archive retention - network: stable low-latency path to Redis and PostgreSQL
Operational notes:
- queue monitoring becomes mandatory
- table growth review should be part of routine operations
- use mounted or managed persistent storage for archives
Large installation
Use this as a starting point for:
- many tenants
- constant CPI ingestion
- heavier payload inspection
- broad operational reporting
Recommended shape:
- dedicated PostgreSQL or managed PostgreSQL
- dedicated Redis or managed Redis
- worker profile
fastor custom only after evidence-based tuning
Starting infrastructure guidance:
- CPU:
8vCPU or more - RAM:
16to32 GBor more - disk: sized primarily around DB growth and archive retention
- storage: strong IOPS matter more than only raw capacity
Operational notes:
- monitor queue depth continuously
- monitor DB write latency and vacuum behavior
- separate infrastructure services are strongly preferred
CPU considerations
CPU demand increases with:
- many Celery workers
- archive export and compression
- PDF export generation
- AI task execution
- message parsing and batch processing
The container image also includes Chromium, Java, PostgreSQL, Redis, Nginx, and Python runtime pieces, so it is not a minimal single-process image.
RAM considerations
Idle memory footprint is affected by:
- Gunicorn with
4workers and2threads each - many Celery worker processes
- internal PostgreSQL
- internal Redis
- Python process duplication across worker groups
For constrained environments, the first safe levers are:
- reduce Celery concurrency
- use external PostgreSQL
- use external Redis
- keep archive and cold backfill activity moderate
Disk and IOPS considerations
Disk demand is shaped by:
- SQLite write amplification if SQLite is used
- PostgreSQL table and index growth
- archive exports
- job and service logs
- payload and attachment retention
IOPS become important when:
- many workers write at once
- archive cleanup triggers heavy DB maintenance
- storage snapshots run during active ingestion windows
Network considerations
Network quality matters for:
- PostgreSQL round trips
- Redis round trips
- CPI API calls
- email delivery
- managed AI backends
If Redis or PostgreSQL are remote, low latency improves queue throughput and lock behavior.
Practical tuning order
When throughput is not sufficient, the safest order is:
- validate queue depth and Redis health
- validate DB latency and write pressure
- review polling frequencies and cold backfill behavior
- only then raise worker concurrency
This order matches how the current architecture actually bottlenecks.