Storage and Logs

IntegraMon stores data in several places at once. Some of them are persistent by design, others are explicitly temporary.

Main storage roots

Global runtime root

DATA_DIR
default container path: /app/data

This is the main base directory for:

SQLite database files
tenant archive folders
tenant job logs
storage-related exports and per-tenant runtime files

Global log root

LOG_DIR
default: ${DATA_DIR}/logs

Supervisor writes service logs here.

Service log locations

Supervisor-managed log files typically include:

${LOG_DIR}/postgres.log
${LOG_DIR}/pg-init.log
${LOG_DIR}/redis.log
${LOG_DIR}/migrate.log
${LOG_DIR}/gunicorn.log
${LOG_DIR}/celery-light.log
${LOG_DIR}/celery-light-cold.log
${LOG_DIR}/celery-api-details.log
${LOG_DIR}/celery-api-details-cold.log
${LOG_DIR}/celery-alert-details.log
${LOG_DIR}/celery-alerts.log
${LOG_DIR}/celery-periodic.log
${LOG_DIR}/celery-ai.log
${LOG_DIR}/celery-messagelog-process-batch.log
${LOG_DIR}/celery-messagelog-process-batch-cold.log
${LOG_DIR}/celery-medium.log
${LOG_DIR}/worker.log
${LOG_DIR}/nginx.log
${LOG_DIR}/login_activity.log

These are global platform logs.

Tenant job log locations

Per-tenant job logs are resolved by core/worker_logging.py.

Default pattern:

<DATA_DIR>/<config.name>/logs/jobs/<jobname>/<YYYY-MM-DD>.log
<DATA_DIR>/<config.name>/logs/runs/<jobname>/<run_id>.log

This can be overridden by tenant configuration through:

cConfigExt(name="global").value["data_dir"]
cConfigExt(name="jobs.logs").value["data_dir"]

Supported tokens:

{data_dir}
{config}

Archive storage

Archive jobs create tenant-specific archive directories under:

<tenant data dir>/<config_name>/archive

Current default archive naming patterns:

PostgreSQL export: <config>_YYYY_MM_DD.archive
SQLite export: <config>_YYYY_MM_DD.archive.sqlite
optionally compressed SQLite export: .gz

The archive database model is cpiArchive.

Data that grows quickly

The main growth drivers are:

cpiMessageLog
cpiMessageLogRuns
cpiPayload
cpiMessageAttachment
cpiCustomHeaderProperties
tenant archive files
job and service logs
metric snapshot tables over long periods

Temporary versus persistent data

Persistent by design

SQLite DB file under DATA_DIR
tenant archives
database rows in PostgreSQL or SQLite
report artifacts stored in DB by default
metric snapshot tables

Temporary or rebuildable

Redis cache content
Redis broker queue state when internal Redis is used
documentation index cache
generated runtime frontend config /var/www/html/app-config.js

Not persisted automatically unless mounted separately

/var/log/nginx/access.log
/var/log/nginx/error.log
in-container PostgreSQL cluster storage when using local DB without a mounted volume

Cleanup behavior

Log cleanup

Archive runs also trigger log cleanup:

aggregated job logs are deleted by filename date
run logs are deleted by file modification time
env fallback retention is LOG_RETENTION_DAYS=3

Tenant-specific jobs.logs config can override that retention at runtime.

Archive cleanup

archive_cpi_day supports:

days_ago default 10
delete_older_than default 30

That means archive processing and archive deletion are built into the same task family.

Config cleanup

The platform cleanup tasks delete tenant-related records in batches and also purge cache keys and archive references. Relevant models include:

cConfig
cConfigCleanupHistory
cWorkerJobRun
archive and CPI data models connected to the tenant

Rotation and hard limits

What exists today:

Supervisor logfile max size: 10MB
Supervisor logfile backups: 3
log cleanup through archive tasks

What does not exist globally:

one central storage quota system
hard archive size caps
automatic DB retention pruning for all large tables

Operators should therefore treat storage review as an active duty, not a fully automated one.

Typical growth patterns

Expected storage growth is mostly driven by:

polling frequency
count of monitored tenants
payload download depth
archive retention window
log verbosity and worker count

Practical pattern:

message tables grow first
archives grow next
logs grow steadily with worker count and troubleshooting activity

Recommended operator actions

mount DATA_DIR to persistent storage
monitor archive folder growth per tenant
monitor DB table growth from superadmin storage views
keep log retention short unless compliance requires longer storage
use PostgreSQL and external storage for medium to large environments