Skip to content

Storage and Logs

IntegraMon stores data in several places at once. Some of them are persistent by design, others are explicitly temporary.

Main storage roots

Global runtime root

  • DATA_DIR
  • default container path: /app/data

This is the main base directory for:

  • SQLite database files
  • tenant archive folders
  • tenant job logs
  • storage-related exports and per-tenant runtime files

Global log root

  • LOG_DIR
  • default: ${DATA_DIR}/logs

Supervisor writes service logs here.

Service log locations

Supervisor-managed log files typically include:

  • ${LOG_DIR}/postgres.log
  • ${LOG_DIR}/pg-init.log
  • ${LOG_DIR}/redis.log
  • ${LOG_DIR}/migrate.log
  • ${LOG_DIR}/gunicorn.log
  • ${LOG_DIR}/celery-light.log
  • ${LOG_DIR}/celery-light-cold.log
  • ${LOG_DIR}/celery-api-details.log
  • ${LOG_DIR}/celery-api-details-cold.log
  • ${LOG_DIR}/celery-alert-details.log
  • ${LOG_DIR}/celery-alerts.log
  • ${LOG_DIR}/celery-periodic.log
  • ${LOG_DIR}/celery-ai.log
  • ${LOG_DIR}/celery-messagelog-process-batch.log
  • ${LOG_DIR}/celery-messagelog-process-batch-cold.log
  • ${LOG_DIR}/celery-medium.log
  • ${LOG_DIR}/worker.log
  • ${LOG_DIR}/nginx.log
  • ${LOG_DIR}/login_activity.log

These are global platform logs.

Tenant job log locations

Per-tenant job logs are resolved by core/worker_logging.py.

Default pattern:

  • <DATA_DIR>/<config.name>/logs/jobs/<jobname>/<YYYY-MM-DD>.log
  • <DATA_DIR>/<config.name>/logs/runs/<jobname>/<run_id>.log

This can be overridden by tenant configuration through:

  • cConfigExt(name="global").value["data_dir"]
  • cConfigExt(name="jobs.logs").value["data_dir"]

Supported tokens:

  • {data_dir}
  • {config}

Archive storage

Archive jobs create tenant-specific archive directories under:

  • <tenant data dir>/<config_name>/archive

Current default archive naming patterns:

  • PostgreSQL export: <config>_YYYY_MM_DD.archive
  • SQLite export: <config>_YYYY_MM_DD.archive.sqlite
  • optionally compressed SQLite export: .gz

The archive database model is cpiArchive.

Data that grows quickly

The main growth drivers are:

  • cpiMessageLog
  • cpiMessageLogRuns
  • cpiPayload
  • cpiMessageAttachment
  • cpiCustomHeaderProperties
  • tenant archive files
  • job and service logs
  • metric snapshot tables over long periods

Temporary versus persistent data

Persistent by design

  • SQLite DB file under DATA_DIR
  • tenant archives
  • database rows in PostgreSQL or SQLite
  • report artifacts stored in DB by default
  • metric snapshot tables

Temporary or rebuildable

  • Redis cache content
  • Redis broker queue state when internal Redis is used
  • documentation index cache
  • generated runtime frontend config /var/www/html/app-config.js

Not persisted automatically unless mounted separately

  • /var/log/nginx/access.log
  • /var/log/nginx/error.log
  • in-container PostgreSQL cluster storage when using local DB without a mounted volume

Cleanup behavior

Log cleanup

Archive runs also trigger log cleanup:

  • aggregated job logs are deleted by filename date
  • run logs are deleted by file modification time
  • env fallback retention is LOG_RETENTION_DAYS=3

Tenant-specific jobs.logs config can override that retention at runtime.

Archive cleanup

archive_cpi_day supports:

  • days_ago default 10
  • delete_older_than default 30

That means archive processing and archive deletion are built into the same task family.

Config cleanup

The platform cleanup tasks delete tenant-related records in batches and also purge cache keys and archive references. Relevant models include:

  • cConfig
  • cConfigCleanupHistory
  • cWorkerJobRun
  • archive and CPI data models connected to the tenant

Rotation and hard limits

What exists today:

  • Supervisor logfile max size: 10MB
  • Supervisor logfile backups: 3
  • log cleanup through archive tasks

What does not exist globally:

  • one central storage quota system
  • hard archive size caps
  • automatic DB retention pruning for all large tables

Operators should therefore treat storage review as an active duty, not a fully automated one.

Typical growth patterns

Expected storage growth is mostly driven by:

  • polling frequency
  • count of monitored tenants
  • payload download depth
  • archive retention window
  • log verbosity and worker count

Practical pattern:

  • message tables grow first
  • archives grow next
  • logs grow steadily with worker count and troubleshooting activity
  • mount DATA_DIR to persistent storage
  • monitor archive folder growth per tenant
  • monitor DB table growth from superadmin storage views
  • keep log retention short unless compliance requires longer storage
  • use PostgreSQL and external storage for medium to large environments