Skip to main content

Resourcing Overview

Onyx Lite

ResourceMinimumPreferred
CPU2 vCPU4 vCPU
RAM2 GB4 GB
Disk10 GB50 GB
Onyx Lite uses under 1GB of memory at baseline. Disk and memory usage scale with the number of files users upload to the system, since PostgreSQL handles file storage in Lite mode.

Onyx Standard

ResourceMinimumPreferred
CPU4 vCPU8+ vCPU
RAM10 GB16+ GB
Disk32 GB + ~2.5x indexed data500 GB for organizations <5000 users
OpenSearch enforces a read-only block on indices when disk usage hits the flood stage watermark (default 95%), which effectively blocks all writes. Monitor disk usage and plan capacity accordingly.

Local Deployment (Docker)

You can control the resources available to Docker in the Resources section of the Docker Desktop settings menu.
Often old, unused Docker images take up sizeable disk space. To clean up unused images, run docker system prune --all.

Cloud Providers (AWS, GCP, etc.)

For small to mid scale deployments, we recommend deploying Onyx to a single instance in your cloud provider of choice. When evaluating your instance, follow the Preferred resources in the table above.

Onyx Lite

ProviderRecommended Instance Type
AWSt3.medium
GCPe2-medium
AzureB2s

Onyx Standard

ProviderRecommended Instance Type
AWSm7g.xlarge
GCPe2-standard-4 or e2-standard-8
AzureD4s_v3
DigitalOceanMeet the preferred resources in the table above

Container-Specific Resourcing (Standard)

For more efficient scaling, you can dedicate resources to each Onyx container using Kubernetes or AWS EKS. See the Onyx Helm chart values.yaml for our default requests and limits.
ComponentCPUMemory
api_server12 Gi
background28 Gi
indexing_model_server24 Gi
inference_model_server24 Gi
postgres22 Gi
opensearch24 Gi
nginx250m (1/4)128 Mi
If you are using cloud-based embedding models (e.g. OpenAI, Cohere, etc.) instead of locally hosted ones, the indexing_model_server and inference_model_server will use significantly less memory.
All together, this comes out to a total available node size of at least ~12 CPU and ~24GB of Memory.

Container-Specific Resourcing (Lite)

Onyx Lite runs only four services. All storage is consolidated onto PostgreSQL.
ComponentCPUMemory
api_server11 Gi
web_server250m (1/4)512 Mi
postgres11 Gi
nginx250m (1/4)128 Mi
Memory usage in Lite mode scales with the number of user-uploaded files, since PostgreSQL handles file storage, caching, and session management.

How Resource Requirements Scale

The main driver of resource requirements for Standard mode is the number of indexed documents. This primarily affects the search index (OpenSearch), which is responsible for storing documents and handling search requests.

OpenSearch Memory

OpenSearch memory is split roughly 50/50 between the JVM heap and the OS file system cache. Both halves are critical — the heap handles indexing and search operations while the file system cache keeps frequently accessed index segments in memory for fast reads. Key rules for JVM heap sizing:
  • Set Xms and Xmx to 50% of available RAM (the other 50% goes to OS/file cache)
  • Never exceed 32 GB heap — beyond this, Java disables compressed ordinary object pointers, causing significant performance degradation

Storage Overhead

OpenSearch adds overhead on top of the raw source data. The formula for on-disk storage is:
Storage = Source data × (1 + replicas) × 1.45
The 1.45 multiplier accounts for indexing overhead (~10%), Linux reserved space (~5%), and OpenSearch internal overhead (~20%), plus a safety margin.
Onyx defaults to 0 replicas for single-node deployments, so storage is approximately 1.45× the source data size.

Scaling Guidelines

OpenSearch resource requirements scale linearly with the volume of indexed data. The exact ratio depends on deployment size — large distributed clusters are more efficient per GB than single-node deployments due to fixed per-node overhead (cluster management, garbage collection, segment merging). Industry guidelines for large clusters suggest a memory-to-data ratio of around 1:16 for search-heavy workloads. However, for the single-node deployments typical of self-hosted Onyx, the fixed overhead per node is a much larger fraction of total resources. Based on our experience, we recommend the following for single-node or small-cluster deployments:
ScaleMemory per 1 GB of source docsCPU per 1 GB of source docs
Small (< 5 GB)~2 GB~0.25 CPU
Medium (5–50 GB)~1.5 GB~0.25 CPU
Large (50+ GB)~1 GB~0.2 CPU
The per-GB cost decreases at larger scale because the fixed baseline overhead is amortized. At very large scale, consider a dedicated OpenSearch cluster or a managed service. Other factors that may affect resource requirements include:
  • The embedding model and vector dimensions
  • Whether you have quantization and dimensional reduction enabled
  • Query throughput and concurrency

Resourcing Example

For a deployment with 10GB of text content, your opensearch component will need:
  • Memory: 4 (base) + 10 × 1.5 = 19 GB
  • CPU: 2 (base) + 10 × 0.25 = 4.5 cores
If deploying in a single instance, this would be in addition to the base requirements. Overall, that would take us to
= 9 CPU and >= 35GB of memory.
Given these requirements, a m7g.2xlarge or c5.4xlarge EC2 instance would be appropriate. If deploying with Kubernetes or AWS EKS, this would give a per-component resource allocation of:
ComponentCPUMemory
api_server12 Gi
background28 Gi
indexing_model_server24 Gi
inference_model_server24 Gi
postgres24 Gi
opensearch519 Gi
Total available node size: ~14 CPU and ~41GB of Memory.

Next Steps

Guide: Deploy Onyx Locally

Deploy Onyx locally with Docker.

Guide: Deploy on AWS

Deploy Onyx on an EC2 instance.