Resource requirements for R-based AI analytics
This page helps you estimate how much RAM must be available to the R runtime for AI solutions provided by dab.
The dataset in this page means the output of one analytic executed by dab Nexus. That full dataset must fit into the memory available to the R runtime.
Quick guidance
- Use the matrices below to estimate RAM required by the
Rruntime for one analytic run. - The matrix values are not total SQL Server or Azure SQL Managed Instance RAM.
- If multiple analytics run at the same time, add their
Rmemory requirements together. - On Azure SQL Managed Instance,
Rcan use only about 20% of instance resources by default.
RAM needed by the R runtime
The tables below show RAM required by the R runtime for one analytic run. They do not show total server or instance RAM.
If a matrix cell says 16 GB, R needs about 16 GB for that run.
It does not mean that a 16 GB SQL Server or Managed Instance is enough.
Pick the row range and column range for the dataset passed from dab Nexus to the analytic. Then use the matching workload matrix.
How to use the tables:
- In dab Nexus, open the analytic run view and check how many rows the analytic has.
- Check the column count in Content Studio. If you do not know the exact value, use the typical column count from the workload table below.
- In the workload table below, find the analytic you want to run.
- Open the matching matrix and choose the cell where the row range and column range meet.
- Read that value as the RAM that must be available to
Rfor that run.
These values are conservative estimates based on the typical behavior of AI solutions provided by dab in R. Actual peak usage also depends on data types, column cardinality, temporary objects created by the analytic, and concurrent executions.
Workload types and typical column counts
| AI analytic category | Analytics | Typical column count | Use this matrix |
|---|---|---|---|
| DEAN / Outlier Detection | AI_Outliers, *_Outliers | Usually 15-20, but configurable | Standard AI workloads |
| Market Basket Analysis | AI_MarketBasketAnalysis, GL_MarketBasket | Usually 6-15, but configurable | Standard AI workloads |
| Root Cause | *_RootCause | Usually 10-20 | Heavy AI workloads |
| Duplicate Payments / Credit Note AI | AP_DuplicatePaymentsEnhanced | Usually around 40 | Heavy AI workloads |
| Master Data AI | CU_DuplicatesEnhanced, CU_Outliers, VE_DuplicatesEnhanced, VE_Outliers | Usually around 15 | Heavy AI workloads |
Standard AI workloads: RAM needed by the R runtime
Examples: AI_Outliers, *_Outliers, AI_MarketBasketAnalysis, GL_MarketBasket
| Input rows | Up to 10 cols | 11-20 cols | 21-40 cols | 41-60 cols |
|---|---|---|---|---|
| Up to 100k | 4 GB | 4 GB | 8 GB | 12 GB |
| 100k-500k | 4 GB | 8 GB | 12 GB | 16 GB |
| 500k-1M | 8 GB | 12 GB | 24 GB | 32 GB |
| 1M-2M | 12 GB | 24 GB | 48 GB | 64 GB |
| 2M-3M | 16 GB | 32 GB | 64 GB | 96 GB |
Heavy AI workloads: RAM needed by the R runtime
Examples: *_RootCause, AP_DuplicatePaymentsEnhanced, CU_DuplicatesEnhanced, CU_Outliers, VE_DuplicatesEnhanced, VE_Outliers
| Input rows | Up to 10 cols | 11-20 cols | 21-40 cols | 41-60 cols |
|---|---|---|---|---|
| Up to 100k | 4 GB | 8 GB | 8 GB | 12 GB |
| 100k-500k | 8 GB | 12 GB | 16 GB | 24 GB |
| 500k-1M | 12 GB | 16 GB | 32 GB | 48 GB |
| 1M-2M | 16 GB | 32 GB | 64 GB | 96 GB |
| 2M-3M | 24 GB | 48 GB | 96 GB | 128 GB+ |
If your workload is larger than the ranges shown above, treat the matrix values as a lower bound and size the environment with additional headroom.
SQL Server on-premises or Azure Virtual Machine
Choose this platform if you want the most control over memory allocation.
- The matrices above show how much RAM must be left available to the
Rruntime. - The total machine memory must therefore be higher than the matrix value because Windows and SQL Server also need memory.
- Minimum practical host size: 16 GB total RAM
- Typical starting point: 32 GB total RAM
- Large, wide, or concurrent workloads: 64 GB total RAM or more
- Minimum CPU: 2 cores
- Recommended CPU: modern x64 CPU with strong single-core performance
Important: SQL Server can consume most of the machine's memory unless you cap it correctly. Leave enough memory for the operating system and R, and use Resource Governor if you need to control the memory available to external scripts. Restart SQL Server Launchpad after changing these settings.
Check and configure Resource Governor
Use these commands on self-managed SQL Server if you need to see or change how much memory is available to R.
Use the following queries to check the current configuration:
SELECT is_enabled FROM sys.resource_governor_configuration;
SELECT name, max_memory_percent, max_cpu_percent
FROM sys.resource_governor_external_resource_pools;
- The first query shows whether Resource Governor is enabled.
- The second query shows the current memory and CPU limits for external runtimes such as
R.
Increase the default external pool if R needs more memory:
ALTER EXTERNAL RESOURCE POOL "default"
WITH (
MAX_CPU_PERCENT = 100,
MAX_MEMORY_PERCENT = 40
);
ALTER RESOURCE GOVERNOR RECONFIGURE;
This example allows external scripts to use up to 40% of the memory available to SQL Server instead of the default 20%.
Use this setting together with max server memory so that the R runtime can actually access the amount of RAM shown in the matrices above.
Azure SQL Managed Instance
Choose this platform if you want a managed Azure SQL service and can work within the platform limits of Machine Learning Services.
- Recommended: Memory optimized premium-series
- Acceptable for smaller datasets: Premium-series
- Use only for small, cost-driven workloads: Standard-series (Gen5)
For these workloads, memory per vCore matters more than raw vCore count.
| Hardware | Memory per vCore | Recommendation |
|---|---|---|
| Standard-series (Gen5) | 5.1 GB | Avoid for serious AI workloads |
| Premium-series | 7 GB | Good for smaller datasets |
| Memory optimized premium-series | 13.6 GB | Best choice for RAM-heavy analytics |
Important: Azure SQL Managed Instance does not support Resource Governor external resource pools for R. By default, R can use only up to 20% of Managed Instance resources.
The matrix values still refer to R memory only. Total instance memory must be much higher.
Approximate total instance memory needed = required R RAM / 0.20
Example:
- If the matrix says
16 GB,Rneeds about16 GB. - It does not mean that a
16 GBManaged Instance is enough. - Because
Rcan use only about20%,16 GBforRmeans about80 GBtotal instance memory.
RAM needed by R | Approximate total instance memory |
|---|---|
| 4 GB | 20 GB |
| 8 GB | 40 GB |
| 12 GB | 60 GB |
| 16 GB | 80 GB |
| 24 GB | 120 GB |
| 32 GB | 160 GB |
| 48 GB | 240 GB |
| 64 GB | 320 GB |
| 96 GB | 480 GB |
| 128 GB | 640 GB |
If you get out-of-memory errors, reduce the dataset size, scale up the instance, or open an Azure support ticket about the extensibility resource limit.
Reduce memory usage
If memory is limited, reduce the amount of data that dab Nexus passes to the analytic:
- Reduce the number of columns in Content Studio. Fewer columns reduce the width of the dataset and therefore the required RAM.
- Use smaller company-code selections and shorter time scopes during task creation in dab Nexus. This reduces the number of rows passed to the analytic.
- Avoid too many concurrently running tasks in Nexus with these workloads. Concurrent tasks increase the total RAM demand on the SQL environment.
These measures are most effective for large datasets, wide tables, and environments with multiple AI tasks running at the same time.