Direkt zum Hauptinhalt

Resource requirements for R-based AI analytics

This page helps you estimate how much RAM must be available to the R runtime for AI solutions provided by dab.

Important

The dataset in this page means the output of one analytic executed by dab Nexus. That full dataset must fit into the memory available to the R runtime.

Quick guidance

  • Use the matrices below to estimate RAM required by the R runtime for one analytic run.
  • The matrix values are not total SQL Server or Azure SQL Managed Instance RAM.
  • If multiple analytics run at the same time, add their R memory requirements together.
  • On Azure SQL Managed Instance, R can use only about 20% of instance resources by default.

RAM needed by the R runtime

Important

The tables below show RAM required by the R runtime for one analytic run. They do not show total server or instance RAM.

If a matrix cell says 16 GB, R needs about 16 GB for that run.

It does not mean that a 16 GB SQL Server or Managed Instance is enough.

Pick the row range and column range for the dataset passed from dab Nexus to the analytic. Then use the matching workload matrix.

How to use the tables:

  1. In dab Nexus, open the analytic run view and check how many rows the analytic has.
  2. Check the column count in Content Studio. If you do not know the exact value, use the typical column count from the workload table below.
  3. In the workload table below, find the analytic you want to run.
  4. Open the matching matrix and choose the cell where the row range and column range meet.
  5. Read that value as the RAM that must be available to R for that run.

These values are conservative estimates based on the typical behavior of AI solutions provided by dab in R. Actual peak usage also depends on data types, column cardinality, temporary objects created by the analytic, and concurrent executions.

Workload types and typical column counts

AI analytic categoryAnalyticsTypical column countUse this matrix
DEAN / Outlier DetectionAI_Outliers, *_OutliersUsually 15-20, but configurableStandard AI workloads
Market Basket AnalysisAI_MarketBasketAnalysis, GL_MarketBasketUsually 6-15, but configurableStandard AI workloads
Root Cause*_RootCauseUsually 10-20Heavy AI workloads
Duplicate Payments / Credit Note AIAP_DuplicatePaymentsEnhancedUsually around 40Heavy AI workloads
Master Data AICU_DuplicatesEnhanced, CU_Outliers, VE_DuplicatesEnhanced, VE_OutliersUsually around 15Heavy AI workloads

Standard AI workloads: RAM needed by the R runtime

Examples: AI_Outliers, *_Outliers, AI_MarketBasketAnalysis, GL_MarketBasket

Input rowsUp to 10 cols11-20 cols21-40 cols41-60 cols
Up to 100k4 GB4 GB8 GB12 GB
100k-500k4 GB8 GB12 GB16 GB
500k-1M8 GB12 GB24 GB32 GB
1M-2M12 GB24 GB48 GB64 GB
2M-3M16 GB32 GB64 GB96 GB

Heavy AI workloads: RAM needed by the R runtime

Examples: *_RootCause, AP_DuplicatePaymentsEnhanced, CU_DuplicatesEnhanced, CU_Outliers, VE_DuplicatesEnhanced, VE_Outliers

Input rowsUp to 10 cols11-20 cols21-40 cols41-60 cols
Up to 100k4 GB8 GB8 GB12 GB
100k-500k8 GB12 GB16 GB24 GB
500k-1M12 GB16 GB32 GB48 GB
1M-2M16 GB32 GB64 GB96 GB
2M-3M24 GB48 GB96 GB128 GB+

If your workload is larger than the ranges shown above, treat the matrix values as a lower bound and size the environment with additional headroom.

SQL Server on-premises or Azure Virtual Machine

Choose this platform if you want the most control over memory allocation.

  • The matrices above show how much RAM must be left available to the R runtime.
  • The total machine memory must therefore be higher than the matrix value because Windows and SQL Server also need memory.
  • Minimum practical host size: 16 GB total RAM
  • Typical starting point: 32 GB total RAM
  • Large, wide, or concurrent workloads: 64 GB total RAM or more
  • Minimum CPU: 2 cores
  • Recommended CPU: modern x64 CPU with strong single-core performance

Important: SQL Server can consume most of the machine's memory unless you cap it correctly. Leave enough memory for the operating system and R, and use Resource Governor if you need to control the memory available to external scripts. Restart SQL Server Launchpad after changing these settings.

Check and configure Resource Governor

Use these commands on self-managed SQL Server if you need to see or change how much memory is available to R.

Use the following queries to check the current configuration:

SELECT is_enabled FROM sys.resource_governor_configuration;

SELECT name, max_memory_percent, max_cpu_percent
FROM sys.resource_governor_external_resource_pools;
  • The first query shows whether Resource Governor is enabled.
  • The second query shows the current memory and CPU limits for external runtimes such as R.

Increase the default external pool if R needs more memory:

ALTER EXTERNAL RESOURCE POOL "default"
WITH (
MAX_CPU_PERCENT = 100,
MAX_MEMORY_PERCENT = 40
);

ALTER RESOURCE GOVERNOR RECONFIGURE;

This example allows external scripts to use up to 40% of the memory available to SQL Server instead of the default 20%.

Use this setting together with max server memory so that the R runtime can actually access the amount of RAM shown in the matrices above.

Azure SQL Managed Instance

Choose this platform if you want a managed Azure SQL service and can work within the platform limits of Machine Learning Services.

  • Recommended: Memory optimized premium-series
  • Acceptable for smaller datasets: Premium-series
  • Use only for small, cost-driven workloads: Standard-series (Gen5)

For these workloads, memory per vCore matters more than raw vCore count.

HardwareMemory per vCoreRecommendation
Standard-series (Gen5)5.1 GBAvoid for serious AI workloads
Premium-series7 GBGood for smaller datasets
Memory optimized premium-series13.6 GBBest choice for RAM-heavy analytics

Important: Azure SQL Managed Instance does not support Resource Governor external resource pools for R. By default, R can use only up to 20% of Managed Instance resources.

The matrix values still refer to R memory only. Total instance memory must be much higher.

Approximate total instance memory needed = required R RAM / 0.20

Example:

  • If the matrix says 16 GB, R needs about 16 GB.
  • It does not mean that a 16 GB Managed Instance is enough.
  • Because R can use only about 20%, 16 GB for R means about 80 GB total instance memory.
RAM needed by RApproximate total instance memory
4 GB20 GB
8 GB40 GB
12 GB60 GB
16 GB80 GB
24 GB120 GB
32 GB160 GB
48 GB240 GB
64 GB320 GB
96 GB480 GB
128 GB640 GB

If you get out-of-memory errors, reduce the dataset size, scale up the instance, or open an Azure support ticket about the extensibility resource limit.

Reduce memory usage

If memory is limited, reduce the amount of data that dab Nexus passes to the analytic:

  • Reduce the number of columns in Content Studio. Fewer columns reduce the width of the dataset and therefore the required RAM.
  • Use smaller company-code selections and shorter time scopes during task creation in dab Nexus. This reduces the number of rows passed to the analytic.
  • Avoid too many concurrently running tasks in Nexus with these workloads. Concurrent tasks increase the total RAM demand on the SQL environment.

These measures are most effective for large datasets, wide tables, and environments with multiple AI tasks running at the same time.