Analytic Step Level
DEAN
What is DEAN?
Our standardised, rule-based analysis steps identify line items in a particular base table that match a particular rule. For example, AP_ManPayments lists all invoices from an AP base table that have not yet been cleared by the automatic payment run. However, there may be line items of interest to our customers that are not identified by our analysis steps.
This is why we have implemented DEAN. DEAN uses a statistical (not rule-based) approach to find outliers in a given base table. Outliers are rows that stand out from the rest of the rows because they have unusual values. DEAN is derived from Detecting Anomalies. Anomalies and outliers are synonymous. DEAN can be used for the following base tables: SD, PU, MM, MD, GL, DL, BL, AR, AP.
Logic
For each base table, there is a separate analysis step that calls the same R script in which DEAN is implemented. Depending on the module, 50-100 predefined and fixed columns of a base table are transferred to R. After transferring a base table to R/DEAN, the following steps are performed:
- Pre-processing
- Columns with exactly one unique field value will be deleted
- If bijective column pairs (1:1 relationship) exist, one column is deleted from each pair
- If there are strongly correlated numerical column pairs, one column of the pair is deleted
- Search for outliers
- Splitting the given base table into chunks to avoid RAM problems when calculating outliers
- Search for outliers in the respective chunk
- 0.1% of the line items, up to a maximum of 100, of the given base table are identified as outliers
- Outliers explained
- For each outlier found, i.e. a conspicuous item, the system now checks which column value or values make the item an outlier
Important Columns
DEAN returns the found outliers. The following columns are created:
Technical Columnname | Descriptive Columname | Description |
Outlier_Score | Outlier_Score | - contains a value between 0 and 1 - he higher the value, the more conspicuous the outlier in question |
Detected_by_DEAN_because | Detected_by_DEAN_because | - contains short text with the column value or column values that make the respective position an outlier |
Market Basket Analysis (for G/L Accounts)
What is a Market Basket Analysis?
A Market Basket Analysis is performed on the documents and the corresponding G/L accounts of a specific GL base table.
Logic
The following columns of a GL base table are transferred to R:
- BKPF_BUKRS
- BKPF_GJAHR
- BKPF_BELNR
- BSEG_BUZEI
- c_GL02_HKONT_SKAT_TXT20
- BSEG_SHKZG
Market Baskets are then created. A Market Basket is an array or list containing all the G/L accounts used in a document. Each basket (and each document) is uniquely identified by the following columns: BKPF_BUKRS, BKPF_GJAHR and BKPF_BELNR.
Example: a Market Basked could contain the following G/L accounts:
- 0000156000 - A/R ICO_S
- 0000874010 - Offsetting of assets held for sale_H
- 0000261100 - Output VAT_H
As you can see in the example, the debit/credit indicator is added to each G/L account. If a G/L account appears more than once in a document, it will only appear once in the relevant shopping basket.
Frequently occurring rules are then searched for in the Market Baskets.
Example rule: If a document contains 0000193800 - ELKO clearing account_S and 000261100 - output VAT_S, it will also contain 0000372100 - discount 3_S 99.95% of the time.
For each rule found, the result will be exactly those documents that do not follow that rule. This means that documents containing the left part of the rule but not the right part will be returned.
Important Columns
The following columns are generated by the Market Basket Analysis:
Technical Columnname | Descriptive Column Name | Description |
Unique_ID | Unique_ID | - clearly identifies a document that does not comply with a particular rule |
Rule_Number | Rule_Number | - the number of the rule that was found |
Exception_Number | Exception_Number | - number of the document or voucher that does not comply with a specific rule |
Rule | Rule | - rule as text |
Number_of_Receipts_following_given_Rule | Number_of_Receipts_following_given_Rule | - number of documents following a particular rule |
If_a_Receipt_contains_A_Probability_that_it_contains_B | If_a_Receipt_contains_A_Probability_that_it_contains_B | - probability that the right part of the given rule is contained in a document if the left part is contained in it |
The following analyses require an R installation
- GL_AI_MBA_Accounts (Market Basket Analysis for G/L accounts)
- AI_MarketBasket (Market Basket Analysis for any table)
- AI_Outliers (DEAN for any table)
- **_AI_Outliers (DEAN for the Data Products/Basetables)
- AP_Dupl_Payments_AI (Duplicate Payment Analysis with AI)
- AR_Dupl_CreditNotes_AI (Duplicate Creditnote Analysis with AI)
- CU_AI_Duplicates and VE_AI_ Duplicates (Duplicates in the Master Data)
- CU_AI_Outliers and VE_AI_Outliers (Outliers in the Master Data)
- Root-Cause-Analysis
Here are the explanations for the columns in the Root-Cause-Analysis:
Columnname | Explanation |
Root_Cause_Count | - Contains the description of a found cause (to account for the number of findings) |
Root_Cause_Volume | - Contains the description of a found cause (for consideration of the amounts of findings) |
Trend_Count | - Contains the value of the trend line for the given month (for analysing the number of findings) |
Trend_Volume | - Contains the value of the trend line for the given month (for viewing the amounts of the findings) |
Benchmark_Count_Mean | - Contains the value of the benchmark for the given month (to analyse the number of findings) |
Benchmark_Volume_Mean | - Contains the value of the benchmark for the given month (to take into account the amounts of the findings) |