🔎Glossary of Terms
Analytics
Analytics referred to here is specific to DaBot.
Analytics are captured at the Bot level based on the Bot's run history. Insights were provided on the structure and data quality changes on the source and target that occurred over time. The changes include the number of columns, the position of the columns, and any data drifts that occurred.
As the product grows, the scope of the analytics will change.
Bot
A bot is an executable program in DaBot. Bots play a central role in the product and use DaBot's I2A, CodeGen, and CodeX for execution.
Users create bots to load their source data to the target. To create a bot, users should provide all general properties - name, description, source, and destination.
Users can access Column match and Pattern match modules by creating or editing an individual Bot.
Bot-Advanced Property
These are the additional and optional properties needed for a Bot execution. It includes scheduling, email notification on bot status, email notification on any drift alerts, etc.
More advanced properties will be added as part of the future roadmap.
Bot-General Property
These are the mandatory properties that are needed for a Bot execution. It includes Bot Name, Bot Description, Destination, and Source.
CodeGen
CodeX
A backend engineering module that would execute the code generated by the CodeGen module.
Column Match
By default, this screen has the results from the I2A output. Every target column will have a possible source column mapped along with the confidence score.
The column match module allows users to pick columns from the sources and map it to the target manually.
It will allow users to change the column mapping or set default values to the target column. It will also show a data preview of the source and target column when a particular target & source column is selected.
Confidence Score
A score that is generated by the I2A algorithm using the internal logic (based on various mapping parameters). The confidence score plays a significant role in source-to-target mapping.
At the end of I2A execution, one target column could have multiple source column matches. Each possible match from the source will have its confidence score based on the various mapping parameters. All unmatched columns will not have a confidence score.
The user cannot directly update the confidence score. Only I2A can generate the score.
Connection
A connection module allows users to connect to a database or file system. Once a connection is set up to a DB or file system, users can access the data for their source or target.
DaBot
Name of the company. It's the short form of Data Bot. Bots automating the data integration process.
Destination
Destination is a part of Bots-General property. Users must pick a destination (file or table) for the bots to load the data from the source. A target is called a destination.
Edit Bot
Edit bot provides the capability for a user to modify the bots. By using the edit bot option, the user can make following changes to a bot-
Add/remove a source
Modify Bot Name, Bot Description
Modify any properties in the advanced properties
I2A
I2A stands for Intelligent Ingestion Algorithm. It is the core algorithm of the product. It reads a source and a destination to produce a JSON output that contains all the mapping information from a source to a target. It generates a confidence score based on source-to-target mapping. The output of this module is used in the downstream process CodeGen.
Login
Registered users can login to the product by using their registered email id and password.
Metadata Repo
Backend database that contains the metadata of the product. It includes details about users, bots, bot attributes, connections, run history, etc. It is architected in PostGresDB.
Metadata Repo doesn't capture the Bot's learnings. SmartHub captures that information.
Preview Bot
The preview bot functionality allows users to review the mapping information before or after the execution. The preview bot will connect users to a column Match or Pattern match function.
Run History
Run history contains bot’s execution history. Users will have the ability to review the past runs and the corresponding run attributes associated with a bot. Run attributes - source details, target details and run time.
SmartHub
Backend DB that contains the Bot's learnings on the source-to-target mapping. For every destination defined in the product, SmartHub will capture all unique source mappings. SmartHub acts as the critical re-usability component of the product. The I2A algorithm leverages SmartHub as a key element to produce the mappings.
Only references to Bots and destinations are in the SmartHub. The Metadata Repo contains all other information about the Bot, Bot attributes, and run history.
Source
Source is a part of Bots-General property. Users must pick at least one source (file or table) for a bot to load the data to the destination. Users will have the ability to add multiple sources to a Bot.
Drift
Any change between 2 bot execution in the source or the target dataset at the metadata level is a drift. It could be due to the following-
Change in the column names (Customer_Name to Cust_Name)
Change in the position of the columns
Change in the data pattern of the columns (Customer_Name column populated with Date Values)
It can be a data drift or schema drift.
DriftDetection Logic
A module that detects the changes in the source or target compared to the previous run.
Possible Match
The JSON output of I2A will contain a section called the possible match. Every column in the target will list all the potential matches from the score and the corresponding confidence score.
Unmatched
The JSON output of I2A will contain a section called unmatched. Source columns that are not matched to any of the target columns are listed in the unmatched section.
In the column match screen, users can pick any unmatched column and assign it to a target column.
Account
An Account is the root node in the DaBot hierarchy. Every customer can have one or more accounts. Users must be added to at least one Account to access the DaBot application.
Customer: Acme, Corp
Workspace:
Acme_dev
Acme_Prod
User:
[email protected] assigned to acme_dev will have access only to acme dev instance
[email protected] assigned to acme_dev and acme_prod will have access to both acme dev and prod instance
Last updated