Annotation

QUALITY INDICATORS OF THE DATA LINEAGE BASED ON STATIC ANALYSIS OF SQL QUERIES
Скачать PDF
Annotation: When working with big data, it is important to keep track of where the data is coming from in the system. This knowledge helps both in the processes of modifying and expanding the reporting functionality, and contributes to the analysis of calculations in terms of the efficiency of using computing resources. To analyze the origin of data in reporting, a Data Lineage is used. You can generate a data line using a variety of tools, one of the most wellknown tools involves the use of static analysis of SQL queries using the Abstract Syntax Tree (AST) model. Since the data generation line is an oriented acyclic graph, it can be analyzed and used to calculate various indicators. In this paper, we propose a list of indicators that allow us to evaluate the effectiveness of the generated data line. These indicators allow us to evaluate the effectiveness of both the entire data line and individual transformations within it. The proposed indicators make it possible to carry out the calculation optimization process more efficiently and visually.
Page numbers: 56-64.
For citation: Konakov P.O. Quality indicators of the data lineage based on static analysis of sql queries // Electronic Scientific Journal IT-Standard. – 2024. – No. 2. – pp. 56-64.