olap

olap operators

roll-up

drill-down

slice-and-dice

pivot

drill-across

drill-through

extraction transformation and loading (etl)

The ETL process aims to get data from sources, improve general data quality, transform data according to the schema and loads it in the DWH

--- title: ETL --- flowchart TD A[EXTRACTION\nextract data from sources] B[CLEANSING\nimprovements to the quality\nremoving duplicates] C[TRASFORMATION\ndata processing according to the schema] D[LOADING\nload data in the DWH] A --> B B --> C C --> D

extraction

The extraction phase aims to get data from the datasources, there are 2 possible approaches: STATIC or INCREMENTAL

--- title: EXTRACTION --- flowchart TD A[APPROACHES] B[STATIC\nDWH is populated for the first time] C[INCREMENTAL\nthe DWH is updated with new data regularly] A --> B & C

Each approach is more suitable for certain types of data:

types of data types of extraction
structured data (from databases or formatted files ) static (for the first DWH population operation)
unstructured data (from social media) incremental (for the update operations on the DWH)

cleansing

solution for data inconsistencies

Dictionary based techniques

Aproximate merging

transformation

denormalization

loading

refresh

update