History of development
Getl has been developed by EasyData since 2012 as an open source product based on Groovy under the GPL-3.0 license.
Getl is a cross-platform solution and runs on a Java Runtime Environment.
Getl was originally developed as an ETL framework that allowed you to solve complex tasks of capturing and loading data from a variety of sources into the Vertica data warehouse.
Now it is a full-fledged product that allows you to implement a full cycle of solution development for DWH.
Features
The main advantage of this ETL tool is the ability to work with dynamic structures such as ASN.1, XML, Json or non-standard CSV formats with a floating field structure
Getl written in Groovy and uses the power of dynamic compilation of this language.
Getl generates code for specific tasks, with mapping fields and casting types, which is then compiled into Java bytecode and executed.
This guarantees high speed of process execution, but at the same time does not restrict working with dynamic structures and does not require strict description of data structures at the development process.
Getl supports most existing DBMS and file formats.
Getl also allows you to manipulate local, network, and distributed file systems, and contains its own simple stored procedure language that facilitates the development of analytical data marts for Vertica.
To facilitate the development of ETL process development logic, Getl has its own Dsl language, which allows you to develop simple scripts without OOP knowledge.
This language was designed so that it can be used to develop reusable ETL/ELT templates and implement logic for capturing, converting, and controlling data within the planned or used storage architecture.
Getl allows developing test cases for ETL/ELT processes.
It is a universal tool for preparing reference data for test and development environments with data quality checks.
Getl supports the ability to store configuration files and SQL scripts in resource files for team development.
Built-in multithreading support allows you to develop solutions for capturing and parsing a large data volumes.
Getl in-memory RDBMS allows intermediate processing or caching collected data using the SQL language capabilities.
Getl supports multiple environments to improve development process.
Deep integration with JetBrains InelliJ Idea.
It becomes possible to build a full-fledged project for your DWH with powerful functionality when using the JetBrains InelliJ Idea together with Getl.
Main features is centralized repository of data sources, reusable templates, data loading monitoring and other service processes for cleaning and transferring data.
It allows step-by-step debugging of Getl script logic, profiling applications using Java profilers and collaborating using GIT or SVN version control systems.