The only tool to simplify your work and accelerate data processing

Forget everything you know about Data Ingestion: Zwoox is here to change your perspective and to turn the process of importing and extracting structured data into an easy one.

Turn your data ingestion into an easy process

Zwoox is a highly scalable tool based on Hadoop technologies (Spark, Hbase, Kafka, Hive, Impala) that allows you to change and model data as you ingest it.

No more coding individual data pipelines

Zwoox eliminates the need to code manually all the data pipelines for every data source.

Accelerate your data processing

Zwoox allows you to accelerate your data processing by giving you several options on how to import data, even allowing you to replicate any RDBMS DML in near real-time into Hadoop structures

What’s under the hood?

Kafka for CDC ingestion

By using Kafka for DML event storage  Zwoox can access
data change events in a structured and ordered fashion,
guaranteeing fault tolerance, accurate event replication.

Sqoop for direct Database access

Uses Sqoop for a table’s initial load before activating CDC stream,
or to do a full or incremental import of a table in case
the CDC functionality isn’t available.

Spark for all processing

Zwoox fully leverages Spark for its high scalability, resilience and
fast data processing. Combined with YARN, you can configure
the amount of resources to use in your cluster for Zwoox.

HBase for configuration repository

Zwoox uses Hbase as its internal database since it allows very quick
access to all relevant metadata even when dealing with
thousands of concurrent ingestions.

What does Zwoox Offer?

  • Partioning automation
  • Data type translation
  • Data availability through atomic bulk changes
  • Full audit history without performance impacts
  • Derive new columns fro pre-set functions or “pluggable” code
  • Operational Integration with Cloudera Manager
  • Incremental Sqoop based imports
  • Structured historical data cleansing

See full offer