APC 技術ブログ

株式会社エーピーコミュニケーションズの技術ブログです。

株式会社 エーピーコミュニケーションズの技術ブログです。

Intro to Fivetran - Overview

Introduction

This is Matsuzaki from the lakehouse department of the GLB Business Department. In this article, we will explain the overview of the ELT tool, Fivetran, with which our department has a partner contract.

table of contents

What is ELT?

Before I get into Fivetran, I want to briefly touch on the ELT process.

ELT stands for Extract, Load, Transform, and is a processing method that extracts data, loads it into another storage system, and then transforms it. The main difference from ETL is the execution timing of Load. ETL transforms data before loading, while ELT transforms data after loading.

The advantages of the ELT process from an ETL perspective are two-fold.

  • Short lead time from the start of data ingest until data analysts can actually work with the data
  • Since raw data before transform is saved in DWH, it is easy to debug and reprocess when transform processing does not go as expected and data is created in an undesired format.

Figure source:The ultimate guide to ELT @ Fivetran Blog

Fivetran overview

Fivetran is a cloud-based ELT (Extract, Load, Transform) tool. It was developed to easily and quickly ingest data from various data sources into DWH and SaaS type data stores.

Key features include:

  • 1. Compatible with various data sources : Equipped with functions such as semi-automation of data import settings using plug-ins called connectors, saving the trouble of retrieving data from data sources and applying schemas. can.

  • 22. Automatic Schema Setting** : Along with automatic data synchronization, data schema setting can be done automatically. This allows you to maintain data quality and reduce the burden of data engineering.。*1

  • 3. Real-time data synchronization : Supports data extraction via API and real-time data synchronization to keep data up-to-date.

Why Fivetran is needed

Data integration is an integral part of how companies and organizations make business decisions and conduct marketing activities. And when building a data infrastructure for such purposes, if different tools are used for data sources, data formats, storage, etc., it will take a lot of effort and time to set up the ELT pipeline. . By solving such problems, Fivetran can make business decisions faster and more accurately, and greatly improve the speed and efficiency of data analysis.

Fivetran is also easy to implement and cost-effective, making it used by companies and organizations of all sizes, from small businesses to large corporations.

Data sources compatible with plug-ins

Fivetran's connectors support numerous data sources. Examples of representative data sources are shown below. Please refer to this page for available connectors other than the above.

database

  • MariaDB , MySQL
  • PostgreSQL
  • MongoDB
  • Oracle
  • SQL Server

SaaS tools

  • Salesforce
  • Hubspot
  • Zendesk
  • Shopify
  • Google Analytics
  • Intercom
  • Stripe

storage

  • Amazon S3
  • Google Cloud Storage
  • Microsoft Azure Blob Storage

Conclusion

That's all for this article. How was it? In the next article, I plan to explain how to integrate Fivetran with Databricks. Thank you.

We provide a wide range of support, from the introduction of a data analysis platform using Databricks to support for in-house production. If you are interested, please contact us.

[https://www.ap-com.co.jp/service/data_ai/:embed:cite]

We are also looking for people to work with us! We look forward to hearing from anyone who is interested in APC.

Translated by Johann

www.ap-com.co.jp

*1:データ型の割当・推論機能の詳細は以下をご覧ください。 -> core-concepts : datatypehierarchy , core-concepts : typeinference