APC 技術ブログ

株式会社エーピーコミュニケーションズの技術ブログです。

株式会社 エーピーコミュニケーションズの技術ブログです。

Managing Data Encryption in Apache Spark™

Introduction

This is GLB Business Department Lakehouse Department Chen. Based on a report by Mr. Nagasato, who is participating in the local Data + AI SUMMIT 2023 (DAIS), this lecture entitled "Latest Trends in Data Encryption and Storage Utilizing Apache Spark™" focused on data confidentiality and temporality. was introduced to protect sensitive data. The audience is data engineers, data analysts list, data scientists.

Leveraging Apache Sparta and Key Manager

Apache Spark is an interesting engine that can store most of your data, allowing you to prepare it, read it, and send it to the server. It then retrieves data from other storage units and operates on multiple data units. In addition, keys for connection and data control are managed by a system called a key manager. A key manager's goal is to preserve confidentiality and transience to protect sensitive data.

Improved performance of analytical data with prepared format configuration

The talk showed how to improve the performance of analytical data using ready-made format configurations. This technique is expected to improve the speed of reading and writing data and greatly improve the efficiency of data analysis.

Introduction of technology to realize efficient data sharing

The following technologies were introduced to improve the efficiency of data sharing.

  1. Data Compression: Reduce data volume and improve transfer speed
  2. Data partitioning: Distributed processing to improve the speed of reading and writing data
  3. Cache Utilization: Utilize cache to read frequently accessed data quickly

Combining these technologies is expected to greatly increase the efficiency of data sharing and improve the performance of data analysis.

Summary

In this presentation, data encryption and storage using Apache Spark™ were introduced, including the functions of Apache Sparta, the role of the key manager, improving the performance of analytical data with prepared format configurations, and streamlining data sharing. . Armed with this knowledge, you can improve the efficiency of data processing while preserving confidentiality and transience to protect sensitive data. Data encryption and storage management will continue to be important themes in data processing using Apache Spark™.。

Conclusion

This content based on reports from members on site participating in DAIS sessions. During the DAIS period, articles related to the sessions will be posted on the special site below, so please take a look.

Translated by Johann

www.ap-com.co.jp

Thank you for your continued support!