APC 技術ブログ

株式会社エーピーコミュニケーションズの技術ブログです。

株式会社 エーピーコミュニケーションズの技術ブログです。

Building Enterprise-Grade GenAI Apps with MLflow and Vector Search

Preface

In the initial segment of the session, participants were welcomed by Dennis Dymanski and Kulkichada, who introduced themselves and elaborated on their professional backgrounds and areas of expertise.

Dennis Dymanski, the chief software engineer at Corning since 2010, shared his extensive involvement in machine learning engineering and advanced analytics. His unique focal point at Corning particularly revolves around delivery and acceleration in the areas of natural language processing and information sciences.

Following Dennis, Kulkichada took the stage as a Solutions Architect at Databricks. Holding this position for three years and co-authoring "Data Engineering with Databricks," she discussed the crucial roles of data engineers and MLOps professionals in the current technological landscape.

The session’s agenda delved deeply into the architecture and strategies surrounding Mosaic AI, emphasizing Corning's advancements in generative AI, and the advancing agenda including prompt engineering and Retrieval Augmented Generation (RAG). Such thematic elements are central to understanding the broader implications of adopting and adapting these technologies in an enterprise environment.

Participants were provided with a thorough understanding of developing and implementing enterprise-grade generative AI applications using prominent tools like MLflow and vector search. This educational experience was amplified by practical insights from seasoned experts, proving extremely beneficial for attendees. The session concluded with Dennis distributing raffle gifts, adding an enjoyable twist to the informative event.

This introduction served not only as a gateway to the extensive topics covered, but also established a baseline understanding among participants about the significance and impact of integrating advanced generative AI applications into enterprise systems.

The session focused on practical applications and methodologies related to "Mosaic AI Architecture" and "Retrieval Augmented Generation (RAG)." These key components form an essential foundation for expanding AI capabilities within a corporate setting and improve decision-making accuracy.

1. Fine-tuning Foundation Models

The initial approach discussed involved refining an existing foundation model using enterprise-specific data. This customization ensures the model is tailored to specific tasks and delivers highly relevant and accurate outputs for given scenarios.

2. Combining Fine-tuning and RAG

Subsequently, a pattern involving the integration of fine-tuning and RAG was discussed. This strategy leverages the specificity achieved through model fine-tuning and the comprehensive search capabilities provided by RAG. This synergy is particularly effective in tackling subtle enterprise challenges and delivers finely-tuned, data-backed responses.

3. Pretraining

The third approach covered involved pretraining models using extensive datasets. This method enables the creation of robust large language models (LLMs) based on billions to trillions of data points. Such models benefit from a vast foundational knowledge base, achieving unparalleled diversity and adaptability.

Particularly, the session focused on the role of RAG, exploring its application areas and the advantages gained when combined with the fine-tuning process. The presentation highlighted numerous corporate scenarios where this combined approach significantly enhances the accuracy and relevance of AI applications.

By leveraging the capacities of the Mosaic AI Architecture and RAG, corporations can significantly accelerate the AI adoption cycle, ensuring the developed solutions are not only effective but also precisely align with unique operational requirements.

Corning's GenAI Applications

Located in the small town of Corning, New York, Corning Incorporated has nearly 175 years of expertise in material sciences, specializing in glass and ceramics manufacturing. The integration of traditional glass-making techniques with cutting-edge AI technologies has led to significant advancements and innovative ways to meet modern needs. If you're in the Northeast USA, a visit to the Corning Glass Museum is highly recommended, where you can explore everything from 3,000-year-old glass artifacts to modern glass products and learn about detailed glass manufacturing processes, such as the drawing of fiber optics or the production of fusion glass used in iPhone screens.

This long-standing expertise in material sciences and cutting-edge technological advancements not only adapt new technologies but also create products that meet contemporary demands. This integration serves as a successful model demonstrating how AI evolution can have substantial impacts on corporations and shape the future of businesses globally. Detailed explanations of technical aspects, such as the manufacturing process of fusion glass for iPhone screens, provide invaluable insights. This integration highlights how Corning uses generative AI applications to maintain industry leadership and innovation.

Data Integration and Governance

When constructing an enterprise portal as a conventional interface for organizations, initial discussions with enterprise architects about non-functional requirements are crucial. While business needs might be clearly understood, architects emphasize elements like latency, throughput, observability, maintainability, and operational support.

For companies like Corning, conducting extensive proprietary research and managing intellectual property, data security, and governance are top priorities. As such, interactive interfaces also need to adapt similar security measures to authenticate users per data record and permit access.

The integration of Large Language Models (LLMs) presents an innovative yet challenging addition to this setting due to its stochastic nature, which makes outputs somewhat unpredictable. This unpredictability introduces complexities in maintaining data integrity and governance, necessitating sophisticated strategies to ensure the accuracy and security of generated content.

This section detailed essential technical considerations for protecting data and enhancing operational efficiency in a corporate setting. We explored real-world scenarios that highlight potential risks and necessary measures when deploying generative AI technologies. Understanding how to effectively align these technologies with business objectives and established governance frameworks is crucial for corporations looking to responsibly leverage the benefits of AI.

Implementation and Optimization: Enhancing Enterprise-Level GenAI Applications

To build GenAI applications at an enterprise level, efficient data processing and model management are necessary. This section focused on methods to optimize these processes.

Local GPU Utilization for Document Vectorization

Using external APIs to vectorize documents and text data often leads to cost implications and speed limitations. However, leveraging a locally hosted Databricks GPU infrastructure for vectorization streamlines the process and reduces costs. This method involves extracting content from documents and vectorizing it directly on local GPUs instead of sending the data to external services.

Building Concurrent Vectorization Processes

For effectively managing vectorization of massive datasets, such as 25 million documents, implementing parallel processing using multiple GPUs is essential. This setup increases processing speeds, enhances data handling efficiency, and significantly reduces the time required for large-scale vectorization.

Monitoring the Context Window Size of Embedding Models

When performing embeddings or vectorization, monitoring the context window size of the used model is crucial. The context window size affects how much surrounding text the model considers for each word, consequently impacting the accuracy and consistency of the resulting vectors. Maintaining optimal window sizes ensures effectiveness in embeddings and subsequent application performance.

Conclusion

This section highlighted several key strategies for optimizing the implementation of enterprise-level GenAI applications. By leveraging local GPU infrastructure for cost-effective vectorization, setting up concurrent processing structures for large datasets, and carefully managing embedding model parameters, corporations can achieve significant improvements in scalability and efficiency. These advancements form the foundation for realizing the full potential of GenAI in corporate settings, ensuring that these applications are not only powerful but also practical and prepared to meet modern business demands.

About the special site during DAIS

This year, we have prepared a special site to report on the session contents and the situation from the DAIS site! We plan to update the blog every day during DAIS, so please take a look.

www.ap-com.co.jp