How to Build LLMs on Your Company’s Data While on a Budget

I'm Chen from the Lakehouse Department of the GLB Division. Based on a report by Mr. Nagasato, who is participating in Data + AI SUMMIT 2023 (DAIS2023) held in San Francisco, "How to Build LLMs on Your Company's Data While on a Budget" Here is an overview of the lecture titled "How to do it."

The talk showed how companies with limited budgets can build large scale language models (LLMs) based on their own data. Presenter Sean Owen is a Principal Product Specialist at Databricks. In addition, the lecture is aimed at engineers interested in data and AI, data scientists in companies with limited budgets, and business owners who want to utilize their own data.

How to build and customize large-scale language models

Building large-scale language models is a difficult task for companies with limited budgets. However, by using the following method, it is possible to build an effective language model while keeping costs down.

Leveraging existing models: Effective customization is possible by using a large-scale language model (e.g. GPT-3) that has already been published and fine-tuning it with your own data.
Data preprocessing: cleaning your data and removing noise can improve learning efficiency
Choose model size wisely: The larger the model size, the more resources required for training. Therefore, it is important to choose the right size model according to your budget.

Leveraging Choice Models

A selection model is a model specialized for the task of selecting the best one from multiple candidates. By utilizing the choice model in the following ways, we can build an effective language model.

Combining multiple models: By combining existing language models and selection models, it is possible to obtain more effective results
Increase data variation: Selection models can accommodate a wide variety of data. Therefore, it is possible to improve the performance of the model by increasing the variation of the data.

About the latest concepts, features and services

Recently, various concepts, functions, and services related to building large-scale language models have emerged. Some of them are introduced below. Zero-shot learning: A learning method that can make appropriate predictions even for data that has never been seen during learning. This allows us to react quickly to new data. Transfer Learning: Applying a model learned in one task to another task. This saves time and resources for learning.

Utilization of cloud services

By using cloud services, you can flexibly secure the resources necessary to build a large-scale language model. In addition, cloud services provide the latest functions and services, making it possible to build effective language models. In this way, even budget-constrained companies can leverage the latest concepts, features, and services to build language models based on their own data.

Summary

In this talk, he introduced how companies with limited budgets can build large-scale language models based on their own data. It is possible to build an effective language model by using existing models, preprocessing data, selecting the size of the model, and so on. We believe we can make the most of our data while leveraging the latest concepts, features and services.

Conclusion

This content based on reports from members on site participating in DAIS sessions. During the DAIS period, articles related to the sessions will be posted on the special site below, so please take a look.

Translated by Johann

www.ap-com.co.jp

Thank you for your continued support!