Intelligent Systems

Compilation and Optimization for Large Language Models

Features
Compilation and optimization techniques applied to large language models improve the operational efficiency and performance of the models.

Description

The high computational demands of large language models have made reducing computational load and improving efficiency for deployment on edge devices/terminals a challenge faced by the industry. Compilation and optimization techniques for large language models, including operator fusion, scheduling, and quantization, can accelerate operational efficiency and performance. Through these advanced compilation techniques, these models can be run efficiently in resource-constrained environments, enabling edge computing and reducing cloud transmission risks. This not only helps to lower computational costs but also enhances data privacy.
 

Dept:Electronic and Optoelectronic System Research Laboratories
POC:陳鼎升  
Tel:03-5915499
E-mail:justinchen@itri.org.tw