Scaling GenAI Series
Fine-Tuning Large Language Models
Optimizing with Quantization and LoRA
Check out the first installment of Scaling GenAI: the Builder's Toolkit. As you transition from generative AI PoCs to bringing models to production, you must adapt LLMs to your company's use cases.
Fine-tuning emerged as the way to infuse domain knowledge into pre-trained models. Yet fine-tuning can be very demanding on infrastructure, slow, and expensive. This on-demand webinar will help you overcome these challenges.
The session recording will explore two optimization techniques: Quantization and Low-Rank Adaptation (or LoRA).
- Review the motivation and theory behind PEFT (parameter-efficient fine-tuning) techniques
- Discover the power of quantization with the Huggingface Trainer on Domino using Falcon-40b
- Investigate LoRA with the Falcon-7b LLM using PyTorch Lightning