Llama 2: Leveling the playing field for LLM-based enterprise AI applications

Josh Poduska2023-07-20 | 8 min read

Domino's new Model Sentry feature helps enterprises manage AI evolution and launch products responsibly.

In a groundbreaking move that will undoubtedly reshape the landscape of AI adoption, Meta has released Llama 2—a family of highly performant open source foundation models. This announcement is set to impact businesses seeking to harness the power of artificial intelligence to drive innovation, enhance customer experiences, and stay competitive in an increasingly AI-driven world. Get started with Llama in the Domino platform with our open source template.

Why is the release of Llama 2 a big deal?

In a sentence, Llama 2 is the first collection of open source models that rival the capabilities of closed, proprietary, pay-for-use models.

Llama 2 models come in two flavors: the base model and a chat-ready model. These two flavors can be found in various sizes (7B, 13B, and 70B parameters) to facilitate a variety of deployment designs. The models also have a much larger context length—about 4K tokens or 6 to 12 standard pages of content. This allows for longer, more detailed, and elaborate prompts. The models also incorporate advanced techniques to speed up responses. The models, code, and weights are freely available and authorized for commercial use. In their paper, Meta released details on the fine-tuning process for the Llama 2 chat models. They also provide a Responsible Use Guide. These contributions will enable the building of safer and more reliable LLMs, whether those are variations of Llama or entirely new models.

* A note on naming conventions: The base model is usually referred to simply as Llama 2. Llama 2 is also used when referring to the entire family of models, so context is important to understand which is being referenced. The chat model is referred to as Llama-2-chat, Llama 2-chat, or Llama 2-Chat. Each is acceptable and used in Meta’s paper. When referring to the specific size of a model, variations of the naming convention, such as Llama-2-13b-chat or Llama 2-Chat (13B), are fine to use.

The base Llama 2 model can be fine-tuned on a company’s internal data to create powerful models specific to unique needs. It generally performs better than existing open-source base models on common benchmarks and is comparable to or outperforms the closed GPT-3 model. There have been previous open-source models comparable to GPT-3. What really sets Llama 2 apart is the release of its chat model.

Llama-2-chat can be plugged directly into conversational commercial applications with conversational fluidity and safety guardrails that rival the closed "product" LLMs such as ChatGPT, BARD, and Claude. Closed product LLMs undergo intensive fine-tuning to align with human preferences (e.g., Supervised fine-tuning (SFT) and Reinforcement Learning with Human Feedback (RLHF)). This significantly enhances model usability and safety. The fine-tuning process is the key to preventing hallucinations and harmful responses. It also helped generate a dialog that feels natural to humans. It was this fine-tuning that produced the first release of ChatGPT and pushed LLMs over a barrier of usability into a space that has captivated over 100 million users around the world. However, the fine-tuning process comes with substantial computational and human annotation costs and has lacked transparency. It also makes it impossible for most companies to recreate it on their own, forcing them to seek out established incumbents with proprietary models. Llama 2 models were trained on 40% more data than Llama 1, incorporated SFT on 100,000+ annotations, and aligned on human preference with over a million annotations. Llama 2 levels the playing field.

How exactly does the release of Llama 2 impact businesses seeking to leverage AI?

Reduced Barriers to Entry: One of the most significant hurdles for companies looking to tap into the potential of AI has been the cost and access to cutting-edge models. The open-source nature of Llama 2 removes these barriers, making it easier for startups and small businesses to experiment and innovate without prohibitive costs.
Accelerated AI Development: By providing the weights and model architecture, Meta empowers developers and data scientists to fine-tune Llama 2 for their specific use cases. This enables faster AI development cycles, facilitating rapid prototyping and testing of AI-powered applications better aligned with business objectives.
Enhanced Performance: The fact that Llama 2 outperforms GPT-3 and other open-source models is a testament to its robustness and potential. Businesses can expect better performance, improved accuracy, and more reliable outcomes from their AI-driven solutions, leading to more satisfied customers and stakeholders.
Innovation and Collaboration: The open-source community around Llama 2 is likely to foster collaboration and innovation among technical experts. This collective effort can lead to continuous improvements and new breakthroughs in AI capabilities, benefiting the entire AI ecosystem.
Ethical AI Advancements: Open-source models like Llama 2 encourage transparency and ethical AI practices. As businesses leverage this model, there is a heightened focus on addressing biases, ensuring fairness, and understanding the implications of AI in decision-making processes.
Increased Control: Most prominent LLMs run on cloud-provider infrastructure, frequently relying on dedicated services. That is because hosting LLMs requires complex infrastructure and deep DevOps know-how businesses lack. Llama 2 will allow organizations to openly develop and share the techniques necessary to host the LLM. This will ease the path for enterprises to host the model in-house or on VPCs. This will remove a common barrier, as many enterprises want to retain control over the data used by and with the model.

Wrapping it up

Meta's release of Llama 2 is a pivotal moment for businesses seeking to harness the full potential of AI. These open-source foundation models have the power to greatly impact how AI is adopted, developed, and deployed across industries. By breaking down barriers, improving performance, and encouraging collaboration, Llama 2 paves the way for a more inclusive, innovative, and responsible AI future.

Technical leaders that embrace advancements like Llama 2, and those that will undoubtedly come in the future, place their organization at the forefront of AI-driven advancements. Llama 2 will enable them to make informed decisions, drive efficiencies, and deliver exceptional customer experiences. The time to explore Llama 2 is now; with it, the potential for transformative AI solutions awaits.

Are you ready to embrace the power of Llama 2 and unlock the true potential of AI for your business? Let's usher in a new era of AI innovation together. At Domino Data Lab we are enabling the development of generative AI applications at industry-leading enterprises across insurance, finance, pharmaceuticals, life sciences, manufacturing, government, retail, and more. Domino’s platform brings enterprise-grade functionality such as reproducibility, collaboration, and security to all aspects of the LLMOps lifecycle.

Josh Poduska

Josh Poduska is the Chief Field Data Scientist at Domino Data Lab and has 20+ years of experience in analytics. Josh has built data science solutions across domains including manufacturing, public sector, and retail. Josh has also managed teams and led data science strategy at multiple companies, and he currently manages Domino’s Field Data Science team. Josh has a Masters in Applied Statistics from Cornell University. You can connect with Josh at https://www.linkedin.com/in/joshpoduska/

Summary

Llama 2: Leveling the playing field for LLM-based enterprise AI applications

Why is the release of Llama 2 a big deal?

How exactly does the release of Llama 2 impact businesses seeking to leverage AI?

Wrapping it up

Other posts you might be interested in

Generative AI on Domino

Crossing the Frontier: LLM Inference on Domino

Governance meets scalable inference: Domino + Amazon SageMaker