Llama 2. A significant milestone in the world of AI
With the development of language models showing no signs of letting up, Meta AI has decided to make their contribution to the AI world with the introduction of the second iteration of their groundbreaking open-source language model Llama 2. It definitely marks a significant step in the field of natural language processing (and artificial intelligence as a whole), further democratizing the power of LLMs and improving the quality of many LLM-based applications.
In this blog post, we will focus on the widely-discussed Llama 2 model. We will go through the technicalities, safety issues, the tools built around it, and the possibilities of using the model. We’ll also examine how the model compares to others available, and we’ll take a quick look at Meta’s recent partnership with Microsoft.
What is Llama 2, and why is there so much buzz around it?
Chances are, if artificial intelligence sparks your interest, you’ve probably heard about the excitement surrounding Llama 2. Llama stands for Large Language Model Meta AI, which is an autoregressive language model that relies on a transformer architecture (similar to many of the recently developed alternatives). While the first iteration of Llama (presented in late February 2023) was generously made available for non-commercial use, the second version, Llama 2, takes a leap forward, by not only being open to the public but also offering itself for commercial usage. Essentially, this means that both individual developers and enterprises can now use the Llama 2 model to develop countless commercial applications. This may herald even more new products based on language models, and thus the even faster development of AI in this field!
The Llama 2 license permits any commercial use of the model with one small exception – if you had a user count of over 700 million per month at the time of the model’s launch, obligatory permission must be sought from Meta. This license exception was implemented due to Meta AI’s desire to prevent their current competitors from utilizing the model. Anyone else can make unlimited use of it, and even if applications based on it reach that kind of scale in the future, it will still be license-compliant.
Llama 2 is available to the public in a variety of sizes and flavors. The smallest model has 7 billion parameters, followed by a 13 billion parameter model and a staggering 70 billion parameter one, so you get a great trade-off between accuracy and the speed/cost of your system. Interestingly, there is also a model with 34 billion parameters, but according to Meta’s research paper, it has not been released to the public due to a lack of time to ensure the self-imposed safety threshold. Each of the sizes mentioned is quite significant; if you want to learn more about operating such models, it is worth checking out our blog post about LLMOps.
In addition, a few versions of the model are available for different applications. Beyond the foundational version of the model, fine-tuned versions for chat and programming assistance are also available. According to Meta, the model is trained on 40% more data than the previous version which equals around 2 trillion tokens in total and over 1 million human annotations (more precisely, binary comparisons of the model outputs) used in the RLHF (Reinforcement Learning from Human Feedback) process. This means, after classic self-supervised training, the model was fine-tuned on human labels that indicated more helpful responses. The above-mentioned improvements to the model allow for building multiple business applications such as specialized chatbots, knowledge and information retrieval search engines, code or text autocompletion, automatic content creators and many more.
Read our post about implementing LLMs in business operations to learn more about possible use cases in detail.
Safety development
In the second iteration attention was paid to the aforementioned issue of safety associated with the use of the model. In this case, every effort was made during pretraining to ensure that all data used was fully legal and did not come from users who did not give their consent. In addition, Meta tackled the thorny issue of making sure that the model was free of biases toward religion, gender, nationality, race, and sexual orientation. Then, during fine-tuning using the RLHF method, the authors attempted to eliminate three types of behavior (illicit and criminal activities, hateful and harmful activities, and unqualified advice) from the model.
A more detailed description of the safety measures can be found in Meta’s research paper about the model, where the relevant chapter is 12 pages long! It is worth mentioning that in this case safety improvements can prove to be a double-edged sword, because, for example, the model can wrongly interpret a question as harmful or hurtful in some situations.
A rapidly growing ecosystem
There have already been several tools built around Llama to make it easier for developers to use them. A noteworthy one, for example, is the open-source Llama 2-Accessory library that targets pre-training, fine-tuning, and deployment. This library allows for the easy evaluation of fine-tuned models with popular benchmarks, and can optimize its performance for speed and size, or easily encapsulate it in an API. Building such tools not only facilitates the work of developers but also greatly accelerates the creation of new products based on Llama 2.
Another toolkit library to check out when planning your work around the second version of Llama may be llama-recipes from Meta itself, which provides ready-made scripts for fine-tuning in various hardware configurations. Besides, it is worth remembering that it is possible to use Llama 2 with LangChain and in the Huggingface ecosystem, which gives us a truly infinite number of applications.
Best of all, some chatbot-based service products on the market like Perplexity Labs or Poe are already using Llama 2. This trend indicates the likelihood of an increasing number of similar products adopting this technology in the future.
Benchmarking the performance
While proprietary models like OpenAI’s GPT series are still superior, it’s worth noting that Llama 2 showcases tremendous potential. In fact, in view of certain benchmarks, it even surpasses the effectiveness of GPT-3.5 which is the basis for the free version of ChatGPT. Of course, it should be remembered that benchmarks are not the only valid indicators determining the effectiveness of the model on all levels. Nevertheless, when designing a custom solution, it is a great idea to check out the performance of those leading models in your downstream task.
Meta & Microsoft
In a stride toward the future, Meta and Microsoft have embarked upon a groundbreaking partnership. United by a shared vision, the two tech giants have pledged their support to the Llama 2 family of language models. Llama 2 is available in the Azure AI model catalog, which means that you can easily build applications on top of the model or just play with it using cloud-native tools. It is also optimized to run locally on Windows. But if you don’t use Azure, don’t worry – AWS, GCP or even many of the smaller cloud service providers already offer Llama 2 as well, e.g., Anyscale Endpoints.
Summary
In an era of artificial intelligence and language models, the introduction of Llama 2 represents an undeniable landmark – a testament to the constant quest to push the boundaries of what is possible. As Llama 2 develops its capabilities, one thing becomes clear: We are at a point in time where natural language and technology are intertwined in unprecedented ways, opening the door to previously unexplored innovations. All of this is now available not only for public use but also for commercial use, which is sure to accelerate the development of this field even further, and result in many great products which are applicable to our lives and work.
For those who want a deeper understanding of the nuances contained in the Llama 2 architecture, a treasure trove of insights awaits in Meta’s comprehensive research paper. Dive into the depths of innovation here: Meta’s Llama 2 Paper. If you yearn to embark on a journey of first-hand interaction, you can take the opportunity to download Llama 2 for exploration here: Download Llama 2.