Meta just released its answer on GitHub Copilot, and it’s free

Meta has officially released Code Llama, a new open source LLM for code completion, generation, and testing that can run on local machines and compete with ChatGPT.
The template is freely available for research and commercial use, and comes in a number of variations that best suit the user’s needs. It can produce or complete lines of code in languages such as Python, C++, Java, and Bash.
Code Llama is a specialized version of Meta Free LLM Llama 2and was created by subjecting Llama 2 to additional training based on 500 billion pieces of code and programming data.
The model comes with three different parameter sizes: 7 billion (7B), 13 billion (13B), and 34 billion (34B).
Meta states that while the 34B is the most accurate, the 7B and 13B run faster and can be more useful for lower latency requirements such as real-time code completion.
Code Llama 34B scored an accuracy of 48.8% on HumanEval, a benchmark dataset created by OpenAI to run AI models through programming challenges, better than the 30.5% achieved by Llama 2’s base model and a slight improvement over the 48.1% scored by OpenAI’s GPT- 3.5 Which form is the backbone of ChatGPT.
All models are still below OpenAI’s multimedia level GPT-4which can generate code in a wide range of programming languages and is the base model for Microsoft’s advanced code AI programming assistant X co-pilot.
In addition to a variety of sizes of Code Llama models, Meta has released two finely tuned models titled “Code Llama—” Pythonand the “Lama Symbol – Guidance”.
The former has undergone additional training based on an extensive data set of 100 billion Python tokens, to ensure that it is particularly accurate at generating code in the language.
Meta states that it was created because Python is among the most widely used languages in the AI community, has scaled high to date, and is the foundation of open source. machine learning frame bittorch.
Llama — Instruct has been trained on 5 billion tokens to tune for natural language input, and is the model that Meta has recommended for users wanting to generate answers or code based on questions in plain text as with a tool like ChatGPT.
While the generic Llama 2 can be used in a similar way, it is not as accurate with its code responses as it has not undergone the same fine-tuning steps as the Code Llama.
The Model 7B can also run on a single device Graphics processing unit (GPU)Although the Meta has not specified the minimum hardware requirements to achieve this.
Software engineer Anton Bakaj posted a video in which Code Llama was able to process code generation in 49ms per code, running on four Nvidia 3090 GPUs.
Llama code 34B on 4x3090s, ~49ms per code pic.twitter.com/5A3bOdGe6KAugust 25, 2023
This can be useful for programmers who want to use the model to build, test, or complete code based on sensitive data or private information.
Although this will require an upfront investment in hardware, small businesses may weigh these costs against subscriptions to services such as GBT Plus chat Or co-pilot X.
The cost of keeping data locally may also be seen as necessary against the “black hole” of overseeing code being passed to companies like Google and OpenAI.
Meta did not say the origins of some of the data used to train the Llama 2, which could open the companies to legal action under legislation such as the anti-terrorism law. European Union Artificial Intelligence Law If it is later found that they have created code based on copyrighted data.
LLaMA, the predecessor to Llama 2, was leaked online in March 2023, by some hackers. It is called to be stored on Bitcoin anonymouslyEasy entry. Some experts have expressed concerns that LLaMA could be used in the wrong hands Promotion of cybercrime.
Related resources
Revolutionary value driving using generative AI
This free webinar explains how companies are leveraging AI responsibly and at scale
free download
Unlike LLaMA, Llama 2 and Code Llama are freely available outside of academia. The Meta stated that Code Llama has been subjected to additional testing to eliminate malicious output.
“As with all evolving technologies, Code Llama involves risks. Building AI models responsibly is critical, and we took several safety measures before we launched Code Llama.
As part of our red team efforts, we quantified Code Llama’s risk in creating malicious code. We generated claims that attempt to solicit malicious code with clear intent and recorded Code Llama’s responses to those claims against ChatGPT (GPT3.5 Turbo). Our results found that Code Llama Respond with safer responses.
In addition to the obviously harmful output, Code Llama will be judged on the daily usefulness of code generation and debugging.
It was recently found that ChatGPT gives incorrect answers to programming questions in more than 50% of cases.