Meta AI has launched Llama 4, aiming to build the world’s leading open-source AI accessible to everyone. Two models are released: Llama 4 Scout, a fast, multimodal model with a 10 million token context length designed for single GPU use; and Llama 4 Maverick, a powerful 17 billion parameter model outperforming GPT-4.0 and Gemini Flash 2, also multimodal and efficient. Two more models, Llama 4 Reasoning and the massive Llama 4 Behemoth with over 2 trillion parameters, are forthcoming. Llama 4 marks a milestone for open-source AI, offering top-performing small to mid-size models with more releases planned soon.
ChatGPT vs Llama 4
Meta’s Llama 4 and OpenAI’s ChatGPT are both powerful AI models, but they have different strengths. Llama 4, particularly the Llama 4 Scout, boasts a massive 10 million token context window, allowing it to handle vast amounts of text and potentially leading to an “infinite context window”. It’s also designed for customization and open-source use, making it attractive for developers. ChatGPT, on the other hand, excels in complex reasoning, creative tasks, and multimodal interactions (like image generation). It also offers a polished user experience and advanced features like deep research capabilities.
Mark Zuckerberg transcription
Hey everyone, it is Llama 4 day. Our goal is to build the world’s leading AI, open source it, and make it universally accessible so that everyone in the world benefits. And I’ve said for a while that I think that open source AI is going to become the leading models. And with Llama 4, this is starting to happen.
Meta AI is getting a big upgrade today. So if you want to try Llama 4, you can use Meta AI in WhatsApp, Messenger, or Instagram Direct, or you can go to our website at meta.ai.
Today, we are dropping the first two open-source Lama 4 models, and we’ve got two more on the way. The first model is Lama 4 Scout. It is extremely fast, natively multimodal, it has an industry-leading nearly infinite 10 million token context length, and it is designed to run on a single GPU. It is 17 billion parameters by 16 experts, and it is by far the highest performing small model in its class.
The second model is Lama 4 Maverick, the workhorse. It beats GPT-4.0 and Gemini Flash 2 on all benchmarks. It is smaller and more efficient than DeepSeq v3, but it is still comparable on text. Plus, it is natively multimodal. This one is 17 billion parameters by 128 experts, and it is designed to run on a single host for easy inference. This thing is a beast.
Then we’ve got two more models on the way. One is Lama 4 Reasoning. And we’re going to have more news to share on that in the next month. And the last one we are calling Lama 4 Behemoth.
This thing is massive. More than 2 trillion parameters. I’m not aware of anyone training a larger model out there. It is already the highest performing base model in the world, and it is not even done training yet. We’re going to share more about Llama 4 Behemoth soon.
Overall, Llama 4 is a milestone for Meta AI and for open source. For the first time, the best small, mid-size, and potentially soon frontier models will be open source. There’s a lot more to do, but the trajectory here is clear. We’ve got more model drops coming soon, so stay good out there.
FAQ
What is the main goal of Llama 4?
The main goal of Llama 4 is to build the world’s leading AI, open source it, and make it universally accessible so that everyone in the world benefits.
Where can you try Llama 4?
You can try Llama 4 using Meta AI in WhatsApp, Messenger, Instagram Direct, or by visiting the website meta.ai.
What are the first two open-source Llama 4 models released?
The first two open-source Llama 4 models are Llama 4 Scout and Llama 4 Maverick.
What are the key features of Llama 4 Scout?
Llama 4 Scout is extremely fast, natively multimodal, has an industry-leading nearly infinite 10 million token context length, designed to run on a single GPU, and has 17 billion parameters by 16 experts.
How does Llama 4 Maverick perform compared to GPT-4.0 and Gemini Flash 2?
Llama 4 Maverick beats GPT-4.0 and Gemini Flash 2 on all benchmarks.
What are the specifications of Llama 4 Maverick?
Llama 4 Maverick has 17 billion parameters by 128 experts, is natively multimodal, smaller and more efficient than DeepSeq v3, and designed to run on a single host for easy inference.
What future Llama 4 models are mentioned?
The future Llama 4 models mentioned are Llama 4 Reasoning and Llama 4 Behemoth.
What is special about Llama 4 Behemoth?
Llama 4 Behemoth is massive with more than 2 trillion parameters, is the highest performing base model in the world, and is still in training.
What significance does Llama 4 have for Meta AI and open source?
Llama 4 is a milestone because for the first time, the best small, mid-size, and potentially soon frontier models will be open source.
By Manesh Ram, Digital Marketing Specialist. Please follow @maneshram & Meta