New AI Startup Deep Cogito Releases Open Models Outperforming LLaMA and Qwen

April 10, 2025

Open source AI is picking up speed, and Deep Cogito just added serious fuel to the fire with its first models, already outperforming major competitors on key benchmarks.

The Growing Appeal of Open Source AI Models

For a long time, the most capable AI felt like a black box, controlled by a handful of major players. But a definite shift is happening. The open source movement, which has fueled innovation in software for decades, is now transforming AI development.

Why is this important? Open source means the code and often the training methodologies are publicly available. This allows researchers, students, and businesses to inspect, modify, and build upon these models without hefty price tags or restrictive licenses common with proprietary systems.

Meet Deep Cogito: Shaking Up the Scene

Recently, a new AI research startup, Deep Cogito, emerged from stealth mode. Based in San Francisco, they did not just announce their existence; they dropped a whole new line of potent open source large language models called Cogito v1.

Founded by Drishan Arora, who previously worked on LLM modeling for Google’s generative search, Deep Cogito is clear about its goals. They aim to push AI beyond current limitations, creating models that can improve their own reasoning. The ultimate, ambitious goal is developing superintelligence, AI that surpasses human intellect across the board.

Cogito v1: What’s Under the Hood?

Deep Cogito did not start small. Their initial release includes five base models fine-tuned from Meta’s Llama 3.2, ranging significantly in size: 3 billion, 8 billion, 14 billion, 32 billion, and 70 billion parameters. This range offers options for different hardware capabilities and various use cases.

Accessibility is a major factor here. You can grab these models right now from popular platforms like the AI code sharing community Hugging Face and Ollama. For developers looking to integrate them via APIs, they are available through Fireworks AI and Together AI.

The models are released under Llama licensing terms. This is significant because it permits commercial use for many, allowing businesses to incorporate these models into their products and services. There is a threshold, though: usage beyond 700 million monthly users requires obtaining a specific paid license from Meta.

Understanding these licensing terms is important for commercial applications. The allowance for broad commercial use up to a very high user count makes Cogito v1 attractive for startups and established businesses aiming to integrate advanced AI without immediate licensing costs. This approach promotes wider adoption and experimentation within the industry.

Future Scale and Open Infrastructure

Deep Cogito is not stopping there. They have signaled plans to release even larger models, potentially reaching up to a massive 671 billion parameters, in the coming months. This signals a serious commitment to competing at the highest levels of AI performance.

These future models, particularly the planned Mixture-of-Experts (MoE) variants, represent the next step in scaling AI. MoE architectures allow models to become much larger and more capable by activating only relevant parts of the network for a given task, making them more efficient than traditional dense models of equivalent size. Deep Cogito’s plan to open source these powerful MoE models could significantly influence the high-end AI landscape.

IDA: The Training Method Behind Cogito’s Strength

How did Deep Cogito achieve such strong performance right out of the gate? Their approach hinges on a training methodology they call Iterated Distillation and Amplification (IDA). This is positioned as an alternative to more common methods like Reinforcement Learning from Human Feedback (RLHF) or simple teacher-student model distillation.

The core idea behind IDA is quite interesting. Instead of relying heavily on human feedback or a static “teacher” model, IDA involves giving the AI model itself more computational power to generate better solutions to problems. Then, the *process* of reaching that improved solution is distilled back into the model’s own parameters.

Performance Breakdown: Cogito v1 Benchmarks

Claims of strength need proof. Deep Cogito released extensive benchmark results comparing their Cogito v1 models against well-known open source competitors. These tests cover general knowledge (like MMLU), reasoning (like ARC and Hellaswag), math (MATH, GSM8K), and multilingual capabilities (MGSM).

A key feature highlighted is the models’ hybrid reasoning capability. They can operate in a “Standard” mode for quick answers or a “Reasoning” mode. This reasoning mode uses more computation, similar to self-reflection techniques seen in other advanced models, allowing for deeper problem-solving and often boosting performance significantly.

These benchmarks suggest the Cogito models, especially the larger ones, are highly competitive. They often exceed the performance of similar-sized open source models counterparts in general knowledge and reasoning. The IDA training appears effective, particularly when the “Reasoning” mode is activated.

However, there is a noticeable trade-off in mathematical reasoning compared to models like DeepSeek R1, which seem more specialized for that task. This highlights that different models may excel in different areas, depending on their architecture and training focus. Understanding these nuances helps in selecting the right model for a specific application.

Security and Ethical Considerations

Increased capability brings increased responsibility. The growing power of these models raises questions about safety and potential misuse. Securing these complex systems is critically important.

Companies and researchers are actively working on ways to make AI models more robust against attacks or unintended behavior. This involves rigorous testing and developing methods to better understand and control model outputs. Red teaming, where experts try to find vulnerabilities by intentionally attempting to misuse the model, is becoming standard practice.

Governments are also taking action. Efforts like recent executive orders aim to regulate AI models, addressing national security concerns tied to the most powerful AI systems. These often require safety testing and reporting for frontier models.

Enhanced Tool Calling Capabilities

Deep Cogito also emphasized native tool-calling performance. This refers to the AI’s ability to understand when it needs external tools (like calculators, search engines, or APIs) to answer a question or complete a task. It also involves correctly formatting the request for that tool.

This capability is vital for building effective AI agents and assistants that can interact with the digital world. Cogito models reportedly show strong performance here. For instance, Cogito 3B supports four types of tool calls (simple, parallel, multiple, parallel-multiple) with high accuracy (over 91-92% in tests), whereas the base Llama 3.2 3B model lacks native tool calling support.

Similarly, Cogito 8B significantly outperforms Llama 3.1 8B on tool-calling tasks, scoring over 89% across types compared to Llama 3.1’s 35-54%. Deep Cogito attributes this advantage partly to the base model improvements from IDA and also to specific post-training focused on tool-use tasks. This practical capability could make Cogito models very attractive for developers building interactive applications and autonomous agents.

Robust tool calling allows AI to go beyond simple text generation. It enables actions like booking appointments, retrieving real-time information, or controlling other software. As these models improve, they become more useful components in complex automated workflows.

Looking Ahead: The Trajectory for Deep Cogito

Deep Cogito’s debut is strong, but they stress it is just the beginning. Their roadmap includes releasing much larger Mixture-of-Experts (MoE) models at 109B, 400B, and 671B parameter sizes. They also plan to continually update the existing models with more training.

The team positions IDA not just as a way to train current models but as a scalable path promoting AI self-improvement. The goal is to reduce reliance on static human feedback or fixed teacher models, potentially leading to a faster capability growth curve. This aligns with their long-term vision of pursuing superintelligence through open research.

Arora stressed that while benchmarks are useful indicators, the true measure is real-world utility and adaptability. The partnerships with platforms like Hugging Face, Fireworks AI, Together AI, Ollama, and RunPod are crucial. These collaborations get the models into the hands of developers and researchers who will explore those real-world applications.

Conclusion

For students, researchers, and businesses, this means more potent tools are becoming available to learn from, build with, and innovate upon. While challenges remain, particularly around specialized tasks like advanced math and the constant need for safety and security, the direction is clear. The landscape of open source AI models is rapidly becoming more competitive and capable, broadening access to advanced artificial intelligence.

This continued progress pushes the boundaries of what is possible, offering more choices and flexibility for everyone involved in the AI space. Following developments like those from Deep Cogito provides valuable insight into the future direction of these transformative open source AI models.

Workmind – Blog