Open source AI: Red Hat’s point-of-view

Open source AI: Red Hat’s point-of-view

By Chris Wright (pictured), Chief Technology Officer and Senior Vice President Global Engineering at Red Hat

 

More than three decades ago, Red Hat saw the potential of how open source development and licenses can create better software to fuel IT innovation. Thirty-million lines of code later, Linux not only developed to become the most successful open source software but the most successful software to date. Our commitment to open source principles continues today, not only in our business model, but in our corporate culture as well. We believe that these concepts can have the same impact on Artificial Intelligence (AI) if done the right way, and yet there is a distinct divide within the technology world as to what “the right way” is.

AI, especially the large language models (LLMs) driving generative AI (GenAI), cannot be viewed in quite the same way as open source software. Unlike software, AI models principally consist of model weights, which are numerical parameters that determine how a model processes inputs, as well as the connections it makes between various data points. Trained model weights are the result of an extensive training process involving vast quantities of training data that are carefully prepared, mixed and processed.

While model weights are not software, in some respects they serve a similar function to code. It’s easy to draw the comparison that data is, or is analogous to, the source code of the model. In open source, the source code is commonly defined as the “preferred form” for making modifications to the software. Training data alone does not fit this role, given its typically vast size and the complicated pre-training process that results in a tenuous and indirect connection any one item of training data has to the trained weights and the resulting behavior of the model.

The majority of improvements and enhancements to AI models now taking place in the community do not involve access to or manipulation of the original training data. Rather, they are the result of modifications to model weights or a process of fine tuning which can also serve to adjust model performance. Freedom to make those model improvements requires that the weights be released with all the permissions users receive under open source licenses.

Red Hat’s view of open source AI

Red Hat sees the minimum threshold for open source AI as open source-licensed model weights combined with open source software components. This is a starting point of open source AI, not the final destination. We encourage the open source community, regulatory authorities and industry to continue to strive toward greater transparency and alignment with open source development principles when training and fine-tuning AI models.

This is Red Hat’s view of how we, as an open source software ecosystem, can practically engage in open source AI. It’s not an attempt at a formal definition, like that being undertaken by  the Open Source Initiative (OSI) with its Open Source AI Definition (OSAID). Our viewpoint to date is simply our take on what makes open source AI achievable and accessible to the broadest set of communities, organizations, and vendors.

We put this viewpoint into action through our work in open source communities, highlighted by the Red Hat-led InstructLab project and our work around the Granite family of open source-licensed models with IBM Research. InstructLab significantly lowers the barrier to AI model contributions from non-data scientists. With InstructLab, domain experts from across industries can add their skills and knowledge to InstructLab, both for internal use and to help drive a shared, broadly accessible open source AI model for upstream communities.

The Granite 3.0 model family addresses a wide range of AI use cases, from code generation to natural language processing to extracting insights from vast datasets, all done under a   permissive open source license. We helped IBM Research bring a family of Granite code models to the open source world, and continue to support the model family, both from an open source point of view and as part of our Red Hat AI offering.

The ripples caused by the recent announcements from DeepSeek show how open source innovation can impact AI, both at the model level and beyond. There are obviously concerns around DeepSeek’s approach, namely that the model’s license doesn’t clarify how it was produced, which further reinforces the need for transparency. That said, this disruption affirms our view of AI’s future: An open one, centering on smaller, optimized and open models that can be customized to specific enterprise data use cases anywhere and everywhere across the hybrid cloud.

Expanding open source AI beyond models

Open source technology and development principles are at the core of our AI offerings, just as it is for our Red Hat AI portfolio. Red Hat OpenShift AI builds on a foundation of Kubernetes, KubeFlow and Open Container Initiative (OCI)-compliant containers along with a host of other cloud-native open source technologies. Red Hat Enterprise Linux AI (RHEL AI) incorporates the open source-licensed Granite LLM family from IBM  and the InstructLab open source project.

Red Hat’s work in the open source AI space expands far beyond InstructLab and the Granite model family to the tools and platforms needed to actually consume and productively use AI. We’re active in an ever-expanding number of upstream projects and communities, and have initiated many more on our own, including (but not limited to):

  • RamaLama – an open source project that aims to make local management and serving of AI models far less painful and complex;
  • TrustyAI – an open source toolkit for building more responsible AI workflows;
  • Climatik – a project centered on helping make AI more sustainable when it comes to energy consumption;
  • Podman AI Lab – a developer toolkit focused on facilitating experimentation with open source LLMs;

Our recent announcement around Neural Magic furthers our AI vision, making it possible for organizations to align smaller and optimized AI models, including open source-licensed models, with their data, wherever it lives across the hybrid cloud. IT organizations can then use the vLLM inference server to power the decisions and output of these models, helping to build an AI stack founded on transparent and supported technologies.

To Red Hat, open source AI lives and breathes on the hybrid cloud. Hybrid cloud provides much-needed flexibility to choose the best environment for each AI workload, optimizing performance, cost, scaling and security requirements. Our platforms, goals and organization supports this effort, and we look forward to collaborating with industry partners, our customers and the broader open source community as we continue to push open source AI innovation forward.

There’s immense potential to broaden this open collaboration in the AI space. We see a future that encompasses transparent work on models, as well as their training. Whether it’s next week or next month (or sooner, given how quickly AI evolves), we will continue to endorse and embrace efforts that push the boundaries of what it means to democratize and open up the world of AI.