ServiceNow, Hugging Face and NVIDIA have teamed up to release a new family of open source LLMs called StarCoder2 designed for developers.
StarCoder2 has undergone 619 programming training and is intended to provide developers with features such as code generation, workflow generation, and text compression, to name a few. The companies anticipate that the StarCoder2 models will be useful to both software engineers and citizen programmers.
It was developed in-house BigCode community, which is a group dedicated to the responsible development of LLM. The project was managed by both ServiceNow and Hugging Face.
StarCoder 2 comes in three different model sizes: ServiceNow trained a 3 billion parameter model, Hugging Face trained a 7 billion parameter model, and NVIDIA trained a 15 billion parameter model.
Smaller models are designed to provide powerful performance while using a small amount of computing power. According to the companies, the 3 billion parameter model matches the performance of the 15 billion parameter model of the original StarCoder release.
Users will be able to fine-tune these models to meet their own specific needs, using open source tools such as NVIDIA NeMo or Hugging Face TRL.
“StarCoder2 is a testament to the combined power of open scientific collaboration and responsible AI practices with an ethical data supply chain,” said Harm de Vries, ServiceNow’s StarCoder2 development team leader and co-leader of BigCode. “The state-of-the-art open access model improves on previous performance of generative AI to increase developer productivity and gives developers equal access to the benefits of AI for code generation, which in turn enables organizations of any size to more easily meet all their requirements.” business potential.”
Leandro von Werra, machine learning engineer at Hugging Face and co-leader of BigCode, added: “The joint effort led by Hugging Face, ServiceNow and NVIDIA enables the release of powerful base models that empower the community to build a wide range of applications. efficiently with full transparency of data and training. StarCoder2 is a testament to the potential of open source and open science as we work to democratize responsible artificial intelligence.”