Google is building on the success of its Gemini launch by releasing a new family of lightweight AI models called Gemma. Gemma models are open and designed to be used by researchers and developers to safely innovate with AI.
“We believe that responsible issuance of LLMs is critical to improving the safety of frontier models, to ensuring equitable access to this revolutionary technology, to enabling rigorous evaluation and analysis of current techniques, and to enabling the development of the next wave of innovation,” the researchers behind Gemma wrote in technical Application.
Along with Gemma, Google is also releasing a new one Responsible Generative AI Toolkit which includes security classification and debugging capabilities, as well as Google’s best practices for developing large language models.
Gemma comes in two model sizes: 2B and 7B. They share many of the same technical and infrastructure components as Gemini, which Google says enables Gemma models to “achieve best-in-class performance for their size compared to other open models.”
Gemma also provides integration with JAX, TensorFlow, and PyTorch, allowing developers to switch between frameworks as needed.
Models can run on different types of devices, including laptops, desktops, IoT, mobile, and the cloud. Google has also partnered with NVIDIA to optimize Gemma for use on NVIDIA GPUs.
It is also optimized for use on Google Cloud, enabling benefits such as one-click deployment and built-in inference optimizations. It is available through Google Cloud’s Vertex AI Model Garden, which now contains over 130 AI models, and through Google Kubernetes Engine (GKE).
According to Google Cloud, through Vertex AI, Gemma could be used to support real-time generative AI tasks that require low latency or build applications that can complete lightweight AI tasks such as text generation, summarization, and Q&A.
“With Vertex AI, builders can reduce operational costs and focus on creating custom versions of Gemma that are optimized for their use case,” wrote Burak Gokturk, VP and GM of Cloud AI at Google Cloud. blog post.
On GKE, potential use cases include deploying custom models in containers alongside applications, customizing model serving and infrastructure configuration without the need to provision nodes, and integrating AI infrastructure quickly and in a scalable manner.
Gemma was designed to align with Google’s principles of responsible artificial intelligence and used automatic filtering techniques to remove personal data from training sets, reinforcement learning from human feedback (RLHF) to align models with responsible behavior, and manual evaluations that included red teaming, adversarial testing, and model fitness assessments for potentially bad outcomes.
Because the models are designed to promote artificial intelligence research, Google offers free credits to developers and researchers who want to use Gemma. It can be accessed for free using Kaggle or Colab, or early adopters of Google Cloud can get a $300 credit. Researchers can also apply for their projects up to $500,000.
“In addition to state-of-the-art benchmark performance measures, we’re excited to see what new use cases emerge from the community and what new capabilities emerge as we advance the field together. We hope researchers will use Gemma to accelerate a wide range of research, and we hope developers will create useful new applications, user experiences, and other functions,” the researchers wrote.