This week, Groq, a pioneer in AI and ML systems, stated that it has adopted a new large language model (LLM), LLaMA chatbot technology from Meta, and a suggested replacement for ChatGPT to run on its platforms.
On February 24th, Facebookparent ®’s company, Meta, launched LLaMA, which chatbots may utilize to produce writing that sounds human. After downloading the model three days later, the Groq team had eight GroqChip inference processors running it on a commercial GroqNodeTM server within a few days. This is a development activity that frequently requires a bigger team of engineers to finish weeks or months later than Groq was able to do with only a small group from its compiler team.
Jonathan Ross, CEO and founder of Groq said, “This speed of development at Groq validates that our generalizable compiler and software-defined hardware approach is keeping up with the accelerating pace of LLM innovation–something traditional kernel-based approaches struggle with.”
While Meta researchers initially created LLaMA for NVIDIATM processors, Groq’s quick LLaMA bring-up is a very special and significant milestone. Engineers from Groq successfully ran a cutting-edge model on their technology to show off GroqChip as a ready-to-use replacement for existing technologies. Customers will require solutions that offer real time-to-production advantages, lowering developer complexity for quick iteration, as generative AI carves out a niche for itself in the market and transformers accelerate the speed of Technology development.
Bill Xing, Tech Lead Manager, ML Compiler at Groq said, “The complexity of computing platforms is permeating into user code and slowing down innovation. Groq is reversing this trend. Since we’re working on models that were trained on Nvidia GPUs, the first step of porting customer workloads to Groq is removing non-portable, vendor-specific code targeted for specific vendors and architectures. This might include replacing vendor-specific code calling kernels, removing manual parallelism or memory semantics, etc. The resulting code ends up looking a lot simpler and more elegant. Imagine not having to do all that ‘performance engineering’ in the first place to achieve stellar performance! This also helps by not locking a business down to a specific vendor.”