Ever heard of a machine learning model that gets smarter by being overfit? Welcome to the fascinating world of "grokking" in Large Language Models (LLMs)!
What's Grokking?
Imagine training a model way past the point where conventional wisdom says "stop!" Instead of becoming less accurate, these models sometimes have an "aha!" moment. They suddenly "get" (…
Keep reading with a 7-day free trial
Subscribe to Axiomata to keep reading this post and get 7 days of free access to the full post archives.