Skip to main content
Compute-Optimal

About

I'm Sam, currently an MS student studying Natural Language Processing (NLP) at UC Santa Cruz. Before that, I was a backend software engineer working in regenerative agriculture and electric micromobility at Regrow, Indigo, and Bird.

As I've been learning over the last year, I've kept the habit of building an Obsidian knowledge base to organize concepts, but I never got around to crystallizing my thoughts into formal blog posts. Now that I'm at UCSC, I wanted this site to be somewhere I could point classmates to for more polished explanations of topics we cover in class. I want the site to be both a consistently-improving resource for myself and others when it comes to evergreen topics, as well as a place to share interesting and high-quality content as I come across it. Lastly, I hope that it serves as evidence of passion when interviewing for internships or full-time roles.

Why did I pick this name for a blog? In neural scaling laws, compute-optimality refers to the optimal allocation of resources across model size, dataset size, and training compute (FLOPs). If your language model is too small relative to your compute budget, you're leaving performance on the table; if it's too large, you won't have enough compute to train it properly. If the dataset is too small, you'll overfit, and if it's too large, you waste compute on unnecessary training examples. There's a narrow band of model and dataset sizes that maximizes model performance, given a compute budget. I do think there's some sort of analogy that can be drawn between the concept of compute-optimality and the pursuit of self-education -- just like training runs often have a fixed compute budget, we have finite time and mental energy for learning. And just like compute-optimal training, effective learning requires balancing multiple factors: What should we prioritize learning first? How much time do we spend building intuition? How deeply should we learn a concept? How much can we reasonably have on our plate while still making forward progress? How do we retain what we've learned? It's tricky -- the answer to these questions is a personal and ever-moving target as our knowledge grows and interests shift. I hope that the creation of this blog will help me on my journey to an optimal learning outcome, and that it can be a useful resource for others on theirs.

I've been helped along my journey so far by the work of too many people to name, but I'd like to highlight a few individuals who've created educational content that I've found particularly helpful. I hope that my own writing can one day produce even a fraction of the value for others that these folks have provided to me: