First Impressions on Deep Learning in 2022

02 Aug, 2022

Over the past few years, I've been willfully ignorant of the deep learning world. I think there's a few reasons for this. The main reason is a hype cycle of epic proportions. I think the worst hype cycles in the tech world tend to occur when a technology is applied to a specific domain and provides some real value, which is then followed by a huge wave of trying to apply that technology to many other domains, sometimes without considering if that is actually solving a problem or is economically viable. See: The dot-com bubble, blockchain, and in my opinion, machine learning.

But a combination of factors has really piqued my interest in deep learning this year. Namely, I've started working at a consulting firm that has a keen interest in this area, but even more so was the release of the DALL-E model from OpenAI. This model can generate images from captions describing the image. For example, you can write "A sloth playing a guitar, photograph 35mm lens" and get this!.

I have been impressed by ML applications in the past, but this one really blew me away. I think I was so impressed by DALL-E because the way it generates images means it has some level of understanding of how objects interact in a scene. For example, in scenes with water it will often draw reflections on the water. Sometimes it will make creative decisions that make sense even though the caption doesn't explicitly ask for it (For example, this one. This suggests, to me at least, that there's a deeper understanding of the data, and one that seems a little closer to human, than previous models like GPT-3.

So with my mind thoroughly inspired, I decided to dive in. But ML is such a huge field now, where does one start? I remembered a podcast I heard a while back with Jeremy Howard of fast.ai. I don't have a theoretical computer science background, and prefer to learn with a problem in mind. I remember Jeremy talked about fast.ai and their approach to teaching deep learning. That is, they teach the practical skills first that get you excited about solving problems of ML, and then they strip it back to the foundations. This really appealed to me, especially since I'm already comfortable in Python, which is the main language of the course. So I headed over to fast.ai and jumped in. After completing the first section of the book, I thought I'd share my experience.

First Impressions on Deep Learning and fast.ai

Fast.ai is great. It really is fast. All the lessons are delivered via Jupyter notebooks, which make reading and running the code a breeze. I used paperspace to run my notebooks with a GPU, and would highly recommend them.
I had some big misunderstandings about deep learning. Firstly, I thought you needed big data to perform any kind of valuable ML. This is not true. Pre-trained models that are trained on huge datasets, then fine-tuned to your specific use-case, allow you to produce useful models with only a few hundred points of labelled data. That is awesome!
It seems the biggest problem in practical ML is producing labelled data, or producing/finding pre-trained models. That is kind of exciting though, as it seems this would be an area rife with opportunities.
It's a pretty exciting time to get into machine learning. It still feels kind of nascent, with frameworks and tooling being kind of all over the place. It feels like the abstractions we've come to appreciate in other parts of computing are not quite there yet with ML. For example, AMD cards weren't even supported by PyTorch and Tensorflow until the end of 2019. It really is in its infancy.

That's all for now! I'm going to continue the book and write more impressions as I go.