Learning Deep Generative Modelling from Stanford

I am a big fan of self-directed learning, but I have never before spent almost 2 thousands of dollars on a single course. Was it worth it? Who could most benefit from that?

I took the Deep Generative Modeling course from Standford online for 1750 dollars.

It was a first such course for me and before committing to it, I had some questions and doubts as it was rather expensive. I hope sharing my experience will help out people in a similar situation.

You can benefit from this post if you:

are interested in learning about AI/ML or specifically generative modelling and you are wondering if XCS236 is a good course to take.
wonder if it’s worth to spend almost 2k usd on a course, vs 50 usd/free on other platforms/DIY approaches.
want to know how to best take advantage of such course if you were to invest in it.

My background

Coming to the course I had been working in tech for 10+ years in software engineering positions and had a degree in Applied Physics. This meant that I was not afraid of math (calculus especially) and was very comfortable in coding. My math base was somewhat rusty, but I compensated by taking some Coursera courses to refresh the topics I needed more confidence in.

I also had some knowledge/experience with ML/AI - I took many courses along the years even though my job experience with that was very limited. I also have done some side projects that required using models.

My knowledge of generative modeling was mostly related to TTS models and content of Dive into Deep Learning and Deep Learning by Deep Mind and UCLA.

Course content and format

The course was focused on fundamental math and techniques in Generative Modeling:

Autoregressive models
Variational autoencoders
Normalizing flows
GANs
Energy based models
Score based models
Diffusion models

You can see the full syllabus here.

The course gave a deep understanding of probabilistic modeling across all the approaches and taught me a bunch of math tricks. The assignments helped me to better understand the math and as well as actually implement all the approaches on toy datasets.

The concepts were explained in a way that built on each other and gradually expanded the student’s arsenal of techniques.

The assignments were rather math heavy, but would really help with better understanding the nuance of the theory material which I later really appreciated. The coding parts were well done and focused on the core of the algorithms and math and provided good scaffolding for the essential bits to make things work, but not specific to the content of the course. The code was all in pytorch which is what is commonly used in the industry. The assignments focused on autoregressive models (including transformers), variational autoencoders and GANs. There weren’t any assignments related to diffusion modeling which was disappointing.

The course material was up to date on the latest diffusion based approaches. The main professor of the course Stefano Ermon is one of the top Generative Modelling researchers (H-index 84 as of 2024) with a specialty in diffusion modeling.

The course was not focused on transformers, LLMs, or language processing in general, if this is what you want to learn about, this is not the right course.

Platform and format

The course was a mixture of pre recorded lectures broken into chunks on specific topics and a set of assignments that contained coding and written parts. It’s 10 weeks, but as it’s self paced it could be completed earlier.

Lectures were hosted on the stanford online platform (also available on youtube, but not unprocessed), the assignments were in private repositories on github, but you would submit the solutions on the stanford online platform.

There were 3 assignments and no graded projects. The assignments had pretty generous deadlines.

I believe that about 400 people participated in my course cohort. There was a community on slack for students on slack, and each student would also have a course facilitator, students would be encouraged to form groups on their own. There were regular live zoom meetings for office hours with the professor/TAs, show and tell and guest speakers as well. Course facilitators were available for 1:1 video chat or chats on slack.

What I enjoyed in the course

Slack community for students and TAs/Course Facilitators
Study groups - where I could chat with other students and meet with them virtually
Additional resources for students
World class faculty - Stefano Ermon is one of the lead diffusion researchers!
Homeworks were difficult enough to motivate a lot of learning - lots of math proof 1:1 time with course facilitators
Deep and well organized understanding of the models aided by theoretical homeworks and practical exercises
Well scaled down problems - no need for GPUs
Lectures also available on YouTube (convenient for watching while doing house chores :D)
Alumni slack and private linked-in group - this was a nice surprise at the end of the course. Based on the introductions on slack, there were very many cool professionals from all sorts of companies attending the course, so this is a goldmine of high quality contacts.

Money on the line helps with motivation

I start a lot of courses and I only finish some, it takes quite a lot of persistence and motivation to stick to them.

The fact that I spent 1750 dollars on the course made me really want to complete it. Just imagining that I could fail it and waste all this money would make my stomach churn. So I really put in the work. I prioritized time for the assignments, I started early and I studied additional resources. It motivated me to revise my math skills and read additional papers.

The level of math in the first assignment was quite a shock to many of the students (including me!), but as I persisted I got into the groove of writing math proofs. My comfort level got expanded for sure.

That said, there was a window of time to give up on a course and get a refund or transfer to a different course if the content wasn’t matching the student’s expectation.

You get what you put into it

As the course was hard, just completing it provided a lot of value, because it required understanding of the topics. I completed the course with more than 100% of the points (thanks to assignments for extra credit!), but I still think that I could have taken more advantage of it!

What did I do right?

Some of the things I did right:

Learning from additional resources to fill in my gaps and enhance my understanding
Using Spaced Repetition (ANKI) to better retain the material - since I put all this effort, it would be a waste if I promptly forgot it
Choosing to do (most of) extra credit assignments
Meeting fellow students - I was a instigator of several discussions and video calls, people were pretty interesting
Contacting the course facilitator when stuck

What should I have done differently?

I would have benefitted more if I did the following:

Picking a project to do before the course even starts and partnering with other students to get it done - this was the part that was different from the live stanford course - I think it would be provide a great excuse for a intense learning and collaboration and also taking better advantage of course resources (e.g. questions to professors and TAs)
Taking more advantage of the course facilitators - e.g. please double check my code or reasoning on a proof. It would save me from wasting time.
Collaboration with other students and taking advantage of networking
Take more advantage of live zoom sessions - I didn’t attend most of them and rewatched some of the recordings.

That said, sometimes life gets in the way and it’s hard to maximize every opportunity, so I won’t be too hard on myself about it.

How to do a DIY alternative

If you are a sufficiently motivated and well organized individual with lots of time on your hands, you could recreate most of the value of the course doing the following:

Watch the youtube videos and read the lecture slides (syllabus)
Read the associated materials
Implement generative modeling using major model approaches on toy datasets such as:
- MNIST
- SVHN
- Fashion MNIST
Play with state of the art models from hugging face:
- Transformers
- Diffusion models
Devise and work on a project related to the topic

But what about the help of facilitators and the community?

There are ways to get around that too, for example:

Use discord, e.g. ML/AI communities, like one around hugging face - to meet real people and get help.
Use your network to get skilled people to help you, you might actually know people who know a lot about AI/ML. People like to help smart and motivated individuals on the topics they are passionate about. If you don’t know such people, go to some local meetups.

Conclusion

I found the course very valuable for me, but it’s not a course for everyone and you can also learn a lot on your own.

I was interested in the content and happy to put in the time, additionally my work sponsored 2/3rd of the cost of the course. High cost motivated me to prioritize it. And a Stanford credential was an added bonus.

But I would not immediately generalize the worth of that specific course for me to all such courses, even all the Stanford ones.

The factors I would look for to determine if the course if worth it for me:

unique material that can’t be easily learned on other platforms
access to world class faculty
access to the alumni community/fellow students
topics relevant to my interests/future plans and my own availability to take advantage of the course

I hope that helps! Stay curious!