Roadmap to the AWS Machine Learning Certification
7 min read
Certifications can be really useful for learning. When a certification is well designed, a knowledgeable organization is telling you what they consider to be important, and they are validating your learning.
There is some debate whether a certification will help you get a job. My belief is that certifications are really useful for building credibility with recruiters. Generally, recruiters do not have deep technical knowledge, and it can be difficult for them to know whether you are a good candidate. A certification can make them feel more comfortable that they won't be wasting their client's time if they send them your resume.
For machine learning, the main certifications are from the major cloud providers. Amazon, Google and Microsoft each offer a machine learning certification, and I went with Amazon, because that is what I have experience with.
If you look at the official exam guide you will see that the test is broken into 4 domains:
- Data Engineering 20%
- Exploratory Data Analysis 24%
- Modeling 36%
- MLOps 20%
What may be less obvious is that the questions in domains 2 & 3, Data Analysis and Modeling, are mostly not about AWS specifically. Domain 1 is primarily about Amazon's ETL tools and storage, and Domain 4 is about Amazon SageMaker, but the majority of domains 2 & 3 are about understanding data distributions, preparing data for modeling, training, and evaluating models. This means that to prepare for the exam you need to learn things beyond just how to use AWS.
What you really need to know for the exam are
- General AWS Knowledge (including S3 for storage)
- Knowledge of Machine Learning
- AWS Data Engineering tools like EMR, Kinesis and Glue
- Amazon SageMaker
General AWS Knowledge
The best way to get the general AWS knowledge you need is to take Adrian Cantrill's Solution Architect Associate Course. His teaching is very thorough. Go through the course and you will have a solid understanding of how AWS works. The course is $40, and I believe it to be the best on the market
After you finish the course take the test because
- It is a good way to get experience with AWS tests and
- When you pass the exam you will get a voucher for half off your next exam. So the $150 Architect exam gives you half off the $300 Machine Learning exam.
The only other resource you will need to prepare for the Solutions Architect exam is the practice questions from Tutorial's Dojo. The practice questions are great, and the explanations of each of the answers are very clear. They also link back to the relevant documentation on AWS for more in depth explanations. The set of practice tests for the exam only cost $15.
Machine Learning Knowledge
The best way to get a strong foundation in Machine Learning is to take Andrew Ng's Machine Learning Specialization on Coursera. The course is on a subscription model. I don't remember if it is $49 or $59 per month. I believe you can watch the course videos for free, but I don't recommend that. The graded exercises will help you learn. If the price is a problem, I believe Coursera has a very generous financial aid program you should look into.
Data Engineering and ML Ops
After you have completed Adrian Cantrill's and Andrew Ng's courses, you will need a course specific to the AWS Machine Learning exam. I do strongly recommend that you take these courses first, because the material for the Machine Learning Specialty will make more sense afterwards.
Don't pay full price on udemy. The course I took was this one from Frank Kane and Stephane Maarek on Udemy. There are other courses on Udemy, and also courses on other platforms. They may be better or worse, I don't know. This is the one I took and it was sufficient.
The course covers each of the domains on the exam. At first, the material on S3 feels like a repetition of what you have already learned, but it is worth it. Virtually all of the data you use in SageMaker will be coming from S3, so it is worth the extra attention. Then it gets into Kinesis, Glue, and EMR in much more depth than you had on the Architect exam.
After that, you will cover data exploration and modeling topics that you have already been exposed to by Andrew Ng. This will help solidify your knowledge, and it will also teach your AWS specifics. Then there is a really long section where Frank describes a bunch of different built in algorithms in SageMaker. Watch it and understand it, but if you are the type to make flash cards (I am), don't make a flash cards for the instance types and hyper parameters of each algorithm. Watch it all once, and then a few days before the test watch it again.
The final section of the course is on ML Ops. This is the one part that you really haven't been exposed to before now. I was less confident in this area going in to the test, but the material in the course, and the material in the practice tests I took gave me enough preparation.
You should never pay full price for a course on Udemy. As I am writing this, the course is listed as $100, but they have frequent sales. Check back ever few days, and within 2 weeks you should be able to buy this course for $15.99 or less.
As I mentioned for the Architect exam, the best practice tests are at Tutorials Dojo. This one is a little more expensive, it is $17.99. I also picked up a couple of practice exams on Udemy, but I did not find them to be as good. But if you really feel like you need more practice, Frank Kane's practice test seemed to be the best of the rest.
One thing to be aware of when it comes to practice questions, some sites that sell practice questions are actually selling "Test Dumps." Instead of hand crafted questions, they are selling copies of actual questions. I don't know how widespread this is for Amazon tests, but I know it has been common for other companies in the past. Using test dumps violates the terms of service for the exam. Just learn the material and you will be prepared for the test.
I mentioned above that I make flash cards to prepare for exams. I use a program called Anki that you can get here. The computer version is free, though I believe there is a one-time charge for the iPhone app, if you want to use that. Personally, I make the flash cards on my laptop so I can use the keyboard, but I normally study the flash cards on my phone.
As I am going through a course or a book, whenever there is something I want to be sure to remember, I make a flash card for it. Also, anytime an instructor says "you will want to remember this on the exam," I make a flash card. When I start doing practice tests, virtually every question I got wrong gets a flash card, and also some of the things I got right, but I was unsure about.
You can find decks of flash cards that people have made, on various subjects, but I find the best way is to create your own. If you can formulate what you want to remember into a question in your own words, that will give you a great head start on retention. And when you see the question again, you will remember some of the larger context around the question that lead you to make that card in the first place. If you use someone else's cards you won't have that context.
This exam is generally considered one of the more difficult ones that Amazon offers, but it is doable. The steps I have laid out will take months to complete, but you will learn the material, and you should pass the exam on the first try.
If after taking the courses and practice exams, there are still areas you don't feel like you understand, you can learn more about those. Amazon provides free workshops on a number of topics. I was really confused about what the Blazing Text algorithm was, and I found talks about that on YouTube.
Either before your test, or after, you are going to have to build things with these tools to cement your learning. The examples SageMaker provides can be a good place to start.
I mentioned a number of courses and other paid resources here. I am not receiving any sort of promotional fee. These are just the things I found most useful.