Amazon AWS Certified Machine Learning Specialty – Modeling Part 12

  • By
  • January 25, 2023
0 Comment

32. Automatic Model Tuning

Let’s talk about automatic model tuning within Sage Maker, which is a very exciting capability of the Sage Maker system. So hyper parameter tuning is kind of a really big problem in the world of machine learning. So for all these algorithms we talked about, we’ve talked about the different hyper parameters that they exposed, and there’s a lot of them, right? How do you find the most optimal values to suffer these things? Well, we have some guidance on some of these thing s. I mean, we’ve talked about the effect of different learning rates and batch sizes and depths. Some of these cause you to find local minima that aren’t the right answer.

Some of them can cause you to overfit your model, things like that. But to find the absolute best value of these things, it’s tough. I mean, these are very complicated systems and there’s really no better way that we’ve come up with yet than just trying different values and seeing which one works the best. So often you just have to experiment with different values of these parameters to end up with a model that’s as optimal as it can be. It’s kind of a machine learning dirty little secret that we don’t fully understand what’s going on inside there. And a lot of it’s just trial and error to see what works. And this problem blows up very quickly when you have many different parameters that you want to optimize at once.

So if I have ten different values of learning rate that I want to try to drill in on, well, that’s fine. I can just train my model and test it ten times and figure out which learning rate worked the best. But if I have ten different learning rates and ten different batch sizes that I want to try out, well, now I have ten times ten different possibilities to try. If I want to throw in different depths of the network as well, I just blew it up by another order of magnitude. So as you add more and more hyper parameters that you want to try to tune at once, this problem just grows exponentially. You have to try every combination of every possible value and every time you have to train a model and evaluate that model. And as you can see, this gets really, really, really expensive both in terms of time and money very quickly. So that’s what automatic model tuning and Sage Maker tries to help with. Basically, you just define the hyper parameters that you care about and the ranges of values that you want to try on those hyper parameters and what metric you’re optimizing for.

Sage Maker can then spin up what we call a hyper parameter tuning job that will train as many of those combinations as you allow. So you can set an upper bound on how many training steps you want to run to control your costs and it will try to work within that bound. And as it goes, it will actually spin up training instances to run as much in parallel as it can, potentially quite a few of them, and just try to plow through all those different combinations of parameters as quickly as it can. It can involve quite a bit of computing power, but at least we can use the parallel capabilities of Sage Maker and the ability to spin up entire separate instances to do this for you to try to make that as quick as possible. Once you’re done, the set of hyper parameters that produce the best results can be turned around and deployed as a highly tuned model that uses the best parameters you could find. But here’s where it gets really cool. So the thing that’s special about automatic model tuning in Sage Maker is that it learns as it goes, so it doesn’t actually try every possible combination. It can actually learn over time that going in this direction on this parameter is having a positive effect and this one’s having a negative effect, and it can use that to be more intelligent about the actual parameters that it tries out. So it doesn’t necessarily try every possible combination of those parameters. It learns as it goes to try to figure out intelligently which ones make the most sense to try out next. And by doing that, it can save a lot in terms of the resources required to do your hyper parameter tuning.

Now, there are some best practices you should follow when doing automatic model tuning in Sage Maker, and this is important to remember this stuff. First of all, don’t try to optimize too many hyper parameters at once like we talked about, this explodes very quickly, and as you add more hyper parameters, that’s basically another dimension of parameter space that you need to explore and it just blows up exponentially. So try to focus on the hyper parameters that you think will have the most impact on the accuracy of your model or whatever metric you’re optimizing for. Start with those first. You can always do more tuning on other parameters as a second pass later on. Also, make sure you limit your ranges to a smaller range as possible. If you have some guidance as to what parameters might work, don’t explore crazy values on the outside of that, because that will just yield work that you don’t need to be done. Another key one is using logarithmic scales when appropriate. So whenever you do an automatic model tuning job, you tell it not only the range, but also the scale in which you want to explore this range. Linear would just go through in a linear manner. But if you have a hyper parameter where the values tend to range from something like 0. 1 to 0. 1, for example, you probably want to try a logarithmic scale for that instead, right? If you did a linear scale, you’d be there all day, but logarithmic would explore that more quickly.

Also, do not run too many training jobs concurrently. Like we talked about Sage makers parameter tuning learns as it goes, and it can’t do that learning if it’s doing everything in parallel. It works much better if you just run one or two training jobs at once. Allow Sage maker to learn from those results, and then run the next set of training jobs. So don’t run too many training jobs concurrently with parameter tuning. That can limit how well the process can learn, which is really the key to Sage maker’s efficiency in doing hyper parameter tuning.

Also, finally, if you have a training job that’s running across multiple instances, you have to take care to make sure that the correct objective metric in the end is being reported from some result of all those training instances. So if you’re doing your own training job code, that can be a little bit tricky. You want to make sure that it plays nice with hyper parameter tuning by reporting the objective that you’re trying to optimize on in hyper parameter tuning at the end when all those instances come back together. But the key ones to remember here are use a small range if you can. Don’t do too many hyper parameters at once. Don’t run too many training jobs concurrently because the learning relies on sort of that sequential learning over time. And also, whenever appropriate, use logarithmic scales for exploring your parameter space.

33. Apache Spark with SageMaker

So let’s talk about the intersection of Sage Maker and Apache Spark. So Apache Spark is a very popular framework for preprocessing data, and it also has a very powerful ML Lib library as well that can perform machine learning at large scale too. So in a lot of ways, Apache Spark does a lot of what Sage Maker does, but it does even more because it’s really good at preprocessing data. Basically, the way it works is that you load up your data into something called a data frame within Spark, and you can distribute the processing of that data frame to sort of manipulate and massage that data across an entire cluster on Spark. So wouldn’t it be cool if you could combine Sage Maker and Spark together? Actually use the power of AWS as well as the power of Spark? Well, turns out you can.

And there’s a Sage Maker Spark library that AWS provides that basically lets you use Sage Maker within a Spark driver script. So what does that look like? How do you use that? Well, you would pre process your data as normal with Apache Spark. So whatever processing your data to collect that data and map it and reduce it or whatever you need to do, you would still do that using Apache Spark as you would normally. And when you’re done, at least in the world of Python, you would end up with what’s called a data frame object from Spark that contains all of your preprocessed data.

At that point, instead of using Spark’s ML Lib, you could use what’s called a Sage Maker Estimator, which works the same way. And it exposes a few different of the more popular algorithms within Sage Maker as basically things that you can use in Spark. For example, K means and PCA and XGBoost XGBoost being a very popular algorithm these days, it’s winning a lot of competitions, PCA for dimensionality reduction and K means for clustering that will then produce a Sage Maker model that you can use to make inferences. So it looks a lot like just normal Spark code if you look at it here, if you’re familiar with Spark code. But instead of using a Spark ML Lib implementation, we’re using a Sage Maker estimator and a Sage Maker model instead. So for the machine learning portion, we’re handing things off to Sage Maker to run within its own framework as opposed to the Spark cluster itself. We’re going to go off and spin up our own ML instances within Sage Maker to perform that final stage while still using Spark for all the preprocessing. The way this works in practice, you can take a Sage Maker notebook and connect that to a remote elastic MapReduce cluster running Spark.

So remember, EMR can run Spark on it. We just need to connect our Sage Maker notebook to that Spark cluster so we can use it, or you can use Sapling if you prefer. The training data frame that you’re preprocessing and creating in spark should end up with a features column that’s a vector of doubles, double precision values, and an optional labels column of doubles as well. If you’re doing supervised stuff, then you just create a sage maker estimator call fit on it using that data frame, and that will give you back a sage maker model. You can then call transform on the sage maker model to make inferences on that trained model.

This also works with spark pipelines, too. So pretty good integration between sage maker and spark. Why would you bother with all this? Well, it allows you to combine the power of preprocessing. Big data is sets in spark with training and inference in sage maker, so it’s kind of the best of those two worlds. And, yes, spark can actually do massive scale machine learning as well. But if you have AWS resources you want to use and take advantage of all the special capabilities of sage maker, such as automatic hyper parameter tuning, you might want to use both together. So it’s good to know that you can do that.

Comments
* The most recent comment are at the top

Interesting posts

Preparing for Juniper Networks JNCIA-Junos Exam: Key Topics and Mock Exam Resources

So, you’ve decided to take the plunge and go for the Juniper Networks JNCIA-Junos certification, huh? Great choice! This certification serves as a robust foundation for anyone aiming to build a career in networking. However, preparing for the exam can be a daunting task. The good news is that this guide covers the key topics… Read More »

Mastering Microsoft Azure Fundamentals AZ-900: Essential Study Materials

Ever wondered how businesses run these days without giant server rooms? That’s the magic of cloud computing, and Microsoft Azure is a leading cloud platform. Thinking about a career in this exciting field? If so, mastering the Microsoft Certified: Azure Fundamentals certification through passing the AZ-900 exam is the perfect starting point for you. This… Read More »

The Impact of Remote Work on IT Certification Exam Processes

With remote work becoming the new norm, it’s not just our daily routines that have changed but also how we tackle IT certification exams. Gone are the days of trekking to testing centers; now, your living room can double as an exam room. This shift has brought about some fascinating changes and challenges. Let’s dive… Read More »

IT Risk Management: CRISC Certification Exam Essentials

Do you ever feel like the IT world is moving at warp speed? New tech seems to pop up every day, leaving you wondering how to keep up and truly stand out in your field. Companies are increasingly concerned about online threats, data leaks, and meeting legal requirements. That’s where the CRISC (Certified in Risk… Read More »

The Ultimate Guide to Mastering Marketing Automation for Email Wizards

Hey there, email aficionados! Welcome to your new favorite read – the one that’s going to turbocharge your email marketing game. You’re about to dive into the captivating world of marketing automation, a place where efficiency meets effectiveness, letting you boost your campaigns without breaking a sweat. Get ready to discover how automation can not… Read More »

Master YouTube Marketing with These 10 Powerful Steps

Welcome to the dynamic world of YouTube marketing! Whether you’re a seasoned pro or just getting started, harnessing the power of YouTube can significantly boost your brand’s visibility and engagement. With over 2 billion monthly active users, YouTube offers a vast audience for your content. But how do you stand out in such a crowded… Read More »

sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |