Amazon AWS Certified Machine Learning Specialty – Modeling Part 11

  • By
  • January 25, 2023
0 Comment

30. IP Insights in SageMaker

And let’s cover the IP insights algorithm in Sage maker. IP Insights is all about finding fishy behavior in your web logs. So it’s an unsupervised technique that learns the usage patterns of specific IP addresses and it automatically identifies suspicious behavior from given IP addresses. So it can identify login attempts from anomalous IP addresses, it can identify accounts that are creating resources from anomalous IP addresses. So basically it’s used as a security tool as a way of analyzing your web blogs for suspicious behavior that might cause you to flag something or maybe shut down a session. It can take in user names and account IDs directly, so you don’t really need to pre process your data a whole lot. It has a training channel, obviously, but since it is unsupervised, the validation channel is optional.

If you want, you can use that to compute an area under the curve score which we talked about earlier. Remember, the input has to be CSV data and it’s a very simple CSV file. It’s just entity and IP address. So that entity again can be a username or an account ID or whatever Identifier you use, followed by the IP address associated with that entity. That’s it. Under the hood. It’s using a neural network to learn latent vector representations of entities and IP addresses. So it’s doing some pretty fancy modeling there to try to learn what specific IP addresses do. Those entities are hashed and embedded, so it has an embedding layer to try to organize those IP addresses together. You need a sufficiently large hash size for this to work, so that’s going to end up being one of our important hyper parameters and it will automatically generate negative samples during training by randomly pairing entities and IPS.

So that’s kind of a neat little twist to the algorithm there. This is a case where we have an unbalanced data set, right? So it’s kind of like fraud detection where the vast majority of transactions are not going to be anomalous. So it actually generates anomalous examples by just randomly pairing together entities and IP addresses. And those are random fishy things that probably are anomalous. So kind of a neat idea there. The important hyper parameters, the number of entity vectors, the hash size, that’s all that is that we talked about earlier. They recommend that you set this to twice the number of unique entity Identifiers. So a little bit of a manual step there. Also the size of the embedding vectors is given by vector dim. Something else you might want to tune.

Too large of result there could result in overfitting, so something to be careful of there. And since it is a neural network under the hood, we have the usual suspects for tuning neural networks the number of training epochs, the learning rate, and the batch size. You can use a CPU or GPU, but since it’s a neural network, GPUs are recommended if you can p three, two x large or higher is recommended and it can use multiple GPUs on one machine. The size of a CPU instance would depend on the parameters that you chose if you’re going to go with CPUs instead. So again main thing with IP insights is used to identify anomalous behavior from IP addresses using a neural network at work. That’s the main takeaway.

31. Reinforcement Learning in SageMaker

Now let’s dive into the world of reinforcement learning in Sage Maker. And while I could frame this as just yet another builtin algorithm of Sage Maker, it’s really its own entirely different beast. So let’s go into a little bit more depth on this one. Reinforcement learning isn’t like the other algorithms in Sage Maker. You don’t, like, train a model and then deploy a model to make inferences for classifications or regressions. It’s more about learning about some virtual environment and how to navigate that environment in an optimal manner as you encounter different states within that environment. So the example I’m going to use here is an AI driven Pacman. Hopefully you’re familiar with the old game Pacman. If you grew up in the 80s, I’m sure you did. But the idea is that you have some sort of an agent. In this case, the agent is Pacman, and he’s exploring some space.

In this case, that space is the game board of Pacman itself. And as it goes as it goes through this space, it learns the values, the rewards associated with different state changes and different conditions. So, for example, if I turn left, what happens? I’m going to hit a wall. That’s not good. If I go right a little bit there, I might hit a power pill. That’s probably a good thing made as a reward associated with that. But if I go down, I’m going to run into a ghost and die. So that would be a very negative reward in that case. So it just learns for a given position within this environment and a given set of things around me, what’s the best thing to do? And it just does this by randomly exploring the space over time and building up this model of how to navigate this thing most efficiently.

Once it’s been trained, once it’s explored this entire space and learned about it, it’s very quick for it to be deployed and actually run in practice because it has a very fast lookup table of, okay, I’m in this spot. This is a state around me. Here’s what I should do. So the online performance is very fast once you’ve actually gone through the work of exploring this space and training it. And although you do see this a lot in the world of games, and you hear a lot of press about AI winning different types of games using reinforcement learning, because it’s a fun example. You can pit a machine against a man here and watch the machine win. But it also has some more practical examples. For example, supply chain management, HVAC systems, industrial robotics, dialogue systems, and autonomous vehicles. Even you can think of those as just an agent in a giant environment of the world, if you will. So that’s what reinforcement learning is all about.

Let’s dive into more of the mathematical notation around it. So a very specific implementation of reinforcement learning is called Q learning. It just formalizes what we talked about a little bit more. So, again, we start with a set of environmental states. We’ll call that s, and possible states are the surrounding conditions of the agent. So is there a ghost next to me? Is there a power pill in front of me? Things like that. Those are states, and I have a set of possible actions that I can take in those states. We’ll call that set of actions A. And in the case of Pacman, those possible actions will be things like move up, down, left, or right. Finally, we’ll have a value for each state action pair, and we’ll call that value Q. That’s why we call it QLearning. So, for each state, for a given state of conditions surrounding Pacman, a given action will have a value Q. So moving up might have a given value of Q. Moving down would have a negative Q value if it means encountering a ghost. So we start off with a Q value of zero for every possible state that Pacman can be in.

And as Pacman explores the maze, as bad things happen to Pacman, we reduce the queue value for the state that Pacman was in at that time. So if Pacman ends up getting eaten by a ghost, we penalize whatever he did in that current state. And as good things happen to Pac Man as he eats a power pill or eats a ghost, we’ll increase the queue value for that action for the state that he was in. And then what we can do is use those Q values to inform Pacman’s future choices. And we built a little intelligent agent that can perform optimally and make a perfect little Pacman. So, getting back to a real example here, some state actions here.

We can define the current state of Pacman by the fact that he has a wall to the west and a space to the north and east, a ghost to the south. And we can look at the actions he can take. He can’t actually move left at all, but he can move up, down, or right. And we can assign a value to all of those actions. So by going up or right, nothing really happens at all. There’s no power pill or dots to consume. But if he goes left, that’s definitely a negative value. So we can say for the state given by the current conditions that Pacman is surrounded by here, moving down would be a really bad choice. There should be a negative Q value for that. Moving left just can’t be done at all. That would have basically an infinitely negative Q value. And moving up or right are just neutral. So the Q value would remain zero for those action choices for that given state. Now, you can also look ahead a little bit more to make an even more intelligent agent. So I’m actually two steps away from getting a power pill here. So if Pacman were to explore this state. If I were to hit the case of eating that power pill on the next state, I could actually factor that into the Q value for the previous state.

And if you just have some sort of a discount factor based on how far away you are in time, how many steps away you are, you can factor that all in together. So that’s actually a way of building in a little bit of memory into the system. So the Q value that I experienced when I consume that power pill might actually give a boost to the previous Q values that I encountered along the way. It’s a way of kind of propagating that value back in time and giving a little bit of a boost to the actions that led to this positive cue value later on.

So one problem that we have in reinforcement learning is the exploration problem. How do I make sure that I efficiently cover all of the different states and actions within those states during the exploration phase, or the training phase, if you will? So a sort of naive approach is to always choose the action for a given state with the highest cue value that I’ve computed so far. And if there’s a tie, just choose one at random. So initially all of my Q values might be zero and I’ll just pick actions at random at first. And as I start to gain information about better of Q values for a given action instead of given states, I’ll start to use those as I go. But that ends up being pretty inefficient and I can actually miss a lot of paths that way if I just tie myself into this rigid algorithm of always choosing the best Q value that I’ve computed so far. So a better way of doing exploration is to introduce a little random variation into my actions as I’m exploring. We call that an epsilon term. So we have some value where I roll the dice and I have a random number. And if it ends up being less than this epsilon value, I don’t actually follow the highest Q value. I don’t do the thing that makes sense.

Instead, I just take a path at random to try it out and see what happens. And that actually lets me explore a much wider range of possibilities. It just lets us cover a much wider range of actions in states than we could otherwise. So what we just did can be described in very fancy mathematical terms, but conceptually it’s still pretty simple. I explore some set of actions that I can take for a given set of states. I use that to inform the rewards associated with the given action for a given set of states. And after that exploration is done, I can use that information, those Q values, to intelligently navigate through an entirely new maze. But this can also be called a Markov decision process. So again, a lot of data science is just assigning fancy, intimidating names to simple concepts. And there’s a ton of that in the world of reinforcement learning.

So if you look up the definition of Markov decision processes, it is a mathematical framework for modeling decision making. Decision making, like what action did we take given a set of possibilities for a given state in situations where the outcomes are partly random? Well, that kind of sounds like the random exploration that we just talked about and partly under the control of a decision maker. That decision maker is our Q values that we computed. So MDPs Markov decision processes are just a fancy way of describing our exploration algorithm that we just described for reinforcement learning. And the notation is even similar. States are still described as S, and S prime is the next state that we encounter. We have state transition functions that are just defined as P sub A for a given state to have S and S prime. And we have our Q values, which are basically represented as a reward function. So our sub A value for a given S and S prime is basically the same thing as our Q value. Moving from one state to another has a given reward associated with it. And by moving from one state to another, that’s defined by a state transition function. So again, describing what we just did, just with a different mathematical notation and a fancier sounding word, markov decision processes. And if you want to sound even smarter, you can also call a Markov decision process by another name a discrete timesocastic control process. Holy cow, that sounds intelligent. But the concept itself is the same simple thing we just described. So to recap, you can make an intelligent pacman agent or anything else by just having it semi randomly explore different choices of movement given different conditions. Where those choices are actions, those conditions are states. We keep track of the reward or penalty associated with each action or state as we go. And we can actually propagate those rewards and penalties backwards multiple steps if we want to make it even better.

And then we store those Q values that we ended up associating with each state, and we can use that to inform its future choices. So we can go into a whole new maze and have a really smart pacman that can avoid the ghosts and eat them up pretty effectively all on its own. It’s a pretty simple concept, but it’s very powerful. And you can also say that you understand a bunch of fancy terms now because it’s all called the same thing. Cue learning, reinforcement learning, Markov decision processes, dynamic programming, it’s all tied up in the same concept. So, I mean, I think it’s pretty cool that you can actually make a sort of artificially intelligent pacman through such a simple technique. And it really does work. Let’s tie this back into Sage Maker now.

So Sage Maker offers an implementation of reinforcement learning that’s built on deep learning, and it uses TensorFlow and MXNet to do this. It also supports different toolkits. So when we talk about reinforcement learnings in Sage Maker, we have frameworks, those are TensorFlow or MXNet. We have toolkits which includes Intel, Coach and Ray RLlib, and we have environments, and it supports a wide variety of environments. They can be custom ones, open source, or commercial ones from MATLAB, Simulink Energy Plus, Roboschool, Pie, Bullet, Amazon, Sumerian, and AWS RoboMaker are examples that they give in the documentation. The other cool thing about Stage Makers reinforcement learning is that it can be distributed so that exploration, that training stage can be distributed amongst many machines. And you can also distribute the environment rollout as well. So you can just deploy that trained model where it learned all those Q values for different states, what actions to take in a distributed manner as well.

It can do both multicore and multiinstance. So you can take advantage of multiple cores on one PC and an entire fleet of multiple PCs as well. So to recap again, here are some of the key terms associated with reinforcement learning. The environment that’s the layout of the board or the maze or whatever it is you’re working within the state would be where the player or pieces are. Like, where exactly is our agent right now? An action would be the things that agent can do, like moving in a given direction. And a reward is the value associated with the action from a given state. So we have a given state, what’s the reward associated with a given action from that state? Finally, observation, which would be the surroundings in Amaze or the state of a chessboard, basically, what’s the state of the environment right now? All right, so talking about hyper parameters, it’s a little bit weird with reinforcement learning because, again, it’s not a traditional machine learning model. We’re not doing old school trained chest here. It’s a little bit different. So the parameters that you might want to optimize are probably going to be very specific to your specific implementation.

So reinforcement learning in Sage Maker just allows you to abstract away whatever hyper parameters that you want. Internally, there’s nothing built in, but if there are things that you want to expose to be tuned, you can do that. And then you can use Sage Maker’s hyper parameter tuning capabilities to optimize them automatically. So no set list of hyper parameters with reinforcement learning, but you can make your own if you want to. And again, due to the general nature of reinforcement learning, there’s not a lot of specific guidance for what instance types to use for it with Sage Maker. But, you know, keep in mind, it is built on deep learning frameworks like TensorFlow and MXNet. So a GPU is probably going to be helpful. And we do know that it supports multiple instances and multiple cores. So you can have more than one machine. Even if you’re going with CPUs.

Comments
* The most recent comment are at the top

Interesting posts

Preparing for Juniper Networks JNCIA-Junos Exam: Key Topics and Mock Exam Resources

So, you’ve decided to take the plunge and go for the Juniper Networks JNCIA-Junos certification, huh? Great choice! This certification serves as a robust foundation for anyone aiming to build a career in networking. However, preparing for the exam can be a daunting task. The good news is that this guide covers the key topics… Read More »

Mastering Microsoft Azure Fundamentals AZ-900: Essential Study Materials

Ever wondered how businesses run these days without giant server rooms? That’s the magic of cloud computing, and Microsoft Azure is a leading cloud platform. Thinking about a career in this exciting field? If so, mastering the Microsoft Certified: Azure Fundamentals certification through passing the AZ-900 exam is the perfect starting point for you. This… Read More »

The Impact of Remote Work on IT Certification Exam Processes

With remote work becoming the new norm, it’s not just our daily routines that have changed but also how we tackle IT certification exams. Gone are the days of trekking to testing centers; now, your living room can double as an exam room. This shift has brought about some fascinating changes and challenges. Let’s dive… Read More »

IT Risk Management: CRISC Certification Exam Essentials

Do you ever feel like the IT world is moving at warp speed? New tech seems to pop up every day, leaving you wondering how to keep up and truly stand out in your field. Companies are increasingly concerned about online threats, data leaks, and meeting legal requirements. That’s where the CRISC (Certified in Risk… Read More »

The Ultimate Guide to Mastering Marketing Automation for Email Wizards

Hey there, email aficionados! Welcome to your new favorite read – the one that’s going to turbocharge your email marketing game. You’re about to dive into the captivating world of marketing automation, a place where efficiency meets effectiveness, letting you boost your campaigns without breaking a sweat. Get ready to discover how automation can not… Read More »

Master YouTube Marketing with These 10 Powerful Steps

Welcome to the dynamic world of YouTube marketing! Whether you’re a seasoned pro or just getting started, harnessing the power of YouTube can significantly boost your brand’s visibility and engagement. With over 2 billion monthly active users, YouTube offers a vast audience for your content. But how do you stand out in such a crowded… Read More »

sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |