How to Apply a function Row-wise or Column-wise in Pandas | #19 of 53: The Complete Pandas Course

How to Apply a function Row-wise or Column-wise in Pandas | #19 of 53: The Complete Pandas Course

Course materials Github: https://github.com/machinelearningplu... -------------------- The map function iterate through every cell of a given series. Say you have a data frame from that a given column is considered as a series right? Doing DF on this particular column dot map. Whatever function you are writing here will execute that logic on every cell of the series. That's what map function does. Now, sometimes, you want to instead of iterating through a given series alone, you want to iterate through the entire row of a data frame. That is you have multiple rows here, multiple columns and rows here, you want to iterate through the entire row and do your computations based on all the columns or some of the columns in a given row. You can do that using the apply function. If you clearly understand this function, you will get the confidence that you can write any sort of logic on any given data frame. Now apply works not just row wise that is you have this data frame not just row wise, it works column wise also. That is you can take a given column, do some computation of the entire column and return back the output. This way also apply works both row wise as well as column wise. So what we are going to do here is we will use the Hungary chickenpox data set. This data set contains the number of chickenpox cases weekly chickenpox cases for various regions in Hungary. These are the different values of this data set. What we want to first do is for any given week of this data frame, in fact, for all the weeks of this data frame, we want to find out what is the maximum number of cases observed in any part of Hungary, any of the regions within Hungary, we will try to do that, that can be easily achieved using the apply function, we will first see how to write this apply function based on this fairly straightforward example, what we want to basically do is what is the maximum value, find the maximum value of every row in this data frame, that's what we want to do, we will do that with apply first, then we will write the same logic with a for loop password. Then, after understanding that example, we will see a column wise operation also. Alright, let's first do this. First objective is to find the maximum of each and every row in this data frame. So here, what I'm doing first is I'm dropping this particular column date, I'm not concerned about date, drop it call apply, then within apply passing the function that you want to execute. So here, by default, the entire entire values of the data frame, a given row in a data frame will be passed on to this function from that output the function the output of this is going to be the maximum value right? So here is I think it's going to be 173 or something that is going to be returned for every row. Let's run this and see the output. So for the first column, we have the maximum value is 178. I think it should be hidden somewhere here. Likewise, for second column, the maximum is 200, and so on. So that's the output. To do this with a for loop. It's quite straightforward as well, we simply iterate a given data frame row wise using DF dot iter rows, DF dot iter rows. So this argument right here, row argument is going to contain all the values present at each and every row, right from that row, we are going to extract the maximum value and append it to the max values list that we have initialized before calling this for loop. So let's run this, this output should match with the output of apply. So very first five rows, these five rows here is the same as what we got here. Good. Second example. What is the median number of chickenpox cases for every region? Now for every region is key here, because we need to find out the median number of chickenpox cases, column wise for every region means each and every region is a column. Right? So for Budapest, what is the median Barnea? What is the median and so on. All right, same apply function you can use instead of np dot max, we are going to use np dot median. But here the main change here is the axis changes to rows here. Earlier, when we iterated row wise, we had set x is equal to columns. To do it, for every column set x is equal to rows. Let's run this. So those are the different median values. Hope that is clear. quite straightforward, right? The main thing that you need to remember about apply function is the function whatever function that you're using here is going to receive the entire row or the entire column based on the value that you pass to access. That's the main idea. Let's solidify this information with the meaning challenge too many challenges actually. So the challenge for this video is create a column containing the difference between the highest and the lowest.