Common Excel Tasks Demonstrated in Pandas - Part 2
Posted by Chris Moffitt in articles
Introduction
I have been very excited by the response to the first post in this series. Thank you to all for the positive feedback. I want to keep the series going by highlighting some other tasks that you commonly execute in Excel and show how you can perform similar functions in pandas.
In the first article, I focused on common math tasks in Excel and their pandas counterparts. In this article, I’ll focus on some common selection and filtering tasks and illustrate how to do the same thing in pandas.
Getting Set Up
If you would like to follow along, you can download the excel file.
Import the pandas and numpy modules.
import pandas as pd
import numpy as np
Load in the Excel data that represents a year’s worth of sales for our sample company.
df = pd.read_excel("sample-salesv3.xlsx")
Take a quick look at the data types to make sure everything came through as expected.
df.dtypes
account number int64 name object sku object quantity int64 unit price float64 ext price float64 date object dtype: object
You’ll notice that our date column is showing up as a generic
object
. We are going to convert it to datetime object to make some
future selections a little easier.
df['date'] = pd.to_datetime(df['date'])
df.head()
account number | name | sku | quantity | unit price | ext price | date | |
---|---|---|---|---|---|---|---|
0 | 740150 | Barton LLC | B1-20000 | 39 | 86.69 | 3380.91 | 2014-01-01 07:21:51 |
1 | 714466 | Trantow-Barrows | S2-77896 | -1 | 63.16 | -63.16 | 2014-01-01 10:00:47 |
2 | 218895 | Kulas Inc | B1-69924 | 23 | 90.70 | 2086.10 | 2014-01-01 13:24:58 |
3 | 307599 | Kassulke, Ondricka and Metz | S1-65481 | 41 | 21.05 | 863.05 | 2014-01-01 15:05:22 |
4 | 412290 | Jerde-Hilpert | S2-34077 | 6 | 83.21 | 499.26 | 2014-01-01 23:26:55 |
df.dtypes
account number int64 name object sku object quantity int64 unit price float64 ext price float64 date datetime64[ns] dtype: object
The date is now a datetime object which will be useful in future steps.
Filtering the data
I think one of the handiest features in Excel is the filter. I imagine that almost anytime someone gets an Excel file of any size and they want to filter the data, they use this function.
Here is an image of using it for this data set:

Similar to the ilter function in Excel, you can use pandas to filter and select certain subsets of data.
For instance, if we want to just see a specific account number, we can easily do that with Excel or with pandas.
Here is the Excel filter solution:

It is relatively straightforward to do in pandas. Note, I am going to use the
head
function to show the top results. This is purely for the purposes of keeping the article shorter.
df[df["account number"]==307599].head()
account number | name | sku | quantity | unit price | ext price | date | |
---|---|---|---|---|---|---|---|
3 | 307599 | Kassulke, Ondricka and Metz | S1-65481 | 41 | 21.05 | 863.05 | 2014-01-01 15:05:22 |
13 | 307599 | Kassulke, Ondricka and Metz | S2-10342 | 17 | 12.44 | 211.48 | 2014-01-04 07:53:01 |
34 | 307599 | Kassulke, Ondricka and Metz | S2-78676 | 35 | 33.04 | 1156.40 | 2014-01-10 05:26:31 |
58 | 307599 | Kassulke, Ondricka and Metz | B1-20000 | 22 | 37.87 | 833.14 | 2014-01-15 16:22:22 |
70 | 307599 | Kassulke, Ondricka and Metz | S2-10342 | 44 | 96.79 | 4258.76 | 2014-01-18 06:32:31 |
You could also do the filtering based on numeric values. I am not going to show any more Excel-based samples. I am sure you get the idea.
df[df["quantity"] > 22].head()
account number | name | sku | quantity | unit price | ext price | date | |
---|---|---|---|---|---|---|---|
0 | 740150 | Barton LLC | B1-20000 | 39 | 86.69 | 3380.91 | 2014-01-01 07:21:51 |
2 | 218895 | Kulas Inc | B1-69924 | 23 | 90.70 | 2086.10 | 2014-01-01 13:24:58 |
3 | 307599 | Kassulke, Ondricka and Metz | S1-65481 | 41 | 21.05 | 863.05 | 2014-01-01 15:05:22 |
14 | 737550 | Fritsch, Russel and Anderson | B1-53102 | 23 | 71.56 | 1645.88 | 2014-01-04 08:57:48 |
15 | 239344 | Stokes LLC | S1-06532 | 34 | 71.51 | 2431.34 | 2014-01-04 11:34:58 |
If we want to do more complex filtering, we can use
map
to filter on various criteria.
In this example, let’s look for items with sku’s that start with B1.
df[df["sku"].map(lambda x: x.startswith('B1'))].head()
account number | name | sku | quantity | unit price | ext price | date | |
---|---|---|---|---|---|---|---|
0 | 740150 | Barton LLC | B1-20000 | 39 | 86.69 | 3380.91 | 2014-01-01 07:21:51 |
2 | 218895 | Kulas Inc | B1-69924 | 23 | 90.70 | 2086.10 | 2014-01-01 13:24:58 |
6 | 218895 | Kulas Inc | B1-65551 | 2 | 31.10 | 62.20 | 2014-01-02 10:57:23 |
14 | 737550 | Fritsch, Russel and Anderson | B1-53102 | 23 | 71.56 | 1645.88 | 2014-01-04 08:57:48 |
17 | 239344 | Stokes LLC | B1-50809 | 14 | 16.23 | 227.22 | 2014-01-04 22:14:32 |
It’s easy to chain two or more statements together using the &.
df[df["sku"].map(lambda x: x.startswith('B1')) & (df["quantity"] > 22)].head()
account number | name | sku | quantity | unit price | ext price | date | |
---|---|---|---|---|---|---|---|
0 | 740150 | Barton LLC | B1-20000 | 39 | 86.69 | 3380.91 | 2014-01-01 07:21:51 |
2 | 218895 | Kulas Inc | B1-69924 | 23 | 90.70 | 2086.10 | 2014-01-01 13:24:58 |
14 | 737550 | Fritsch, Russel and Anderson | B1-53102 | 23 | 71.56 | 1645.88 | 2014-01-04 08:57:48 |
26 | 737550 | Fritsch, Russel and Anderson | B1-53636 | 42 | 42.06 | 1766.52 | 2014-01-08 00:02:11 |
31 | 714466 | Trantow-Barrows | B1-33087 | 32 | 19.56 | 625.92 | 2014-01-09 10:16:32 |
Another useful function that pandas supports is called
isin
. It
allows us to define a list of values we want to look for.
In this case, we look for all records that include two specific account numbers.
df[df["account number"].isin([714466,218895])].head()
account number | name | sku | quantity | unit price | ext price | date | |
---|---|---|---|---|---|---|---|
1 | 714466 | Trantow-Barrows | S2-77896 | -1 | 63.16 | -63.16 | 2014-01-01 10:00:47 |
2 | 218895 | Kulas Inc | B1-69924 | 23 | 90.70 | 2086.10 | 2014-01-01 13:24:58 |
5 | 714466 | Trantow-Barrows | S2-77896 | 17 | 87.63 | 1489.71 | 2014-01-02 10:07:15 |
6 | 218895 | Kulas Inc | B1-65551 | 2 | 31.10 | 62.20 | 2014-01-02 10:57:23 |
8 | 714466 | Trantow-Barrows | S1-50961 | 22 | 84.09 | 1849.98 | 2014-01-03 11:29:02 |
Pandas supports another function called
query
which allows you to
efficiently select subsets of data. It does require the installation of
numexpr so make sure you have it
installed before trying this step.
If you would like to get a list of customers by name, you can do that with a query, similar to the python syntax shown above.
df.query('name == ["Kulas Inc","Barton LLC"]').head()
account number | name | sku | quantity | unit price | ext price | date | |
---|---|---|---|---|---|---|---|
0 | 740150 | Barton LLC | B1-20000 | 39 | 86.69 | 3380.91 | 2014-01-01 07:21:51 |
2 | 218895 | Kulas Inc | B1-69924 | 23 | 90.70 | 2086.10 | 2014-01-01 13:24:58 |
6 | 218895 | Kulas Inc | B1-65551 | 2 | 31.10 | 62.20 | 2014-01-02 10:57:23 |
33 | 218895 | Kulas Inc | S1-06532 | 3 | 22.36 | 67.08 | 2014-01-09 23:58:27 |
36 | 218895 | Kulas Inc | S2-34077 | 16 | 73.04 | 1168.64 | 2014-01-10 12:07:30 |
The query function allows you do more than just this simple example but for the purposes of this discussion, I’m showing it so you are aware that it is out there for your needs.
Working with Dates
Using pandas, you can do complex filtering on dates. Before doing anything with dates, I encourage you to sort by the date column to make sure the results return what you are expecting.
df = df.sort_values(by=['date'])
df.head()
account number | name | sku | quantity | unit price | ext price | date | |
---|---|---|---|---|---|---|---|
0 | 740150 | Barton LLC | B1-20000 | 39 | 86.69 | 3380.91 | 2014-01-01 07:21:51 |
1 | 714466 | Trantow-Barrows | S2-77896 | -1 | 63.16 | -63.16 | 2014-01-01 10:00:47 |
2 | 218895 | Kulas Inc | B1-69924 | 23 | 90.70 | 2086.10 | 2014-01-01 13:24:58 |
3 | 307599 | Kassulke, Ondricka and Metz | S1-65481 | 41 | 21.05 | 863.05 | 2014-01-01 15:05:22 |
4 | 412290 | Jerde-Hilpert | S2-34077 | 6 | 83.21 | 499.26 | 2014-01-01 23:26:55 |
The python filtering syntax shown before works with dates.
df[df['date'] >='20140905'].head()
account number | name | sku | quantity | unit price | ext price | date | |
---|---|---|---|---|---|---|---|
1042 | 163416 | Purdy-Kunde | B1-38851 | 41 | 98.69 | 4046.29 | 2014-09-05 01:52:32 |
1043 | 714466 | Trantow-Barrows | S1-30248 | 1 | 37.16 | 37.16 | 2014-09-05 06:17:19 |
1044 | 729833 | Koepp Ltd | S1-65481 | 48 | 16.04 | 769.92 | 2014-09-05 08:54:41 |
1045 | 729833 | Koepp Ltd | S2-11481 | 6 | 26.50 | 159.00 | 2014-09-05 16:33:15 |
1046 | 737550 | Fritsch, Russel and Anderson | B1-33364 | 4 | 76.44 | 305.76 | 2014-09-06 08:59:08 |
One of the really nice features of pandas is that it understands dates so it will allow us to do partial filtering. If we want to only look for data more recent than a specific month, we can do so.
df[df['date'] >='2014-03'].head()
account number | name | sku | quantity | unit price | ext price | date | |
---|---|---|---|---|---|---|---|
242 | 163416 | Purdy-Kunde | S1-30248 | 19 | 65.03 | 1235.57 | 2014-03-01 16:07:40 |
243 | 527099 | Sanford and Sons | S2-82423 | 3 | 76.21 | 228.63 | 2014-03-01 17:18:01 |
244 | 527099 | Sanford and Sons | B1-50809 | 8 | 70.78 | 566.24 | 2014-03-01 18:53:09 |
245 | 737550 | Fritsch, Russel and Anderson | B1-50809 | 20 | 50.11 | 1002.20 | 2014-03-01 23:47:17 |
246 | 688981 | Keeling LLC | B1-86481 | -1 | 97.16 | -97.16 | 2014-03-02 01:46:44 |
Of course, you can chain the criteria.
df[(df['date'] >='20140701') & (df['date'] <= '20140715')].head()
account number | name | sku | quantity | unit price | ext price | date | |
---|---|---|---|---|---|---|---|
778 | 737550 | Fritsch, Russel and Anderson | S1-65481 | 35 | 70.51 | 2467.85 | 2014-07-01 00:21:58 |
779 | 218895 | Kulas Inc | S1-30248 | 9 | 16.56 | 149.04 | 2014-07-01 00:52:38 |
780 | 163416 | Purdy-Kunde | S2-82423 | 44 | 68.27 | 3003.88 | 2014-07-01 08:15:52 |
781 | 672390 | Kuhn-Gusikowski | B1-04202 | 48 | 99.39 | 4770.72 | 2014-07-01 11:12:13 |
782 | 642753 | Pollich LLC | S2-23246 | 1 | 51.29 | 51.29 | 2014-07-02 04:02:39 |
Because pandas understands date columns, you can express the date value in multiple formats and it will give you the results you expect.
df[df['date'] >= 'Oct-2014'].head()
account number | name | sku | quantity | unit price | ext price | date | |
---|---|---|---|---|---|---|---|
1168 | 307599 | Kassulke, Ondricka and Metz | S2-23246 | 6 | 88.90 | 533.40 | 2014-10-08 06:19:50 |
1169 | 424914 | White-Trantow | S2-10342 | 25 | 58.54 | 1463.50 | 2014-10-08 07:31:40 |
1170 | 163416 | Purdy-Kunde | S1-27722 | 22 | 34.41 | 757.02 | 2014-10-08 09:01:18 |
1171 | 163416 | Purdy-Kunde | B1-33087 | 7 | 79.29 | 555.03 | 2014-10-08 15:39:13 |
1172 | 672390 | Kuhn-Gusikowski | B1-38851 | 30 | 94.64 | 2839.20 | 2014-10-09 00:22:33 |
df[df['date'] >= '10-10-2014'].head()
account number | name | sku | quantity | unit price | ext price | date | |
---|---|---|---|---|---|---|---|
1174 | 257198 | Cronin, Oberbrunner and Spencer | S2-34077 | 13 | 12.24 | 159.12 | 2014-10-10 02:59:06 |
1175 | 740150 | Barton LLC | S1-65481 | 28 | 53.00 | 1484.00 | 2014-10-10 15:08:53 |
1176 | 146832 | Kiehn-Spinka | S1-27722 | 15 | 64.39 | 965.85 | 2014-10-10 18:24:01 |
1177 | 257198 | Cronin, Oberbrunner and Spencer | S2-16558 | 3 | 35.34 | 106.02 | 2014-10-11 01:48:13 |
1178 | 737550 | Fritsch, Russel and Anderson | B1-53636 | 10 | 56.95 | 569.50 | 2014-10-11 10:25:53 |
When working with time series data, if we convert the data to use the date as as the index, we can do some more filtering variations.
Set the new index using
set_index
.
df2 = df.set_index(['date'])
df2.head()
account number | name | sku | quantity | unit price | ext price | |
---|---|---|---|---|---|---|
date | ||||||
2014-01-01 07:21:51 | 740150 | Barton LLC | B1-20000 | 39 | 86.69 | 3380.91 |
2014-01-01 10:00:47 | 714466 | Trantow-Barrows | S2-77896 | -1 | 63.16 | -63.16 |
2014-01-01 13:24:58 | 218895 | Kulas Inc | B1-69924 | 23 | 90.70 | 2086.10 |
2014-01-01 15:05:22 | 307599 | Kassulke, Ondricka and Metz | S1-65481 | 41 | 21.05 | 863.05 |
2014-01-01 23:26:55 | 412290 | Jerde-Hilpert | S2-34077 | 6 | 83.21 | 499.26 |
We can slice the data to get a range.
df2["20140101":"20140201"].head()
account number | name | sku | quantity | unit price | ext price | |
---|---|---|---|---|---|---|
date | ||||||
2014-01-01 07:21:51 | 740150 | Barton LLC | B1-20000 | 39 | 86.69 | 3380.91 |
2014-01-01 10:00:47 | 714466 | Trantow-Barrows | S2-77896 | -1 | 63.16 | -63.16 |
2014-01-01 13:24:58 | 218895 | Kulas Inc | B1-69924 | 23 | 90.70 | 2086.10 |
2014-01-01 15:05:22 | 307599 | Kassulke, Ondricka and Metz | S1-65481 | 41 | 21.05 | 863.05 |
2014-01-01 23:26:55 | 412290 | Jerde-Hilpert | S2-34077 | 6 | 83.21 | 499.26 |
Once again, we can use various date representations to remove any ambiguity around date naming conventions.
df2["2014-Jan-1":"2014-Feb-1"].head()
account number | name | sku | quantity | unit price | ext price | |
---|---|---|---|---|---|---|
date | ||||||
2014-01-01 07:21:51 | 740150 | Barton LLC | B1-20000 | 39 | 86.69 | 3380.91 |
2014-01-01 10:00:47 | 714466 | Trantow-Barrows | S2-77896 | -1 | 63.16 | -63.16 |
2014-01-01 13:24:58 | 218895 | Kulas Inc | B1-69924 | 23 | 90.70 | 2086.10 |
2014-01-01 15:05:22 | 307599 | Kassulke, Ondricka and Metz | S1-65481 | 41 | 21.05 | 863.05 |
2014-01-01 23:26:55 | 412290 | Jerde-Hilpert | S2-34077 | 6 | 83.21 | 499.26 |
df2["2014-Jan-1":"2014-Feb-1"].tail()
account number | name | sku | quantity | unit price | ext price | |
---|---|---|---|---|---|---|
date | ||||||
2014-01-31 22:51:18 | 383080 | Will LLC | B1-05914 | 43 | 80.17 | 3447.31 |
2014-02-01 09:04:59 | 383080 | Will LLC | B1-20000 | 7 | 33.69 | 235.83 |
2014-02-01 11:51:46 | 412290 | Jerde-Hilpert | S1-27722 | 11 | 21.12 | 232.32 |
2014-02-01 17:24:32 | 412290 | Jerde-Hilpert | B1-86481 | 3 | 35.99 | 107.97 |
2014-02-01 19:56:48 | 412290 | Jerde-Hilpert | B1-20000 | 23 | 78.90 | 1814.70 |
df2["2014"].head()
account number | name | sku | quantity | unit price | ext price | |
---|---|---|---|---|---|---|
date | ||||||
2014-01-01 07:21:51 | 740150 | Barton LLC | B1-20000 | 39 | 86.69 | 3380.91 |
2014-01-01 10:00:47 | 714466 | Trantow-Barrows | S2-77896 | -1 | 63.16 | -63.16 |
2014-01-01 13:24:58 | 218895 | Kulas Inc | B1-69924 | 23 | 90.70 | 2086.10 |
2014-01-01 15:05:22 | 307599 | Kassulke, Ondricka and Metz | S1-65481 | 41 | 21.05 | 863.05 |
2014-01-01 23:26:55 | 412290 | Jerde-Hilpert | S2-34077 | 6 | 83.21 | 499.26 |
df2["2014-Dec"].head()
account number | name | sku | quantity | unit price | ext price | |
---|---|---|---|---|---|---|
date | ||||||
2014-12-01 20:15:34 | 714466 | Trantow-Barrows | S1-82801 | 3 | 77.97 | 233.91 |
2014-12-02 20:00:04 | 146832 | Kiehn-Spinka | S2-23246 | 37 | 57.81 | 2138.97 |
2014-12-03 04:43:53 | 218895 | Kulas Inc | S2-77896 | 30 | 77.44 | 2323.20 |
2014-12-03 06:05:43 | 141962 | Herman LLC | B1-53102 | 20 | 26.12 | 522.40 |
2014-12-03 14:17:34 | 642753 | Pollich LLC | B1-53636 | 19 | 71.21 | 1352.99 |
As you can see, there are a lot of options when it comes to sorting and filtering based on dates.
Additional String Functions
Pandas has support for vectorized string functions as well.
If we want to identify all the sku’s that contain a certain value, we can use
str.contains
. In this case, we know that the sku is always
represented in the same way, so B1 only shows up in the front of the sku. You
need to understand your data to make sure you are getting back what you expected.
df[df['sku'].str.contains('B1')].head()
account number | name | sku | quantity | unit price | ext price | date | |
---|---|---|---|---|---|---|---|
0 | 740150 | Barton LLC | B1-20000 | 39 | 86.69 | 3380.91 | 2014-01-01 07:21:51 |
2 | 218895 | Kulas Inc | B1-69924 | 23 | 90.70 | 2086.10 | 2014-01-01 13:24:58 |
6 | 218895 | Kulas Inc | B1-65551 | 2 | 31.10 | 62.20 | 2014-01-02 10:57:23 |
14 | 737550 | Fritsch, Russel and Anderson | B1-53102 | 23 | 71.56 | 1645.88 | 2014-01-04 08:57:48 |
17 | 239344 | Stokes LLC | B1-50809 | 14 | 16.23 | 227.22 | 2014-01-04 22:14:32 |
We can string queries together and use
sort
to control how the data is ordered.
df[(df['sku'].str.contains('B1-531')) & (df['quantity']>40)].sort_values(by=['quantity','name'],ascending=[0,1])
account number | name | sku | quantity | unit price | ext price | date | |
---|---|---|---|---|---|---|---|
684 | 642753 | Pollich LLC | B1-53102 | 46 | 26.07 | 1199.22 | 2014-06-08 19:33:33 |
792 | 688981 | Keeling LLC | B1-53102 | 45 | 41.19 | 1853.55 | 2014-07-04 21:42:22 |
176 | 383080 | Will LLC | B1-53102 | 45 | 89.22 | 4014.90 | 2014-02-11 04:14:09 |
1213 | 604255 | Halvorson, Crona and Champlin | B1-53102 | 41 | 55.05 | 2257.05 | 2014-10-18 19:27:01 |
1215 | 307599 | Kassulke, Ondricka and Metz | B1-53102 | 41 | 93.70 | 3841.70 | 2014-10-18 23:25:10 |
1128 | 714466 | Trantow-Barrows | B1-53102 | 41 | 55.68 | 2282.88 | 2014-09-27 10:42:48 |
1001 | 424914 | White-Trantow | B1-53102 | 41 | 81.25 | 3331.25 | 2014-08-26 11:44:30 |
Bonus Task
I frequently find myself trying to get a list of unique items in a long list within Excel. It is a multi-step process to do this in Excel but is fairly simple in pandas. Here is one way to do this using the Advanced Filter in Excel.

In pandas, we use the
unique
function on a column to get the list.
df["name"].unique()
array([u'Barton LLC', u'Trantow-Barrows', u'Kulas Inc', u'Kassulke, Ondricka and Metz', u'Jerde-Hilpert', u'Koepp Ltd', u'Fritsch, Russel and Anderson', u'Kiehn-Spinka', u'Keeling LLC', u'Frami, Hills and Schmidt', u'Stokes LLC', u'Kuhn-Gusikowski', u'Herman LLC', u'White-Trantow', u'Sanford and Sons', u'Pollich LLC', u'Will LLC', u'Cronin, Oberbrunner and Spencer', u'Halvorson, Crona and Champlin', u'Purdy-Kunde'], dtype=object)
If we wanted to include the account number, we could use
drop_duplicates
.
df.drop_duplicates(subset=["account number","name"]).head()
account number | name | sku | quantity | unit price | ext price | date | |
---|---|---|---|---|---|---|---|
0 | 740150 | Barton LLC | B1-20000 | 39 | 86.69 | 3380.91 | 2014-01-01 07:21:51 |
1 | 714466 | Trantow-Barrows | S2-77896 | -1 | 63.16 | -63.16 | 2014-01-01 10:00:47 |
2 | 218895 | Kulas Inc | B1-69924 | 23 | 90.70 | 2086.10 | 2014-01-01 13:24:58 |
3 | 307599 | Kassulke, Ondricka and Metz | S1-65481 | 41 | 21.05 | 863.05 | 2014-01-01 15:05:22 |
4 | 412290 | Jerde-Hilpert | S2-34077 | 6 | 83.21 | 499.26 | 2014-01-01 23:26:55 |
We are obviously pulling in more data than we need and getting some
non-useful information, so select only the first and second columns
using
iloc
.
df.drop_duplicates(subset=["account number","name"]).iloc[:,[0,1]]
account number | name | |
---|---|---|
0 | 740150 | Barton LLC |
1 | 714466 | Trantow-Barrows |
2 | 218895 | Kulas Inc |
3 | 307599 | Kassulke, Ondricka and Metz |
4 | 412290 | Jerde-Hilpert |
7 | 729833 | Koepp Ltd |
9 | 737550 | Fritsch, Russel and Anderson |
10 | 146832 | Kiehn-Spinka |
11 | 688981 | Keeling LLC |
12 | 786968 | Frami, Hills and Schmidt |
15 | 239344 | Stokes LLC |
16 | 672390 | Kuhn-Gusikowski |
18 | 141962 | Herman LLC |
20 | 424914 | White-Trantow |
21 | 527099 | Sanford and Sons |
30 | 642753 | Pollich LLC |
37 | 383080 | Will LLC |
51 | 257198 | Cronin, Oberbrunner and Spencer |
67 | 604255 | Halvorson, Crona and Champlin |
106 | 163416 | Purdy-Kunde |
I think this single command is easier to maintain than trying to remember the Excel steps every time.
If you would like to view the notebook, feel free to download it.
Conclusion
After I posted, my first article, Dave Proffer retweeted my post and said “Good tips 2 break ur #excel addiction”. I think this is an accurate way to describe how Excel is frequently used today. So many people reach for it right away without realizing how limiting it can be. I hope this series helps people understand that there are alternatives out there and that python+pandas is an extremely powerful combination.
Changes
- 29-Nov-2020: Updated code to represent using
sort_values
and removing reference toix
Comments