
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Various Text Data Types in Python Pandas
There are two ways to store textual data in python pandas (for version 1.0.0.to Latest version 1.2.4). On this note, we can say pandas textual data have two data types which are object and StringDtype.
In the older version of pandas (1.0), only object dtype is available, in a newer version of pandas it is recommended to use StringDtype to store all textual data. To overcome some disadvantages of using objects dtype, this StringDtype is introduced in the pandas 1.0 version. Still, we can use both object and StringDtype for text data.
Let’s take an example, in that create a DataFrame using text data and see the output default dtype in pandas text data.
Object dtype
Create a pandas DataFrame with text data and verify the dtype of data.
Example
dict_ = {'A':['a','Aa'],'B':['b','Bb']} # Declaring a Dictionary df = pd.DataFrame(dict_) # creating a DataFrame using Dictionary print(df['A']) # printing column A values print() # giving space between each output print(df['B']) # Printing column B values
Explanation
In the above code, created a Dictionary with string data and assigned it to the dict_ variable, by using this dict_ we created a Pandas DataFrame. This DataFrame has 2 columns and 2 rows, and the total data present in this DataFrame is string data.
From the last 3 lines of the above code is displaying each column of data, in that output, we can see the dtype of our data. Let’s verify the output below.
Output
0 a 1 Aa Name: A, dtype: object 0 b 1 Bb Name: B, dtype: object
The above output is representing each column A and column B values from our DataFrame separated by a line space. Here we can see the dtype of each column representing the object by default. To define StringDtype we need to state it explicitly.
String dtype
To define String dtype we can use the dtype parameter and assign a string or StringDtype argument. Let’s see some examples below.
Example
list_ = ['python','sample', 'string'] ds = pd.Series(list_, dtype='string') print(ds)
Explanation
Here we define a pandas Series, by using the pandas series method with a list of strings. And we pass string argument to the Parameter dtype, it will change the default object dtype to string.
Output
0 python 1 sample 2 string dtype: string
The above block is the output of series data, here the dtype of data is a string. We can also use pd.StringDtype() to define dtype as a string. Let’s take another example.
Example
data = ['john','dev','philip'] # creating a list ds = pd.Series(data, dtype= pd.StringDtype()) # Series creation ds
For this example also we have taken a pandas series with a list of strings and defined pd.StringDtype argument to parameter dtype.
Output
0 John 1 Dev 2 Philip dtype: string
Here the output of pd.StringDtype argument to dtype parameter is shown above block.