
- Python Pandas - Home
- Python Pandas - Introduction
- Python Pandas - Environment Setup
- Python Pandas - Basics
- Python Pandas - Introduction to Data Structures
- Python Pandas - Index Objects
- Python Pandas - Panel
- Python Pandas - Basic Functionality
- Python Pandas - Indexing & Selecting Data
- Python Pandas - Series
- Python Pandas - Series
- Python Pandas - Slicing a Series Object
- Python Pandas - Attributes of a Series Object
- Python Pandas - Arithmetic Operations on Series Object
- Python Pandas - Converting Series to Other Objects
- Python Pandas - DataFrame
- Python Pandas - DataFrame
- Python Pandas - Accessing DataFrame
- Python Pandas - Slicing a DataFrame Object
- Python Pandas - Modifying DataFrame
- Python Pandas - Removing Rows from a DataFrame
- Python Pandas - Arithmetic Operations on DataFrame
- Python Pandas - IO Tools
- Python Pandas - IO Tools
- Python Pandas - Working with CSV Format
- Python Pandas - Reading & Writing JSON Files
- Python Pandas - Reading Data from an Excel File
- Python Pandas - Writing Data to Excel Files
- Python Pandas - Working with HTML Data
- Python Pandas - Clipboard
- Python Pandas - Working with HDF5 Format
- Python Pandas - Comparison with SQL
- Python Pandas - Data Handling
- Python Pandas - Sorting
- Python Pandas - Reindexing
- Python Pandas - Iteration
- Python Pandas - Concatenation
- Python Pandas - Statistical Functions
- Python Pandas - Descriptive Statistics
- Python Pandas - Working with Text Data
- Python Pandas - Function Application
- Python Pandas - Options & Customization
- Python Pandas - Window Functions
- Python Pandas - Aggregations
- Python Pandas - Merging/Joining
- Python Pandas - MultiIndex
- Python Pandas - Basics of MultiIndex
- Python Pandas - Indexing with MultiIndex
- Python Pandas - Advanced Reindexing with MultiIndex
- Python Pandas - Renaming MultiIndex Labels
- Python Pandas - Sorting a MultiIndex
- Python Pandas - Binary Operations
- Python Pandas - Binary Comparison Operations
- Python Pandas - Boolean Indexing
- Python Pandas - Boolean Masking
- Python Pandas - Data Reshaping & Pivoting
- Python Pandas - Pivoting
- Python Pandas - Stacking & Unstacking
- Python Pandas - Melting
- Python Pandas - Computing Dummy Variables
- Python Pandas - Categorical Data
- Python Pandas - Categorical Data
- Python Pandas - Ordering & Sorting Categorical Data
- Python Pandas - Comparing Categorical Data
- Python Pandas - Handling Missing Data
- Python Pandas - Missing Data
- Python Pandas - Filling Missing Data
- Python Pandas - Interpolation of Missing Values
- Python Pandas - Dropping Missing Data
- Python Pandas - Calculations with Missing Data
- Python Pandas - Handling Duplicates
- Python Pandas - Duplicated Data
- Python Pandas - Counting & Retrieving Unique Elements
- Python Pandas - Duplicated Labels
- Python Pandas - Grouping & Aggregation
- Python Pandas - GroupBy
- Python Pandas - Time-series Data
- Python Pandas - Date Functionality
- Python Pandas - Timedelta
- Python Pandas - Sparse Data Structures
- Python Pandas - Sparse Data
- Python Pandas - Visualization
- Python Pandas - Visualization
- Python Pandas - Additional Concepts
- Python Pandas - Caveats & Gotchas
Python Pandas read_fwf() Method
The read_fwf() method in Python's Pandas library is used to read data from files where the columns have fixed widths, meaning each column has a set size. This method is helpful for processing data files where columns are aligned with specific widths or fixed delimiters. Additionally, it allows you to optionally iterate or break up large files into smaller chunks, making it easier to process them.
Fixed-width file formats are commonly used in older systems or specific applications, where each piece of data starts and ends at specific positions. This method allows for reading such data efficiently into DataFrames for analysis. The read_fwf() method is similar to read_csv(), but instead of reading delimited data, it works with files where each column has a fixed width.
Syntax
The syntax of the read_fwf() method is as follows −
pandas.read_fwf(filepath_or_buffer, *, colspecs='infer', widths=None, infer_nrows=100, dtype_backend=<no_default>, iterator=False, chunksize=None, **kwds)
Parameters
The Python Pandas read_fwf() method accepts the below parameters −
filepath_or_buffer: Specifies the file path or file-like object. It can be a string, path object, or URL (e.g., http, ftp, s3, or file).
colspecs: This is a list of tuples (pairs) that specify the start and end positions of each column. For example, (0, 5) means the column starts at position 0 and ends at position 5 (excluding 5). If you don't specify this, the method tries to figure out the column positions from the first 100 rows by default.
widths: Specifies field widths as a list of integers. Use this instead of colspecs for contiguous intervals.
infer_nrows: Number of rows considered by the parser for detecting colspecs when set to 'infer'.
dtype_backend: Backend for resulting DataFrame data types. Default is numpy_nullable.
iterator: If True, returns an iterator for reading the file in chunks.
chunksize: Number of lines to read per chunk for iteration.
**kwargs: Additional optional keyword arguments passed to TextFileReader.
Return Value
The Pandas read_fwf() method returns a DataFrame or a TextFileReader if iterator=True.
Example: Reading a Fixed-Width File
Here is a basic example demonstrating reading a fixed-width file using the pandas read_fwf() method with the default settings to infer column specifications automatically from the file.
import pandas as pd # Sample fixed-width data saved to a file data = """ Name Age City Salary Tom 28 Toronto 20000 Lee 32 HongKong 3000 Steven 43 Bay Area 8300 Ram 38 Hyderabad 3900""" # Writing data to a file with open("sample_fwf.txt", "w") as file: file.write(data) # Reading the fixed-width file df = pd.read_fwf("sample_fwf.txt") print("DataFrame from fixed-width file:") print(df)
When we run above program, it produces following result −
DataFrame from fixed-width file:
Name | Age | City | Salary | |
---|---|---|---|---|
0 | Tom | 28 | Toronto | 20000 |
1 | Lee | 32 | HongKong | 3000 |
2 | Steven | 43 | Bay Area | 8300 |
3 | Ram | 38 | Hyderabad | 3900 |
Example: Reading Fixed-Width File with Column Specifications
The following example demonstrates using the read_fwf() method with the colspecs parameter to define custom start and end intervals for columns.
import pandas as pd # Sample fixed-width data saved to a file data = """ Name Age City Salary Tom 28 Toronto 20000 Lee 32 HongKong 3000 Steven 43 Bay Area 8300 Ram 38 Hyderabad 3900""" # Writing data to a file with open("sample_fwf.txt", "w") as file: file.write(data) # Specifying column intervals colspecs = [(0, 6), (6, 11), (11, 20), (20, 26)] # Reading the fixed-width file df = pd.read_fwf("sample_fwf.txt", colspecs=colspecs) print("DataFrame from fixed-width file with specified column intervals:") print(df)
Following is an output of the above code −
DataFrame from fixed-width file with specified column intervals:
Name | Age | City | Salary | |
---|---|---|---|---|
0 | Tom | 28 | Toronto | 20000 |
1 | Lee | 32 | HongKong | 3000 |
2 | Steven | 43 | Bay Area | 8300 |
3 | Ram | 38 | Hyderabad | 3900 |
Example: Specifying Column Widths While Reading Fixed-Width File
You can also manually specify column widths using the widths parameter of the read_fwf() method instead of using the colspecs parameter. The following example demonstrates the same.
import pandas as pd # Sample fixed-width data saved to a file data = """ Name Age City Salary Tom 28 Toronto 20000 Lee 32 HongKong 3000 Steven 43 Bay Area 8300 Ram 38 Hyderabad 3900""" # Writing data to a file with open("sample_fwf.txt", "w") as file: file.write(data) # Reading the file with specified widths df = pd.read_fwf("sample_fwf.txt", widths=[6, 5, 9, 3]) print("DataFrame from fixed-width file with specified widths:") print(df)
While executing the above code we get the following output −
DataFrame from fixed-width file with specified widths:
Name | Age | City | Sal | |
---|---|---|---|---|
0 | Tom | 28 | Toronto | 20.0 |
1 | Lee | 32 | HongKong | 3.0 |
2 | Steven | 43 | Bay Area | 83.0 |
3 | Ram | 38 | Hyderabad | NaN |
Example: Skipping Rows While Reading Fixed-Width Files
The read_fwf() method allows you to skip the rows while reading using the skiprows parameter. This example demonstrates skipping the first two rows from the fixed-width text file.
import pandas as pd # Sample fixed-width data saved to a file data = """ Name Age City Salary Tom 28 Toronto 20000 Lee 32 HongKong 3000 Steven 43 Bay Area 8300 Ram 38 Hyderabad 3900""" # Writing data to a file with open("sample_fwf.txt", "w") as file: file.write(data) # Reading the file, skipping the first three rows df = pd.read_fwf("sample_fwf.txt", skiprows=3) print("DataFrame from fixed-width file by skipping the first three rows:") print(df)
Following is an output of the above code −
DataFrame from fixed-width file by skipping the first two rows:
Lee | 32 | HongKong | 3000 | |
---|---|---|---|---|
0 | Steven | 43 | Bay Area | 8300 |
1 | Ram | 38 | Hyderabad | 3900 |
Example: Iterating through Fixed-Width File Chunk by Chunk
By specifying a chunksize parameter of the read_fwf() method, you can get an iterable object of type TextFileReader for iterating the data of a fixed-width file chunk by chunk.
import pandas as pd # Sample fixed-width data saved to a file data = """ Name Age City Salary Tom 28 Toronto 20000 Lee 32 HongKong 3000 Steven 43 Bay Area 8300 Ram 38 Hyderabad 3900""" # Writing data to a file with open("sample_fwf.txt", "w") as file: file.write(data) # Reading the file in chunks chunk_size = 1 chunks = pd.read_fwf("sample_fwf.txt", chunksize=chunk_size) print("Iterating through Fixed-Width File Chunk by Chunk:") for chunk in chunks: print(chunk)
When we run above program, it produces following result −
Iterating through Fixed-Width File Chunk by Chunk:
Name | Age | City | Salary | |
---|---|---|---|---|
0 | Tom | 28 | Toronto | 20000 |
Name | Age | City | Salary | |
---|---|---|---|---|
1 | Lee | 32 | HongKong | 3000 |
Name | Age | City | Salary | |
---|---|---|---|---|
2 | Steven | 43 | Bay Area | 8300 |
Name | Age | City | Salary | |
---|---|---|---|---|
3 | Ram | 38 | Hyderabad | 3900 |