How to combine two DataFrames in Pandas?
Last Updated :
05 May, 2025
While working with data, there are multiple times when you would need to combine data from multiple sources. For example, you may have one DataFrame that contains information about a customer, while another DataFrame contains data about their transaction history. If you want to analyze this data together, then you would need to combine these DataFrames. The two main ways to achieve this in Pandas are: concat() and merge().
In this article, we will implement and compare both methods to show you when each is best.
1. Using concat() to Combine DataFrames
The concat() function allows you to stack DataFrames by adding rows on top of each other or columns side by side.
Stacking DataFrames Vertically
Python
import pandas as pd
df1 = pd.DataFrame({'Name': ['Alice', 'Bob'], 'Age': [25, 30]})
df2 = pd.DataFrame({'Name': ['Charlie', 'David'], 'Age': [35, 40]})
c_df = pd.concat([df1, df2])
print(c_df)
Output:
Stacking Dataframes Vertically
The indexes are not reset. If you want a clean, new index, you can use ignore_index=True:
Python
c_df = pd.concat([df1, df2], ignore_index=True)
print(c_df)
Output:
Writing Index in Order
Stacking DataFrames Horizontally
Python
df1 = pd.DataFrame({'Name': ['Alice', 'Bob'], 'Age': [25, 30]})
df2 = pd.DataFrame({'City': ['New York', 'Los Angeles'], 'Salary': [70000, 80000]})
c_df = pd.concat([df1, df2], axis=1)
print(c_df)
Output:
Stacking Dataframes Horizontally
2. Using merge() to Combine DataFrames
The merge() Function is like joining tables in SQL. It combines DataFrames based on common columns or indexes.
Basic Merge (Inner Join)
The default join is an "inner join," meaning only the rows that have the same value in the shared column will be kept:
Python
df1 = pd.DataFrame({'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35]})
df2 = pd.DataFrame({'Name': ['Alice', 'Bob', 'David'], 'Salary': [50000, 60000, 70000]})
m_df = pd.merge(df1, df2, on='Name')
print(m_df)
Output:
Inner Join of Dataframes
Types of Joins in merge()
- Inner Join: Only rows with matching values in both DataFrames.
- Outer Join: Includes all rows from both DataFrames. Where there's no match, it fills in
NaN
for missing values. - Left Join: All rows from the left DataFrame and matching rows from the right.
- Right Join: All rows from the right DataFrame and matching rows from the left.
Example of an outer join:
Python
outer_m_df = pd.merge(df1, df2, on='Name', how='outer')
print(outer_m_df)
Output:
Outer Join of DataframesWhen to Use: concat()
vs merge()
concat():
- When you want to stack DataFrames (add rows or columns).
- When the DataFrames have similar structures.
merge():
- When you need to join DataFrames based on shared columns or indices.
- When you need different types of joins (inner, outer, etc.).
Comparison Table - concat() vs merge()
Feature | concat() | merge() |
---|
Purpose | Stack/concatenate along an axis | Combine DataFrames based on columns or index |
---|
Axis | Can stack along rows or columns | Joins based on common columns or index |
---|
Join Types | - | Supports inner, outer, left, and right joins |
---|
Flexibility | Simple stacking | More complex merging with conditions |
---|
Use Case | Stacking DataFrames row-wise or column-wise | Joining datasets based on shared columns or indices |
---|
Read More:
Similar Reads
How to compare values in two Pandas Dataframes? Let's discuss how to compare values in the Pandas dataframe. Here are the steps for comparing values in two pandas Dataframes: Step 1 Dataframe Creation: The dataframes for the two datasets can be created using the following code:Â Python3 import pandas as pd # elements of first dataset first_Set =
2 min read
How to Get the Common Index of Two Pandas DataFrames When working with large datasets in Python Pandas, having multiple DataFrames with overlapping or related data is common. In many cases, we may want to identify the common indices between two DataFrames to perform further analysis, such as merging, filtering, or comparison.This article will guide us
5 min read
Combine two Pandas series into a DataFrame In this post, we will learn how to combine two series into a DataFrame? Before starting let's see what a series is?Pandas Series is a one-dimensional labeled array capable of holding any data type. In other terms, Pandas Series is nothing but a column in an excel sheet. There are several ways to con
3 min read
How To Concatenate Two or More Pandas DataFrames? In real-world data the information is often spread across multiple tables or files. To analyze it properly we need to bring all that data together. This is where the pd.concat() function in Pandas comes as it allows you to combine two or more DataFrames in: Vertically (stacking rows on top of each o
3 min read
How to Merge Two Pandas DataFrames on Index Merging two pandas DataFrames on their index is necessary when working with datasets that share the same row identifiers but have different columns. The core idea is to align the rows of both DataFrames based on their indices, combining the respective columns into one unified DataFrame. To merge two
3 min read
Pandas Combine Dataframe Combining DataFrames in Pandas is a fundamental operation that allows users to merge, concatenate, or join data from multiple sources into a single DataFrame. This article explores the different techniques we can use to combine DataFrames in Pandas, focusing on concatenation, merging and joining.Pyt
3 min read