Unix Commands - Home

join Command in Linux

Quiz

The join command in Linux is a versatile utility that combines two files based on a common field. It's particularly useful when dealing with data that's related but stored in separate files.

The join command is used to merge lines from two files based on a common field. It's particularly useful for combining data from related files.

Here is a comprehensive guide to the options available with the join command −

Understanding the join Command
How to Use join Command?
Syntax of join Command
Options join Command
Examples of join Command in Linux

Understanding the join Command

The join command in Linux is a versatile utility used to combine two files based on a common field. This command is particularly useful when dealing with data that is related but stored in separate files.

Understanding the various options available with the join command can greatly enhance your data manipulation capabilities in Linux.

How to Use join Command?

The default behavior of the join command is to take the first field as the key for joining. However, with the options listed above, you can customize the behavior to suit your specific needs.

For example, if you want to join two files on the second field of each, you would use -1 2 -2 2. If you want to output only the unpairable lines from file 1, you would use -a 1.

Syntax of join Command

join [options] file1 file2

Options join Command

Options	Description
-a FILENUM	This option prints lines from FILENUM (either file 1 or file 2) that do not have a corresponding line in the other file.
-2 FIELD	Join on FIELD of file 2.
-j FIELD	Join the files based on FIELD. This is equivalent to specifying -1 FIELD -2 FIELD.
-i or --ignore-case	Ignore differences in case when comparing the fields.
-e EMPTY	Replace missing input fields with the specified EMPTY string.
-o FORMAT	Construct the output line to obey the specified FORMAT. Output fields in a specific format (e.g., 1.2).
-t CHAR	Use CHAR as the field delimiter for both input and output. By default, whitespace is used.
-1 FIELD	Join on FIELD of file 1.
-v FILENUM	Like -a, but instead of printing the unrepairable lines, it suppresses the joined output lines.
--check-order	Check that the input is correctly sorted, even if all input lines are pairable.
--nocheck-order	Do not check that the input is correctly sorted.
--help	Display a help message and exit.
--version	Display version information and exit.

By mastering these options, you can efficiently combine data from multiple sources, making the join command a powerful tool in your Linux toolkit.

It's important to note that the join command requires that the input files be sorted on the join field. If they are not, you may need to sort them beforehand or use the --nocheck-order option if you're certain the files are sorted correctly.

Examples of join Command in Linux

Take a look at the following examples to get a clear understanding of how the join command works in Linux −

Basic Usage
Specifying the Join Field
Joining on Different Fields
Including Unpairable Lines
Changing the Output Format
Using a Different Field Separator
Case-Insensitive Joining
Checking for Sorted Input
Redirecting Output to a File

Basic Usage

The simplest form of the join command is when you have two files with a common field, usually the first column. Consider two files, file1.txt and file2.txt, with the following content −

file1.txt −

1 AAYUSH
2 APAAR
3 HEMANT
4 KARTIK

file2.txt −

To join these files, you would use −

join file1.txt file2.txt

Specifying the Join Field

If the common field is not the first column, you can specify the join field using the -1 and -2 options followed by the field number. For example −

join -1 2 -2 1 file1.txt file2.txt

Joining on Different Fields

You can join two files on different fields in each file using -1 for the first file and -2 for the second file. For instance −

join -1 1 -2 2 file1.txt file2.txt

Including Unpairable Lines

By default, join only outputs lines that have a match in both files. To include lines from the first file that don't have a corresponding match in the second file, use the -a option −

join -a 1 file1.txt file2.txt

Changing the Output Format

The -o option allows you to customize the output format. For example, to output only the name and ID from file1.txt and file2.txt, you would use −

join -o 1.2,2.2 file1.txt file2.txt

Using a Different Field Separator

The -t option lets you specify a different field separator if your files don't use whitespace. For example, if your files use a colon −

join -t ':' file1.txt file2.txt

Using Different Field Separator Using join

Case-Insensitive Joining

The -i option allows you to perform a case-insensitive join. This is useful when the case of text in the join field may not match −

join -i file1.txt file2.txt

Checking for Sorted Input

The --check-order option checks that the input is correctly sorted, which is a requirement for the join command to work properly −

join --check-order file1.txt file2.txt

Redirecting Output to a File

To save the output of the join operation to a new file, redirect the output using the > operator −

join file1.txt file2.txt > joined.txt

Print unmatched lines from file1 −

join -a 1 file1.txt file2.txt

Replace missing fields with "missing" −

join -e "missing" file1.txt file2.txt

Output fields in a specific format −

join -o 2.1 1.2 file1.txt file2.txt

Use a custom field separator −

join -t "," file1.txt file2.txt

Ignore case when comparing fields −

join -i file1.txt file2.txt

Conclusion

The join command is a powerful tool for combining related data sets. With these examples, you should be able to leverage its capabilities to streamline your data processing tasks in Linux.

Remember, the key to effectively using the join command is ensuring that your input files are properly sorted on the join field. With a bit of practice, you'll find the join command to be an indispensable part of your Linux toolkit.

Print Page