Python Pandas - LaTeX



The Pandas library allows exporting DataFrame, Series, and Styler objects into LaTeX tabular representations, enabling easy integration of tabular data into LaTeX documents.

LaTeX is a high-quality typesetting system used for creating professional documents, particularly in scientific and technical fields. It can handle complex mathematical formulas, and scientific symbols. The LaTeX tabular representation arranges the content in rows and columns, by using its defined structure. Each cell in the table is stored with text or data, and rows are separated by line breaks.

Below is an example of a simple LaTeX tabular representation, where rows are separated by the "\\" and columns by "&" −

\begin{tabular}{lrr}
 & Col1 & Col2 \\
Row1 & 1 & 2 \\
Row2 & 3 & 4 \\
\end{tabular}

In this tutorial, we will learn how to export Pandas objects to LaTeX using the to_latex() method, including advanced customization options provided by the Styler object.

Key Considerations for Working with LaTeX in Pandas

Here are some important points to keep in mind when working with LaTeX in Pandas −

  • Styler Integration: The DataFrame.to_latex() method now internally uses the Styler.to_latex() implementation, which offers greater flexibility and advanced formatting capabilities.

  • Installation Requirements: The jinja2 library is required to use the LaTeX export functionality in Pandas v2.0.0 and later.

  • Export-Only Functionality: Currently, Pandas only supports exporting data to LaTeX. It does not provide the ability to read LaTeX files.

Writing Pandas Objects to LaTeX

You can export Pandas DataFrame, Series, or Styler object to a LaTex using the to_latex() method. This method converts Pandas data into LaTeX tables, supporting various formats like tabular, longtable, and nested tables.

Example

This example shows how to convert a Pandas DataFrames to LaTex tabular representations using the DataFrame.to_latex() method.

import pandas as pd

# Create a simple DataFrame
df = pd.DataFrame([[1, 2], [3, 4]], index=["a", "b"], columns=["c", "d"])

# Display the Input DataFrame
print("Original DataFrame:")
print(df)

# Export to LaTeX using Styler
latex_output = df.style.to_latex()

print("\nGenerated LaTeX Code:")
print(latex_output)

When we run above program, it produces following result −

Original DataFrame:
c d
a 1 2
b 3 4
Generated LaTeX Code: \begin{tabular}{lrr} & c & d \\ a & 1 & 2 \\ b & 3 & 4 \\ \end{tabular}

Formatting Values Before Exporting to LaTeX

You can format table values before exporting using either the formatters parameter of the to_latex() method or the Styler.format() method.

Example: Formatting Values using the to_latex() Method

This example demonstrates formatting a DataFrame values before exporting to LaTeX table using the DataFrame.to_latex() methods and the formatters parameter.

import pandas as pd

# Create a DataFrame
df = pd.DataFrame({"Col_1": range(3), "Col_2": ['a', 'b', 'c']})
print("Original DataFrame:")
print(df)

# Convert DataFrame to LaTeX format
latex_output = df.to_latex(index=False, formatters={"Col_1": float, "Col_2": str.upper})

print("\nFormatted LaTeX Output:")
print(latex_output)

Following is an output of the above code −

Original DataFrame:
Col1 Col2
r1 0 a
r2 1 b
r3 2 c
Formatted LaTeX Output: \begin{tabular}{rl} \toprule Col_1 & Col_2 \\ \midrule 0.0 & A \\ 1.0 & B \\ 2.0 & C \\ \bottomrule \end{tabular}

Example: Formatting Values using the Styler.format() Method

You can format the values in a DataFrame before exporting using the Styler.format() method. This is especially useful for displaying currency, percentages, or custom string formatting.

This example uses the Styler.format() method to format values before LaTeX output of a Pandas DataFrame.

import pandas as pd

# Create a simple DataFrame
df = pd.DataFrame({"Col_1": range(3), "Col_2": ['a', 'b', 'c']}, index=['r1', 'r2', 'r3'])

# Display the Input DataFrame
print("Original DataFrame:")
print(df)

# Format values as currency before LaTeX export
latex_output = df.style.format({"Col_1": "{}", "Col_2": str.upper}).to_latex()

print("\nFormatted LaTeX Output:")
print(latex_output)

While executing the above code we get the following output −

Original DataFrame:
Col1 Col2
r1 0 a
r2 1 b
r3 2 c
Formatted LaTeX Output: \begin{tabular}{lrl} & Col_1 & Col_2 \\ r1 & 0 & A \\ r2 & 1 & B \\ r3 & 2 & C \\ \end{tabular}

Exporting Hierarchical Indexed Object to LaTeX

Pandas supports exporting hierarchical indexed objects (multi-row and multi-column indices) to LaTeX, making it suitable for complex datasets.

Example

This example shows how to export a MultiIndex DataFrame to LaTeX using the multicolumn and multirow parameters of the .to_latex() method.

import pandas as pd
import numpy as np

# Create hierarchical indexing for rows and columns
row_index = pd.MultiIndex.from_arrays(
    [["BMW", "BMW", "Lexus", "Lexus", "Audi", "Audi"],
     ["1", "2", "1", "2", "1", "2"]],
    names=["Brand", "Model"]
)

column_index = pd.MultiIndex.from_arrays(
    [["Performance", "Performance", "Price", "Price"],
     ["Speed", "Mileage", "USD", "Discount"]],
    names=["Attribute", "Details"]
)

# Create a MultiIndex DataFrame
data = np.random.rand(6, 4)
df = pd.DataFrame(data, index=row_index, columns=column_index)

print("Original MultiIndexed DataFrame:")
print(df)

# Convert DataFrame to LaTeX with multirow and multicolumn options
latex_output = df.to_latex(multicolumn=True,
multirow=True,
)

print("\nMultiIndex LaTeX Output:")
print(latex_output)

When we run above program, it produces following result −

Original MultiIndexed DataFrame:
Attribute Performance Price
Details Speed Mileage USD Discount
Brand Model
BMW 1 0.346620 0.315143 0.213465 0.444698
2 0.588380 0.378466 0.709013 0.836503
Lexus 1 0.586889 0.343801 0.202830 0.737780
2 0.559294 0.896489 0.501153 0.440781
Audi 1 0.228265 0.654666 0.606819 0.598872
2 0.520143 0.447575 0.088034 0.140516
MultiIndex LaTeX Output: \begin{tabular}{llrrrr} \toprule & Attribute & \multicolumn{2}{r}{Performance} & \multicolumn{2}{r}{Price} \\ & Details & Speed & Mileage & USD & Discount \\ Brand & Model & & & & \\ \midrule \multirow[t]{2}{*}{BMW} & 1 & 0.346620 & 0.315143 & 0.213465 & 0.444698 \\ & 2 & 0.588380 & 0.378466 & 0.709013 & 0.836503 \\ \cline{1-6} \multirow[t]{2}{*}{Lexus} & 1 & 0.586889 & 0.343801 & 0.202830 & 0.737780 \\ & 2 & 0.559294 & 0.896489 & 0.501153 & 0.440781 \\ \cline{1-6} \multirow[t]{2}{*}{Audi} & 1 & 0.228265 & 0.654666 & 0.606819 & 0.598872 \\ & 2 & 0.520143 & 0.447575 & 0.088034 & 0.140516 \\ \cline{1-6} \bottomrule \end{tabular}
Advertisements