Skip to content

TIMESTAMP_NTZ datatype used by DataFrame.to_sql but fails on read_sql_table #295

@cdcadman

Description

@cdcadman

I'm using:
python 3.12
pandas==2.1.3
sqlalchemy==2.0.22
databricks-sql-connector==3.0.0
Windows

If I have a pandas DataFrame with a datetime64[ns] data type and use to_sql to write it into databricks, then read_sql_query to read the same table, it fails to handle the TIMESTAMP_NTZ data type.

Here is example code:

from os import environ

from pandas import DataFrame, Timestamp, read_sql_table
from sqlalchemy import create_engine
from sqlalchemy.event import listen

host = environ["DATABRICKS_HOST"]
http_path = environ["DATABRICKS_HTTP_PATH"]  # SQL Warehouse
catalog = environ["DATABRICKS_CATALOG"]
schema = environ["DATABRICKS_SCHEMA"]


def do_connect(dialect, conn_rec, cargs, cparams):
    cparams["access_token"] = environ["DATABRICKS_ACCESS_TOKEN"]


engine = create_engine(
    f"databricks://{host}:443?http_path={http_path}&catalog={catalog}&schema={schema}"
)
listen(engine, "do_connect", do_connect)
with engine.connect() as conn:
    dtf = DataFrame(
        {
            "timestamp_field": [
                Timestamp("12/31/2023 12:00pm"),
                Timestamp("1/31/2023 2:00pm"),
            ]
        }
    )
    conn.exec_driver_sql("drop table if exists test_timestamp")
    dtf.to_sql("test_timestamp", conn, index=False)
    read_sql_table("test_timestamp", conn)

Here is my output:

C:\Temp>python databricks_timestamp_ntz.py
C:\Temp\databricks_timestamp_ntz.py:17: SADeprecationWarning: The dbapi() classmethod on dialect classes has been renamed to import_dbapi().  Implement an import_dbapi() classmethod directly on class <class 'databricks.sqlalchemy.base.DatabricksDialect'> to remove this warning; the old .dbapi() classmethod may be maintained for backwards compatibility.
  engine = create_engine(
Traceback (most recent call last):
  File "C:\Temp\databricks_timestamp_ntz.py", line 32, in <module>
    read_sql_table("test_timestamp", conn)
  File "C:\Program Files\Python312\Lib\site-packages\pandas\io\sql.py", line 348, in read_sql_table
    table = pandas_sql.read_table(
            ^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python312\Lib\site-packages\pandas\io\sql.py", line 1665, in read_table
    self.meta.reflect(bind=self.con, only=[table_name], views=True)
  File "C:\Program Files\Python312\Lib\site-packages\sqlalchemy\sql\schema.py", line 5788, in reflect
    _reflect_info = insp._get_reflection_info(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python312\Lib\site-packages\sqlalchemy\engine\reflection.py", line 2006, in _get_reflection_info
    columns=run(
            ^^^^
  File "C:\Program Files\Python312\Lib\site-packages\sqlalchemy\engine\reflection.py", line 1992, in run
    res = meth(filter_names=_fn, **kw)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python312\Lib\site-packages\sqlalchemy\engine\reflection.py", line 921, in get_multi_columns
    table_col_defs = dict(
                     ^^^^^
  File "C:\Program Files\Python312\Lib\site-packages\sqlalchemy\engine\default.py", line 1099, in _default_multi_reflect
    single_tbl_method(
  File "C:\Program Files\Python312\Lib\site-packages\databricks\sqlalchemy\base.py", line 150, in get_columns
    row_dict = parse_column_info_from_tgetcolumnsresponse(col)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python312\Lib\site-packages\databricks\sqlalchemy\_parse.py", line 330, in parse_column_info_from_tgetcolumnsresponse
    _col_type = GET_COLUMNS_TYPE_MAP[_raw_col_type]
                ~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^
KeyError: 'timestamp_ntz'

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions