Skip to content

Importers

Base infrastructure

import_data_table() is the shared back-end called by every importer. It handles column header cleaning, CREATE TABLE (when requested), and bulk-insertion via db.populate_table(). Format-specific importers only need to parse their file into headers and rows, then call import_data_table().

If you are adding support for a new file format, start with the Adding Importers guide.

__all__ = ['import_data_table'] module-attribute

import_data_table(db, schemaname, tablename, is_new, hdrs, data)

Create (if needed) and populate a database table from in-memory row data.

Format readers

Each module below implements one or more import formats.

__all__ = ['importfile', 'importtable'] module-attribute

importfile(db, schemaname, tablename, column_name, filename)

Import an entire file as a single value into a table column.

importtable(db, schemaname, tablename, filename, is_new, skip_header_line=True, quotechar=None, delimchar=None, encoding=None, junk_header_lines=0)

Import a delimited text file into a new or existing database table.

__all__ = ['import_json'] module-attribute

import_json(db, schemaname, tablename, filename, is_new, encoding=None)

Import a JSON file into a database table.

Objects are flattened so that nested keys become dot-separated column names (e.g. address.city). Arrays within objects are stored as JSON strings.

__all__ = ['importods', 'ods_data'] module-attribute

importods(db, schemaname, tablename, is_new, filename, sheetname, junk_header_rows)

Import an ODS worksheet into a new or existing database table.

ods_data(filename, sheetname, junk_header_rows=0)

Returns the data from the specified worksheet as a list of headers and a list of lists of rows.

__all__ = ['importxls', 'xls_data'] module-attribute

importxls(db, schemaname, tablename, is_new, filename, sheetname, junk_header_rows, encoding)

Import an XLS or XLSX worksheet into a new or existing database table.

xls_data(filename, sheetname, junk_header_rows, encoding=None)

Returns the data from the specified worksheet as a list of headers and a list of lists of rows.

__all__ = ['import_feather', 'import_parquet'] module-attribute

import_feather(db, schemaname, tablename, filename, is_new)

Import an Apache Arrow Feather (IPC) file into a database table.

import_parquet(db, schemaname, tablename, filename, is_new)

Import a Parquet file into a database table.