Adding a New Importer¶
This guide walks through every step required to add support for a new import file format to execsql. The process involves two files: write the importer function and register the format string in the IMPORT handler.
Background: How Importers Work¶
Importers are standalone module-level functions. There is no base class to subclass — every importer follows the same pattern: parse the source file into column headers and rows, then hand off to import_data_table() (the shared back-end in src/execsql/importers/base.py) to create the table and load the data.
The shared back-end¶
import_data_table(db, schemaname, tablename, is_new, hdrs, data) handles everything after parsing:
- Cleans column headers according to config settings (trimming, folding, deduplication).
- Issues
CREATE TABLEwhenis_newis1or2. - Calls
db.populate_table()to bulk-insert the rows. - Commits the transaction.
Your importer just needs to open and parse the source file, then call import_data_table().
The IMPORT dispatch¶
Unlike EXPORT, the IMPORT machinery in src/execsql/metacommands/io_import.py is split. The main x_import handler dispatches by file extension (.csv, .txt, .ods, .xls/.xlsx) and calls the matching importer directly. Some formats (e.g. Feather, Parquet) have their own top-level handlers — x_import_feather, x_import_parquet — registered separately in the dispatch table.
When adding a new format, pick the right pattern: extend the extension-dispatch in x_import for a format that fits the standard IMPORT … TO … shape, or write a new x_import_<format> handler if the format needs its own metacommand syntax.
Step-by-step: Adding an Importer¶
Step 1 — Write the importer function¶
Create src/execsql/importers/myformat.py:
# src/execsql/importers/myformat.py
from __future__ import annotations
"""
MyFormat import for execsql.
Provides :func:`importtable_myformat`, which reads a `.myfmt` file and
loads it into a database table.
"""
from typing import Any
from execsql.db.base import Database
from execsql.exceptions import ErrInfo
from execsql.importers.base import import_data_table
import execsql.state as _state
def importtable_myformat(
db: Database,
schemaname: str | None,
tablename: str,
filename: str,
is_new: Any,
encoding: str | None = None,
) -> None:
"""Import *filename* (MyFormat) into *tablename*.
Args:
db: Active database connection.
schemaname: Schema name, or ``None`` for the default schema.
tablename: Target table name.
filename: Path to the source file.
is_new: ``1`` to CREATE the table, ``2`` to DROP and re-CREATE,
``0`` to append to an existing table.
encoding: File encoding override. Falls back to the configured
import encoding.
"""
from pathlib import Path
if not Path(filename).is_file():
raise ErrInfo(
type="error",
other_msg=f"Non-existent file ({filename}) used with the IMPORT metacommand",
)
enc = encoding if encoding else _state.conf.import_encoding
try:
import myformat_lib # lazy import of optional dependency
except ImportError:
raise ErrInfo(
type="error",
other_msg="The myformat_lib package is required to import MyFormat files.",
)
# Parse headers and rows from the source file.
reader = myformat_lib.open(filename, encoding=enc)
hdrs = reader.column_names() # list[str]
rows = reader.iter_rows() # iterable of list[Any]
import_data_table(db, schemaname, tablename, is_new, hdrs, rows)
Key points:
- Naming is not standardized. Existing importers use
importtable(csv),importods,importxls,import_feather,import_parquet. Pick a descriptive function name; the exampleimporttable_myformatis one option, not a required convention. - Call
import_data_table()— do not calldb.populate_table()directly. The shared back-end handles column header cleaning,CREATE TABLE, and commit. - Import optional dependencies lazily inside the function body so execsql still runs for users who do not have the library installed.
- Raise
ErrInfofor expected failures rather than a bareraiseorsys.exit. - The
is_newparameter values are:0= append to existing table,1= create new table,2= drop and re-create.
Step 2 — Register the format in the IMPORT handler¶
Open src/execsql/metacommands/io_import.py. At the top, import your function:
Then find the IMPORT format dispatch (the block that calls importtable for CSV, importtable_ods for ODS, etc.) and add a new elif branch:
elif filefmt == "myformat":
importtable_myformat(
_state.dbs.current(),
schemaname,
tablename,
filename,
is_new,
encoding=enc,
)
Format string naming: the format string is what the user writes after FORMAT in the IMPORT metacommand (FORMAT myformat). Use lowercase, no spaces. If you need aliases, use elif filefmt in ("myformat", "myfmt"):.
Step 3 — Add tests¶
# tests/importers/test_myformat_importer.py
import pytest
from pathlib import Path
from typer.testing import CliRunner
from execsql.cli import app
@pytest.fixture()
def runner():
return CliRunner()
class TestMyFormatImporter:
"""IMPORT FORMAT myformat."""
def test_basic_import(self, runner, tmp_path):
db = tmp_path / "test.db"
src = tmp_path / "data.myfmt"
src.write_text(...) # write a minimal test file in your format
script = tmp_path / "test.sql"
script.write_text(
f"-- !x! IMPORT {src} TO mytable FORMAT myformat NEW\n"
)
result = runner.invoke(app, ["-tl", str(script), str(db), "-n"])
assert result.exit_code == 0, result.output
Checklist¶
- Importer function written in
src/execsql/importers/myformat.py - Function imported in
src/execsql/metacommands/io_import.py -
elif filefmt == "myformat":branch added in the IMPORT handler - Test added to
tests/importers/ -
pytestpasses locally - New format string documented in Metacommands — IMPORT
- New library dependency (if any) added to
pyproject.tomlextras and documented in Requirements