Quickstart
Read a file
import openyxdb
# As a PyArrow Table
table = openyxdb.to_pyarrow("data.yxdb")
# As a Pandas DataFrame
df = openyxdb.to_pandas("data.yxdb")
# As a Polars DataFrame
df = openyxdb.to_polars("data.yxdb")
Write a file
import openyxdb
import polars as pl
df = pl.DataFrame({"id": [1, 2, 3], "name": ["Alice", "Bob", "Carol"]})
openyxdb.from_polars(df, "output.yxdb")
Lazy scan with Polars
Importing openyxdb monkey-patches Polars with YXDB support automatically:
import polars as pl
import openyxdb
# Lazy scan -- only requested columns are decoded from disk
df = pl.scan_yxdb("data.yxdb").select("id", "score").filter(pl.col("score") > 90).collect()
# Eager read
df = pl.read_yxdb("data.yxdb")
Write from a lazy plan
import polars as pl
import openyxdb
lf = pl.scan_yxdb("data.yxdb").filter(pl.col("score") > 50)
lf.yxdb.sink("filtered.yxdb")
DuckDB
import duckdb, openyxdb
con = duckdb.connect()
openyxdb.register_duckdb(con, "yx", "data.yxdb")
result = con.execute("SELECT COUNT(*) FROM yx WHERE score > 90").fetchone()
print(result)
Next
- Reading -- all read paths and the high-level API
- Writing -- all write paths
- Polars integration -- lazy scans, streaming sinks, and namespace plugins
- DuckDB integration -- SQL over YXDB files
- Field types -- all 17 YXDB types and their Python/Arrow mappings