Table

Instances of class Table are handles to Pixeltable tables and views/snapshots.

Use this handle to query and update the table and to add and drop columns.

Tables are created by calling pxt.create_table. Views and snapshots are created by calling pxt.create_view (snapshots require is_snapshot=True).

To get a handle to an existing table/view/snapshot, call pxt.get_table.

Overview

Column Operations
`add_column`	Add a column to the table or view
`drop_column`	Remove a column from the table or view
`rename_column`	Rename a column

Data Operations
`insert`	Insert rows into table
`update`	Update rows in table or view
`delete`	Delete rows from table

Indexing Operations
`add_embedding_index`	Add embedding index on column
`drop_embedding_index`	Drop embedding index from column
`drop_index`	Drop index from column

Versioning
`revert`	Revert the last change

pixeltable.Table

Table(id: UUID, dir_id: UUID, name: str, tbl_version_path: TableVersionPath)

Base class for all tabular SchemaObjects.

base `property`

base: Optional['Table']

The base table of this Table. If this table is a view, returns the Table from which it was derived. Otherwise, returns None.

getattr

__getattr__(
    name: str,
) -> Union[
    "pixeltable.exprs.ColumnRef", "pixeltable.func.QueryTemplateFunction"
]

Return a ColumnRef or QueryTemplateFunction for the given name.

getitem

__getitem__(
    index: object,
) -> Union[
    "pixeltable.func.QueryTemplateFunction",
    "pixeltable.exprs.ColumnRef",
    "pixeltable.dataframe.DataFrame",
]

Return a ColumnRef or QueryTemplateFunction for the given name, or a DataFrame for the given slice.

setitem

__setitem__(
    column_name: str, value: Union[ColumnType, Expr, Callable, dict]
) -> None

Adds a column to the table Args: column_name: the name of the new column value: column type or value expression or column specification dictionary: column type: a Pixeltable column type (if the table already contains rows, it must be nullable) value expression: a Pixeltable expression that computes the column values column specification: a dictionary with possible keys 'type', 'value', 'stored' Examples: Add an int column with None values:

>>> tbl['new_col'] = IntType(nullable=True)

For a table with int column ``int_col``, add a column that is the factorial of ``int_col``. The names of
the parameters of the Callable must correspond to existing column names (the column values are then passed
as arguments to the Callable). In this case, the return type cannot be inferred and needs to be specified
explicitly:

>>> tbl['factorial'] = {'value': lambda int_col: math.factorial(int_col), 'type': IntType()}

For a table with an image column ``frame``, add an image column ``rotated`` that rotates the image by
90 degrees. In this case, the column type is inferred from the expression. Also, the column is not stored
(by default, computed image columns are not stored but recomputed on demand):

>>> tbl['rotated'] = tbl.frame.rotate(90)

Do the same, but now the column is stored:

>>> tbl['rotated'] = {'value': tbl.frame.rotate(90), 'stored': True}

add_column

add_column(
    *,
    type: Optional[ColumnType] = None,
    stored: Optional[bool] = None,
    print_stats: bool = False,
    **kwargs: Any
) -> UpdateStatus

Adds a column to the table.

Parameters:

kwargs (Any, default: {} ) –

Exactly one keyword argument of the form column-name=type|value-expression.
type (Optional[ColumnType], default: None ) –

The type of the column. Only valid and required if value-expression is a Callable.
stored (Optional[bool], default: None ) –

Whether the column is materialized and stored or computed on demand. Only valid for image columns.
print_stats (bool, default: False ) –

If True, print execution metrics.

Returns:

UpdateStatus –

execution status

Raises:

Error –

If the column name is invalid or already exists.

Examples:

Add an int column with None values:

>>> tbl.add_column(new_col=IntType())

Alternatively, this can also be expressed as:

>>> tbl['new_col'] = IntType()

For a table with int column int_col, add a column that is the factorial of int_col. The names of the parameters of the Callable must correspond to existing column names (the column values are then passed as arguments to the Callable). In this case, the column type needs to be specified explicitly:

>>> tbl.add_column(factorial=lambda int_col: math.factorial(int_col), type=IntType())

Alternatively, this can also be expressed as:

>>> tbl['factorial'] = {'value': lambda int_col: math.factorial(int_col), 'type': IntType()}

For a table with an image column frame, add an image column rotated that rotates the image by 90 degrees. In this case, the column type is inferred from the expression. Also, the column is not stored (by default, computed image columns are not stored but recomputed on demand):

>>> tbl.add_column(rotated=tbl.frame.rotate(90))

Alternatively, this can also be expressed as:

>>> tbl['rotated'] = tbl.frame.rotate(90)

Do the same, but now the column is stored:

>>> tbl.add_column(rotated=tbl.frame.rotate(90), stored=True)

Alternatively, this can also be expressed as:

>>> tbl['rotated'] = {'value': tbl.frame.rotate(90), 'stored': True}

add_embedding_index

add_embedding_index(
    col_name: str,
    *,
    idx_name: Optional[str] = None,
    text_embed: Optional[Function] = None,
    img_embed: Optional[Function] = None,
    metric: str = "cosine"
) -> None

Add an index to the table. Args: col_name: name of column to index idx_name: name of index, which needs to be unique for the table; if not provided, a name will be generated text_embed: function to embed text; required if the column is a text column img_embed: function to embed images; required if the column is an image column metric: distance metric to use for the index; one of 'cosine', 'ip', 'l2'; default is 'cosine'

Raises:

Error –

If an index with that name already exists for the table or if the column does not exist.

Examples:

Add an index to the img column:

>>> tbl.add_embedding_index('img', img_embed=...)

Add another index to the img column, using the inner product as the distance metric, and with a specific name; text_embed is also specified in order to search with text:

>>> tbl.add_embedding_index(
    'img', idx_name='clip_idx', img_embed=..., text_embed=...text_embed..., metric='ip')

batch_update

batch_update(
    rows: Iterable[dict[str, Any]], cascade: bool = True
) -> UpdateStatus

Update rows in this table.

Parameters:

rows (Iterable[dict[str, Any]]) –

an Iterable of dictionaries containing values for the updated columns plus values for the primary key columns.
cascade (bool, default: True ) –

if True, also update all computed columns that transitively depend on the updated columns.

Examples:

Update the 'name' and 'age' columns for the rows with ids 1 and 2 (assuming 'id' is the primary key):

>>> tbl.update([{'id': 1, 'name': 'Alice', 'age': 30}, {'id': 2, 'name': 'Bob', 'age': 40}])

collect

collect() -> 'pixeltable.dataframe.DataFrameResultSet'

Return rows from this table.

column_names

column_names() -> list[str]

Return the names of the columns in this table.

column_types

column_types() -> dict[str, ColumnType]

Return the names of the columns in this table.

count

count() -> int

Return the number of rows in this table.

delete `abstractmethod`

delete(where: Optional['pixeltable.exprs.Predicate'] = None) -> UpdateStatus

Delete rows in this table.

Parameters:

where (Optional['pixeltable.exprs.Predicate'], default: None ) –

a Predicate to filter rows to delete.

Examples:

Delete all rows in a table:

>>> tbl.delete()

Delete all rows in a table where column a is greater than 5:

>>> tbl.delete(tbl.a > 5)

describe

describe() -> None

Print the table schema.

df

df() -> 'pixeltable.dataframe.DataFrame'

Return a DataFrame for this table.

display_name `abstractmethod` `classmethod`

display_name() -> str

Return name displayed in error messages.

drop_column

drop_column(name: str) -> None

Drop a column from the table.

Parameters:

name (str) –

The name of the column to drop.

Raises:

Error –

If the column does not exist or if it is referenced by a computed column.

Examples:

Drop column factorial:

>>> tbl.drop_column('factorial')

drop_embedding_index

drop_embedding_index(
    *, column_name: Optional[str] = None, idx_name: Optional[str] = None
) -> None

Drop an embedding index from the table.

Parameters:

column_name (Optional[str], default: None ) –

The name of the column whose embedding index to drop. Invalid if the column has multiple embedding indices.
idx_name (Optional[str], default: None ) –

The name of the index to drop.

Raises:

Error –

If the index does not exist.

Examples:

Drop embedding index on the img column:

>>> tbl.drop_embedding_index(column_name='img')

drop_index

drop_index(
    *, column_name: Optional[str] = None, idx_name: Optional[str] = None
) -> None

Drop an index from the table.

Parameters:

column_name (Optional[str], default: None ) –

The name of the column whose index to drop. Invalid if the column has multiple indices.
idx_name (Optional[str], default: None ) –

The name of the index to drop.

Raises:

Error –

If the index does not exist.

Examples:

Drop index on the img column:

>>> tbl.drop_index(column_name='img')

get_views

get_views(*, recursive: bool = False) -> list['Table']

All views and snapshots of this Table.

group_by

group_by(*items: 'exprs.Expr') -> 'pixeltable.dataframe.DataFrame'

Return a DataFrame for this table.

head

head(*args, **kwargs) -> 'pixeltable.dataframe.DataFrameResultSet'

Return the first n rows inserted into this table.

insert `abstractmethod`

insert(
    rows: Optional[Iterable[dict[str, Any]]] = None,
    /,
    *,
    print_stats: bool = False,
    fail_on_exception: bool = True,
    **kwargs: Any,
) -> UpdateStatus

Inserts rows into this table. There are two mutually exclusive call patterns:

To insert multiple rows at a time: insert(rows: Iterable[dict[str, Any]], /, *, print_stats: bool = False, fail_on_exception: bool = True)

To insert just a single row, you can use the more convenient syntax: insert(*, print_stats: bool = False, fail_on_exception: bool = True, **kwargs: Any)

Parameters:

rows (Optional[Iterable[dict[str, Any]]], default: None ) –

(if inserting multiple rows) A list of rows to insert, each of which is a dictionary mapping column names to values.
kwargs (Any, default: {} ) –

(if inserting a single row) Keyword-argument pairs representing column names and values.
print_stats (bool, default: False ) –

If True, print statistics about the cost of computed columns.
fail_on_exception (bool, default: True ) –

Determines how exceptions in computed columns and invalid media files (e.g., corrupt images) are handled. If False, store error information (accessible as column properties 'errortype' and 'errormsg') for those cases, but continue inserting rows. If True, raise an exception that aborts the insert.

Returns:

UpdateStatus –

execution status

Raises:

Error –

if a row does not match the table schema or contains values for computed columns

Examples:

Insert two rows into a table with three int columns a, b, and c. Column c is nullable.

>>> tbl.insert([{'a': 1, 'b': 1, 'c': 1}, {'a': 2, 'b': 2}])

Insert a single row into a table with three int columns a, b, and c.

>>> tbl.insert(a=1, b=1, c=1)

order_by

order_by(
    *items: "exprs.Expr", asc: bool = True
) -> "pixeltable.dataframe.DataFrame"

Return a DataFrame for this table.

query_names

query_names() -> list[str]

Return the names of the registered queries for this table.

rename_column

rename_column(old_name: str, new_name: str) -> None

Rename a column.

Parameters:

old_name (str) –

The current name of the column.
new_name (str) –

The new name of the column.

Raises:

Error –

If the column does not exist or if the new name is invalid or already exists.

Examples:

Rename column factorial to fac:

>>> tbl.rename_column('factorial', 'fac')

revert

revert() -> None

Reverts the table to the previous version.

.. warning:: This operation is irreversible.

select

select(*items: Any, **named_items: Any) -> 'pixeltable.dataframe.DataFrame'

Return a DataFrame for this table.

show

show(*args, **kwargs) -> 'pixeltable.dataframe.DataFrameResultSet'

Return rows from this table.

sync

sync(
    stores: Optional[str | list[str]] = None,
    *,
    export_data: bool = True,
    import_data: bool = True
) -> "pixeltable.io.SyncStatus"

Synchronizes this table with its linked external stores.

Parameters:

stores (Optional[str | list[str]], default: None ) –

If specified, will synchronize only the specified named store or list of stores. If not specified, will synchronize all of this table's external stores.
export_data (bool, default: True ) –

If True, data from this table will be exported to the external stores during synchronization.
import_data (bool, default: True ) –

If True, data from the external stores will be imported to this table during synchronization.

tail

tail(*args, **kwargs) -> 'pixeltable.dataframe.DataFrameResultSet'

Return the last n rows inserted into this table.

to_coco_dataset

to_coco_dataset() -> Path

Return the path to a COCO json file for this table. See DataFrame.to_coco_dataset()

to_pytorch_dataset

to_pytorch_dataset(
    image_format: str = "pt",
) -> "torch.utils.data.IterableDataset"

Return a PyTorch Dataset for this table. See DataFrame.to_pytorch_dataset()

unlink_external_stores

unlink_external_stores(
    stores: Optional[str | list[str]] = None,
    *,
    delete_external_data: bool = False,
    ignore_errors: bool = False
) -> None

Unlinks this table's external stores.

Parameters:

stores (Optional[str | list[str]], default: None ) –

If specified, will unlink only the specified named store or list of stores. If not specified, will unlink all of this table's external stores.
ignore_errors (bool, default: False ) –

If True, no exception will be thrown if a specified store is not linked to this table.
delete_external_data (bool, default: False ) –

If True, then the external data store will also be deleted. WARNING: This is a destructive operation that will delete data outside Pixeltable, and cannot be undone.

update

update(
    value_spec: dict[str, Any],
    where: Optional["pixeltable.exprs.Predicate"] = None,
    cascade: bool = True,
) -> UpdateStatus

Update rows in this table.

Parameters:

value_spec (dict[str, Any]) –

a dictionary mapping column names to literal values or Pixeltable expressions.
where (Optional['pixeltable.exprs.Predicate'], default: None ) –

a Predicate to filter rows to update.
cascade (bool, default: True ) –

if True, also update all computed columns that transitively depend on the updated columns.

Examples:

Set column int_col to 1 for all rows:

>>> tbl.update({'int_col': 1})

Set column int_col to 1 for all rows where int_col is 0:

>>> tbl.update({'int_col': 1}, where=tbl.int_col == 0)

Set int_col to the value of other_int_col + 1:

>>> tbl.update({'int_col': tbl.other_int_col + 1})

Increment int_col by 1 for all rows where int_col is 0:

>>> tbl.update({'int_col': tbl.int_col + 1}, where=tbl.int_col == 0)

version

version() -> int

Return the version of this table. Used by tests to ascertain version changes.

where

where(pred: 'exprs.Predicate') -> 'pixeltable.dataframe.DataFrame'

Return a DataFrame for this table.

Table

Overview

pixeltable.Table

base property

__getattr__

__getitem__

__setitem__

add_column

add_embedding_index

batch_update

collect

column_names

column_types

count

delete abstractmethod

describe

df

display_name abstractmethod classmethod

drop_column

drop_embedding_index

drop_index

get_views

group_by

head

insert abstractmethod

order_by

query_names

rename_column

revert

select

show

sync

tail

to_coco_dataset

to_pytorch_dataset

unlink_external_stores

update

version

where

base `property`

getattr

getitem

setitem

delete `abstractmethod`

display_name `abstractmethod` `classmethod`

insert `abstractmethod`