string
pixeltable.functions.string
Pixeltable UDFs for StringType
.
It closely follows the Pandas pandas.Series.str
API.
Example:
import pixeltable as pxt
from pixeltable.functions import string as pxt_str
t = pxt.get_table(...)
t.select(pxt_str.capitalize(t.str_col)).collect()
capitalize
capitalize(self: str) -> str
Return string with its first character capitalized and the rest lowercased.
Equivalent to str.capitalize()
.
casefold
casefold(self: str) -> str
Return a casefolded copy of string.
Equivalent to str.casefold()
.
center
center(self: str, width: int, fillchar: str = ' ') -> str
Return a centered string of length width
.
Equivalent to str.center()
.
Parameters:
-
width
(int
) –Total width of the resulting string.
-
fillchar
(str
, default:' '
) –Character used for padding.
contains
contains(
self: str,
pattern: str,
case: bool = True,
flags: int = 0,
regex: bool = True,
) -> bool
Test if string contains pattern or regex.
Parameters:
-
pattern
(str
) –string literal or regular expression
-
case
(bool
, default:True
) –if False, ignore case
-
flags
(int
, default:0
) –flags for the
re
module -
regex
(bool
, default:True
) –if True, treat pattern as a regular expression
count
count(self: str, pattern: str, flags: int = 0) -> int
Count occurrences of pattern or regex.
Parameters:
-
pattern
(str
) –string literal or regular expression
-
flags
(int
, default:0
) –flags for the
re
module
endswith
endswith(self: str, pattern: str) -> bool
Return True
if the string ends with the specified suffix, otherwise return False
.
Equivalent to str.endswith()
.
Parameters:
-
pattern
(str
) –string literal
fill
fill(self: str, width: int, **kwargs: None) -> str
Wraps the single paragraph in string, and returns a single string containing the wrapped paragraph.
Equivalent to textwrap.fill()
.
Parameters:
-
width
(int
) –Maximum line width.
-
kwargs
(None
, default:{}
) –Additional keyword arguments to pass to
textwrap.fill()
.
find
find(
self: str, substr: str, start: Optional[int] = 0, end: Optional[int] = None
) -> int
Return the lowest index in string where substr
is found within the slice s[start:end]
.
Equivalent to str.find()
.
Parameters:
-
substr
(str
) –substring to search for
-
start
(Optional[int]
, default:0
) –slice start
-
end
(Optional[int]
, default:None
) –slice end
findall
findall(self: str, pattern: str, flags: int = 0) -> JsonT
Find all occurrences of a regular expression pattern in string.
Equivalent to re.findall()
.
Parameters:
-
pattern
(str
) –regular expression pattern
-
flags
(int
, default:0
) –flags for the
re
module
format
format(self: str, *args: None, **kwargs: None) -> str
Perform string formatting.
Equivalent to str.format()
.
fullmatch
fullmatch(self: str, pattern: str, case: bool = True, flags: int = 0) -> bool
Determine if string fully matches a regular expression.
Equivalent to re.fullmatch()
.
Parameters:
-
pattern
(str
) –regular expression pattern
-
case
(bool
, default:True
) –if False, ignore case
-
flags
(int
, default:0
) –flags for the
re
module
index
index(
self: str, substr: str, start: Optional[int] = 0, end: Optional[int] = None
) -> int
Return the lowest index in string where substr
is found within the slice [start:end]
.
Raises ValueError if substr
is not found.
Equivalent to str.index()
.
Parameters:
-
substr
(str
) –substring to search for
-
start
(Optional[int]
, default:0
) –slice start
-
end
(Optional[int]
, default:None
) –slice end
isalnum
isalnum(self: str) -> bool
Return True
if all characters in the string are alphanumeric and there is at least one character, False
otherwise.
Equivalent to [str.isalnum()
](https://docs.python.org/3/library/stdtypes.html#str.isalnum
isalpha
isalpha(self: str) -> bool
Return True
if all characters in the string are alphabetic and there is at least one character, False
otherwise.
Equivalent to str.isalpha()
.
isascii
isascii(self: str) -> bool
Return True
if the string is empty or all characters in the string are ASCII, False
otherwise.
Equivalent to str.isascii()
.
isdecimal
isdecimal(self: str) -> bool
Return True
if all characters in the string are decimal characters and there is at least one character, False
otherwise.
Equivalent to str.isdecimal()
.
isdigit
isdigit(self: str) -> bool
Return True
if all characters in the string are digits and there is at least one character, False
otherwise.
Equivalent to str.isdigit()
.
isidentifier
isidentifier(self: str) -> bool
Return True
if the string is a valid identifier according to the language definition, False
otherwise.
Equivalent to str.isidentifier()
islower
islower(self: str) -> bool
Return True
if all cased characters in the string are lowercase and there is at least one cased character, False
otherwise.
Equivalent to str.islower()
isnumeric
isnumeric(self: str) -> bool
Return True
if all characters in the string are numeric characters, False
otherwise.
Equivalent to str.isnumeric()
isspace
isspace(self: str) -> bool
Return True
if there are only whitespace characters in the string and there is at least one character, False
otherwise.
Equivalent to str.isspace()
istitle
istitle(self: str) -> bool
Return True
if the string is a titlecased string and there is at least one character, False
otherwise.
Equivalent to str.istitle()
isupper
isupper(self: str) -> bool
Return True
if all cased characters in the string are uppercase and there is at least one cased character, False
otherwise.
Equivalent to str.isupper()
ljust
ljust(self: str, width: int, fillchar: str = ' ') -> str
Return the string left-justified in a string of length width
.
Equivalent to str.ljust()
Parameters:
-
width
(int
) –Minimum width of resulting string; additional characters will be filled with character defined in
fillchar
. -
fillchar
(str
, default:' '
) –Additional character for filling.
lower
lower(self: str) -> str
Return a copy of the string with all the cased characters converted to lowercase.
Equivalent to str.lower()
lstrip
lstrip(self: str, chars: Optional[str] = None) -> str
Return a copy of the string with leading characters removed. The chars
argument is a string specifying the set of
characters to be removed. If omitted or None
, whitespace characters are removed.
Equivalent to str.lstrip()
Parameters:
-
chars
(Optional[str]
, default:None
) –The set of characters to be removed.
match
match(self: str, pattern: str, case: bool = True, flags: int = 0) -> bool
Determine if string starts with a match of a regular expression
Parameters:
-
pattern
(str
) –regular expression pattern
-
case
(bool
, default:True
) –if False, ignore case
-
flags
(int
, default:0
) –flags for the
re
module
normalize
normalize(self: str, form: str) -> str
Return the Unicode normal form.
Equivalent to unicodedata.normalize()
Parameters:
-
form
(str
) –Unicode normal form (
‘NFC’
,‘NFKC’
,‘NFD’
,‘NFKD’
)
pad
pad(self: str, width: int, side: str = 'left', fillchar: str = ' ') -> str
Pad string up to width
Parameters:
-
width
(int
) –Minimum width of resulting string; additional characters will be filled with character defined in
fillchar
. -
side
(str
, default:'left'
) –Side from which to fill resulting string (
‘left’
,‘right’
,‘both’
) -
fillchar
(str
, default:' '
) –Additional character for filling
partition
partition(self: str, sep: str = ' ') -> JsonT
Splits string at the first occurrence of sep
, and returns 3 elements containing the part before the
separator, the separator itself, and the part after the separator. If the separator is not found, return 3 elements
containing string itself, followed by two empty strings.
removeprefix
removeprefix(self: str, prefix: str) -> str
Remove prefix. If the prefix is not present, returns string.
removesuffix
removesuffix(self: str, suffix: str) -> str
Remove suffix. If the suffix is not present, returns string.
repeat
repeat(self: str, n: int) -> str
Repeat string n
times.
replace
replace(
self: str,
pattern: str,
repl: str,
n: int = -1,
case: bool = True,
flags: int = 0,
regex: bool = False,
) -> str
Replace occurrences of pattern
with repl
.
Equivalent to str.replace()
or
re.sub()
, depending on the value of regex.
Parameters:
-
pattern
(str
) –string literal or regular expression
-
repl
(str
) –replacement string
-
n
(int
, default:-1
) –number of replacements to make (-1 for all)
-
case
(bool
, default:True
) –if False, ignore case
-
flags
(int
, default:0
) –flags for the
re
module -
regex
(bool
, default:False
) –if True, treat pattern as a regular expression
rfind
rfind(
self: str, substr: str, start: Optional[int] = 0, end: Optional[int] = None
) -> int
Return the highest index where substr
is found, such that substr
is contained within [start:end]
.
Equivalent to str.rfind()
.
Parameters:
-
substr
(str
) –substring to search for
-
start
(Optional[int]
, default:0
) –slice start
-
end
(Optional[int]
, default:None
) –slice end
rindex
rindex(
self: str, substr: str, start: Optional[int] = 0, end: Optional[int] = None
) -> int
Return the highest index where substr
is found, such that substr
is contained within [start:end]
.
Raises ValueError if substr
is not found.
Equivalent to str.rindex()
.
rjust
rjust(self: str, width: int, fillchar: str = ' ') -> str
Return the string right-justified in a string of length width
.
Equivalent to str.rjust()
.
Parameters:
-
width
(int
) –Minimum width of resulting string.
-
fillchar
(str
, default:' '
) –Additional character for filling.
rpartition
rpartition(self: str, sep: str = ' ') -> JsonT
This method splits string at the last occurrence of sep
, and returns a list containing the part before the
separator, the separator itself, and the part after the separator.
rstrip
rstrip(self: str, chars: Optional[str] = None) -> str
Return a copy of string with trailing characters removed.
Equivalent to str.rstrip()
.
Parameters:
-
chars
(Optional[str]
, default:None
) –The set of characters to be removed. If omitted or
None
, whitespace characters are removed.
slice
slice(
self: str,
start: Optional[int] = None,
stop: Optional[int] = None,
step: Optional[int] = None,
) -> str
Return a slice.
Parameters:
-
start
(Optional[int]
, default:None
) –slice start
-
stop
(Optional[int]
, default:None
) –slice end
-
step
(Optional[int]
, default:None
) –slice step
slice_replace
slice_replace(
self: str,
start: Optional[int] = None,
stop: Optional[int] = None,
repl: Optional[str] = None,
) -> str
Replace a positional slice with another value.
Parameters:
-
start
(Optional[int]
, default:None
) –slice start
-
stop
(Optional[int]
, default:None
) –slice end
-
repl
(Optional[str]
, default:None
) –replacement value
startswith
startswith(self: str, pattern: str) -> int
Return True
if string starts with pattern
, otherwise return False
.
Equivalent to str.startswith()
.
Parameters:
-
pattern
(str
) –string literal
strip
strip(self: str, chars: Optional[str] = None) -> str
Return a copy of string with leading and trailing characters removed.
Equivalent to str.strip()
.
Parameters:
-
chars
(Optional[str]
, default:None
) –The set of characters to be removed. If omitted or
None
, whitespace characters are removed.
swapcase
swapcase(self: str) -> str
Return a copy of string with uppercase characters converted to lowercase and vice versa.
Equivalent to str.swapcase()
.
title
title(self: str) -> str
Return a titlecased version of string, i.e. words start with uppercase characters, all remaining cased characters are lowercase.
Equivalent to str.title()
.
upper
upper(self: str) -> str
Return a copy of string converted to uppercase.
Equivalent to str.upper()
.
wrap
wrap(self: str, width: int, **kwargs: None) -> JsonT
Wraps the single paragraph in string so every line is at most width
characters long.
Returns a list of output lines, without final newlines.
Equivalent to textwrap.fill()
.
Parameters:
-
width
(int
) –Maximum line width.
-
kwargs
(None
, default:{}
) –Additional keyword arguments to pass to
textwrap.fill()
.
zfill
zfill(self: str, width: int) -> str
Pad a numeric string with ASCII 0
on the left to a total length of width
.
Equivalent to str.zfill()
.
Parameters:
-
width
(int
) –Minimum width of resulting string.