SQLite

The core SqliteDatabase handles pragmas, user-defined functions, WAL mode, full-text search and JSON. Because the full-text search and JSON fields are specific to SQLite, these features are provided by playhouse.sqlite_ext.

Implementations

SqliteDatabase

Core SQLite implementation. Provides:

  • Pragma support (including WAL-mode)

  • User-defined functions

  • ATTACH / DETACH databases

  • Full-text search

  • JSON

Full-text search and JSON implementations available in playhouse.sqlite_ext.

CySqliteDatabase (playhouse.cysqlite_ext)

Extends SqliteDatabase, uses cysqlite driver.

  • All above functionality

  • Table-value functions

  • Commit / Rollback / Update / Progress / Trace hooks

  • BLOB I/O

  • Online backups

  • Can be built with encryption.

APSWDatabase (playhouse.apsw_ext)

Extends SqliteDatabase, uses apsw driver.

APSW is a thin C-level driver that exposes the full range of SQLite functionality.

SqlCipherDatabase (playhouse.sqlcipher_ext)

Extends SqliteDatabase, uses sqlcipher3 driver.

SQLCipher provides transparent full-database encryption using 256-bit AES, ensuring data on-disk is secure.

SqliteQueueDatabase (playhouse.sqliteq)

Extends SqliteDatabase.

Provides a SQLite database implementation with a long-lived background writer thread. All write operations are managed by a single write connection, preventing timeouts and database locking issues. This implementation is useful when using Sqlite in multi-threaded environments with frequent writes.

PRAGMA statements

SQLite allows run-time configuration through PRAGMA statements (SQLite documentation). These statements are typically run when a new database connection is created.

To specify default PRAGMA statements for connections:

db = SqliteDatabase('my_app.db', pragmas={
    'journal_mode': 'wal',
    'cache_size': 10000,  # 10000 pages, or ~40MB
    'foreign_keys': 1,  # Enforce foreign-key constraints
})

PRAGMAs may also be configured dynamically using either the pragma() method or the special properties exposed on the SqliteDatabase object:

# Set cache size to 64MB for *current connection*.
db.pragma('cache_size', -64000)

# Same as above.
db.cache_size = -64000

# Read the value of several pragmas:
print('cache_size:', db.cache_size)
print('foreign_keys:', db.foreign_keys)
print('journal_mode:', db.journal_mode)
print('page_size:', db.page_size)

# Set foreign_keys pragma on current connection *AND* on all
# connections opened subsequently.
db.pragma('foreign_keys', 1, permanent=True)

Attention

Pragmas set using the pragma() method are not re-applied when a new connection opens. To configure a pragma to be run whenever a new connection is opened, specify permanent=True.

db.pragma('foreign_keys', 1, permanent=True)

See also

SQLite PRAGMA documentation: https://sqlite.org/pragma.html

User-Defined Functions

SQLite can be extended with user-defined Python code. The SqliteDatabase class supports a variety of user-defined extensions:

Functions

User-defined functions accept any number of parameters and return a single value.

Aggregates

Aggregate values across multiple rows and return a single value.

Window Functions

Aggregates which support operating on sliding windows of data.

Collations

Control how values are ordered and sorted.

Table Functions

User-defined tables (requres cysqlite).

Shared Libraries

Load an extension from a shared library.

Function example

db = SqliteDatabase('analytics.db')

from urllib.parse import urlparse

@db.func('hostname')
def hostname(url):
    if url is not None:
        return urlparse(url).netloc

# Call this function in our code:
# The following finds the most common hostnames of referrers by count:
query = (PageView
         .select(fn.hostname(PageView.referrer), fn.COUNT(PageView.id))
         .group_by(fn.hostname(PageView.referrer))
         .order_by(fn.COUNT(PageView.id).desc()))

Aggregate example

User-defined aggregates must define two methods:

  • step(*values) - called once for each row being aggregated.

  • finalize() - called only once to produce final aggregate value.

from hashlib import md5

@db.aggregate('md5')
class MD5Checksum(object):
    def __init__(self):
        self.checksum = md5()

    def step(self, value):
        self.checksum.update(value.encode('utf-8'))

    def finalize(self):
        return self.checksum.hexdigest()

# Usage:
# The following computes an aggregate MD5 checksum for files broken
# up into chunks and stored in the database.
query = (FileChunk
         .select(FileChunk.filename, fn.MD5(FileChunk.data))
         .group_by(FileChunk.filename)
         .order_by(FileChunk.filename, FileChunk.sequence))

Window function example

User-defined window functions are simply aggregates with two additional methods:

  • step(*values) - called for each row being aggregated.

  • inverse(*values) - “invert” the effect of a call to step(*values).

  • value() - return the current value of the aggregate.

  • finalize() - return final aggregate value.

# Window functions are normal aggregates with two additional methods:
# inverse(value) - Perform the inverse of step(value).
# value() - Report value at current step.
@db.aggregate('mysum')
class MySum(object):
    def __init__(self):
        self._value = 0
    def step(self, value):
        self._value += (value or 0)
    def inverse(self, value):
        self._value -= (value or 0)  # Do opposite of "step()".
    def value(self):
        return self._value
    def finalize(self):
        return self._value

# e.g., aggregate sum of employee salaries over their department.
query = (Employee
         .select(
             Employee.department,
             Employee.salary,
             fn.mysum(Employee.salary).over(
                 partition_by=[Employee.department]))
         .order_by(Employee.id))

Collation example

Collations accept two values and provide a value indicating how they should be ordered (e.g. cmp(lhs, rhs)).

@db.collation('ireverse')
def collate_reverse(s1, s2):
    # Case-insensitive reverse.
    s1, s2 = s1.lower(), s2.lower()
    return (s1 < s2) - (s1 > s2)  # Equivalent to -cmp(s1, s2)

# To use this collation to sort books in reverse order...
Book.select().order_by(collate_reverse.collation(Book.title))

# Or...
Book.select().order_by(Book.title.asc(collation='reverse'))

Table function example

Example user-defined table-value function (see cysqlite TableFunction docs for full details on TableFunction).

from cysqlite import TableFunction
from playhouse.cysqlite_ext import CySqliteDatabase

db = CySqliteDatabase('my_app.db')

@db.table_function('series')
class Series(TableFunction):
    columns = ['value']
    params = ['start', 'stop', 'step']

    def initialize(self, start=0, stop=None, step=1):
        """
        Table-functions declare an initialize() method, which is
        called with whatever arguments the user has called the
        function with.
        """
        self.start = self.current = start
        self.stop = stop or float('Inf')
        self.step = step

    def iterate(self, idx):
        """
        Iterate is called repeatedly by the SQLite database engine
        until the required number of rows has been read **or** the
        function raises a `StopIteration` signalling no more rows
        are available.
        """
        if self.current > self.stop:
            raise StopIteration

        ret, self.current = self.current, self.current + self.step
        return (ret,)

# Usage:
cursor = db.execute_sql('SELECT * FROM series(?, ?, ?)', (0, 5, 2))
for value, in cursor:
    print(value)

# Prints:
# 0
# 2
# 4

Shared Libraries

Example:

# Load `closure.so` shared library in the current directory.
db = SqliteDatabase('my_app.db')
db.load_extension('closure')

To support shared libraries, your SQLite3 will need to have been compiled with support for run-time loadable extensions.

Locking Mode for Transactions

SQLite transactions can be opened in three different modes:

  • Deferred (default) - only acquires lock when a read or write is performed. The first read creates a shared lock and the first write creates a reserved lock. Because the acquisition of the lock is deferred until actually needed, it is possible that another thread or process could create a separate transaction and write to the database.

  • Immediate - a reserved lock is acquired immediately. In this mode, no other connection may write to the database or open an immediate or exclusive transaction. Other processes can continue to read from the database, however.

  • Exclusive - opens an exclusive lock which prevents all (except for read uncommitted) connections from accessing the database until the transaction is complete.

Example specifying the locking mode:

db = SqliteDatabase('app.db')

with db.atomic('EXCLUSIVE'):
    read()
    write()


@db.atomic('IMMEDIATE')
def some_other_function():
    # This function is wrapped in an "IMMEDIATE" transaction.
    do_something_else()

For more information, see the SQLite locking documentation. To learn more about transactions in Peewee, see the Transactions documentation.

Danger

Do not alter the isolation_level property of the sqlite3.Connection object. Peewee requires the sqlite3 driver be in autocommit-mode, which is handled automatically by SqliteDatabase.

CySqlite

CySqliteDatabase uses the cysqlite driver, a high-performance alternative to the standard library sqlite3 module. cysqlite provides additional features and hooks not available with in the standard library sqlite3 driver.

Installation:

pip install cysqlite

Usage:

from playhouse.cysqlite_ext import CySqliteDatabase

db = CySqliteDatabase('my_app.db', pragmas={
    'cache_size': -64000,
    'journal_mode': 'wal',
    'foreign_keys': 1,
})
class CySqliteDatabase(database, **kwargs)
Parameters:
  • pragmas (list) – A list of 2-tuples containing pragma key and value to set every time a connection is opened.

  • timeout – Set the busy-timeout on the SQLite driver (in seconds).

  • rank_functions (bool) – Make search result ranking functions available. Recommended only when using FTS4.

  • regexp_function (bool) – Make the REGEXP function available.

See also

CySqliteDatabase extends SqliteDatabase and inherits all methods for declaring user-defined functions, aggregates, window functions, collations, pragmas, etc.

Example:

db = CySqliteDatabase('app.db', pragmas={'journal_mode': 'wal'})
table_function(name)

Class-decorator for registering a cysqlite.TableFunction. Table functions are user-defined functions that, rather than returning a single, scalar value, can return any number of rows of tabular data.

See cysqlite docs for details on TableFunction API.

from cysqlite import TableFunction

@db.table_function('series')
class Series(TableFunction):
    columns = ['value']
    params = ['start', 'stop', 'step']

    def initialize(self, start=0, stop=None, step=1):
        """
        Table-functions declare an initialize() method, which is
        called with whatever arguments the user has called the
        function with.
        """
        self.start = self.current = start
        self.stop = stop or float('Inf')
        self.step = step

    def iterate(self, idx):
        """
        Iterate is called repeatedly by the SQLite database engine
        until the required number of rows has been read **or** the
        function raises a `StopIteration` signalling no more rows
        are available.
        """
        if self.current > self.stop:
            raise StopIteration

        ret, self.current = self.current, self.current + self.step
        return (ret,)

cursor = db.execute_sql('SELECT * FROM series(?, ?, ?)', (0, 5, 2))
for (value,) in cursor:
    print(value)

# Prints:
# 0
# 2
# 4
register_table_function(klass, name)
Parameters:
  • klass (TableFunction) – class implementing TableFunction API.

  • name (str) – name for user-defined table function.

Register a cysqlite.TableFunction class with the connection. Table functions are user-defined functions that, rather than returning a single, scalar value, can return any number of rows of tabular data.

See also

unregister_table_function(name)
Parameters:

name – Name of the user-defined table function.

Returns:

True or False, depending on whether the function was removed.

Unregister the user-defined table function.

on_commit(fn)
Parameters:

fn – callable or None to clear the current hook.

Register a callback to be executed whenever a transaction is committed on the current connection. The callback accepts no parameters and the return value is ignored.

However, if the callback raises a ValueError, the transaction will be aborted and rolled-back.

Example:

db = CySqliteDatabase(':memory:')

@db.on_commit
def on_commit():
    logger.info('COMMITing changes')
on_rollback(fn)
Parameters:

fn – callable or None to clear the current hook.

Register a callback to be executed whenever a transaction is rolled back on the current connection. The callback accepts no parameters and the return value is ignored.

Example:

@db.on_rollback
def on_rollback():
    logger.info('Rolling back changes')
on_update(fn)
Parameters:

fn – callable or None to clear the current hook.

Register a callback to be executed whenever the database is written to (via an UPDATE, INSERT or DELETE query). The callback should accept the following parameters:

  • query - the type of query, either INSERT, UPDATE or DELETE.

  • database name - the default database is named main.

  • table name - name of table being modified.

  • rowid - the rowid of the row being modified.

The callback’s return value is ignored.

Example:

db = CySqliteDatabase(':memory:')

@db.on_update
def on_update(query_type, db, table, rowid):
    # e.g. INSERT row 3 into table users.
    logger.info('%s row %s into table %s', query_type, rowid, table)
authorizer(fn)
Parameters:

fn – callable or None to clear the current authorizer.

Register an authorizer callback. Authorizer callbacks must accept 5 parameters, which vary depending on the operation being checked.

  • op: operation code, e.g. cysqlite.SQLITE_INSERT.

  • p1: operation-specific value, e.g. table name for SQLITE_INSERT.

  • p2: operation-specific value.

  • p3: database name, e.g. "main".

  • p4: inner-most trigger or view responsible for the access attempt if applicable, else None.

See sqlite authorizer documentation for description of authorizer codes and values for parameters p1 and p2.

The authorizer callback must return one of:

  • cysqlite.SQLITE_OK: allow operation.

  • cysqlite.SQLITE_IGNORE: allow statement compilation but prevent the operation from occuring.

  • cysqlite.SQLITE_DENY: prevent statement compilation.

More details can be found in the cysqlite docs.

trace(fn, mask=2, expand_sql=True):
Parameters:
  • fn – callable or None to clear the current trace hook.

  • mask (int) – mask of what types of events to trace. Default value corresponds to SQLITE_TRACE_PROFILE.

  • expand_sql (bool) – Pass callback the sqlite3_expanded_sql() from sqlite3_stmt (expands bound parameters)

Register a trace hook (sqlite3_trace_v2). Trace callback must accept 4 parameters, which vary depending on the operation being traced.

  • event: type of event, e.g. SQLITE_TRACE_PROFILE.

  • sid: memory address of statement (only SQLITE_TRACE_CLOSE), else -1.

  • sql: SQL string. If expand_sql then bound parameters will be expanded (for SQLITE_TRACE_CLOSE, sql=None).

  • ns: estimated number of nanoseconds the statement took to run (only SQLITE_TRACE_PROFILE), else -1.

Any return value from callback is ignored.

More details can be found in the cysqlite docs.

slow_query_log(threshold_ms=50, logger=None, level=logging.WARNING, expand_sql=True)
Parameters:
  • threshold_ms – estimated millisecond threshold to log slow queries.

  • logger – logging namespace, defaults to 'peewee.cysqlite_ext'.

  • level (int) – level for slow query log.

  • expand_sql (bool) – expand bound parameters in SQL query.

Register a sqlite3_trace_v2 callback that will log slow queries to the given logger. Overrides previously-registered trace() callback. Automatically re-registered when new connection is opened.

progress(fn, n=1)
Parameters:
  • fn – callable or None to clear the current progress handler.

  • n (int) – approximate number of VM instructions to execute between calls to the progress handler.

Register a progress handler (sqlite3_progress_handler). Callback takes no arguments and returns 0 to allow progress to continue or any non-zero value to interrupt progress.

More details can be found in the cysqlite docs.

autocommit

Property which returns a boolean indicating if autocommit is enabled. By default, this value will be True except when inside a transaction (or atomic() block).

Example:

>>> db = CySqliteDatabase(':memory:')
>>> db.autocommit
True
>>> with db.atomic():
...     print(db.autocommit)
...
False
>>> db.autocommit
True
backup(destination, pages=None, name=None, progress=None)
Parameters:
  • destination (CySqliteDatabase) – Database object to serve as destination for the backup.

  • pages (int) – Number of pages per iteration. Default value of -1 indicates all pages should be backed-up in a single step.

  • name (str) – Name of source database (may differ if you used ATTACH DATABASE to load multiple databases). Defaults to “main”.

  • progress – Progress callback, called with three parameters: the number of pages remaining, the total page count, and whether the backup is complete.

Example:

master = CySqliteDatabase('master.db')
replica = CySqliteDatabase('replica.db')

# Backup the contents of master to replica.
master.backup(replica)
backup_to_file(filename, pages, name, progress)
Parameters:
  • filename – Filename to store the database backup.

  • pages (int) – Number of pages per iteration. Default value of -1 indicates all pages should be backed-up in a single step.

  • name (str) – Name of source database (may differ if you used ATTACH DATABASE to load multiple databases). Defaults to “main”.

  • progress – Progress callback, called with three parameters: the number of pages remaining, the total page count, and whether the backup is complete.

Backup the current database to a file. The backed-up data is not a database dump, but an actual SQLite database file.

Example:

db = CySqliteDatabase('app.db')

def nightly_backup():
    filename = 'backup-%s.db' % (datetime.date.today())
    db.backup_to_file(filename)
blob_open(table, column, rowid, read_only=False)
Parameters:
  • table (str) – Name of table containing data.

  • column (str) – Name of column containing data.

  • rowid (int) – ID of row to retrieve.

  • read_only (bool) – Open the blob for reading only.

  • dbname (str) – Database name (e.g. if multiple databases attached).

Returns:

cysqlite.Blob instance which provides efficient access to the underlying binary data.

Return type:

cysqlite.Blob

See cysqlite documentation for more details.

Example:

class Image(Model):
    filename = TextField()
    data = BlobField()

buf_size = 1024 * 1024 * 8  # Allocate 8MB for storing file.
rowid = Image.insert({
    Image.filename: 'thefile.jpg',
    Image.data: fn.zeroblob(buf_size),
}).execute()

# Open the blob, returning a file-like object.
blob = db.blob_open('image', 'data', rowid)

# Write some data to the blob.
blob.write(image_data)
img_size = blob.tell()

# Read the data back out of the blob.
blob.seek(0)
image_data = blob.read(img_size)
class PooledCySqliteDatabase(database, **kwargs)

Connection-pooling variant of CySqliteDatabase.

APSW

APSW is a thin C wrapper over SQLite’s C API that exposes nearly every SQLite feature including virtual tables, virtual filesystems, and BLOB I/O.

Installation:

pip install apsw

Usage:

from playhouse.apsw_ext import APSWDatabase

db = APSWDatabase('my_app.db')

class BaseModel(Model):
    class Meta:
        database = db

Use the Field subclasses from playhouse.apsw_ext rather than those from peewee to ensure correct type adaptation. For example, use playhouse.apsw_ext.DateTimeField instead of peewee.DateTimeField.

class APSWDatabase(database, **connect_kwargs)

Subclass of SqliteDatabase using the APSW driver.

Parameters:
  • database (string) – filename of sqlite database

  • connect_kwargs – keyword arguments passed to apsw when opening a connection

register_module(mod_name, mod_inst)

Register a virtual table module globally. See the APSW virtual table documentation.

Parameters:
  • mod_name (string) – name to use for module

  • mod_inst (object) – an object implementing the Virtual Table interface

unregister_module(mod_name)

Unregister a previously registered module.

SQLCipher

SQLCipher is an encrypted wrapper around SQLite. Peewee exposes it through SqlCipherDatabase, which is API-identical to SqliteDatabase except for its constructor.

Installation:

pip install sqlcipher3

Usage:

from playhouse.sqlcipher_ext import SqlCipherDatabase

db = SqlCipherDatabase(
    'app.db',
    passphrase=os.environ['PASSPHRASE'],
    pragmas={'cache_size': -64000})

Example usage with deferred initialization and passphrase prompt:

db = SqlCipherDatabase(None)

class BaseModel(Model):
    class Meta:
        database = db

class Secret(BaseModel):
    value = TextField()

# Prompt the user and initialize the database with their passphrase.
while True:
    db.init('my_app.db', passphrase=input('Passphrase: '))
    try:
        db.get_tables()  # Will raise if passphrase is wrong.
        break
    except DatabaseError as exc:
        print('Wrong passphrase.')
        db.init(None)

Pragma configuration (e.g. increasing PBKDF2 iterations):

db = SqlCipherDatabase('my_app.db',
                       passphrase='s3cr3t',
                       pragmas={'kdf_iter': 1_000_000})

SQLCipher can be configured using a number of extension PRAGMAs. The list of PRAGMAs and their descriptions can be found in the SQLCipher documentation.

class SqlCipherDatabase(database, passphrase, **kwargs)
Parameters:
  • database (str) – Path to the encrypted database file.

  • passphrase (str) – Encryption passphrase (should be 8 character minimum; enforce stronger requirements in your application).

If the database file does not exist, it is created and encrypted with a key derived from passphrase. If it does exist, passphrase must match the one used when the file was created.

If the passphrase is incorrect, an error will be raised when first attempting to access the database (typically DatabaseError: file is not a database).

rekey(passphrase)

Change the encryption passphrase for the open database.

SqliteQueueDatabase

SqliteQueueDatabase serializes all write queries through a single long-lived connection on a dedicated background thread. This allows multiple application threads to write to a SQLite database concurrently without conflict or timeout errors.

SqliteQueueDatabase can be used as a drop-in replacement for the regular SqliteDatabase if you want simple read and write access to a SQLite database from multiple threads, and do not need transactions.

from playhouse.sqliteq import SqliteQueueDatabase

db = SqliteQueueDatabase(
    'my_app.db',
    use_gevent=False,    # Use stdlib threading (default).
    autostart=True,      # Start the writer thread immediately.
    queue_max_size=64,   # Max pending writes before blocking.
    results_timeout=5.0, # Seconds to wait for a write to complete.
    pragmas={'journal_mode': 'wal'})

If you set autostart=False, start the writer thread explicitly:

db.start()

Stop the writer thread on application shutdown (waits for pending writes):

import atexit

@atexit.register
def _stop():
    db.stop()

Read queries work as normal. Open and close the connection per-request as you would with any other database. Only writes are funneled through the queue.

Transactions are not supported. Because writes from different threads are interleaved, there is no way to guarantee that the statements in a transaction from one thread execute atomically without statements from another thread appearing between them. The atomic() and transaction() methods raise a ValueError if called.

If you need to temporarily bypass the queue and write directly (for example, during a batch import), use pause() and unpause().

class SqliteQueueDatabase(database, use_gevent=False, autostart=True, queue_max_size=None, results_timeout=None, **kwargs)
Parameters:
  • database (str) – database filename.

  • use_gevent (bool) – use gevent instead of threading.

  • autostart (bool) – automatically start writer background thread.

  • queue_max_size (int) – maximum size of pending writes queue.

  • results_timeout (float) – timeout for waiting for query results from write thread (seconds).

start()

Start the background writer thread.

stop()

Signal the writer thread to stop. Blocks until all pending writes are flushed.

is_stopped()

Return True if the writer thread is not running.

pause()

Block until the writer thread finishes its current work, then disconnect it. The calling thread takes over direct database access. Must be followed by a call to unpause().

unpause()

Resume the writer thread and reconnect the queue.

SQLite-Specific Fields

These field classes live in playhouse.sqlite_ext and can be used with:

class RowIDField

Primary-key field mapped to SQLite’s implicit rowid column.

For more information, see the SQLite documentation on rowid tables.

class Note(Model):
    rowid = RowIDField()  # Implied primary_key=True.
    content = TextField()
    timestamp = TimestampField()

RowIDField can be mapped to a different field name, but it’s underlying column name will always be rowid.

class Note(Model):
    id = RowIDField()
    ...
class AutoIncrementField

Integer primary key that uses SQLite’s AUTOINCREMENT keyword, guaranteeing the primary key is always strictly increasing even after deletions. Has a small performance cost versus the default PrimaryKeyField or RowIDField.

See the SQLite AUTOINCREMENT documentation for details.

class ISODateTimeField

Subclass of DateTimeField that preserves UTC offset information for timezone-aware datetimes when storing to SQLite’s text-based datetime representation.

class TDecimalField(max_digits=10, decimal_places=5, auto_round=False, rounding=None, *args, **kwargs)

Subclass of DecimalField that stores decimal values in a TEXT column to avoid any potential loss of precision that may occur when storing in a REAL (double-precision floating point) column. SQLite does not have a true numeric type, so this field ensures no precision is lost when using Decimals.

SQLite JSON

JSONField enables storing and querying JSON data in SQLite using the SQLite json functions.

class JSONField(json_dumps=None, json_loads=None, **kwargs)
Parameters:
  • json_dumps – Custom JSON serializer. Defaults to json.dumps.

  • json_loads – Custom JSON deserializer. Defaults to json.loads.

Stores and retrieves JSON data transparently and provides efficient implementations for in-place modification and querying. Data is automatically serialized on write, deserialized on read.

Example model:

from peewee import *
from playhouse.sqlite_ext import JSONField

db = SqliteDatabase(':memory:')

class Config(db.Model):
    data = JSONField()

Config.create_table()

# Create two rows.
Config.create(data={'timeout': 30, 'retry': {'max': 5}})
Config.create(data={'timeout': 10, 'retry': {'max': 10}})

To access or modify specific object keys or array indexes in a JSON structure, you can treat the JSONField as if it were a dictionary/list:

# Select or order by a JSON value:
query = (Config
         .select(Config, Config.data['timeout'].alias('timeout'))
         .order_by(Config.data['timeout'].desc()))

# Aggregate on nested value:
avg = (Config
       .select(fn.SUM(Config.data['timeout']) / fn.COUNT(Config.id))
       .scalar())

# Filter by nested value:
Config.select().where(Config.data['retry']['max'] < 8)

Data can be atomically updated, written and removed in-place:

# In-place update (preserves other keys):
(Config
 .update(data=Config.data.update({'timeout': 60}))
 .where(Config.data['timeout'] >= 30)
 .execute())

# Set a specific path:
(Config
 .update(data=Config.data['timeout'].set(120))
 .where(Config.data['retry']['max'] == 5)
 .execute())

# Update a specific path with an object. Existing field ("max") will be
# preserved in this example.
(Config
 .update(data=Config.data['retry'].update({'backoff': 1}))
 .execute())

# To overwrite a specific path with an object, use set():
(Config
 .update(data=Config.data['retry'].set({'allowed': 10}))
 .execute())

# Remove a key atomically:
(Config
 .update(data=Config.data.update({'retry': None}))
 .where(Config.id == 1)
 .execute())

# Another way to remove atomically:
(Config
 .update(data=Config.data['retry'].remove())
 .where(Config.id == 2)
 .execute())

Helpers for other JSON scenarios:

# Query JSON types:
query = (Config
         .select(Config.data.json_type(), Config.data['timeout'].json_type())
         .tuples())
# [('object', 'integer'), ('object', 'integer')]

# Query length of an array:
cfg1 = Config.create(data={'statuses': [1, 99, 1, 1]})
cfg2 = Config.create(data={'statuses': [1, 1]})

query = (Config
         .select(
             Config.data['statuses'],
             Config.data['statuses'].length())
         .tuples())

# [([1, 99, 1, 1], 4), ([1, 1], 2)]

Let’s add a nested value and then see how to iterate through it’s contents recursively using the tree() method:

Config.create(data={'x1': {'y1': 'z1', 'y2': 'z2'}, 'x2': [1, 2]})

tree = Config.data.tree().alias('tree')
query = (Config
         .select(Config.id, tree.c.fullkey, tree.c.value)
         .from_(Config, tree))

for row in query.tuples():
    print(row)

(1, '$', {'x1': {'y1': 'z1', 'y2': 'z2'}, 'x2': [1, 2]}),
(1, '$.x2', [1, 2]),
(1, '$.x2[0]', 1),
(1, '$.x2[1]', 2),
(1, '$.x1', {'y1': 'z1', 'y2': 'z2'}),
(1, '$.x1.y1', 'z1'),
(1, '$.x1.y2', 'z2')]

The tree() and children() methods are powerful. For more information on how to utilize them, see the json1 extension documentation.

__getitem__(item)
Parameters:

item – Access a specific key or array index in the JSON data.

Returns:

a special object exposing access to the JSON data.

Return type:

JSONPath

Access a specific key or array index in the JSON data. Returns a JSONPath object, which exposes convenient methods for reading or modifying a particular part of a JSON object.

Example:

# If metadata contains {"tags": ["list", "of", "tags"]}, we can
# extract the first tag in this way:
Post.select(Post, Post.metadata['tags'][0].alias('first_tag'))

For more examples see the JSONPath API documentation.

extract(*paths)
Parameters:

paths – One or more JSON paths to extract.

Extract one or more JSON path values. Returns a list when multiple paths are given.

extract_json(path)
Parameters:

path (str) – JSON path

Extract the value at the specified path as a JSON data-type. This corresponds to the -> operator added in Sqlite 3.38.

extract_text(path)
Parameters:

path (str) – JSON path

Extract the value at the specified path as a SQL data-type. This corresponds to the ->> operator added in Sqlite 3.38.

set(value, as_json=None)
Parameters:
  • value – a scalar value, list, or dictionary.

  • as_json (bool) – force the value to be treated as JSON, in which case it will be serialized as JSON in Python beforehand. By default, lists and dictionaries are treated as JSON to be serialized, while strings and integers are passed as-is.

Set the value stored in a JSONField.

Uses the json_set() function from the json1 extension.

replace(value, as_json=None)
Parameters:
  • value – a scalar value, list, or dictionary.

  • as_json (bool) – force the value to be treated as JSON, in which case it will be serialized as JSON in Python beforehand. By default, lists and dictionaries are treated as JSON to be serialized, while strings and integers are passed as-is.

Replace the existing value stored in a JSONField. Will not create if does not exist.

Uses the json_replace() function from the json1 extension.

insert(value, as_json=None)
Parameters:
  • value – a scalar value, list, or dictionary.

  • as_json (bool) – force the value to be treated as JSON, in which case it will be serialized as JSON in Python beforehand. By default, lists and dictionaries are treated as JSON to be serialized, while strings and integers are passed as-is.

Insert value into JSONField. Will not overwrite existing.

Uses the json_insert() function from the json1 extension.

append(value, as_json=None)
Parameters:
  • value – a scalar value, list, or dictionary.

  • as_json (bool) – force the value to be treated as JSON, in which case it will be serialized as JSON in Python beforehand. By default, lists and dictionaries are treated as JSON to be serialized, while strings and integers are passed as-is.

Append to the array stored in a JSONField.

Uses the json_set() function from the json1 extension.

update(data)
Parameters:

data – a scalar value, list or dictionary to merge with the data currently stored in a JSONField. To remove a particular key, set that key to None in the updated data.

Merge new data into the JSON value using the RFC-7396 MergePatch algorithm to apply a patch (data parameter) against the column data. MergePatch can add, modify, or delete elements of a JSON object, which means update() is a generalized replacement for both set() and remove(). MergePatch treats JSON array objects as atomic, so update() cannot append to an array, nor modify individual elements of an array.

For more information as well as examples, see the SQLite json_patch() function documentation.

remove()

Remove the data stored in the JSONField.

Uses the json_remove function from the json1 extension.

json_type()

Return a string identifying the type of value stored in the column.

The type returned will be one of:

  • object

  • array

  • integer

  • real

  • true

  • false

  • text

  • null <– the string “null” means an actual NULL value

  • NULL <– an actual NULL value means the path was not found

Uses the json_type function from the json1 extension.

length()

Return the length of the array stored in the column.

Uses the json_array_length function from the json1 extension.

children()

The children function corresponds to json_each, a table-valued function that walks the JSON value provided and returns the immediate children of the top-level array or object. If a path is specified, then that path is treated as the top-most element.

The rows returned by calls to children() have the following attributes:

  • key: the key of the current element relative to its parent.

  • value: the value of the current element.

  • type: one of the data-types (see json_type()).

  • atom: the scalar value for primitive types, NULL for arrays and objects.

  • id: a unique ID referencing the current node in the tree.

  • parent: the ID of the containing node.

  • fullkey: the full path describing the current element.

  • path: the path to the container of the current row.

Internally this method uses the json_each (documentation link) function from the json1 extension.

Example usage (compare to tree() method):

class KeyData(Model):
    key = TextField()
    data = JSONField()

KeyData.create(key='a', data={'k1': 'v1', 'x1': {'y1': 'z1'}})
KeyData.create(key='b', data={'x1': {'y1': 'z1', 'y2': 'z2'}})

# We will query the KeyData model for the key and all the
# top-level keys and values in it's data field.
kd = KeyData.data.children().alias('children')
query = (KeyData
         .select(kd.c.key, kd.c.value, kd.c.fullkey)
         .from_(KeyData, kd)
         .order_by(kd.c.key)
         .tuples())
print(query[:])

# PRINTS:
[('a', 'k1', 'v1',                    '$.k1'),
 ('a', 'x1', '{"y1":"z1"}',           '$.x1'),
 ('b', 'x1', '{"y1":"z1","y2":"z2"}', '$.x1')]
tree()

The tree function corresponds to json_tree, a table-valued function that recursively walks the JSON value provided and returns information about the keys at each level. If a path is specified, then that path is treated as the top-most element.

The rows returned by calls to tree() have the same attributes as rows returned by calls to children():

  • key: the key of the current element relative to its parent.

  • value: the value of the current element.

  • type: one of the data-types (see json_type()).

  • atom: the scalar value for primitive types, NULL for arrays and objects.

  • id: a unique ID referencing the current node in the tree.

  • parent: the ID of the containing node.

  • fullkey: the full path describing the current element.

  • path: the path to the container of the current row.

Internally this method uses the json_tree (documentation link) function from the json1 extension.

Example usage:

class KeyData(Model):
    key = TextField()
    data = JSONField()

KeyData.create(key='a', data={'k1': 'v1', 'x1': {'y1': 'z1'}})
KeyData.create(key='b', data={'x1': {'y1': 'z1', 'y2': 'z2'}})

# We will query the KeyData model for the key and all the
# keys and values in it's data field, recursively.
kd = KeyData.data.tree().alias('tree')
query = (KeyData
         .select(kd.c.key, kd.c.value, kd.c.fullkey)
         .from_(KeyData, kd)
         .order_by(kd.c.key)
         .tuples())
print(query[:])

# PRINTS:
[('a',  None,  '{"k1":"v1","x1":{"y1":"z1"}}', '$'),
 ('b',  None,  '{"x1":{"y1":"z1","y2":"z2"}}', '$'),
 ('a',  'k1',  'v1',                           '$.k1'),
 ('a',  'x1',  '{"y1":"z1"}',                  '$.x1'),
 ('b',  'x1',  '{"y1":"z1","y2":"z2"}',        '$.x1'),
 ('a',  'y1',  'z1',                           '$.x1.y1'),
 ('b',  'y1',  'z1',                           '$.x1.y1'),
 ('b',  'y2',  'z2',                           '$.x1.y2')]
class JSONPath(field, path=None)
Parameters:
  • field (JSONField) – the field object we intend to access.

  • path (tuple) – Components comprising the JSON path.

A convenient, Pythonic way of representing JSON paths for use with JSONField. Implements the same methods as JSONField but designed for operating on nested items, e.g.:

Config.create(data={'timeout': 30, 'retries': {'max': 5}})

# Both Config.data['timeout'] and Config.data['retries']['max']
# are instances of JSONPath:
query = (Config
         .select(Config.data['timeout'])
         .where(Config.data['retries']['max'] < 10))
class JSONBField(json_dumps=None, json_loads=None, **kwargs)

Extends JSONField and stores data in the binary jsonb format (SQLite 3.45.0+). When reading raw column values the data is in its encoded binary form use the json() method to decode:

# Raw read returns binary:
kv = KV.get(KV.key == 'a')
kv.value   # b"l'k1'v1"

# Use .json() to get a Python object:
kv = KV.select(KV.value.json()).get()
kv.value   # {'k1': 'v1'}
json()

Indicate the JSONB field-data should be deserialized and returned as JSON (as opposed to the SQLite binary format).

User-Defined Function Collection

The playhouse.sqlite_udf contains a number of functions, aggregates, and table-valued functions grouped into named collections.

from playhouse.sqlite_udf import register_all, register_groups
from playhouse.sqlite_udf import DATE, STRING

db = SqliteDatabase('my_app.db')

register_all(db)                   # Register every function.
register_groups(db, DATE, STRING)  # Register selected groups.

# Register individual functions:
from playhouse.sqlite_udf import gzip, gunzip
db.register_function(gzip, 'gzip')
db.register_function(gunzip, 'gunzip')

Once registered, call functions via Peewee’s fn namespace or raw SQL:

# Find most common URL hostnames.
query = (Link
         .select(fn.hostname(Link.url).alias('host'), fn.COUNT(Link.id))
         .group_by(fn.hostname(Link.url))
         .order_by(fn.COUNT(Link.id).desc())
         .tuples())

Available functions

CONTROL_FLOW

if_then_else(cond, truthy, falsey=None)

Simple ternary-type operator, where, depending on the truthiness of the cond parameter, either the truthy or falsey value will be returned.

DATE

strip_tz(date_str)
Parameters:

date_str – A datetime, encoded as a string.

Returns:

The datetime with any timezone info stripped off.

The time is not adjusted in any way, the timezone is simply removed.

humandelta(nseconds, glue=', ')
Parameters:
  • nseconds (int) – Number of seconds, total, in timedelta.

  • glue (str) – Fragment to join values.

Returns:

Easy-to-read description of timedelta.

Example, 86471 -> “1 day, 1 minute, 11 seconds”

mintdiff(datetime_value)
Parameters:

datetime_value – A date-time.

Returns:

Minimum difference between any two values in list.

Aggregate: minimum difference between any two datetimes.

avgtdiff(datetime_value)
Parameters:

datetime_value – A date-time.

Returns:

Average difference between values in list.

Aggregate: average difference between consecutive values.

duration(datetime_value)
Parameters:

datetime_value – A date-time.

Returns:

Duration from smallest to largest value in list, in seconds.

Aggregate: duration from the smallest to the largest value, in seconds.

date_series(start, stop, step_seconds=86400)
Parameters:
  • start (datetime) – Start datetime

  • stop (datetime) – Stop datetime

  • step_seconds (int) – Number of seconds comprising a step.

Table-value function: returns rows consisting of the date/+time values encountered iterating from start to stop, step_seconds at a time.

Additionally, if start does not have a time component and step_seconds is greater-than-or-equal-to one day (86400 seconds), the values returned will be dates. Conversely, if start does not have a date component, values will be returned as times. Otherwise values are returned as datetimes.

Example:

SELECT * FROM date_series('2017-01-28', '2017-02-02');

value
-----
2017-01-28
2017-01-29
2017-01-30
2017-01-31
2017-02-01
2017-02-02

FILE

file_ext(filename)
Parameters:

filename (str) – Filename to extract extension from.

Returns:

Returns the file extension, including the leading “.”.

file_read(filename)
Parameters:

filename (str) – Filename to read.

Returns:

Contents of the file.

HELPER

gzip(data, compression=9)
Parameters:
  • data (bytes) – Data to compress.

  • compression (int) – Compression level (9 is max).

Returns:

Compressed binary data.

gunzip(data)
Parameters:

data (bytes) – Compressed data.

Returns:

Uncompressed binary data.

hostname(url)
Parameters:

url (str) – URL to extract hostname from.

Returns:

hostname portion of URL

toggle(key)
Parameters:

key – Key to toggle.

Toggle a key between True/False state. Example:

>>> toggle('my-key')
True
>>> toggle('my-key')
False
>>> toggle('my-key')
True
setting(key, value=None)
Parameters:
  • key – Key to set/retrieve.

  • value – Value to set.

Returns:

Value associated with key.

Store/retrieve a setting in memory and persist during lifetime of application. To get the current value, specify key. To set a new value, call with key and new value.

clear_toggles()

Clears all state associated with the toggle() function.

clear_settings()

Clears all state associated with the setting() function.

MATH

randomrange(start, stop=None, step=None)
Parameters:
  • start (int) – Start of range (inclusive)

  • end (int) – End of range(not inclusive)

  • step (int) – Interval at which to return a value.

Return a random integer between [start, end).

gauss_distribution(mean, sigma)
Parameters:
  • mean (float) – Mean value

  • sigma (float) – Standard deviation

sqrt(n)

Calculate the square root of n.

tonumber(s)
Parameters:

s (str) – String to convert to number.

Returns:

Integer, floating-point or NULL on failure.

mode(val)
Parameters:

val – Numbers in list.

Returns:

The mode, or most-common, number observed.

Aggregate: calculates mode of values.

minrange(val)
Parameters:

val – Value

Returns:

Min difference between two values.

Aggregate: minimum distance between two numbers in the sequence.

avgrange(val)
Parameters:

val – Value

Returns:

Average difference between values.

Aggregate: average distance between consecutive numbers in the sequence.

range(val)
Parameters:

val – Value

Returns:

The range from the smallest to largest value in sequence.

Aggregate: range of values observed.

median(val)
Parameters:

val – Value

Returns:

The median, or middle, value in a sequence.

Aggregate: median value of a sequence.

Note

Only available if you compiled the _sqlite_udf extension.

STRING

substr_count(haystack, needle)

Returns number of times needle appears in haystack.

strip_chars(haystack, chars)

Strips any characters in chars from beginning and end of haystack.

damerau_levenshtein_dist(s1, s2)

Computes the edit distance from s1 to s2 using the damerau variant of the levenshtein algorithm.

Note

Only available if you compiled the _sqlite_udf extension.

levenshtein_dist(s1, s2)

Computes the edit distance from s1 to s2 using the levenshtein algorithm.

Note

Only available if you compiled the _sqlite_udf extension.

str_dist(s1, s2)

Computes the edit distance from s1 to s2 using the standard library SequenceMatcher’s algorithm.

Note

Only available if you compiled the _sqlite_udf extension.

Parameters:
  • regex (str) – Regular expression

  • search_string (str) – String to search for instances of regex.

Table-value function: searches a string for substrings that match the provided regex. Returns rows for each match found.

Example:

SELECT * FROM regex_search('\w+', 'extract words, ignore! symbols');

value
-----
extract
words
ignore
symbols