moleculeresolver.SqliteMoleculeCache

Classes

SqliteMoleculeCache

A class for caching molecule information using SQLite.

Module Contents

class moleculeresolver.SqliteMoleculeCache.SqliteMoleculeCache(db_path: str | None = ':memory:', expiration_datetime: datetime.datetime | None = None)

A class for caching molecule information using SQLite.

This class provides methods to initialize, manage, and query a SQLite database for storing molecule information. It supports multi-threading and implements context management for proper resource handling.

Attributes:

db_path (str): Path to the SQLite database file. Defaults to “:memory:”.

expiration_datetime (Optional[datetime]): Expiration date for cached entries.

_connections (dict): Thread-specific database connections.

_main_thread_id (int): ID of the main thread.

db_path = ':memory:'
expiration_datetime = None
_connections
_main_thread_id
__enter__() SqliteMoleculeCache

Enter the runtime context related to this object.

Creates tables and deletes expired entries.

Returns:

SqliteMoleculeCache: The instance of the class.

close_child_connections() None

Close all child thread database connections.

__exit__(exception_type, exception_value, exception_traceback) None

Exit the runtime context and close all database connections.

Closes all child thread connections and optimizes the main thread’s connection before closing.

get_connection() sqlite3.Connection

Get or create a thread-specific database connection.

Returns:

sqlite3.Connection: A SQLite database connection for the current thread.

_create_tables() None

Create the necessary tables in the SQLite database if they don’t exist.

save(service: str | list[str], identifier_mode: str | list[str], identifier: str | list[str], molecules: moleculeresolver.molecule.Molecule | list[moleculeresolver.molecule.Molecule]) None

Save molecule information to the database.

Saves one or multiple Molecule objects to the database, along with their associated service, identifier_mode, and identifier.

Args:

service (Union[str, list[str]]): The service(s) associated with the molecule(s).

identifier_mode (Union[str, list[str]]): The identifier mode(s) for the molecule(s).

identifier (Union[str, list[str]]): The identifier(s) for the molecule(s).

molecules (Union[Molecule, list[Molecule]]): The molecule(s) to be saved.

Raises:

ValueError: If a molecule’s synonyms contain a pipe symbol or if molecule properties don’t match the input values.

Search for molecules in the database based on the provided criteria.

Supports single and multiple molecule searches. It can either return the full molecule information or just check for existence.

Args:

service (Union[str, list[str]]): The service(s) to search in.

identifier_mode (Union[str, list[str]]): The mode(s) of identification (e.g., ‘name’, ‘cas’).

identifier (Union[str, list[str]]): The identifier(s) to search for.

only_check_for_existence (Optional[bool]): If True, only check if the molecule exists. Defaults to False.

Returns:

Union[Optional[list[Molecule]], list[Optional[list[Molecule]]], bool, list[bool]]: - If searching for a single molecule:

  • If only_check_for_existence is False: returns Optional[list[Molecule]]

  • If only_check_for_existence is True: returns bool

  • If searching for multiple molecules:
    • If only_check_for_existence is False: returns list[Optional[list[Molecule]]]

    • If only_check_for_existence is True: returns list[bool]

Raises:

ValueError: If the input parameters are inconsistent or invalid for multiple searches.

exists(service: str | list[str], identifier_mode: str | list[str], identifier: str | list[str]) bool | list[bool]

Check if molecule(s) exist in the database based on the provided criteria.

Supports both single and multiple molecule existence checks.

Args:

service (Union[str, list[str]]): The service(s) to search in. Can be a single string or a sequence of strings for multiple checks.

identifier_mode (Union[str, list[str]]): The mode(s) of identification (e.g., ‘name’, ‘cas’). Can be a single string or a sequence of strings for multiple checks.

identifier (Union[str, list[str]]): The identifier(s) to search for. Can be a single string or a sequence of strings for multiple checks.

Returns:

Union[bool, list[bool]]:

  • For a single check: A boolean indicating whether the molecule exists.

  • For multiple checks: A list of booleans, each indicating whether the corresponding molecule exists.

Note:

This method uses the internal _search method with the ‘only_check_for_existence’ flag set to True.

search(service: str | list[str], identifier_mode: str | list[str], identifier: str | list[str]) list[moleculeresolver.molecule.Molecule] | None | list[list[moleculeresolver.molecule.Molecule] | None]

Search for molecules based on the given parameters.

Searches for molecules using the specified service, identifier mode, and identifier. Supports both single and multiple searches.

Args:

service (Union[str, list[str]]): The service(s) to use for the search. Can be a single string or a sequence of strings.

identifier_mode (Union[str, list[str]]): The identifier mode(s) to use. Can be a single string or a sequence of strings.

identifier (Union[str, list[str]]): The identifier(s) to search for. Can be a single string or a sequence of strings.

Returns:

Union[Optional[list[Molecule]], list[Optional[list[Molecule]]]]:

  • If a single search is performed, returns either None or a list of Molecule objects.

  • If multiple searches are performed, returns a list of results, where each result is either None or a list of Molecule objects.

Note:

This method internally calls the _search method to perform the actual search operation.

delete_expired() None

Delete expired molecules from the cache.

Removes all molecules from the database that were added before the expiration datetime, if set.

Note:

This method only performs the deletion if ‘self.expiration_datetime’ is set.

delete_by_service(service: str, mode: str | None = '%') None

Delete all molecules associated with a specific service from the cache.

Args:

service (str): The name of the service whose molecules should be deleted.

recreate_all_tables() None

Recreate all tables in the database.

Closes any existing connections, deletes the database files, and then recreates the tables. Use with caution, as it will result in data loss.

Raises:

RuntimeError: If called in a multi-threaded environment (more than one connection).

count(service: str | None = None) int

Count the number of molecules in the database, optionally filtered by service.

Args:

service (Optional[str]): The service to filter by. If None, counts all molecules.

Returns:

int: The number of molecules matching the criteria.