store

class gitblobts.store.Blob(timestamp: float, blob: bytes)

Bases: object

Instances of this class are returned by Store.getblobs().

This class is not meant to be initialized otherwise.

Parameters:
  • timestamp – registered timestamp
  • blob – content
class gitblobts.store.Store(path: Union[str, pathlib.Path], *, compression: Optional[str] = None, key: Optional[bytes] = None)

Bases: object

Initialize the interface to a preexisting cloned git repository.

Parameters:
  • path – path to a preexisting cloned git repository. It must have a valid remote.
  • compression – name of a built-in or third-party importable module with compress and decompress functions, e.g. bz2, gzip, lzma. Once established, this must not be changed for a given repository, failing which file corruption can result.
  • key – optional encryption and decryption key as previously generated by generate_key(). Once established, this must not be changed for a given repository, failing which file corruption can result. The key should be stored safely. If it is lost, it will not be possible to decrypt previously encrypted blobs. If anyone else gains access to it, it can be used to decrypt blobs.
addblob(blob: bytes, timestamp: Union[None, int, float, str, time.struct_time] = None) → None

Add a blob and also push it to the remote repository.

Parameters:
  • blob – bytes representation of text or an image or anything else.
  • timestamp – optional time at which to index the blob, preferably as a Unix timestamp. If a Unix timestamp, it can be positive or negative number of whole or fractional seconds since epoch. This doesn’t have to be unique, and so there can be a one-to-many mapping of timestamp to blobs. If a string, it is parsed using dateparser.parse. If not specified, the current time is used.

Idempotency, if required, is to be implemented externally.

addblobs(blobs: Iterable[bytes], timestamps: Optional[Iterable[Union[None, int, float, str, time.struct_time]]] = None) → None

Add multiple blobs and also push them to the remote repository.

For adding multiple blobs, this method is more efficient than multiple calls to addblob(), as the commit and push are batched and done just once.

Parameters:
  • blobs – iterable or sequence.
  • timestamps – optional iterable or sequence of the same length as blobs. If not specified, the current time is used, and this will naturally increment just slightly for each subsequent blob. For further details, refer to the timestamp parameter of addblob().

In case the length of blobs and timestamps are somehow not identical, the shorter of the two lengths is used.

Idempotency, if required, is to be implemented externally.

getblobs(start_time: Union[None, int, float, str, time.struct_time] = -inf, end_time: Union[None, int, float, str, time.struct_time] = inf, *, pull: Optional[bool] = False) → Iterator[gitblobts.store.Blob]

Yield blobs matching the specified time range.

This method currently requires listing and decoding the metadata for all files in the repository directory. From this perspective, calls to it should be consolidated.

Parameters:
  • start_time – inclusive start time. Refer to the corresponding type annotation, and also to the timestamp parameter of addblob().
  • end_time – inclusive end time. Refer to the corresponding type annotation, and also to the timestamp parameter of addblob().
  • pull – pull first from remote repository. A pull should be avoided unless necessary.
Yields:

instances of Blob. If start_timeend_time, blobs are yielded in ascending chronological order sorted by their registered timestamp, otherwise in descending order.

To pull without yielding any blobs, one can therefore call get_blobs(math.inf, math.inf, pull=True).

gitblobts.store.generate_key() → bytes

Return a random new Fernet key.

The key should be stored safely. If it is lost, it will not be possible to decrypt previously encrypted blobs. If anyone else gains access to it, it can be used to decrypt blobs.

An example of a generated key is b'NrYgSuzXVRWtarWcczyuwFs6vZftN1rnlzZtGDaV7iE='.

Returns:key used for encryption and decryption.