Deterministic Hashing#

In order to detect whether a Step has to be re-run or not, Tango relies on some tools to compute deterministic hashes from the inputs to the Step.

The center-piece of this module is the det_hash() function, which computes a deterministic hash of an arbitrary Python object. The other things in this module influence how that works in various ways.

class tango.common.det_hash.CustomDetHash[source]#

By default, det_hash() pickles an object, and returns the hash of the pickled representation. Sometimes you want to take control over what goes into that hash. In that case, derive from this class and implement det_hash_object(). det_hash() will pickle the result of this method instead of the object itself.

If you return None, det_hash() falls back to the original behavior and pickles the object.

abstract det_hash_object()[source]#

Return an object to use for deterministic hashing instead of self.

Return type:

Any

class tango.common.det_hash.DetHashFromInitParams(*args, **kwargs)[source]#

Add this class as a mixin base class to make sure your class’s det_hash is derived exclusively from the parameters passed to __init__().

det_hash_object()[source]#

Returns a copy of the parameters that were passed to the class instance’s __init__() method.

Return type:

Any

class tango.common.det_hash.DetHashWithVersion[source]#

Add this class as a mixin base class to make sure your class’s det_hash can be modified by altering a static VERSION member of your class.

Let’s say you are working on training a model. Whenever you change code that’s part of your experiment, you have to change the VERSION of the step that’s running that code to tell Tango that the step has changed and should be re-run. But if you are training your model using Tango’s built-in TorchTrainStep, how do you change the version of the step? The answer is, leave the version of the step alone, and instead add a VERSION to your model by deriving from this class:

class MyModel(DetHashWithVersion):
    VERSION = "001"

    def __init__(self, ...):
        ...
det_hash_object()[source]#

Returns a tuple of VERSION and this instance itself.

Return type:

Any

tango.common.det_hash.det_hash(o)[source]#

Returns a deterministic hash code of arbitrary Python objects.

If you want to override how we calculate the deterministic hash, derive from the CustomDetHash class and implement CustomDetHash.det_hash_object().

Return type:

str