Why is the library named Tango?#
The motivation behind this library is that we can make research easier by composing it into well-defined steps. What happens when you choreograph a number of steps together? Well, you get a dance. And since our team’s leader is part of a tango band, “AI2 Tango” was an obvious choice!
How can I debug my steps through the Tango CLI?#
You can run the
tango command through pdb. For example:
python -m pdb -m tango run config.jsonnet
How is Tango different from Metaflow, Airflow, or redun?#
We’ve found that existing DAG execution engines like these tools are great for production workflows but not as well suited for messy, collaborative research projects where code is changing constantly. AI2 Tango was built specifically for these kinds of research projects.
How does Tango’s caching mechanism work?#
AI2 Tango caches the results of steps based on the
unique_id of the step. The
unique_id is essentially a hash of all of the inputs to the step along with:
the step class’s fully qualified name, and
the step class’s
VERSIONclass variable (an arbitrary string).
Unlike other workflow engines like redun, Tango does not take into account the source code of the class itself (other than its fully qualified name) because we’ve found that using a hash of the source code bytes is way too sensitive and less transparent for users.
When you change the source code of your step in a meaningful way you can just manually change the
VERSION class variable to indicate to Tango
that the step has been updated.