PROVIT - PROVenance Integration Tools¶
PROVIT is a light, dezentralized data provenance and documentation tool. It allows the user to track workflows and modifications of data-files.
PROVIT works completely decentralized, all information is stored in .prov files (as JSON-LD RDF graphs) along it’s corresponding data file in the file system. No additional database or server setup is needed.
A small subset of the W3C PROV-O vocabulary is implemented.
PROVIT aim to provided an easy to use interface for users who have never worked with provenance tracking before. If you feel limited by PROVIT you should have a look at more extensive implementations, e.g.: prov.
Full documentation is available under: provit.readthedocs.io.
Requirements¶
This software was tested on Linux with Python 3.5 and 3.6.
Installation¶
Installation via pip is recommended for end users. We strongly encourage end users to make use of a virtualenv.
pip¶
Clone the repository and create a virtual environment (optional) and install into with pip into the virtualenv.
$ mkvirtualenv provit
$ pip install provit
git / Development¶
Clone the repository and create a virtualenv.
$ git clone https://github.com/diggr/provit
$ mkvirtualenv provit
Install it with pip in editable mode
$ pip install -e ./provit
Usage¶
PROVIT provides a command line client which can be used to enrich any file based data with provenance information.
PROVIT also includes a (experimental) web-based interface (PROVIT Browser).
Command Line Client¶
Usage:
Open PROVIT Browser:
$ provit browser
Add provenace event to a file:
$ provit add FILEPATH [OPTIONS]
Options:
-a AGENT, --agent AGENT | |
Provenance information: agent (multiple=True) | |
--activity ACTIVITY | |
Provenance information: activity | |
-d DESCRIPTION, --desc DESCRIPTION | |
Provenance information: Description of the data manipulation process | |
-o ORIGIN, --origin ORIGIN | |
Provenance information: Data origin | |
-s SOURCES, --sources SOURCES | |
Provenance information: Source files (multiple=True) | |
--help | Show this message and exit. |
Provenance Class¶
from provit import Provenance
# load prov data for a file, or create new prov for file
prov = Provenance(<filepath>)
# add provenance metadata
prov.add(agents=[ "agent" ], activity="activity", description="...")
prov.add_primary_source("primary_source")
prov.add_sources([ "filepath1", "filepath2" ])
# return provenance as json tree
prov_dict = prov.tree()
# save provenance metadata into "<filename>.prov" file
prov.save()
Roadmap¶
General roadmap of the next steps in development
- Tests
- Tutorials
- Windows support
- Agent management in PROVIT Browser
Overview¶
Authors: | P. Mühleder muehleder@ub.uni-leipzig.de, F. Rämisch raemisch@ub.uni-leipzig.de |
---|---|
License: | MIT |
Copyright: | 2018, Peter Mühleder and Universitätsbibliothek Leipzig |