Archivist is an R package for object management (storing, sharing, searching). I am going to present it on useR conference next week (hope I can meet some of you in Aalborg). Below you will find two coolest (imho) features implemented in the version 1.5 of archivist.
Archivist stores R objects (artifacts) in collections called repositories. Repositories are either local (a directory) or remote (e.g. GitHub repository). Function saveToRepo() saves R objects to a given repository. Then repositories may be shared and other users may retrieve objects with the function aread().
The only argument of aread() is a hook (link) to the repository and object’s md5 hash. For example you can use two lines of code to load the ggplot2 object presented below.
The result of aread() is an R object. One can modify it (change colors, title) or extract data from it. You can add such hooks to every plot, table, dataset in a report, blog post or an application. Readers may then retrieve them easily. Hash is 32 characters long, but it is enough to specify only first few of them.
In the archivist you will find an operator %a% which behaves like %>% from magrittr, but additionally stores all partial results in a default repository. Then with the use of ahistory() one can retrieve the object’s history with hooks to all partial results.
Let’s consider a following example.
library(dplyr) tmp = iris %a% filter(Sepal.Length > 6) %a% lm(Petal.Length~Species, data=.) %a% summary()
Having the object of it’s md5 hash one can retrieve the object’s history.
ahistory(tmp) iris [ff575c261c949d073b2895b05d1097c3] filter(Sepal.Length < 6) [d3696e13d15223c7d0bbccb33cc20a11] lm(Petal.Length ~ Species, data = .) [990861c7c27812ee959f10e5f76fe2c3] summary() [050e41ec3bc40b3004bc6bdd356acae7]
The goal of the architect is to enrich reports, applications, articles, blog posts in hooks to recoverable results.