Hi everyone. I’ve been speaking some with @sj, and he encouraged me to write-up some thoughts I had. I should say 1) Considering the current barrier to entry with Underlay (one of the things I would like to discuss) I can mostly speculate as to what may be useful to the project. 2) I’m in a professional transition but for the time being in a full-time consultancy, so my engagement with Underlay is less than I would want it to be. With that said, I humbly give my ideas for your consideration. Here we go.
Let’s start at H4Q, my project for the Hack for Sweden 2015 Open Data hackathon (references in Swedish). The name is a portmanteau of “hack 4 sweden”, my employer HiQ and obviously, “hack”. As a 24-hour project it didn’t amount to much, but I implemented my best idea of what the Open Data community actually needed (in contrast to the other projects which all made great use of the open data provided). Because, even the number of agencies contributing data was an order of magnitude higher than the number of participating teams. Most teams also clustered on a small fraction of the most accessible datasets and there were indications that many were hard to consume and extract any value from.
My project was thus simply to assemble a powerful data science platform, tutorials and examples of Open Data use cases together with a sandbox enabling exploration and eventually contributions back to the community. Crucially it would work in a distributed fashion with minimal barrier to entry. I say simply, because thanks to existing open technologies, the gist of it is these few lines of a Dockerfile (example updated for jupyter):
FROM jupyter/scipy-notebook
RUN git clone https://github.com/H4Q/therepo.git /opt/repo
RUN cd /opt/repo; git checkout -b sandbox origin/sandbox
If you’re not already familiar with Docker, in an Infrastructure as Code fashion it provides a type of “containerized” extremely light-weight virtual machine disk images, which either can be fetched from public registries or re-assembled predictably by each end user. Docker images are usually less than 1GB in size (this happens to be 1.2GB), thanks to community optimization and being assembled in a layered file system.
Thus the above script define a runnable image of Ubuntu 20.04, Jupyter Notebook Scientific Python Stack and the sandbox branch of the H4Q repository which also define the project. Having Docker installed and the H4Q Git repository cloned, running this system is (should be) a matter of executing docker build -t h4q .
and docker run -d -p 443:8888 -e "PASSWORD=foobar" h4q
. It is easy to see how Docker has become the bread-and-butter of deploying any kind of system more complex than a single application.
There were a bunch more things in there, such as (over-)utilizing Git branching to enable making the project your own, a peer review / idea enrichment workflow, but in a current context that would probably be better provided with an integration of PubPub. Also this was before either repo2docker or binder projects came into the picture.
Would I have developed the project further (I didn’t, it was a fun but short-lived experiment), the next item would surely have been to provide a great README, including contribution guidelines.
A lot more recently, after me being aware of it for some time, Spotify finally published an article on their Golden Paths concept. Beside seeming like an excellent platform for providing guidelines, introductions, documentation and collaborative development of shared solutions, crucially it combine these under a single concept. The documentation is the functioning solution, and the functioning solution is also the opinionated guideline for how to work in the domain. Note the “all-in-one”-property in common with my hack above.
Finally, these thoughts are inspired by my experiences with Secure Scuttlebutt. Ironically, partly by how difficult it is to get up to speed with Scuttlebutt without previous familiarity with the NodeJS ecosystem, but in this context that is secondary. I heard of how the project boldly apply “value-based architecture”, arguably leading to drastically improved ablity to deliver value aligned with said values and goals. Product management is in no way my expertise, but trying to assess from Underlay development until today, you are investing heavily in systems and protocol innovation seemingly necessary to enabling the values you wish to deliver. Still, if only to direct engagement efforts, I am urgently curious about the values and goals of the project on short-, mid- and long-term basis. Achieving explicit understanding of these also ought to simplify priorities and decision-making on all levels.
Summing up, that is my current line of inquiry with regards to Underlay. What people would you want to get engaged with the project, how can you anticipate their needs and best spend efforts to drive engagement? If we can attempt answers to those questions, I have provided some ideas for how to structure efforts in engaging those rare but hopefully yet plentiful minds in the furthering of Underlay goals.
Do let me know what you think and feel about this topic. Also, if I was to phrase these ideas for Commonplace, any hints regarding style and format are appreciated.