Building Data Collaboratives

In recent years, a number of organizations have started to fund such collaboratives, and set up new PhD programs to train data scientists focused on building them. Berkeley and BIDS in particular has spearheaded the concept of ‘data collaboratives’ as networks of practice around a topical area of data.
Here’s an example from 2018:

What are examples of successful data collaboratives built from communities of practice?

Some efforts to train contributors to future collaboratives:

Some topical collaboratives:

  • A lightly curated overview courtesy of GovLab.
  • At MIT, the Climate CoLab, collaboratively building proposals (grounded in shared data and policy) to reach climate goals

As we start defining registries, these are groups to consider. Many repositories of datasets + annotations from a wide range of sources, which may make good registries.

As Zach noted recently: Schmidt’s Technology + Society team is a fine example, as it has a focus on data collaboratives – efforts to ‘harness the flood of data being generated by the private sector to create public value’, each supported by a coalition of agencies and organizations focused on a topic. (so far including: the global refugee crisis, improving medical care, and increasing agricultural productivity in developing countries)

These collaboratives often have data maintenance + coordination + cross-referencing challenges well suited to the UL; and like to fund the development of specific projects and tools.