Collection DOIs (and other upcoming DOI changes)

I’m getting close to having Crossref accepting DOI deposits for our three initial collection types (books, journal issues, conferences) and working on the user-facing part of this. I’m trying to figure out what we need to support here before I get too deep into building it, so without further ado here’s a proposed user flow for getting a DOI for a collection (this would be launched from the Metadata tab of the /dashboard/collection page)


I’m thinking this logic would be encapsulated into a small textfield/button combination widget (which itself has hooks to launch confirmation and eventually Crossref login dialogs). Ideally once this works for collections I’d like to port it over to the pub-level DOI pane.

After thinking about this for a bit, I’m not sure whether we want to support the ability to just put a completely custom DOI into the metadata field. I can see why we’d want to do this, in case the PubPub version of a collection is mirroring some other canonical version. But in the absence of supporting metadata about the collection, it’s not clear whether we can actually use this DOI in the Crossref deposit for the collection’s constituent pubs. We might have to pull down that metadata from Crossref, which sounds like a completely separate can of worms.

So what does everyone think — am I missing anything or am I good to build this?

This is awesome - thanks so much for putting the diagram together.

I have a question for the update all pubs whose 'default collection' is this one step. Specifically, does this step create new DOIs for pubs that already have them? I could imagine the following being a desired process: I create a collection > create a pub > add the pub to the collection > assign a DOI to the pub, it will have a DOI of 10.xx/{comunityIDslice}.{pubIDSlice}. I then add a collection DOI and reach the step in question - changing the pub DOI to 10.xx/{comunityIDslice}.{collectionIDSlice}.{pubIDSlice}.

What is the best practice here around the permanence of the DOI? Of course, the original DOI has to always resolve to the pub - but it also feels like there should only ever be one DOI for a pub. Or, are we saying that by being ‘in’ a new collection - it is now a new thing, and deserves a new DOI?

Phone Ian in dead of night

This is a simple operations bug to resolve: “DOI generation is available Monday - Wednesday from 10am-6pm PST, Thursday-Friday from 11am-4pm EST, and Saturdays from 1pm-1:15pm CST”…

I once used a government visa site that only allowed form submissions during office hours. It was a nightmare.

What is the best practice here around the permanence of the DOI? Of course, the original DOI has to always resolve to the pub - but it also feels like there should only ever be one DOI for a pub.

I am inclined to agree and I think that to get around this, we should drop collection IDs from pub-level DOIs. A pub’s ID and its parent community’s ID are both permanent, so if we use this scheme: 10.21428/{communityId}.{pubId | collectionId}, the pub DOI will never have to change. However, we can still update the Crossref deposit for the pub’s DOI with a reference to the collection DOI. That is what I had in mind when I wrote this up — though not quite what we talked about on Slack, so I’m glad to be documenting it here.

Or, are we saying that by being ‘in’ a new collection - it is now a new thing, and deserves a new DOI?

Based on my own cartoonish understanding of these systems and insight from MITP folks, I think it might occasionally be helpful to issue a new DOI for what is essentially the same piece of content, to support things like overlay journals or retrospective issues. But these re-publishings often contain corrections and new commentary, so it makes about as much sense to give them another DOI; maybe a clever future version of PubPub would understand this and be able to link the two DOIs together in the metadata.

Quoting from my own post on the KFG slack:

Hm, okay, I’m beginning to suspect that maybe no auto-updating is required after all. My mental model of Crossref deposits has been that they are the canonical representation of an entity on Crossref, and Crossref is just a table that maps DOIs to deposits. But it is more accurate to think of deposits as being the input that is used to update Crossref’s own internal representation, which is seems to have a first-class understanding of collection <-> publication relationships and is thus considerably richer than a simple table. To be less vague about this, what I mean is that if we send an update for a collection, it is not necessary to send updates for each pub in that collection because at rest, the collection-level data in a pub deposit becomes part of the collection model in Crossref, and doesn’t really “live” on the pub-level one. Basically, this is good news, because we don’t need to build a whole system to trigger changes in many pub-level deposits based on changes in collection-level ones.

For the purposes of this thread what this means is that we don’t have to update pub-level metadata in response to collection-level changes at all, except perhaps when a collection is deleted.

Small update: I’m reworking the pub options Collections pane to be a little more full-featured (currently it’s just a bunch of Blueprint Tag items) and I’m kicking around different names for what we’re internally calling the “primary collection” for a pub — the one that we generate citations and DOI deposits against. One I’m considering is “citation home”; it doesn’t make any more or less sense than anything else we’ve come up with so far, but I like the possibility of putting the word “citation” in there. Thoughts?


Sorry I’m jumping in late. Not sure how I missed this. This diagram is awesome and I think you’re on the right track. A few things:

I think we can launch without the ability to do this. But I think eventually we’ll need to, especially at the collection level. Three reasons:

  1. Porting content from other systems where a DOI is already assigned.
  2. For some people, the DOI is part of branding, and they really care about it.
  3. The edge-case you mentioned. I think we may want to pull back and research this a bit more. I’m increasingly wondering if this is needed at the collection level, or if it would suffice to deposit a Pub as a component of another object with a DOI. In that case, I think all we have to do is reference that other DOI – and it feels like a different scope from what we’re doing right now that requires more thought.

I’m not sure that captures enough of what you’re doing when you select that option. It’s more than just the citation - it’s also the way we recommend articles, layout collections in the header, and could influence design.

I think I still lean towards ‘Primary Collection’ specifically because it’s kind of vague (and it gives use space to do all the things that are done when something has a primary collection). Perhaps the row marked as ‘Primary’ can have a tooltip next that label elaborating on what it means to be the ‘Primary collection’.