Collection management

I got started on thinking about Tags as collections, which we’re going to need soon to onboard HDSR and other clients. Here’s my initial PRD. Annotations welcome there, and feel free to add discussion/thoughts here as well.

(Also, process-wise, feels a bit funky to have drafted in PubPub but be posting here? But I think it was a much better interface to draft a research doc in than this. Maybe I should C&P here in the future? LMK what you think).

Awesome - this is great.

Regarding process, I think PubPub is the right place for that. The model I’ve had in my head is that Discourse is for things that look like conversations (you want replies, not edits) - and PubPub is for culminating to a single document (you want edits, not replies). Annotations on PubPub make it a bit blurrier - but it still feels like there’s a separation between discussions that lead to edits (e.g. HDSR said that they want the ability to show 3 related articles at the bottom) and conversations that guide the process (e.g. we may want to push this feature to April so it can be designed after reviews).

Especially for historical purposes, I think it’s useful to always have a Discourse discussion that can catch process conversation around single, iterative docs. So, posting a Topic here with a short intro and a link to Figma, PubPub, Google Docs, etc seems good.

Discussed today:

  • Tags have “types” that roughly map to crossref types. Default is “unstructured” or something like that.
  • No nesting of tags.
  • Types come with some preset common metadata and any crossref requirements – ie author for book, editor for journal, etc.
  • You create tags with types (and associated objects) rather than creating tags and then editing them.

Next steps

  • @gabestein to draft spec from crossref of needed types

@deepak.jagdish @ian:

I’ve updated the spec to include a very basic listing of tag types and metadata at the bottom. I’m still trying to figure out via a combination of crossref docs and user interviews what’s actually required in terms of metadata. In general, I think as long as we get mappings decently right, we can work on this feature a bit independently of actually depositing to crossref, then update that later.

The rest of the features we talked about are a bit more vaguely spelled out in the what is it? section.

Lmk about comments & qs – happy to (and maybe we just should) jump on a call before getting too deep into this?

Thanks for the documentation/update, @gabestein. I noodled on this on the dev side just enough today to get a feel for the open questions, some of which are:

  • discoverability: how important is it that a new community admin be able to figure out how to add “book”/“journal” structure without accidentally discovering this functionality under “tags”? Should there be some sort of UI onramp from the pub settings home, given the (perceived? apparent?) centrality of this feature to PubPub’s value prop?
  • create/edit distinction: how will the act of creating a collection-tag differ from the experience of editing the metadata of an existing tag? (I have my own opinion here — I’d prefer the interfaces match as closely as possible)
  • ordering: how do we surface the ability to order pubs within a collection-tag?
  • representation: [mostly a dev question] does tag metadata itself have structure in the database, or can it exist as an unstructured JSON blob attached to the tag? (I’d prefer the latter, at least for an MVP – it’s easier, more flexible, and I assume we don’t need to query/index on the metadata)
  • naming: I’m beginning to think that instead of {Book, Issue, …} being a type of tag, I think the top-level name for this should be a Collection, of which {Tag, Book, Issue…} are all a subtype. I think this will be easier to grok, since people don’t have to extend their mental model of what a “tag” is to get what we’re doing.

Based on this, I have a few pretty weakly held opinions on how this feature should look:

  • We should rename tags to “collections” – or something that generalizes over {Tag, Book, Issue…} – rather than overloading the existing definition of tag.
  • There should be some kind of hint from the homepage of the community dashboard (or even a call-to-action in the pub header, like “New Collection” next to “New Pub”) to promote discoverability.
  • The Collections (née Tabs) section of the community dashboard should have some pretty big CTAs to create {Book, Issue, …More} near the top, and I think it would be nice to contain this functionality in a wizard-like modal dialog. Editing the metadata of an existing tag would essentially pop up the same modal.

Awesome, thanks for spelling this out. Generally I think you’re going in the right direction, a few thoughts to add (and places to get @deepak.jagdish’s feedback).

Yeah, we’re bad at this right now as a whole and I like the direction you’re going. But I’d actually like to wait to address this issue a bit more holistically across the platform. In particular, as we discussed yesterday, we have plans in the next few months to think about templates, also to re-build the dashboard and better define community roles, etc. All this to say – I think we can wait on tackling discoverability. As a stop-gap, this’ll kick me to finally write decent guide documentation.

Only consideration here from my end is that at some point we’re probably going to want to do some validation against crossref specs and there are cases where we’re going to want to submit to crossref, get some data back from them, and update what we have.

I like this a lot. Tag as a subtype of collection alongside book etc (we may still want a generic that implies order?) makes way more sense than what we discussed yesterday.

In general, I agree the two interfaces should be the same. But I’m not sold on wizard as the interface. I’d love to get @deepak.jagdish’s thoughts here, and I’m also checking with a few users of metadata systems. My reasoning for being hesitant about wizards is my hunch is that while creating collections won’t happen very often, doing stuff like adding pubs to them will, and that will often require top-level metadata creation. So if the editing interface imposes too much step-by-step wizard-like stuff, it could become cumbersome to edit.

Also worth thinking about adding Pubs to collections as part of the action here.

I think it would be good to get on a call about this tomorrow. I’m working on some strawman UI for how this might fit into the “tags” (“collections”?) pane on the dashboard — we could maybe take a peek at that.

One lingering question I have is about the importance of the “Link to page” functionality of tags — is that something we want all collection types to have? It seems like a somewhat niche feature, which makes me hopeful that we can hide it a little deeper in the information hierarchy to make room for more critical actions, but I’m not really sure what its use case is.

I’m pretty free tomorrow - ping whenever you’re free.

The link function is widely used by editors and I suspect readers (I’ll have to look at metrics to confirm), and my sense is pretty important. It essentially functions as a breadcrumb, which we get asked for surprisingly consistently - ie, many communities who create an “issue 1” collection/tag also create an “issue 1” page to display the collection, and that feature is how we do the tag/page association and allow readers to click from an article in issue 1 back to the issue 1 page.

That doesn’t mean it has to be universal or particularly prominent in the ui (I can see it not being an option for the tag type, for example), and we could certainly explore a name change to better suit the new taxonomy. We might also want to consider creating that association (and potentially the page itself) when editors create a new collection for convenience. Years of Wordpress experience have made me strongly against requiring that association and/or making a “default” collection page. A huge percentage of the time you end up needing to write exceptions/custom layouts/etc for each collection, and I think it’s better for editors to be deliberate about creating a page for a collection than to have a default.

Happy to discuss more tomorrow.

In the interest of having a reference point, here’s a workflow that I mocked up locally.

  1. We start on the Tags pane as usual, but it’s been renamed Collections and there’s a dropdown to the left of the creation textbox that lets you choose a different collection type to create.

  2. We switch over to Books and notice this list is empty, so we start typing to create a new book.

  3. The book appears in the list with a new “Edit metadata” button.

  4. By clicking “Edit metadata” we are taken into a modal dialog to, well, do just that. (TODO: auto-fill this information from community metadata)

So some of this is kind of silly (the copy, obviously) and I consider most of this to be throwaway except the underlying architecture changes, which I hope to keep. In particular this points to a new set of questions:

  • How can we surface the metadata fields at collection creation time, ideally without overwhelming the user?
  • How can we present a growing number of per-collection options on each row without either getting visually busy or hiding everything behind too many clicks? In addition to “edit metadata”, I imagine that at some point we’ll want another action available to manage collection contents, so the “add more buttons” approach doesn’t scale too much farther.
  • What’s a good workflow to move collections between types? Do we need to do this?

Hopefully this can be a starting point for fruitful discussion tomorrow.

Hey @ian, the prototype looks great! Clear and straightforward.

It took me some time to wrap my head around the hierarchy of pubs, communities, and their associated tags (or Collections). Here are some of my thoughts sketched out:


  • Nomenclature: I agree with you and @gabestein that we need to call it something different from just “tag”. At the very least, we should use a prefix with the term tag, such as Book Tag, or Journal Tag, or Conference Tag. This way, the intended use is explicit, and we don’t have to introduce yet another term to the already large-ish set of terms a new user needs to get used to when exploring the PubPub universe. Alternatively, we can use the term Collection, or Set, or Compilation, etc. (as shown in the list in the image). If we have to pick a term separate from tag, then Collection is something I can be comfortable with, especially since other digital media tools such as visual bookmarking tools, document editors, etc. tend to use that term to refer to a larger group of things.

  • User Flow: The screenshots you’ve shared indicate that the user arrives at the view for customizing/adding collections from the Community’s settings/manage panel. In the image I’ve shared this would be user flow path P1. I’m curious if the user takes path P2 (ie. wanting to add/customize collection info from the Pub view), then do we take the user to a different view? Or should the view look similar to what you have prototyped already? Maybe the answer to this is linked to the question @trich raised in the slack channel yesterday about whether to surface some of these Settings details into a separate flat page rather than sticking them into an overlay. But then again, his reference was only to the Pub itself and not the Community.

I’ve assumed that both P1 and P2 should ideally lead the user to the same view, but perhaps the list of actions we make available in the destination panel, or their associated copy, can be changed a bit, depending on where the user is coming from (Community, or Pub).

Assuming the user is attempting to add a new Collection from a Pub, or trying to attach an existing collection name to a Pub, then the flow could look like this:


In the case of path P3, the user wants to create a new Collection - in this case a Book, and so we render a view that permits her to enter the associated metadata.

P4 shows the path where an existing Collection name has been identified, and the user wants to edit some of the details in it. User could arrive at the destination view of P4 from a Community as well I suppose.

  • Colors & Iconography: I think it is important to visually distinguish a regular tag (the yellowish green) from more advanced Collections. We can do this with some differing colors and icons.

I haven’t populated the panel views with all options (such as Ordering, Page link, etc.) but they can be added as necessary.

Happy to get on a call today to discuss this further. Now that we have a couple of visual vantage points, it’ll be easier to make decisions and clarifications.

Thank you for all of this, @deepak.jagdish! Gabe and I got on a call just now to discuss all of this further. Let me try to summarize what we’re thinking about.

  • We’re committing to the name “collections”, which encompasses tags, books, journal articles, etc. I’ll be making changes to the codebase as I work to reflect this.
  • I’m interested in continuing to explore the P1 (management from community side) flow for the next few days, but I’m hopeful that managing this information from pub/community side can be relatively similar and make use of the same pieces of UI (for consistency’s sake, and 'cuz I’m lazy) as Deepak suggested.
  • DOIs are a special piece of metadata because we want to conditionally sync them with Crossref.
  • We want a P1-view that lets community admins add and remove pubs to a collection, and it would be helpful if that view could also allow them to add contextually-specific bits of information to the pub-collection relationship, e.g. “this pub serves as the supplementary materials for this book”.

Because we’re getting on a call tomorrow with Kelly and Rachel from the Press, I’m going to try to mock up the sync-DOI and manage-pubs-in-collection views so we have a toy version of each to show to them.

Geeky side note: I’m really happy today that we’re using the same programming language to write our frontend and backend, because I think there’s potential to write isomorphic code for schema/metadata munging that can be run in both places.

Also, I should have said as much up front, but I like your treatment of layout and information hierarchy, Deepak! I think those mocks will be super helpful to me once I understand how all the high-level pieces fit together and can focus a bit more on visual design — until then, things will continue to look…fugly.

Messing around with collections editor…

Super excited about the direction this is all going! Great work everyone!

I really like the decision to name these ‘Collections’ - with a ‘tag’ being a type of collection. That feels right.

I am cautious about having each type of collection seem like it’s own first-order element though. I think there’s a real risk of confusion with too many nouns that seem like they might be different things. I’d prefer it be clear that everything is a ‘Collection’. In pixels, I think this means 1) showing all of the collections at once (rather than having the type selection filter which are displayed), and 2) having the type selection come to the right of the title - making it clear that you are making a new Collection with title ‘Issue 1’ and giving that collection a type of Issue (rather than feeling like you are making a new Issue). See the (hastily constructed) mockup example below. Apologies if this feels like a subtle point - but in the past even having ‘Pages’ and ‘Collections’ be the same thing just with different types of metadata caused a lot of confusion because they were presented as first-order elements.


This might be too early for the level of planning at the moment, but I tend to prefer inline, scrollable content over a modal or wizard. Modals feel a bit heavier to me and are more cumbersome on mobile (you either sacrifice some precious horizontal space to provide a ‘click-out’ area or you have to add ‘close’/‘cancel’ buttons since you can’t hit ‘esc’. I also find that I want to use the back button to close modals on mobile, which then introduces url hackiness to solve). I tend to prefer expanding sections over modals. Happy to defer to others/Deepak on this, but maybe something like cards that get focus and expand could be useful here (example mockup below - again, excuse the haste).


I’m realizing as I write this, one assumption I have is that scroll is king. People know how to scroll and it is the easiest thing to do on mobile, desktop, and laptop. As long as it is well laid out and parsable, I prefer a long, scrollable block over a step-by-step modal.

For ordering, I agree with a drag-and-drop UI. We do this for ordering the Pub Block layout, but we could do a better job with the design of it.


I’m fine with a JSON blob in the database for some of the more in-development metadata pieces, but ones that feel pretty solid (e.g. ‘type’, ‘linkedPageId’, ‘isPublic’) I think deserve a column.

Again - awesome work all!

Thanks for your thoughts, Travis!

I am cautious about having each type of collection seem like it’s own first-order element though. I think there’s a real risk of confusion with too many nouns that seem like they might be different things.

I agree that we should seek to minimize noun fatigue, and cultivate the user perspective that everything on this pane is really just a collection with added bits, not n new first-order things. I’m on board with trying to find a way to show them all at once, but I think their use cases are different enough that in practice people may want to see just tags, or just books, or at least sort a table by collection type. The tags-vs-everything-else difference is salient here, since IMO editing tags is likely to be an order-of-magnitude more frequent and lightweight interaction — maybe (ugh) Tags deserve their own dashboard pane, with all other collections displayed in another pane?

This might be too early for the level of planning at the moment, but I tend to prefer inline, scrollable content over a modal or wizard

I’m not excited about modals, exactly, but I think they might be appropriate here at least for the content editor, if not the metadata editor — it’s a hefty enough piece of UI, with its own scrolling panes, that I don’t think we want to place it within another scrolling pane. It feels like it should not be possible for the user to scroll away from this editor, because it’s annoying to accidentally do that (often when overscrolling sub-panes) and have to find your way back. I like your mockup of the inline metadata fields, and I think exploring that could lead to a better UI than I have now. At any rate, we can easily add or remove these things from modals, so no rush to make a decision now and IMO it’s no biggie either way. We should discuss this synchronously at some point – I’m especially interested in having a conversation about mobile support more generally.

We do [drag and drop] for ordering the Pub Block layout, but we could do a better job with the design of it.

I wish I had seen (or bothered to look for) this before rolling my own! :laughing: I think we’re using two different drag-n-drop libraries, though I’d advocate for switching to react-beautiful-dnd for all stuff like this from now on.

Anyway, I just came to drop this off — more refinements of the collections editor:

Our conversation with the MITP folks today gave us food for thought on how to approach the crossref/DOI/citations aspect of collections. In particular we need to decide:

  • When does crossref submission happen? Is it a state (think checkbox) or an action (think button)?
  • What implication does submitting a collection to crossref have for its constituent pubs? How do we communicate that to the user?

Kelly and Rachel didn’t have tons of pointed feedback on these questions — it seems like the need to explicitly deposit journal issue data as an object into Crossref isn’t really part of their workflow — but I think we got some good signal that the interface we’re building is in some sense sufficient and doesn’t have any huge blind spots. I’m going to punt on all of that for a little bit and focus on the implementation details of Collectins on the backend while we decide what our angle there is.

I’ve added quite a bit to the spec based on discussion here and our conversations yesterday. See updates:

The gist is: I think from a metadata perspective collections and pubs actually live fairly independently of each other. You can deposit a pub alone. You can deposit collection metadata alone. You can deposit a pub that includes info about a collection without actually depositing the collection (that’s actually how PubPub and mitp does everything today). The deposit action that “matters” is the one we support today: depositing the pub for the first time. If it’s in a collection, it includes metadata from that collection, even if that collection has never been deposited.

There’s a fair amount of imputing that happens when you add a collection (collections may include community-level metadata in deposit/citation) and when you add a pub to a collection (pubs may include collection-level metadata in citation and deposit). This isn’t great from a dev or user perspective, but unfortunately it’s how crossref works and how people use crossref.

Eventually, once a pub has been deposited, certain actions should automatically update the pub-level crossref deposit, with proper batching and the ability to send manual updates. Because collection level metadata updates (and indeed direct deposits at all) are rare and fairy destructive, I think we should be able to get away with fully manual depositing and updating at the collection level.

From a reader/editor perspective, pubs “live” in non-tag collections, and the collection they live in effects the way they’re presented/browsed (ie, readers should know the container and might eventually browse a book differently than a journal), and cited.

Tag collections are looser, and tags “live” on pubs rather than the other way around.

It’s been a few days of radio silence on my end so I just want to give a quick update on what I’ve been doing:

  • Refactoring the site to use “collections” terminology instead of “tags”. This was a slog, but is done, and I think it’ll be worth it!
  • Refining the collections editor I’ve posted about above and wiring it up to the backend so it’s now fully functional.
  • Starting to think about a more flexible system to create Crossref deposits based on a {community, collection, pub} tuple. My goal is to have a system that can take those three objects and determine the correct part of the Crossref schema that applies to them. Right now we’re only citing pubs as journal_article entities within the context of a journal entity, but depending on which collection we’re using as context, we may want to create a conference instead of a journal instead, and use appropriate sub-entries.

I have a proof-of-concept structure for code that will do the last thing, but right now all it does is replicate the existing journal/journal_article submission structure – I’ll want to stress-test it against a few more Crossref submission types before I feel comfortable with it :sweat: @trich this is part of what I want to show you tomorrow, and @gabestein I will probably seek out your input soon to make sure I’m doing something that at least points in the right direction!

Finally, some things that are weighing on my mind but I haven’t really turned my attention to yet:

  • Where and how we surface all collections to admins/readers rather than just tags; there are a few use cases for tags (like the ability to auto-apply them to new pubs) that I think may not generalize very well to collections. In the cases where all collections are visible, how to distinguish different kinds from one another.

  • The final structure of the Collections dashboard pane, addressing Travis’ and Deepak’s input about layout and the semantics of tags versus other collection types. Right now I have a bunch of pieces that we can kind of glue together in different combinations, but it’ll take some time to do that to everyone’s satisfaction.

  • The user-facing mechanics of creating Crossref submissions for collections.

  • The mechanics of setting the “default” collection for a pub, and automatically updating its Crossref submission to reflect this.

As you can imagine, this will probably keep me busy for a while. Personally I’m happy to keep working on this until it’s done to everyone’s satisfaction, but we may want to spend some time prioritizing this stuff and putting some of it “below the line” in light of the branches refactor and the HDSR work that needs to get done.

That’s awesome, Ian, thanks for the update.

Of course, feel free to reach out when it comes time to test some crossref stuff.

In general, I’d like us to be in the habit of shipping something at the end of the cycle unless we’ve specified before hand it’s going to be multi-cycle (like branches, for example). Both as best practice to avoid scope creep etc, and to make sure we can flexibly re-prioritize without having to feel like we’re dropping projects in the middle.

This case is a bit different because it wasn’t super well scoped in the beginning and required some discovery (that’s on me). But when that happens, I think it makes sense to scope down as we’re developing. Knowing what we know now, I think it would be fine to finish up just the collections editor piece absent crossref integration, and focus on integration next. If that makes sense to you, let’s chat today or tomorrow about scoping down to something that seems reasonable for release Friday, then we can focus next two weeks on the crossref integration and some front-end display.

I’m trying to write some flavor text around the “Link to page” dropdown and I’m starting to wonder whether linking to a page feels like a good model for how non-tag collections should work by default. Making users create a page to act as a table of contents is a flexible model, but the fallback behavior of having the collection’s URL be the Algolia search page seems weird outside of the tag context.

I’ve been playing with grid concepts for the dashboard collection view that @trich, @gabestein and I talked about last week — here’s one that makes collections seem a little “weightier”:


I feel like I could iterate on this concept for a while longer and see how it goes; there’s a lot of size and positioning issues that are more salient here than when everything is simply row text. I’m not sure if it’s realistic to make tags smaller in this view, since they require all the controls that other collection types have.