Before reading this, please go through the schema documentation again from start to finish - some existing pages have been partially or totally re-written and many new ones have been added.
As the name implies, tasl is intended to be a minimal representation of algebraic data types, using RDF literals and URIs as primitives. On top of that, in places where it can be done consistently and unobtrusively, tasl has some syntactic sugar for common structural patterns.
I can expand on rationale if people are interested, but my general feeling is that I’m committed to these parts:
- the overall algebraic data model
- using URIs for naming classes, components, and options
- using {} for products and [] for coproducts
- calling coproducts “coproducts” and not “unions” or anything else. “union” in particular is misleading since they’re technically discriminated unions (which are different in very significant ways!)
- requiring that every URI used in the schema is from a namespace that is declared at the top. in other words, no “raw” or inline URIs, and you have to declare prefixes for the namespaces you use.
- no option for setting a “default” or empty-prefix namespace
- URIs aren’t quoted; you just write e.g.
ex:Person
directly. this is a major, forcing design decision w/r/t other syntax elements - using
<>
for the URI type and<ex:jdkflsjk>
for literals. this matches the informal design language i’ve been developing for graph visualizations, where URIs are diamonds and literals are corner-cut rectangles - having a concept of a “type variable” that associates types with local (non-class, non-exported) identifiers that you can re-use in later type expressions
- defining global type variables for xsd datatypes like
string
,integer
, etc. This feels like the appropriate degree of defaultishness - the optional operator
?
(it comes before the type, not after) - using
*
for references - using
#
for comments - not supporting mult-line comments
- the name “tasl”
here are some medium-strongly-held opinions:
- not capitalizing tasl, ever
- having the global variables
unit
anduri
and here are the things that I’m not really satisfied with or still questioning:
1. unit type syntax
Should the unit type should have its own symbol (like !
) or if we should write it as the empty product object {}
? Typically the unit type feels like its own, different kind of type. But mathematically speaking they’re identical. It’s weird to have two different syntactic ways of writing the same thing. It might not matter that much if people just end up using a unit
global variable all the time. I’m inclined to switch to {}
.
2. syntax for components and options
>-
is pretty weird. I like the way it looks with the ligature font but I still haven’t gotten used to typing it.
I had originally wanted to use a reverse arrow <-
for options, but that doesn’t play well with auto-matching brackets, which most IDEs do by default for unrecognized file types and which I’d like to enable anyway for the sake of URIs and literals (every time you type <
for an option you’d accumulate an extra >
that you’re not going to use).
We can’t use colons because the URIs will all have colons in them
type foo {
ex:bar: integer
}
I still think using ->
for both products and coproducts doesn’t do enough to visually distinguish them:
type foo {
ex:bar -> integer;
ex:jfklfjskl -> string;
ex:jfklfjskl -> string;
ex:jfklfjskl -> [
ex:ajklsa -> integer;
ex:ajklsa -> integer;
ex:ajklsa -> integer;
ex:ajklsa -> integer;
ex:ajklsa -> integer;
];
ex:jfklfjskl -> string;
ex:jfklfjskl -> string;
}
… here the arrows attract all the attention, and the crucial info is hidden at the top and bottom. This isn’t a design problem that most languages have because most languages just have one kind of syntactic map (and they usually get to use colons for it). e.g. in JavaScript the context that “this big block is an object” is almost always implicit and/or you can tell just by noticing the colons; there’s nothing you have to distinguish it from.
no spacer?
One totally different way to go is to not have a spacing token at all:
type foo {
ex:bar integer;
ex:jfklfjskl string;
ex:jfklfjskl string;
ex:jfklfjskl [
ex:ajklsa integer;
ex:ajklsa integer;
ex:ajklsa integer;
ex:ajklsa integer;
ex:ajklsa integer;
];
ex:jfklfjskl string;
ex:jfklfjskl string;
}
this doesn’t cause any technical syntactic problems with parsing, but it doesn’t really help either. plus, property names can vary in length a lot
type foo {
ex:bar dateTime;
ex:someLongerPropertyName ? boolean;
}
… here it feels like the long property name makes it harder to see what’s going on with the one above it. Go doesn’t use colons in any of its struct or type declaration syntax, but it only works well and looks nice since they have a canonical formatter for every major IDE that auto-inserts spaces:
type foo struct {
bar byte
someLongerPropertyName bool
}
without that, I do feel like we should have some kind of spacer token, and arrows were the most natural candidate.
type foo {
ex:bar -> dateTime;
ex:someLongerPropertyName -> ? boolean;
}
a different delimiter?
I was reading about dhall and saw that they write coproducts like this: <foo integer | bar integer | baz string>
(foo, bar, baz are the option keys). Right now we use the same delimiter ;
for products and coproducts… maybe one way to go is to use a different delimiter for coproducts?
This looks really good when you write it on one line (like the dhall example), but it’s not obvious what the right thing to do for multi-line blocks is. If we use the pipe |
, which is the most familiar union-ish symbol, we’d have to at least add a space:
type foo {
ex:bar integer;
ex:jfklfjskl string;
ex:jfklfjskl string;
ex:jfklfjskl [
ex:ajklsa integer |
ex:ajklsa integer |
ex:ajklsa integer |
ex:ajklsa integer |
ex:ajklsa integer |
];
ex:jfklfjskl string;
ex:jfklfjskl string;
}
… which is weird. If we move the pipe inside…
type foo {
ex:bar integer;
ex:jfklfjskl string;
ex:jfklfjskl string;
ex:jfklfjskl [
| ex:ajklsa integer
| ex:ajklsa integer
| ex:ajklsa integer
| ex:ajklsa integer
| ex:ajklsa integer
];
ex:jfklfjskl string;
ex:jfklfjskl string;
}
…then suddenly products and coproducts are very different! more different than we asked for!
I’m open to any and all suggestions here.
3. declaring things
It feels like tasl doesn’t really have a consistent approach to declaring things.
Right now, we have four kinds of declarations:
- namespace declarations:
namespace ex http://example.com/
- type declarations:
type foo ? { ex:bar -> integer }
- class declarations:
class ex:Person { ex:name -> foo }
- edge declarations:
edge ex:foo ==/ ex:bar /=> ex:baz
3.1. edges
The last of these - edge declarations - aren’t covered in the documentation yet. But the gist is that
edge ex:foo ==/ ex:bar /=> ex:baz
expands to
class ex:bar {
ul:source -> * ex:foo;
ul:target -> * ex:baz;
}
… note that the class that’s being declared is the middle URI of the edge declaration syntax. the ascii art is supposed to communicate that the middle URI is the “label” of the big arrow, which goes from source to target.
edge
is a different kind of syntactic sugar than the optional operator ?
. The optional operator works on types - it takes one type in a produces another type. The edge declaration needs to create a whole new class, which means it needs its own URI label and can’t be nested inside other type declarations (this is related to why there can’t be syntactic sugar for multi-valued properties).
One example of another shorthand syntax that we might want to add is list
- letting people create classes for linked lists of things. This is the same “kind” as edge
in the sense that it needs to be its own class and the user would have to give it its own URI name. One sketch looks like this:
list ex:IntegerList :: integer
which would expand to
class ex:IntegerList ? {
ul:head -> integer;
ul:tail -> * ex:IntegerList;
}
It feels like it’d be smart to anticipate these (regardless of whether list
is a good idea or not) and try to preemptively unify the declaration syntax a bit. Maybe ::
(or similar) could be the “this is shorthand syntax that expands to a larger class declaration” token, and we could write both edges and lists like this:
# notice how ex:bar comes first now!
edge ex:bar :: ex:foo ==> ex:baz
# and edges and lists look similar!
list ex:IntegerList :: integer
and other shorthand classes we find could all follow the label :: ...weirdstuff
pattern. I’m medium-strongly in favor of making this change. I like ::
but am open to other suggestions.
3.2 namespaces, types, classes
Right now we don’t use =
or any kind of assignment token, we just declare things with keywords. This is the simplest thing to do, but I wonder if we’re missing an opportunity to make the distinction between namespaces and types, which are local to the schema, and classes (and edges and whatever else), which are exported to the world.
I’m particularly scared that the difference between types and classes is going to be confusing. Just declaring a type
type foo {
ex:name -> string
}
doesn’t do anything. Only classes actually matter.
type foo {
ex:name -> string
}
class ex:Thing foo
The thing that namespaces and types have in common is that they define local alphanumeric (ie not URI) identifiers (prefixes and type variables). I keep debating whether it’s worth it to try to visually distinguish them somehow
namespace ex = http://example.com/
type foo = { ex:name -> string }
class ex:Thing foo
… or maybe by adding another keyword, like export
namespace ex http://example.com/
type foo { ex:name -> string }
export class ex:Thing foo
export edge ex:Thing2Thing :: ex:Thing ==> ex:Thing
… where just declaring class
or edge
without export
in front of it is invalid syntax. Generally I like the pattern of using keywords for class-level things and tokens for type-level things (that’s e.g. why I don’t want to add type keywords like product { ... }
) and this fits with that ethos.
I’m also open to more radical reworkings of the declaration syntax if people have any ideas they like.
4. the name “class”
I’m not particularly attached to it but I also don’t really like the alternatives that much.
Let me know if you have thoughts on any of these! I wrote this pretty fast so some of it might not make sense