Follow

Okay since I don't seem to have enough energy for the serious things I have to do, time to start this thread on "Hawkeish Names", and how they could help extensions to ActivityStreams (the vocab used by ActivityPub) and other projects, allowing decentralized terminology with some collaboration without requiring a central authority to "ok" a term or requiring being able to maintain a namespace.

Thread below, I'll eventually turn it into a blogpost or ActivityPub issue.

· · Web · 2 · 2 · 4

First, let's lay out how things work now. A lot of people wonder why json-ld has a context. Well, english is imprecise, and in an open world system we need to be careful about what terms we mean. For example, imagine two ActivityPub implementors adding "run" a program and "run" a mile extensions, and how to know which term is used?

The json-ld context maps shorter human-readable terms to more precise URIs. Eg "Follow" becomes "w3.org/ns/activitystreams#Foll" which is a more precise URI.

Okay now we can easily understand, we can map new terms to new vocabularies etc. Except vocabularies are hard to maintain, they spread out, and are generally a pain in the ass.

And what about adding extensions to existing vocabularies? The SocialCG has spent about 9 months hemming and hawing about how or if to extend ActivityStreams. That sucks. Can we do something better?

@sandro made a suggestion earlier which I'll call "Hawke Names", from which "Hawkeish Names" are derived. Sandro's suggestions is: the actual important definition of a term isn't the short version of the term, but the paragraph long definition of your term from the specification and use that as your key. So in this case, name would be "A simple, human-readable, plain-text name for the object. HTML markup MUST NOT be included. The name MAY be expressed using multiple
language-tagged values."

Except *nobody's* going to want to have every key in their json document bloat 20x so that's not going to happen. So here's where I diverge from @sandro who thinks this is an "optimization", but I think is pretty key: you hash the document and use that as a hash-based URN (a kind of content-addressed URI). Just to show sha1 (yeah I know) the key would look like: urn:sha1:fa53084596e3e1c04b37441b70ad0e6d90907163

Except now you may observe: "Chris, nobody's going to want to read json documents where the hashes are the keys, that's unreadable" and that's true! Thankfully we already have the right solution, which is the json-ld context. Now we can map the context like so:

{"@context": {"name": "urn:sha1:fa53084596e3e1c04b37441b70ad0e6d90907163"},
"name": "Bob McFoo"}

Of course we can put the context somewhere else so we don't need to inline it every time; how and where we should host contexts is a whole topic of its own (hint, content-addressing is actually also the ideal answer)

Ok, so but how do people map from a hash to a term definition? Ie, how do people find and coordinate on the meaning of terms?

There are a couple of ways to do this; one might be that we actually hook it into a peer to peer CAS filesystem and there you go, you can get it. But that's too futuristic. There's a simpler route...

- We have a git repository full of all the well known definitions
- Terms are stored in files like
Person-693bd494bd6d46773c100cd0a60b47ef7b29c962.txt
- There are three directories for terms:
--- stable/: terms that have been around for a while, have significant community uptake and fair understanding in this domain
--- wip/: terms that are "in development" in the community. The community is still "converging" on the exact definition and
functionality.
--- dustbin/: terms that didn't "make it"

This can be managed as a normal git repo for the project that cares about these terms. Obviously, the hashing mechanism / URN type has to be explicitly agreed upon by the group. But you don't have to "wait" on the group to agree that it's okay for you to "take over" this shortname.

Another, maybe overly nerdy way to put it is to look at Zooko's triangle and see that hashes of names are decentralized and globally unique, but not human meaningful. We need to bring back the human meaningful'ness mapping in two places: a place for devs to look it up (the git repo) and the keys for human readability on the document itself (brought by the context mapping)

Anyway @garbados @zack and @freakazoid all wanted to read this when I wrote it up, so there you go.

I'm not sure how clear it is, I'd be curious for feedback.

@cwebber personally, I don't understand why a bespoke format of git repository plus .txt files (and we still have to figure out how to distribute the git repository, or how to handle change control to the different directories) is preferable to the widely deployed DNS and HTTP protocols.

Forgive my naivete, but couldn't we just use URLs and make synonyms if we determine that my definition of a term is the same as yours? (But this isn't my fight, so feel free to ignore my perspective here.)

@npd the mutable web is fragile and pieces of it go down and rot all the time

@cwebber @sandro but hashes, unlike URIs accessible over the Web, aren't dereferenceable: a dev can't see the new vocabulary term and easily look up what it means.

@cwebber @npd You somewhat mischaracterized my proposal. I wasn't suggesting one use the definition text every place in the data you need to refer to the definition. Instead, you use a URI like now, but you connect it rigidly to the definition, so that different URIs can clearly refer to the same thing.

Fairly concrete proposal at sandhawke.github.io/mov/

The schemove implementation isn't quite done.

@cwebber @npd And I REALLY HATE using microblogging for technical discussion, so maybe move it to gh issues? Issues on mov are welcome.

@sandro @cwebber happy to move conversation elsewhere, like if there's a mailing list discussing this topic. I don't know if issues on your GitHub repository are appropriate, since I don't know if the discussion is specific to your particular proposal, or if you want issues on why we should look at alternatives to your proposal.

@npd @cwebber Certainly if there's anything imperfect :-) about movable-schemas it'd be nice to document/discuss in issues there. Then if there's an alternative, we could start talking about it there, sure.

The problem is Chris proposed an alternative to strawman, so I don't know where to begin.

Also, Chris, if you want feedback on your proposal, maybe make a repo for it?

@sandro @npd Sorry my response to yours was a strawman because I thought your idea was good as I misunderstood it aside from one thing!

I'll have to re-read what you wrote, I'm happy for feedback on what I wrote even if it's not the same thing

I'll write up my proposal on the AP issue tracker anyway when I get some time to breathe

@cwebber Sorry, I know this is serious decentralization talk and all, but every time you write "Hawkeish Names," I'm thinking something like "How about Jeff? That's a hawkish name right there. I know a Jeff, and he's a total hawk."

@therealraccoon I think there's a reference being made here I don't know about

Sign in to participate in the conversation
Octodon

The social network of the future: No ads, no corporate surveillance, ethical design, and decentralization! Own your data with Mastodon!