Follow

semantic web / linked data braindump 

This is a biased braindump about linked data and the semantic web. @shel was interested in my thoughts but I wanted to open this to anyone. Keep in mind that a lot of other people are more well informed than I am and I'm sure to get things wrong. That said, here's how I see things:

semantic web / linked data braindump 

What's "the right way" to write web applications? You sometimes hear RESTful people talk as if they're pursuing the true path, as Tim Berners-Lee envisioned it. However, while REST is compatible with Semantic Web / Linked Data stuff, TimBL and crew actually had a different view than what many RESTful people will tell you. The vision: the web is a graph, and should even be machine readable as such

semantic web / linked data braindump 

TimBL has a whole lot of interesting design ideas, and you can read them here: w3.org/DesignIssues/

A lot of this vision for the web is not new; hypertext is an old idea; Project Xanadu is probably the most famous plan for a hypertext system, but it spent so long in engineering/vapor mode. What's most interesting about "The World Wide Web" is that it took off because it was easy to cobble together. Still, TimBL had broader vision than what we see today

semantic web / linked data braindump 

One big vision of these is that "the web is a graph", and in fact it should even be a machine readable graph... that's basically "the semantic web". All vocabularies carefully defined, content written in a structure a computer can read.

semantic web / linked data braindump 

What structure is, could that be? You've basically got two choices: trees or graphs. XML, JSON: these are trees. But note: the web itself is a graph... nodes point to other nodes. It makes sense that the base datastructure is thus also a graph. A generalized graph was described, called RDF

semantic web / linked data braindump 

RDF, or any graph, says you can describe the whole world in triples. Subject Predicate Object: Tim Likes Tea. One node pointing to another node. More here: w3.org/TR/rdf11-primer/

Except anyone can say anything about anything, so actually we need quads: Subject Predicate Object [Optional-Graph]: since you might say different things than I do about Tim, our different opinions may be in their own graph "islands"

semantic web / linked data braindump 

So okay, we can do tim --likes--> tea and that's great, and even put it in its own graph, great.

But how do we describe Tim? or Tea? Or Likes?

If only we had universal resource identifiers for these... wait... we DO have Universal Resource Identifiers for these! URIs! So we can have a URI about Tim, but we can also define Tea and Likes in terms of URIs.

semantic web / linked data braindump 

One problem though: it might not be easy to resolve URIs for everything. For example, we may want to be able to refer to a bicycle in the real world, out of bound from the network, and how can we do that? This is the source of HTTP Range 14, an epic bikeshed and standards meme. There were various approaches suggested but the world has mostly settled on (ab)using the fragment part of URIs, the stuff after the hash... https://foo.example/objects/#this-bicycle

semantic web / linked data braindump 

Why? Because fragments are resolved client-side!

This is also (ab)used to allow for vocabulary terms all the time.

IMO it's subpar, we should have two types for fragments-you-can-extract-from-documents and oob-fragments. But so it goes.

semantic web / linked data braindump 

Another interesting bit about RDF is that it's an abstraction. There *is* no official encoding. Instead, there are a lot of encodings: Turtle, n-quads, RDF-XML (the worst), and JSON-LD (I like this one because you don't have to know RDF to participate). Those are all serializations but RDF is the Platonic Form Of Data.

semantic web / linked data braindump 

Here's the problem: the "semantic web" was most enthusiastically embraced by well meaning but super, well... academic'y academics. Most of this RDF and Semantic Web stuff became buried in jargon. RDF encodings, in general, required a certain amount of expertise to read. Also nobody but semantic web people were using those datastructures.

It seemed like the semantic web was dying under its own academic weight.

semantic web / linked data braindump 

So what happened? The Semantic Web got two makeovers:

- It rebranded itself as Linked Data. In fact Linked Data was just a term TimBL used in another documen talking about the Semantic Web w3.org/DesignIssues/LinkedData but people started to use it as its own term.
- JSON-LD happened. IMO this is a big win: unlike RDF-XML which never really worked if you tried to use XML tooling, a JSON developer *can* use JSON tooling and work with it.

semantic web / linked data braindump 

In fact some people who worked on JSON-LD themselves had partly rejected the "Semantic Web" and its baggage... RDF compatibility was not even in JSON-LD's original design and came later. See Manu Sporny's "JSON-LD And Why I Hate The Semantic Web" article manu.sporny.org/2014/json-ld-o

semantic web / linked data braindump 

JSON-LD did something else than just make linked data accessible to JSON folks: it also gave a tree-based structure from which one can work with linked data, but works hard to be valid in that structure (unlike, say, RDF-XML which I will never stop beating up on).

IMO the world really is structured in graphs, but most people don't think in graphs, they think in terms of objects. So this matters a lot.

semantic web / linked data braindump 

Anyway, lots of folks think they can get around ontologies (defining vocabulary) and linked data ideas but eventually all they do is reinvent ontologies and linked data.

Linked data is great, long live linked data!

Oh and PS: ActivityPub is a linked data system! And that's for a reason!

semantic web / linked data braindump 

Oh I left something out (thanks for pointing it out @csarven) but anyway if you don't know RDF you probably know "Entity Attribute Value", or even "Property Value" model of things from many programming languages

tim.likes = tea

now make those URIs and we're at RDF!

@cwebber Linked data is fine in some vague notions, but here's two solid issues I have with it:
1) Authoritative Availability (in time). If the source(s) for linked data go down, you have a graph with missing edges and nodes. Linked data does not have a way to route around this censorship.
2) Automatically ingesting "meaning" beyond just the semantics. If two ontologies represent the same idea, it takes manpower to get the machines to equate the idea between the two sets of linked data.

@cj 1) content addressed linked data
2) there's rdf:sameAs (or is it owl?) but full ack that I think nobody's tooling is actually using that stuff

@cwebber 1) Sure, but in practice that does not bring about a semantic web. Individuals storing the entire web is a non-starter.

@cwebber Re: #2) That's precisely the problem. It takes hefty compute power, input, and machine learning for a computer to rectify two ontologies and figure out which are the same as others. Otherwise it requires humans manually fulfilling "rdf:sameAs" or hacking the code. Which breaks the point of a machine-readable semantic web.

semantic web / linked data braindump 

@cwebber
Confession: the whole semantic FOAF thing fascinates me, but I'm irrationally/ignorantly convinced that it's massively overcomplicated.
Some data just doesn't seem to want to be described relationally, but I'd love to see something as simple as relational … err … relations for FOAF or similar.

@kevinmarks @shadowfirebird @cwebber
no … ooooo! Nice. Very much semantic _web_? Still, something to think about…

@cwebber JSON-LD is terrible for those writing servers with strong typing such as C/C++ and golang. Which is a big inhibitor for adoption. Which is why I'm trying to work on the golang bit. :)

@cwebber I am not super familiar with RDF, so I don't know if I need to read more or if the question is being dodged. The JSON-LD support in golang is non-existent.

@cwebber From my limited understanding, that extension to the library just ensures '@id' is treated as a URI.

Go requires declaring, statically:

type Object struct {
Id string `json:"id"`
// and so on
}

...except JSON-LD lets things be JSON objects, arrays, etc which breaks the entire JSON ecosystem in Golang that has built up. Which is why Swagger and automatic API code generation is used.

@cwebber turtle 4 ever! I agree that Rdf xml can burn.

semantic web / linked data braindump 

@cwebber Do you need the fourth field if predicates are namespaced or themselves URIs? In TAO (Facebook's graph db) "assoc" (edge) types are identified by a 64 bit number just like everything else, and interpretation is up to whatever code uses it.

Then one could have statements about the various predicates in different vocabularies.

semantic web / linked data braindump 

@seanl The graph is optional, hence why we usually refer to triples even though modern RDF has quads. In fact quads didn't even come till later in RDF's design.

semantic web / linked data braindump 

@cwebber I've been thinking about this sort of thing a lot w.r.t. replacing Yelp, Google, etc. Ways to publish and discover statements about real world things. The best approach I can think of for actually identifying things is to just have a bunch of places that store descriptions and give them URLs, then gradually de-dup over time by replacing duplicates with redirects. Like Facebook does with places but decentralized.

semantic web / linked data braindump 

@seanl Deduplication and archiving is a serious concern. If you've heard me talk about content addressed linked data lately, that's partly why...

semantic web / linked data braindump 

@cwebber Content-addressed should work, since deduplication for semantic purposes could be handled by external metadata. All metadata for, say, a Yelp competitor should be fairly small even keeping all versions for all time given that OpenStreetMap is a a few terabytes.

semantic web / linked data braindump 

@cwebber All these things attempt to redefine and ignore graph theory in terms of applications in the framework of "Web Science". A dead end.
Imagine if you did the same ignoring other well-established disciplines in logic and mathematics.

But you don't have to take my word for it, check out yourself how many references to "graph theory" or any major graph theoretical works appear referenced in w3c publications.
I don't have a good explanation for this.

semantic web / linked data braindump 

@h Sorry but BS... I've worked with plenty of people who are semantic web folks and the problem isn't that they don't understand logic and math and graphs. The problem is that they assume the rest of the world does too.

semantic web / linked data braindump 

@cwebber Dude, sorry but BS. I didn't say they don't understand. You have a serious problem twisting words.

semantic web / linked data braindump 

@cwebber Imagine if any other serious work were to be taken at face value with no connection to any well-established science.
So basically these publications aren't held to well-established standards because they're folks you happen to know?

I know this is not what you mean, of course, this is not what I think, what I actually said was that I don't have a good explanation for it, if you cared to read without twisting words.

semantic web / linked data braindump 

@h Alright I guess I misread you! I wasn't trying to twist your words intentionally.

semantic web / linked data braindump 

@cwebber Good, that's better.

semantic web / linked data braindump 

@cwebber having to click on all these 'show more' thingies is as annoying as using snapchat

semantic web / linked data braindump 

@cwebber you might want to explain what a graph is, and why that's useful here.

semantic web / linked data braindump 

@cwebber or at least what we mean by a graph in this context

semantic web / linked data braindump 

@nightpool I was getting to it! octodon.social/@cwebber/995595 does that I hope but feel free to add

semantic web / linked data braindump 

@cwebber i mean, i'm guessing this is good for pretty tech-y/mathy people but less techy/mathy people might be lost. what does it mean to say the "web itself is a graph"? what is a graph anyway and why do we think of things in terms of graphs?

semantic web / linked data braindump 

@cwebber just making that connection more explicit i guess

semantic web / linked data braindump 

@nightpool Okay fair enough: ever see those pictures where a bunch of nodes point at other nodes and make a kind of "mesh" structure? That's a graph!

But also, any webpage links to other webpages... wait a minute, one node linking to another, tthat's also a graph! (Subject-node, Predicate-link-edge, Object-node)

I know that's abstract so if you want to make it clearer go for it :)

semantic web / linked data braindump 

@cwebber i mean, *I* get it. i've been doing this stuff for at least a year or so now. @-ing me isn't going to help explain it to other people :P

@nightpool @cwebber

The grim reality is that ordinary folks build the web. Ordinary folks who are not mathematical logicians. The main reason that the semantic web got nowhere was that it was way too complex. The barrier to entry was solid and impenetrable.

If the semantic web is to be reinvented it needs to be a lot simpler. Creating ontologies needs to be in some very simple format like dot or markdown, or maybe even simpler than that.

@bob @cwebber @nightpool *postmodernist voice* the problem with the semantic web is the underlying assumption that knowledge can be structured

@bruno @bob @nightpool *postmodernist voice* the problem with you sending me a message about postmodernist snark is I must assume that language itself cannot be successfully structured and thus this diaslogue is an exercise in futility. Moving on!

@bruno @nightpool @cwebber I think there were multiple problems with the semantic web, including this one.

@bruno @cwebber @bob cc @impiaaa <-- should you build programs that make this assumption? why or why not? were you even aware that this was a possible assumption? to what extent is a utf8 string structured and what types of things does it limit?

@nightpool @cwebber @bob in more seriousness, any system of categorization or hierarchization of data points/objects is an exercise of power and an expression of a specific, idiosyncratic ideology. libraries can all use the dewey decimal system because they all have very similar institutional cultures and goals, and so they can reuse that structure over and over. how do you make the diverse ideologies and systems represented in different groups/people on a "semantic web" interface meaningfully?

@bruno @cwebber @nightpool

In defense of the semantic web, there was never anything saying that there had to be a single or definitive ontology. Potentially every site could have had its own ontology and reasoning system. But had the semantic web become popular there probably would have been a consolidating effect.

@bob @nightpool @cwebber Right, I think a fundamental design problem was that average people don't really need to think about ontologies very much, nor do they, while also an explicit ontology set up by some kind of ontology scientist would probably ill serve a lot of people (which is part of the role of a librarian; to be an interpreter between people's expectations/desires and the hierarchical organization system of a library)

@cwebber @nightpool @bob Like I'm one of a very small number of people who have actually used the semantic web outside of academia for practical, money-making and artistic purposes; I don't think there was a design for the semantic web where it would be widespread in the same way blogging is and not an institutional thing.

Sign in to participate in the conversation
Octodon

The social network of the future: No ads, no corporate surveillance, ethical design, and decentralization! Own your data with Mastodon!