Web 3.0 Is The Semantic Web?
So says John Markoff in today's NY Times.
I am really having trouble keeping all these versions of the web clear in my mind.
I think I should go back and look at the feature specs for web 1.0, 2.0, and 3.0 just to be clear about which version introduced which features.
Does anyone know where I can find them?

With all do respect, Mr. Sanchez, you should probably have submitted your executive summary directly to Union Square Ventures.
Fred:
Web 1.0 - The extension of businesses to an online environment, dramatically reducing costs. For example, Amazon selling stuff online and being able to cut costs by not having physical store locations.
Web 2.0 - The extension of Adsense onto websites as a real revenue model, and the use of the browser as a primary application platform. For example, Goowy and the "WebOS".
Web 3.0 - The realization that Adsense-drive websites are "Nonsense". The realization that the browser isn't capable of handling O/S-like functions. Finally, the realization that data is being disaggregated and non-portable. People will become connected to their data and all tethers will be cut.
For more information, see the e-mail I sent you yesterday. I can't wait for Web 3.0.
Posted by: Robert Dewey | November 12, 2006 at 10:38 AM
I think we're a long ways away from a semantic web. To get the level of results suggested by Markoff's article requires understanding the complexities of human behavior, which itself is one of the ultimate goals of A.I. Nothing in the near term can match the customer service of an actual human being.
Semantic search places more strain on the end user as it requires them to provide more input into their query. And sometimes it takes multiple queries to find the content/information you are looking for. Also, I believe a lot of Internet users right now don't use the full abilities of search engines of today.
I agree with Robert's definition of Web 3.0 to some degree. Projects like Parakey and Adobe's Apollo are bridging the gap between the OS and Browser. But, we're not going to see Adsense/Ad-driven websites going away soon.
A few weeks ago, I did a post on what Web 3.0 might be like. Perhaps, what I wrote is somewhere in the middle.
http://www.convos.com/home/insight-what-to-expect-from-web-30.html
Posted by: JP Checa | November 12, 2006 at 11:04 AM
Looked through your blog JP, and your assessments are pretty much inline with my thoughts. I envision a "data everywhere" scenario, but I don't think companies like GDrive or Omnidrive offer the complete solution.
For example, in order to use GDrive, you will have to use a Google application to access your storage... That creates a continuity problem - you have 10 different storage providers and each storage provider has their own proprietary connection method (and associated application).
That means you will have to specifically seek out the devices that work with your storage, and no two devices will be interchangeable... Yuck.
What *REALLY* needs to happen is that storage providers use existing open protocols, such as NTFS (or design around one central platform and app, like the one we're working on). Users would just connect to their storage on any device and have instant access to their data. The really innovative aspect is that even an O/S could be held remotely. The O/S could even be engineered to determine which device a user is using, and/or automatically distribute the needed applications for that device. It's an open platform, so it would work with ANY O/S designed to take advantage of remote hosting (whether on an intranet OR globally accessible via the WWW). This is something we have been heavily experimenting with as well. "Open-ness" is the key to the future, as there only needs to be *ONE* open solution.
The broadband infrastructure in the United States isn't really capable of handling such a task right now... However, Asia is more than ready with high broadband penetration and very fast speeds (approaching 100mbps).
I just hope current VC's are awake and ready for the impending change.
Posted by: Robert Dewey | November 12, 2006 at 12:54 PM
I actually agree with Markoff's article and am glad that this is the direction being highlighted by him in a publication as prominent as the NYT.
There is a renaissance of sorts going on in a growing slice of the web application architects (for some time now) that embraces the underlying design of the web as a very large collection of distributed resources, and as frameworks evolve to rightly supplement this notion, we will begin to see more and more 'natural' cross-site mashups arise.
This is an important paradigm shift. For APIs currently do exist and allow "data in" and "data out" to happen, they do so in a way highly specific to each API-exposed site. To get data out of Flickr, for instance, you need to know and understand the Flickr API. What I think we will eventually get to, are applications largely exposing global resources (a-la REST); a web where everything really *is* a resource. And when that happens, the concept of "mash-up" will extend far beyond one or two web services at a time.
Indeed, consider http://tumbl.es/ -- this is a simple type-sensitive structured weblog, but it is itself a mashup application; the web is its database. It sources its posts (resources) from another web application which is responsible solely for capturing and publishing said posts (resources).
REST assured you're going to see more and more of this (pun intended). :-)
Posted by: Bosko | November 12, 2006 at 01:02 PM
Great summary of web's 1.0 - 3.0 Robert. You skipped right over "web 0.5" though.
"computerless email appliance."
see the full story here:
http://dogballs.typepad.com/my_weblog/2006/11/web_20_roundup.html
Posted by: marc | November 12, 2006 at 01:05 PM
We are making a very simple matter a little too complex. I have written extensively about this subject (see http://www.openlinksw.com/blog/~kidehen/?id=1072).
The Web has numerous dimensions to it, unfortunately we struggle with the basic acceptance of this reality. The dimensions are:
1.0 - Interactive Web (a Web of Hypertext Docs)
2.0 - Services Web (a Web of APIs and Services)
3.0 - Data Web (Semantic Web - Layer 1)
The limitations of Web 2.0 (Open Data Access) set the stage for Web 3.0 (if you want to call it that) appreciation. Just like the limitations of 1.0 set the stage for 2.0.
It really isn't complex, we are making our way through a continuum where, unfortunately, "versionitis", is the new mechanism for acknowledging the commencement and conclusion of critical inflections.
Posted by: Kingsley Idehen | November 12, 2006 at 01:24 PM
Kingsley,
Of course the "outside" isn't complex. It's the innerworkings of HOW it will work that is complex.
It's easy to say "data will be everywhere" or "an O/S will dominate" (okay, back in the 70's)... But designing the roadmap to success is something completely different. Getting from point A to point B is simple... Getting from point A to point Z is something completely different - something alot more complex.
Posted by: Robert Dewey | November 12, 2006 at 01:30 PM
web 1.0: html
web 2.0: xml/rss/microformats
web 3.0: computer talks back, tells you that's a stupid idea & send you a pink slip.
seriously tho, i think an interesting concept way before we get to semantic web is just to have a unified data language & event model. the internet becomes a pretty powerful device once that happens.
we're fairly close on the unified data language with xml/rss & microformats. don't know if we've quite nailed down the event model yet, but seems doable within the next few years.
Posted by: dave mcclure | November 12, 2006 at 01:58 PM
Dave,
I like how you break down the "versions" by technology, versus business models.
The semantic web is a complex solution. From what I gather, everyone will need their own database of information, and those databases will have to reference eachother. There is a good article here:
http://www.sciam.com/article.cfm?articleID=00048144-10D2-1C70-84A9809EC588EF21
The article describes device to device interactivity (i.e. the volume control), database relations (mom's doctor and insurance), etc..
In that pipe-dream world, everyone involved needs a database that is universally accessible. Where will these databases be stored? Will they be distributed databases? How will people be identified? What stops me from looking up someone else's "mom" or doctor?
Good idea and concept, but lots of potential issues.
Posted by: Robert Dewey | November 12, 2006 at 03:43 PM
Robert, you are forgetting the simple fact that the web, as it was originally conceived by Tim Berners-Lee, and as was later developed and highlighted by others, is a very large and distributed set of resources. HTTP itself was designed to POST, PUT, DELETE, and GET resources.
For various reasons, what has happened in the web application world has for long adopted the RPC style of design, not exposing resources but sets of remote procedures.
So it's not a "pipe-dream world," as you call it, or as the SciAm article you referred at first glance appears to suggest, but it's really the reality and purpose of a very fundamental aspect of the web, and how new-web applications will evolve to embrace it, as well as take advantage of other applications embracing it.
In other words, the web *is* a distributed database of resources already, it's just that not everyone has been paying attention.
FWIW, I agree with Dave McClure's assessment of versioning, based on technology.
Posted by: Bosko | November 12, 2006 at 04:26 PM
Here's my two cents:
1 + 2 = 3
I had been holding back on this revelation, but the absurdity of ringing in the Web 3.0 era got me to reveal a simple truth that has escaped the geek circle jerkists of Silicon Vally who are wetting themselves over a semantic web where most normal people still havent heard of del.icio.us or know how to contribute to Web 2.0 in any helpful way whatsoever.
The absurdity is that people think Web 3.0 can happen before they learn how to get normal people using Web 2.0 and off 1.0.
So, my equation again:
Web 3.0 = Web 1.0 + Web 2.0
OR
Web 3.0 = the usability of 1.0 and the group intelligence of 2.0. Let's hold off on Semantic Web crap until people get that normal people -- non-Valley non-techcrunchers non-AVC -- need to be a part of this revolution too.
Posted by: innonate | November 12, 2006 at 05:45 PM
Well, then I must pose the question;
What benefit does the semantic web bring to the average consumer? Easier to locate data, or what?
I'm sure there are obvious benefits, I'm just not well researched on "semantics" and think there are better fish to fry first (like getting data to the "web" before we actually try to organize it). I guess that's kind of what Nate (innonate) is trying to say.
Posted by: Robert Dewey | November 12, 2006 at 06:47 PM
I think John Markoff's article is good on prediction, but is rather high-level on details and examples.
He implies that in Semantic Web computers will able to solve complex optimization like picking vacation on the budget. It seems like there is whole lot of AI attached to the whole semantic web.
I think that this is somewhat misleading. Semantic Web is about annotating data so that it becomes easier to glue the data and services. Semantic Web is about killing the parser. It is about eliminating ambiguity. It has nothing to do with AI. AI is a totally separate, orthogonal problem.
What is missing in the article, however, is the answer why computer in semantic web will be able to help you sift through vacations faster. The main reason is semantical representation of your preferences. Since the data will be annotated with semantics, as you interact with the data, the computer will be "aware" of the kind of data you are interacting with.
For example, as you go to a travel site and look at information about Paris, the computer will read "tags" in this page. The tags will indicate that it is a travel site and the destination is a city called Paris. Now the computer will fetch a set of actions which make sense in the travel context. It will be able to show you the map of Paris, the weather there and also the deals which are going to be custom tailored to you.
The custom tailoring will be possible because the computer will have a set of your previous travel experiences. It will "know" where you have been, what sort of hotels you like and what airline you travel the most.
All of this is happening not via magic of AI, but via simple mapping and hardwiring.
I agree though, with overall conclusion - we are ready for the next phase of the web.
Alex
Posted by: Alex Iskold | November 12, 2006 at 10:39 PM
I think Fred's point was the vernacular is murky at best but is necessary for you dudes to pretend like you really know what you are talking about so you can write articles, sell stuff etc. If it wasn't his point, it's mine. It's an overused oversimplification.
Posted by: michael | November 13, 2006 at 12:17 AM
So in this Web 3.0 World can someone tell me how A.I. is going to determine which information it is mining on our networked computers is mine, my wife's or my son's? If we all are linking to things sent to us by someone else (friends or family?) that is not neccessarily our "cups of tea" in the long run, will this data be looked at as part of our profile mapping? I also have this nagging feeling that however click fraud is perpertrated, there will be similar issues produced by a semantic web concerning what information I really want and what a 3.0 world may determine is information I want due to usage that is not mine alone or due to places I am driven through a trick here or there with full intention of creating a map of my profile that is pre-determined by another entity and not me. (new life for spam and phishers?) I have always heard that the worst anti-semantic is another semantic...I never believed it until now.
Posted by: wa | November 13, 2006 at 07:40 AM
Classic - suckered the geeks in. Funny. Getting to apply undefined labels in hindsight. I suppose everyone knows what Web 4.0 is then? Maybe we can ask Sir Berner-Lees.
Posted by: Tom Kirkham | November 14, 2006 at 09:28 PM
Web 3.0? Semantic Web? Holy Grail??
I was a sutdent of AI at MIT in the mid 1980s. I watched with interest and skepticism as everyone began to call everything "AI" in order to get funding directed their way, and then in two short years, from 1985 to 1987, LISP went out, C came in, and all VC funded AI deals went bust. Soon everyone was saying AI was either a hoax, or a very hard thing to do, and it would take years. AI became a bad word.
The semantic web appears in too many repsects to be the reincarnation of AI on the web, attempting to spark new life, new promises, hoopla, false hopes, hype, etc. In fact, it is many of the smae people behind it.
The influencers downplay this all by saying that the semantic web isn't as hard as AI, or it isn't trying to make the web intelligence, but merely able to discover and integrate disparate pieces of information.
A web of data is substantially larger than a web of pages (our current HTML web). If the semantic web is to ever work, somebody must mark up this entire web. Based on a standard ontology that doesn't and may never exist. And then maintain it. Houston we have a problem. Not enough fuel.
The semantic web is an elegant idea that appears to be imminently stillborn for reasons of complete impracticality and lack of maintainable infrastructure.
The semantic web as currently envisioned will work only when:
1) ontologies can be created and maintained by text extractors and crawlers
2) the entire web can be marked up, semantically indexed, and maintained by spiders without human assistance
3) inference over the semantic web does not require an extremely deep heuristic search down multiple, redundant, cyclical pathways with many islands that are disconnected
4) the web becomes smart enough to eliminate websites or data elements that are incorrect, misleading, false, or just plain lousy
That web is either a long way off or is never going to happen. There are many good ideas in the thinking about the semantic web. But the entire vision of the semantic web must be revamped before it will ever work at all. It must allow errors, redundancy, ambiguity, statistical interprentations, opinions and vagaries -- and still produce something useful that is better than manual browsing and searching.
The semantic web as currently envisioned is one and the same as "The AI" in the Trilogy Foundation. We can make real progress when we drop out of the clouds and start looking at what can be done with a less than a perfect semantic web.
Thoughts?
Posted by: Ed Addison | November 21, 2006 at 11:45 PM
Ed,
Responses to your points re. Semantic Web Matrialization:
<<
1) ontologies can be created and maintained by text extractors and crawlers"
>>
Ontologies will be developed by Humans. This process has already commenced and far more landscape has been covered that you may be aware of. For instance, there is an Ontology for Online Communities with Semantics factored in. More importantly, most Blogs, Wikis, and other "points of presence" on the Web are already capable of generating Instance Data for this Ontology by way of the underlying platforms that drive these things. The Ontology is called: SIOC (Semantically-Interlinked Online Communities).
See: http://sioc-project.org/
<<
2) the entire web can be marked up, semantically indexed, and maintained by spiders without human assistance
>>
Most of it can, and already is :-)
Human assistance should, and would, be on an "exception basis" a preferred use of human time (IMHO). We do not need to annotate the Web manually when this labor intensive process can be automated (see my earlier comments).
<<
3) inference over the semantic web does not require an extremely deep heuristic search down multiple, redundant, cyclical pathways with many islands that are disconnected
>>
When you have a foundation layer of RDF Data (generated in the manner I've discussed above), you then have a substrate that's far more palatable to Intelligent Reasoning. Note, the Semantic Web is made of many layers. The critical layer at this juncture is the Data-Web (Web of RDF Data). Note, when I refer to RDF I am not referring to RDF/XML the serialization format, I am referring to the Data Model (a Graph).
<<
4) the web becomes smart enough to eliminate websites or data elements that are incorrect, misleading, false, or just plain lousy
>>
The Semantic Web vision is not about eliminating Web Sites (The Hypertext-Document-Web). It is simply about adding another dimension of interaction to the Web. This is just like the Services-Web dimension as delivered by Web 2.0.
We are simply evolving within an innovation continuum. There is no mutual exclusivity about any of the Web Dimensions since they collectively provide us with a more powerful infrastructure for building and exploiting "collective wisdom".
Posted by: Kingsley Idehen | November 24, 2006 at 10:26 AM
http://www.nytimes.com/2006/11/12/business/12web.html?hp&ex=1163394000&en=a34a6306f48166fb&ei=5094&partner=homepage
http://avc.blogs.com/a_vc/2006/11/web_30_is_the_s.html
Most of this "semantic web" talk is drivel, a boring repeat of the "weak versus strong typing" debate in OO languages in the 80s and 90s. Today's web relies on links that are weakly typed, and that's for a reason: politics. Semantic webs don't scale because of varying points of view and incentives to lie.
That said, there are specific techniques that DO scale: tumbl.es, SRO tuples, wiki, meetup, comparison shopping, auctions, prediction markets, and many more. I agree with Ed that "we can make real progress when we drop out of the clouds and start looking at what can be done with a less than a perfect semantic web." That starts by not giving it such a monolithic name or pretending that there's no semantics now.
The web is already semantic: every link posted exists for a reason and is followed or not followed also for a reason, and those reasons are usually obvious to the agents (humans) that post and that click. If the creator was wholly trusted to state the reason (or type) for the reader, there'd be much deception (think spam), so the reader will always have to intuit it. (So far, we're clearly replaying the OO debate; The difference is that you have so much more help to figure out what links mean on the web, than you do for objects within an OO program.)
One of the main reasons people rely so heavily on search engines is so an expert collective can sort out the reasons links exist, and try to come up with a few heuristics that put links that exist for a reason that a reader cares about, up front. These semantics appear as a division between "organic" results (roughly simulating how humans look to "learn") and the "paid" results (how humans look to "buy"). Each gets their own column. Someday we'll have a third column of opportunities to "help" or answer questions as volunteers, which may lead us to paid work doing either (how humans look to "sell" what they know). This user effort is key.
Ed requires that his web 3.0 "becomes smart enough to eliminate websites or data elements that are incorrect, misleading, false, or just plain lousy", but the algorithms can only be so smart and they can never be wise. The real semantic game is getting users to classify things or negotiate category schemes: If you want less expert collectives to also mark pages with specific tags, they're doing so. If you want even less expert collectives to pound out a reliable namespace to name every concept, they've done so: the Wikipedia is web 1.5 at least. Web 2.0 may just expand it to billions of pages in a few hundred languages. On that you could expect to see dozens of good task-based information architectures evolve: web 3.0. But not one web 3.0. Several. In each of which, as Dave McClure says, "computer talks back, tells you that's a stupid idea & send you a pink slip" - for that web subset only. Imagine the unfairness of being fired from the whole web 3.0, say by having some "global reputation".
A lot of these semantic web approaches just assume that there's a single eventual future agreement on link types. There may be, but only once there were viable competing web 3.0 approaches would it make any sense to try to define a uniform general semantics to simplify all the core tasks of that emerging economy (based on those applications). Go read some linguistics: it's a very tough problem to tie actions to words. I do not believe the W3 people have any clue what they're doing, even their REST axioms and verbs are wrong: what purpose does DELETE serve other than to REDIRECT you to a 404? REST was a happy accident, not a plan. Wiki was another happy accident, and it has more of a future than webDAV.
There are other ontologies from those who do understand real living body semantics, such as Cliff Joslyn's group at LANL. But they biased their work rather heavily towards a mechanistic view of life that starts with biology. That's a more reliable path but will take a very long time to get up from a model of my mitochondrial DNA to planning my trip to France. The real web is more about human scales of action. I suggest this won't be solved by biologists, computer scientists or any hard science geeks.
I suggest rather that it's a social problem requiring negotiation and politics as usual, and that very good social software only comes out of addressing hard social problems. Not toy problems as SIOC addresses, but those that involve supporting lateral/peer human relationships in real working problem solving with the fate of living bodies at stake. Those of us doing this though do not yet agree on how to phrase even basic link and action semantics, or choose one set of verbs. The only real agreement I see among those doing real work on collective construction of serious knowledge bases is that "all users are trolls": stuck in a permanent structural conflict with hosts of these webs, who simply cannot step out of the way to let the users define their own semantics. They are psychologically unable usually, but also held emotionally to account and often also legally liable. The hosts feel they must be in charge.
It's similar to the dilemma of kings in the late 18th century knowing they had to lose control but not knowing how to do so safely, or a university administration at any time: students would ideally work out their own curricula and class schedules as a cohort, but since they can't agree fast enough and dull students can't take the chaos of negotiating, the professors must exist so the dull students can demand that they crack down on the chaos and impose structure (rather than fighting the bright ones; Some bright students benefit from structure also, when they're not that interested in the particular topic).
However, good kings and good professors know that they need the chaos even if only to challenge their own long-held assumptions. They have the answers mostly right by the time they become surviving kings or actual professors, so most people won't or can't argue with them. They too need peers - and unruly trolls to question them sharply just to get that one question in 1000 that really changes their views.
To solve the problem of the dullish students who never challenge anyone or create any chaos and are only there to get a degree, most professors agree on just enough basic standard terms to put in textbooks to cover the undergraduate material, and then agree not to fight too viciously in public about everything else. That's exactly what web ontologists do when they agree on stuff in RDF, OVAL, OWL, DAML+OIL, and other mostly useless toys that will be discarded in a few years at best. They're textbooks for the dullish undergrads, not so useful to solve real problems. Who's seen such an application in the wild?
Today's real semantic webs are intranets, and they aren't using these "standards" and you can't see what they're doing, usually. Slowly they're evolving as I suggest, via wiki and social protocols. A dating service might be the most public semantic web application: each link has a pretty clear purpose. But you have to be a human with emotions to understand it at all. LinkedIn is also strongly semantic, since it has such a narrow field of discourse. Likewise, within organizations, you would understand the purpose of any link via your familiarity with the processes of the organization's daily work. If this is extremely rigorously documented, like the US military's SGML-based manuals for every process it performs, you might be able to say that all requirements of a semantic web are satisfied. But that's an exceptionally well run intranet, and certainly it isn't a peer-run system or participatory democracy. It's therefore foolish to assume that a military-style hierarchy will define web semantics or that a more controlled and centralized semantic test suite is going to yield some eventual standard.
That's not what happens in politics, and this is politics. What happens in politics is parties: groups of people who agree that they despise each other's views slightly less violently than everyone else's and would not fight against imposition of a regime by someone from within their own group, and at least not fight to the death against imposition by a regime by another, as long as another election came soon and everyone followed some rules to limit the adversarial process to use ballots and words, not bullets. A main purpose of parties is to let believers argue out their semantics before encountering others too different from themselves (and maybe having to fight them). Or, put another way, the tents are kept far enough apart that those peeing out don't pee only on each other, and most of the pee is expended during election time so there's less left between elections when people have to approach each other's tents...
So I submit that semantic webs won't evolve anywhere until we have semantic parties. Perfectly good strongly semantic hypertext systems from the 80s (IBIS, NoteCards) died absolutely dead in the web era instead of being adapted. Why? I submit it's because semantic webs don't scale well: little inhouse link-type regimes die in contact with the world wide web as surely as dictatorships that open up to emigration, journalism, art and trade must eventually embrace democracy. When characterizing the differences between hypertext systems I don't see a neat upgrade path from one technology to another, but rather several dimensions that mirror those in real politics: from authoritarian to peer participatory democracy, from top secret to utterly public and transparent, from fully web-integrated to standalone silos, from bodily harms at risk in every transaction to just talk about nothing of any bodily potential, from highly mobile to just-sit-on-your-butt, from well funded to volunteer only, from (most important) optimizing fundamental energy and material efficiency versus funding the next con job.
Somewhere in the middle of all these scales are the political applications that I've worked on lately. I require a full capital asset model and a mature regret minimization methodology just to sort out the right thing to do next. And I'm there because that's the exact middle of the problem - where everything is the most muddled and the water's warm. From all that pee, mostly...
Posted by: Craig Hubley | December 22, 2006 at 12:49 PM
As far as I understand it, the difference between Web 2.0 and Web 3.0 has more to do with how the metadata is generated. With Web 2.0 the metadata is created by people who tag websites and documents using sites like delicious or XML code. With Web 3.0 the hope is that search engines can be created that actively send out spiders to search the web and parse the web sites and documents ontologically without the aid of human beings. This would obviously take an enormous amount of processing speed and server space, but it would also alleviate concerns about people having to spend enormous amounts of time tagging everything on the web.
Posted by: carver94 | February 10, 2008 at 11:36 AM