There are only a few months left before the code freeze on September 1st. Now that Fields API has settled in core, it's time to extend it with some RDF semantics. DERI Galway is hosting an RDF in Drupal code sprint from May 11th until May 14th.
This sprint builds on Dries' ideas expressed in his recent posts Drupal, the semantic web and search and RDFa and Drupal. With RDF in the core of Drupal and RDFa output by default, it's dozens of thousands of websites which will all of a sudden start publishing their data as RDF.
So far 8 people have signed up. How about you?
- Stéphane Corlosquet, DERI (organizer)
- Florian Loretan, HappyPixels (organizer)
- Rolf Guescini, Cerpus
- Benjamin Melançon, Agaric Design
- Stefan Freudenberg, Agaric Design
- Frédéric G. Marand, OSInet
- Mark Birbeck, webBackplane
- John Morahan, io1
Some others are willing to come but cannot afford the trip until some funding is secured. To help us fund the sprint and bring more Drupal rockstars on board, please consider making a donation using the ChipIn widget on this page. The money will be used to cover flight, food and hotel costs for the sprinters. All sprinters are generously donating their time to make this happen. It would also be great to fly in a few additional people with extensive testing and Fields experience. Any excess money will be used to add more people, or will be donated to the Drupal Association.
Goals of the code sprint
The RDF code sprint will focus on Drupal core and aim at integrating RDF semantics in it.
- Extend Fields API to integrate RDF mappings for each field instance. The semantics of a field can differ from a bundle to another. This can be stored either in the existing
settings
property or by adding ardf_mappings
property to the Field Instance objects. - Modify the Fields UI (contrib) to allow RDF mappings editing.
- Define the appropriate mappings for the core modules, based on the RDF core mapping proposal.
- Patch core modules with the mappings defined above.
- Export these mappings in RDFa via the theme layer and keep it as generic as possible in order to ease the work of the themers.
- Write tests for RDF in core.
- Identify other non-fieldable entities in core which could benefit from being RDF-ized, and see how to annotate them. Comment is one example. Terms also, though they might become fieldable.
- RSS 1 (RDF) in core. Arto volunteered to get started with that.
See a list of current open RDF issues in RDF issues in core.
See also the RDF code sprint wiki page where we will keep an up to date list of goals.
Comments
What is RDF?
I'm probably just showing how little I know, but what is RDF? And what does having it in core mean to me as a user?
- Alan Tutt
Exceptional Personal Development for Exceptional People
http://www.PowerKeysPub.com
Intro to the Semantic Web
RDF is a W3C standard to add semantics to the data of your site and enable interoperability on the Web. Think of it as RSS on steroids. Watch this great video Intro to the Semantic Web
Some of the ways it helps end-users...
1. Better SEO; RDF allows Google and other search engines to have context about your site's content. They'll understand that "Frank Jones" is the name of a person, not just some random text. They'll understand that a random node on your site is a review for a book with a rating of 2/5 stars. Think search engines on steroids.
2. Better opportunities for interoperability. Data on your site can be "mashed up" with data from other peoples' sites in all sorts of interesting ways.
3. Once you explain what the content is of your pages, it makes it really easy to pull in related content from elsewhere on your site (or elsewhere on the web) to help improve the ability of your visitors to find things they're looking for quickly and easily.
(scor, feel free to correct me if I'm wrong in any of this; this is just what I learned from researching OpenCalais the other week.)
right!
you're perfectly right webchick!
Thank you, Webchick. I have
Thank you, Webchick.
I have to say that this sounds like an interesting theory, and hopefully it will turn out to have practical uses as well. Has anyone split-tested this to see if it really does produce better SEO? Are there any examples of live sites using it to improve site usability?
- Alan Tutt
Exceptional Personal Development for Exceptional People
http://www.PowerKeysPub.com
Well note that this exists right now...
It's not like this is science-fiction stuff that "could" someday appear. :) Google, for example, is parsing this stuff as we speak, and directing priority traffic to sites that implement RDF and Microformats.
For example, try searching Google for "name of movie movie" and you'll see something like this:
That aggregated rating is parsed from sites that implement microformats to explain that the "5" that the search engine finds in that page is actually a "5 stars out of 5" rating on a movie review. If you click into that link, you'll see a variety of sites. One of them that usually comes up is http://www.commonsensemedia.org/ which also happens to be a Drupal site that implements the hreview microformat.
Drupal makes a particularly interesting/powerful platform to put RDF into because there is literally no limit to the type of content Drupal can manage, so we have a real opportunity to be leaders in this area, and move this power into the hands of people who are not comfortable hand-editing HTML.
Great job Angie!
That was really great Angie! The previous explanation along with this one in the screenshot just saved me a couple hours of reading.
Cheers,
Elijah Lynn
-----------------------------------------------
The Future is Open!
Yahoo! SearchMonkey
AlanT, make sure you also watch the video about SearchMonkey. It features enhanced result you can see already on Yahoo! search results, searching for art of pizza chicago for example.
Does this mean that Drupal will be documented?
It looks like if you have to ask, you don't belong. Continuing with the strong drupal tradition of writing new code without backward compatibility, we can now release Drupal 7 years ahead of documentation for Drupal 6. In fact, you can forget about 6 documentation entirely - read the code, that should be enough.
Excellent!
I'm always happy to meet someone passionate about seeing Drupal's documentation improved. :)
It's important to note that anyone can click the "edit" tab on any handbook page and fix it if they notice something inaccurate. Or, if you come across something that's not documented yet, write down as much as you've managed to figure out, and then file an issue in the queue, either against a particular module if it's for that, or against the "Documentation" project if it's for something more general such as a page in the handbook. The documentation team is a really great bunch of volunteers who love to help those who want to help Drupal, and would be more than happy to proof-read your work, collaborate with you on something, or direct you to the proper channels. http://drupal.org/contribute/documentation has more information on getting involved.
Looking forward to your contributions! :D
TOTALLY off-topic for this thread, but...
I was bored, so did some quick calculations.
A quick and very unscientific grep of the drupal core modules says that from:
24699 lines (just the core /modules directory, not API, excluding the html templates)
22477 are /not/ blank.
4721 /look/ like inline function docs ( with *)- the core phpdoc documentation as seen on api.drupal.org
1586 are inline explanatory docs ( with //) - available on api.d.o and useful to any developer.
Taking a look at the code,
2601 lines are calls to t() - which contain more text than code and are just ui messages, it's not like there are per-line docs needed there.
2931 lines contain nothing but "}" on its own - not exactly confusing to anyone reading docs.
(2601 + 2931) = 5532 non-documentable lines
sooo .. the way I look at it, there are
(4721+1586)= 6307 lines of doc to (22477 -6307 -5532)= 10638 lines of code.
a little over 1 line of documentation per 1.7 actual code that may need explaining. +/- 5%
So that's (a little) like the developers spending 22 minutes of every hour explaining what they are doing in the remaining 38 minutes.
Line-count-based metrics are extremely flawed way of measuring code quality, BUT I still don't understand why these results (2 docs every 3 lines) could be held up to call Drupal6 'undocumented'.
Do we need the "talking about things" to outweigh the "actually doing things" portion of the code before it can be called "adequately documented"?
FTR, to expose how bad my maths/cli skills are:
.. there are many tweaks that could be made to this algorithm, have fun.
Of course, I may have totally missed the point as I'm only talking abut docs intended for people who read documentation. I'm not sure what the wordcount on the Drupal handbooks vs contrib code would come in at.
.dan. is the New Zealand Drupal Developer working on Government Web Standards
I'm afraid RDF is an abstract
I'm afraid RDF is an abstract and hard to understand topic. Even amongst seasoned web developers, you'll be hard pressed to find much excitement in RDF. I think that enthusiasm is reflected in the chipin widget. Can you provide more information about how that would to translate into real world applications and use cases?
If it's worth anything to
If it's worth anything to you, Dries has given a presentation about RDF and explained his reasoning why he supports it. Tim Berners Lee supports it. Here's what he said:
I have a dream for the Web [in which computers] become capable of analyzing all the data on the Web – the content, links, and transactions between people and computers. A ‘Semantic Web’, which should make this possible, has yet to emerge, but when it does, the day-to-day mechanisms of trade, bureaucracy and our daily lives will be handled by machines talking to machines. The ‘intelligent agents’ people have touted for ages will finally materialize.
RDF _is_ cool, really!
I totally understand this reaction. As technologies go, RDF is dry as dust. It's one step removed from all the cool stuff that it enables, so people don't get excited about it.
As someone who is trying to marry science and semantics -- that is, change current scientific methods so that automated processes can _understand_ them, in a computational sense at least -- I am wildly enthusiastic about this work. I am convinced it will be the most important contribution of Drupal to its users, the users of Drupal sites, and to information technology in general, in this decade at least.
Many of the other links provide the additional information that you are asking for, and I probably can't improve upon them in a post. But I will give a use case from our community (since I wasn't sure where to leave my use case anyway). Those of you who like text more than video may find it helpful. This use case relates the more semantic-oriented technologies of this change to the practical Drupal web site technologies.
Right now we[1] collect references to other documents on the web about science data management. We are starting to categorize and rate them, using custom fields we created for each kind of reference. Anyone who wants to find and use these ratings has to go to the Drupal site, look up each page, and grab that data. The ratings themselves are terms that come from vocabularies we maintain on another 'vocabulary repository' system [2]. We will probably add the ratings data to a custom-built table, but that will cost development time and still require people navigate to the table and view it, or copy/paste it, to use it for their own purposes. They can only use what we can provide by developing custom software. They can't automate it because if we change our format, or the name we used for the title of a category, their automated scraping of our page will break. They can't tell what the rating words mean unless we also specifically add taxonomies to Drupal that match our rating taxonomies on our own vocabulary system (or build additional tools to do that automatically, which we might have to do). They can't relate the ratings on our site easily to the ratings on another site, or know that our John Doe that rated this Content Standard is the same John Doe that rated it differently on another site.
In the brave new world of RDF in Drupal, here's how I hope it will work:
I will donate some of my own money to make this happen, although the benefit will go to my work life. If someone from the project finds this use case interesting they are welcome to contact me about it.
[1] Marine Metadata Interoperability project, http://marinemetadata.org
[2] MMI Ontology Registry and Repository, http://mmisw.org/or
Structural hints open interesting doors
While you're thinking about how RDF might empower what EPIC called "fact-stripping robots," give some attention to YQL Execute as well. The potential interaction of the two is mind-boggling.
RDF is out there right now and being used
I think some seasoned web developers might not be excited about RDF because they may not all come from a data or information architecture background.
Have a look at http://www.london-gazette.co.uk/
All the Corporate Insolvency notices (and may others) contain large amounts of RDF triples encoded as RDFa. The documents are self describing, in combination with the ontologies pointed to by the CURIEs, a machine can infer all sorts of information such as comany name, number, nature of business, directors, the court hearing date, place, which administrators were appointed, which company they worked for and at which office, and so on.
Phil
RDF Primer
http://www.w3.org/TR/xhtml-rdfa-primer/
My Drupal sites:
My thoughts
I (obviously) support this big time:
I will take part as much as
I will take part as much as possible (a few meetings on Monday but will be around for most of the week).
Memory usage
I hope that those developing this can take into account memory footprint. I don’t believe I’m alone, based on the RDF module’s issue queue, in running into memory problems with Drupal 6 + RDF on a shared hosting account. Enabling that module seems to take up another ~4 MB.
If this is going into core, I hope that it won’t have that kind of impact on every single Drupal site upgraded to v7.
I’m not complaining or railing against anyone’s hard work on this effort — heck, I want to be able to run the RDF module now — but I think that memory usage is an important consideration.
reduced memory footprint
The RDF API module is not going into core and there won't be similar memory issues in core. We are working to make the RDF in core as lightweight as possible.
RDF in a pharmacist view
What i understand is (non-technical guy), it will help index old search engines (Google & Yahoo) to display results like the new search engine from ex-googlers http://www.cuil.com (Pronounced "cool"...)
If i am not asking too much, is it possible to display standard search results of websites using drupal like Cuil search engine results...
If you check for "Drupal" in google and Cuil ...you know what i mean
With RDF in Drupal 6 how much more SE traffic do you get?
If you implemented RDF for Drupal 6 would you share your SEO results, such as the % of increase from organic search? I am really curious how much can I benefit from its implementation? Some case studies would be great.
I just tried the Calais analyzer, but in some of my posts with 300 words, it could only identify maybe 2 words as the name of the person...
My Drupal sites:
RDF / Microformats will be mainstream soon
Right now I wouldn't expect RDF integration to have a very large impact on traffic to your site. However, just two days ago Google announced support for RDF and Microformats in regular search. Yahoo already has this capability featured in Search Monkey.
http://googlewebmastercentral.blogspot.com/2009/05/introducing-rich-snip...
http://developer.yahoo.com/searchmonkey/
So expect this to be a hot topic in the coming months/years. It's awesome that we're identifying this tremendous opportunity and making progress toward supporting it! (I donated, you should too)
Moved
Moved to http://drupal.org/node/443824#comment-2751460
-----------------------------------------------
The Future is Open!
I have been using Drupal
I have been using Drupal since a long time for my site and am keen on looking out for latest updates on Drupal. This new RDF code from Drupal is very interesting and effective, specially its interoperability feature. I strongly recommend it to the webmasters for their sites.
[Edit: spam links removed]
neswebdesign are (bad) spammers
Dear "SEO" guy.
Trying to spam the system by adding "rel='follow'" links to your signatures does not work. Perhaps you should consult with a web professional who actually understands SEO, because you are demonstrably useless at it.
.dan. is the New Zealand Drupal Developer working on Government Web Standards
Can we make arbitrary RDF assertions in Drupal7?
Hi guys
It isn't clear to me yet, if it is going to be possible/easy to make any arbitrary assertion about a node in Drupal7. I'm thinking something like the Relations API but be able to make the relation to any URL, not only another Drupal node.
Scor's work on RDFCCK is great, I love that we can use vocabularies, but I want more than that - I want to be able to make assertions about existing nodes that I didn't plan on when I created the CCK model, and in fact that don't apply to most of the nodes, just some of them, so I don't want to add it as a property. I'm thinking of general predicates like
dc:created_by
or
abra:inspired_by
that could apply to many documents and media, but I don't want to add as CCK properties as there are a large number of different predicates I might want to apply.
I don't always read this list, if this makes sense to anyone could you possibly copy a reply to gvelez17 && gmail.com ?
thanks!
--Golda
http://iwhome.com