Problem/Motivation
Drupal 7 includes default RDF mappings from quite a few namespaces (FOAF, Dublin Core, SIOC, SKOS, etc). These were the popular vocabularies back in 2009. From the feedback we got, it seems that the amount of namespaces in Drupal 7 confused people. Since then, schema.org was launched with a broad set of types and properties all under the same namespace, and it is backed by the major search engines.
Proposed resolution
List the types and properties from schema.org which are relevant for the types of data we have in core (taxonomy term, user, file, their fields, etc) and in the standard install profile. There might also be some terms from schema.org we could use to "suggest" mappings when a user creates a new content type or field depending on the field type. The notion of "default" mapping should go away in Drupal 8, and be replaced with suggested mappings so that users are not forced into using a particular default mapping without knowing it. This approach was used by Lin Clark in the microdata module.
Remaining tasks
- Add test for user mapping and user page.
- Add test for term page.
- Add test for teasers in node listing.
- Add the mapping for page comments.
User interface changes
None. There is currently no UI in core to choose RDF mappings.
|
Property | Drupal 7 mapping | Drupal 8 mapping | Notes |
|
# node.page | |||
|
type | foaf:Document | schema:WebPage | |
|
title | dc:title | schema:name | |
|
created | dc:date, dc:created datatype => xsd:dateTime callback => date_iso8601 |
schema:dateCreated | should use the HTML5 <time> element if rendered |
|
changed | dc:modified datatype => xsd:dateTime callback => date_iso8601 |
schema:dateModified | should use the HTML5 <time> element if rendered |
|
body | content:encoded | schema:text datatype => rdf:HTML |
|
|
uid | sioc:has_creator (rel) | schema:author | |
|
name | foaf:name | NA | this should no longer need to be explicitly in comment, but only as a mapping to user |
|
comment_count | sioc:num_replies datatype => xsd:integer |
schema:interactionCount | |
|
last_activity | sioc:last_activity_date datatype => xsd:dateTime callback => date_iso8601 |
||
|
||||
|
# node.article | |||
|
type | sioc:Item, foaf:Document | schema:Article | Article is more generic than BlogPosting or NewsArticle, and given that we don't know kind of article end user will use this content type for, I think it's safer to just use the generic schema:Article |
|
title | (same as node.page) | (same as node.page) | |
|
created | (same as node.page) | (same as node.page) | |
|
changed | (same as node.page) | (same as node.page) | |
|
body | (same as node.page) | (same as node.page) | |
|
uid | (same as node.page) | (same as node.page) | |
|
name | (same as node.page) | NA | this should no longer need to be explicitly in comment, but only as a mapping to user |
|
comment_count | (same as node.page) | (same as node.page) | |
|
last_activity | (same as node.page) | (same as node.page) | |
|
field_image | og:image, rdfs:seeAlso (rel) | schema:image | |
|
field_tags | dc:subject (rel) | schema:keywords ? | |
|
||||
|
# node.forum | |||
|
type | sioc:Post, sioct:BoardPost | schema:Discussion | Not an official type yet, but there is a proposal for this type |
|
title | (same as node.page) | (same as node.page) | |
|
created | (same as node.page) | (same as node.page) | |
|
changed | (same as node.page) | (same as node.page) | |
|
body | (same as node.page) | (same as node.page) | |
|
uid | (same as node.page) | (same as node.page) | |
|
name | (same as node.page) | NA | this should no longer need to be explicitly in comment, but only as a mapping to user |
|
comment_count | (same as node.page) | (same as node.page) | |
|
last_activity | (same as node.page) | (same as node.page) | |
|
taxonomy_forums | sioc:has_container (rel) | ||
|
||||
|
# comment (all) | |||
|
type | sioc:Post, sioct:Comment | schema:Comment | |
|
title | (same as node.page) | schema:name | |
|
created | (same as node.page) | schema:dateCreated | |
|
changed | (same as node.page) | schema:dateModified | |
|
comment_body | content:encoded | schema:text | |
|
pid | sioc:reply_of (rel) | ||
|
uid | sioc:has_creator (rel) | schema:author | |
|
name | (same as node.page) | NA | this should no longer need to be explicitly in comment, but only as a mapping to user |
|
||||
|
||||
|
#taxonomy_vocabulary (all) | |||
|
type | skos:ConceptScheme | ||
|
name | dc:title | ||
|
description | rdfs:comment | ||
|
||||
|
# taxonomy_term (all) | |||
|
type | skos:Concept | schema:Thing | |
|
name | rdfs:label, skos:prefLabel | schema:name | |
|
description | skos:definition | schema:description | |
|
vid | skos:inScheme (rel) | ||
|
parent | skos:broader (rel) | ||
|
||||
|
# taxonomy_term.forums | |||
|
type | sioc:Container, sioc:Forum | ||
|
name | (same as taxonomy_term.all) | ||
|
description | (same as taxonomy_term.all) | ||
|
vid | (same as taxonomy_term.all) | ||
|
parent | (same as taxonomy_term.all) | ||
|
||||
|
# user (no bundle) | |||
|
type | sioc:UserAccount | schema:Person | |
|
name | foaf:name | schema:name | |
|
homepage | foaf:page (rel) | NA | I would suggest to no longer the homepage field because it is only available for anonymous users and its value cannot be as trustable as an authenticated user (spam). People can create an homepage field if they want to on the user entity type |
Comment | File | Size | Author |
---|---|---|---|
#46 | 1784234-46-schema-mapping.patch | 36.13 KB | linclark |
#44 | 1784234-45-schema-mapping_includes1941286.patch | 38.63 KB | linclark |
#44 | interdiff.txt | 3.92 KB | linclark |
#37 | 1784234-37-schema-mapping.patch | 36.17 KB | linclark |
#33 | 1784234-33-schema-mapping.patch | 36.22 KB | linclark |
Comments
Comment #1
Anonymous (not verified) CreditAttribution: Anonymous commentedI can attest to this. In surveys I performed, half listed SEO as the primary reason they used RDF module, but only 1 of the 30 respondents had made alterations to the core mappings (which would be required to get SEO results).
Comment #2
chx CreditAttribution: chx commentedTalked to dale42 about this today, I recommended typed_data()->getDefinitions() as the starting point.
Comment #3
Anonymous (not verified) CreditAttribution: Anonymous commentedUsually, we map terms like Schema.org to fields and content types. While Schema.org does have it's own datatypes, they aren't frequently used. Datatypes are often covered by XSD and other vocabularies. What specifically was dale42 interested in doing?
Comment #4
chx CreditAttribution: chx commentedNo idea :) do you mean field types? dale42 was also very confusing :/ but anyways, field types and content types are not the same "level" so to speak. Nor are invidual fields... I was trying to give some guidance on D8, anyways.
Comment #5
Anonymous (not verified) CreditAttribution: Anonymous commentedThese are the kinds of mappings that we'll need to figure out:
Comment #6
judahtanthony CreditAttribution: judahtanthony commentedFYI, I just got notice from Google that they are going to start looking at the itemprop="logo".
We could do something like:
We may have to do some style changes to accomodate, but the idea remain.
Comment #7
scor CreditAttribution: scor commentedThe logo property in question is for the Organization type. This would make sense if we were shipping with an organization content type, I don't think core will ever have such content type, but distributions may.
Comment #8
Anonymous (not verified) CreditAttribution: Anonymous commentedTopping. This should be handled ASAP if we're going to get it in before July 1.
Comment #9
scor CreditAttribution: scor commentedyes, I'm going to take this one on.
Comment #10
Anonymous (not verified) CreditAttribution: Anonymous commentedI think it will actually reduce the duplication of effort if this patch is finished before #1778410: Throw exception when RDF namespaces collide, so I've started on it.
The patch includes mappings and tests for Article and Page. I had two thoughts on the mappings you suggested for those:
This patch includes a concept called the SchemaOrgDataConverter. This would provide callbacks for formatting values to schema.org's standards... for example, interactionCount as "UserComment:5". However, I haven't written the tests for the front page yet (which is where comment count is displayed), so I haven't made it work yet.
Comment #11
scor CreditAttribution: scor commentedThis needs to be rerolled now that #1869600: Refactor RDF mappings to be inline with the new Entity Field API has been committed.
Let's keep going in the direction described in #15, I agree with all of your points, Lin.
Comment #12
Anonymous (not verified) CreditAttribution: Anonymous commentedRerolled to account for the refactor RDF patch and the taxonomy term API switching to entity reference.
Comment #13
Anonymous (not verified) CreditAttribution: Anonymous commentedReassign.
Comment #14
Anonymous (not verified) CreditAttribution: Anonymous commentedI've added the mapping for article comments.
We currently do not have a mapping which relates the node to the comment (which would be schema:comment). This should be easy once #1907960: Helper issue for "Comment field" is in.
In rdf_preprocess_comment, we were accessing the mapping for 'title' instead of 'subject', which is what comment calls its title. I changed this, but it might break something else. I was surprised that our CommentAttributeTest didn't catch it.
Comment #16
Anonymous (not verified) CreditAttribution: Anonymous commentedFixing the test.
Comment #17
Anonymous (not verified) CreditAttribution: Anonymous commentedI suggest that we don't even worry about mapping vocabularies... if the closest we can get is schema:Thing, than there isn't anything useful that could really be done with the information anyway.
Comment #18
Anonymous (not verified) CreditAttribution: Anonymous commentedChanged the mapping for tags.
Comment #19
Anonymous (not verified) CreditAttribution: Anonymous commentedChanged mapping for users.
Comment #20
Anonymous (not verified) CreditAttribution: Anonymous commentedMoved the user mapping to user module (since it is independent of standard profile) and made some code style fixes.
I also changed the mappings for forum nodes and terms. I didn't add tests for this since it isn't part of the standard profile and we don't have tests for them at this point. I don't think it is worth the additional effort of testing them, since we are already testing the rendering of RDFa in nodes and terms.
Left to do:
Comment #20.0
Anonymous (not verified) CreditAttribution: Anonymous commentedinitial draft
Comment #21
Anonymous (not verified) CreditAttribution: Anonymous commentedIn this patch, I make it possible to use methods on classes as datatype callbacks, add a test for this, and add tests for the front page display.
I also rearranged the test assertions so that mappings that are the same (e.g., the shared mappings between page and article) are tested by the same function (e.g. _testCommonNodeProperties).
Comment #23
Anonymous (not verified) CreditAttribution: Anonymous commentedWhoops, forgot to merge.
This patch fixes that and also removes the test for the mapping config... this is already tested indirectly by testing the output.
Comment #25
Anonymous (not verified) CreditAttribution: Anonymous commentedThe last patch will fail because #1996714: Convert FileItem and ImageItem to extend EntityReferenceItem got in. I changed fid to target_id to fix that.
I also added tests for the user page and removed the hardcoded relationship between the user and the account.
Comment #27
Anonymous (not verified) CreditAttribution: Anonymous commentedSo it turns out we were hard-coding the placement of the content attribute for comment count and didn't have a datatype callback to handle it. I've added a callback for raw values.
I've also added a test for the taxonomy term page display. For some reason, the description isn't getting exposed. I assume we test for this in the taxonomy term test, so I'm not sure what's going on here. Will be looking in to it.
Comment #28
Anonymous (not verified) CreditAttribution: Anonymous commentedIt turns out we weren't testing for the description in our TaxonomyAttributesTest.
@scor, do you have any idea why the description attributes wouldn't be placed.
Comment #29
scor CreditAttribution: scor commentedI need to review the whole patch, but here is a reply to your question.
We couldn't place the attributes because the term description was not a field, and adding markup around it required extra functionality that didn't make it in D7. I believe the right solution is #569434: Remove taxonomy term description field; provide description field for forum taxonomy (which afaict would not break any API and could be committed after July 1st). I'll try to get someone to review and push for this issue.
Comment #31
Anonymous (not verified) CreditAttribution: Anonymous commentedAh, that makes sense. Ok, I've removed that test and added a todo to add tests once it is a field.
I've also added the mapping for page comments. I figure a test is probably overkill, but we could add it if necessary.
I'll be going over it once more for code style, but the content should be ready.
Comment #32
scor CreditAttribution: scor commentedThis is looking great so far!
Not sure what you mean with the comment "Article". Looks like you are setting the URI here.
you might want to add a link to the issue here, to show it's being take care of.
link to issue?
Comment #33
Anonymous (not verified) CreditAttribution: Anonymous commentedThis patch fixes code style issues and addresses the comments from scor.
Comment #34
scor CreditAttribution: scor commentedReviewed and tested the patch locally. It's working great and I can easily tweak the mappings by editing the YAML files.
This patch brings a truck load of new schema.org types and properties into Drupal 8 core. The article, forum and page nodes, as well as users, comments and taxonomy terms are now using schema.org for all their metadata and field instances defined by core. This patch also includes a small API addition for supporting a schema.org data conversion, and tests for the standard profile schema.org mappings. The @todos are waiting for API-cleanup issues to be resolved.
Comment #35
scor CreditAttribution: scor commented#33: 1784234-33-schema-mapping.patch queued for re-testing.
unfortunately this will need to be rerolled because of #1941286: Remove the process layer (rdf module).
Comment #37
Anonymous (not verified) CreditAttribution: Anonymous commentedIt seems that that patch may have broken the output of the comment created date. To late in Ireland to debug tonight.
Comment #39
Anonymous (not verified) CreditAttribution: Anonymous commentedSo as I explain in https://drupal.org/node/1941286#comment-7595467, the patch in that issue removed the placement of certain comment attributes. @scor, since you were involved in that issue can you debug that?
Once that regression is fixed, this patch should be good to go.
Comment #40
scor CreditAttribution: scor commentedyes, let me take care of that...
Comment #41
scor CreditAttribution: scor commentedAs explained in #1941286-40: Remove the process layer (rdf module), this fails because the standard profile uses Bartik. Could you try to switch the Standard profile theme to stark in setup() to see if it makes any difference? This is taken from testCommentLinks():
Comment #42
Anonymous (not verified) CreditAttribution: Anonymous commentedYou're right, it does fail because the standard profile uses Bartik, but also because of the change in #1941286: Remove the process layer (rdf module). The test worked with Bartik in #31.
Comment #43
scor CreditAttribution: scor commentedCould you try the patch #1941286-42: Remove the process layer (rdf module)?
Comment #44
Anonymous (not verified) CreditAttribution: Anonymous commentedScor's patch from the other issue fixes these test fails. I have included that patch and also fixed a typo.
Comment #45
scor CreditAttribution: scor commentedFollow up for #1941286: Remove the process layer (rdf module) was committed, so this patch can now be rerolled.
Comment #46
Anonymous (not verified) CreditAttribution: Anonymous commentedRerolled.
Comment #47
scor CreditAttribution: scor commentedCompared patch #33 (which was RTBC'ed) and #46 in my IDE and verified the changes are the same.
Back to RTBC (see #33).
Comment #48
yannickooGreat work!
Comment #49
Dries CreditAttribution: Dries commentedCommitted to 8.x. This is exciting. Thanks.
Comment #50
Anonymous (not verified) CreditAttribution: Anonymous commentedSince this changes the HTML data that is exposed, we should probably provide a change notice. I'll work on that.
Comment #51
Anonymous (not verified) CreditAttribution: Anonymous commentedOK, change notice added.
Comment #51.0
Anonymous (not verified) CreditAttribution: Anonymous commentedUpdated remaining tasks.
Comment #52
bleen CreditAttribution: bleen commentedfor reference: https://drupal.org/node/2034127
Comment #53
scor CreditAttribution: scor commentedThis record looks good to me. Thanks Lin.
Comment #54
tim.plunkettI encountered this test while debugging another issue.
It declares none of its properties and uses _test as a prefix for methods (which we're trying to avoid), among other things.
I've opened #2036765: Drupal\rdf\Tests\StandardProfileTest needs clean up as a follow-up, please review those changes as I've seen these inconsistencies in several RDF tests before (#1869600-123: Refactor RDF mappings to be inline with the new Entity Field API, for example).
Comment #55.0
(not verified) CreditAttribution: commentedUpdated mapping table to reflect what was in the patch.
Comment #56
jneubert CreditAttribution: jneubert commentedJust came across to adopt Drupal 7 mappings in ZBW Labs to the future standard schema.org mappings, and spotted
I suppose the question mark's there because "keywords" has "Text" as the expected type. Perhaps it would be more appropriate to use schema:about, which expects schema:Thing?
from https://schema.org/CreativeWork:
Comment #57
jneubert CreditAttribution: jneubert commentedJust another one: Re node.article
would perhaps be better schema:articleBody (from https://schema.org/Article)
Comment #58
jneubert CreditAttribution: jneubert commentedAt node.page, the "Drupal 8 mapping" column for last_activity is empty - by accident?
I couldn't find a schema property for this - perhaps stick with sioc:last_activity_date?