Rethinking Guernica Homepage

Explore one of the twentieth century's most charismatic artworks using Drupal 8

The Museo Nacional Centro de Arte Reina Sofía, the most important institution of modern and contemporary art in Spain and internationally, contracted Biko to develop an ambitious project around Picasso's most charismatic artwork.

Located in the center of Madrid, it is a must see for tourists, art lovers and scholars. Its collection is intrinsically linked by history and time and perhaps its most iconic work is Guernica by Pablo Picasso.

To talk about Guernica is to talk about Spain’s most iconic 20th century work of art. Its life is intertwined with the history of Spain and of the world, from its creation in the middle of the Spanish Civil War, through its travels around the world, to its return to Spain after the transition to democracy.

What challenge did we take on?

The Museo Reina Sofia was conducting the most ambitious research that had ever been carried out about Guernica:

  • A comprehensive collection of all the documents that could be found related to the painting and its history.
  • The largest study of high-resolution images that has ever been carried out on a painting.

How could we take advantage of the online medium to create a digital product that would live up to expectations? That was the challenge that we faced, and which has kept us motivated from then until the launching of the site.

What did we do?

Our first challenge was to give form to the project, so we started with a joint definition phase with the researchers and those responsible at the Museum.

The results of this phase showed the need for a site structured in three sections:

  • Documentary archive, where site visitors can consult everything from the letters to Picasso commissioning the painting, to the photos of its arrival in Spain in 1981. A powerful search engine encourages the exploration of stories that explain the history of Guernica and provides additional information about its protagonists.
  • Chronology, through which the entire catalog of documentation can be explored using a spectacular interactive display.
  • Gigapixel, a technology that allows any user to discover Guernica from a perspective that was unthinkable before. The largest image of an artwork ever captured offers the possibility of studying the painting with natural, ultraviolet or infrared light, and X-rays.

Archivo Guernica - Global Architecture

Why Drupal was chosen: 

Conceived as an archive of archives, Rethinking Guernica is composed of different public and private archives from institutions and national and international agencies. It incorporates more than 2,000 interrelated documents which can be explored in depth thanks to a powerful search engine.

The core of the project architecture is an implementation of Drupal 8 as content manager. Functionally, the researchers from the Museo Reina Sofia needed to easily manage all content, allowing its classification into different taxonomies and little by little incorporating multimedia material associated with each research document. The choice of Drupal was a decision made within the continuity of the general architecture of all the Museo Reina Sofia's portals, all built using Drupal 7. However with the move to Drupal 8 we gained all the advantages provided by the latest Drupal version, ensuring the project's lifecycle was lengthened and making it easier to maintain. In addition the ease of integration with the search system proved very useful.

Describe the project (goals, requirements and outcome): 

The implementation of this project was characterized by:

  • The use of five different content types, mainly document and story.
  • Ten different taxonomies that allow the classification of documents (author, bibliography, location, protagonist, provenance, etc.).
  • A content creation set-up based on the Paragraphs module.
  • A multimedia asset management that allows simple and flexible association of documents, high-resolution images, videos, audio, etc.
  • Use of all of the core multilingual capabilities to publish in English and Spanish.
  • Use of migrate module to perform the initial import of thousands of files and content from sources based on Excel files.

Edit Backend - Guernica

Search system

A faceted search system was built around the entire documentary archive based on Apache Solr. This is an Open Source product from the Apache Foundation built using Java technology that we know in depth and are comfortable with. Integration with Drupal was achieved using the Search API and Facets modules.

Keys to building a good search engine

Our keys to building our search engine:

  • Performing a custom configuration of the fields to be indexed and the weight to assign to each of these fields in the Solr index.
  • Performing an analysis of the project's data model.
  • Building a hierarchy that allows the user to assign different weights to each field and entity to adjust the algorithm that Solr uses to determine the relevance of each document.

Configuration of indexable field settings and relative weights
Configuration of indexable field settings and relative weights.

We already had ample experience in the implementation of faceted search systems, but this time we wanted to tweak the usability of the interface further.

  • Ranking of facets: As we were dealing with a project in which we wanted to offer all the filtering power possible for a dedicated audience (researchers), special attention had to be paid to the location of the facets, their ordering and their interaction when may be hundreds of results are to filter; otherwise the facet system could overwhelm the user and become unmanageable.
  • Interaction with each facet: For each facet available in a set of results we built a block that displays the filters with most hits. We limited each block to 10 items and allowing the user to display a pop-up enabling them to perform a free search on the facets with fewer hits. Each block of facets adapts its content to the set of results shown at each moment.Interaction with each facet
  • Interaction with the facet system on mobile devices: We have taken special care of the interaction with the facet system on mobile devices. The responsive version for small screens uses a folded system to display the different filters. We focus the user’s attention on the main interaction, which is the free search and the block of results, without losing the ability to refine the search.
  • Naturalization of the summary of search results: This is a component that we are especially proud of and which we have taken extra special care of to improve the user experience:

    Naturalization of the summary of search results

In the past we had always modeled this summary of applied filters as a set of tags that the user could delete later. But this time we wanted to inform the user in natural language which was also grammatically correct, as if we were telling them in person what we had found for their search. Technically, we had to implement a new service in Drupal (based on FacetsSummaryManager) that handles the rendering of these search summaries.

SEO

Focus on SEO (search engine optimization) in all stages (conceptualization, AI and UX, design, layout, development and content) in addition to the inherent capabilities of Drupal ensure high rankings and good visibility of our site. The basic SEO configuration in Drupal 8that we use in all of our projects involves the use of the following modules:

  • Pathauto (to configure the automatic generation of pretty urls);
  • Metatag (for automatic generation of metadata for SEO and SEM), as well as a custom module to incorporate metadata using JSON-LD;
  • And Simple_sitemap (for the publication of Google XML sitemaps).

Metatag config

Gigapixel

The challenge: to build a web application that enables the exploration of Picasso's Guernica down to the smallest detail. As a baseline we had the set of images that were obtained during a study carried out in 2012 "Journey to the Inside of Guernica". The origin of this study was based on the use of a computer controlled robotic device. The robot moved with great accuracy at 25 microns from the work, capturing images and data with great precision in different channels: visible and ultraviolet light, multi-spectral infrared imaging, 3D scanning and spectral reflectance. The result was a set of images that make up the largest image (690,000 pixels by 311,000 pixels) of a artwork ever captured. Browse it!.

Robotic automated device installed in front of Guernica for the study
Robotic automated device installed in front of Guernica for the study.

Image processing

One of the biggest problems we faced in the project was the double objective of:

  • Making a format of the image of Guernica available that can be managed by a web browser;
  • Which at the same time can use a server infrastructure that supports very high traffic peaks and while not requiring a very large investment in hardware.

The Museo Reina Sofia’s Department of Restoration has been working for years in collaboration with the Universidad Politécnica de Madrid (Polytechnic University of Madrid) to process the images obtained by the robot and turn them into a useful image for their research. The result is a geo-referenced image hosted on a GeoServer server, technology that is currently being used for the building of interactive map applications. The problem with that format is that it requires a complex and costly server infrastructure in order to withstand the expected traffic.

The solution: Transform the images used by GeoServer into a pyramid of static images; specifically, into a pyramid of images that is readable by the open source online map generation library, Leaflet.js. To do this on the basis of the mosaic of highest resolution images (in the visible channel for example there are 51,224 images that are 2,048 x 2,048 px), we processed them using the tools (specifically gdal2tiles) that are used for image operations by this type of mapping application. This enabled us to generate a pyramid composed of 12 depth levels and a total of 8,742,468 images of 256x256 pixels.

pyramid of static images

Some details of the process will present an idea of the volume of data processed:

  • Number of source files for the visible channel: 51,224
  • Number of files generated for the visible channel: 8,742,468
  • Size in pixels of the deepest zoom level: 311,296 by 690,176 px
  • Volume of data for the visible channel: 436 Gb.
  • Processing time to transform the image from GeoJSON format to Leaflet: 162 hours. Almost 7 days to process an image! And this was using the most powerful server infrastructure that we have at Biko, and which we optimized to operate with large volumes of small files on disk. We spent an entire month working our machines at maximum performance to obtain the images of the 4 channels used.

Architecture of the Application

At the technical level, the Gigapixel application is a subproject that is completely independent of the rest of the portal. It is a Single Page Application built using only HTML, JS and CSS. The core of the application is based on the Leaflet.js library, which is responsible for handling the pyramid of images that we have hosted on Amazon S3. We complemented this library with a series of open source extensions developed by us:

  • Leaflet-hash: Allows us to obtain urls with a hash code that enable the reconstruction of the status of the application, with the aim of being able to share details particular to the image on social networks.
  • Leaflet-minimap: This allows us to use a small image to display where the user’s zoom is located.
  • Leaflet.sync: Allows you to compare two views of the same image. This was another of the technical challenges that gave us headaches. The images of the different views do not have the same resolution and size in pixels, so we had to perform scaling and translation operations to reposition both images which took us back to our biggest nightmares in trigonometry lessons.

Gigapixel. Comparison of the visible channel with the ultraviolet channel, hole in the soldier’s eye
Gigapixel. Comparison of the visible channel with the ultraviolet channel, hole in the soldier’s eye.

Chronology

Chronology is a tool for the exploration of Guernica’s documentary archive using a different approach to that of the search engine. We show all the documents generated around the painting on a timeline: photos, posters, correspondence, etc. This application is designed for discovery and immersion. The content can be interactively filtered by author, document type, etc., thus limiting its temporal distribution.
This is a way to contextualize everything that the work of art has generated throughout its different milestones and historical moments in the 20th and 21st centuries.

Technically, it is built as a JavaScript application, basically using the data visualization library D3.js (Data-driven Documents). This library is one of the most powerful instruments in existence for building data visualizations, and due to its versatility and respect for web standards it fit perfectly into the design of the project. The information is displayed in the form of dynamic, fully interactive SVG, which enables interaction with the documents.

Chronology

Much of the work in this visualization has involved studying the UX to make almost 2000 interactive elements accessible simultaneously on the screen with subtle refinements like zooms in years with a high concentration of documents, automatic scaling of each year to improve selection and the ability to drag-and-drop to find similar documents.

The information required by the application is obtained from the data stored in Drupal which uses an API built using a custom module to display the documents in TSV format and interpreted within D3.

The home page, an interactive presentation for getting to know Guernica

How to tell a great story in very few words

In order to explain to the general public what Guernica is and present the project, we modeled an interactive presentation using careful storytelling. The baseline was a presentation of a series of texts and images that accompany the visitors as they scroll down the home page.

The development of this interactive element is based on a JavaScript application with CSS3 animations built using as a basis a number of JavaScript libraries, each with a different area of responsibility:

  • Fullpage.js: This is the base on which a set of slides is built that occupy the browser's full screen and it coordinates the transitions between the different slides.
  • GreenSock: Used to build linked scenes and animations, which are those that are launched at certain events controlled by Fullpage.js. For example, when the user enters a slide, an animation is launched that makes the background image opaque; at 0.5 seconds we see the main text displaced on the X-axis; after one second the image caption appears, etc.

Theming: HTML Components, Atomic Design

A custom theme was used for the construction of the portal’s theme that extends the "stable" theme and incorporates a series of gulp tasks to compile SASS, optimize images and SVG icons, minimize JS, etc.

For the construction of all of the project’s HTML we used the Atomic Design methodology focused on the cumulative reuse of simple modular elements (components) to create more complex information structures. Instead of designing pages, we designed components.

This methodology forced us to think about the system as a whole rather than about each page individually, and as a result, we obtained a catalog of HTML components that we constantly reuse to build pages, which provides greater consistency of user and graphic experience.

 HTML by Components, Atomic Design

Best practice for CSS coding
We needed our styles to have the same quality as the back-end code, and to achieve this:

  • We used the pseudolanguage SASS to write our CSS style sheets. This enabled us to write maintainable code without duplication, which we can evolve further.
  • SASS code follows a strict code style guide with a set of best practices.
  • The organization of code follows the ITCSS rules that help us to organize our styles in layers of specificity and objectives
  • We applied the BEM methodology to achieve greater semantics and cohesion in the names of the styles of each component.

Best practice for CSS coding
SASS code folder structure based on ITCSS.

Hosted on the shoulders of giants

Another critical aspect of this project is performance and scalability of the server architecture. As this is a project which will receive a lot of publicity from public media we expect large spikes in traffic. We therefore had to strengthen the portal’s architecture to prepare it for these traffic spikes – the key is Amazon Cloudfront.

Over the server architecture used for the hosting the Museo Reina Sofia’s projects we implemented a CDN layer based on Amazon Cloudfront, integrating it with Drupal using the CDN module.

In addition to a CDN layer, the project uses specific software to optimize the server’s performance: a Varnish cluster as a reverse proxy for caching static files and code generated by Drupal, a Memcache cluster as an in-memory cache for low level Drupal caching, and a Solr cluster for indexing and searches. In addition, all the storage of static files related to Gigapixel is supported by Amazon S3.

In the front-end layer we have used and contributed our own module for optimizing the loading of CSS (https://www.drupal.org/project/critical_css), which extracts the CSS code from the first scroll and includes it inline in the header of each page.

Server architecture

Organizations involved: 
Modules/Themes/Distributions
Why these modules/theme/distribution were chosen: 

The list above represents the main modules that we have used in the project. We use some other modules in our dev environments.

Community contributions: 

We have contributed the module Critical CSS and various patches to the search_api_sorts and facets_pretty_paths modules.

Guernica Gigapixel
Pablo Picasso
A document of the site