Voting starts in March for the Drupal Association Board election.
As one of the world's premier cancer centers, Memorial Sloan-Kettering Cancer Center is committed to exceptional patient care, leading-edge research, and superb educational programs.
Memorial Sloan-Kettering Cancer Center's broad mission requires that the www.mskcc.org design and information architecture serve a wide variety of audiences, each of whom are interested in very different content. For example, each aspect of MSKCC's mission, treatment, research, and education, has a dedicated landing page. A newly diagnosed cancer patient generally is interested in seeing only information about their specific cancer type while a postdoctoral student might want to view a researcher within a given research program or department. The primary goal for the redesign was to better address the needs of the wide variety of MSKCC's users, while cleanly conveying their mission.
Take a video tour of Memorial Sloan-Kettering Cancer Center’s redesigned website.
In 2001, after several prototypes and iterations, the Memorial Sloan-Kettering Cancer Center website (www.mskcc.org) launched on my "homegrown" content management system (CMS) called the "Inettool." For the last decade, while maintaining and customizing the Inettool, I came to the realization that I was digging a "CMS hole" where my code and MSKCC's data were gradually being buried and trapped in this custom-built system. This experience led me to conclude that "custom-built software requires everything to be custom built." Using an open source CMS, like Drupal, prevents one from being trapped in a custom-built CMS, because open source code provides pre-existing and tested functionality that can be customized.
The simplest explanation to why I choose Drupal is "given enough eyeballs, all bugs are shallow," Drupal has a community of engaged participants looking at and contributing code. For me, Drupal's contributed code and open discussions are its biggest strengths; I have not hit any brick walls or black boxes while using Drupal to build the MSKCC.org website.
So, after 2 years of work, Memorial Sloan-Kettering Cancer Center's redesigned website, MSKCC.org finally launched. Rebuilding and moving MSKCC to Drupal was my biggest project ever and my first large scale Drupal project, which is why I feel it is important to share my experience...
About the "Switch"
As mentioned in the introduction, the previous website was built using a custom built CMS, called the "Inettool." The driving force behind the custom built CMS was the desire for the institution to have a specialized, customizable website.
I will let you in on a little secret and misconception: "most websites are never really that special, it is the institution and/or business behind the website that is special." In the case of the MSKCC.org, the CMS is just a tool used to convey their mission and message, which I personally summarize as "quality, compassionate care."
I successfully convinced MSKCC to switch to Drupal by explaining the ongoing challenges of maintaining their custom built CMS. One final, but key selling point for switching to Drupal was the availability of enterprise support from Acquia.
So MSKCC agreed to adopt Drupal and "the switch" got underway.
The "switch" can be broken down into several steps/decisions, which include:
- System Architecture
- Content Management
- Information Architecture
- Site Features (aka modules)
- Templates (aka themes and panels)
First, Some Stats
Below are some general statistics to help describe the scope of the migration and the general system architecture requirements.
3,364,927 Unique Visitors
|206 modules enabled
142 contrib modules
64 custom modules
|55 active users
|33 content types
2003 book pages
|108 primary links
26 secondary links
Conceptualizing a migration of data from the Inettool to Drupal was the first step of the switch process. There were many questions to be asked and answered on how the migration would be accomplished. For instance: What data would be moved? How much data could be cleanly migrated? How much data would require additional post-migration clean-up?
How to migrate an existing website to Drupal can be a pretty easy question to answer, since Drupal has several contributed modules to import and export data. Honestly, I made a 'newbie' mistake, which may have been a good decision, to write a custom migration module from scratch. I saw this as an opportunity to learn PHP and the inner workings of Drupal's API and database structure while knowing that this code would be thrown away after the final migration. Anyone new to Drupal should be willing to throw away code, it is just part of the learning process.
Besides learning Drupal, I had three goals for my migration script, which were:
- Automated nightly builds so that everyone could review the migrated data as changes were being made.
- Single page imports that would be used to debug minor migration issues.
- To cleanly migrate 90% of the existing 10,000+ pages, thus requiring little post migration clean-up.
Besides one or two issues that had to be fixed post-final migration, the data migration was successful, requiring about 3 weeks of post-migration clean-up but admittedly there was a lot of pre-migration clean-up. The most important thing was when it was time to finally migrate the website, everyone on the project was comfortable and ready to move to Drupal.
Since the project began in 2009, the new website uses Drupal 6. Though the website has no patient health information (PHI), MSKCC reasonably required that the web servers be hosted internally. The key performance recommendation I made, especially for MSKCC's initial launch on Drupal, was to have no authenticated traffic on the website. By keeping all external users anonymous, every page on the website can be cached by a reverse proxy and the website can handle a fairly large load.
No one at MSKCC, including myself, had ever launched a large Drupal or LAMP stack website, so Acquia was brought in to do a general Drupal site audit and make server recommendations. The final solution was an F5 load balancer in front of 2 varnish reverse proxy/web servers, 1 memcache server, and 2 master and slave MySQL DB servers.
In the end, the server architecture for this website is pretty much the standard set-up for a high-performance Drupal website. The website is very responsive and has come nowhere near reaching its max load.
Custom server requirements were added to the 'Site status' report using hook_requirements(). These custom requirements check for properly configured firewall rules, internal webservice access, and additional PHP add-ons, like Oracle's OCI8 Database drivers.
The MSKCC website is primarily a content and information-driven site which is why it was important to focus on the website's content types and navigation system before implementing site features (aka modules). The website has 33 content types, which may seem like a lot but the broad mission of MSKCC, which is treatment, research, and education, requires some additional content type specificity. For example, doctors, researchers, and staff members all require unique content types with custom fields with unique node access rules and controls.
Below are some notable content types:
An HTML fragment is small piece of HTML code that is used as global content within the website's blocks, main menus, and/or super footers. HTML fragments are primarily used by web developers to build editable pieces of specialized but customizable content.
The view content type provides content administrators an easy mechanism for building listings of data (aka Views) on the website. The view content type includes several CCK fields that are passed as arguments to a selected view.
The teaser content type is a simple call-out, which consists of title, image, description, and a link that redirects to a complete web page. The teaser content type is used to create a specialized call-out for a page whose default teaser is not appropriate.
Out of the box, Drupal supports a primary and secondary menu. These menus are used in the main navigation bars at the top of website. The primary and secondary menus handle the first 3 tiers of MSKCC.org, and then a combination of taxonomy, books, and Views manage the lower levels of the website's information architecture.
Drupal's taxonomy system is used to manage MSK hierarchical medical specialities and even simple event categorization. I built a custom taxonomy helper module to generate hierarchical and alphabetical taxonomy term displays for finding a doctor by specialities or department.
Besides having a lot of unique content, the website has many unique sections maintained by different users. The Book module, included in Drupal core, was the best means to break down the website's very rich information architecture. A custom 'Book helper' module was created to allow administrators to customize a book's navigation using some additionally available menu features, including disabling menu items and customizing a menu item's title.
Add Nodes Page
I use Views religiously, for anything that is "a list of things." As long as the Views module remains as helpful with either generating an SQL query and/or with displaying the results of an SQL query, I am going to use it. A custom MSK views module was created to handle all Views-related customization including altering queries, exposed filters, and additional template preprocessing.
|Patient Stories||Videos Search|
Some Lessons Learned...
Follow Drupal's best practices
One of the key factors behind Drupal's healthy community of code contributors is the project's well-defined and enforced best practices. Before switching to Drupal, the only best practice I followed was trying to write clean code. Following Drupal best practices was the easiest way to improve my programming skills and the overall quality of the website's code.
Below are the five Drupal best practices that I adhered to during development of the MSKCC website:
- Code standards
Drupal's code standards are very well documented. The Coder module is extremely helpful in correcting any bad habits and mistakes.
- Version control
Use version control. `nuff said
- API documentation
Generally, developers hate writing documentation! To encourage myself and all future developers on the website to write decent API documentation, we set up a secure api.mskcc.org website using the API module. Seeing one's lack of documentation or just grammatical mistakes on a website can be a great motivator to make improvements or correct errors.
- Issue tracking
Getting the project team, including myself, to switch from tracking issues by email to using a purpose-built tracking system took considerable effort but everyone is now happily using Unfuddle to manage issues.
- Unit testing
SimpleTest is now part of Drupal 7 and this is the only best practice that I admittedly fell short of implementing. Unit testing is something I hope to implement during the upgrade to D8.
Originally, I started out namespacing just my modules with msk_* and soon realized it helps to namespace every custom object including Views, Panels, Rules, and even CSS classes. I namespaced all my views with 'msk_', then included the type of view, and finally, provided a unique name for the view. For example, the clinical trials view is named 'msk_directory_trials' and the view used for the news feed content pane is named 'msk_content_pane_news_feed'.
The project does not use the Features module but does export everything into code, including Views, Panels, Rules, and ImageCache. The website uses the Strongarm module to export almost all of the website's configuration settings (aka variables) into code. I created a Strongarm dump module which allows every system configuration page to be easily exported. When the site is updated to D8, it will use Features module.
I personally use Google Docs to document and share everything. I also keep an organized list of any useful modules and/or Drupal-related blog posts. There is no 100% perfect resource for Drupal, so it is worth tracking discussions about tricks, hacks, and APIs for modules like Views and Panels.
For developer documentation, I made sure to include the recommended README.txt files and API comments with every module and set up a series of README files for coding standards, installation guides, changes and issues with modules, etc., which are stored in SVN and available via a secure help section within the MSKCC website.
|Name Spaced Exported
Drupal works... maybe this is too simple a statement for a complex project comprised of close to 200 modules, but in the end Drupal accomplished what it was designed to do: build a website. Drupal allowed MSKCC to focus on their website's mission and not the technology behind it. In the end, MSKCC's goals were met because the website looks great and the information is easy to find.
While planning and implementing MSKCC's custom modules, I tried to make sure that any re-usable functionality was abstracted out into generic modules that could be shared with the Drupal community. Meanwhile, the great GIT migration occurred which changed and improved how the Drupal project and its contributed modules were being developed. One of the coolest changes was the addition of developer sandboxes. Sandboxes are basically open sourced Drupal projects that are not fully-fledged projects but they give developers a way to share their code. This is exactly what I intended to do.
During Randy Fay's DrupalCon presentation "Git on Drupal.org: It's Easier Than You Think!", I asked the question "should developers just sandbox all their code while working on a Drupal website?" The answer I got was "yes," so I decided to build and share my sandbox. I restructured my 'sites/all/modules' directory to reflect this by adding a 'sandbox' directory next to my 'contrib', 'custom', and 'dev' directories. I would describe this new 'sandbox' directory as code that sits somewhere between being completely custom to that which may one day be contributed back to the Drupal community.
Please note: Some of the 'sandbox' module below have not (and may never be) uploaded to Drupal.org because I feel I won't be able to fully support the code or there are similar modules already available on Drupal.org.
- Book author access: Allows a book's main page author to edit and manage all lower level book pages.
- User access control: Allows a user to grant other users access to update their content.
- API browser: Makes it easier to navigate API documentation and source code.
- Content labels: Allows administrators to update the titles and descriptions for a content type and its fields on one page.
- Content analyze: Adds an analyze (field lengths) tab to the content types - fields admin section.
- Strongarm dump: Allows module variables to be exported into arrays and objects that can be used by the Strongarm module.
- System summary: Builds a report to list site statistics and installed modules and themes.
- Image filter: Display an image's title attributes as a caption below or next to an image.
- jQuery UI filter: Converts static HTML to a jQuery UI Accordian or Tabs widget.
- Menu filter: Inserts a menu's links as a list or dropdown within the body of selected text.
- Short-hand path filter: A filter that allows for short-hand redirect paths to be entered and replaced within any text.
- TOC filter: Converts header tags into a linked table of contents.
- Book helper: Improves Drupal's core Book module's functionality.
- Node parent title: Automatically prepends or appends a parent title to a node's title when it is saved.
- Node reference back reference: Automatically creates node reference back references for selected content types and fields.
- Weight reset: Adds a reset button to the weight-based sorting view from the Weight module.
- Taxonomy helper: Helps improve the presentation of vocabularies and term hierarchies using custom templates and Views.
- Taxonomy permissions: Adds 'view vocabulary terms permissions' for taxonomy-related pages.
- Add to calendar: Provides 'add to calendar' links for Outlook, Google Calendar, Yahoo! Calendar, and iCal.
- Inline links: Adds custom and automated inline links to content.
- Subscribe to feed: Allows users to subscribe to an RSS feed using an RSS/Podcast reader.
- Create content: Adds contextual information to 'Create content' menu links.
- Ctools jump menu style: Converts CTools jump menu into a stylized HTML menu.
- Global optimizer: Groups and optimizes CSS and JS into global and page-specific files.
- Flush page cache: Easing the pain when you need to flush Drupal's cache.
- Menu redirect: Adds the ability to set a menu item to be a redirect which prevents multiple menu items from being in the active menu trail at the same time.
- Views global settings: Allows Views admins to define global settings (ie caching) that are shared by all Views.
- Views URL alias node: Allows node-related Views to be filtered by path aliases.
- Webform disable results: Allows editors to disable the saving of Webform submissions.
- Webform results access control: Allows selected users and roles access to view and edit Webform results.
- GSK block: Defines custom blocks for GSK website.
- MSK main: Provides shared utility functions for MSK modules.
- MSK block: Defines custom blocks for MSK websites with some block helper functions.
- MSK deployment: Deploys the MSK SVN codebase to multiple web servers.
- MSK input filters: Input filters for MSK-specific content and customizations.
- MSK form tweaks: Alters system, contrib, and node forms.
- MSK glossary: Allows users to look up cancer-related terms within the NCI glossary.
- MSK group: Manages custom MSK groups (aka labs, core facilities, and research programs).
- MSK group access: Allows the author of a group to edit and manage all group pages.
- MSK herbs: Custom JSON webservice for MSK herbs.
- MSK media: Enhanced multimedia content generated from the media_brightcove.module.
- MSK menu: Adds additional functionality to Drupal's menu system.
- MSK menu block: Stores all MSK menu blocks in code with additional custom logic.
- MSK menu breadcrumb: Handle custom breadcrumbing for MSK books, groups, and orphaned nodes.
- MSK migrate: Stores and displays information about Inettool data migrated to Drupal including originating id, meta data, and redirects.
- MSK migrate inetdata: Migrates MSKCC inetdata table data to Drupal's webforms.
- MSK migrate Inettool: Migrates an inettool project's site architecture, content, and resources from the Inettool database to Drupal.
- MSK migrate PRG: Migrates doctor bios and related directories from the Physician Referral Guide (PRG) database to Drupal.
- MSK migrate protocols: Migrates MSKCC clinical trials (protocols) to Drupal.
- MSK node: Enhances and organizes Drupal's core and contribute node related modules.
- MSK panels: Stores all MSK panels in code.
- MSK path: Manages MSK's SEO friendly paths.
- MSK RSS: Handles RSS and Podcast formatting for nodes and Views.
- MSK search: Handles customization of MSK (Google Mini) search results.
- MSK secure pages: Sets which pages are always going to be used in secure mode (SSL).
- MSK stats: Manages stat tracking tags/codes for MSKCC's DoubleClick and Did-It accounts.
- MSK theme: Contains re-usable theme and meta data related functions.
- MSK toolbar: Toolbar block for MSK, includes glossary, print, download, email, and share.
- MSK trials (aka protocols): Synchronizes trial content type with the MSK PIMS protocol database.
- MSK user: Adds additional information and functionality to Drupal's user profiles.
- MSK views: Stores all MSK views in code and enhances Views with exposed filters.
- MSK webform: Tweaks and adds additional functionality to the Webform module.
- MSK webform payment: Payment handler for MSK Webform module.
- MSK workflow: Custom workflow module that integrate revisioning and workflow.
- MSK wysiwyg: Enhances CKEditor WYSIWYG.
The launch of the new MSKCC.org was a joint effort of four different groups/organizations who were responsible for design, content, web development, and infrastructure. Magnani, Caruso and Dutton (MCD) designed the new site and re-worked the information architecture. The Big Blue House (My company) was responsible for all Drupal development. MSKCC's Department of Information Systems configured and administers the enterprise LAMP server stack. Finally, MSKCC's Department of Public Affairs manages the website day-to-day, and is responsible for the high-quality content and beautiful photography, as well as ongoing strategy and optimization.