Hi all.

I maintain a site based on Commerce, with several thousand unique products; user manuals. I designed it myself, wrote some modules to fill gaps in functionality etc. Everything works great. The Drupal site sits on a subdomain, and the main domain is an encyclopedia of information about the companies and machines that we sell manuals for. The main site isn't made with Drupal - it has been written in Word and Publisher since about 1998. There are also many hundreds (possibly 1-2k) pages on this site.
We are now at the stage where we would like to Drupalise the main site to improve its appearance and make it easier to manage. Impressive though it is, the menus are all hard coded and it looks as though it was made in Publisher 2000 (which it still is). I'd also like to advertise products within the information we provide, add adsense etc.
Each page consists of text and images with captions that include HTML, and many pages alternate between the two so simply using image fields on nodes won't cut it.

I've had a crack at this using the Paragraphs module and have come up with exactly what I need, but there are a couple of things that concern me - namely performance and futureproofing, Firstly I'm worried that having spent hundreds of hours migrating content into Drupal, the database will be so huge that performance will suffer or backing up will become impossible. Then there is the potential issue of migrating the content out of Paragraphs in the future - things change in Drupal all the time. I'm building in D7 and Paragraphs exists for 8, but will it definitely exist for 9? Is there some other module or set of modules I should be looking at for authoring this kind of content?

There are other issues I've yet to solve too, such as dealing with all the inevitable broken links.
Any words of advice from someone who's undertaken a migration of this scale, or has stretched Paragraphs to its limits would be really appreciated.

Comments

VM’s picture

drupal.org has over a million nodes. What you've not discussed is your hardware and user base. I suggest bench marking based on your known variables.

griz’s picture

Good point. I'm using a Linode 8GB VPS for the Drupal site, and I haven't yet measured the traffic. This is the traffic on the main site last month:

Server Activity Totals
Total Sessions Served 766,139
Total Hits 7,242,494
Total Page Hits 1,481,380
Total Non Page Hits 5,761,114
Total Session Duration 118,692,600s
Total Transferred 654.34 GB

Server Activity Averages
Total Sessions Served 766,139
Average Hits Per Session 9
Average Page Hits Per Session 1
Average Session Duration 154s
Average Transfer/Session 895.56 kB

Page views per session breakdown
292945 (38.3%) sessions made 0 page requests
321106 (42.0%) sessions made 1 page requests
122572 (16.0%) sessions made 2-5 page requests
14923 (1.9%) sessions made 6-10 page requests
7521 (1.0%) sessions made 11-20 page requests
4272 (0.6%) sessions made 21-50 page requests
1136 (0.1%) sessions made 51-100 page requests
906 (0.1%) sessions made 101+ page requests

Time spent per session breakdown
642417 (83.9%) sessions lasted 0 minutes
18511 (2.4%) sessions lasted 1 minutes
34418 (4.5%) sessions lasted 2-5 minutes
40200 (5.3%) sessions lasted 6-15 minutes
19956 (2.6%) sessions lasted 16-30 minutes
4668 (0.6%) sessions lasted 31-45 minutes
1962 (0.3%) sessions lasted 46-60 minutes
3249 (0.4%) sessions lasted 61+ minutes

  • Am I right in thinking that there could be a lot of hotlinking going on here? It's not really a concern as we pay a flat fee for this hosting, but may be an issue on my VPS.
  • I'm not too concerned about performance for anonymous traffic as I've already set up caching to deal with that.
  • drupal.org doesn't use the Paragraphs module, which is the thing that concerns me - it must significantly multiply the number of requests per page for logged-in users.
  • Benchmarking... not something I've done before. I know Google is my friend here, but I'd love some pointers. Do I devel generate a load of paragraphs content and then simulate lots of page requests?