Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Under what circumstance would one want to run 'git pull' on a platform? or 'git checkout' for that matter. I don't see any post hooks to update sites on the platform, or anything like that. It seems like we're allowing, even encouraging the modification of platform code under existing sites. This has always been strongly discouraged, for good reasons.
Comment | File | Size | Author |
---|---|---|---|
#13 | Hosting | aegir3-lxc.devekko.net - Google Chrome_013.png | 107.35 KB | niccolox |
Comments
Comment #2
gboudrias CreditAttribution: gboudrias at Praxis Labs Coop commentedI've never been a fan of gitifying platforms and I think we should remove the functionality while there are still few users. The feature belongs in a more controlled workflow environment such as devshop.
Comment #3
formatC'vt CreditAttribution: formatC'vt as a volunteer commentedi do it when i need it =)
For example: upgrade core from 7.38 to 7.39
Comment #4
formatC'vt CreditAttribution: formatC'vt as a volunteer commentedI think we should do this:
1) Separate user permissions for git pull/checkout site and platform
2) Add post hooks to update sites on the platform
what are you think?
Comment #5
ergonlogicTL;DR The
git pull
task on platforms are dangerous and should probably be blocked pending re-work to make them safer.Let me elucidate my concerns:
Changing code under a running site leaves it in an inconsistent state. That is, the new code expects a certain database schema, but until
update.php
is run on all sites on the platform, this won't be the case.update.php
ought to be called when the site is in maintenance mode.So the workflow, at this point, would be:
There are several issues with this workflow. If we were to run
update.php
on all sites in parallel, this could be very resource intensive, to the point of crashing the server if you've got a couple hundred sites on the platform. Otherwise, we'd be running updates serially, and some unlucky site will be in maintenance mode for the entire time it takes to update all the other sites. The uptime of the next to last site won't be in much better.But it gets worse. What if something goes wrong on one or more sites? At very least, we'll need to take backups of the sites before running the
git pull
. Triggering dozens or hundreds of backups in parallel is a recipe for disaster, from a disk I/O standpoint, at very least. Plus, we'd need to put the sites in maintenance mode prior to the backups, to avoid data loss if we had to rollback.I'm not even sure what a rollback in such a case should look like. We cannot just revert back to the previous checkout on the platform, without restoring the backups for all the sites that succeeded prior to the error that triggered the rollback. So what do we do with any broken sites? I suppose we could build a new platform based on the prior commit, and deploy the broken sites' backups there. But if we're going to do that, we're pretty much back to Aegir's recommended process of migrating between platforms, just taking a long, arduous and risky way around.
Maybe I'm missing something obvious, but to me platform-level git repos are pretty much conflating sites and platforms. As soon as you have a second site on a platform, at least one of them will suffer increased downtime and risk. Even for a single site per platform, we should still follow the process above to ensure the safety and reliability of such updates.
I suppose a user could safely maintain two platforms off the same git repo, and migrate sites in a leap-frog fashion. That is, have all sites on one platform, run
git pull
on the other platform, migrate all sites to the updated platform, then reverse that process for the nextgit pull
. If this is the workflow we want to support, then we'd probably only want to allow pull tasks on platforms without any sites.Again, unless I've missed something, we're looking at non-trivial work to support this. For now, I suggest that we simply disable pull tasks on platforms. If someone is already using it as part of a safe update process, I'd love to hear about it. In that case, we could split out platform-level git repos and pull tasks into a separate feature, and put a warning in the description pointing to documentation for the proper process.
I'm tempted to turn this into a bug report, since this functionality doesn't appear to conform to our usual policy of helping users not shoot themselves in the foot.
Comment #6
helmo CreditAttribution: helmo at Initfour websolutions commentedTL;DR: I do use it in two ways. Separating the permission and adding a warning seems fair.
1) For small sites I have a common platform in git, where the sites then also have their own git repo for the site specific stuff.
2) For larger sites I sometimes create a custom platform (often also for legacy reasons), these often existed in sites/default before I put them under Aegir.
My reasoning for a platform in Git:
* Complexity: I started with drush make years ago, but kept patching and debugging it. Maybe these days more of the edge cases are flushed out, but back then committing to git took way less time then to figure out the right make format for a library that comes as a zip, only available via a POST request, with the desired code in a subdirectory. Not to mention borky SSL certificates.
* Comparability: Another advantage of all code in Git is for me the ease of reviewing. I just drush dl --gitcommit, and
git diff
. I could run diff on /var/aegir/platforms/platform-x/sites/all/modules/contrib/views /var/aegir/platforms/platform-y/sites/all/modules/contrib/views after building it, but it's more typing/TAB-ing.* Dependency: And I don't like to depend on remote sites to host my production code. If the download page for some fancy jquery plugin is down when Drupal releases a security update I'll have trouble building a new platform.
Situations to pull instead of migrating to a new platform:
* Small security updates that don't bring update hooks (yes I review that code before deploying)
And often from looking at the code I can see that maintenance mode is not needed. Even with some updates hooks.
I know it's not a best practice to do unless you know what you're doing.
* Minor CSS updates
* Updates already thoroughly tested on a staging site.
At the confirmation dialog we already have a checkbox for "Force: Reset --hard before pulling?" that could have an option to run db updates.
This also relates to #1456258: Limit git features by platform where also checkboxes for drush fra and the drush cc all were mentioned.
One of my thoughts is that rules integration could also help here to add more custom actions and safeguards. Unfortunately that D7 upgrade has not gotten finished, #2323959: Upgrade to 7.x-3.x
Comment #7
formatC'vt CreditAttribution: formatC'vt as a volunteer commentedYes, it's a dangerous stuff and result can be a nightmare, but no one is pushing you, this is your decision to use git or not.
Comment #8
cweagansSorry for the late reply on this. As the original author of this code in (hosting|provision)_platform_git, I can give some info:
- This functionality was written for SLAC.
- All of their sites are deployed from a custom install profile (so if they need to recreate a site, they can just delete it and re-deploy it)
- They use it for pretty much everything (migrate is too slow for them - their sites are many GB between the files and database, and each of their environments (dev, stage, prod, etc) are on different web server clusters)
- In their case, there's literally no user generated content after about 8pm and before ~6am, so their backups run at night between those times.
- If they deploy new code and something breaks, they have an easy process to restore from a backup. Most of the time, it's CSS tweaks or security patches. Things like that. Even when they're deploying new functionality, though, it's more of a "We're adding this module, site owners. You're welcome to turn it on if you want.". They don't *ever* remove anything.
- Putting individual sites in Git didn't really make much sense for them because there is no custom code running on each site. You can basically think of their infrastructure as a SaaS product for internal use.
Generally, the thought process was that while the Migrate task provides good reliability at a technical level, you can also achieve that reliability in other ways - in their case, from a meatspace process.
Comment #9
cweagans(Oh, and generally +1 for keeping it and adding optional settings to improve reliability)
I'll invite the SLAC people to comment here.
Comment #10
ergonlogicOkay, fair enough. What all these responses have in common is that the operators know the risks, and are mitigating them through their own processes and/or project architecture outside of Aegir itself. This entails fairly advanced knowledge of Aegir, Drupal and git.
So I propose that we:
(3) would be in Aegir core, and should be pretty simple. We can also put clustering and other edge-case features in there.
Thoughts?
Comment #11
niccolox CreditAttribution: niccolox commenteda few scatter-gun points
in short, instead of turning this feature off or hiding it away as "advanced" its actually basic, and the most simple option 1 site, 1 platform
we need a simple DEV>TEST>LIVE workflow for the single site/platform Git Platform use case, i.e. the theoretical Git Platform Deploy (dev>test>live feature)
for the complex mass hosted multiple site/per platform Git Platform use case, we need to offer even more guidance
one thing that I am learning more and more, is that there are lots of POLICIES built into Aegir tools, and so its a matter of gathering these use cases and best practices and baking them in as defaults, especially the simple and most common single site/single platform use case...
I would also add, that the Aegir Summit featured git based workflow, for platforms and sites, the genie is out of the bottle
Comment #12
cweagansFor multisite mass hosting (where Aegir excels right now), I'm not sure that this workflow (git platforms) is the best. SLAC was a pretty weird project from a tech stack standpoint. For many users of Aegir, the current migrate workflow is really the "right" way to do it, but I agree that we shouldn't lock people out of other workflows if they are deemed to be appropriate.
Note that the PaaS solutions we're implementing for Aegir 4 essentially do the exact same thing as the Migrate task, but under the hood. The main reason for doing this is so that we can roll back if something is horribly wrong with the new container, and that, IMO, is a good thing to suggest by default in older versions of Aegir too.
hosting_platform_git/provision_platform_git were open sourced approximately when they were written, so they've been used by at least one org for two years.
I think the solution here is a two-parter:
1) Disable git platforms for new installs and hide it in an "Advanced" section on the features page
2) Get the Rules integration working so that people can start informing Aegir of their human-driven workflows. If that means that "Update this platform" = "Git pull and hope for the best", that's an end user decision. It could also mean, however, that "Update this platform" = "Provision a new platform with this Git repo + tag/branch on the same server, clone sites to that platform, check that everything is okay on those sites, then remove aliases from the old sites and apply them to the new sites so that the real site URL points to the new platform". Both are perfectly valid workflows depending on the org requirements, but we shouldn't decide which one, IMO.
Just my $0.02.
Comment #13
niccolox CreditAttribution: niccolox commentedto be fair, the Git Pull Task is in Experimental and the Git Checkout is in Roles and Permissions
Features marked experimental have not been completed to a satisfactory level to be considered production ready, so use at your own risk.
some simple changes in description and grouping would help
Git pull task
Enables git pull tasks on sites and platforms.
Roles & permissions
Git checkout task
Enables git checkout tasks on sites and platforms.
Comment #14
niccolox CreditAttribution: niccolox commentedsome possible Git Platform related support modules/code
Drupal Git contrib
https://www.drupal.org/project/git_rules
https://www.drupal.org/project/git_status
https://www.drupal.org/project/git_wrapper
https://www.drupal.org/project/git_sync
https://www.drupal.org/project/erpal_git
https://www.drupal.org/project/git
https://www.drupal.org/project/git_deploy
https://www.drupal.org/project/git_hooker
https://www.drupal.org/project/features_git
https://www.drupal.org/project/git_filter
https://www.drupal.org/sandbox/bevan.wishart/1441638
https://www.drupal.org/sandbox/sun/1255586
git graphing
http://gitgraphjs.com/
https://github.com/alaingilbert/git2graph
https://github.com/gogits/gogs
https://github.com/mbostock/d3
https://github.com/blog/1093-introducing-the-new-github-graphs
Comment #15
ergonlogicFYI, 'Roles & Permissions' is a collapsed fieldset providing further details about the feature above.
The git functionality on platforms has distinct use-cases from that on sites, and so, should be separate features, imo. I really don't like seeing the 'git url' field when I'm creating a platform, for example, when what I really want is to keep site config under git. Likewise, in a single site per platform scenario, having git repos on sites is superfluous, as are their related tasks, etc.
I think we might want to consider some default behaviour for platforms under git. For example, we might want to lock the platform once a site is installed on it. This'd require an opt-in to the more risky workflows.
Comment #16
niccolox CreditAttribution: niccolox commentedone of the confusing things is that when you come from a cloud lord git centric workflow and you see git and site you think git platform/site as the same thing
but in the Git Integration in Aegir it means a site folder, site level git repo
Comment #17
ergonlogicJust to be clear, I don't want to limit anyone's ability to build custom workflows with Aegir; quite the opposite. I'm only expressing concern with what we recommend, i.e., default behaviours. I think Aegir users have a right to expect that we won't recommend dangerous workflows.
Git support is one of the biggest new features in Aegir3, as evidenced by it being among the very first golden contrib to be added. I'd like to move it out of 'experimental'. After all, most, if not all, Aegir maintainers are using it in some fashion in production. So I'm working on (1) and (3) from #10. The others can be worked on later, if desired, in other issues.
Aegir4 ought to support platform-level git repos properly, along with sites/apps that use Composer, Gemfiles, etc. These serve the same purpose as Drush Make, which will likely go the way of the dinosaur, since Composer is much better. Common Git upstreams are also a reasonable way to use platform repos in a multi-site-like fashion. I believe our efforts are better served working towards Aegir4. Containers are popular for good reasons. Kubernetes and Openshift provide robust solutions to some of these very issues.
Comment #18
ergonlogicPushed suggested changes to
dev/2555129
branch. Note that it depends on 7a148998d4 in Hosting, where I added an 'advanced' group for Hosting features.Note that I haven't tested this thoroughly, but the UI changes appear to work.
Comment #19
formatC'vt CreditAttribution: formatC'vt as a volunteer commentedthx, i will do some tests in few next days
Comment #20
helmo CreditAttribution: helmo at Initfour websolutions commentedI only did a very limited amount of testing, but added an update hook and fixed a typo in the dev branche.
Comment #21
formatC'vt CreditAttribution: formatC'vt as a volunteer commentedWe have a problem with pull because hook form alter call order is:
but
hosting_git_pull_form_alter
should be called last, not firstThe module order is determined by system weight, then by module name.
And
_hosting_git_site_or_platform_enabled()
is undefined.Comment #22
formatC'vt CreditAttribution: formatC'vt as a volunteer commentedI do think we need merge
hosting_git
,hosting_git_pull
andhosting_git_checkout
modules andhosting_git
can host all of the code. Andplatform/site
modules can be implement onlyvariable_set
on module/enable just becausehook_hosting_feature
requires a moduleComment #23
ergonlogicI'm ambivalent about how to handle individual git tasks. Right now, they're all in the same Provision code, under hosting_git, but we have separate features for a couple tasks on the front-end, and no tasks for all the rest. We have feature requests for a number of these to be exposed via front-end tasks, though. If we go the route of individual modules per task, then we'd probably want to split up the backend accordingly.
Historically, we've tended towards monolithic code-bases, which complicates debugging and such, even if it saves on the extra boilerplate of separate modules. Considering how some of these tasks are fairly stable, it'd be nice to move them out of 'experimental'. If we're adding new tasks to the main hosting_git though, this'd keep us from doing that. At least, I'd argue that we shouldn't. Individual modules per task allows for a greater separation of concerns, and would allow us to add additional 'experimental' git tasks, while promoting the battle-tested ones.
As for the order of form_alter hooks, I'd suggest just lowering the weight of hosting_git.module, to ensure that it runs before the other alter hooks. Alternative, increasing the weight of the others should accomplish the same thing.
Comment #24
Jon PughSee branch
2897894-git-hooks
This adds extensible "git hooks" to platforms and sites. Very simple to add:
There is a similar hosting hook.
To handle the platform/sites problem, if you run
provision-git-pull
orprovision-git-checkout
on a platform, it will run the configured git-hooks on all sites.Comment #25
colanAs Aegir Hosting Git does some overly permissive stuff (updating platforms in place as described in the issue description, which should be discouraged), you can now use the Aegir Platform Git module for platforms instead. It won't allow you to do this. It's a submodule of Aegir Deploy.
Comment #26
colanWe should mention this on the project page here to make users aware of the danger, and provide a link to the other module.