As a first step to solve #1074960: Let maintainers set a default branch on git repositories we need methods to set and get the default branch of a repository. To set a default branch a symbolic ref from HEAD to refs/heads/<branchname> must be created. Then, when cloning, <branchname> will be checked out by default.

The attached patch adds a testcase for this and implements it. Please review.

Because of #1291272: VersionControlGitTestCase::versioncontrolCreateRepoFromClone() trys to call protected method VersioncontrolGitRepository::exec() the testCloning() method will fail - the secound patch makes the exec() method public to show that it would work, if #1291272: VersionControlGitTestCase::versioncontrolCreateRepoFromClone() trys to call protected method VersioncontrolGitRepository::exec() was fixed. It is only the test case that fails because of this, setDefaultBranch() works either way.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

pillarsdotnet’s picture

As requested. Since I can't fault the code, I'll nit-pick the documentation. Coding standards require that function summaries should start with a third-person present indicative verb, and fit on a single 80-character line.

From the first patch:
+   * Change the default branch.

Should perhaps be:

+   * Changes the default branch.

 

+   * Get the default branch of the repository. That is the branch HEAD points
+   * to.

Should perhaps be:

+   * Gets the default (HEAD) branch of the repository.

 

+   * Ensure that the default branch stays set after it has been changed.

Should perhaps be:

+   * Ensures that the default branch stays set after it has been changed.

 

+   * Ensure that cloned repositories actually have the default branch checked
+   * out.

Should perhaps be:

+   * Ensures that cloned repositories have the default branch checked out.
And in the second patch:
+   * Change the default branch.

Should perhaps be:

+   * Changes the default branch.

 

+   * Get the default branch of the repository. That is the branch HEAD points
+   * to.

Should perhaps be:

+   * Gets the default (HEAD) branch of the repository.

 

    * Execute a Git command using the root context and the command to be
    * executed.

Should perhaps be:

    * Executes a Git command using the root context.

 

+   * Implementation of getInfo()

Should be:

+   * Implements getInfo().

 

+   * Ensure that the default branch stays set after it has been changed.

Should perhaps be:

+   * Ensures that the default branch stays set after it has been changed.

 

+   * Ensure that cloned repositories actually have the default branch checked
+   * out.

Should perhaps be:

+   * Ensures that cloned repositories have the default branch checked out.
Niklas Fiekas’s picture

Thank you for your review, pillarsdotnet.

Both patches are the same except the latter contains this hunk:

@@ -161,7 +198,7 @@ class VersioncontrolGitRepository extends VersioncontrolRepository {
    *  Logged output from the command; an array of either strings or file
    *  pointers.
    */
-  protected function exec($command) {
+  public function exec($command) {
     if (!$this->envSet) {
       $this->setEnv();
     }

The new versions contain all your suggestions except:

    * Execute a Git command using the root context and the command to be
    * executed.

to

    * Executes a Git command using the root context.

That one was just from the context and I didn't touch it in my patch. Maybe this could be done in a seperate issue together with other coding-style / comment fixes.

sdboyer’s picture

Very happy to see somebody stepping up for this kinda stuff. Unfortunately I have to crush this a bit with many additional requirements, but @Niklas Fiekas was pretty gracious about that in irc :) brief summary of those issues:

  • We cannot assume direct writability to repositories on-disk. While it may be possible in certain configurations, it isn't possible on d.o (repositories are not locally available to webheads), and isn't advisable in any case (webserver shouldn't have write access to your git repos). To deal with this, we currently use a layer of indirection for async requests (such as setDefaultBranch()) that passes through a job queue, and is picked up by workers on the other side. This works kinda well enough for the async write situation. It's not robust enough to really deserve the title 'framework,' but it's good enough to not require rearchitecting.
  • Direct reads from the repository have no real support at the moment - at least, not when triggered from the web ui. The assumption with such reads are that they're a blocking request - the process handling the web request needs to have that data in order to render the page. getDefaultBranch() is an excellent example of this. This is a very different kind of challenge than the async one. It's already a requirement, though, for the new, Drupal-native version of a repository browser to go in. GitHub uses BERT, an RPC system, to do it; we're looking at a node.js-backed solution. The transport layer doesn't really matter that much, though, tbh.
  • Actually, getDefaultBranch() isn't something we necessarily want to require a blocking network + io call. We still need the sync framework described above, but for this specific case, it would be better to have the data stored locally - we can treat that as canonical of the user's preference, which lets us clean up any potential anomalies on disk. Doing that means adding a {versioncontrol_git_repositories} table, with a default_branch field, and corresponding CRUD to the various entities that manage it (VersioncontrolGitRepository, VersioncontrolGitRepositoryController. We could do this with an overridden controller or a hook, but I prefer overriding).

All that said, I don't really mind keeping these methods on the VersioncontrolGitRepository object itself; they make logical sense as part of the API. But they need to connect to these indirection layers instead of doing the work directly themselves.

Niklas Fiekas’s picture

Thanks for summarizing this. Two new patches.

The first one defines the schema of the versioncontrol_git_repositories table and creates it.

The secound extends the VersioncontrolRepositoryController with a VersioncontrolGitRepositoryController that loads data from that table and adds CRUD to the VersioncontrolGitRepository class.

Niklas Fiekas’s picture

Niklas Fiekas’s picture

Update:

I forked this on github, because I can rebase there to give you a linear history (at least until someone forks me): https://github.com/niklasf/versioncontrol_git

We've got the schema and the controller. Next are getDefaultBranch() and setDefaultBranch(). What do they do?

Can we do something like that in getDefaultBranch()?

How do I queue a write operation?

Niklas Fiekas’s picture

Update:

Implemented getter an setter.

I don't think we need an update function, since all default branches should be master.

Is this it, or do we need something else (except a UI)?

Niklas Fiekas’s picture

The attached patch:

  • Adds a versioncontrol_git_repositories table.
  • Adds a VersioncontrolGitRepositoryController (that depended on #1293720: Allow backend specific repository controllers).
  • Adds a getter and setter for VersioncontrolGitRepository::$default_branch.
  • Adds a fetchDefaultBranch() method to the worker.
  • Adds an update function that loads all repositories and queues them for an initial synchronization.

I rewrote the history to make those steps single commits in my fork on github: https://github.com/niklasf/versioncontrol_git/compare/11a4940156...8ea7c...

Niklas Fiekas’s picture

Status: Needs review » Fixed

Thanks for reviewing. I am marking this as fixed, becuase it's merged in.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

sdboyer’s picture

Status: Closed (fixed) » Needs work

The original 6203 update created the table, but it did not create records in that table for each currently existing repository. As a result, no subsequent update operations work as they assume those records to already exist.

Please add a 6205 update that creates records, even if they're empty ones, for all (git) repositories.

I'm sure this was my oversight, letting this in as it is.

Niklas Fiekas’s picture

Status: Needs work » Needs review
FileSize
858 bytes

Please test this one.

sdboyer’s picture

Yep, looks good. Go ahead and push it.

Niklas Fiekas’s picture

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

  • Commit b53bdeb on 6.x-2.x, fix-invalid-default-branches, fullsync-memory by Niklas Fiekas:
    Issue #1291308: Add versioncontrol_git_repositories table
    
    
  • Commit 26eef1e on 6.x-2.x, fix-invalid-default-branches, fullsync-memory by Niklas Fiekas:
    Issue #1291308: Add VersioncontrolGitRepositoryController
    
    
  • Commit 4b12120 on 6.x-2.x, fix-invalid-default-branches, fullsync-memory by Niklas Fiekas:
    Issue #1291308: Implement getDefaultBranch() and setDefaultBranch()
    
    
  • Commit b104913 on 6.x-2.x, fix-invalid-default-branches, fullsync-memory by Niklas Fiekas:
    Issue #1291308: Implement fetchDefaultBranch()
    
    
  • Commit fb6c927 on 6.x-2.x, fix-invalid-default-branches, fullsync-memory by Niklas Fiekas:
    Issue #1291308: Add an update function for an initial synchronization
    
    
  • Commit 219f536 on 6.x-2.x, fix-invalid-default-branches, fullsync-memory by sdboyer:
    #1291308 by Niklas Fiekas: Added Add setDefaultBranch() and...
  • Commit e84b44d on 6.x-2.x, fix-invalid-default-branches, fullsync-memory by Niklas Fiekas:
    Issue #1291308: Populate versioncontrol_git_repositories with empty rows...