Problem/Motivation

Column "path" is empty & in related #2335687: Show file size on admin/reports/xhprof (esp if path is missing) new size column is added

Proposed resolution

Save "path" (Request URI) in dump

Remaining tasks

Agree on implementation & commit

Members fund testing for the Drupal project. Drupal Association Learn more

Comments

lussoluca’s picture

In the D7 version only the mongodb storage shows the path of a profiled page (see #2335687: Show file size on admin/reports/xhprof (esp if path is missing)), in D8 version the storage is a straight port so here we also didn't have the path stored (and the showed).

We need to modify the way the profile is stored to be similar to the one used in Webprofiler (with an index file with metadata, such the path, the size, the timestamp, ...). If we all agreed on this implementation I can easily port the webprofiler code to the XHProf module.

Xano’s picture

Priority: Normal » Major

Bumping this to major, because it's a terrible pain in the ass.

Why is it that only MongoDB shows the path?

lussoluca’s picture

The XHProf file format doesn't save the profiled page's path. To overcome to this behavior the MongoDb submodule (not ported to D8 yet) saves the page's path in a field of the document.
I've only ported the filesystem storage which saves only the XHProf file so the information about page's path is lost. We need to do something similar to the webprofiler module: an index.csv file which stores metadata about each XHProf file

q0rban’s picture

Here's a patch to get this working in D7.

q0rban’s picture

Status: Active » Patch (to be ported)
lussoluca’s picture

I don't know if altering the XHProf data file is the best solution for this issue, I prefer to have metadata informations in a separate location.

juampynr’s picture

Patch at #4 applies cleanly and works fine.

donquixote’s picture

I don't know if altering the XHProf data file is the best solution for this issue, I prefer to have metadata informations in a separate location.

I guess the downside would be that the files are no longer readable by other programs.. valid point.

The options, as I see it, are

  • storing it in the database -> needs babysitting to not let it pile up over time
  • storing it in a csv file, as in #3 (lussoluca) -> seems ok, if we don't get too many files.
  • storing two files per instance, one for the meta and one for the data.

I think the csv file is the preferable solution.
But let's not forget we will parse this file not just on the overview page, but also to display the meta information for a specific xhprof run, on admin/reports/xhprof/%. We don't want this file to become too long.

Maybe we should quickly discuss the case where more than one site dumps xhprof files into the /tmp directory.. should they all go into the same csv file? We could avoid this by having a site-specific uuid in the filename. Which is no longer unique if you copy the database.. so use a hash of DRUPAL_ROOT instead?

lussoluca’s picture

We don't want this file to become too long.

In Webprofiler, for example, we have an option (enabled by default) to clear al profiles on cache clear, we can do the same here.

Maybe we should quickly discuss the case where more than one site dumps xhprof files into the /tmp directory.. should they all go into the same csv file? We could avoid this by having a site-specific uuid in the filename. Which is no longer unique if you copy the database.. so use a hash of DRUPAL_ROOT instead?

Can we store the csv file into the Drupal filesystem (public:// or private://) so it is site specific?

donquixote’s picture

In Webprofiler, for example, we have an option (enabled by default) to clear al profiles on cache clear, we can do the same here.

Maybe we also want to clear once it hits a specific number or file size..

Can we store the csv file into the Drupal filesystem (public:// or private://) so it is site specific?

The nice thing about /tmp is that the operating system does the housekeeping for us. And, I assume, a csv file in /tmp would be cleared at the same time as the log files.
But I am not fundamentally opposed to it, just mentioning.

And, we don't want to these files to be copied around on backup or deployment / migration.
There is a reason why we can specify a tmp location in Drupal's filesystem settings..
Unfortunately, people might simply put /tmp there, and then it is shared with other projects. But md5(DRUPAL_ROOT) should be fairly unique, or not?

AndrewsizZ’s picture

Created patch for 8.x version
@lussoluca yes I agree that batter save this path data(metadata informations) in a separate location, but for me looks also not bad to have all data in one place.
So maybe someone need for now, it will be enough ;)

AndrewsizZ’s picture

Status: Patch (to be ported) » Needs review
kdebisschop’s picture

@AndrewsizZ looking at the patch in #11, it seems to me that it requires unserializing each profile in the storage directory each time the index page is viewed (unless the page itself is cached). Am I reading the logic correctly?

If I am reading that correctly, do you have any thoughts or experience on the performance cost of unserializing that many files ? They can get large enough that I'm not sure we can easily ignore the cost of passing all that data through the serializer and then in the run object returned to scanXHProfDir() only to get the request path. Am I wrong?

Was the idea of a CSV or other index file deemed unworkable?

Thanks for your thoughts (and your patch which was helpful in understanding the issue)

kdebisschop’s picture

I have been exploring the idea of a CSV file myself by playing with the D8 patch. Since one person may be examining the file while another process is altering it, I think there are possible concurrency issues. Those are the sorts of things that make me wonder a database table might be worth it.

I explored creating separate smaller tables that have only the path in their contents - it would reduce the size of the file contents that needed to be loaded, but the summary page would still need to open very one of those little files to get the path.

Would anyone be interested in looking at a patch for d8 that puts it in the database and creates a cron to clean up removed files on a regular basis?

humansky’s picture

Issue tags: +needs backport to D7
andypost’s picture

  1. +++ b/src/Controller/XHProfController.php
    @@ -109,6 +109,10 @@ class XHProfController extends ControllerBase {
    +        '#template' => $this->t('Path: !path', array('!path' => $run->getPath())),
    

    this no longer works

  2. +++ b/src/XHProfLib/Run.php
    @@ -18,6 +18,11 @@ class Run {
       private $namespace;
    +  ¶
    +  /**
    
    +++ b/src/XHProfLib/Storage/FileStorage.php
    @@ -62,6 +62,7 @@ class FileStorage implements StorageInterface {
         $files = $this->scanXHProfDir($this->dir, $namespace);
    +    ¶
         $files = array_map(function ($f) {
    

    trailing whitespaces

  3. +++ b/src/XHProfLib/Storage/FileStorage.php
    @@ -75,13 +76,17 @@ class FileStorage implements StorageInterface {
    +      'path' => isset($_SERVER['REQUEST_URI']) ? $_SERVER['REQUEST_URI'] : $_SERVER['PHP_SELF'],
    

    there should be d8 method for that

andypost’s picture

Here's a reroll with fixes

andypost’s picture

+++ b/src/XHProfLib/Storage/FileStorage.php
@@ -110,11 +113,13 @@ class FileStorage implements StorageInterface {
       foreach (glob("{$this->dir}/*.{$this->suffix}") as $file) {
         preg_match("/(?:(?<run>\w+)\.)(?:(?<namespace>[^.]+)\.)(?<ext>[\w.]+)/", basename($file), $matches);
+        $run = $this->getRun($matches['run'], $matches['namespace']);

This could be very expensive to deserialize all runs to get its path