My goal is to read the analytics output at the end of this month and see how many people visited articles about "cats" versus articles about "dogs". I've inserted values into the custom variables section of the configs with what I think should be a correct way of tracking the taxonomy term name of each node visited.

In the custom variable fields, I set the NAME of one variable to be "Animal" and the VALUE to be "[node:field_animal:1:name]". If I understand right, this means that Google Analytics will track the first term stored in each field_animal field of each node visited. However, I get a warning from the module that says,

The Custom variable value #1 is using the following forbidden tokens with personal identifying information: [node:field_animal:1:name].

I think what is happening is that the module is poorly detecting the string "name" in the value as potentially personal information about a user, rather than as the name of the taxonomy term in a node. It sees "name" in the field and prevents me from saving the config.

If I pick a different token from the same field, like "[node:field_animal:1:tid]" (tid instead of name) then I can store the config and assume I'll be getting my information in tomorrow's analytics report.

Why does the term ID pass the tests while the term name fails?

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

hass’s picture

Status: Active » Fixed

This one of the known unwanted side effects. We need to change the regex for this. Please try to use any other variable.

seaneffel’s picture

Title: Term ID token trips the privacy warning » Change regex on Term ID privacy warning
Category: support » task
Status: Fixed » Active

Ok, instead of dead-ending the issue, let's change the scope and make it a task.

Also, if anyone can recommend a workaround to track taxonomy terms then it will take some pressure off these otherwise busy module maintainers.

hass’s picture

Category: task » support
Status: Active » Fixed

If you can share a patch, feel free to reopen.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

eelkeblok’s picture

Version: 7.x-1.3 » 7.x-2.x-dev
Issue summary: View changes
Status: Closed (fixed) » Needs review
FileSize
519 bytes

This is the exact issue I am running into. Previously, I've made use of the patch in #1307452, but it no longer applies to the 2.0 version of this module.

Can we make the :name] entry in the blacklist more specific? I suppose it is like it is to be generic and catch any token representing a person's name, but the way it is, it is too generic. I think it is a commendable idea to try and protect users from themselves, but it seems too restrictive, the way it is. The safety net is now preventing perfectly legitimate uses as well.

At least for this particular issue, I propose the accompanied patch. Maybe in general, you should not try and make the list exhaustive, but instead try and catch the most obvious violations. Or alternatively, I'd even suggest to go as far as offering a setting to disable the blacklisting completely and maybe displaying a message whenever it is set, warning the user that the safety net is gone and they are making any edits at their own risk.

eelkeblok’s picture

Category: Support request » Task
hass’s picture

Status: Needs review » Needs work

This does not work as you leave a lot of alternate user name tokens outside. That is why the name key is inside. I think we can only solve this with a whitelist.

eelkeblok’s picture

The token for which this issue is occuring is one for the name of a taxonomy term in a field. For an example, see the issue summary. So, whitelisting will not work, you'd need a whitelist entry for any taxonomy field (*could* be generated automatically, but I really feel this goes a bit too far for a courtesy function). What user name tokens are we leaving out, this way?

hass’s picture

current-user:name and tons of other as user:name is inherited to deeper structures. See token browser... you may need entity api to see these... not sure. I have seen tons of them.

eelkeblok’s picture

I have entity API, which indeed adds a sh*tload of tokens. I think my changes covers most of the intended behaviour of the filter. I did find a class that is not covered, which is original variants (e.g. current-user:original:name). However, I also found many examples of tokens that are harmless, would be very useful as a GA tracking metric, but are currently being prevented from selection. For example:

  • Filenames
  • Taxonomy terms (which led to the opening of this issue)
  • Content types
  • (Domain names, although those would not make a likely candidate to add as a custom metric, unless you are doing something wrong, I suppose)

As an aside, an important area the module is not covering is Real Name module. An administrator selects arbitrary fields to serve as parts of the real name of a user. As it is, the GA module does nothing to prevent an administrator from adding those. All in all, I think this feature is very much missing its target, as it is now.

Things to consider:

  • Does it really need to be this forceful (i.e. validation error, no way to add the token, even if the user knows what they are doing)? (Who's responsibility, ultimately, is it if a site violates Google's terms of use? Wouldn't it be enough to display a warning?)
  • How many false positives does the solution generate (as it turns out, quite a lot, especially taxonomy terms would seem to be a popular metric).
  • How many situations are there that are *not* covered by the filtering? (false negatives, basically; I think the most important one is introduced by Real Name module), but you can never take into account aal custom code that is out there).
hass’s picture

I'm happy if we are able to remove the false positives, but not at the full removsl costs, please. Above patch does not cover what you describe and would be a lot better, as I know.

hass’s picture

Status: Needs work » Closed (fixed)

If you can share a working patch, feel free to reopen.

eelkeblok’s picture

Title: Change regex on Term ID privacy warning » Do not prevent tokens potentially containing personally identifying tokens to be added
Status: Closed (fixed) » Needs review
FileSize
987 bytes

Alright. Since getting the actual check exactly to everyone's liking doesn't seem to be feasible right now, I propose the validation error (a.k.a. brick wall) is replaced with a warning message, that doesn't physically prevent tokens that trigger the filtering code from being added. I do firmly believe adding personally identifying information is mostly the user's reponsibility, not the module's. This change is implemented by the accompanying patch.

I updated the issue title to reflect this change in approach.

rwohleb’s picture

I think the hard block is a bit too big-brother, but if the maintainers don't like the idea of removing the hard block, how about we add a hidden bypass variable that can only be set via the settings file. This way a developer can bypass the hard block if they know what they are doing, but the casual users won't be able to get into trouble by accident. Even if the bypass variable is enabled, I still think warning messages are a good idea.

hass’s picture

Title: Do not prevent tokens potentially containing personally identifying tokens to be added » Change regex on Term ID privacy warning
Status: Needs review » Needs work

#1307452: Provide developers a way to bypass sometimes overzealous forbidden token checking is a dup and as noted more than once - I'm happy to change/fix the regex. You only need to share a patch for this and not any other stuff.

rwohleb’s picture

Yeah, I was just about to retract my comment after seeing the other ticket :P

hass’s picture

Status: Needs work » Closed (won't fix)

Looks like nobody is interested in fixing the false alarms.

osopolar’s picture

osopolar’s picture

Status: Closed (won't fix) » Needs review
hass’s picture

Great if it is that simple. We should check some more false positives I think.

hass’s picture

#10 in this case here has some other useful tokens that we could whitelist.

hass’s picture

Status: Needs review » Needs work
Issue tags: +Needs tests

Can you add a test, please? Not sure what modules it requires to use [term:name]. I have taxonomy module enabled and some terms exists for sure, but token validation always fails with invalid token. Also tested with entity api, but no success.

hass’s picture

Two things are missing in this patch and until this is done I have no idea how you tested your patch or made it working. In both form elements the 'term'm type is missing and therfore the validation fails and the insert is not possible in token tree.

  $form['googleanalytics_custom_dimension']['indexes'][$i]['value'] = array(
    '#token_types' => array('node', 'term'),
  );

  $form['googleanalytics_custom_dimension']['googleanalytics_token_tree'] = array(
    '#token_types' => array('node', 'term'),
  );
hass’s picture

hass’s picture

Bug: user:original] is not blocked. So I'm adding this here to remind me on this before next release.

Your change introduces new issues. With your change the :name tokens are no longer properly detected. e.g. [user:name] is detected, but [current-user:name] not.

Test tokens I used:

[term:name] [term:tid]
[current-user:name]
[current-user:edit-url]
[current-user:original]
[user:name]
hass’s picture

Title: Change regex on Term ID privacy warning » Change regex on Term ID privacy warning / Allow taxonomy tokens in dimensions
osopolar’s picture

@hass: please see useless last patch from osopolar in #231451: Add hook to alter data before sending it to browser #88 and the example there, where with implementation of hook_googleanalytics_admin_settings_form_alter() more token types were made available.

hass’s picture

Does not make the patch here a correct one.

osopolar’s picture

I never claimed that! But it may help to test this issue and may help others to solve problems.

I can't see, why [current-user:name] should not be detected. I use Reggy on OS X to test regular expressions. There are also online solutions like regex101.com. I prepared the regex for privacy detection here: https://regex101.com/r/mE2gG9/2 and it detects every :name token except he term:name.

Edit: I'm not sure how you test the issue, make sure do not miss the /g modifier (match all) in the regex tester.

hass’s picture

I'm not sure how you test the issue, make sure do not miss the /g modifier (match all) in the regex tester.

I have simply inserted the placeholder in the module settings page and pressed save settings button.

  • hass committed 6827b21 on 8.x-2.x
    Issue #2037595 by hass: Tests if Custom Dimensions token form validation...

  • hass committed d85c28e on 8.x-2.x
    Issue #2037595 by hass: Tests if Custom Dimensions token form validation...

  • hass committed edee98b on 8.x-2.x
    Issue #2037595 by hass: Login is required for test
    

Status: Needs review » Needs work

The last submitted patch, 34: Issue-2037595-by-osopolar-hass-Change-regex-on-Term-D8.patch, failed testing.

The last submitted patch, 34: Issue-2037595-by-osopolar-hass-Change-regex-on-Term-D8.patch, failed testing.

The last submitted patch, 34: Issue-2037595-by-osopolar-hass-Change-regex-on-Term-D8.patch, failed testing.

The last submitted patch, 34: Issue-2037595-by-osopolar-hass-Change-regex-on-Term-D8.patch, failed testing.

The last submitted patch, 34: Issue-2037595-by-osopolar-hass-Change-regex-on-Term-D8.patch, failed testing.

  • hass committed 3fa1d07 on 7.x-2.x
    Issue #2037595 by hass: Tests if Custom Dimensions token form validation...