Allow Entity Mesh to analyze content using configurable user roles and implement Tracker system [#3544912]

Allow Entity Mesh to analyze content using configurable user roles

Problem/Motivation

Currently, the Entity Mesh module only analyzes links that are visible to anonymous users.
However, some sites we want to analyze are not accessible to anonymous users, so we can use the module in this context.

We need to extend Entity Mesh so that site administrators can configure which user role(s) should be used when crawling and analyzing links.

By default, the module should continue to behave as it does now (use the anonymous role), but it should allow selecting other roles through configuration.

This would make the module usable for a wider range of scenarios, especially for websites where access control is based on authenticated roles.

Steps to reproduce

Install and enable the Entity Mesh module.
Attempt to analyze a site where most or all content is restricted to authenticated users.

Observe that:

Entity Mesh only analyzes links visible to anonymous users.
Links restricted to authenticated or custom roles are skipped.

There is currently no configuration option to analyze the site using a different role.

Proposed resolution

Add a module configuration setting to choose which user role(s) Entity Mesh should use when analyzing links. By default, use the anonymous role for backward compatibility.

Reuse the functionality already proposed in issue #3535302, which implements the ability to generate a fake account object and assign roles dynamically.

At this point, there are two options:

either continue with the approach used in this issue, which involves changing the account globally in each processing step,
or implement a more refined solution, which would require at least the following two actions:

Replace new AnonymousUserSession() calls with the configurable fake account object.
Audit all access checks in the module to ensure they use the configured roles.

Remaining tasks

Add a configuration form to select the role(s) used for crawling.

Extend the fake account functionality from issue #3535302 to support configured roles.

At this point, there are two options:

either continue with the approach used in this issue, which involves changing the account globally in each processing step,
or implement a more refined solution, which would require at least the following two actions:

Replace new AnonymousUserSession() calls with the configurable fake account object.
Audit all access checks in the module to ensure they use the configured roles.

Add tests to confirm:

Default behavior remains anonymous-only.
Configured roles are respected during link analysis.

Entity Mesh Tracker System - Technical Overview

Purpose

The Tracker system provides a queue-based mechanism to manage and process entity
link analysis asynchronously, replacing the previous approach of truncating and
rebuilding the entire entity_mesh table during batch operations.

Database Schema

The entity_mesh_tracker table tracks entities pending analysis with
the following structure:

id: Primary key (auto-increment)
entity_type: Entity type identifier (e.g., 'node',
'taxonomy_term')
entity_id: Entity identifier
operation: Operation type (1 = process/update, 2 =
delete)
status: Processing status (1 = pending, 2 = processing, 3 =
processed, 4 = failed)
timestamp: Unix timestamp of last update
retry_count: Number of failed processing attempts

Indexes: entity_lookup (entity_type, entity_id), status, timestamp

Unique constraint: entity_type + entity_id combination

Core Components

TrackerInterface

Defines service contract with constants for operations and statuses

Tracker Service (`entity_mesh.tracker`)

Implements tracking functionality:

addEntity(): Adds/updates entity in tracker (uses MERGE for
upsert behavior)
addMultipleEntities(): Batch adds entities within
transaction
getPendingEntities(): Retrieves entities awaiting processing
(ordered by timestamp)
getFailedEntities(): Retrieves failed entities for retry
logic
markAsProcessed(): Updates status to processed
markAsFailed(): Updates status to failed and increments
retry_count
deleteEntity(): Removes entity from tracker
deleteProcessedRecords(): Cleanup of old processed records
getPendingCount()/getTotalCount(): Statistics
methods
truncate(): Clears entire tracker table

Integration Points

Entity Hooks: Entity CRUD operations (insert/update/delete)
automatically add entries to tracker via entity hooks

Batch Processing: Refactored to populate tracker instead of
directly processing all entities

Cron Processing: Configurable cron job processes pending
entities with limit control (default: 50 per run, configurable via
entity_mesh.settings.cron_limit)

Drush Commands: New commands for manual tracker management and
processing

Processing Flow

Entity operation (create/update/delete) triggers tracker entry
Entry status = PENDING (1)
Cron or manual processing picks up pending entries
Status changes to PROCESSING (2) during analysis
On success: status = PROCESSED (3), on failure: status = FAILED (4) +
retry_count incremented
Failed entities can be reprocessed based on retry limits
Processed records older than configured days are automatically purged

Configuration

cron_enabled: Enable/disable automatic cron processing (default:
TRUE)
cron_limit: Maximum entities to process per cron run (default:
50)
processing_mode: Controls synchronous vs asynchronous processing
behavior
synchronous_limit: Threshold for immediate vs queued
processing

Benefits

Incremental processing: Only changed entities are
analyzed
Performance: Avoids full table truncation and rebuild
Reliability: Retry mechanism for failed processing
Flexibility: Manual and automatic processing options
Scalability: Configurable limits prevent timeout issues on
large sites

Issue fork entity_mesh-3544912

Show commands

Start within a Git clone of the project using the version control instructions.

Add & fetch this issue fork’s repository

Or, if you do not have SSH keys set up on git.drupalcode.org:

Add & fetch this issue fork’s repository

3544912-allow-entity-mesh changes, plain diff MR !43
Check out this branch for the first time

Check out existing branch, if you already have it locally

About issue forks

Comments

Comment #1

4 September 2025 at 15:26

lpeidro created an issue. See original summary.

Comment #2

lpeidro commented 4 September 2025 at 15:28

Issue summary:

View changes

Comment #3

lpeidro commented 4 September 2025 at 15:29

Title:

Allow Entity Mesh to analyze links using configurable user roles instead of anonymous-only

» Allow Entity Mesh to analyze content using configurable user roles instead of anonymous-only

Comment #4

lpeidro commented 4 September 2025 at 15:30

Issue summary:

View changes

Comment #5

5 September 2025 at 06:53

lpeidro opened merge request !43

Comment #6

lpeidro commented 6 September 2025 at 12:52

Status:

Active

» Needs review

The option to configure the user profile under which content processing is executed has been implemented.

The configuration now includes three mutually exclusive options:

Anonymous users
Authenticated users with no role or with additional roles
A specific user already registered in the database

This is useful for intranet environments or systems where authenticated users access the website.

Relevant functional tests have been added to ensure that access permissions to content are properly validated.

Ready for testing and suggestions for improvements.

Comment #7

lpeidro commented 6 September 2025 at 12:52

Assigned:

lpeidro

» Unassigned

Comment #8

lpeidro commented 13 October 2025 at 09:15

Title:	Allow Entity Mesh to analyze content using configurable user roles instead of anonymous-only	» Allow Entity Mesh to analyze content using configurable user roles and implement Tracker system
Issue summary:	View changes
Status:	Needs review	» Reviewed & tested by the community

Due to the need to improve performance in this task and the way content is processed, we have also implemented a tracking system along with a cron job to ensure greater stability in the process, as well as a specific cron for the module. I’m also taking this opportunity to update the task title and the description.

Comment #9

13 October 2025 at 16:33

lpeidro committed ae180f3e on 1.x

Issue #3544912: It is needed to clear clearMeshAccountCache during the...

Comment #10

13 October 2025 at 16:33

lpeidro committed 4883c4ee on 1.x
```
Issue #3544912: Fix coding standard
```

Comment #11

13 October 2025 at 16:33

lpeidro committed 343c9830 on 1.x

Issue #3544912: Include webform as dependency for test env

Comment #12

13 October 2025 at 16:33

lpeidro committed 98417c8a on 1.x

Issue #3544912: The webform access was not calculated properly

Comment #13

13 October 2025 at 16:33

lpeidro committed ccfdb2e0 on 1.x

Issue #3544912: Fix test issues due to merge the branch 1.x

Comment #14

13 October 2025 at 16:33

lpeidro committed 8f128809 on 1.x
```
Issue #3544912: Fix codding standard
```

Comment #15

13 October 2025 at 16:33

lpeidro committed e1da601d on 1.x
```
Issue #3544912: Fix issue
```

Comment #16

13 October 2025 at 16:33

lpeidro committed 96330cc5 on 1.x

Issue #3544912: Added Hook Update to set the cron values

Comment #17

13 October 2025 at 16:33

lpeidro committed e7385518 on 1.x

Issue #3544912: Fix issues detected during testing

Comment #18

13 October 2025 at 16:33

lpeidro committed f7fb6015 on 1.x
```
Issue #3544912: Update tests
```

Comment #19

13 October 2025 at 16:33

lpeidro committed 00017ffe on 1.x

Issue #3544912: The queue worker is not used anymore

Comment #20

13 October 2025 at 16:33

lpeidro committed 47bc8162 on 1.x

Issue #3544912: Implement the tracker in the hook alters

Comment #21

13 October 2025 at 16:33

lpeidro committed 7f2d6077 on 1.x

Issue #3544912: Implement the Tracker system in the Entity Class

Comment #22

13 October 2025 at 16:33

lpeidro committed aa557795 on 1.x

Issue #3544912: Added new methods needed for tracking

Comment #23

13 October 2025 at 16:33

lpeidro committed 1429d71d on 1.x

Issue #3544912: Created form for cron configuration

Comment #24

13 October 2025 at 16:33

lpeidro committed 2b93fdb6 on 1.x

Issue #3544912: When the tracker, the NodeBatch does not need to...

Comment #25

13 October 2025 at 16:33

lpeidro committed 084a4340 on 1.x
```
Issue #3544912: Added cron to process
```

Comment #26

13 October 2025 at 16:33

lpeidro committed 48b5a25e on 1.x

Issue #3544912: Refactor drush commands and ad a new one for track...

Comment #27

13 October 2025 at 16:33

lpeidro committed 9283a8d3 on 1.x

Issue #3544912: Remove batch process that is not needed

Comment #28

13 October 2025 at 16:33

lpeidro committed b7c1a5f6 on 1.x

Issue #3544912: We add button for process tracker entities.

Comment #29

13 October 2025 at 16:33

lpeidro committed 7724977d on 1.x

Issue #3544912: Ony truncate the tracker table.

Comment #30

13 October 2025 at 16:33

lpeidro committed 1af83816 on 1.x

Issue #3544912: We refactor the node batch to use the tracker

Comment #31

13 October 2025 at 16:33

lpeidro committed c6a97ef4 on 1.x

Issue #3544912: We restructure the form of batch process with the option...

Comment #32

13 October 2025 at 16:33

lpeidro committed 28073de9 on 1.x

Issue #3544912: We create the batch process for entities tracking

Comment #33

13 October 2025 at 16:33

lpeidro committed 15ceb96f on 1.x

Issue #3544912: Only we need a field of timestamp to execute in a proper...

Comment #34

13 October 2025 at 16:33

lpeidro committed 39b46b11 on 1.x

Issue #3544912: Created service tracker manager to add or remove...

Comment #35

13 October 2025 at 16:33

lpeidro committed e3e4d85b on 1.x

Issue #3544912: Create service for tracker actions on the database

Comment #36

13 October 2025 at 16:33

lpeidro committed 743030a5 on 1.x

Issue #3544912: Create data base table for tracker

Comment #37

13 October 2025 at 16:33

lpeidro committed 0989de57 on 1.x

Issue #3544912: Created test for the system tracker

Comment #38

13 October 2025 at 16:33

lpeidro committed a870fefb on 1.x
```
Issue #3544912: Fix coding standard
```

Comment #39

13 October 2025 at 16:33

lpeidro committed e9dd6e74 on 1.x
```
Issue #3544912: Fix unit tests
```

Comment #40

13 October 2025 at 16:33

lpeidro committed 255eec29 on 1.x

Issue #3544912: Overrite the haspermission method in DummyAccount...

Comment #41

13 October 2025 at 16:33

lpeidro committed 83cc0e34 on 1.x

Issue #3544912: Add form validation in the backend for configuration...

Comment #42

13 October 2025 at 16:33

lpeidro committed 7357f0a0 on 1.x
```
Issue #3544912: Fix test
```

Comment #43

13 October 2025 at 16:33

lpeidro committed 9e327a19 on 1.x
```
Issue #3544912: Fix test, create roles
```

Comment #44

13 October 2025 at 16:33

lpeidro committed 2dfef0ae on 1.x
```
Issue #3544912: Fix coding standard
```

Comment #45

13 October 2025 at 16:33

lpeidro committed 7a11fd85 on 1.x
```
Issue #3544912: Fix test
```

Comment #46

13 October 2025 at 16:33

lpeidro committed 42abd50f on 1.x
```
Issue #3544912: Fix test
```

Comment #47

13 October 2025 at 16:33

lpeidro committed 7cafc203 on 1.x
```
Issue #3544912: Fix coding standard
```

Comment #48

13 October 2025 at 16:33

lpeidro committed 41118f60 on 1.x

Issue #3544912: The account is not set or properly configured during...

Comment #49

13 October 2025 at 16:33

lpeidro committed 6e20e1a1 on 1.x
```
Issue #3544912: Fix coding standard
```

Comment #50

13 October 2025 at 16:33

lpeidro committed fefa9f00 on 1.x

Issue #3544912: Fix functional and kernel test II

Comment #51

13 October 2025 at 16:33

lpeidro committed 929e19fc on 1.x
```
Issue #3544912: Fix functional test
```

Comment #52

13 October 2025 at 16:33

lpeidro committed 038d766e on 1.x

Issue #3544912: Implement kernel test to check that check properly the...

Comment #53

13 October 2025 at 16:33

lpeidro committed a19c1593 on 1.x

Issue #3544912: Refactor the DummyAccount and the method check access

Comment #54

13 October 2025 at 16:33

lpeidro committed 2e02d039 on 1.x

Issue #3544912: Rename and update the method checkAccessEntity to be...

Comment #55

13 October 2025 at 16:33

lpeidro committed bbf8bc9d on 1.x

Issue #3544912: Improve the role selection and add the possibility to...

Comment #56

13 October 2025 at 16:33

lpeidro committed 431cc932 on 1.x

Issue #3544912: Improve explanation in form

Comment #57

13 October 2025 at 16:33

lpeidro committed a817f240 on 1.x

Issue #3544912: Added test for the configuration roles funcionality

Comment #58

13 October 2025 at 16:33

lpeidro committed 37369e63 on 1.x

Issue #3544912: Added funcionality to configure different roles.

Comment #59

lpeidro commented 13 October 2025 at 16:54

Issue summary:	View changes
Status:	Reviewed & tested by the community	» Fixed

Funcionality merged.

Comment #60

13 October 2025 at 16:54

Now that this issue is closed, please review the contribution record.

As a contributor, attribute any organization that helped you, or if you volunteered your own time.

Maintainers, please credit people who helped resolve this issue.

Comment #61

27 October 2025 at 16:54

Status:

Fixed

» Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

Allow Entity Mesh to analyze content using configurable user roles and implement Tracker system