Problem/Motivation

The maxEmbeddingsInput() method returns a hardcoded value of 1024 tokens with a TODO comment

  public function maxEmbeddingsInput($model_id = ''): int {                                                                                                                                                                                   
    // @todo this is playing safe. Ideally, we should provide real number per model.                                                                                                                                                          
    return 1024;                                                                                                                                                                                                                              
  }               

This has two problems:

1. Incorrect value: The mistral-embed model supports 8192 tokens, not 1024. This unnecessarily limits text that can be embedded.
2. Not using the $model_id parameter: The method receives a model ID but ignores it. The Mistral API returns max_context_length for each model via the /v1/models endpoint, which should be used to return accurate limits per model.

Steps to reproduce

0. have ai_provider_mistral module installed and setup
1. Call $provider->maxEmbeddingsInput('mistral-embed')
2. Observe it returns 1024 regardless of model
3. Check Mistral API - mistral-embed actually supports 8192 tokens

Proposed resolution

Update the method to dynamically fetch the model's max_context_length from the Mistral API, with fallback to known defaults

Remaining tasks

- Implement dynamic fetching from API
- Add fallback for known models

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Comments

petar_basic created an issue. See original summary.

petar_basic’s picture

Assigned: petar_basic » Unassigned
Status: Needs work » Needs review

Implemented maxEmbeddingsInput() to fetch the actual max_context_length from Mistral's /v1/models API endpoint instead of returning a hardcoded value of 1024.

- Defaults to mistral-embed model if no model_id specified
- Caches the result for 24 hours (to be consistent with getConfiguredModels())
- Falls back to 1024 if the model is not found in the API response
- Added kernel tests for both the API fetch and fallback scenarios

Note on interface documentation:

The EmbeddingsInterface::maxEmbeddingsInput() docblock states it returns "Max input string length in bytes", but the actual usage in ai_search module's EmbeddingStrategyPluginBase passes this value to TextChunker, which uses a tokenizer
(getEncodedChunks). This means the value is interpreted as tokens, not bytes. The Mistral API returns max_context_length in tokens. The interface documentation should probably be updated to reflect this.

petar_basic’s picture

Issue summary: View changes
petar_basic’s picture

Issue summary: View changes

To test, this can be run:

drush php-eval "                                                                                                                                            
    \$provider = \Drupal::service('ai.provider')->createInstance('mistral');                                                                                                          
    print 'maxEmbeddingsInput for mistral-embed: ' . \$provider->maxEmbeddingsInput('mistral-embed') . PHP_EOL;                                                                       
  "                                                                                                                                                                                   
      

fago made their first commit to this issue’s fork.

fago’s picture

Status: Needs review » Reviewed & tested by the community

good find, solid fix and tests, ready!

> (getEncodedChunks). This means the value is interpreted as tokens, not bytes. The Mistral API returns max_context_length in tokens. The interface documentation should probably be updated to reflect this.

let's open an issue and file an MR to fix that then?

  • fago committed aef865fe on 1.1.x authored by petar_basic
    fix: #3570539 maxEmbeddingsInput() should fetch token limits from API...
fago’s picture

Status: Reviewed & tested by the community » Fixed

Merged, so setting this one to fixed.

Now that this issue is closed, review the contribution record.

As a contributor, attribute any organization that helped you, or if you volunteered your own time.

Maintainers, credit people who helped resolve this issue.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.