drupal / ai_recipe_audio_transcription
Transcribes the audio media item and creates summary of the transcription.
Package info
git.drupalcode.org/project/ai_recipe_audio_transcription.git
Type:drupal-recipe
pkg:composer/drupal/ai_recipe_audio_transcription
Requires
- drupal/ai: ^1.4
- drupal/core: ^11.2
- drupal/views_bulk_operations: ^4.4
This package is not auto-updated.
Last update: 2026-06-11 14:55:30 UTC
README
Automatically transcribes audio files uploaded to the Media: Audio bundle and generates a short editorial summary from the transcription. Uses AI Automators with any speech-to-text-capable AI provider (e.g. OpenAI Whisper).
Requirements
- Drupal 11.2+
- AI module 1.4 or newer (provides the
verifySetupAiconfig action) - A configured default provider for the
speech_to_textoperation type at/admin/config/ai/settings→ Default Providers (e.g. OpenAI Whisper) - A configured default provider for the
chatoperation type — used for the summary step (any text-capable model)
Apply
composer require drupal/ai_recipe_audio_transcription
php core/scripts/drupal recipe recipes/contrib/ai_recipe_audio_transcription
drush cache:rebuild
If no default model for speech_to_text is configured, the recipe apply
will abort and roll back. Configure a default model and re-run.
What it does
- Applies
core/recipes/audio_media_type(Drupal core Audio media type). - Installs
ai,ai_automators,views_bulk_operations, andoptions. - Adds three fields to the Media: Audio bundle:
field_audio_transcription— long text, stores the full transcription.field_audio_summary— long text, stores the AI-generated summary.ai_automator_status— list field tracking automator processing state (default:pending).
Registers four AI Automator rules on the Audio bundle:
Automator Source field Rule Worker Audio Transcription Action field_media_audio_filellm_audio_to_string_longaction Audio Transcription Default field_media_audio_filellm_audio_to_string_longqueue Audio Summary Action field_audio_transcriptionllm_string_longaction Audio Summary Default field_audio_transcriptionllm_string_longqueue The summary automators use the prompt: "Create 2-3 sentence editor-friendly summary of the following content: {{ context }}"
Configures both AI fields as textarea widgets on the Audio edit form (weight 19–20, hidden from the default view display).
- Adds a Summary exposed filter to the Media view and all displays of the Media Library view so editors can search audio by summary text.
How automators run
Each field has two automator variants — action and queue — to support both on-demand and background processing.
On save (action worker)
The action automators fire synchronously when an Audio media item is
saved and ai_automator_status is pending. Transcription runs first
(weight 100), then the summary is generated from the resulting
transcription text (weight 102).
This is the default path for editors uploading audio through the UI.
Via cron / queue (queue worker)
The queue automators (weights 101 and 103) process items through Drupal's queue system. Run the queue worker to process pending items in the background:
drush queue:run ai_automator_entity_queue
Or let cron pick it up automatically on the next scheduled run.
Bulk processing with Views Bulk Operations
To re-run automators on existing audio items:
- Go to Content → Media (
/admin/content/media). - Filter to the Audio type.
- Select the items to process.
- Choose the Execute AI Automators VBO action and apply.
This triggers the action automators for every selected item.
Recipe is not adding the allowed VBO actions to media overview. If you would like to use the actions, update the media overview view accordingly. This is not done by default as Drupal core doesn't have the config actions to operate with views yet. Once it is implemented the VBO feature will be added. At the moment this can potentially conflict with ai_recipe_image_classification that also adds VBO actions to media overview page.
Checking automator status
The ai_automator_status field on each Audio media item reflects the
current processing state (pending, processing, done, failed).
You can add this field as a column in the Media view for monitoring.
Cost note
Every audio file triggers one speech-to-text API call for transcription and one chat API call for the summary. Long audio files will consume more tokens. Review your provider's per-minute or per-token pricing before enabling bulk reprocessing of large media libraries.
Issue queue
Bugs and feature requests: https://www.drupal.org/project/issues/ai_recipe_audio_transcription