survos/ai-dataset-bundle

Dataset-scale AI batch processing for Survos datasets using canonical dataset paths and JSONL artifacts.

Maintainers

Package info

github.com/survos/ai-dataset-bundle

Type:symfony-bundle

pkg:composer/survos/ai-dataset-bundle

Fund package maintenance!

kbond

Statistics

Installs: 0

Dependents: 0

Suggesters: 0

Stars: 0

Open Issues: 0

dev-main 2026-05-29 11:04 UTC

This package is auto-updated.

Last update: 2026-05-29 12:51:38 UTC


README

Dataset-scale AI batch processing for Survos/Museado datasets.

This bundle is intentionally separate from survos/ai-workflow-bundle. ai-workflow-bundle operates on individual workflow subjects. This bundle operates on dataset JSONL stages, writes durable batch artifacts, and uses the canonical workspace paths from survos/dataset-bundle.

Responsibilities

  • Read normalized rows from 20_normalize/{core}.jsonl.
  • Write provider-ready batch input JSONL and manifests to 40_ai/.
  • Submit/check/download OpenAI batch jobs through survos/ai-batch-bundle.
  • Convert downloaded batch responses into portable claim JSONL files.
  • Leave later enrichment/import stages to consume those claim files.

Commands

Commands are exposed as methods on Survos\AiDatasetBundle\Service\DatasetAiService.

php bin/console ai:dataset:estimate mus/aust --core=obj
php bin/console ai:dataset:prepare mus/aust --core=obj --force
php bin/console ai:dataset:submit mus/aust --core=obj --force
php bin/console ai:dataset:status mus/aust
php bin/console ai:dataset:download mus/aust --core=obj --force

ai:dataset:submit is the paid provider call. estimate and prepare are local.

Files

For dataset mus/aust and core obj, the bundle uses:

Path Purpose
20_normalize/obj.jsonl Normalized source records
40_ai/obj.dense_summary.batch.input.jsonl OpenAI batch input
40_ai/obj.dense_summary.batch.json Local batch manifest
40_ai/obj.dense_summary.batch.output.jsonl Raw OpenAI batch output
40_ai/obj.jsonl Portable claim rows for enrichment

All paths are resolved with Survos\DataBundle\Service\DataPaths.

Install

composer require survos/ai-dataset-bundle

Register the bundle:

Survos\AiDatasetBundle\SurvosAiDatasetBundle::class => ['all' => true],

Required runtime bundles:

  • survos/dataset-bundle
  • survos/jsonl-bundle
  • survos/ai-batch-bundle
  • survos/claims-bundle

Optional:

  • yethee/tiktoken for better token estimates.