hokusai/CLAUDE.md
Konstantin Fickel 4def49350e
All checks were successful
Continuous Integration / Build Package (push) Successful in 35s
Continuous Integration / Lint, Check & Test (push) Successful in 57s
chore: rename bulkgen to hokusai
2026-02-20 17:08:12 +01:00

5.6 KiB

CLAUDE.md - hokusai development guide

Project overview

hokusai is a make-like build tool for AI-generated artifacts (images and text). A YAML config file defines targets with dependencies; hokusai builds a DAG with networkx and executes generation in parallel topological order using Mistral (text) and BlackForestLabs (images) as providers.

Commands

uv sync                  # install dependencies
uv run hokusai build     # build all targets
uv run hokusai build X   # build target X and its transitive deps
uv run hokusai clean     # remove generated artifacts + state file
uv run hokusai graph     # print dependency graph with stages
uv run pytest            # run tests

Code quality

Pre-commit hooks run automatically on git commit:

  • basedpyright - strict static type checking (config: pyrightconfig.json points to .devenv/state/venv)
  • ruff check - linting with auto-fix
  • ruff format - formatting
  • commitizen - enforces conventional commit messages (feat:, fix:, chore:, etc.)

Run manually:

basedpyright
ruff check
ruff format --check

Code style conventions

  • All function signatures must be fully typed. No Any unless truly unavoidable.
  • Use pathlib.Path everywhere, never os.path.
  • Use from __future__ import annotations in every module.
  • Use modern typing: str | None (not Optional[str]), Self, override, Annotated.
  • Pydantic BaseModel for data that serializes to/from YAML. dataclass for internal-only data structures (e.g. BuildResult).
  • Errors: raise with msg = "..."; raise ValueError(msg) pattern (ruff W0 compliance).
  • Commit messages follow conventional commits (feat:, fix:, refactor:, chore:).

Architecture

Module structure

main.py                  # Entry point: imports and runs hokusai.cli.app
hokusai/
  __init__.py
  cli.py                 # Typer CLI: build, clean, graph commands
  config.py              # Pydantic models for YAML config
  graph.py               # networkx DAG construction and traversal
  builder.py             # Build orchestrator: incremental + parallel
  state.py               # .hokusai.state.yaml hash tracking
  providers/
    __init__.py           # Abstract Provider base class (ABC)
    image.py              # BlackForestLabs image generation
    text.py               # Mistral text generation

Data flow

  1. cli.py finds the *.hokusai.yaml in cwd, calls load_config() from config.py
  2. config.py parses YAML into ProjectConfig (pydantic), which contains Defaults and dict[str, TargetConfig]
  3. graph.py builds an nx.DiGraph from target dependencies. get_build_order() uses nx.topological_generations() to return parallel batches
  4. builder.py run_build() iterates generations. Per generation:
    • Checks each target for dirtiness via state.py (SHA-256 hashes of inputs, prompt, model, extra params)
    • Skips targets whose deps already failed
    • Runs dirty targets concurrently with asyncio.gather()
    • Records state after each generation (crash resilience)
  5. providers/ dispatch by TargetType (inferred from file extension)

Key design decisions

  • Target type inference: .png/.jpg/.jpeg/.webp = image, .md/.txt = text. Defined in config.py as IMAGE_EXTENSIONS / TEXT_EXTENSIONS.
  • Prompt resolution: if the prompt string is a path to an existing file, its contents are read; otherwise it's used as-is. Done in builder.py:_resolve_prompt().
  • BFL client is synchronous: wrapped in asyncio.to_thread() in providers/image.py. Uses ClientConfig(sync=True, timeout=300) for internal polling.
  • Mistral client is natively async: uses complete_async() directly in providers/text.py.
  • Incremental builds: .hokusai.state.yaml tracks per-target: input file hashes, prompt hash, model name, and extra params hash. Any change marks the target dirty.
  • Error isolation: if a target fails, its dependents are marked "Dependency failed" but independent targets continue building.
  • State saved per-generation: partial progress survives crashes. At most one generation of work is lost.

Provider interface

All providers implement hokusai.providers.Provider:

async def generate(self, target_name, target_config, resolved_prompt, resolved_model, project_dir) -> None

The provider writes the result file to project_dir / target_name.

Image provider specifics (BFL)

  • image_prompt field: base64-encoded reference image (from target.reference_image)
  • control_image field: base64-encoded control image (from target.control_images)
  • Result image URL is in result.result["sample"], downloaded via httpx
  • Supported models: flux-dev, flux-pro, flux-pro-1.1, flux-pro-1.1-ultra, flux-kontext-pro, flux-pro-1.0-canny, flux-pro-1.0-depth, flux-pro-1.0-fill, flux-pro-1.0-expand

Text provider specifics (Mistral)

  • Text input files are appended to the prompt with --- Contents of <name> --- headers
  • Image inputs are noted as [Attached image: <name>] (no actual vision/multimodal yet)
  • Raw LLM response is written directly to the output file, no post-processing

Environment variables

  • MISTRAL_API_KEY - required for text targets
  • BFL_API_KEY - required for image targets

Dependencies

  • typer - CLI framework
  • pydantic - data validation and config models
  • pyyaml - YAML parsing
  • networkx - dependency graph
  • blackforest - BlackForestLabs API client (sync, uses requests; no type stubs)
  • mistralai - Mistral API client (supports async)
  • httpx - async HTTP for downloading BFL result images (transitive via mistralai)
  • hatchling - build backend