Content targets write literal text to files via 'content:' field, without requiring an AI provider or API keys. They are not archived when overwritten. Loop expansion allows defining 'loops:' at the top level with named lists of values. Targets with [var] in their name are expanded via cartesian product. Variables are substituted in all string fields. Explicit targets override expanded ones. Escaping: \[var] -> [var]. Expansion happens at config load time so the rest of the system (builder, graph, state) sees only expanded targets.
8.2 KiB
8.2 KiB
CLAUDE.md - hokusai development guide
Project overview
hokusai is a make-like build tool for AI-generated artifacts (images and text). A YAML config file defines targets with dependencies; hokusai builds a DAG with networkx and executes generation in parallel topological order using Mistral, OpenAI (text), and BlackForestLabs, OpenAI (images) as providers.
Commands
uv sync # install dependencies
uv run hokusai build # build all targets
uv run hokusai build X # build target X and its transitive deps
uv run hokusai regenerate X # force rebuild X even if up to date
uv run hokusai clean # remove generated artifacts + state file (or archive if configured)
uv run hokusai graph # print dependency graph with stages
uv run pytest # run tests
Code quality
Pre-commit hooks run automatically on git commit:
- basedpyright - strict static type checking (config:
pyrightconfig.jsonpoints to.devenv/state/venv) - ruff check - linting with auto-fix
- ruff format - formatting
- commitizen - enforces conventional commit messages (
feat:,fix:,chore:, etc.)
Run manually:
basedpyright
ruff check
ruff format --check
Code style conventions
- All function signatures must be fully typed. No
Anyunless truly unavoidable. - Use
pathlib.Patheverywhere, neveros.path. - Use
from __future__ import annotationsin every module. - Use modern typing:
str | None(notOptional[str]),Self,override,Annotated. - Pydantic
BaseModelfor data that serializes to/from YAML.dataclassfor internal-only data structures (e.g.BuildResult). - Errors: raise with
msg = "..."; raise ValueError(msg)pattern (ruff W0 compliance). - Commit messages follow conventional commits (
feat:,fix:,refactor:,chore:).
Architecture
Module structure
main.py # Entry point: imports and runs hokusai.cli.app
hokusai/
__init__.py
cli.py # Typer CLI: build, regenerate, clean, graph, init, models commands
config.py # Pydantic models for YAML config + loop expansion at load time
expand.py # Loop variable extraction, substitution, and target expansion
graph.py # networkx DAG construction and traversal
builder.py # Build orchestrator: incremental + parallel
state.py # .hokusai.state.yaml hash tracking
archive.py # Archive helper for preserving previous generations
prompt.py # Prompt resolution and placeholder substitution
resolve.py # Model resolution (target -> provider/model)
providers/
__init__.py # Abstract Provider base class (ABC)
models.py # ModelInfo and Capability definitions
registry.py # Provider/model registry
blackforest.py # BlackForestLabs FLUX image generation
mistral.py # Mistral text generation
openai_text.py # OpenAI text generation (GPT-4, GPT-5, o3, etc.)
openai_image.py # OpenAI image generation (DALL-E, gpt-image)
bfl.py # Low-level BFL API client
Data flow
- cli.py finds the
*.hokusai.yamlin cwd, callsload_config()fromconfig.py - config.py parses YAML, expands loop templates via
expand.py(cartesian product), then validates intoProjectConfig(pydantic) which containsDefaults,loops, anddict[str, TargetConfig] - graph.py builds an
nx.DiGraphfrom target dependencies.get_build_order()usesnx.topological_generations()to return parallel batches - builder.py
run_build()iterates generations. Per generation:- Checks each target for dirtiness via
state.py(SHA-256 hashes of inputs, prompt, model, extra params) - Skips targets whose deps already failed
- Runs dirty targets concurrently with
asyncio.gather() - Records state after each generation (crash resilience)
- Checks each target for dirtiness via
- providers/ dispatch by
TargetType(inferred from file extension)
Key design decisions
- Target type inference:
.png/.jpg/.jpeg/.webp= image,.md/.txt= text. Defined inconfig.pyasIMAGE_EXTENSIONS/TEXT_EXTENSIONS. - Prompt resolution: if the
promptstring is a path to an existing file, its contents are read; otherwise it's used as-is. Supports{filename}placeholders. Done inprompt.py. - Model resolution:
resolve.pymaps target config + defaults to aModelInfowith provider, model name, and capabilities. - Content targets: targets with
content:write literal text to the file; no provider needed, no archiving on overwrite. State tracks the content string for incremental skip. - Download targets: targets with
download:URL are fetched via httpx; state tracks the URL for incremental skip. - Loop expansion:
loops:defines named lists of values. Targets with[var]in their name are expanded via cartesian product at config load time (inexpand.py). Only variables appearing in the target name trigger expansion. Explicit targets override expanded ones. Escaping:\[var]→ literal[var]. Substitution applies to all string fields (prompt, content, download, inputs, reference_images, control_images). The rest of the system sees only expanded targets. - BFL client is async: custom async client in
providers/bfl.pypolls for completion. - Mistral client is natively async: uses
complete_async()directly. - OpenAI clients are async: use the official
openaiSDK with async methods. - Incremental builds:
.hokusai.state.yamltracks per-target: input file hashes, prompt hash, model name, and extra params hash. Any change marks the target dirty. - Archiving: when
archive_folderis set, previous outputs are moved toarchive/<name>.01.<ext>(incrementing) before rebuild or clean. - Error isolation: if a target fails, its dependents are marked "Dependency failed" but independent targets continue building.
- State saved per-generation: partial progress survives crashes. At most one generation of work is lost.
Provider interface
All providers implement hokusai.providers.Provider:
async def generate(self, target_name, target_config, resolved_prompt, resolved_model, project_dir) -> None
The provider writes the result file to project_dir / target_name.
Image provider specifics (BFL)
- Reference images are base64-encoded and passed as
input_image(flux-2),image_prompt(flux-1.x), etc. - Control images for canny/depth models use
control_imagefield - Result image URL is polled and downloaded via httpx
- Supported models:
flux-dev,flux-pro,flux-pro-1.1,flux-pro-1.1-ultra,flux-2-pro,flux-kontext-pro,flux-pro-1.0-canny,flux-pro-1.0-depth,flux-pro-1.0-fill,flux-pro-1.0-expand
Image provider specifics (OpenAI)
- Uses
images.generatefor text-to-image,images.editfor image-to-image - Reference images passed as raw bytes to the edit endpoint
- Supported models:
gpt-image-1.5,gpt-image-1,gpt-image-1-mini,dall-e-3,dall-e-2
Text provider specifics (Mistral)
- Text input files are appended to the prompt with
--- Contents of <name> ---headers - Image inputs are encoded as data URLs for multimodal models (pixtral)
- Raw LLM response is written directly to the output file, no post-processing
- Supported models:
mistral-large-latest,mistral-small-latest,pixtral-large-latest,pixtral-12b-latest
Text provider specifics (OpenAI)
- Similar to Mistral: text inputs appended, images encoded as data URLs
- Supported models:
gpt-5,gpt-5-mini,gpt-5-nano,gpt-4o,gpt-4o-mini,gpt-4.1,gpt-4.1-mini,gpt-4.1-nano,o3,o3-mini,o3-pro,o4-mini
Environment variables
MISTRAL_API_KEY- required for Mistral text modelsBFL_API_KEY- required for BlackForestLabs FLUX image modelsOPENAI_API_KEY- required for OpenAI text and image models
Dependencies
typer- CLI frameworkpydantic- data validation and config modelspyyaml- YAML parsingnetworkx- dependency graphmistralai- Mistral API client (supports async)openai- OpenAI API client (supports async)httpx- async HTTP for BFL polling and image downloadshatchling- build backend