docs: update README and CLAUDE.md with recent features

- Add regenerate command documentation - Add download target type - Fix reference_images field (list, not single string) - Document archive behavior for clean command - Update module structure to reflect actual files - Add OpenAI provider documentation - Update supported models list - Add OPENAI_API_KEY to environment variables
2026-02-21 11:48:49 +01:00 · 2026-02-21 11:48:49 +01:00 · fda28b5afa
commit fda28b5afa
parent d76951fe47
2 changed files with 80 additions and 23 deletions
--- a/CLAUDE.md
+++ b/CLAUDE.md
@ -2,17 +2,18 @@

 ## Project overview

-hokusai is a `make`-like build tool for AI-generated artifacts (images and text). A YAML config file defines targets with dependencies; hokusai builds a DAG with networkx and executes generation in parallel topological order using Mistral (text) and BlackForestLabs (images) as providers.
+hokusai is a `make`-like build tool for AI-generated artifacts (images and text). A YAML config file defines targets with dependencies; hokusai builds a DAG with networkx and executes generation in parallel topological order using Mistral, OpenAI (text), and BlackForestLabs, OpenAI (images) as providers.

 ## Commands

 ```bash
-uv sync                  # install dependencies
-uv run hokusai build     # build all targets
-uv run hokusai build X   # build target X and its transitive deps
-uv run hokusai clean     # remove generated artifacts + state file
-uv run hokusai graph     # print dependency graph with stages
-uv run pytest            # run tests
+uv sync                      # install dependencies
+uv run hokusai build         # build all targets
+uv run hokusai build X       # build target X and its transitive deps
+uv run hokusai regenerate X  # force rebuild X even if up to date
+uv run hokusai clean         # remove generated artifacts + state file (or archive if configured)
+uv run hokusai graph         # print dependency graph with stages
+uv run pytest                # run tests
 ```

 ## Code quality
@ -48,15 +49,23 @@ ruff format --check
 main.py                  # Entry point: imports and runs hokusai.cli.app
 hokusai/
  __init__.py
-  cli.py                 # Typer CLI: build, clean, graph commands
+  cli.py                 # Typer CLI: build, regenerate, clean, graph, init, models commands
  config.py              # Pydantic models for YAML config
  graph.py               # networkx DAG construction and traversal
  builder.py             # Build orchestrator: incremental + parallel
  state.py               # .hokusai.state.yaml hash tracking
+  archive.py             # Archive helper for preserving previous generations
+  prompt.py              # Prompt resolution and placeholder substitution
+  resolve.py             # Model resolution (target -> provider/model)
  providers/
    __init__.py           # Abstract Provider base class (ABC)
-    image.py              # BlackForestLabs image generation
-    text.py               # Mistral text generation
+    models.py             # ModelInfo and Capability definitions
+    registry.py           # Provider/model registry
+    blackforest.py        # BlackForestLabs FLUX image generation
+    mistral.py            # Mistral text generation
+    openai_text.py        # OpenAI text generation (GPT-4, GPT-5, o3, etc.)
+    openai_image.py       # OpenAI image generation (DALL-E, gpt-image)
+    bfl.py                # Low-level BFL API client
 ```

 ### Data flow
@ -74,10 +83,14 @@ hokusai/
 ### Key design decisions

 - **Target type inference**: `.png/.jpg/.jpeg/.webp` = image, `.md/.txt` = text. Defined in `config.py` as `IMAGE_EXTENSIONS` / `TEXT_EXTENSIONS`.
- **Prompt resolution**: if the `prompt` string is a path to an existing file, its contents are read; otherwise it's used as-is. Done in `builder.py:_resolve_prompt()`.
- **BFL client is synchronous**: wrapped in `asyncio.to_thread()` in `providers/image.py`. Uses `ClientConfig(sync=True, timeout=300)` for internal polling.
- **Mistral client is natively async**: uses `complete_async()` directly in `providers/text.py`.
+- **Prompt resolution**: if the `prompt` string is a path to an existing file, its contents are read; otherwise it's used as-is. Supports `{filename}` placeholders. Done in `prompt.py`.
+- **Model resolution**: `resolve.py` maps target config + defaults to a `ModelInfo` with provider, model name, and capabilities.
+- **Download targets**: targets with `download:` URL are fetched via httpx; state tracks the URL for incremental skip.
+- **BFL client is async**: custom async client in `providers/bfl.py` polls for completion.
+- **Mistral client is natively async**: uses `complete_async()` directly.
+- **OpenAI clients are async**: use the official `openai` SDK with async methods.
 - **Incremental builds**: `.hokusai.state.yaml` tracks per-target: input file hashes, prompt hash, model name, and extra params hash. Any change marks the target dirty.
+- **Archiving**: when `archive_folder` is set, previous outputs are moved to `archive/<name>.01.<ext>` (incrementing) before rebuild or clean.
 - **Error isolation**: if a target fails, its dependents are marked "Dependency failed" but independent targets continue building.
 - **State saved per-generation**: partial progress survives crashes. At most one generation of work is lost.

@ -91,21 +104,34 @@ The provider writes the result file to `project_dir / target_name`.

 ### Image provider specifics (BFL)

- `image_prompt` field: base64-encoded reference image (from `target.reference_image`)
- `control_image` field: base64-encoded control image (from `target.control_images`)
- Result image URL is in `result.result["sample"]`, downloaded via httpx
- Supported models: `flux-dev`, `flux-pro`, `flux-pro-1.1`, `flux-pro-1.1-ultra`, `flux-kontext-pro`, `flux-pro-1.0-canny`, `flux-pro-1.0-depth`, `flux-pro-1.0-fill`, `flux-pro-1.0-expand`
+- Reference images are base64-encoded and passed as `input_image` (flux-2), `image_prompt` (flux-1.x), etc.
+- Control images for canny/depth models use `control_image` field
+- Result image URL is polled and downloaded via httpx
+- Supported models: `flux-dev`, `flux-pro`, `flux-pro-1.1`, `flux-pro-1.1-ultra`, `flux-2-pro`, `flux-kontext-pro`, `flux-pro-1.0-canny`, `flux-pro-1.0-depth`, `flux-pro-1.0-fill`, `flux-pro-1.0-expand`
+
+### Image provider specifics (OpenAI)
+
+- Uses `images.generate` for text-to-image, `images.edit` for image-to-image
+- Reference images passed as raw bytes to the edit endpoint
+- Supported models: `gpt-image-1.5`, `gpt-image-1`, `gpt-image-1-mini`, `dall-e-3`, `dall-e-2`

 ### Text provider specifics (Mistral)

 - Text input files are appended to the prompt with `--- Contents of <name> ---` headers
- Image inputs are noted as `[Attached image: <name>]` (no actual vision/multimodal yet)
+- Image inputs are encoded as data URLs for multimodal models (pixtral)
 - Raw LLM response is written directly to the output file, no post-processing
+- Supported models: `mistral-large-latest`, `mistral-small-latest`, `pixtral-large-latest`, `pixtral-12b-latest`
+
+### Text provider specifics (OpenAI)
+
+- Similar to Mistral: text inputs appended, images encoded as data URLs
+- Supported models: `gpt-5`, `gpt-5-mini`, `gpt-5-nano`, `gpt-4o`, `gpt-4o-mini`, `gpt-4.1`, `gpt-4.1-mini`, `gpt-4.1-nano`, `o3`, `o3-mini`, `o3-pro`, `o4-mini`

 ## Environment variables

- `MISTRAL_API_KEY` - required for text targets
- `BFL_API_KEY` - required for image targets
+- `MISTRAL_API_KEY` - required for Mistral text models
+- `BFL_API_KEY` - required for BlackForestLabs FLUX image models
+- `OPENAI_API_KEY` - required for OpenAI text and image models

 ## Dependencies

@ -113,7 +139,7 @@ The provider writes the result file to `project_dir / target_name`.
 - `pydantic` - data validation and config models
 - `pyyaml` - YAML parsing
 - `networkx` - dependency graph
- `blackforest` - BlackForestLabs API client (sync, uses `requests`; no type stubs)
 - `mistralai` - Mistral API client (supports async)
- `httpx` - async HTTP for downloading BFL result images (transitive via mistralai)
+- `openai` - OpenAI API client (supports async)
+- `httpx` - async HTTP for BFL polling and image downloads
 - `hatchling` - build backend
--- a/README.md
+++ b/README.md
@ -85,10 +85,11 @@ defaults:
 | `prompt` | string | Inline prompt text, or path to a prompt file |
 | `model` | string | Override the default model for this target |
 | `inputs` | list[string] | Files this target depends on (other targets or existing files) |
-| `reference_image` | string | Image file for image-to-image generation |
+| `reference_images` | list[string] | Image files for image-to-image generation |
 | `control_images` | list[string] | Control images (for canny/depth models) |
 | `width` | int | Image width in pixels |
 | `height` | int | Image height in pixels |
+| `download` | string | URL to download instead of generating (mutually exclusive with prompt) |

 Target type is inferred from the file extension:
 - **Image**: `.png`, `.jpg`, `.jpeg`, `.webp`
@ -133,6 +134,23 @@ targets:

 hokusai resolves dependencies automatically. If you build a single target, its transitive dependencies are included.

+### Download targets
+
+Targets can download files from URLs instead of generating them:
+
+```yaml
+targets:
+  reference.jpg:
+    download: https://example.com/image.jpg
+
+  variation.png:
+    prompt: "A variation of this image in watercolor style"
+    reference_images:
+      - reference.jpg
+```
+
+Download targets participate in dependency resolution like any other target. They are skipped if the URL hasn't changed.
+
 ### Archiving previous outputs

 Set `archive_folder` at the top level to preserve previous versions of generated files. When a target is rebuilt, the existing output is moved to the archive folder with an incrementing numeric suffix:
@ -157,10 +175,23 @@ Build all targets, or a specific target and its dependencies.
 - Runs independent targets in parallel
 - Continues building if a target fails (dependents of the failed target are skipped)

+### `hokusai regenerate <targets...>`
+
+Force regeneration of specific targets, ignoring their up-to-date status. Useful for getting a new variation of an AI-generated output without changing the prompt.
+
+```bash
+hokusai regenerate hero.png           # regenerate one target
+hokusai regenerate hero.png logo.png  # regenerate multiple targets
+```
+
+If `archive_folder` is set, the previous versions are archived before regeneration.
+
 ### `hokusai clean`

 Remove all generated target files and the build state file (`.hokusai.state.yaml`). Input files are preserved.

+If `archive_folder` is set, files are moved to the archive instead of being deleted.
+
 ### `hokusai graph`

 Print the dependency graph showing build stages: