docs: update README and CLAUDE.md with recent features
- Add regenerate command documentation - Add download target type - Fix reference_images field (list, not single string) - Document archive behavior for clean command - Update module structure to reflect actual files - Add OpenAI provider documentation - Update supported models list - Add OPENAI_API_KEY to environment variables
This commit is contained in:
parent
d76951fe47
commit
fda28b5afa
2 changed files with 80 additions and 23 deletions
70
CLAUDE.md
70
CLAUDE.md
|
|
@ -2,17 +2,18 @@
|
|||
|
||||
## Project overview
|
||||
|
||||
hokusai is a `make`-like build tool for AI-generated artifacts (images and text). A YAML config file defines targets with dependencies; hokusai builds a DAG with networkx and executes generation in parallel topological order using Mistral (text) and BlackForestLabs (images) as providers.
|
||||
hokusai is a `make`-like build tool for AI-generated artifacts (images and text). A YAML config file defines targets with dependencies; hokusai builds a DAG with networkx and executes generation in parallel topological order using Mistral, OpenAI (text), and BlackForestLabs, OpenAI (images) as providers.
|
||||
|
||||
## Commands
|
||||
|
||||
```bash
|
||||
uv sync # install dependencies
|
||||
uv run hokusai build # build all targets
|
||||
uv run hokusai build X # build target X and its transitive deps
|
||||
uv run hokusai clean # remove generated artifacts + state file
|
||||
uv run hokusai graph # print dependency graph with stages
|
||||
uv run pytest # run tests
|
||||
uv sync # install dependencies
|
||||
uv run hokusai build # build all targets
|
||||
uv run hokusai build X # build target X and its transitive deps
|
||||
uv run hokusai regenerate X # force rebuild X even if up to date
|
||||
uv run hokusai clean # remove generated artifacts + state file (or archive if configured)
|
||||
uv run hokusai graph # print dependency graph with stages
|
||||
uv run pytest # run tests
|
||||
```
|
||||
|
||||
## Code quality
|
||||
|
|
@ -48,15 +49,23 @@ ruff format --check
|
|||
main.py # Entry point: imports and runs hokusai.cli.app
|
||||
hokusai/
|
||||
__init__.py
|
||||
cli.py # Typer CLI: build, clean, graph commands
|
||||
cli.py # Typer CLI: build, regenerate, clean, graph, init, models commands
|
||||
config.py # Pydantic models for YAML config
|
||||
graph.py # networkx DAG construction and traversal
|
||||
builder.py # Build orchestrator: incremental + parallel
|
||||
state.py # .hokusai.state.yaml hash tracking
|
||||
archive.py # Archive helper for preserving previous generations
|
||||
prompt.py # Prompt resolution and placeholder substitution
|
||||
resolve.py # Model resolution (target -> provider/model)
|
||||
providers/
|
||||
__init__.py # Abstract Provider base class (ABC)
|
||||
image.py # BlackForestLabs image generation
|
||||
text.py # Mistral text generation
|
||||
models.py # ModelInfo and Capability definitions
|
||||
registry.py # Provider/model registry
|
||||
blackforest.py # BlackForestLabs FLUX image generation
|
||||
mistral.py # Mistral text generation
|
||||
openai_text.py # OpenAI text generation (GPT-4, GPT-5, o3, etc.)
|
||||
openai_image.py # OpenAI image generation (DALL-E, gpt-image)
|
||||
bfl.py # Low-level BFL API client
|
||||
```
|
||||
|
||||
### Data flow
|
||||
|
|
@ -74,10 +83,14 @@ hokusai/
|
|||
### Key design decisions
|
||||
|
||||
- **Target type inference**: `.png/.jpg/.jpeg/.webp` = image, `.md/.txt` = text. Defined in `config.py` as `IMAGE_EXTENSIONS` / `TEXT_EXTENSIONS`.
|
||||
- **Prompt resolution**: if the `prompt` string is a path to an existing file, its contents are read; otherwise it's used as-is. Done in `builder.py:_resolve_prompt()`.
|
||||
- **BFL client is synchronous**: wrapped in `asyncio.to_thread()` in `providers/image.py`. Uses `ClientConfig(sync=True, timeout=300)` for internal polling.
|
||||
- **Mistral client is natively async**: uses `complete_async()` directly in `providers/text.py`.
|
||||
- **Prompt resolution**: if the `prompt` string is a path to an existing file, its contents are read; otherwise it's used as-is. Supports `{filename}` placeholders. Done in `prompt.py`.
|
||||
- **Model resolution**: `resolve.py` maps target config + defaults to a `ModelInfo` with provider, model name, and capabilities.
|
||||
- **Download targets**: targets with `download:` URL are fetched via httpx; state tracks the URL for incremental skip.
|
||||
- **BFL client is async**: custom async client in `providers/bfl.py` polls for completion.
|
||||
- **Mistral client is natively async**: uses `complete_async()` directly.
|
||||
- **OpenAI clients are async**: use the official `openai` SDK with async methods.
|
||||
- **Incremental builds**: `.hokusai.state.yaml` tracks per-target: input file hashes, prompt hash, model name, and extra params hash. Any change marks the target dirty.
|
||||
- **Archiving**: when `archive_folder` is set, previous outputs are moved to `archive/<name>.01.<ext>` (incrementing) before rebuild or clean.
|
||||
- **Error isolation**: if a target fails, its dependents are marked "Dependency failed" but independent targets continue building.
|
||||
- **State saved per-generation**: partial progress survives crashes. At most one generation of work is lost.
|
||||
|
||||
|
|
@ -91,21 +104,34 @@ The provider writes the result file to `project_dir / target_name`.
|
|||
|
||||
### Image provider specifics (BFL)
|
||||
|
||||
- `image_prompt` field: base64-encoded reference image (from `target.reference_image`)
|
||||
- `control_image` field: base64-encoded control image (from `target.control_images`)
|
||||
- Result image URL is in `result.result["sample"]`, downloaded via httpx
|
||||
- Supported models: `flux-dev`, `flux-pro`, `flux-pro-1.1`, `flux-pro-1.1-ultra`, `flux-kontext-pro`, `flux-pro-1.0-canny`, `flux-pro-1.0-depth`, `flux-pro-1.0-fill`, `flux-pro-1.0-expand`
|
||||
- Reference images are base64-encoded and passed as `input_image` (flux-2), `image_prompt` (flux-1.x), etc.
|
||||
- Control images for canny/depth models use `control_image` field
|
||||
- Result image URL is polled and downloaded via httpx
|
||||
- Supported models: `flux-dev`, `flux-pro`, `flux-pro-1.1`, `flux-pro-1.1-ultra`, `flux-2-pro`, `flux-kontext-pro`, `flux-pro-1.0-canny`, `flux-pro-1.0-depth`, `flux-pro-1.0-fill`, `flux-pro-1.0-expand`
|
||||
|
||||
### Image provider specifics (OpenAI)
|
||||
|
||||
- Uses `images.generate` for text-to-image, `images.edit` for image-to-image
|
||||
- Reference images passed as raw bytes to the edit endpoint
|
||||
- Supported models: `gpt-image-1.5`, `gpt-image-1`, `gpt-image-1-mini`, `dall-e-3`, `dall-e-2`
|
||||
|
||||
### Text provider specifics (Mistral)
|
||||
|
||||
- Text input files are appended to the prompt with `--- Contents of <name> ---` headers
|
||||
- Image inputs are noted as `[Attached image: <name>]` (no actual vision/multimodal yet)
|
||||
- Image inputs are encoded as data URLs for multimodal models (pixtral)
|
||||
- Raw LLM response is written directly to the output file, no post-processing
|
||||
- Supported models: `mistral-large-latest`, `mistral-small-latest`, `pixtral-large-latest`, `pixtral-12b-latest`
|
||||
|
||||
### Text provider specifics (OpenAI)
|
||||
|
||||
- Similar to Mistral: text inputs appended, images encoded as data URLs
|
||||
- Supported models: `gpt-5`, `gpt-5-mini`, `gpt-5-nano`, `gpt-4o`, `gpt-4o-mini`, `gpt-4.1`, `gpt-4.1-mini`, `gpt-4.1-nano`, `o3`, `o3-mini`, `o3-pro`, `o4-mini`
|
||||
|
||||
## Environment variables
|
||||
|
||||
- `MISTRAL_API_KEY` - required for text targets
|
||||
- `BFL_API_KEY` - required for image targets
|
||||
- `MISTRAL_API_KEY` - required for Mistral text models
|
||||
- `BFL_API_KEY` - required for BlackForestLabs FLUX image models
|
||||
- `OPENAI_API_KEY` - required for OpenAI text and image models
|
||||
|
||||
## Dependencies
|
||||
|
||||
|
|
@ -113,7 +139,7 @@ The provider writes the result file to `project_dir / target_name`.
|
|||
- `pydantic` - data validation and config models
|
||||
- `pyyaml` - YAML parsing
|
||||
- `networkx` - dependency graph
|
||||
- `blackforest` - BlackForestLabs API client (sync, uses `requests`; no type stubs)
|
||||
- `mistralai` - Mistral API client (supports async)
|
||||
- `httpx` - async HTTP for downloading BFL result images (transitive via mistralai)
|
||||
- `openai` - OpenAI API client (supports async)
|
||||
- `httpx` - async HTTP for BFL polling and image downloads
|
||||
- `hatchling` - build backend
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue