streamd/REQUIREMENTS.md

# Streamd Requirements

Streamd (stylized as "Strea.md") is a personal knowledge management and time-tracking CLI tool that organizes time-ordered markdown files using `@Tag` annotations.

## Core Concepts

### Shard

A **Shard** is the fundamental unit of content. It represents a section of a markdown file (paragraph, heading, list item) that can contain markers and tags.

```
Shard {
    markers: [String]       // @Tag annotations at START of content
    tags: [String]          // @Tag annotations AFTER content begins
    start_line: int
    end_line: int
    children: [Shard]       // Nested shards (hierarchical)
}
```

### LocalizedShard

A **LocalizedShard** extends Shard with temporal and dimensional placement information.

```
LocalizedShard {
    markers: [String]
    tags: [String]
    start_line: int
    end_line: int
    moment: DateTime        // When this entry was created
    location: Map<String, String>  // Dimension placements
    children: [LocalizedShard]
}
```

---

## Tag Extraction Logic

### R1: Tag Recognition Pattern

Tags are recognized by the regex pattern: `@([^\s*\x60~\[\]]+)`

A tag is `@` followed by word characters, excluding:
- Whitespace
- Asterisks `*`
- Backticks `` ` ``
- Tildes `~`
- Brackets `[]`

**Examples of valid tags:**
- `@Task`, `@Done`, `@Waiting`
- `@Timesheet`, `@Break`
- `@ProjectName`, `@Client-ABC`

### R2: Marker vs Tag Distinction

The extraction MUST distinguish between **markers** and **tags** based on their position within a block:

| Type | Position | Purpose |
|------|----------|---------|
| **Marker** | Before any non-whitespace content | Semantic classification (triggers shard creation) |
| **Tag** | After non-whitespace content | Metadata annotation (does not trigger shard creation) |

**Example:**
```markdown
@Task @Streamd Working on feature    <!-- @Task and @Streamd are MARKERS -->
Some text here @CompletedFeature     <!-- @CompletedFeature is a TAG -->
```

### R3: Marker Boundary Tracking

The extraction algorithm MUST track a "marker boundary" state:

1. Start with `marker_boundary_encountered = false`
2. While processing tokens:
   - If whitespace-only: continue (boundary not crossed)
   - If `@Tag` token found AND boundary NOT crossed: add to markers
   - If `@Tag` token found AND boundary crossed: add to tags
   - If any non-whitespace content found: set boundary = crossed

### R4: Nested Token Handling

Tag extraction MUST handle nested markdown formatting:

- Emphasis: `*@Tag*` or `_@Tag_`
- Strong: `**@Tag**` or `__@Tag__`
- Strikethrough: `~~@Tag~~`
- Links: `[@Tag](url)`

Tags inside these formatting elements are still valid and should be extracted.

### R5: Applicable Block Types

Tag extraction applies to:
- Headings (`# Heading with @Tag`)
- Paragraphs (`@Tag in paragraph`)
- Quoute Blocks (`> @Tag in Quote`)
- List items (each item can have its own markers)

---

## Parsing Logic

### R6: Heading-Based Hierarchy

The parser MUST create a hierarchical shard structure based on markdown headings.

**Algorithm for determining split level:**

1. Find the minimum heading level that either:
   - Appears 2+ times in the block list, OR
   - Has markers AND is not the first heading
2. If no such level exists, do not split (return None)

**Example:**
```markdown
# Main Title
Content here

## Section A        <!-- Split point (level 2 appears twice) -->
Section A content

## Section B        <!-- Split point -->
Section B content
```

### R7: List Item Shard Creation

Each list item with markers MUST become its own shard:

```markdown
- @Task Item one    <!-- Shard 1 -->
- @Task Item two    <!-- Shard 2 -->
- Item three        <!-- NOT a separate shard (no markers) -->
```

### R8: Shard Simplification

When building shards, apply this optimization:
- If a shard has exactly 1 child AND no markers AND no tags
- Return the child directly instead of wrapping it

---

## Dimension Placement Logic

### R9: Dimension Configuration

A **Dimension** defines a classification axis:

```
Dimension {
    display_name: String    // For UI display
    comment: String?        // Documentation
    propagate: bool         // Whether children inherit this dimension
}
```

### R10: Marker Configuration

A **Marker** defines how a tag affects dimension placement:

```
Marker {
    display_name: String
    placements: [MarkerPlacement]
}

MarkerPlacement {
    if_with: Set<String>    // Conditional: only apply if ALL these markers present
    dimension: String       // Target dimension name
    value: String?          // Value to assign (defaults to marker name)
    overwrites: bool        // Can overwrite existing placement
}
```

### R11: Conditional Placement

Placements with `if_with` conditions MUST only apply when ALL specified markers are present on the same shard.

**Example Configuration:**
```
Marker "Task" {
    placements: [
        { dimension: "task", value: "open" },
        { if_with: ["Done"], dimension: "task", value: "done" },
        { if_with: ["Waiting"], dimension: "task", value: "waiting" },
    ]
}
```

**Behavior:**
- `@Task` alone → `task: "open"`
- `@Task @Done` → `task: "done"` (conditional overrides default)
- `@Task @Waiting` → `task: "waiting"`

### R12: Localization Algorithm

The localization process MUST follow this algorithm:

```
function localize_shard(shard, config, propagated_from_parent, moment):
    position = copy(propagated_from_parent)  // Start with inherited
    private_position = {}                     // Non-propagating dimensions

    for marker in shard.markers:
        if marker in config.markers:
            for placement in marker.placements:
                // Check conditional
                if placement.if_with is subset of shard.markers:
                    dimension = config.dimensions[placement.dimension]
                    value = placement.value OR marker

                    // Check if we can apply this placement
                    target = dimension.propagate ? position : private_position
                    if placement.dimension not in target OR placement.overwrites:
                        target[placement.dimension] = value

    // Recursively localize children with propagating dimensions
    children = [
        localize_shard(child, config, position, moment)
        for child in shard.children
    ]

    // Merge private dimensions into final position
    position.update(private_position)

    return LocalizedShard(
        markers: shard.markers,
        tags: shard.tags,
        location: position,
        moment: moment,
        children: children,
    )
```

### R13: Dimension Propagation

When `propagate = true`:
- Children inherit the dimension value from their parent
- Child can override with their own placement

When `propagate = false`:
- Dimension value is NOT inherited by children
- Each shard must have its own marker to be placed in this dimension

**Example:**
```
dimensions: {
    "project": { propagate: true },   // Children inherit project
    "task": { propagate: false },     // Each task is independent
}
```

```markdown
# @Project-X
## @Task Item A     <!-- project: "Project-X", task: "open" -->
### Sub-item        <!-- project: "Project-X", task: (none) -->
## @Task Item B     <!-- project: "Project-X", task: "open" -->
```

### R14: Overwrite Behavior

Default: A placement does NOT overwrite an existing value in the dimension.

With `overwrites: true`: The placement WILL replace any existing value.

This allows conditional placements to override base placements.

---

## File Naming Convention

### R15: File Name Format

Files follow the pattern: `YYYYMMDD-HHMMSS[_file_type] [markers].md`

- `YYYYMMDD`: Date (8 digits, required)
- `HHMMSS`: Time (4-6 digits, optional, pads with zeros)
- `_file_type`: Optional alphanumeric prefix identifying the file type (e.g. `_daily`)
- `[markers]`: Space-separated marker names extracted from file content

**Extraction regex for datetime:** `^(?P<date>\d{8})(?:-(?P<time>\d{4,6}))?.+\.md$`

**Extraction regex for file type:** `^\d{8}(?:-\d{4,6})?_([a-zA-Z0-9]+)`

When a `_file_type` prefix is present it is stored in the `file_type` dimension of the root shard and propagates to all child shards.

### R16: Temporal Markers

Special markers can override the file timestamp:

- Date markers: `@YYYYMMDD` (8 digits)
- Time markers: `@HHMMSS` (6 digits)

These are used for entries that reference a different time than when the file was created.

---

## Timesheet Module

### R17: Timesheet Point Types

```
TimesheetPointType {
    Card,       // Clock in / start work
    Break,      // Clock out / end work
    SickLeave,
    Vacation,
    Holiday,
    Undertime,
}
```

### R18: Timesheet State Machine

Process timesheet shards chronologically per day:

1. Start state: "on break" (not working)
2. `Card` marker: Transition to "working", record start time
3. `Break` marker: Transition to "on break", emit timecard from previous start to now
4. Special markers (SickLeave, Vacation, etc.): Set day type

**Validation:** The last entry of each day MUST be a `Break` (cannot end day while working).

### R18a: Timesheet Report Configuration

The `.streamd.toml` file in the base folder configures timesheet periods:

```toml
timezone = "Europe/Berlin"  # Optional timezone for day boundaries

[timesheet]
[[timesheet.periods]]
start = "2026-01-01"
end = "2026-06-30"
hours_per_week = 38.0

[[timesheet.periods]]
start = "2026-07-01"
end = "2026-12-31"
hours_per_week = 40.0
```

**Configuration Rules:**
- Dates use ISO 8601 format (`YYYY-MM-DD`)
- Periods MUST NOT overlap (validation error if they do)
- Gaps between periods are allowed — days in gaps have 0 expected hours
- `hours_per_week` is distributed over Mon-Fri (e.g., 38h/week = 7.6h/day)

### R18b: Day Type Rules

| Day Type | Expected Hours | Actual Hours |
|----------|---------------|--------------|
| Regular work day | period.hours_per_week / 5 | Sum of timecards |
| Weekend (Sat/Sun) | 0 | Sum of timecards (hidden if 0) |
| Sick Leave (@SickLeave) | Normal expected | max(expected, worked) |
| Vacation (@VacationDay) | Normal expected | expected + worked |
| Holiday (@Holiday) | 0 | Sum of timecards |
| Flex Day (@UndertimeDay) | Normal expected | 0 |
| Day in gap (no period) | 0 | Sum of timecards + warning |
| Missing (no entries) | Normal expected | 0 + warning |

### R18c: Timesheet Report Warnings

The report generates warnings for:

1. **Missing days without explanation**: A weekday within a configured period has no timecard entries and no special day type marker
2. **Overlapping timecards**: Two or more timecards on the same day have overlapping time ranges
3. **Work outside configured periods**: Work logged on a day that falls outside all configured periods

---

## Query System

### R19: Shard Search

Provide recursive search through the shard tree:

- `find_shard(predicate)`: Find all shards matching a predicate function
- `find_by_position(dimension, value)`: Find shards where `location[dimension] == value`
- `find_by_set_dimension(dimension)`: Find shards where dimension exists in location

---

## CLI Commands

### R20: Core Commands

| Command | Description |
|---------|-------------|
| `streamd new` | Create new timestamped file, open editor, rename with markers on close |
| `streamd daily [YYYYMMDD]` | Open the earliest daily file for the given date (default: today in configured timezone), or create a new `_daily` file if none exists |
| `streamd todo` | List all shards with `task: "open"`, numbered, hiding future tasks |
| `streamd todo --show-future` | Include tasks with future dates in the todo listing |
| `streamd todo N edit` | Edit task N in editor, cursor positioned at task line |
| `streamd todo N done` | Mark task N as done by inserting `@Done` after `@Task` |
| `streamd edit [n]` | Edit nth file (supports negative indexing for recent files) |
| `streamd timesheet` | Generate formatted timesheet report with expected/actual hours |
| `streamd completions <shell>` | Generate shell completions (bash, zsh, fish, elvish, powershell) |
| `streamd lsp` | Start Language Server Protocol server over stdin/stdout |

### R21a: Daily Command Behavior

`streamd daily [YYYYMMDD]` provides quick access to the daily journal entry for a given date.

**Date resolution:**
- If a `YYYYMMDD` argument is provided, it is parsed as the target date.
- If no argument is given, today's date is used, interpreted in the repository timezone (from `.streamd.toml`, defaulting to UTC).

**File lookup:**
- All markdown files in the base folder are localized.
- Files with `file_type = "daily"` whose root shard `moment` falls within the target date (in the configured timezone) are collected.
- The file with the earliest `moment` is opened in `$EDITOR` (defaults to `vi`).

**File creation:**
- If no matching file is found, a new file is created at `<now_local>_daily.md` (e.g. `20260413-083000_daily.md`) containing `# ` and opened in the editor.
- The `_daily` suffix is permanent — it identifies the file type and is not renamed after editing.

### R21: Todo Command Behavior

**Task Numbering:**
- Tasks are numbered starting from 1 (oldest task = 1)
- Tasks are sorted by their `moment` field in ascending order
- Output format: `[N] --- file.md:line ---` followed by task content

**Future Task Filtering:**
- By default, tasks with `moment > now` are hidden from the listing
- The `--show-future` flag includes all tasks regardless of their moment
- When using `todo N edit` or `todo N done`, all tasks (including future) are considered for number lookup

**Edit Action (`todo N edit`):**
- Opens the task's file in `$EDITOR` (defaults to `vi`)
- Uses `+LINE` argument to position cursor at task's start line
- Errors if N is 0 or exceeds the task count

**Done Action (`todo N done`):**
- Reads the file and modifies the line at task's start_line
- Inserts ` @Done` immediately after `@Task`
- Preserves trailing newline if the original file had one
- Errors if:
  - N is 0 or exceeds the task count
  - Multiple `@Task` markers found on the same line
  - No `@Task` marker found on the expected line

---

## Application Configuration

### R22: Config File Location

The application configuration is stored at `~/.config/streamd/config.toml`:

```toml
base_folder = "/path/to/stream/files"
```

### R23: Environment Variable Override

The `STREAMD_BASE_FOLDER` environment variable can override the config file setting.

---

## Configuration Merging

### R24: Configuration Composition

Multiple configurations can be merged:

- Dimensions are combined (later configs can add new dimensions)
- Markers are combined (later configs can add new markers)
- This allows base configuration + domain-specific extensions

---

## LSP Server

### R25: LSP Subcommand

`streamd lsp` starts a Language Server Protocol server over stdin/stdout.

**Workspace root resolution:**
- The base folder is taken from `initializeParams.rootUri` (or `rootPath` as fallback).
- R22/R23 global config resolution is bypassed in LSP mode.

**Passive mode:**
- If `.streamd.toml` is absent from the workspace root, the server enters passive mode: all requests return empty results and no diagnostics are published.

**Config watching:**
- The server registers a `workspace/didChangeWatchedFiles` watcher for `.streamd.toml`.
- Config is reloaded without restarting the server when `.streamd.toml` changes.

**Document sync:**
- Full-document sync (`TextDocumentSyncKind::FULL`).
- Re-parses on `didOpen`, `didChange`, and `didSave`.

### R25a: LSP Completion

- Trigger character: `@`
- Returns marker names from the merged config (BasicTimesheetConfiguration + TaskConfiguration).
- Conditional suggestions: if marker A is on the line and A has placements with `if_with: {B}`, B is offered with higher priority.
- Temporal snippets: `@` followed by a digit offers `YYYYMMDD` and `HHMMSS` format snippets (R16).

### R25b: LSP Diagnostics

- **File-name format (R15)**: Warning when the file basename does not match `^(\d{8})(?:-(\d{4,6}))?.+\.md$`.
- **Timesheet violations (R18)**: Error when a day ends without a break; Warning for overlapping timecards.

### R25c: LSP Document Symbols

- Returns the `LocalizedShard` tree as nested `DocumentSymbol` nodes.
- Symbol names are derived from marker names or tag names.

### R25d: LSP Code Actions

- "Mark task as done": offered on any line containing `@Task` without `@Done`; inserts ` @Done` after `@Task`.

### R25e: LSP Cross-file Features

- `workspace/symbol`: searches all `.md` files in base folder (depth 1) for shards matching the query.
- `textDocument/references`: finds all occurrences of the `@Marker` under the cursor across the workspace.
- `textDocument/rename`: renames an `@Marker` across all files via `WorkspaceEdit`.