Aruuz Nigar — Pure Engine Execution Flow
Scope
This document describes the pure Aruuz scansion engine flow, excluding Flask, web routes, templates, or any UI concerns. It explains how input text is transformed into Aruuz (prosodic scansion) once it enters the engine layer, and how data flows across engine components until final scansion results are produced.
This document assumes: - Input is already available as cleaned line strings - The caller is responsible for input/output presentation
Engine Entry Point (Conceptual)
Conceptual entry:
List[str] → Aruuz Engine → List[scanOutput]
In the current codebase, the engine is entered indirectly via Flask. Conceptually, however, the engine begins when:
- A Scansion object is created
- One or more Lines objects are added
- scan_lines() is invoked
Phase 1: Line → Words
Lines Object Creation
Class: Lines
Responsibility: - Convert a raw line string into a structured list of words - Perform text cleaning and normalization
Flow:
1. Clean punctuation and zero-width characters
2. Split line into word tokens
3. Normalize word forms
4. Create Words objects
Output:
- Lines.original_line
- Lines.words_list: List[Words]
At this stage, no scansion logic exists. The engine is still purely lexical.
Phase 2: Engine Initialization
Scansion Object
Class: Scansion
Responsibility: - Maintain engine-wide state - Coordinate scansion phases - Provide access to database lookup
Key State:
- lst_lines: all lines to be scanned
- word_lookup: database interface (optional)
- Mode flags: free_verse, fuzzy
No computation occurs here beyond setup.
Phase 3: Line Registration
Adding Lines
Method: Scansion.add_line()
Responsibility:
- Register Lines objects for later processing
This stage merely accumulates input; no scansion is performed yet.
Phase 4: Global Scan Invocation
scan_lines()
Method: Scansion.scan_lines()
Responsibility: - Drive the full scansion process - Aggregate results across all lines
High-level flow: 1. Iterate through all registered lines 2. Scan each line independently 3. Collect all scan results 4. Select dominant meter (crunch)
Phase 5: Single-Line Scansion
scan_line()
This is the core engine pipeline for a single poetic line.
Step 5.1: Word → Code Assignment
Method: Scansion.word_code()
For each word:
1. If codes already exist, skip
2. Attempt database lookup
3. If found:
- Convert stored taqti to codes
- Attach variations if marked is_varied
4. If not found:
- Apply heuristic syllable analysis
Output:
- Each Words object now contains one or more scansion codes
This step transforms textual words into symbolic rhythm units.
Step 5.2: Contextual Word Adjustments
After raw codes are assigned, inter-word rules are applied:
- Al (ال) handling
- Izafat handling
- Ataf (و) handling
- Word grafting (وصال الف)
These rules: - Modify word codes - Introduce alternative code paths - Reflect pronunciation-dependent scansion behavior
At the end of this step, ambiguity is fully materialized.
Phase 6: Code Combination Explosion
CodeTree Construction
Class: CodeTree
Responsibility: - Represent all possible code combinations for the line
Concept: - Each word may have multiple codes - Each code introduces a branch - The tree encodes the Cartesian product of possibilities
This transforms linear code lists into a search space.
Phase 7: Meter Matching
Tree Traversal
Method: CodeTree.find_meter()
Responsibility: - Traverse all valid code paths - Match each path against known meter patterns
Key Behavior:
- x syllables act as wildcards
- PatternTree expands ambiguous matches
- Multiple meters may match a single path
Output:
- scanPath objects containing:
- Concrete code sequences
- Matching meter indices
Phase 8: scanPath → scanOutput
Each valid scan path is converted into a human-meaningful result.
scanOutput contains: - Word-by-word taqti - Full scansion code - Meter name - Feet breakdown
At this point, scansion is fully resolved for the line.
Phase 9: Cross-Line Consolidation
crunch()
Responsibility: - Enforce the classical assumption of a dominant meter - Score meter consistency across lines - Discard non-dominant meters
This step transforms multiple local truths into a global poetic interpretation.
Phase 10: Engine Output
Final Output:
List[scanOutput]
Each scanOutput represents:
- One line
- One meter
- Fully resolved Aruuz scansion
The engine does no formatting, rendering, or UI decisions.
Conceptual Summary
Lines (text)
↓
Words (lexical units)
↓
Codes (= - x)
↓
CodeTree (combinatorial space)
↓
scanPath (valid rhythmic paths)
↓
scanOutput (metrical meaning)
End of Pure Engine Flow Document