Asset-Frameworker/ProjectNotes/Data_Flow_Refinement_Plan.md
Rusfort 6971b8189f Data Flow Overhaul
Known regressions in current commit:
- No "extra" files
- GLOSS map does not look corrected
- "override" flag is not respected
2025-05-01 09:13:20 +02:00

8.8 KiB

Architectural Plan: Data Flow Refinement (v3)

Date: 2025-04-30

Author: Roo (Architect Mode)

Status: Approved

1. Goal

Refine the application's data flow to establish the GUI as the single source of truth for processing rules. This involves moving prediction/preset logic upstream from the backend processor and ensuring the backend receives a complete SourceRule object for processing, thereby simplifying the processor itself. This version of the plan involves creating a new processing module (processing_engine.py) instead of refactoring the existing asset_processor.py.

2. Proposed Data Flow

The refined data flow centralizes rule generation and modification within the GUI components before passing a complete, explicit rule set to the backend. The SourceRule object structure serves as a consistent data contract throughout the pipeline.

sequenceDiagram
    participant User
    participant GUI_MainWindow as GUI (main_window.py)
    participant GUI_Predictor as Predictor (prediction_handler.py)
    participant GUI_UnifiedView as Unified View (unified_view_model.py)
    participant Main as main.py
    participant ProcessingEngine as New Backend (processing_engine.py)
    participant Config as config.py

    User->>+GUI_MainWindow: Selects Input & Preset
    Note over GUI_MainWindow: Scans input, gets file list
    GUI_MainWindow->>+GUI_Predictor: Request Prediction(File List, Preset Name, Input ID)
    GUI_Predictor->>+Config: Load Preset Rules & Canonical Types
    Config-->>-GUI_Predictor: Return Rules & Types
    %% Prediction Logic (Internal to Predictor)
    Note over GUI_Predictor: Perform file analysis (based on list), apply preset rules, generate COMPLETE SourceRule hierarchy (only overridable fields populated)
    GUI_Predictor-->>-GUI_MainWindow: Return List[SourceRule] (Initial Rules)
    GUI_MainWindow->>+GUI_UnifiedView: Populate View(List[SourceRule])
    GUI_UnifiedView->>+Config: Read Allowed Asset/File Types for Dropdowns
    Config-->>-GUI_UnifiedView: Return Allowed Types
    Note over GUI_UnifiedView: Display rules, allow user edits
    User->>GUI_UnifiedView: Modifies Rules (Overrides)
    GUI_UnifiedView-->>GUI_MainWindow: Update SourceRule Objects in Memory
    User->>+GUI_MainWindow: Trigger Processing
    GUI_MainWindow->>+Main: Send Final List[SourceRule]
    Main->>+ProcessingEngine: Queue Task(SourceRule) for each input
    Note over ProcessingEngine: Execute processing based *solely* on the provided SourceRule and static config. No internal prediction/fallback.
    ProcessingEngine-->>-Main: Processing Result
    Main-->>-GUI_MainWindow: Update Status
    GUI_MainWindow-->>User: Show Result/Status

3. Module-Specific Changes

  • config.py:

    • Add Canonical Lists: Introduce ALLOWED_ASSET_TYPES (e.g., ["Surface", "Model", "Decal", "Atlas", "UtilityMap"]) and ALLOWED_FILE_TYPES (e.g., ["MAP_COL", "MAP_NRM", ..., "MODEL", "EXTRA", "FILE_IGNORE"]).
    • Purpose: Single source of truth for GUI dropdowns and validation.
    • Existing Config: Retains static definitions like IMAGE_RESOLUTIONS, MAP_MERGE_RULES, JPG_QUALITY, etc.
  • rule_structure.py:

    • Remove Enums: Remove AssetType and ItemType Enums. Update AssetRule.asset_type, FileRule.item_type_override, etc., to use string types validated against config.py lists.
    • Field Retention: Keep FileRule.resolution_override and FileRule.channel_merge_instructions fields for structural consistency, but they will not be populated or used for overrides in this flow.
  • gui/prediction_handler.py (or equivalent):

    • Enhance Prediction Logic: Modify run_prediction method.
    • Input: Accept input_source_identifier (string), file_list (List[str] of relative paths), and preset_name (string) when called from GUI.
    • Load Config: Read ALLOWED_ASSET_TYPES, ALLOWED_FILE_TYPES, and preset rules.
    • Relocate Classification: Integrate classification/naming logic (previously in asset_processor.py) to operate on the provided file_list.
    • Generate Complete Rules: Populate SourceRule, AssetRule, and FileRule objects.
      • Set initial values only for overridable fields (e.g., asset_type, item_type_override, target_asset_name_override, supplier_identifier, output_format_override) based on preset rules/defaults.
      • Explicitly do not populate static config fields like FileRule.resolution_override or FileRule.channel_merge_instructions.
    • Temporary Files (If needed for non-GUI): May need logic later to handle direct path inputs (CLI/Docker) involving temporary extraction/cleanup, but the primary GUI flow uses the provided list.
    • Output: Emit rule_hierarchy_ready signal with the List[SourceRule].
  • NEW: processing_engine.py (New Module):

    • Purpose: Contains a new class (e.g., ProcessingEngine) for executing the processing pipeline based solely on a complete SourceRule and static configuration. Replaces asset_processor.py in the main workflow.
    • Initialization (__init__): Takes the static Configuration object as input.
    • Core Method (process): Accepts a single, complete SourceRule object. Orchestrates processing steps (workspace setup, extraction, map processing, merging, metadata, organization, cleanup).
    • Helper Methods (Refactored Logic): Implement simplified versions of processing helpers (e.g., _process_individual_maps, _merge_maps_from_source, _generate_metadata_file, _organize_output_files, _load_and_transform_source, _save_image).
      • Retrieve overridable parameters directly from the input SourceRule.
      • Retrieve static configuration parameters (resolutions, merge rules) only from the stored Configuration object.
      • Contain no prediction, classification, or fallback logic.
    • Dependencies: rule_structure.py, configuration.py, config.py, cv2, numpy, etc.
  • asset_processor.py (Old Module):

    • Status: Remains in the codebase unchanged for reference.
    • Usage: No longer called by main.py or GUI for standard processing.
  • gui/main_window.py:

    • Scan Input: Perform initial directory/archive scan to get the file list for each directory/archieve.
    • Initiate Prediction: Call PredictionHandler with the file list, preset, and input identifier.
    • Receive/Pass Rules: Handle rule_hierarchy_ready, pass SourceRule list to UnifiedViewModel.
    • Send Final Rules: Send the final SourceRule list to main.py.
  • gui/unified_view_model.py / gui/delegates.py:

    • Load Dropdown Options: Source dropdowns (AssetType, ItemType) from config.py.
    • Data Handling: Read/write user modifications to overridable fields in SourceRule objects.
    • No UI for Static Config: Do not provide UI editing for resolution or merge instructions.
  • main.py:

    • Receive Rule List: Accept List[SourceRule] from GUI.
    • Instantiate New Engine: Import and instantiate the new ProcessingEngine from processing_engine.py.
    • Queue Tasks: Iterate SourceRule list, queue tasks.
    • Call New Engine: Pass the individual SourceRule object to ProcessingEngine.process for each task.

4. Rationale / Benefits

  • Single Source of Truth: GUI holds the final SourceRule objects.
  • Backend Simplification: New processing_engine.py is focused solely on execution based on explicit rules and static config.
  • Decoupling: Reduced coupling between GUI/prediction and backend processing.
  • Clarity: Clearer data flow and component responsibilities.
  • Maintainability: Easier maintenance and debugging.
  • Centralized Definitions: config.py centralizes allowed types.
  • Preserves Reference: Keeps asset_processor.py available for comparison.
  • Consistent Data Contract: SourceRule structure is consistent from predictor output to engine input, enabling potential GUI bypass.

5. Potential Issues / Considerations

  • PredictionHandler Complexity: Will require careful implementation of classification/rule population logic.
  • Performance: Prediction logic needs to remain performant (threading).
  • Rule Structure Completeness: Ensure SourceRule dataclasses hold all necessary overridable fields.
  • Preset Loading: Robust preset loading/interpretation needed in PredictionHandler.
  • Static Config Loading: Ensure the new ProcessingEngine correctly loads and uses the static Configuration object.

6. Documentation

This document (ProjectNotes/Data_Flow_Refinement_Plan.md) serves as the architectural plan. Relevant sections of the Developer Guide will need updating upon implementation.