Pipeline simplification - Needs testing!
This commit is contained in:
154
ProjectNotes/PipelineRefactoringPlan.md
Normal file
154
ProjectNotes/PipelineRefactoringPlan.md
Normal file
@@ -0,0 +1,154 @@
|
||||
# Revised Refactoring Plan: Processing Pipeline
|
||||
|
||||
**Overall Goal:** To simplify the processing pipeline by refactoring the map merging process, consolidating map transformations (Gloss-to-Rough, Normal Green Invert), and creating a unified, configurable image saving utility. This plan aims to improve clarity, significantly reduce I/O by favoring in-memory operations, and make Power-of-Two (POT) scaling an optional, integrated step.
|
||||
|
||||
**I. Map Merging Stage (`processing/pipeline/stages/map_merging.py`)**
|
||||
|
||||
* **Objective:** Transform this stage from performing merges to generating tasks for merged images.
|
||||
* **Changes to `MapMergingStage.execute()`:**
|
||||
1. Iterate through `context.config_obj.map_merge_rules`.
|
||||
2. Identify required input map types and find their corresponding source file paths (potentially original paths or outputs of prior essential stages if any).
|
||||
3. Create "merged image tasks" and add them to `context.merged_image_tasks`.
|
||||
4. Each task entry will contain:
|
||||
* `output_map_type`: Target map type (e.g., "MAP_NRMRGH").
|
||||
* `input_map_sources`: Details of source map types and file paths.
|
||||
* `merge_rule_config`: Complete merge rule configuration (including fallback values).
|
||||
* `source_dimensions`: Dimensions for the high-resolution merged map basis.
|
||||
* `source_bit_depths`: Information about the bit depth of original source maps (needed for "respect_inputs" rule in save utility).
|
||||
|
||||
**II. Individual Map Processing Stage (`processing/pipeline/stages/individual_map_processing.py`)**
|
||||
|
||||
* **Objective:** Adapt this stage to handle both individual raw maps and `merged_image_tasks`. It will perform necessary in-memory transformations (Gloss-to-Rough, Normal Green Invert) and prepare a single "high-resolution" source image (in memory) to be passed to the `UnifiedSaveUtility`.
|
||||
* **Changes to `IndividualMapProcessingStage.execute()`:**
|
||||
1. **Input Handling Loop:** Iterate through `context.files_to_process` (regular maps) and `context.merged_image_tasks`.
|
||||
2. **Image Data Preparation:**
|
||||
* **For regular maps:** Load the source image file into memory (`current_image_data`). Determine `base_map_type` from the `FileRule`. Determine source bit depth.
|
||||
* **For `merged_image_tasks`:**
|
||||
* Attempt to load input map files specified in `input_map_sources`. If a file is missing, log a warning and generate placeholder data using fallback values from `merge_rule_config`. Handle other load errors.
|
||||
* Check dimensions of loaded/fallback data. Apply `MERGE_DIMENSION_MISMATCH_STRATEGY` (e.g., resize, log warning) or handle "ERROR_SKIP" strategy (log error, mark task failed, continue).
|
||||
* Perform the merge operation in memory according to `merge_rule_config`. Result is `current_image_data`. `base_map_type` is the task's `output_map_type`.
|
||||
3. **In-Memory Transformations:**
|
||||
* **Gloss-to-Rough Conversion:**
|
||||
* If `base_map_type` starts with "MAP_GLOSS":
|
||||
* Perform inversion on `current_image_data` (in memory).
|
||||
* Update `base_map_type` to "MAP_ROUGH".
|
||||
* Log the conversion.
|
||||
* **Normal Map Green Channel Inversion:**
|
||||
* If `base_map_type` is "NORMAL" *and* `context.config_obj.general_settings.invert_normal_map_green_channel_globally` is true:
|
||||
* Perform green channel inversion on `current_image_data` (in memory).
|
||||
* Log the inversion.
|
||||
4. **Optional Initial Scaling (POT or other):**
|
||||
* Check `INITIAL_SCALING_MODE` from config.
|
||||
* If `"POT_DOWNSCALE"`: Perform POT downscaling on `current_image_data` (in memory) -> `image_to_save`.
|
||||
* If `"NONE"`: `image_to_save` = `current_image_data`.
|
||||
* *(Note: `image_to_save` now reflects any prior transformations)*.
|
||||
5. **Color Management:** Apply necessary color management to `image_to_save`.
|
||||
6. **Pass to Save Utility:** Pass `image_to_save`, the (potentially updated) `base_map_type`, original source bit depth info (for "respect_inputs" rule), and other necessary details (like specific config values) to the `UnifiedSaveUtility`.
|
||||
7. **Remove Old Logic:** Remove old save logic, separate Gloss/Normal stage calls.
|
||||
8. **Context Update:** Update `context.processed_maps_details` with results from the `UnifiedSaveUtility`, including notes about any conversions/inversions performed or merge task failures.
|
||||
|
||||
**III. Unified Image Save Utility (New file: `processing/utils/image_saving_utils.py`)**
|
||||
|
||||
* **Objective:** Centralize all image saving logic (resolution variants, format, bit depth, compression).
|
||||
* **Interface (e.g., `save_image_variants` function):**
|
||||
* **Inputs:**
|
||||
* `source_image_data (np.ndarray)`: High-res image data (in memory, potentially transformed).
|
||||
* `base_map_type (str)`: Final map type (e.g., "COL", "ROUGH", "NORMAL", "MAP_NRMRGH").
|
||||
* `source_bit_depth_info (list)`: List of original source bit depth(s).
|
||||
* Specific config values (e.g., `image_resolutions: dict`, `file_type_defs: dict`, `output_format_8bit: str`, etc.).
|
||||
* `output_filename_pattern_tokens (dict)`.
|
||||
* `output_base_directory (Path)`.
|
||||
* **Core Functionality:**
|
||||
1. Use provided configuration inputs.
|
||||
2. Determine Target Bit Depth:
|
||||
* Use `bit_depth_rule` for `base_map_type` from `file_type_defs`.
|
||||
* If "force_8bit": target 8-bit.
|
||||
* If "respect_inputs": If `any(depth > 8 for depth in source_bit_depth_info)`, target 16-bit, else 8-bit.
|
||||
3. Determine Output File Format(s) (based on target bit depth, config).
|
||||
4. Generate and Save Resolution Variants:
|
||||
* Iterate through `image_resolutions`.
|
||||
* Resize `source_image_data` (in memory) for each variant (no upscaling).
|
||||
* Construct filename and path.
|
||||
* Prepare save parameters.
|
||||
* Convert variant data to target bit depth/color space just before saving.
|
||||
* Save variant using `cv2.imwrite` or similar.
|
||||
* Discard in-memory variant after saving.
|
||||
5. Return List of Saved File Details: `{'path': str, 'resolution_key': str, 'format': str, 'bit_depth': int, 'dimensions': (w,h)}`.
|
||||
* **Memory Management:** Holds `source_image_data` + one variant in memory at a time.
|
||||
|
||||
**IV. Configuration Changes (`config/app_settings.json`)**
|
||||
|
||||
1. **Add/Confirm Settings:**
|
||||
* `"INITIAL_SCALING_MODE": "POT_DOWNSCALE"` (Options: "POT_DOWNSCALE", "NONE").
|
||||
* `"MERGE_DIMENSION_MISMATCH_STRATEGY": "USE_LARGEST"` (Options: "USE_LARGEST", "USE_FIRST", "ERROR_SKIP").
|
||||
* Ensure `general_settings.invert_normal_map_green_channel_globally` exists (boolean).
|
||||
2. **Review/Confirm Existing Settings:**
|
||||
* Ensure `IMAGE_RESOLUTIONS`, `FILE_TYPE_DEFINITIONS` (`bit_depth_rule`), `MAP_MERGE_RULES` (`output_bit_depth`, fallback values), format settings, quality settings are comprehensive.
|
||||
3. **Remove Obsolete Setting:**
|
||||
* `RESPECT_VARIANT_MAP_TYPES`.
|
||||
|
||||
**V. Data Flow Diagram (Mermaid)**
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
A[Start Asset Processing] --> B[File Rules Filter];
|
||||
B --> STAGE_INDIVIDUAL_MAP_PROCESSING[Individual Map Processing Stage];
|
||||
|
||||
subgraph STAGE_INDIVIDUAL_MAP_PROCESSING [Individual Map Processing Stage]
|
||||
direction LR
|
||||
C1{Is it a regular map or merged task?}
|
||||
C1 -- Regular Map --> C2[Load Source Image File into Memory (current_image_data)];
|
||||
C1 -- Merged Task (from Map Merging Stage) --> C3[Load Inputs (Handle Missing w/ Fallbacks) & Merge in Memory (Handle Dim Mismatch) (current_image_data)];
|
||||
|
||||
C2 --> C4[current_image_data];
|
||||
C3 --> C4;
|
||||
|
||||
C4 --> C4_TRANSFORM{Transformations?};
|
||||
C4_TRANSFORM -- Gloss Map? --> C4a[Invert Data (in memory), Update base_map_type to ROUGH];
|
||||
C4_TRANSFORM -- Normal Map & Invert Config? --> C4b[Invert Green Channel (in memory)];
|
||||
C4_TRANSFORM -- No Transformation Needed --> C4_POST_TRANSFORM;
|
||||
C4a --> C4_POST_TRANSFORM;
|
||||
C4b --> C4_POST_TRANSFORM;
|
||||
|
||||
C4_POST_TRANSFORM[current_image_data (potentially transformed)] --> C5{INITIAL_SCALING_MODE};
|
||||
C5 -- "POT_DOWNSCALE" --> C6[Perform POT Scale (in memory) --> image_to_save];
|
||||
C5 -- "NONE" --> C7[image_to_save = current_image_data];
|
||||
|
||||
C6 --> C8[Apply Color Management to image_to_save (in memory)];
|
||||
C7 --> C8;
|
||||
|
||||
C8 --> UNIFIED_SAVE_UTILITY[Call Unified Save Utility with image_to_save, final base_map_type, source bit depth info, config];
|
||||
end
|
||||
|
||||
UNIFIED_SAVE_UTILITY --> H[Update context.processed_maps_details with list of saved files & notes];
|
||||
H --> STAGE_METADATA_SAVE[Metadata Finalization & Save Stage];
|
||||
|
||||
STAGE_MAP_MERGING[Map Merging Stage] --> N{Identify Merge Rules};
|
||||
N --> O[Create Merged Image Tasks (incl. inputs, config, source bit depths)];
|
||||
O --> STAGE_INDIVIDUAL_MAP_PROCESSING; %% Feed tasks
|
||||
|
||||
A --> STAGE_OTHER_INITIAL[Other Initial Stages]
|
||||
STAGE_OTHER_INITIAL --> STAGE_MAP_MERGING;
|
||||
|
||||
STAGE_METADATA_SAVE --> Z[End Asset Processing];
|
||||
|
||||
subgraph UNIFIED_SAVE_UTILITY_DETAILS [Unified Save Utility (processing.utils.image_saving_utils)]
|
||||
direction TB
|
||||
INPUTS[Input: in-memory image_to_save, final base_map_type, source_bit_depth_info, config_params, tokens, out_base_dir]
|
||||
INPUTS --> CONFIG_LOAD[1. Use Provided Config Params]
|
||||
CONFIG_LOAD --> DETERMINE_BIT_DEPTH[2. Determine Target Bit Depth (using rule & source_bit_depth_info)]
|
||||
DETERMINE_BIT_DEPTH --> DETERMINE_FORMAT[3. Determine Output Format]
|
||||
DETERMINE_FORMAT --> LOOP_VARIANTS[4. For each Resolution:]
|
||||
LOOP_VARIANTS --> RESIZE_VARIANT[4a. Resize image_to_save to Variant (in memory)]
|
||||
RESIZE_VARIANT --> PREPARE_SAVE[4b. Prepare Filename & Save Params]
|
||||
PREPARE_SAVE --> SAVE_IMAGE[4c. Convert & Save Variant to Disk]
|
||||
SAVE_IMAGE --> LOOP_VARIANTS;
|
||||
LOOP_VARIANTS --> OUTPUT_LIST[5. Return List of Saved File Details]
|
||||
end
|
||||
|
||||
style STAGE_INDIVIDUAL_MAP_PROCESSING fill:#f9f,stroke:#333,stroke-width:2px;
|
||||
style STAGE_MAP_MERGING fill:#f9f,stroke:#333,stroke-width:2px;
|
||||
style UNIFIED_SAVE_UTILITY fill:#ccf,stroke:#333,stroke-width:2px;
|
||||
style UNIFIED_SAVE_UTILITY_DETAILS fill:#ccf,stroke:#333,stroke-width:1px,dashed;
|
||||
style O fill:#lightgrey,stroke:#333,stroke-width:2px;
|
||||
style C4_POST_TRANSFORM fill:#e6ffe6,stroke:#333,stroke-width:1px;
|
||||
@@ -1,181 +0,0 @@
|
||||
# Project Plan: Modularizing the Asset Processing Engine
|
||||
|
||||
**Last Updated:** May 9, 2025
|
||||
|
||||
**1. Project Vision & Goals**
|
||||
|
||||
* **Vision:** Transform the asset processing pipeline into a highly modular, extensible, and testable system.
|
||||
* **Primary Goals:**
|
||||
1. Decouple processing steps into independent, reusable stages.
|
||||
2. Simplify the addition of new processing capabilities (e.g., GLOSS > ROUGH conversion, Alpha to MASK, Normal Map Green Channel inversion).
|
||||
3. Improve code maintainability and readability.
|
||||
4. Enhance unit and integration testing capabilities for each processing component.
|
||||
5. Centralize common utility functions (image manipulation, path generation).
|
||||
|
||||
**2. Proposed Architecture Overview**
|
||||
|
||||
* **Core Concept:** A `PipelineOrchestrator` will manage a sequence of `ProcessingStage`s. Each stage will operate on an `AssetProcessingContext` object, which carries all necessary data and state for a single asset through the pipeline.
|
||||
* **Key Components:**
|
||||
* `AssetProcessingContext`: Data class holding asset-specific data, configuration, temporary paths, and status.
|
||||
* `PipelineOrchestrator`: Class to manage the overall processing flow for a `SourceRule`, iterating through assets and executing the pipeline of stages for each.
|
||||
* `ProcessingStage` (Base Class/Interface): Defines the contract for all individual processing stages (e.g., `execute(context)` method).
|
||||
* Specific Stage Classes: (e.g., `SupplierDeterminationStage`, `IndividualMapProcessingStage`, etc.)
|
||||
* Utility Modules: `image_processing_utils.py`, enhancements to `utils/path_utils.py`.
|
||||
|
||||
**3. Proposed File Structure**
|
||||
|
||||
* `processing/`
|
||||
* `pipeline/`
|
||||
* `__init__.py`
|
||||
* `asset_context.py` (Defines `AssetProcessingContext`)
|
||||
* `orchestrator.py` (Defines `PipelineOrchestrator`)
|
||||
* `stages/`
|
||||
* `__init__.py`
|
||||
* `base_stage.py` (Defines `ProcessingStage` interface)
|
||||
* `supplier_determination.py`
|
||||
* `asset_skip_logic.py`
|
||||
* `metadata_initialization.py`
|
||||
* `file_rule_filter.py`
|
||||
* `gloss_to_rough_conversion.py`
|
||||
* `alpha_extraction_to_mask.py`
|
||||
* `normal_map_green_channel.py`
|
||||
* `individual_map_processing.py`
|
||||
* `map_merging.py`
|
||||
* `metadata_finalization.py`
|
||||
* `output_organization.py`
|
||||
* `utils/`
|
||||
* `__init__.py`
|
||||
* `image_processing_utils.py` (New module for image functions)
|
||||
* `utils/` (Top-level existing directory)
|
||||
* `path_utils.py` (To be enhanced with `sanitize_filename` from `processing_engine.py`)
|
||||
|
||||
**4. Detailed Phases and Tasks**
|
||||
|
||||
**Phase 0: Setup & Core Structures Definition**
|
||||
*Goal: Establish the foundational classes for the new pipeline.*
|
||||
* **Task 0.1: Define `AssetProcessingContext`**
|
||||
* Create `processing/pipeline/asset_context.py`.
|
||||
* Define the `AssetProcessingContext` data class with fields: `source_rule: SourceRule`, `asset_rule: AssetRule`, `workspace_path: Path`, `engine_temp_dir: Path`, `output_base_path: Path`, `effective_supplier: Optional[str]`, `asset_metadata: Dict`, `processed_maps_details: Dict[str, Dict[str, Dict]]`, `merged_maps_details: Dict[str, Dict[str, Dict]]`, `files_to_process: List[FileRule]`, `loaded_data_cache: Dict`, `config_obj: Configuration`, `status_flags: Dict`, `incrementing_value: Optional[str]`, `sha5_value: Optional[str]`.
|
||||
* Ensure proper type hinting.
|
||||
* **Task 0.2: Define `ProcessingStage` Base Class/Interface**
|
||||
* Create `processing/pipeline/stages/base_stage.py`.
|
||||
* Define an abstract base class `ProcessingStage` with an abstract method `execute(self, context: AssetProcessingContext) -> AssetProcessingContext`.
|
||||
* **Task 0.3: Implement Initial `PipelineOrchestrator`**
|
||||
* Create `processing/pipeline/orchestrator.py`.
|
||||
* Define the `PipelineOrchestrator` class.
|
||||
* Implement `__init__(self, config_obj: Configuration, stages: List[ProcessingStage])`.
|
||||
* Implement `process_source_rule(self, source_rule: SourceRule, workspace_path: Path, output_base_path: Path, overwrite: bool, incrementing_value: Optional[str], sha5_value: Optional[str]) -> Dict[str, List[str]]`.
|
||||
* Handles creation/cleanup of the main engine temporary directory.
|
||||
* Loops through `source_rule.assets`, initializes `AssetProcessingContext` for each.
|
||||
* Iterates `self.stages`, calling `stage.execute(context)`.
|
||||
* Collects overall status.
|
||||
|
||||
**Phase 1: Utility Module Refactoring**
|
||||
*Goal: Consolidate and centralize common utility functions.*
|
||||
* **Task 1.1: Refactor Path Utilities**
|
||||
* Move `_sanitize_filename` from `processing_engine.py` to `utils/path_utils.py`.
|
||||
* Update uses to call the new utility function.
|
||||
* **Task 1.2: Create `image_processing_utils.py`**
|
||||
* Create `processing/utils/image_processing_utils.py`.
|
||||
* Move general-purpose image functions from `processing_engine.py`:
|
||||
* `is_power_of_two`
|
||||
* `get_nearest_pot`
|
||||
* `calculate_target_dimensions`
|
||||
* `calculate_image_stats`
|
||||
* `normalize_aspect_ratio_change`
|
||||
* Core image loading, BGR<>RGB conversion, generic resizing (from `_load_and_transform_source`).
|
||||
* Core data type conversion for saving, color conversion for saving, `cv2.imwrite` call (from `_save_image`).
|
||||
* Ensure functions are pure and testable.
|
||||
|
||||
**Phase 2: Implementing Core Processing Stages (Migrating Existing Logic)**
|
||||
*Goal: Migrate existing functionalities from `processing_engine.py` into the new stage-based architecture.*
|
||||
(For each task: create stage file, implement class, move logic, adapt to `AssetProcessingContext`)
|
||||
* **Task 2.1: Implement `SupplierDeterminationStage`**
|
||||
* **Task 2.2: Implement `AssetSkipLogicStage`**
|
||||
* **Task 2.3: Implement `MetadataInitializationStage`**
|
||||
* **Task 2.4: Implement `FileRuleFilterStage`** (New logic for `item_type == "FILE_IGNORE"`)
|
||||
* **Task 2.5: Implement `IndividualMapProcessingStage`** (Adapts `_process_individual_maps`, uses `image_processing_utils.py`)
|
||||
* **Task 2.6: Implement `MapMergingStage`** (Adapts `_merge_maps`, uses `image_processing_utils.py`)
|
||||
* **Task 2.7: Implement `MetadataFinalizationAndSaveStage`** (Adapts `_generate_metadata_file`, uses `utils.path_utils.generate_path_from_pattern`)
|
||||
* **Task 2.8: Implement `OutputOrganizationStage`** (Adapts `_organize_output_files`)
|
||||
|
||||
**Phase 3: Implementing New Feature Stages**
|
||||
*Goal: Add the new desired processing capabilities as distinct stages.*
|
||||
* **Task 3.1: Implement `GlossToRoughConversionStage`** (Identify gloss, convert, invert, save temp, update `FileRule`)
|
||||
* **Task 3.2: Implement `AlphaExtractionToMaskStage`** (Check existing mask, find MAP_COL with alpha, extract, save temp, add new `FileRule`)
|
||||
* **Task 3.3: Implement `NormalMapGreenChannelStage`** (Identify normal maps, invert green based on config, save temp, update `FileRule`)
|
||||
|
||||
**Phase 4: Integration, Testing & Finalization**
|
||||
*Goal: Assemble the pipeline, test thoroughly, and deprecate old code.*
|
||||
* **Task 4.1: Configure `PipelineOrchestrator`**
|
||||
* Instantiate `PipelineOrchestrator` in main application logic with the ordered list of stage instances.
|
||||
* **Task 4.2: Unit Testing**
|
||||
* Unit tests for each `ProcessingStage` (mocking `AssetProcessingContext`).
|
||||
* Unit tests for `image_processing_utils.py` and `utils/path_utils.py` functions.
|
||||
* **Task 4.3: Integration Testing**
|
||||
* Test `PipelineOrchestrator` end-to-end with sample data.
|
||||
* Compare outputs with the existing engine for consistency.
|
||||
* **Task 4.4: Documentation Update**
|
||||
* Update developer documentation (e.g., `Documentation/02_Developer_Guide/05_Processing_Pipeline.md`).
|
||||
* Document `AssetProcessingContext` and stage responsibilities.
|
||||
* **Task 4.5: Deprecate/Remove Old `ProcessingEngine` Code**
|
||||
* Gradually remove refactored logic from `processing_engine.py`.
|
||||
|
||||
**5. Workflow Diagram**
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
AA[Load SourceRule & Config] --> BA(PipelineOrchestrator: process_source_rule);
|
||||
BA --> CA{For Each Asset in SourceRule};
|
||||
CA -- Yes --> DA(Orchestrator: Create AssetProcessingContext);
|
||||
DA --> EA(SupplierDeterminationStage);
|
||||
EA -- context --> FA(AssetSkipLogicStage);
|
||||
FA -- context --> GA{context.skip_asset?};
|
||||
GA -- Yes --> HA(Orchestrator: Record Skipped);
|
||||
HA --> CA;
|
||||
GA -- No --> IA(MetadataInitializationStage);
|
||||
IA -- context --> JA(FileRuleFilterStage);
|
||||
JA -- context --> KA(GlossToRoughConversionStage);
|
||||
KA -- context --> LA(AlphaExtractionToMaskStage);
|
||||
LA -- context --> MA(NormalMapGreenChannelStage);
|
||||
MA -- context --> NA(IndividualMapProcessingStage);
|
||||
NA -- context --> OA(MapMergingStage);
|
||||
OA -- context --> PA(MetadataFinalizationAndSaveStage);
|
||||
PA -- context --> QA(OutputOrganizationStage);
|
||||
QA -- context --> RA(Orchestrator: Record Processed/Failed);
|
||||
RA --> CA;
|
||||
CA -- No --> SA(Orchestrator: Cleanup Engine Temp Dir);
|
||||
SA --> TA[Processing Complete];
|
||||
|
||||
subgraph Stages
|
||||
direction LR
|
||||
EA
|
||||
FA
|
||||
IA
|
||||
JA
|
||||
KA
|
||||
LA
|
||||
MA
|
||||
NA
|
||||
OA
|
||||
PA
|
||||
QA
|
||||
end
|
||||
|
||||
subgraph Utils
|
||||
direction LR
|
||||
U1[image_processing_utils.py]
|
||||
U2[utils/path_utils.py]
|
||||
end
|
||||
|
||||
NA -.-> U1;
|
||||
OA -.-> U1;
|
||||
KA -.-> U1;
|
||||
LA -.-> U1;
|
||||
MA -.-> U1;
|
||||
|
||||
PA -.-> U2;
|
||||
QA -.-> U2;
|
||||
|
||||
classDef context fill:#f9f,stroke:#333,stroke-width:2px;
|
||||
class DA,EA,FA,IA,JA,KA,LA,MA,NA,OA,PA,QA context;
|
||||
Reference in New Issue
Block a user