4.2 KiB
4.2 KiB
Processing Pipeline Refactoring Plan
1. Problem Summary
The current processing pipeline, particularly the IndividualMapProcessingStage, exhibits maintainability challenges:
- High Complexity: The stage handles too many responsibilities (loading, merging, transformations, scaling, saving).
- Duplicated Logic: Image transformations (Gloss-to-Rough, Normal Green Invert) are duplicated within the stage instead of relying solely on dedicated stages or being handled consistently.
- Tight Coupling: Heavy reliance on the large, mutable
AssetProcessingContextobject creates implicit dependencies and makes isolated testing difficult.
2. Refactoring Goals
- Improve code readability and understanding.
- Enhance maintainability by localizing changes and removing duplication.
- Increase testability through smaller, focused components with clear interfaces.
- Clarify data dependencies between pipeline stages.
- Adhere more closely to the Single Responsibility Principle (SRP).
3. Proposed New Pipeline Stages
Replace the existing IndividualMapProcessingStage with the following sequence of smaller, focused stages, executed by the PipelineOrchestrator for each processing item:
-
PrepareProcessingItemsStage:- Responsibility: Identifies and lists all items (
FileRule,MergeTaskDefinition) to be processed from the main context. - Output: Updates
context.processing_items.
- Responsibility: Identifies and lists all items (
-
RegularMapProcessorStage: (HandlesFileRuleitems)- Responsibility: Loads source image, determines internal map type (with suffix), applies relevant transformations (Gloss-to-Rough, Normal Green Invert), determines original metadata.
- Output:
ProcessedRegularMapDataobject containing transformed image data and metadata.
-
MergedTaskProcessorStage: (HandlesMergeTaskDefinitionitems)- Responsibility: Loads input images, applies transformations to inputs, handles fallbacks/resizing, performs merge operation.
- Output:
ProcessedMergedMapDataobject containing merged image data and metadata.
-
InitialScalingStage: (Optional)- Responsibility: Applies configured scaling (e.g., POT downscale) to the processed image data received from the previous stage.
- Output: Scaled image data.
-
SaveVariantsStage:- Responsibility: Takes the final processed (and potentially scaled) image data and orchestrates saving variants using the
save_image_variantsutility. - Output: List of saved file details (
saved_files_details).
- Responsibility: Takes the final processed (and potentially scaled) image data and orchestrates saving variants using the
4. Proposed Data Flow
- Input/Output Objects: Key stages (
RegularMapProcessor,MergedTaskProcessor,InitialScaling,SaveVariants) will use specific Input and Output dataclasses for clearer interfaces. - Orchestrator Role: The
PipelineOrchestratormanages the overall flow. It calls stages, passes necessary data (extracting image data references and metadata from previous stage outputs to create inputs for the next), receives output objects, and integrates final results (like saved file details) back into the mainAssetProcessingContext. - Image Data Handling: Large image arrays (
np.ndarray) are passed primarily via stage return values (Output objects) and used as inputs to subsequent stages, managed by the Orchestrator. They are not stored long-term in the mainAssetProcessingContext. - Main Context: The
AssetProcessingContextremains for overall state (rules, paths, configuration access, final status tracking) and potentially for simpler stages with minimal side effects.
5. Visualization (Conceptual)
graph TD
subgraph Proposed Pipeline Stages
Start --> Prep[PrepareProcessingItemsStage]
Prep --> ItemLoop{Loop per Item}
ItemLoop -- FileRule --> RegProc[RegularMapProcessorStage]
ItemLoop -- MergeTask --> MergeProc[MergedTaskProcessorStage]
RegProc --> Scale(InitialScalingStage)
MergeProc --> Scale
Scale --> Save[SaveVariantsStage]
Save --> UpdateContext[Update Main Context w/ Results]
UpdateContext --> ItemLoop
end
6. Benefits
- Improved Readability & Understanding.
- Enhanced Maintainability & Reduced Risk.
- Better Testability.
- Clearer Dependencies.