5.6 KiB
Developer Guide: Processing Pipeline
This document details the step-by-step technical process executed by the ProcessingEngine class (processing_engine.py) when processing a single asset. A new instance of ProcessingEngine is created for each processing task to ensure state isolation.
The ProcessingEngine.process() method orchestrates the following pipeline based solely on the provided SourceRule object and the static Configuration object passed during engine initialization. It contains no internal prediction, classification, or fallback logic. All necessary overrides and static configuration values are accessed directly from these inputs.
The pipeline steps are:
-
Workspace Preparation (External):
- Before the
ProcessingEngineis invoked, the calling code (e.g.,main.ProcessingTask,monitor._process_archive_task) is responsible for setting up a temporary workspace. - This typically involves using
utils.workspace_utils.prepare_processing_workspace, which creates a temporary directory and extracts the input source (archive or folder) into it. - The path to this prepared workspace is passed to the
ProcessingEngineduring initialization.
- Before the
-
Prediction and Rule Generation (External):
- Also handled before the
ProcessingEngineis invoked. - Either the
RuleBasedPredictionHandler,LLMPredictionHandler(triggered by the GUI), orutils.prediction_utils.generate_source_rule_from_archive(used by the Monitor) analyzes the input files and generates aSourceRuleobject. - This
SourceRulecontains predicted classifications and initial overrides. - If using the GUI, the user can modify these rules.
- The final
SourceRuleobject is the primary input to theProcessingEngine.process()method.
- Also handled before the
-
File Inventory (
_inventory_and_classify_files):- Scans the contents of the already prepared temporary workspace.
- This step primarily inventories the files present. The classification (determining
item_type, etc.) is taken directly from the inputSourceRule. - Stores the file paths and their associated rules from the
SourceRuleinself.classified_files.
-
Base Metadata Determination (
_determine_base_metadata,_determine_single_asset_metadata):- Determines the base asset name, category, and archetype using the explicit values provided in the input
SourceRuleand the staticConfiguration. Overrides (likesupplier_identifier,asset_type,asset_name_override) are taken directly from theSourceRule.
- Determines the base asset name, category, and archetype using the explicit values provided in the input
-
Skip Check:
- If the
overwriteflag isFalse, checks if the final output directory already exists and containsmetadata.json. - If so, processing for this asset is skipped.
- If the
-
Map Processing (
_process_maps):- Iterates through files classified as maps in the
SourceRule. - Loads images (
cv2.imread). - Handles Glossiness-to-Roughness inversion.
- Resizes images based on
Configuration. - Determines output bit depth and format based on
ConfigurationandSourceRule. - Converts data types and saves images (
cv2.imwrite).
- Iterates through files classified as maps in the
- The output filename uses the
standard_typealias (e.g.,COL,NRM) retrieved from theConfiguration.FILE_TYPE_DEFINITIONSbased on the file's effectiveitem_type.- Calculates image statistics.
- Stores processed map details.
-
Map Merging (
_merge_maps_from_source):- Iterates through
MAP_MERGE_RULESinConfiguration. - Identifies required source maps by checking the
item_type_overridewithin theSourceRule(specifically in theFileRulefor each file). Files with a baseitem_typeof"FILE_IGNORE"are explicitly excluded from consideration. - Loads source channels, handling missing inputs with defaults from
ConfigurationorSourceRule. - Merges channels (
cv2.merge). - Determines output format/bit depth and saves the merged map.
- Stores merged map details.
- Iterates through
-
Metadata File Generation (
_generate_metadata_file):- Collects asset metadata, processed/merged map details, ignored files list, etc., primarily from the
SourceRuleand internal processing results. - Writes data to
metadata.jsonin the temporary workspace.
- Collects asset metadata, processed/merged map details, ignored files list, etc., primarily from the
-
Output Organization (
_organize_output_files):- Creates the final structured output directory (
<output_base_dir>/<supplier_name>/<asset_name>/), using the supplier name from theSourceRule. - Moves processed maps, merged maps, models, metadata, and other classified files from the temporary workspace to the final output directory.
- Creates the final structured output directory (
-
Workspace Cleanup (External):
- After the
ProcessingEngine.process()method completes (successfully or with errors), the calling code is responsible for cleaning up the temporary workspace directory created in Step 1. This is often done in afinallyblock whereutils.workspace_utils.prepare_processing_workspacewas called.
- After the
-
(Optional) Blender Script Execution (External):
- If triggered (e.g., via CLI arguments or GUI controls), the orchestrating code (e.g.,
main.ProcessingTask) executes the corresponding Blender scripts (blenderscripts/*.py) usingsubprocess.runafter theProcessingEngine.process()call completes successfully. - Note: Centralized logic for this was intended for
utils/blender_utils.py, but this utility has not yet been implemented. SeeDeveloper Guide: Blender Integration Internalsfor more details.
- If triggered (e.g., via CLI arguments or GUI controls), the orchestrating code (e.g.,
This pipeline, executed by the ProcessingEngine, provides a clear and explicit processing flow based on the complete rule set provided by the GUI or other interfaces.