Ammar Bin Muhammad Shafiq Sarawanan University Of Wollongong Malaysia
The Malaysian furniture industry continues to rely heavily on manual quality inspection, which introduces inconsistency, inefficiency, and potential human error in identifying defects on processed wooden planks. This system presents a dual-stage machine learning framework designed to automate the detection and classification of surface defects in wooden materials using real-time image analysis. The framework combines YOLO-World—
a zero-shot object detection model integrated with CLIP (Contrastive Language–Image Pre-training)—for initial categorization of wood types and potential defects, with Mask R-CNN for pixel-level segmentation and detailed verification of defect regions. Four common defect types were studied: edge defects, scratches, surface anomalies, and discoloration. The YOLO-World model was trained using textual prompts and achieved high detection precision and recall across all classes. Mask R-CNN further refined these detections by segmenting defect boundaries with high spatial accuracy. The system was evaluated on a custom dataset comprising annotated images of various wood types and defects, achieving a mean Average Precision (mAP@50) of 98.9% and a mAP@50-95 of 91.6%. This study demonstrates the feasibility and effectiveness of integrating multi-stage object detection and segmentation techniques to support automated visual inspection in industrial environments. By reducing dependency on manual inspection, the model contributes toward improved quality control and operational efficiency in wood-based manufacturing.