Real-Time Pear Fruit Detection And Counting Using YOLOv4 Models And Deep SORT

In the rapidly advancing world of agriculture, technology is taking center stage, especially when it comes to improving efficiency and reducing manual labor. One area where technology is making huge strides is in the detection and counting of fruits, such as pears, using advanced AI models. This article delves into how real-time pear fruit detection can be achieved using YOLOv4 models and Deep SORT. If you’re a farmer, researcher, or tech enthusiast, this is an exciting development that could transform the way we handle post-harvest logistics.

Table of Contents-

Why Pear Counting Matters

For pear farmers, keeping track of yields is essential. Traditionally, this process is manual, which increases post-harvest losses due to pears’ short perishability and packaging inefficiencies. Moreover, rapid decision-making in times of extreme weather conditions requires quick and accurate information. This study explores how integrating technology can offer a mobile-based solution for real-time pear counting using RGB data and cutting-edge algorithms. It could make a significant difference by saving time, reducing waste, and improving logistical operations.

YOLOv4: A Game Changer in Object Detection

The research primarily focuses on YOLOv4 (You Only Look Once), a state-of-the-art object detection model. YOLOv4 offers excellent performance, balancing both accuracy and speed. This model is highly favored for real-time applications as it runs two times faster than EfficientDet, a comparable detection model, while maintaining similar levels of accuracy.

YOLOv4 Variants:

YOLOv4-CSP: Maximizes accuracy with an AP@0.50 of 98%, making it the best for precision.
YOLOv4-tiny: Best suited for speed and lower computational costs, achieving over 50 frames per second (FPS) with minimal resources.
YOLOv4: The ideal middle ground, providing both high accuracy and real-time speed (≥ 24 FPS) while balancing computational costs.

Overcoming Challenges with Deep SORT

One of the major challenges in pear detection is flickering or failure to detect fruits under tough conditions like occlusion or poor lighting. For instance, merely detecting the fruits might not suffice in an orchard where visibility is hampered. This is where Deep SORT (Simple Online Realtime Tracking) steps in, a multiple object-tracking algorithm that ensures accurate pear counting even when the detection system falters. By assigning unique IDs to each detected fruit, it ensures reliable counting over time.

Two counting methods were explored:

Unique ID method: Provided better reliability, boasting an F1count score of 87.85%, largely due to YOLOv4’s ability to minimize false negatives.
ROI Line: While more restrictive, it occasionally missed some detections due to flickering, making the unique ID method more consistent.

Methodology and Results

This study took a comprehensive approach to analyze various YOLO models and their integration with Deep SORT for accurate pear counting. Data collection involved recording RGB videos of pear orchards using high-quality mobile cameras. These videos were processed through different YOLO models to evaluate accuracy, speed, and computational requirements. The results showed that:

YOLOv4-CSP achieved the highest detection accuracy.
YOLOv4-tiny excelled in speed and low computational cost.
YOLOv4 struck the best balance between accuracy and real-time performance.

Actionable Tips for Farmers and Enthusiasts:

Choose the Right Model: If you’re aiming for the highest accuracy, go with YOLOv4-CSP. For faster, resource-efficient counting, opt for YOLOv4-tiny.
Addressing Lighting and Occlusion Issues: Use Deep SORT for better reliability under tough conditions like poor lighting or occlusion.
Balance Your Resources: For those who need a balance of speed and accuracy with limited computational resources, YOLOv4 is the sweet spot.

Conclusion

In summary, leveraging YOLOv4 models combined with Deep SORT for real-time pear counting can revolutionize the agricultural sector by reducing manual labor, improving accuracy, and speeding up processes. This study demonstrates the potential of integrating object detection and tracking algorithms in agricultural practices, paving the way for smarter farming solutions.

Key Takeaways for Instagram Reels and Canva Infographics:

Speed vs. Accuracy: Use YOLOv4-CSP for top-notch accuracy, YOLOv4-tiny for maximum speed, or YOLOv4 for a balanced approach.
Real-time Efficiency: Achieve real-time fruit counting using Deep SORT with YOLOv4, even in challenging conditions.
Technological Integration: Emphasize the role of AI and machine learning in transforming farming practices for better yield management.

YOLO Model	Accuracy (AP@0.50)	Speed (FPS)	Computational Cost (FLOPS)
YOLOv4-CSP	98%	Moderate	High
YOLOv4-tiny	Moderate	50+ FPS	Low
YOLOv4	High	≥ 24 FPS	Balanced

Data Preparation

11.3.2.1 Videos Were Converted into Image Frames

The video data for pear fruit detection was converted into image frames using the “Scene video filter” in VLC. Automatic screenshots were taken at half-second intervals. For a video with 60 frames per second (FPS), an image was captured every 30 frames, and for a 30 FPS video, one frame was taken every 15 frames. After eliminating frames that did not contain pear fruits, the dataset consisted of:

314 images of 4k resolution
134 images of 1920×1088 resolution
This left a total of 448 images for further processing.

Real-Time Pear Fruit Detection and Counting Using YOLOv4 Models and Deep SORT

11.3.2.2 Labeling

The bounding boxes for pears in the images were labeled using Supervisely®. While Supervisely only exported labels in JSON format, the labels were converted to YOLO format using Roboflow®. Alternative tools for YOLO format annotation include LabelImg and OpenLabeling.

11.3.2.3 Data Augmentation

To enhance the robustness of the detection system, data augmentation was applied. This helped mitigate dataset bias and overfitting by simulating diverse scenarios:

Pixel-level augmentations (which leave the bounding boxes unchanged):
- Random brightness adjustment (-25% to +25%)
- Gamma exposure adjustment (-20% to +20%)
- Coarse Dropout (up to 6% of the image’s pixels were modified)
Spatial-level augmentations (which alter both images and bounding boxes):
- Random horizontal and vertical flips.

Images were resized to 416×416, 512×512, and 608×608 resolutions, with aspect ratios preserved using padding. After augmentation, the dataset was expanded to 1337 images.

11.3.3 Data Splitting

The dataset was split into four parts using a 70:10:10:10 ratio:

Training: High-resolution images for training the model.
Training-validation: Unseen high-resolution images to check overfitting.
Validation and test sets: Composed of lower-resolution mobile phone images, targeting real-time pear detection.

11.3.4 Setting the Target Metric

For mobile applications, the pear fruit counting system had the following goals:

Maximize accuracy within real-time constraints (≥24 FPS).
Minimize GPU consumption, optimizing for devices with lower computational power.

11.3.5 Evaluation Metrics for the Detection

The detection models were evaluated using metrics from the Pascal VOC Challenge:

Intersection over Union (IoU): Ratio of overlap between prediction and ground truth bounding boxes.
True Positives (TP), False Positives (FP), and False Negatives (FN) were computed, leading to the following derived metrics:
- Recall (R) = TP / (TP + FN)
- Precision (P) = TP / (TP + FP)
- F1 score = 2 * (P * R) / (P + R)
- Average Precision (AP): Area under the precision-recall curve.

11.3.6 Components of the YOLOv4 Models

11.3.6.1 Cross-Stage Partial (CSP) Connection

CSP connections were used to reduce computational complexity by splitting the feature map of the base layer into two parts and merging them through a transition-concatenation-transition process.

11.3.6.2 CSPDarknet53

CSPDarknet53 was chosen as the backbone for YOLOv4 and YOLOv4-CSP, as it improved object detection accuracy, even though it lagged behind CSPResNext50 in image classification. Techniques like CutMix, Mosaic, and DropBlock were used to further enhance performance.

11.3.6.3 YOLOv4-Tiny’s Backbone: CSPOSANet

For YOLOv4-tiny, the backbone was designed for efficiency, implementing one-shot aggregation (OSA), which reduced redundant gradient information and improved computational efficiency.

11.3.6.4 Activation Functions

Leaky ReLU: Used in YOLOv4-tiny to maximize speed while maintaining accuracy.
Mish: Used in YOLOv4 and YOLOv4-CSP for better accuracy due to smoother optimization and generalization.

11.3.6.5 Path Aggregation Network (PANet)

PANet aggregated parameters from different backbone levels to ensure that fine-grained information was retained across layers, enhancing feature pyramids.

11.3.6.6 Spatial Pyramid Pooling (SPP)

SPP allowed YOLOv4 and YOLOv4-CSP to handle variable-sized input images by pooling features into a fixed-length output.

This concludes the details of the data preparation, augmentation, and model architecture for pear fruit detection using the YOLOv4 family of models. The next step involves training and optimizing the models using custom anchors.

The error analysis of YOLOv4 models reveals that fine-tuning effectively improved the model’s precision and accuracy across different data sets. The increasing average precision (AP50) from train-val to val to test sets indicates that data mismatches were overcome, and no overfitting occurred during training. This stability across datasets means the models are generalizing well.

The results show that YOLOv4-CSP-608 performed best in detection metrics such as precision (P), false positive rate (FPR), recall (R), and average Intersection over Union (IoU). The use of the Mish activation function and higher network resolution could explain why this model captures intricate object features better than others. Surprisingly, even the smaller YOLOv4-tiny models showed good accuracy, especially at higher resolutions, performing comparably to larger models like YOLOv4-416.

Speed-accuracy tradeoff analysis highlighted that models like YOLOv4-512 and YOLOv4-608 achieved real-time speeds (≥24 FPS) while maintaining high accuracy. YOLOv4-CSP, although slightly slower, offered top-tier accuracy, especially at higher resolutions. The inference speed and GPU consumption analysis reaffirmed that YOLOv4-tiny had significantly lower computational requirements, making it ideal for resource-constrained environments.

When it came to illumination and occlusion challenges, models like YOLOv4-CSP-608 handled complex environments well, even in high-occlusion situations where other models struggled. This makes YOLOv4-CSP-608 the top choice for tasks requiring robustness in detection under difficult conditions.

Regarding the pear counting methods, the unique ID-based approach was more sensitive, resulting in better overall counting performance, particularly in terms of recall and false negative rate (FN). In contrast, the ROI line-based method was more precise but lacked sensitivity, leading to undercounts.

Finally, the FLOPs analysis showed that YOLOv4-tiny models have a clear advantage in terms of speed and computational cost, making them suitable for real-time applications on less powerful hardware, while YOLOv4 and YOLOv4-CSP provided higher accuracy but at the cost of more computational resources.

From the breakdown of the false negative counts in the ROI line-based counting system, it is evident that the sensitivity of this method was a major limitation. To enhance this, it is suggested to improve the tracking algorithm by prioritizing motion information over appearance, particularly under challenging illumination conditions. This would mitigate issues arising from occlusion and potentially increase the sensitivity of the system without sacrificing the correctness of detections.

11.4.9 Best Model Selection

Based on the error analysis and the performance comparison of the YOLOv4 models, it can be concluded that YOLOv4-CSP-608 performed the best overall in terms of detection metrics, inference speed, and computational efficiency. Despite its slightly lower frame rate compared to some other models, its detection accuracy and robustness against occlusion and varying illumination made it the most reliable choice for pear fruit detection in real-time orchard settings.

Other top contenders were YOLOv4-512 and YOLOv4-608, both of which also offered high precision and recall scores, as well as fast inference times that satisfied the real-time requirement. Interestingly, YOLOv4-tiny-608 also showed promise, especially considering its lower computational cost and comparable performance at higher resolutions.

11.5 Conclusions

The study explored and evaluated different versions of the YOLOv4 model family for detecting and counting pears in a real-time orchard environment. Several key insights were gained:

Detection Accuracy: YOLOv4-CSP-608 achieved the highest precision, recall, and F1 score across all models, particularly excelling in complex scenes with occlusion and challenging illumination conditions. YOLOv4-512 and YOLOv4-608 also performed well in these areas.
Inference Speed: The real-time speed requirement (≥24 FPS) was satisfied by models like YOLOv4-512, YOLOv4-416, and YOLOv4-tiny-608. YOLOv4-CSP-512 came close to meeting this threshold, with an inference speed of 21.4 FPS.
Computational Efficiency: YOLOv4-tiny models demonstrated low computational requirements and fast inference speeds, making them attractive for deployment in resource-constrained environments. YOLOv4-tiny-608 in particular offered a good balance between speed and accuracy, showing that the tiny models can perform well at higher resolutions.
Occlusion and Illumination: All YOLOv4 models were able to detect pears in images with varying degrees of occlusion and challenging lighting conditions. However, YOLOv4-CSP-608 and YOLOv4-512 were the most resilient to these challenges, outperforming the other models when object visibility was compromised.
Counting Performance: Between the two methods for counting pears (ROI line-based and unique ID-based), the unique ID-based method was found to be superior due to its higher sensitivity, which resulted in better recall and F1 scores. The ROI line-based method, while more precise, suffered from a high false negative rate, which limited its usefulness in accurately counting pears in the orchard.

In conclusion, while YOLOv4-CSP-608 provided the best performance in terms of accuracy and robustness, YOLOv4-512 and YOLOv4-608 also showed excellent results and are recommended for real-time applications that prioritize both speed and detection quality. YOLOv4-tiny-608, due to its low computational cost and solid detection capabilities at higher resolutions, is an ideal candidate for environments where computational resources are limited but accurate detection is still required.

In the Mask R-CNN model, the head architecture plays a vital role in performing classification, regression, and mask generation. The Region of Interest (ROI)-Align operation refines the extracted features from the feature pyramid to a uniform size, creating two distinct branches: one for classification and bounding box regression (upper branch) and the other for mask generation (lower branch).

Classification and Regression (Upper Branch)

This branch mirrors the approach of Faster R-CNN, which first classifies each ROI into a specific category and then refines the bounding box around the detected object using regression. Specifically:

Classification determines which category the object belongs to using a fully connected layer followed by a Softmax function.
Bounding Box Regression refines the coordinates of the proposed regions, providing more accurate localization of objects.

Mask Generation (Lower Branch)

Unlike Faster R-CNN, Mask R-CNN introduces a fully convolutional network (FCN) in the lower branch for generating pixel-wise segmentation masks:

Instead of the fully connected layers found in traditional CNNs, this branch uses a series of convolutional layersto generate a 14 × 14 feature map, which is further up-sampled to 28 × 28.
This branch outputs a binary mask for each class and each region, providing precise pixel-level segmentation.

The Mask R-CNN model uses deconvolution operations to restore the feature map’s resolution, ensuring that each mask is of the same size as the output image.

Loss Function in Mask R-CNN

The loss function for Mask R-CNN includes multiple components:

RPN (Region Proposal Network) Loss: Similar to Faster R-CNN, RPN loss consists of:
- Classification Loss (L_CLS): This computes the loss for binary classification (object vs. background).
- Bounding Box Regression Loss (L_BOX): Smooth L1 loss is applied to minimize the offset between predicted and ground truth bounding boxes.
The formula for the RPN loss is:LRPN=LCLS+LBOXLRPN=LCLS+LBOX
Mask Loss (L_MASK): This loss is added by the mask branch and measures how well the model predicts the binary mask for each object. It uses binary cross-entropy loss to compute the error between the predicted and true masks.The total loss function is a combination of the RPN loss and the mask loss:L=LRPN+LMASKL=LRPN+LMASK

Performance Metrics

During testing, the performance of the Mask R-CNN model is evaluated using metrics like Precision and Recall:

Precision: The proportion of true positive detections (correct pear identifications) to the total number of positive detections.P=TPTP+FPP=TP+FPTP
Recall: The proportion of true positive detections to the total number of actual positive samples.R=TPTP+FNR=TP+FNTP

Since the task focuses on detecting pears, True Positives (TP) refer to correctly identified pears, while False Positives (FP) are incorrect identifications of background or leaves as pears. False Negatives (FN) occur when pears are missed.

Training Details

The model was trained on a dataset of 9054 RGBA images (with 3018 original and 6036 augmented images), divided into training, validation, and test sets at a ratio of 6:3:1. The training process used 80 epochs, with a learning rate set to 0.001, resulting in a significant drop in training and validation losses

The section provides a detailed evaluation and comparison of various deep learning models—Mask R-CNN, Faster R-CNN, and YOLACT—used for recognizing pears in an orchard using 3D stereo camera datasets.

Key Points:

Further Development: Future research aims to enhance this technology for use in fast agricultural robots and potentially integrate distance-measuring modules to improve precision in orchard environments.

Evaluation Metrics: The models were assessed using standard evaluation metrics like Precision (P), Recall (R), Average Precision (AP), and mean Average Precision (mAP). Testing was conducted on datasets after 80 epochs, using a learning rate of 0.001. Mask R-CNN consistently outperformed the other models, achieving an impressive mAP of 99.45% on the testing set, compared to Faster R-CNN’s 87.52% and YOLACT’s 97.89%.

Effectiveness of Mask R-CNN:

Superior Accuracy: Mask R-CNN displayed significantly higher accuracy, especially in cases where pears were aggregated or under low light conditions, compared to Faster R-CNN and YOLACT. Its ability to generate masks (as opposed to just bounding boxes) allowed it to handle more complex detection tasks.

Independent vs. Aggregated Pear Detection: While all models performed well on independent pears, Mask R-CNN’s recognition of aggregated pears was superior, especially under difficult conditions like occlusions from branches and leaves.

Light Conditions: Mask R-CNN proved more robust in low-light conditions, outperforming Faster R-CNN and YOLACT.

Comparison of Models:

Faster R-CNN: While faster in processing, it struggled with distinguishing individual pears in clusters and had difficulties in varying light conditions, achieving lower accuracy than Mask R-CNN.

YOLACT: Although YOLACT is designed for real-time processing and outperformed Faster R-CNN in some instances, it lacked the accuracy of Mask R-CNN, particularly in complex environments with overlapping pears and varying light.

Discussion of Technology:

3D Stereo Camera (ZED): The use of the ZED 3D stereo camera, which provides depth information, was highlighted as a key advantage. Unlike monocular cameras, it improved the accuracy of object distance measurements and was particularly helpful in distinguishing pears from overlapping foliage.

Challenges with Traditional Cameras: The study noted limitations in using monocular cameras, particularly in extended measurement tasks and lower accuracy in complex environments.

Conclusion and Future Work:

Mask R-CNN for Agricultural Robots: The study suggests that Mask R-CNN is suitable for low-speed fruit-picking mechanisms due to its accuracy, despite its slower processing speed (5 fps).

This analysis emphasizes Mask R-CNN’s superior accuracy in handling complex orchard scenarios, offering a promising approach for future development in automated agricultural systems.

This paper investigates a methodology that uses thermal microcameras and deep learning algorithms, specifically YOLO (You Only Look Once), for detecting embryos in quail eggs during the early stages of incubation. The main goal is to develop a non-invasive, real-time method to distinguish fertilized from unfertilized eggs, which can help improve hatch rates and efficiency in quail farming.

Key Points:

Thermal Imaging: The study uses thermal microcameras to capture radiometric images of quail eggs during the first 168 hours (7 days) of incubation. This non-contact method helps detect heat emitted by embryos, which can indicate whether an egg is fertilized.
Deep Learning Models: Three deep learning models—YOLOv4, YOLOv5, and SSD-MobileNet V2—were tested for object detection on the thermal images. YOLOv5 performed the best with a mAP@0.50 of 99.5%, compared to 98.62% for YOLOv4 and 91.8% for SSD-MobileNet V2.
Egg Turning Intervals: The study hypothesized that less frequent turning of eggs would result in better detection of embryo features because it keeps the developing embryo more static, making temperature distribution more distinguishable. However, the results showed no clear linear relationship between turning intervals and detection accuracy. YOLOv5 showed the highest F1 Score for a 12-hour turning period, with a score of 1.0, while the performance decreased for shorter intervals (6 hours and 1.5 hours).
Experimental Setup: The study was conducted in a controlled laboratory environment with 120 quail eggs in total, divided into four groups based on turning periods (90 minutes, 6 hours, 12 hours) and one control group of unfertilized eggs. The thermal microcamera used was a FLIR® VUE 336, and data were collected and analyzed over seven days.
Radiometric Corrections: The thermal images were adjusted using FLIR Thermal Studio™ to enhance the visibility of features. Specific radiometric corrections, such as isotherm filtering, were applied to better highlight the temperature differences between fertilized and unfertilized eggs.

This research contributes to the development of more efficient precision hatching systems that use thermal imaging and deep learning for automated embryo detection. This could lead to significant improvements in quail farming by reducing waste and increasing hatchability rates.

Deep Learning Algorithms

Here’s a structured breakdown and analysis of the provided text on YOLOv4, YOLOv5, and SSD-MobileNet V2 deep learning algorithms, specifically in the context of object detection for unfertilized eggs using thermal imaging:

1. Introduction to YOLO Algorithms

YOLOv4:
- Released in 2020, it is known for its speed and accuracy, improving upon YOLOv3 with the introduction of CSPDarknet-53 as its backbone.
- Key innovations include the path aggregation network (PANet) and spatial pyramid pooling (SPP).
- Capable of real-time object detection and designed for efficient training on a single GPU.
YOLOv5:
- Developed by Ultralytics in 2021, it has gained popularity for being faster and more efficient than YOLOv4.
- Uses a feature pyramid network (FPN) and pixel aggregation network (PAN) for improved accuracy and training speed.
- Implemented in the PyTorch framework, suitable for low-end devices due to smaller file sizes.
SSD-MobileNet V2:
- A single-shot detection (SSD) model known for its speed and efficiency on mobile devices and low-end computing platforms.
- Introduces depth-wise convolution layers and an inverted residual structure, enhancing performance.

2. Methodology

Model Training:
- All models were trained using a dataset of 420 images of unfertilized eggs, augmented to 1892 images through various transformations.
- Labeling: YOLO format for YOLOv4 and YOLOv5, PASCAL VOC XML format for SSD-MobileNet V2.
- Training Parameters:
  - YOLOv4: 64 batch size, input size 416×416, learning rate 0.0001, trained for 4000 steps.
  - YOLOv5: 16 batch size, input size 416×416, trained for 60 epochs in Google Colab.
  - SSD-MobileNet V2: 320×320 input size, 40,000 training steps.
Data Evaluation:
- Precision, recall, F1 score, and mean average precision (mAP) were used for model evaluation.
- The object detection metrics are based on true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN).

3. Results

Training Performance:
- YOLOv4 took approximately 20 hours, achieving mAP > 90% and an average loss < 0.5.
- YOLOv5 trained in 1 hour with superior performance metrics.
- SSD-MobileNet V2 completed training in 4 hours.
Detection Performance:
- Evaluation metrics for YOLOv4, YOLOv5, and SSD-MobileNet V2 showed high precision and recall.
- YOLOv5 achieved the highest mAP, followed by YOLOv4 and SSD-MobileNet V2.
- The unfertilized eggs were assessed on the 8th day of incubation, confirming the detection accuracy across models.

4. Thermal Imaging Analysis

Different thermal patterns were observed between fertilized and unfertilized eggs.
The methodology for collecting images involved optimal positioning to enhance thermal imaging quality.

5. Conclusion

The study effectively demonstrates the capabilities of deep learning algorithms in thermal imaging for detecting unfertilized eggs.
YOLOv5 outperformed YOLOv4 and SSD-MobileNet V2 in terms of mAP and training efficiency, showcasing its suitability for real-time applications.

This structured analysis captures the essential information from the provided text while emphasizing the methodology, results, and implications of using deep learning for thermal imaging in the context of egg fertilization assessment. If you need further elaboration on any specific section or aspect, feel free to ask!

Discussion

Embryo development in quail eggs lasts approximately 16.5 days (Ainsworth et al., 2010), contrasting with the 21.5 days typical for hen eggs. However, the embryonic development stages in both species share significant similarities, allowing for comparative studies. Various studies have characterized avian embryo development (Hamburger & Hamilton, 1951; Sittmann et al., 1966; Graham & Meier, 1975; Ruffins et al., 2007). While much of the focus has been on the biological functions of structures and genetic factors, recent advances in sensors and computational methods have opened new avenues for improving poultry yields.

This study explored the potential of thermal cameras as a nondestructive, noncontact technique to distinguish fertilized from unfertilized quail eggs, aiming to enhance hatching management efficiency. Unlike conventional incubators used in industrial settings, which typically position eggs vertically and turn them 45° for operational convenience, our method diverged from this standard approach.

Previous research has indicated that egg positioning during incubation does not significantly impact hatching rates (Van de Ven et al., 2011), while Oliveira et al. (2020) found that reduced egg turning frequency negatively affects chick hatchability. Notably, we did not evaluate hatchability or mortality in our study.

Thermal cameras capture images by analyzing the intensity of infrared wavelengths received by the thermal sensor, but they face challenges like low resolution and high costs (Williams et al., 2022). Reflectance, transmittance, and emissivity are critical factors influencing their application. Primarily, thermal imaging serves in nocturnal vision and body temperature monitoring. Our study proposed using thermal imaging to observe quail egg thermal behavior during incubation, leveraging isotherm filtering to identify unfertilized eggs by clustering radiometric information under specific thresholds. The embryo’s development, along with transformations in the yolk sac, allantois, and air chamber, affects gas dynamics through the eggshell’s micropores, enabling thermal imaging to capture these changes. Nonetheless, the absence of standardized isotherm controls may have contributed to the unclear features hindering accurate embryo classification.

13.5 Model Evaluation

Table 13.5 summarizes the evaluation metrics—precision (P), recall (R), and F1-score—across three models (YOLOv4, YOLOv5, and SSD-MobileNet V2) for detecting unfertilized eggs under varying incubation durations (12 hours, 6 hours, and 1.5 hours):

Model	Dataset (h)	Precision (P)	Recall (R)	F1-score
YOLOv4	12	0.857	0.428	0.569
	6	0.301	0.615	0.404
	1.5	0.337	0.446	0.383
YOLOv5	12	1.000	1.000	1.000
	6	0.660	0.500	0.560
	1.5	0.600	0.260	0.360
SSD-MobileNet V2	12	1.000	0.420	0.590
	6	0.630	0.130	0.210
	1.5	0.000	0.000	0.000

The deep learning algorithms demonstrated considerable effectiveness in real-time vision systems, with high precision indicating the capability to extract features from thermal images of unfertilized eggs. Nevertheless, it became evident that additional data is crucial for enhancing model robustness. Classifying images using only one class may not sufficiently bias the model for more detailed feature extraction.

YOLOv5 outperformed other models, achieving a higher mean Average Precision (mAP@0.50) during training and demonstrating superior F1 scores on the testing dataset due to lower false positives across all detections. YOLOv4 followed closely, while SSD-MobileNet V2 produced the least favorable results.

Fertilized eggs exhibited thermal profiles with structures relatively consistent regardless of egg turning periods. However, the low resolution of thermal cameras can hinder feature recognition, with distance and transmittance effects posing additional challenges. Errors during data collection may also lead to subpar radiometric data and misclassification of eggs. The limited number of unfertilized eggs in each treatment hindered the determination of whether varying turning periods could enhance class detection. Nonetheless, using the collected datasets allowed for testing the deep learning model, showcasing its potential for early-stage embryo detection by excluding unfertilized eggs. While most unfertilized eggs were correctly identified, the false-positive detections affected overall precision—particularly notable in the 12-hour intervals, where only one fertilized egg contributed to low error rates.

Conclusion

This study successfully employed a nondestructive and noninvasive approach to distinguish fertilized from unfertilized quail eggs, leveraging thermal microcamera technology and deep learning algorithms. The integration of isotherm analysis during incubation with the YOLO object detection algorithm highlighted significant potential for automated computer vision systems in classifying unfertilized eggs at early stages. Distinct characteristics were identified between fertilized and unfertilized eggs across all treatments, confirming that egg rotation frequency does not impact early embryo identification. Unfertilized eggs could be detected after just 12 hours of incubation.

To assess model performance, we compressed the original dataset and consolidated images into a single dataset, allowing us to evaluate the model’s robustness effectively without overfitting. The training dataset exhibited high precision during validation; however, testing precision declined when images were subjected to treatment datasets. Potential factors for this decline include low resolution and data collection errors with the thermal microcamera at each period.

Future research will focus on improving detection precision through expanded datasets and the classification of fertilized eggs, ultimately refining the methodology for broader applications in avian egg classification.

About Us

Welcome to Agriculture Novel, your go-to source for in-depth information and insights into the world of agriculture, hydroponics, and sustainable farming. Our mission is to educate, inspire, and empower a new generation of farmers, hobbyists, and eco-conscious enthusiasts. Whether you’re interested in traditional farming practices or modern innovations, we aim to provide comprehensive guides, expert tips, and the latest updates in agriculture and urban farming.

At Agriculture Novel, we believe in the power of knowledge to transform the way we grow, sustain, and nourish our world. Explore our articles on topics like Fruit Growing Guide, Hydroponics, Plant Deficiency Guide, and more.

Thank you for joining us on this journey towards a greener, more sustainable future!

About Agronique Horizon
At Agronique Horizon, we specialize in delivering comprehensive digital marketing and web development solutions tailored for the agriculture and hydroponics industries. From custom website design and app development to social media management, we provide end-to-end support for brands aiming to make a meaningful impact. Our team also offers innovative solutions for the real estate sector, bringing precision and visibility to your projects. Learn more about our services here and discover how we can elevate your digital presence

Real-Time Pear Fruit Detection and Counting Using YOLOv4 Models and Deep SORT

Why Pear Counting Matters

YOLOv4: A Game Changer in Object Detection

YOLOv4 Variants:

Overcoming Challenges with Deep SORT

Methodology and Results

Actionable Tips for Farmers and Enthusiasts:

Conclusion

Key Takeaways for Instagram Reels and Canva Infographics:

Data Preparation

11.3.2.1 Videos Were Converted into Image Frames

11.3.2.2 Labeling

11.3.2.3 Data Augmentation

11.3.3 Data Splitting

11.3.4 Setting the Target Metric

11.3.5 Evaluation Metrics for the Detection

11.3.6 Components of the YOLOv4 Models

11.3.6.1 Cross-Stage Partial (CSP) Connection

11.3.6.2 CSPDarknet53

11.3.6.3 YOLOv4-Tiny’s Backbone: CSPOSANet

11.3.6.4 Activation Functions

11.3.6.5 Path Aggregation Network (PANet)

11.3.6.6 Spatial Pyramid Pooling (SPP)

11.4.9 Best Model Selection

11.5 Conclusions

Classification and Regression (Upper Branch)

Mask Generation (Lower Branch)

Loss Function in Mask R-CNN

Performance Metrics

Training Details

Key Points:

Key Points:

1. Introduction to YOLO Algorithms

2. Methodology

3. Results

4. Thermal Imaging Analysis

5. Conclusion

13.5 Model Evaluation

Conclusion

Share this:

Like this:

Related

Related Posts

Unlocking the Power of White LEDs for Plant Factories

Analysis of Cultivation Days and Yield Efficiency in Plant Factories

Leave a ReplyCancel reply

Discover more from Agriculture Novel