International Journal of

ADVANCED AND APPLIED SCIENCES

EISSN: 2313-3724, Print ISSN: 2313-626X

Frequency: 12

line decor
  
line decor

 Volume 11, Issue 9 (September 2024), Pages: 154-163

----------------------------------------------

 Original Research Paper

Efficient batch size detection of apple fruit in a plantation environment

 Author(s): 

 Wahyu Pebrianto 1, Ahmad Hoirul Basori 2, *, Hendra Yufit Riskiawan 1, Andi Besse Firdausiah Mansur 2, Taufiq Rizaldi 1, Nouf Atiahallah Alghanmi 2, Alanoud Subahi 2, Hermawan Arief Putranto 1, Hanadi Alkhudhayr 2, Arwa Mashat 2, Yogiswara Yogiswara 1

 Affiliation(s):

 1Information Technology Department, Politeknik Negeri Jember, Jember, Indonesia
 2Faculty of Computing and Information Technology in Rabigh, King Abdulaziz University, Rabigh 21911, Saudi Arabia

 Full text

  Full Text - PDF

 * Corresponding Author. 

  Corresponding author's ORCID profile: https://orcid.org/0000-0001-9684-490X

 Digital Object Identifier (DOI)

 https://doi.org/10.21833/ijaas.2024.09.017

 Abstract

There is growing interest in using deep learning for object recognition in robots to enhance the efficiency of apple farming. While deep learning-based object detection has shown promising results in various visual tasks, more research is needed to accurately recognize apples in orchard environments. During the training phase, it is important to determine the optimal values of hyperparameters. This research aims to develop a deep learning model, YOLOv7, to reliably identify apples in orchards, using four different batch size values for training. The MinneApple dataset, trained with these batch sizes, serves as our reference model. To assess the model’s ability to work in different situations, we evaluate it using test data with varying input scales. Our results show that the optimal batch size for detecting apples in orchards is 16, achieving a mean average precision (mAP) of 50%. Furthermore, our findings suggest that increasing the batch size does not improve the efficiency of apple detection in orchard environments.

 © 2024 The Authors. Published by IASE.

 This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

 Keywords

 Deep learning, Object recognition, Apple farming, Batch size, Orchard environments

 Article history

 Received 16 May 2024, Received in revised form 6 September 2024, Accepted 8 September 2024

 Acknowledgment

This research work was funded by Institutional Fund Projects under grant no (IFPIP:901-830-1443). The authors gratefully acknowledge the technical and financial support provided by the Ministry of Education and King Abdulaziz University, DSR, Jeddah, Saudi Arabia.

 Compliance with ethical standards

 Conflict of interest: The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

 Citation:

 Pebrianto W, Basori AH, Riskiawan HY, Mansur ABF, Rizaldi T, Alghanmi NA, Subahi A, Putranto HA, Alkhudhayr H, Mashat A, and Yogiswara Y (2024). Efficient batch size detection of apple fruit in a plantation environment. International Journal of Advanced and Applied Sciences, 11(9): 154-163

 Permanent Link to this page

 Figures

 Fig. 1 Fig. 2 Fig. 3 

 Tables

 Table 1 Table 2  

----------------------------------------------   

 References (60)

  1. Abdulkadirov R, Lyakhov P, and Nagornov N (2023). Survey of optimization algorithms in modern neural networks. Mathematics, 11(11): 2466. https://doi.org/10.3390/math11112466   [Google Scholar]
  2. Ahmad HM, Rahimi A, and Hayat K (2024). Capacity constraint analysis using object detection for smart manufacturing. Arxiv Preprint Arxiv:2402.00243. https://doi.org/10.48550/arXiv.2402.00243   [Google Scholar]
  3. Apostolopoulos ID, Tzani M, and Aznaouridis SI (2023). A general machine learning model for assessing fruit quality using deep image features. AI, 4(4): 812–830. https://doi.org/10.3390/ai4040041   [Google Scholar]
  4. Arnold M and Gramza-Michalowska A (2023). Recent development on the chemical composition and phenolic extraction methods of apple (Malus domestica)—A review. Food and Bioprocess Technology, 17: 2519-2560. https://doi.org/10.1007/s11947-023-03208-9   [Google Scholar]
  5. Bishop CM and Bishop H (2023). Convolutional networks. In: Bishop CM and Bishop H (Eds.), Deep learning: Foundations and concepts: 287-324. Springer International Publishing, Cham, Switzerland. https://doi.org/10.1007/978-3-031-45468-4_10   [Google Scholar]
  6. Bochkovskiy A, Wang CY, and Liao H-YM (2020). YOLOv4: Optimal speed and accuracy of object detection. Arxiv Preprint Arxiv:2004.10934. https://doi.org/10.48550/arXiv.2004.10934   [Google Scholar]
  7. Brock A, De S, Smith SL, and Simonyan K (2021). High-performance large-scale image recognition without normalization. Proceedings of the 38th International Conference on Machine Learning, PMLR 139: 1059-1071.   [Google Scholar]
  8. Chen G, Wang H, Chen K, Li Z, Song Z, Liu Y, Chen W, and Knoll A (2022). A survey of the four pillars for small object detection: Multiscale representation, contextual information, super-resolution, and region proposal. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 52(2): 936–953. https://doi.org/10.1109/TSMC.2020.3005231   [Google Scholar]
  9. Chen W, Zhang J, Guo B, Wei Q, and Zhu Z (2021). An apple detection method based on Des‐YOLO v4 algorithm for harvesting robots in complex environment. Mathematical Problems in Engineering, 2021(1): 7351470. https://doi.org/10.1155/2021/7351470   [Google Scholar]
  10. Ge Z, Liu S, Wang F, Li Z, and Sun J (2021). YOLOX: Exceeding YOLO series in 2021. Arxiv Preprint Arxiv:2107.08430. https://doi.org/10.48550/arXiv.2107.08430   [Google Scholar]
  11. Girshick R (2015). Fast R-CNN. In the Proceedings of the IEEE International Conference on Computer Vision, IEEE, Santiago, Chile: 1440-1448. https://doi.org/10.1109/ICCV.2015.169   [Google Scholar]
  12. Girshick R, Donahue J, Darrell T, and Malik J (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Columbus, USA: 580–587. https://doi.org/10.1109/CVPR.2014.81   [Google Scholar]
  13. Goyal P, Dollár P, Girshick R, Noordhuis P, Wesolowski L, Kyrola A, Tulloch A, Jia Y, and He K (2017). Accurate, large minibatch SGD: Training ImageNet in 1 hour. Arxiv Preprint Arxiv:1706.02677. https://doi.org/10.48550/arXiv.1706.02677   [Google Scholar]
  14. Han K, Wang Y, Tian Q, Guo J, Xu C, and Xu C (2020). GhostNet: More features from cheap operations. In the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA: 1580-1589. https://doi.org/10.1109/CVPR42600.2020.00165   [Google Scholar]
  15. Hani N, Roy P, and Isler V (2020). MinneApple: A benchmark dataset for apple detection and segmentation. IEEE Robotics and Automation Letters, 5(2): 852–858. https://doi.org/10.1109/LRA.2020.2965061   [Google Scholar]
  16. He K, Zhang X, Ren S, and Sun J (2016). Deep residual learning for image recognition. In the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA: 770–778. https://doi.org/10.1109/CVPR.2016.90   [Google Scholar]
  17. Hou Q, Zhou D, and Feng J (2021). Coordinate attention for efficient mobile network design. In the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Nashville, USA: 13708–13717. https://doi.org/10.1109/CVPR46437.2021.01350   [Google Scholar]
  18. Huang J, Shang Y, and Chen H (2019). Improved Viola-Jones face detection algorithm based on HoloLens. EURASIP Journal on Image and Video Processing, 2019: 41. https://doi.org/10.1186/s13640-019-0435-6   [Google Scholar]
  19. Ji W, Pan Y, Xu B, and Wang J (2022). A real-time apple targets detection method for picking robot based on ShufflenetV2-YOLOX. Agriculture, 12(6): 856. https://doi.org/10.3390/agriculture12060856   [Google Scholar]
  20. Kandel I and Castelli M (2020). The effect of batch size on the generalizability of the convolutional neural networks on a histopathology dataset. ICT Express, 6(4): 312–315. https://doi.org/10.1016/j.icte.2020.04.010   [Google Scholar]
  21. Kaplun D, Deka S, Bora A, Choudhury N, Basistha J, Purkayastha B, Mazumder IZ, Gulvanskii V, Sarma KK, and Misra DD (2024). An intelligent agriculture management system for rainfall prediction and fruit health monitoring. Scientific Reports, 14(1): 512. https://doi.org/10.1038/s41598-023-49186-y   [Google Scholar]
  22. Keskar NS, Mudigere D, Nocedal J, Smelyanskiy M, and Tang PTP (2016). On large-batch training for deep learning: Generalization gap and sharp minima. Arxiv Preprint Arxiv:1609.04836. https://doi.org/10.48550/arXiv.1609.04836   [Google Scholar]
  23. Kuznetsova A, Maleva T, and Soloviev V (2020). Using YOLOv3 algorithm with pre-and post-processing for apple detection in fruit-harvesting robot. Agronomy, 10(7): 1016. https://doi.org/10.3390/agronomy10071016   [Google Scholar]
  24. LeCun Y, Bengio Y, and Hinton G (2015). Deep learning. Nature, 521(7553): 436–444. https://doi.org/10.1038/nature14539   [Google Scholar]
  25. LeCun Y, Bottou L, Orr GB, and Müller KR (2002). Efficient backprop. In: Orr GB and Müller KR (Eds.), Neural networks: Tricks of the trade: 9-50. Springer Berlin Heidelberg, Berlin, Germany. https://doi.org/10.1007/3-540-49430-8_2   [Google Scholar]
  26. Lin TY, Goyal P, Girshick R, He K, and Dollar P (2020). Focal loss for dense object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(2): 318–327. https://doi.org/10.1109/TPAMI.2018.2858826   [Google Scholar]
  27. Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, and Zitnick CL (2014). Microsoft COCO: Common objects in context. In the Computer Vision–ECCV 2014: 13th European Conference, Springer International Publishing, Zurich, Switzerland: 740-755. https://doi.org/10.1007/978-3-319-10602-1_48   [Google Scholar]
  28. Lin Z, Gao W, Jia J, and Huang F (2021). CapsNet meets SIFT: A robust framework for distorted target categorization. Neurocomputing, 464(24): 290–316. https://doi.org/10.1016/j.neucom.2021.08.087   [Google Scholar]
  29. Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, and Berg AC (2016). SSD: Single shot multibox detector. In the Computer Vision–ECCV 2016: 14th European Conference, Springer International Publishing, Amsterdam, Netherlands: 21-37. https://doi.org/10.1007/978-3-319-46448-0_2   [Google Scholar]
  30. Ma N, Zhang X, Zheng HT, and Sun J (2018). ShuffleNet V2: Practical guidelines for efficient CNN architecture design. In: Ferrari V, Hebert M, Sminchisescu C, and Weiss Y (Eds), Computer vision – ECCV 2018. Lecture notes in computer science: 122–138. Volume 11218, Springer, Cham, Switzerland. https://doi.org/10.1007/978-3-030-01264-9_8   [Google Scholar]
  31. Onishi Y, Yoshida T, Kurita H, Fukao T, Arihara H, and Iwai A (2019). An automated fruit harvesting robot by using deep learning. ROBOMECH Journal, 6(1): 2–9. https://doi.org/10.1186/s40648-019-0141-2   [Google Scholar]
  32. Padilla R, Netto SL, and Da Silva EA (2020). A survey on performance metrics for object-detection algorithms. In the International Conference on Systems, Signals and Image Processing, IEEE, Niteroi, Brazil: 237-242. https://doi.org/10.1109/IWSSIP48289.2020.9145130   [Google Scholar]
  33. Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, Desmaison A, Köpf A, Yang E, DeVito Z, Raison M, Tejani A, Chilamkurthy S et al. (2019). Pytorch: An imperative style, high-performance deep learning library. Advances in Neural Information Processing Systems, 32: 8024–8035.   [Google Scholar]
  34. Pebrianto W, Mudjirahardjo P, and Pramono S H (2022). YOLO method analysis and comparison for real-time human face detection. In the 11th Electrical Power, Electronics, Communications, Controls and Informatics Seminar, IEEE, Malang, Indonesia: 333-338. https://doi.org/10.1109/EECCIS54468.2022.9902919   [Google Scholar]
  35. Pebrianto W, Mudjirahardjo P, and Pramono SH (2024). Partial half fine-tuning for object detection with unmanned aerial vehicles. IAES International Journal of Artificial Intelligence (IJ-AI), 13(1): 399-407. https://doi.org/10.11591/ijai.v13.i1.pp399-407   [Google Scholar]
  36. Qian X and Klabjan D (2020). The impact of the mini-batch size on the variance of gradients in stochastic gradient descent. Arxiv Preprint Arxiv:2004.13146. https://doi.org/10.48550/arXiv.2004.13146   [Google Scholar]
  37. Redmon J and Farhadi A (2017). YOLO9000: Better, faster, stronger. In the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA: 6517–6525. https://doi.org/10.1109/CVPR.2017.690   [Google Scholar]
  38. Redmon J and Farhadi A (2018). YOLOv3: An incremental improvement. Arxiv Preprint Arxiv:1804.02767. https://doi.org/10.48550/arXiv.1804.02767   [Google Scholar]
  39. Redmon J, Divvala S, Girshick R, and Farhadi A (2016). You only look once: Unified, real-time object detection. In the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA: 779-788. https://doi.org/10.1109/CVPR.2016.91   [Google Scholar]
  40. Ren S, He K, Girshick R, and Sun J (2017). Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(6): 1137–1149. https://doi.org/10.1109/TPAMI.2016.2577031   [Google Scholar]
  41. Sato N and Iiduka H (2023). Existence and estimation of critical batch size for training generative adversarial networks with two time-scale update rule. In the International Conference on Machine Learning, PMLR, Honolulu, USA: 30080-30104.   [Google Scholar]
  42. Stapor P, Schmiester L, Wierling C, Merkt S, Pathirana D, Lange BMH, Weindl D, and Hasenauer J (2022). Mini-batch optimization enables training of ODE models on large-scale datasets. Nature Communications, 13: 34. https://doi.org/10.1038/s41467-021-27374-6   [Google Scholar]
  43. Sun L, Hu G, Chen C, Cai H, Li C, Zhang S, and Chen J (2022). Lightweight apple detection in complex orchards using YOLOV5-PRE. Horticulturae, 8(12): 1169. https://doi.org/10.3390/horticulturae8121169   [Google Scholar]
  44. Sun Z, Caetano E, Pereira S, and Moutinho C (2023). Employing histogram of oriented gradient to enhance concrete crack detection performance with classification algorithm and Bayesian optimization. Engineering Failure Analysis, 150: 107351. https://doi.org/10.1016/j.engfailanal.2023.107351   [Google Scholar]
  45. Tan M and Le Q (2019). EfficientNet: Rethinking model scaling for convolutional neural networks. In the International Conference on Machine Learning, PMLR, Long Beach, USA: 6105-6114.   [Google Scholar]
  46. Tan M, Pang R, and Le QV (2020). EfficientDet: Scalable and efficient object detection. In the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA: 10778–10787. https://doi.org/10.1109/CVPR42600.2020.01079   [Google Scholar]
  47. Voulodimos A, Doulamis N, Doulamis A, and Protopapadakis E (2018). Deep learning for computer vision: A brief review. Computational Intelligence and Neuroscience, 2018): 7068349. https://doi.org/10.1155/2018/7068349   [Google Scholar]
  48. Wang CY, Bochkovskiy A, and Liao HYM (2021). Scaled-YOLOv4: Scaling cross stage partial network. In the IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Nashville, USA: 13024–13033. https://doi.org/10.1109/CVPR46437.2021.01283   [Google Scholar]
  49. Wang CY, Bochkovskiy A, and Liao HYM (2023a). YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In the IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Vancouver, Canada: 7464–7475. https://doi.org/10.1109/CVPR52729.2023.00721   [Google Scholar]
  50. Wang CY, Liao HY M, and Yeh IH (2023b). Designing network design strategies through gradient path analysis. Journal of Information Science and Engineering, 39(2): 975–995.   [Google Scholar]
  51. Wang D and He D (2022). Apple detection and instance segmentation in natural environments using an improved mask scoring R-CNN model. Frontiers in Plant Science, 13: 1016470. https://doi.org/10.3389/fpls.2022.1016470   [Google Scholar]
  52. Woo S, Park J, Lee J, and Kweon IS (2018). CBAM: Convolutional block attention module. In: Ferrari V, Hebert M, Sminchisescu C, and Weiss Y (Eds.), Computer vision – ECCV 2018, Lecture notes in computer science: 3–19. Volume 11211, Springer, Cham, Switzerland. https://doi.org/10.1007/978-3-030-01234-2_1   [Google Scholar]
  53. Wu L, Ma J, Zhao Y, and Liu H (2021). Apple detection in complex scene using the improved YOLOv4 model. Agronomy, 11(3): 476. https://doi.org/10.3390/agronomy11030476   [Google Scholar]
  54. Xiao F, Wang H, Xu Y, and Zhang R (2023). Fruit detection and recognition based on deep learning for automatic harvesting: An overview and review. Agronomy, 13(6): 1625. https://doi.org/10.3390/agronomy13061625   [Google Scholar]
  55. Xuan G, Gao C, Shao Y, Zhang M, Wang Y, Zhong J, Li Q, and Peng H (2020). Apple detection in natural environment using deep learning algorithms. IEEE Access, 8: 216772-216780. https://doi.org/10.1109/ACCESS.2020.3040423   [Google Scholar]
  56. Yong H, Huang J, Meng D, Hua X, and Zhang L (2020). Momentum batch normalization for deep learning with small batch size. In the Computer Vision–ECCV 2020: 16th European Conference, Springer International Publishing, Glasgow, UK: 224-240. https://doi.org/10.1007/978-3-030-58610-2_14   [Google Scholar]
  57. Yoshida T, Kawahara T, and Fukao T (2022). Fruit recognition method for a harvesting robot with RGB-D cameras. ROBOMECH Journal, 9: 15. https://doi.org/10.1186/s40648-022-00230-y   [Google Scholar]
  58. You Y, Li J, Reddi S, Hseu J, Kumar S, Bhojanapalli S, Song X, Demmel J, Keutzer K, and Hsieh CJ (2019). Large batch optimization for deep learning: Training BERT in 76 minutes. Arxiv Preprint Arxiv:1904.00962. https://doi.org/10.48550/arXiv.1904.00962   [Google Scholar]
  59. Zhang X, Zhou X, Lin M, and Sun J (2018). ShuffleNet: An extremely efficient convolutional neural network for mobile devices. In the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA: 6848-6856. https://doi.org/10.1109/CVPR.2018.00716   [Google Scholar]
  60. Zhao Z, Wang J, and Zhao H (2023). Research on apple recognition algorithm in complex orchard environment based on deep learning. Sensors, 23(12): 5425. https://doi.org/10.3390/s23125425   [Google Scholar]