1. Introduction
Navigational safety has been a serious concern for mariners since the beginning of seafaring. The need for accurate and reliable information to ensure safe navigation in diverse waterways has led to the development of various tools and systems (Chintan et al., Reference Chintan, Krishna and Pier2022). A nautical chart is an essential tool that provides mariners with a visual representation of vital information for safe navigation. Nautical charts show the bathymetric information such as depths, various hazards for navigation, and navigation aids and important topographic information on the coastline which can be used to design navigation routes to reach an intended destination (Palma, Reference Palma2021). Traditional paper navigational charts (PNCs) have always been produced on large sheets of paper of various scales. To obtain sufficient knowledge, the mariner must consult all surrounding charts that cover the entire area (Blindheim and Arne, Reference Blindheim and Arne2021). Later, with the technological advancement, the International Hydrographic Organisation (IHO) introduced the concept of Electronic Navigational Charts (ENCs) in the 1990s. ENCs are digital versions of paper charts that can be displayed on electronic chart display and information systems (ECDISs) and offer many advantages over paper charts including seamless coverage, real-time update, easy integration with various navigation systems and improved searchability (IHO, 2000; Masetti et al., Reference Masetti, Faulkes and Kastrisios2018; Palikaris and Mavraei, Reference Palikaris and Mavraei2020). IHOs new standard S_100: universal hydrographic data model is to cater for future demands for digital products and services which is not a mere replacement of the older ENC S_57, but an innovative way to manage maritime data compliant with the ISO 19100 (Palma, Reference Palma2021); thereby, ENC is an essential component of e-navigation: electronic means to enhance berth-to-berth navigation.
According to Brown et al. (Reference Brown, Walker and Fay1988), when comparing the ease of using ENCs with the ease of using PNCs, paper nautical charts are used less frequently than electronic nautical charts due to their limitations. However, sailors still preserve all relevant paper nautical charts for navigation as a backup in case of something unforeseen, such as system failure or power failure when using digital nautical charts. Even though paper nautical charts are common, there is no fast way to convert them to electronic versions. Still, ENCs do not adequately cover the entire ocean. Therefore, there must be a means to accelerate the conversion of paper nautical charts to electronic navigational charts for all the navigable waters and full-scale use of the ECDIS (Weintrit, Reference Weintrit2001; IMO, 2006).
Chintan et al. (Reference Chintan, Krishna and Pier2022) provided an overview of the advancements in image classification algorithms based on Convolutional Neural Networks (CNNs) and involved a comprehensive literature review focusing on the development of CNN architectures, optimisation techniques and their applications in various domains. They concluded that the CNNs have significantly improved image classification tasks and provide a broad range of applications. They also highlighted the challenges and future research directions in the field of CNN-based image classification.
One of such applications is underwater target feature extraction and recognition. He et al. (Reference He, Gkioxari, Dollár and Girshick2017) employed various deep learning architectures, such as CNNs, Deep Belief Networks (DBNs) and Recurrent Neural Networks (RNNs), to process and classify underwater sonar images. Zhixiong et al. (Reference Zhixiong, Zhuo, Qian, Rongbin, Ming, Mengxue and Ke2020) revealed that deep learning methods were effective in extracting and recognising underwater target features. Therefore, the deep learning techniques have the potential to significantly improve underwater target recognition, which is essential for various applications, including marine surveillance and exploration. Girshick et al. (Reference Girshick, Donahue, Darrell and Malik2014) conducted a review of text feature extraction techniques based on deep learning. They involved analysing various deep learning models, such as CNNs, RNNs and autoencoders, to understand their application in extracting meaningful features from text data. The authors found that deep learning models are successful in text feature extraction, text classification and information retrieval. Further, the review concluded that deep learning based text feature extraction techniques have advanced the field of natural language processing. However, some challenges remain in terms of model interpretability and scalability.
2. Problem
The current process of transforming traditional PNC symbol data into ENC symbols consumes substantial time and increases the potential for human error. Subsequently, the issue of the absence of an automated system for chart symbol recognition was brought into focus by underlining the need for a more efficient solution. This approach aimed to automate the replacement of the manual process for symbol recognition and conversion. The automation was proposed to reduce the inefficiency associated with the current manual method.
3. Methodology
3.1 Symbol recognition model
The object detection model proposed in this study was designed to identify relevant symbols present within the input Raster Nautical Charts (RNCs) and categorise them effectively while enumerating them. The model, when presented with a user input RNC or a segment of a paper chart image, was expected to exhibit the capability to distinguish symbols and subsequently classify these symbols into related classes, such as major and minor lights, wrecks, rocks, and buoys.
3.2 The data
The first stage involved a series of activities that included data gathering, data cleaning and data transformation into a suitable format for model generation. To acquire sample images of chart symbols, scanned images of paper nautical charts were used which referred as RNCs. These data were obtained from a variety of sources including existing paper charts and online repositories. Despite the diversity, by ensuring the quality of training data and creating a large dataset to train the object detection model, this methodology aimed to enhance the model's ability to accurately detect paper chart symbols in various contexts effectively.
3.3 Data pre-processing
Data pre-processing involved resizing and cropping the images to improve their quality by removing unwanted elements and focusing on the desired symbols. Moreover, it is critical to ensure an even distribution of samples across each class. Therefore, throughout the process of data collection, it is aimed to maintain a balanced distribution among all classes and also to address key aspects such as data scarcity, class imbalance, inconsistent labelling, data quality, data diversity, data relevance and privacy concerns to generate a superior dataset that can be used to train the models effectively. Finally, a comprehensive and accurate dataset to train the object detection model was created. This resulted in a robust and accurate model that is capable of detecting and classifying chart symbols effectively.
3.4 Image annotation
Once the data are ready, the next step was to annotate the symbol names using bounding boxes to guide the learning process of the object detection model. It was necessary to label each symbol image with the accurate name of the symbol, enabling the model to identify symbols correctly in future predictions. Bounding boxes were drawn around each object targeted for detection and the corresponding object class was labelled for each box. This labelling process supplies the ground truth information required for the model to learn to detect objects precisely. According to the literature, various labelling software tools are available for annotating datasets, including computer vision annotation tool (CVAT), LabelImg and visual object tagging tool (VoTT). Here, the LabelImg software, which is a free and open-source labelling tool, was used to label each symbol according to its name. When annotating each symbol in the dataset, it is crucial to ensure that it is accurately labelled with the appropriate name. Proper labelling facilitates the model's learning process and enhances its performance in detecting objects. Then these images were used to train an Artificial Neural Network (ANN) model for object detection using a supervised learning approach.
3.5 Training an object detection model using YOLOv5
YOLO, an acronym for You Only Look Once, is a deep learning-based approach employed in this study for object detection. It uses a single CNN to detect objects in an image (Redmon et al., Reference Redmon, Divvala and Girshick2016). This technique was first introduced in December 2015 and then the YOLO algorithm has undergone significant improvements with subsequent versions like YOLOv2 being released in November 2016. YOLO's innovation lies in its use of a single CNN for object detection instead of requiring a separate network for each object class (Gong et al., Reference Gong, Tingkui, Qiuxia, Dai, Li, He, Wang, Han, Tuniyazi and Li2022). It was the first object detection model to combine bounding box prediction and object classification within a single end-to-end differentiable network. After selecting YOLOv5m, the next step was to import the necessary libraries and load the datasets which contained symbols of different classes. Scikit Learn, TensorFlow, OpenCV, NumPy and Matplotlib libraries were used to perform various tasks associated with achieving importation. Then, the Google Colab was used which is a cloud-based platform that provides free access to graphical user interface (GPU) resources and allows for seamless collaboration on Jupyter notebooks. After that, by using these powerful libraries and tools, the object detection model was efficiently developed and trained for specific paper chart symbol datasets to achieve the optimal performance and accuracy. These libraries and Google Colab development environment were used for easy experimentation and iteration, enabling the successful completion of the research.
3.6 Setting up the YOLOv5 environment
The YOLOv5 repository was cloned from its official GitHub source and the required dependencies were installed to begin working with YOLOv5. This step ensured the programming environment is set up properly to run object detection training and inference commands. The data sets were split into two parts, which are for training and validation. 80% of the data were used for training and 20% of the data was used for validation. Once the training parameters were set, the training command was executed to start the training process. During the training, the model was optimised using a loss function that measures the difference between the predicted and ground-truth bounding boxes and class probabilities. It also used several other techniques, including hyperparameter tuning to optimise the performance of the model. The training process continued until the specified number of epochs was completed.
3.7 Model validation and performance analysis
Upon completion of the model training, it is crucial to evaluate its accuracy using a distinct set of validation images that have not been used during the training stages. This evaluation ensured that the model can be generalised to previously unseen data and provides insight into its real-world performance. Therefore, an assessment of the effectiveness of the proposed approach was carried out via a series of trials that used a dataset of paper chart symbols. This dataset contained images of various symbols extracted from various paper charts, inclusive of hazards, aids to navigation and other features. This practice offered an unbiased approximation of the model's generalisation capabilities when presented with new, unseen data. Following these steps enables an effective evaluation of the model's performance on unseen data. Then, an evaluation process was designed to assess the effectiveness and efficiency of the automated system in comparison to the manual method. The evaluation also aimed to identify any potential issues or limitations of the automated system. The overall methodology is given in Figure 1.
4. Results and analysis
The proposed methodology exhibited a remarkable degree of accuracy in paper chart symbol detection and identification. Hence, this research revealed the potential of using artificial neural networks to detect and identify conventional paper charts' symbols and subsequently enable the efficient production of Electronic Navigational Charts from the existing paper charts. The implementation of ANNs removes the need for manual intervention of symbol detection and identification which is an essential phase of the ENC production process, which also minimises human errors, production time and cost. Further exploration and modification of this method may lead to even higher levels of accuracy and efficiency. To evaluate the effectiveness of the YOLOv5 based model in detecting various paper chart symbols, a series of experiments were conducted, and the overall efficiency and accuracy of the conversion process were analysed.
4.1 Detecting the symbols
Here, the performance of the YOLOv5-based model in detecting various chart symbols was evaluated using commonly employed matrices, including Precision (P), Recall (R), Average Precision (AP), and mean Average Precision (mAP). A comprehensive understanding of the model's performance in detecting paper chart symbols was provided by these matrices ensuring the effectiveness of the proposed approach. The confusion matrix showed that the model performed well in identifying ‘Obstruction’ and ‘Major/Minor Light’ categories. In the ‘Visible Wreck’ category, which was frequently misclassified as ‘Background’, challenges for appropriate identification were encountered, suggesting a potential area for improvement. Insights into the model's precision and recall across different confidence thresholds were provided by the Precision-Confidence and F1-Confidence curves, while the model's generalisation ability was assessed through the Training and Validation Losses. These tools helped identify strengths of the model and areas that require improvements.
A comparison is provided in this research between three YOLOv5 models: YOLOv5n, YOLOv5s and YOLOv5m. Each model offers a distinct balance between detection accuracy and inference speed. The YOLOv5m model outperforms other models in object detection, achieving a notable mean Average Precision of 0 ⋅ 837, thereby highlighting its reliability and performance (Figure 2). However, the model's performance may vary depending on the quality and diversity of the datasets used for training and validation.
The performance of the ANN model may fluctuate depending on the quality and diversity of the dataset employed for training and validation. Although the model exhibited satisfactory performance in detecting numerous chart symbols, certain symbols with low contrast might still present challenges for precise detection within the context of paper chart symbol recognition (Figures 3 and 4).
This research provided a significant contribution to a complete analysis of the YOLOv5 model's performance in detecting paper chart symbols. The findings highlighted the strengths of the models and also suggested potential areas for improvement; thereby offering a roadmap for future research and applications in the field of ENC production. The results show that the proposed automated system could potentially replace the current manual method for chart symbol detection, recognition and conversion.
5. Discussion and conclusions
The study effectively showcased the ability of a YOLOv5-based neural network in identifying and transposing paper chart symbols into their corresponding ENC symbols. This automation was to supplement the operational efficiency in the production of ENCs and to reduce human errors, while complying with the standards of the International Hydrographic Organisation (IHO). Therefore, the detection of the accuracy and efficiency of converting paper chart symbols were exhibited. By automating the conversion process, the national hydrographic agencies and maritime organisations can update and maintain their chart databases more efficiently, providing up-to-date and accurate navigational information to the mariners. Overall, this research highlights the application of artificial intelligence (AI) in the field of marine cartography. By using these technological advancements, the safety and efficiency of maritime navigation can be improved and ultimately benefit the maritime industry.
5.1 Limitations
Despite some limitations, such as the model's varying performance on symbols with complex shapes, the proposed method offers a significant step forwards in the field of automated chart symbol conversion. One limitation encountered during this study was the lack of access to a diverse set of paper charts. Here, small scale charts were mostly used and this method needs to be tested for large scale harbour charts having greater details. This limited availability of charts posed a challenge in capturing a comprehensive range of nautical symbols such as can buoys, which are more commonly found on larger scales or specific types of charts. Consequently, it was difficult to maintain class balance within the data set. Further, another limitation of this study was that the model sometimes misidentified the text on the paper charts as rocky areas. This occurred because the model was trained to detect rocky area symbols, which may also have some similarities to the text on the charts. To mitigate this, a potential solution could involve the use of an additional specialised model trained to detect text. Prior to feeding the scanned image of the paper chart into the primary model, it could be processed through this text detection model. Pixels identified as containing text could be altered to a distinct colour (such as black), which would prevent the primary model from confusing these text elements with rocky area symbols.
5.2 Recommendations
The incorporation of the trained YOLOv5 model with the CARIS S- 57 Composer further highlights the practical applicability of this research in the real-world context of navigational chart production. The results of this study can serve as a foundation for the development and refinement of ANN-based techniques in the field of marine cartography. Future research could explore the amalgamation of additional data sources to improve the robustness of the model, as well as the extension of the approach to other aspects of chart production, such as depth contour recognition and labelling. The proposed method also has the potential to be adapted for other domains, such as land surveying and geological mapping, where similar symbol recognition and conversion tasks are needed.