An official website of the United States government

Official websites use .gov A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS A lock ( Lock Locked padlock icon ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

  • Publications
  • Account settings
  • Advanced Search
  • Journal List

Sensors (Basel, Switzerland) logo

A Systematic Literature Review on Cyber Threat Intelligence for Organizational Cybersecurity Resilience

Saqib saeed, sarah a suayyid, manal s al-ghamdi, hayfa al-muhaisen, abdullah m almuhaideb.

  • Author information
  • Article notes
  • Copyright and License information

Correspondence: [email protected]

Received 2023 Jul 7; Revised 2023 Aug 12; Accepted 2023 Aug 14; Collection date 2023 Aug.

Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license ( https://creativecommons.org/licenses/by/4.0/ ).

Cybersecurity is a significant concern for businesses worldwide, as cybercriminals target business data and system resources. Cyber threat intelligence (CTI) enhances organizational cybersecurity resilience by obtaining, processing, evaluating, and disseminating information about potential risks and opportunities inside the cyber domain. This research investigates how companies can employ CTI to improve their precautionary measures against security breaches. The study follows a systematic review methodology, including selecting primary studies based on specific criteria and quality valuation of the selected papers. As a result, a comprehensive framework is proposed for implementing CTI in organizations. The proposed framework is comprised of a knowledge base, detection models, and visualization dashboards. The detection model layer consists of behavior-based, signature-based, and anomaly-based detection. In contrast, the knowledge base layer contains information resources on possible threats, vulnerabilities, and dangers to key assets. The visualization dashboard layer provides an overview of key metrics related to cyber threats, such as an organizational risk meter, the number of attacks detected, types of attacks, and their severity level. This relevant systematic study also provides insight for future studies, such as how organizations can tailor their approach to their needs and resources to facilitate more effective collaboration between stakeholders while navigating legal/regulatory constraints related to information sharing.

Keywords: cybersecurity, cyber threat intelligence, business organizations, mitigation

1. Introduction

Cybersecurity is a significant concern for businesses worldwide, because cyber attackers constantly target corporate data and information technology (IT) resources to make money or gain a geopolitical advantage [ 1 ]. Cybersecurity can be defined as securing individual or organizational electronic data from unauthorized access. An attempt to gain unauthorized access is termed a cyber-attack, and these attacks may entail the theft of private data, intellectual property, confidential business strategy plans, and/or the disruption of mission-critical IT systems. Organized crime syndicates and nation-state paramilitary cyber organizations have also started using cyber-attacks as an operational strategy that has led to the development of advanced persistent threats (APTs), which are becoming increasingly difficult for organizations to defend against despite having formalized cybersecurity systems [ 2 , 3 ].

Cyber threat intelligence (CTI) has emerged as a potential solution for businesses to address security events’ increasing quantity and complexity. CTI refers to the proactive identification and analysis of cyber threats. However, subscribing to different threat intelligence sources can lead to information overload. A Threat Intelligence-Sharing Platform (TISP) can take cyber threat information data and turn it into actionable intelligence that can be fed into multiple technologies to help in incident response. Information security firms and the ecosystem currently offer TISP solutions in two categories: content aggregation, which provides numerous threat data feeds, and threat intelligence management, which generates economic value from the data obtained [ 4 ].

CTI is a process of gathering, analyzing, and distributing information to identify, monitor, and anticipate cyber threats. CTI can help businesses become more proactive in cybersecurity by identifying vulnerabilities before attackers exploit them [ 5 ]. For example, suppose a particular threat actor has been known to target companies using a specific malware or attack method. In this case, CTI could help to identify this pattern early on so that intrusion detection systems can be assigned to look for those patterns specifically. CTI also plays an essential role in detecting attacks by setting intrusion detection systems based on practices associated with certain threat actors or types of attacks identified by analyzing gathered intelligence data. Furthermore, CTI provides specific security plans tailored toward countering the mannerisms used by cyber threat actors, making it an essential tool for organizations looking at preventing, detecting, and responding effectively against potential cyber-attacks [ 2 ].

As a response to the benefits it provides and the present market trend, CTI has attracted the attention of most organizations. Consequently, CTI alters the organization’s processes and actions as it faces various issues [ 6 , 7 ]. Therefore, the current research will document state-of-the-art cyber threat intelligence. We have conducted a systematic literature review based on the scientific literature published in the last five years to highlight how evolving procedures and technology have helped organizations to improve the cybersecurity of their critical infrastructures by improving CTI. Based on the review, we have outlined a layered CTI framework for organizations to improve their cybersecurity resilience. This model provides a starting point for other researchers to deploy and test applications in organizations to improve their CTI.

This paper’s structure is as follows: Section 2 defines the procedures for selecting primary studies for systematic analysis. Section 3 explains the results of all of the designated significant research. Section 4 considers the findings in connection to the previously indicated research subjects. Section 5 concludes the analysis and makes some recommendations for further study.

2. Materials and Methods

Our review followed the guidelines provided by Kitchenham and Charters [ 8 ], along with Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) [ 9 ] recommendations for systematic review. An extensive systematic literature review (SLR) was conducted through an iterative planning, examination, and documentation process. Primary studies were extracted by directing keywords to scientific repositories. Keywords were designated to aid in developing study outcomes that contribute to the research. We mainly used “Cyber Threat” as the search term to extract relevant papers from the Institute of Electrical and Electronics Engineering (IEEE) digital library [ 10 ], the Association of Computing Machinery (ACM) digital library [ 11 ], and the university library’s online repository. Searches were performed on paper titles during March 2023. The inclusion criteria were as follows:

The selected paper should focus on CTI in business organizations.

The paper should be published between 2019 and 2023.

All papers were journal articles or conference proceedings; any other publication type was excluded.

From the first database keyword searches, we identified 294 research works, of which eight were duplicates. After we checked the title and keywords on each paper under the inclusion/exclusion criteria, we found that 70 papers were not journal or conference publications. Also, 153 papers could not be retrieved. As a result, the number of papers available to review was left at 63. After reading the 63 articles in full and using the inclusion/exclusion criteria, we settled on 52 papers for inclusion in our study. To verify that extracted data passed an assessment of quality, they underwent a test to assess that the data were complete and relevant, and subsequently, this assessment to determine the accuracy of the information included in these studies. The process of identifying the extracted studies went through several stages to have a group of papers that passed the following quality assessment stages.

Stage 1: The manuscript must be published in peer-reviewed journals or conference proceedings. Poster presentations, books, and blogs were left out due to quality concerns.

Stage 2: The paper must be focused on the CTI domain that impacts organizations’ performance.

Stage 3: The paper must be a case study, system application, or modeling implementation.

A quality evaluation checklist was used to evaluate the papers that were found. Eleven studies were excluded from the analysis, because they did not meet the checklist elements’ criteria. The information and data extracted from each paper were stored in a table containing important information for the classification of each paper. Figure 1 shows the distribution of the publication year of the paper. In contrast, Figure 2 presents the total number of accepted articles that went through the review process, starting from the selection of keywords in the selected databases to the last stage.

Figure 1

Annual total of original research articles.

Figure 2

Number of accepted articles in SLR.

The topics of the primary investigations were categorized more broadly based on the narrower emphasis of each publication. The Detection Model subcategory includes research to improve methods for detecting assaults, introduce novel detection models, or use novel detection characteristics. Knowledge Sharing and Training is an umbrella area for research on information dissemination, public education, and better decision making. Based on the presented categories, further analysis was conducted to extract each paper’s main details.

3.1. Detection Model

Suryotrisongko et al. [ 12 ] developed an automated mechanism for detecting botnet Domain Generation Algorithm (DGA) attacks using natural language processing methods and machine-learning procedures. The authors developed a new model that identifies features from unstructured reports and determines cyber threat actors, achieving high accuracy rates of 96% and a precision rate of 96.4%. The model is based on query logs from Domain Name System (DNS) servers from 2004 to 2015, and it identified 107 malicious domain names associated with botnet traffic [ 12 ]. In another study, Moraliyage et al. [ 13 ] proposed using Artificial Intelligence (AI) to categorize the sites based on their content, using a new approach called explainable deep learning. The explainable deep-learning approach analyzes images and text on each site with advanced AI algorithms, such as the Convolutional Neural Network for image analysis with Gradient-weighted Class Activation Mapping (Grad-CAM) and pre-trained word embedding for text analysis. Combining these techniques in two learning pathways—one focused on images, and one focused on text—the method can accurately identify different types of onion services while explaining how it made those classifications based on specific features or patterns found within each site’s content [ 13 ].

In another study, Irshad and Siddiqui [ 14 ] proposed a mechanism to identify features from reports about cyber attackers. They used natural language processing methods and machine-learning algorithms, specifically an embedding model called “Attack2vec”. This model was trained on domain-specific embeddings and optimized for the specific language used in cybersecurity reports. The authors then used machine-learning algorithms to classify different cyber threat actors based on these extracted features. The suggested technique yields excellent accuracy rates of 96 percent accuracy, 96.4 percent precision, 95.58 percent recall, and 95.75 percent F1 measure (which combines precision and recall) [ 14 ].

Finding and deriving threat actions from unstructured CTI information is a difficult task. The current methods rely on semantic dependency and ontology, but they have limitations in accurately extracting all key threat actions and measuring their information content. Zhang et al. [ 15 ] proposed a new approach called EX-Action to address this issue. It is a multimodal learning approach that uses mutual information and natural language processing (NLP) methods to classify threat actions from unstructured CTI records. The framework consists of two main steps: 1. Threat actions are extracted by matching syntactic rules based on the sentence structure in the CTI report. 2. Extracted threats are identified using multimodal learning algorithms. Normalized mutual information (NMI) is used as an evaluation indicator to evaluate the completeness of extracted information content. The proposed method was tested on 243 unstructured CTI reports, with excellent accuracy reaching 79 percent [ 15 ].

In addition, the significance of cyber-physical systems is critical in developing a maintainable computing ecosystem for scalable and secure network design. Cha et al. [ 16 ] proposed a methodology that involved collecting data from network devices, extracting meaningful information such as file hash values and IP addresses, and distributing this information with a centralized institution called a cloud server (CS) to build a credible dataset. Duplicate data generated by multiple feeds were removed when the authors created their datasets. Blockchain technology was used to protect the integrity of CS’s centralized data and rewarded companies that contributed to creating trustworthy datasets. This system reduced network load while ensuring reliability, privacy, scalability, and sustainability for large-scale IoT systems that generate big data communication efficiently. In confined test settings, employing the IP addresses of open-source intelligence CTI feeds saved roughly 15% of storage space relative to total network resources [ 16 ].

Gong and Lee [ 17 ] proposed a framework intended to assist enterprises in real-time detection, analysis, and response to cyber threats and reduce the effect of cyber-attacks on business operations. The framework was comprised of four stages: threat intelligence collection, threat analysis and triage, incident response planning, and execution. The research provides a complete account of each step and discusses the tools and techniques that can be used to implement them. Furthermore, the study’s authors conducted several evaluation experiments on the effectiveness of the proposed framework in detecting and responding to cyber threats in an energy cloud platform. The results showed that the framework could detect and respond to cyber threats in real time and significantly reduce the time taken to detect and mitigate cyber-attacks. Overall, the CTI framework proposed in this study offers a comprehensive approach to incident response in an energy cloud platform. The framework can assist firms in proactively detecting and responding to cyber-attacks, lowering the risk of disruption to their operations, and improving the overall security posture of the energy cloud platform. The study also highlights several challenges related to CTI. One of the main challenges is the difficulty in obtaining high-quality intelligence relevant to the organization’s specific needs. Developing artificial intelligence requires significant resources and expertise to collect, analyze, and validate intelligence data. Another challenge is the lack of standardization in intelligence collection, analysis, and sharing, which makes it difficult to compare and evaluate intelligence from different sources. Additionally, the study highlights the need for organizations to balance the benefits of sharing intelligence with the risks of sharing sensitive information with third parties. Finally, the fast speed of technological progress and the changing character of cyber threats mean that organizations must continually adapt and update their CTI strategies to stay ahead of the threats [ 17 ].

Ejaz et al. [ 18 ] explored the applications of machine learning (ML) in visualizing patterns in CTI data to improve cybersecurity. The study highlights the importance of CTI in protecting organizations against cyber threats and using ML techniques to analyze and visualize large volumes of CTI data. Organizations can take proactive measures to protect their systems from potential attacks by identifying patterns in the data. The article also identified several challenges related to CTI, including the complexity and volume of data, the lack of standardization in data collection and sharing, and the need for skilled personnel to analyze and interpret the data. The study suggests that addressing these issues could enhance the effectiveness of CTI and cybersecurity measures. Overall, the article focuses on the potential of ML techniques to improve the visualization and analysis of CTI data and strengthen cybersecurity measures in organizations. There are several challenges related to CTI, including the fact that the quality of the data used to generate CTI can vary widely. Incomplete or inaccurate data can lead to flawed threat assessments and ineffective security measures. Additionally, information overload, or the volume of data related to cybersecurity threats, can be overwhelming, creating difficulties for organizations in identifying and prioritizing the most relevant threats. Also, practical CTI requires specialized skills and knowledge, which can be difficult for organizations to acquire and maintain. Collaboration and information sharing about cyber threats across organizations can be challenging, as it requires trust and cooperation between different entities. Building and maintaining a robust CTI capability can be expensive, particularly for smaller organizations with limited resources [ 18 ].

Mendez Mena and Yang [ 19 ] developed a framework for decentralized threat intelligence that can be applied to traditional networks and the Internet of Things (IoT). The authors argue that traditional centralized threat intelligence approaches are insufficient for the rapidly evolving threat landscape and the growing number of connected IoT devices. The proposed framework includes several components: distributed threat data collection and analysis, decentralized threat intelligence sharing, and autonomous threat response. The article discusses the advantages of a decentralized approach to threat intelligence, including improved scalability, resiliency, and privacy. The authors also highlight the challenges that must be addressed, such as ensuring trust and consensus among distributed nodes and addressing potential performance bottlenecks. Overall, the article provides a comprehensive overview of the proposed framework for decentralized threat intelligence and its potential applications in securing networks and IoT devices [ 19 ].

Liu et al. [ 20 ] highlighted how current threat intelligence systems rely heavily on manual analysis, which is time-consuming and prone to errors. Therefore, they proposed a machine-learning-based system that automatically identifies relevant threat intelligence and provides actionable insights. The proposed approach, TriCTI, uses a combination of neural networks and trigger detection algorithms to identify patterns in threat intelligence data. The system was trained on a large dataset of CTI reports and could automatically extract relevant indicators of compromise (IOCs) and identify potential threat actors. The authors evaluate TriCTI’s performance on several datasets and compare it to other state-of-the-art threat intelligence systems. They show that TriCTI outperforms other systems regarding precision and recall, indicating that it can effectively identify relevant threat intelligence. The article also discusses the potential applications of TriCTI in cybersecurity operations, such as incident response and threat hunting. The authors argue that TriCTI can significantly reduce the time and effort required to identify and respond to cyber threats. Overall, the article presents a comprehensive overview of TriCTI, a machine-learning-based system for CTI discovery [ 20 ].

Kiwia et al. [ 21 ] presented a taxonomy for banking trojans based on the cyber kill chain model, a framework used to describe the stages of a cyber-attack. The taxonomy structures the characteristics and behavior of banking trojans and develops more effective countermeasures. The authors use an evolutionary computational intelligence approach to identify the standard features and behavior of banking trojans and cluster them into different categories based on the stages of the cyber kill chain. The resulting taxonomy is comprised of six categories: intelligence gathering, weaponization, delivery, exploitation, installation, and command and control. The study also analyzes the characteristics and behavior of each category of a banking trojan, including the techniques they use to evade detection and spread, the types of information they target, and the impact they can have on the victims. The authors argue that understanding the behavior and characteristics of banking trojans is crucial for developing effective countermeasures to protect against these types of attacks. Overall, this study provides a valuable framework for understanding the behavior and characteristics of banking trojans and highlights the need for ongoing research and development of countermeasures to protect against these attacks [ 21 ].

Gong and Lee [ 22 ] presented the BLOCIS framework, which addresses the limitations of existing CTI-sharing systems, such as the vulnerability to Sybil attacks and the lack of privacy and accountability. The authors use a combination of blockchain technology and game theory to create a decentralized and trustless system for sharing CTI. The system allows participants to contribute threat intelligence anonymously while ensuring that other participants validate and verify the information before being added to the blockchain. The study also provides a detailed analysis of the BLOCIS framework, including its architecture, algorithms, and protocols. The authors evaluate the framework’s effectiveness using simulation experiments and compare it to existing CTI-sharing systems. The study results show that the BLOCIS framework effectively prevents Sybil attacks and ensures participants’ privacy and accountability. The authors conclude that the BLOCIS framework can improve the effectiveness and efficiency of CTI sharing and enhance the overall security of the digital ecosystem. This study proposes a novel blockchain-based framework for sharing CTI in a Sybil-resistant manner, providing enhanced privacy and accountability. The BLOCIS framework could revolutionize the field of CTI sharing and contribute to a more secure digital ecosystem [ 22 ].

Borges Amaro et al. [ 23 ] designed a framework to address the challenges of managing the increasing amount of data generated by cyber threats and to provide organizations with a structured approach to making sense of the information. The proposed framework is comprised of six stages: data collection, data processing, data analysis, threat intelligence production, dissemination, and consumption. The study also discusses the various tools and techniques that can be used to implement each stage of the framework. Furthermore, the study emphasizes the importance of visualizing CTI data to aid in decision making and recommends the use of interactive dashboards and heat maps. The proposed methodological framework provides a comprehensive and structured approach to managing and utilizing CTI data for effective organizational decision making [ 23 ].

Al-Fawa’reh et al. [ 24 ] proposed a PCADNN model that combines principal component analysis (PCA) and deep neural network (DNN) algorithms to analyze network traffic data and detect anomalous behavior. The PCA algorithm was used to reduce the dimensionality of the input data, and the DNN algorithm was used to classify the data and identify abnormal patterns. The study’s authors conducted several experiments to evaluate the effectiveness of the proposed approach. The results showed that the PCADNN model could accurately detect anomalous network behavior with high precision and recall. Overall, the PCADNN model presented in this study provides a powerful tool for detecting cyber threats and improving organizations’ overall security posture. The method uses deep-learning algorithms to examine network activity data and find aberrant patterns, allowing enterprises to identify and react to cyber-attacks [ 24 ] rapidly.

In another study, Sun et al. [ 25 ] developed a technique based on automated intelligence production for cyber threat records using multi-source information fusion. The proposed approach integrates different data types, such as network traffic, system logs, and external threat intelligence feeds, to generate more comprehensive and accurate threat intelligence records. To show the viability of the suggested strategy, the authors created a prototype system. The study’s findings demonstrate that the technique may provide reliable threat intelligence records and boost the effectiveness of threat intelligence analysis. CTI has access to a wide range of data sources, including social media, the dark web, and open-source information, and it can analyze and keep track of these to gain vital insights regarding future cyber-attacks and their techniques. It enables businesses to actively reduce risks and safeguard their resources, systems, and networks. Moreover, CTI can provide situational awareness by informing organizations about emerging cyber threats and trends. It enables organizations to stay ahead of cyber attackers and respond more effectively to potential cyber incidents. The main challenges related to CTI include data quality, data overload, lack of standardization, and skill shortages. Thus, it can be challenging for organizations to build and maintain a CTI capability in-house. Implementing CTI can be expensive, particularly for smaller organizations with limited resources, which can make it difficult for them to justify the investment in CTI [ 25 ].

Serketzis et al. [ 26 ] highlighted organizations’ need for an effective incident response plan to deal with potential cyber-attacks. The authors also provide insights into how digital forensics can detect and investigate cyber-attacks and how actionable threat intelligence can improve the accuracy and speed of digital forensics investigations. The paper highlights the challenges faced by organizations in collecting and analyzing threat intelligence and provides recommendations for overcoming these challenges to achieve practical digital forensics measures [ 26 ].

Raptis et al. [ 27 ] highlighted that Connected and Autonomous Vehicles (CAVs) are becoming increasingly common, and thus cyber-attacks targeting them are also increasing. The authors developed a machine-learning-based framework called CAVeCTIR to address this issue. This framework utilizes natural language processing techniques and semantic analysis to match CTIRs with high accuracy. CAVeCTIR also includes a new feature selection method called “Minimum Redundancy Maximum Relevance” to select the most relevant features for matching CTIRs. The authors evaluated the performance of CAVeCTIR using a dataset of real-world CTIRs and compared it with two existing methods. The results showed that CAVeCTIR outperformed the current techniques, achieving an accuracy of 87.67%. The authors also conducted a sensitivity analysis and demonstrated the robustness of CAVeCTIR against different parameter settings. In conclusion, CAVeCTIR provides an effective solution for matching CTIRs related to CAVs, which can improve threat detection and response in the CAV ecosystem. The research shows the promise of machine-learning-based techniques to solve cybersecurity concerns in developing technology [ 27 ].

In another study, Alsaedi et al. [ 28 ] enhanced the accuracy of identifying harmful Uniform Resource Locators (URLs) by creating a detection model that uses CTI and two-stage ensemble learning. The model utilizes attributes extracted from internet searches and features based on CTI to enhance detection performance. The model proposed in this study demonstrates better results than other detection models, with an accuracy improvement of 7.8% and a reduction of 6.7% in false-positive rates compared to conventional URL-based models [ 28 ]. On the other hand, Van Haastrecht [ 29 ] highlighted that sharing platforms like the Malware Information Sharing Platform (MISP) could be helpful for SMEs if the shared intelligence is actionable. Therefore, a prototype application was developed to process MISP data, prioritize cybersecurity threats for SMEs, and provide customized recommendations. Further evaluations will refine the application and help SMEs to defend themselves against cyber-attacks more effectively [ 29 ]. Zhang et al. [ 30 ] proposed a solution called the CTI Automated Assessment Model (TIAM). TIAM evaluates sparsely populated threat intelligence from multiple perspectives. It utilizes automatic classification through feature extraction and integrates Adversarial Tactics, Techniques, and Common Knowledge (ATT&CK) to recognize attack methods linked with an Indicator of Compromise or IOC. The experiment demonstrates that TIAM could assess threat intelligence more efficiently, offering security managers valuable CTI [ 30 ].

In a study by Mishra et al. [ 31 ], anomalies in IoT networks were detected using message queuing telemetry transport (MQTT) and machine-learning algorithms, with a dataset of 4998 records and 34 features. Among the various classifiers employed, the random forest classifier demonstrated the highest level of accuracy at 99.94% [ 31 ].

The importance of sharing and promptly acting on high-quality CTI with the appropriate stakeholders is vital. To achieve this, Chatziamanetoglou and Rantos [ 32 ] proposed a blockchain-based system architecture for CTI that captures, evaluates, stores, and shares CTI while assessing its quality against predefined standards. The suggested system chooses validators and rates CTI inputs using a reputation- and trust-based method. The data are stored in a secure ledger that includes objective evaluation and validator performance and can be used to assess the reputation of CTI sources. The system’s dependability, consistency, and resistance to malicious activities were evaluated through a theoretical analysis using a probabilistic simulation, demonstrating an acceptable tolerance against malicious validators [ 32 ].

To overcome network security challenges, Li et al. [ 33 ] proposed an automatic CTI analysis method called K-CTIAA, which utilizes pre-trained models and knowledge graphs to identify threat actions from unstructured CTI data. K-CTIAA lessens the negative impacts of knowledge insertion, maps associated countermeasures using digital artifacts, and adds related knowledge in knowledge graphs to the corresponding place in CTI. In testing, K-CTIAA achieved an F1 score of 0.941 [ 33 ].

Sharing and exchanging CTI through blockchain technology can enhance protection measures, but existing models are susceptible to attacks and false reporting. Zhang et al. [ 34 ] proposed a novel blockchain-based CTI paradigm that integrates consortium blockchain and distributed reputation management systems for automated analysis and reaction to threat intelligence to overcome these problems. “Proof-of-Reputation” (PoR) consensus, a novel consensus method, satisfies the demand for a high transaction rate while establishing a reputation model for reliable network consensus. Experimental testing of the suggested model and consensus process revealed that it is safe and effective [ 34 ].

Data quantities have increased even more because of the proliferation of security devices and the growing complexity of information technology, causing difficulties for digital forensics and information security regulations. To solve this, Serketzis et al. [ 35 ] introduced the Digital Forensic Readiness (DFR) paradigm, which previously separated the concepts of forensic preparation and CTI. The model, which has good accuracy, precision, and recall rates and requires less data for analysis by researchers, is evaluated through experiments. The study indicates the value of integrating CTI and digital forensics procedures, offering a productive way to enhance operational DFR [ 35 ]. In addition, we have summarized the main contributions of each study. Table 1 summarizes the primary information from all papers relevant to detection model.

Summary of papers related to the detection model.

3.2. Knowledge Sharing and Training

In the context of knowledge sharing and training, Afzaliseresht et al. [ 36 ] discussed a common problem in cybersecurity in which organizations receive many machine-generated threat alerts. Still, only a very small percentage of them are investigated due to limited resources. To address this issue, they proposed a model that generates reports in natural language using storytelling techniques from security logs. This means that instead of receiving technical alerts that can be difficult for non-experts to understand, organizations would receive reports written in plain language with relevant information about potential threats and vulnerabilities. These reports can also be adjusted based on the reader’s level of expertise and preference. The proposed model is validated through a case study at a university’s Security Operations Center (SOC) and shown to provide better comprehension and completeness compared to existing methods for interpreting potential threats in cybersecurity contexts [ 36 ].

Sharing cyber threat information (CTI) is vital for enhancing security, yet many individuals are unwilling to share their CTI and prefer to merely consume it. In a research contribution, Riesco et al. [ 37 ] proposed a new approach to encourage sharing of CTI among several parties involved in cybersecurity information exchange. They suggest using blockchain technology and smart contracts to create incentives for knowledge sharing, which could contribute toward developing and implementing dynamic risk management systems to keep risks under control over time. Specifically, they suggest creating a marketplace for Ethereum blockchain smart contracts where participants can exchange CTI tokens as digital assets with a good value in the market. This would incentivize all parties to share their CTI while highlighting potential storage limitations/costs associated with transactions through simulations/experiments. Overall, this approach aims to improve security by encouraging more effective collaboration between stakeholders while navigating legal/regulatory constraints related to information sharing [ 37 ].

CTI focuses primarily on defense against these attacks, but there is a need for new methods to unmask attackers. Rana et al. [ 38 ] created malicious files as decoys, allowing the authors to gather information from susceptible PCs using honeypots. They used various tools for data analysis, including Visual Studio Code and Python. The evaluation method uses counterintelligence techniques such as cyber deception and decoy files to obtain adversary information. Overall, this research focuses on providing better proactive adversarial system intelligence by capturing attackers’ system information through accurate document-based tokens in a proactive defensive environment while executing threat hunting with TTPs (Tactics Techniques Procedures) [ 38 ].

In addition, Samtani et al. [ 39 ] developed a new system called the AZSecure Hacker Assets Portal (HAP), which is helps organizations increase their awareness of potential cyber threats by collecting data from various dark web platforms such as hacker forums, carding shops, Internet Relay Chat channels, and DarkNet marketplaces. The HAP platform uses advanced techniques such as CTI, data mining, and text mining to organize this information into an interface that allows for easy browsing, searching, and downloading content. It also offers dynamic visualizations to help scholars gain situational awareness and formulate novel research inquiries on emerging threat detection or critical hacker identification. HAP presently serves over 200 customers from academic institutions, police enforcement agencies, and industrial groups like General Electric and PayPal globally [ 39 ].

Koloveas et al. [ 40 ] presented a crawler architecture to collect cyber threat intelligence related to the Internet of Things (IoT) from the clear, social, and dark web. The architecture consists of four main components: a crawler module, a data pre-processing module, a data storage module, and a user interface module. The crawler module systematically and efficiently collects data from various sources, and the data pre-processing module cleans and filters the data. The data storage module securely stores the collected and pre-processed data and is scalable, and the user interface module provides a user-friendly interface for accessing the collected data. The proposed architecture was evaluated using a dataset of IoT-related cyber threat intelligence, and the evaluation results showed that the architecture effectively collects and processes cyber threat intelligence related to IoT from the clear, social, and dark web. Security analysts and researchers can use the proposed architecture to identify and analyze IoT threats and develop effective countermeasures to protect IoT devices and networks [ 40 ].

Basheer and Alkhatib [ 41 ] wrote a review article to outline the importance of testing and checking the dark web for CTI to prevent crimes and gain insight into criminal activities. The review includes recent research in the area, examining techniques, tools, methods, approaches, and outcomes. It also addresses technical hurdles, ethical considerations, and potential future developments [ 41 ].

Mundt and Baier [ 42 ] described an adaptive approach that uses CTI from the MITRE ATT&CK framework to simulate potential threats and identify weaknesses before they can be exploited. The process involves two main steps: automatically deriving the most critical threats for a business via CTI and designing a simulation gear based on attacks extracted from the MITRE ATT&CK framework to assess their impact. The aim is to enable companies to take proactive measures against data theft and double extortion attacks by simulating harmful technologies before they occur in operational environments [ 42 ].

On the other hand, Sakellariou et al. [ 43 ] emphasized using discussion forums as the raw data source for CTI and suggested a semantic schema for organizing the gathered data. The paper introduces the SECDFAN system, a comprehensive method for generating CTI products by analyzing forum content. Furthermore, a reference architecture was created systematically to address all CTI-related concerns, including product sharing and collaboration among security experts [ 43 ].

Sacher-Boldewin and Leverett [ 44 ] improved cyber defense by systematically categorizing and documenting possible failure states in a company’s security operations process. The system divides vulnerability management into three parts, starting from when a vulnerability is detected and rated as relevant, then asking whether prevention measures can be activated on time or if any signs of exploitation can be detected. The authors recommend using dimensions related to business processes, such as people, products, partners, etc. These dimensions are used to calculate the possible resolution categories by multiplying them by involved parties. The suggested system also highlights a direct connection between cybersecurity and risk quantification so that external and internal risks can be managed effectively and efficiently. It recommends building feedback options into existing processes by systematically categorizing possible failure states to help optimize workflows while delivering valuable metrics [ 44 ].

Koloveas et al. [ 45 ] introduced the “Integrated Framework for Threat Intelligence Mining and Extraction” system, INTIME. INTIME is a framework based on machine learning that gives a holistic perspective of the cyber threat intelligence process. It enables security experts to collect, evaluate, and exchange cyber threat data from various online sources, such as clear/deep/dark websites, forums, and social networks. Vulnerabilities/exploits/threat actors/cyber crime tools are among the information retrieved and managed via an integrated platform called MISP (Malware Information Sharing Platform), designed specifically for storing/sharing threat-related information across different organizations. One of the critical features of INTIME is its ability to gather CTI not only from structured sources like known security databases but also unstructured ones like deep net [ 45 ].

The importance of risk management in organizations is crucial, and real-time security threats can harm risk exposure levels. Riesco and Villagrá [ 46 ] highlighted companies’ challenges in managing risks, such as emerging techniques, asset complexity, and numerous vulnerabilities. To overcome these challenges, the authors suggest an architecture for dynamic risk assessment and management based on Web Ontology Language and Semantic Web Rule Language. The architecture includes a new semantic version of Structured Threat Information eXpression (STIX)v2.0 for exchanging CTI. The article demonstrates the effectiveness of the proposed framework in supporting decision making across different organizational levels using a leading cybersecurity organization. The proposed model aims to enable real-time risk management while integrating a mix of standards and ensuring ease of adoption [ 46 ].

Aljuhami and Bamasoud [ 47 ] investigated how Cyber Threat Information (CTI) can reduce cyber risks in Saudi universities by improving risk management. The study investigates CTI concepts, challenges, and risk management practices in higher education. Their work includes a review of previous studies and their relevance to the current research. The results highlight the importance of obtaining advanced and detailed information on cyber threats, or CTI, to deal with their constantly evolving nature. Integrating CTI into risk management enhances defenders’ ability to mitigate the increasing risk of cyber threats [ 47 ]. Sakellariou et al. [ 48 ] introduced essential CTI concepts and an eight-layer CTI in a similar study, with a reference model that can aid in the development of CTI systems. The model’s effectiveness is demonstrated through three case studies, resulting in the creation of CTI [ 48 ].

Dulaunoy et al. [ 49 ] developed a system for unreceptive DNS, malware hash archives, and Secure Sockets Layer (SSL) notaries. This system aims to support incident inquiries and infrastructure tracking by providing CTI. The authors explain that CSIRTs (Computer Security Incident Response Teams) use passive DNS and SSL databases to help with the incident reply. Still, they argue that their new passive SSH database would be a valuable addition to the CSIRT toolbox, because OpenSSH implementation is widely used on many servers as well as computers like MacOSs or Windows machines, which makes it an attractive target for attackers looking for vectors of attack or command-and-control mechanisms [ 49 ].

On the other hand, Gao et al. [ 50 ] developed a system called SecurityKG, which is a system that automates the collection and management of open-source CTI (OSCTI) from over 40 major security websites. It uses AI/NLP techniques to extract relevant information, such as potential threats, vulnerabilities, and risks to critical assets. The system also has an extendable backend that handles all gathered, extracted, and constructed OSCTI components. Additionally, it provides various interactivities through its user interface to facilitate knowledge graph exploration. SecurityKG aims to provide more comprehensive and accurate information about cyber threats [ 50 ].

Al-Mohannadi et al. [ 51 ] analyzed different types of web services and the ways in which adversaries can use them for malicious activities. The study suggests that CTI can be used to protect organizations from cyber threats by providing relevant information about potential attacks, vulnerabilities, and threat actors. This information can help organizations to develop better security strategies and responses to mitigate risks. The study highlights some of the main challenges related to CTI, such as the lack of standardization, data quality issues, and the need for skilled personnel to analyze and interpret the data. The research suggests that cloud-based web services can help to overcome some of these challenges by providing scalable and flexible solutions that can be customized to meet the specific needs of different organizations. The study suggests that cloud-based web services can enhance CTI by providing more comprehensive and accurate information about adversary activities. However, addressing the challenges related to data quality, standardization, and personnel skills is crucial to effectively use CTI to protect organizations from cyber threats [ 51 ]. Using AI and NLP to evaluate social media postings on cyber-attacks and electronic warfare, Sufi [ 52 ] offers a contemporary methodology. A single index is created for each nation using keyword-based index production techniques, and CNN is used to find abnormalities and their causes inside the index. The method is verified using real-time Twitter feeds, producing 75 daily cyber danger indices for six nations with anomalies. Decision makers may use the gathered intelligence to modify their cybersecurity readiness and lessen the harm done by cybercriminals [ 52 ].

In a different study, Cristea [ 53 ] examined the risks connected to potential threats from disruptive technologies in the context of the present financial systems. The study shows that by enhancing efficiency, cutting costs, and boosting transparency, disruptive technologies like blockchain, cryptocurrencies, and artificial intelligence have the potential to completely change the financial sector. However, these technologies also pose significant risks to the current financial systems, including cybersecurity threats, regulatory challenges, and financial instability. The study identifies five key categories of risks associated with disruptive technologies in the financial sector: technology, regulatory, market, operational, and systemic risks. The study concludes that though disruptive technologies significantly benefit the financial industry, managing the associated risks effectively is crucial. Financial institutions, policymakers, and regulators must work together to develop robust risk management strategies and regulatory frameworks to ensure that the benefits of disruptive technologies are realized while mitigating their potential risks [ 53 ].

Thach et al. [ 54 ] suggest that Industry 4.0 has greatly impacted the banking industry in Vietnam, especially regarding its technology quality management practices and cybersecurity risk management strategies. The paper identifies several key factors affecting the success of Industry 4.0 implementation in the banking sector, including regulatory compliance, data privacy, and talent management. Furthermore, the study highlights the importance of cybersecurity risk management in the banking industry, given the increasing prevalence of cyber threats and attacks. In the context of Industry 4.0, the study offers a thorough methodology for managing cybersecurity risks that includes proactive risk identification, risk assessment, risk reduction, and risk monitoring. Overall, the research underlines the necessity for financial institutions in developing nations like Vietnam to establish strong practices for technology quality management and cybersecurity risk management in order to be secure and competitive in the quickly changing digital environment of Industry 4.0 [ 54 ].

Tripodi [ 55 ] utilized a sociotechnical framework to analyze the sociopolitical and technological factors contributing to misinformation’s spread and persistence. They argue that the continuation of this misinformation is due to a complex interplay of social and technological factors, including political polarization, the use of social media platforms to spread misinformation, and the amplification of misinformation by influential individuals and organizations. The study also highlights the potential public health consequences of the persistence of this misinformation, including increased transmission of viruses and decreased compliance with public health guidelines. The authors concluded that addressing this issue will require a multifaceted approach that considers the social and technological factors contributing to the spread of misinformation [ 55 ].

Odemis et al. [ 56 ] developed a “Honeypsy” system designed to observe user behavior with CTI. The system collects and analyzes data from honeypots and decoy systems that mimic natural systems and are used to detect and monitor cyber threats. Honeypsy analyzes the gathered data and looks for unusual behavior using machine-learning methods. The system is appropriate for business situations, since it was made to be scalable and manage large volumes of data. The authors of the research carried out many tests to assess Honeypsy’s performance in identifying user activity in CTI. The results showed that the system can accurately detect anomalous behavior, such as the use of malicious tools and techniques, and can provide early warnings of potential cyber-attacks. The Honeypsy system offers a promising approach to detecting user behavior using machine-learning algorithms and honeypot data in CTI. The system can potentially improve the accuracy and efficiency of cyber threat detection in enterprise environments, helping organizations better protect their assets and data from cyber-attacks [ 56 ].

Vevera et al. [ 57 ] proposed an approach to help organizations make informed decisions when selecting CTI solutions and ensure they align with their specific needs and requirements. The proposed approach is comprised of six attributes: accuracy, reliability, timeliness, cost-effectiveness, usability, and comprehensiveness. This study thoroughly explains each attribute and discusses the criteria that can be used to evaluate them. Furthermore, the study’s authors conducted several experiments to demonstrate the effectiveness of the recommended technique. The findings suggest that the multi-attribute strategy can effectively evaluate CTI products and services and help organizations make informed decisions when selecting solutions that best fit their needs and requirements. Overall, the multi-attribute approach proposed in this study provides a structured and comprehensive approach to selecting CTI products and services. The system can help organizations to evaluate and compare different solutions based on multiple criteria, thereby improving the quality of their decision-making process and enhancing their overall cybersecurity posture [ 57 ].

In another study, Du et al. [ 58 ] highlighted the significance of CTI sharing in mitigating the risk of cyber-attacks and discussed the challenges and obstacles that hinder the sharing process. Moreover, the study presents the key players and initiatives involved in CTI sharing, such as the government, the private sector, and international organizations. Finally, the study provides an outlook for CTI sharing, with recommendations for enhancing the process and increasing its effectiveness [ 58 ].

Westerlund [ 59 ] discusses the potential implications of this technology for business and society, including the risks of cyber threats, such as fraud, identity theft, and misinformation. The author also examines the various applications of deep fake technology, such as entertainment and politics, and discusses the challenges of regulating and controlling its use. Overall, the article provides a valuable overview of the emergence of deep fake technology and its potential impact on business and society [ 59 ].

Due to privacy issues and the absence of a common dataset format, developing a machine-learning-based detection system using heterogeneous network data samples from different sources and organizations is challenging. Sarhan et al. [ 60 ] presented a cooperative CTI-sharing system that enables several enterprises to create, train, and test a powerful ML-based network intrusion detection system to solve this issue. A federated learning method was recommended to protect the confidentiality of each organization’s data. Additionally, network data traffic was made accessible in a common format to help uncover significant trends across various data sources. The authors used NF-UNSW-NB15-v2 and NF-BoT-IoT-v2, among other datasets and scenarios, to test the proposed framework. They found that it correctly classifies various traffic kinds without the necessity for inter-organizational data sharing.

To promote the standardized exchange of cyber threat intelligence, industry standards, including STIX, Trusted Automated Exchange of Intelligence Information (TAXII), and Cyber Observable eXpression (CybOX), were developed. Ramsdale et al. [ 61 ] evaluated these standards. The writers examine the formats and languages that can be used to exchange cyber threat intelligence as well as publicly available sources for threat feeds. Additionally, they look into the types of data offered by a sample of cyber threat intelligence feeds and the challenges involved in compiling and distributing data. The writers matched the data type and the data needs for various security activities. They conclude that many standards are poorly adopted and implemented, with suppliers preferring unique or conventional forms [ 61 ].

Oosthoek and Doerr [ 62 ] highlighted the inadequacy of the CTI (Cyber Threat Intelligence) field in their analysis; this is primarily due to flawed methodology. As a result, CTI currently delivers ineffective outcomes. However, the field has the potential to mature and improve by drawing from the methodology of its parent field, intelligence studies, and addressing challenges related to quality, bias, and actor naming. The article emphasizes the importance of scientific scrutiny and suggests that an alliance between the Intelligence Community (IC) and CTI can drive cyber defense in the future [ 62 ].

De Melo e Silva et al. [ 63 ] discuss the need for more efficient defense mechanisms in response to the changing cybersecurity landscape and emerging threats. They provide an overview of the cyber threat intelligence scenario and identify relevant standards and platforms. The evaluation of these standards and platforms reveals that STIX is the most widely adopted standard due to its holistic approach. At the same time, the Malware Information Sharing Platform (MISP) and OpenCTI are considered the most comprehensive and flexible platforms. However, finding a comprehensive solution for defense based on threat intelligence remains a challenge due to the divergent focuses of existing platforms [ 63 ]. Table 2 provides an overview of studies focusing on knowledge sharing and training.

Summary of papers related to knowledge sharing and training.

4. Discussion

Based on the literature review, developing a sophisticated and comprehensive CTI framework is crucial for organizations to manage and mitigate potential risks effectively. Cyber threats are becoming increasingly complex, making it difficult for traditional security measures to keep up. A well-designed cybersecurity threat intelligence framework can help organizations to avoid emerging threats by providing real-time insights into their critical assets’ potential risks. Such a framework should include identifying vulnerabilities within an organization’s infrastructure, detecting anomalous behavior or patterns associated with malicious activity, and sharing information about potential threats with stakeholders. Furthermore, having a structured approach toward cybersecurity threat intelligence helps to ensure consistency across different organizational departments. It also enables better collaboration between teams who are responsible for managing cyber-attack risk exposure. Accordingly, we propose a CTI framework to improve business organizations’ cyber threat response capabilities in this section. Figure 3 shows the architecture of the security framework presented in this paper. The proposed framework comprises a knowledge base, detection models, and visualization dashboards.

Figure 3

Proposed layered CTI framework.

The first layer, the knowledge base, includes information about potential threats, vulnerabilities, and risks to critical assets. This information can be obtained from sources like internal logs, the dark web, or external feeds. Organizations should establish a centralized repository of this information that all organizational stakeholders can access. This will help ensure that everyone has access to up-to-date threat intelligence data and can make informed decisions regarding risk management. Furthermore, machine-learning algorithms are increasingly used to analyze large volumes of data to identify patterns associated with cyber-attacks or malicious activity. By leveraging these technologies along with human expertise, organizations can improve their ability to develop a reliable database quickly and effectively. A comprehensive knowledge base is crucial for practical cybersecurity threat intelligence, providing valuable insights into potential risks aimed at an organization’s critical assets.

Human identification of cyber threats is limited due to the cognitive limitations of humans., As highlighted in the literature [ 12 , 25 , 27 , 31 , 45 ], machine-learning and artificial intelligence tools can help to identify malicious network traffic. Therefore, the proposed detection models’ second layer includes applications that include these technologies for signature-based, anomaly-based, and behavior-based detection. Signature-based detection involves comparing incoming traffic against known signatures or patterns associated with previously identified threats. This approach could prove to be effective at detecting known threats but may not be able to detect new or emerging threats. Anomaly-based detection involves identifying deviations from regular network activity that could indicate potential malicious activity. This approach could help to detect unknown attacks but may also generate false positives if legitimate activities are flagged as abnormal. Behavioral-based detection focuses on monitoring user behavior and identifying unusual actions deviating from established norms. This model could help to identify insider threat actors with authorized access to systems. Combining these approaches and machine-learning algorithms can help to improve accuracy and reduce false positives/negatives. By leveraging multiple layers of protection, organizations can quickly enhance their ability to detect emerging cyber threats while minimizing risk exposure.

The third layer, the visualization dashboard, provides an overview of key metrics related to cyber threats, such as the number of attacks detected, as well as types of attacks and their severity level. As shown by Samtani et al. [ 39 ], the visualization of cyber threats increases situational awareness among actors, so visualization tools can be used to represent complex data sets in a more intuitive way, which helps analysts to identify patterns or trends that may not be immediately apparent from raw data. This approach is advantageous when dealing with large volumes or diverse sources of information. The dashboards should be customizable based on user roles so stakeholders within an organization can access relevant information quickly. Visualization tools such as heat maps, graphs, etc. could help analysts to understand how different events are connected and their impact on critical assets. They can also be used to highlight which infrastructure assets will have higher vulnerabilities or how cybersecurity risks vary based on the current organizational situation. Overall, dashboards and visualization tools enhance cybersecurity threat intelligence by providing real-time insights into organizational data, such as an organizational risk meter, showing potential risks aimed at an organization’s critical assets, the number of recent attacks, threat levels, the average organizational response time, and the cost of recent cybersecurity attacks. By utilizing these technologies, organizations can quickly improve their ability to detect emerging threats while minimizing risk exposure.

Due to the enhanced digital transformation, business organizations must be resilient toward cybersecurity [ 64 , 65 , 66 , 67 ]. Given the increasing frequency and sophistication of cyber-attacks today, developing a robust cybersecurity threat intelligence framework should be considered one of the top priorities of any organization looking to protect its critical assets. Since humans are a weak link in cybersecurity [ 68 , 69 , 70 ], a robust CTI framework will also facilitate overcoming human security lapses. As proposed in our model, leveraging multiple layers such as a knowledge base, detection models, and visualization tools along with human expertise would enable effective management and mitigation of these evolving challenges.

Based on the details presented in this paper, there are a few limitations to consider. First, the proposed framework for CTI implementation may not apply to all organizations, as each organization has unique needs and resources. Therefore, organizations need to tailor their approach based on their specific requirements. Second, collaboration between stakeholders is essential for practical CTI; it can also pose challenges, such as restrictions on information sharing due to legal or regulatory constraints [ 71 , 72 ]. Organizations must navigate these challenges carefully while complying with relevant laws and regulations. Third, the empirical analysis conducted in this study was limited by the sample size and the scope of data collected from selected industries. This may limit its generalizability across other sectors or regions globally where CTI programs have been implemented successfully but were not included within this research project’s scope of work. Finally, and importantly, cyber threats continue to evolve rapidly over time, making it difficult, and even sometimes impossible, to keep up with new emerging trends without continuous monitoring and updating of one’s security posture strategy plan accordingly. Therefore, any framework should be considered dynamic rather than static when implementing CTI frameworks within an organization’s security posture strategy plan.

5. Conclusions

In conclusion, this paper significantly contributes to the field of CTI by proposing a comprehensive framework for the implementation of CTI. Through an extensive literature review, key components that are essential for practical CTI have been identified, including data collection and processing, analysis, and dissemination. The proposed framework could provide valuable guidance for organizations seeking to establish or improve their CTI capabilities. Additionally, the methodology employed in this research can serve as a model for future studies on related topics. In addition, the paper’s emphasis on collaboration between different stakeholders is particularly noteworthy, as it highlights the importance of information sharing in combating cyber threats. By bringing together experts from various fields, such as cybersecurity professionals, law enforcement agencies, and government officials, organizations can quickly leverage their collective knowledge to identify emerging threats. Several areas for future work could build upon the findings of this study. First, further research could be conducted to explore how organizations can effectively tailor their approach to CTI implementation based on their specific needs and resources. Second, additional studies could investigate how collaboration between stakeholders can be facilitated more effectively while navigating legal and regulatory constraints related to information sharing.

Acknowledgments

The authors would like to thank Tooba Nasir for the language review of the manuscript.

Author Contributions

Conceptualization, S.S.; methodology, S.A.S., M.S.A.-G. and H.A.-M.; data curation, S.A.S., M.S.A.-G. and H.A.-M.; writing—original draft preparation, S.S., S.A.S., M.S.A.-G. and H.A.-M.; writing—review and editing, A.M.A.; supervision, S.S.; funding acquisition, A.M.A. All authors have read and agreed to the published version of the manuscript.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Data availability statement, conflicts of interest.

The authors declare no conflict of interest.

Funding Statement

The APC was funded by the SAUDI ARAMCO Cybersecurity Chair, Imam Abdulrahman Bin Faisal University.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

  • 1. Lenka A., Goswami M., Singh H., Baskaran H. Effective Cybersecurity Operations for Enterprise-Wide Systems. IGI Global; Hershey, PA, USA: 2023. Cybersecurity Disclosure and Corporate Reputation: Rising Popularity of Cybersecurity in the Business World; pp. 169–183. [ Google Scholar ]
  • 2. Kotsias J., Ahmad A., Scheepers R. Adopting and integrating cyber-threat intelligence in a commercial organisation. Eur. J. Inf. Syst. 2023;32:35–51. doi: 10.1080/0960085X.2022.2088414. [ DOI ] [ Google Scholar ]
  • 3. Gately H. Doctoral Dissertation. Macquarie University; Sydney, Australia: 2023. Russian Organised Crime and Ransomware as a Service: State Cultivated Cybercrime. [ Google Scholar ]
  • 4. Abu M.S., Selamat S.R., Ariffin A., Yusof R. CTI–issue and challenges. Indones. J. Electr. Eng. Comput. Sci. 2018;10:371–379. [ Google Scholar ]
  • 5. Webb J., Maynard S., Ahmad A., Shanks G. Information security risk management: An intelligence-driven approach. Australas. J. Inf. Syst. 2014;18:391–404. doi: 10.3127/ajis.v18i3.1096. [ DOI ] [ Google Scholar ]
  • 6. Webb J., Maynard S., Ahmad A., Shanks G. Towards an intelligence-driven information security risk management process for organisations; Proceedings of the ACIS 2013 Proceedings, 52; Niigata, Japan. 16–20 June 2013. [ Google Scholar ]
  • 7. Schlette D., Caselli M., Pernul G. A comparative study on cyber threat intelligence: The security incident response perspective. IEEE Commun. Surv. Tutor. 2021;23:2525–2556. doi: 10.1109/COMST.2021.3117338. [ DOI ] [ Google Scholar ]
  • 8. Kitchenham B., Charters S. Guidelines for Performing Systematic Literature Reviews in Software Engineering. Elsevier; London, UK: 2007. Technical Report, EBSE Technical Report EBSE-2007-0. [ Google Scholar ]
  • 9. Page M.J., McKenzie J.E., Bossuyt P.M., Boutron I., Hoffmann T.C., Mulrow C.D., Moher D. The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ. 2021;372:n71. doi: 10.1136/bmj.n71. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 10. [(accessed on 30 June 2023)]. Available online: https://ieeexplore.ieee.org/Xplore/home.jsp .
  • 11. [(accessed on 30 June 2023)]. Available online: https://dl.acm.org/
  • 12. Suryotrisongko H., Musashi Y., Tsuneda A., Sugitani K. Robust botnet DGA detection: Blending XAI and OSINT for CTI sharing. IEEE Access. 2022;10:34613–34624. doi: 10.1109/ACCESS.2022.3162588. [ DOI ] [ Google Scholar ]
  • 13. Moraliyage H., Sumanasena V., De Silva D., Nawaratne R., Sun L., Alahakoon D. Multimodal classification of onion services for proactive CTI using explainable deep learning. IEEE Access. 2022;10:56044–56056. doi: 10.1109/ACCESS.2022.3176965. [ DOI ] [ Google Scholar ]
  • 14. Irshad E., Siddiqui A.B. Cyber threat attribution using unstructured reports in CTI. Egypt. Inform. J. 2023;24:43–59. doi: 10.1016/j.eij.2022.11.001. [ DOI ] [ Google Scholar ]
  • 15. Zhang H., Shen G., Guo C., Cui Y., Jiang C. Ex-action: Automatically extracting threat actions from CTI report based on multimodal learning. Secur. Commun. Netw. 2021;2021:1–12. [ Google Scholar ]
  • 16. Cha J., Singh S.K., Pan Y., Park J.H. Blockchain-based CTI system architecture for sustainable computing. Sustainability. 2020;12:6401. doi: 10.3390/su12166401. [ DOI ] [ Google Scholar ]
  • 17. Gong S., Lee C. CTI framework for incident response in an energy cloud platform. Electronics. 2021;10:239. doi: 10.3390/electronics10030239. [ DOI ] [ Google Scholar ]
  • 18. Ejaz S., Noor U., Rashid Z. Visualizing Interesting Patterns in CTI Using Machine Learning Techniques. Cybern. Inf. Technol. 2022;22:96–113. [ Google Scholar ]
  • 19. Mendez Mena D., Yang B. Decentralized actionable CTI for networks and the internet of things. IoT. 2020;2:1–16. doi: 10.3390/iot2010001. [ DOI ] [ Google Scholar ]
  • 20. Liu J., Yan J., Jiang J., He Y., Wang X., Jiang Z., Yang P., Li N. TriCTI: An actionable CTI discovery system via trigger-enhanced neural network. Cybersecurity. 2022;5:8. doi: 10.1186/s42400-022-00110-3. [ DOI ] [ Google Scholar ]
  • 21. Kiwia D., Dehghantanha A., Choo K.K.R., Slaughter J. A cyber kill chain based taxonomy of banking Trojans for evolutionary computational intelligence. J. Comput. Sci. 2018;27:394–409. doi: 10.1016/j.jocs.2017.10.020. [ DOI ] [ Google Scholar ]
  • 22. Gong S., Lee C. Blocis: Blockchain-based CTI sharing framework for sybil-resistance. Electronics. 2020;9:521. doi: 10.3390/electronics9030521. [ DOI ] [ Google Scholar ]
  • 23. Borges Amaro L.J., Percilio Azevedo B.W., Lopes de Mendonca F.L., Giozza W.F., Albuquerque R.D.O., García Villalba L.J. Methodological framework to collect, process, analyze and visualize CTI data. Appl. Sci. 2022;12:1205. doi: 10.3390/app12031205. [ DOI ] [ Google Scholar ]
  • 24. Al-Fawa’reh M., Al-Fayoumi M., Nashwan S., Fraihat S. CTI using PCA-DNN model to detect abnormal network behavior. Egypt. Inform. J. 2022;23:173–185. doi: 10.1016/j.eij.2021.12.001. [ DOI ] [ Google Scholar ]
  • 25. Sun T., Yang P., Li M., Liao S. An automatic generation approach of the CTI records based on multi-source information fusion. Future Internet. 2021;13:40. doi: 10.3390/fi13020040. [ DOI ] [ Google Scholar ]
  • 26. Serketzis N., Katos V., Ilioudis C., Baltatzis D., Pangalos G.J. Actionable threat intelligence for digital forensics readiness. Inf. Comput. Secur. 2019;27:273–291. doi: 10.1108/ICS-09-2018-0110. [ DOI ] [ Google Scholar ]
  • 27. Raptis G.E., Katsini C., Alexakos C., Kalogeras A., Serpanos D. CAVeCTIR: Matching CTI Reports on Connected and Autonomous Vehicles Using Machine Learning. Appl. Sci. 2022;12:11631. doi: 10.3390/app122211631. [ DOI ] [ Google Scholar ]
  • 28. Alsaedi M., Ghaleb F.A., Saeed F., Ahmad J., Alasli M. CTI-based malicious url detection model using ensemble learning. Sensors. 2022;22:3373. doi: 10.3390/s22093373. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 29. Van Haastrecht M., Golpur G., Tzismadia G., Kab R., Priboi C., David D., Răcătăian A., Baumgartner L., Fricker S., Ruiz J.F., et al. A shared CTI solution for smes. Electronics. 2021;10:2913. doi: 10.3390/electronics10232913. [ DOI ] [ Google Scholar ]
  • 30. Zhang S., Chen P., Bai G., Wang S., Zhang M., Li S., Zhao C. An automatic assessment method of CTI combined with ATT&CK matrix. Wirel. Commun. Mob. Comput. 2022:7875910. [ Google Scholar ]
  • 31. Mishra S., Albarakati A., Sharma S.K. CTI for IoT Using Machine Leamrning. Processes. 2022;10:2673. doi: 10.3390/pr10122673. [ DOI ] [ Google Scholar ]
  • 32. Chatziamanetoglou D., Rantos K. Blockchain-Based CTI Sharing Using Proof-of-Quality Consensus. Secur. Commun. Netw. 2023:3303122. [ Google Scholar ]
  • 33. Li Z.X., Li Y.J., Liu Y.W., Liu C., Zhou N.X. K-CTIAA: Automatic Analysis of CTI Based on a Knowledge Graph. Symmetry. 2023;15:337. doi: 10.3390/sym15020337. [ DOI ] [ Google Scholar ]
  • 34. Zhang X., Miao X., Xue M. A Reputation-Based Approach Using Consortium Blockchain for CTI Sharing. Secur. Commun. Netw. 2022:7760509. doi: 10.1155/2022/7760509. [ DOI ] [ Google Scholar ]
  • 35. Serketzis N., Katos V., Ilioudis C., Baltatzis D., Pangalos G. Improving forensic triage efficiency through CTI. Future Internet. 2019;11:162. doi: 10.3390/fi11070162. [ DOI ] [ Google Scholar ]
  • 36. Afzaliseresht N., Miao Y., Michalska S., Liu Q., Wang H. From logs to stories: Human-centred data mining for CTI. IEEE Access. 2020;8:19089–19099. doi: 10.1109/ACCESS.2020.2966760. [ DOI ] [ Google Scholar ]
  • 37. Riesco R., Larriva-Novo X., Villagrá V.A. Cybersecurity threat intelligence knowledge exchange based on blockchain: Proposal of a new incentive model based on blockchain and Smart contracts to foster the cyber threat and risk intelligence exchange of information. Telecommun. Syst. 2020;73:259–288. doi: 10.1007/s11235-019-00613-4. [ DOI ] [ Google Scholar ]
  • 38. Rana M.U., Ellahi O., Alam M., Webber J.L., Mehbodniya A., Khan S. Offensive Security: CTI Enrichment With Counterintelligence and Counterattack. IEEE Access. 2022;10:108760–108774. doi: 10.1109/ACCESS.2022.3213644. [ DOI ] [ Google Scholar ]
  • 39. Samtani S., Li W., Benjamin V., Chen H. Informing CTI through dark Web situational awareness: The AZSecure hacker assets portal. Digit. Threats Res. Pract. (DTRAP) 2021;2:1–10. doi: 10.1145/3450972. [ DOI ] [ Google Scholar ]
  • 40. Koloveas P., Chantzios T., Tryfonopoulos C., Skiadopoulos S. A crawler architecture for harvesting the clear, social, and dark web for IoT-related cyber-threat intelligence; Proceedings of the 2019 IEEE World Congress on Services (SERVICES); Milan, Italy. 8–13 July 2019; pp. 3–8. [ Google Scholar ]
  • 41. Basheer R., Alkhatib B. Threats from the dark: A review over dark web investigation research for CTI. J. Comput. Netw. Commun. 2021;2021:1–21. doi: 10.1155/2021/1302999. [ DOI ] [ Google Scholar ]
  • 42. Mundt M., Baier H. Threat-based Simulation of Data Exfiltration Towards Mitigating Multiple Ransomware Extortions. Digit. Threats Res. Pract. 2022 doi: 10.1145/3568993. [ DOI ] [ Google Scholar ]
  • 43. Sakellariou G., Fouliras P., Mavridis I. SECDFAN: A CTI System for Discussion Forums Utilization. Eng. 2023;4:615–634. doi: 10.3390/eng4010037. [ DOI ] [ Google Scholar ]
  • 44. Sacher-Boldewin D., Leverett E. The Intelligent Process Lifecycle of Active Cyber Defenders. Digit. Threats Res. Pract. (DTRAP) 2022;3:1–17. doi: 10.1145/3499427. [ DOI ] [ Google Scholar ]
  • 45. Koloveas P., Chantzios T., Alevizopoulou S., Skiadopoulos S., Tryfonopoulos C. Intime: A machine learning-based framework for gathering and leveraging web data to cyber-threat intelligence. Electronics. 2021;10:818. doi: 10.3390/electronics10070818. [ DOI ] [ Google Scholar ]
  • 46. Riesco R., Villagrá V.A. Leveraging CTI for a dynamic risk framework: Automation by using a semantic reasoner and a new combination of standards (STIX™, SWRL and OWL) Int. J. Inf. Secur. 2019;18:715–739. doi: 10.1007/s10207-019-00433-2. [ DOI ] [ Google Scholar ]
  • 47. Aljuhami A.M., Bamasoud D.M. CTI in Risk Management. Int. J. Adv. Comput. Sci. Appl. 2021;12:156–164. [ Google Scholar ]
  • 48. Sakellariou G., Fouliras P., Mavridis I., Sarigiannidis P. A reference model for CTI systems. Electronics. 2022;11:1401. doi: 10.3390/electronics11091401. [ DOI ] [ Google Scholar ]
  • 49. Dulaunoy A., Huynen J.L., Thirion A. Active and Passive Collection of SSH key material for CTI. Digit. Threats Res. Pract. (DTRAP) 2022;3:1–5. doi: 10.1145/3491262. [ DOI ] [ Google Scholar ]
  • 50. Gao P., Liu X., Choi E., Soman B., Mishra C., Farris K., Song D. A system for automated open-source threat intelligence gathering and management; Proceedings of the 2021 International Conference on Management of Data; Xi’an, China. 20–25 June 2021; pp. 2716–2720. [ Google Scholar ]
  • 51. Al-Mohannadi H., Awan I., Al Hamar J. Analysis of adversary activities using cloud-based web services to enhance CTI. Serv. Oriented Comput. Appl. 2020;14:175–187. doi: 10.1007/s11761-019-00285-7. [ DOI ] [ Google Scholar ]
  • 52. Sufi F. A New Social Media-Driven CTI. Electronics. 2023;12:1242. doi: 10.3390/electronics12051242. [ DOI ] [ Google Scholar ]
  • 53. Cristea L.M. Risks Associated with Threats Related to Disruptive Technologies in the Current Financial Systems Context. Audit Financiar. 2021;1:119–129. doi: 10.20869/AUDITF/2021/161/002. [ DOI ] [ Google Scholar ]
  • 54. Thach N.N., Hanh H.T., Huy D.T.N., Vu Q.N. Technology quality management of the industry 4.0 and cybersecurity risk management on current banking activities in emerging markets-the case in Vietnam. Int. J. Qual. Res. 2021;15:840–856. doi: 10.24874/IJQR15.03-10. [ DOI ] [ Google Scholar ]
  • 55. Tripodi F.B. ReOpen demands as public health threat: A sociotechnical framework for understanding the stickiness of misinformation. Comput. Math. Organ. Theory. 2022;28:321–334. doi: 10.1007/s10588-021-09339-8. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 56. Odemis M., Yucel C., Koltuksuz A. Detecting user behavior in CTI: Development of honeypsy system. Secur. Commun. Netw. arXiv. 2022 doi: 10.1155/2022/7620125.2304.07411 [ DOI ] [ Google Scholar ]
  • 57. Vevera A.V., Cirnu C.E., Radulescu C.Z. A Multi-Attribute Approach for CTI Product and Services Selection. Stud. Inform. Control. 2022;31:13–23. doi: 10.24846/v31i1y202202. [ DOI ] [ Google Scholar ]
  • 58. Du L., Fan Y., Zhang L., Wang L., Sun T. A summary of the development of cyber security threat intelligence sharing. Int. J. Digit. Crime Forensics (IJDCF) 2020;12:54–67. doi: 10.4018/IJDCF.2020100105. [ DOI ] [ Google Scholar ]
  • 59. Westerlund M. The emergence of deepfake technology: A review. [(accessed on 30 June 2023)];Technol. Innov. Manag. Rev. 2019 9 doi: 10.22215/timreview/1282. Available online: https://timreview.ca/article/1282 . [ DOI ] [ Google Scholar ]
  • 60. Sarhan M., Layeghy S., Moustafa N., Portmann M. CTI sharing scheme based on federated learning for network intrusion detection. J. Netw. Syst. Manag. 2023;31:3. doi: 10.1007/s10922-022-09691-3. [ DOI ] [ Google Scholar ]
  • 61. Ramsdale A., Shiaeles S., Kolokotronis N. A comparative analysis of cyber-threat intelligence sources, formats and languages. Electronics. 2020;9:824. doi: 10.3390/electronics9050824. [ DOI ] [ Google Scholar ]
  • 62. Oosthoek K., Doerr C. CTI: A product without a process? Int. J. Intell. CounterIntell. 2021;34:300–315. doi: 10.1080/08850607.2020.1780062. [ DOI ] [ Google Scholar ]
  • 63. de Melo e Silva A., Costa Gondim J.J., de Oliveira Albuquerque R., García Villalba L.J. A methodology to evaluate standards and platforms within CTI. Future Internet. 2020;12:108. doi: 10.3390/fi12060108. [ DOI ] [ Google Scholar ]
  • 64. Al Obaidan F., Saeed S. Handbook of Research on Advancing Cybersecurity for Digital Transformation. IGI Global; Hershey, PA, USA: 2021. Digital transformation and cybersecurity challenges: A study of malware detection using machine learning techniques; pp. 203–226. [ Google Scholar ]
  • 65. Saeed S., Bolívar M.P.R., Thurasamy R. Pandemic, Lockdown, and Digital Transformation. Springer International Publishing; Cham, Switzerland: 2021. [ Google Scholar ]
  • 66. Naeem H., Ullah F., Naeem M.R., Khalid S., Vasan D., Jabbar S., Saeed S. Malware detection in industrial internet of things based on hybrid image visualization and deep learning model. Ad Hoc Netw. 2020;105:102154. doi: 10.1016/j.adhoc.2020.102154. [ DOI ] [ Google Scholar ]
  • 67. Mekala S.H., Baig Z., Anwar A., Zeadally S. Cybersecurity for industrial IoT (IIoT): Threats, countermeasures, challenges and future directions. Comput. Commun. 2023;208:294–320. doi: 10.1016/j.comcom.2023.06.020. [ DOI ] [ Google Scholar ]
  • 68. Saeed S. Education, Online Presence and Cybersecurity Implications: A Study of Information Security Practices of Computing Students in Saudi Arabia. Sustainability. 2023;15:9426. doi: 10.3390/su15129426. [ DOI ] [ Google Scholar ]
  • 69. Saeed S. Digital Workplaces and Information Security Behavior of Business Employees: An Empirical Study of Saudi Arabia. Sustainability. 2023;15:6019. doi: 10.3390/su15076019. [ DOI ] [ Google Scholar ]
  • 70. Kont K.R. Libraries and cyber security: The importance of the human factor in preventing cyber attacks. Libr. Hi Tech News. 2023 doi: 10.1108/LHTN-03-2023-0036. [ DOI ] [ Google Scholar ]
  • 71. Saeed S. A Customer-Centric View of E-Commerce Security and Privacy. Appl. Sci. 2023;13:1020. doi: 10.3390/app13021020. [ DOI ] [ Google Scholar ]
  • 72. Gull H., Alabbad D.A., Saqib M., Iqbal S.Z., Nasir T., Saeed S., Almuhaideb A.M. Handbook of Research on Cybersecurity Issues and Challenges for Business and FinTech Applications. IGI Global; Hershey, PA, USA: 2023. E-Commerce and Cybersecurity Challenges: Recent Advances and Future Trends; pp. 91–111. [ Google Scholar ]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

  • View on publisher site
  • PDF (1.0 MB)
  • Collections

Similar articles

Cited by other articles, links to ncbi databases.

  • Download .nbib .nbib
  • Format: AMA APA MLA NLM

Add to Collections

Advertisement

Advertisement

Cyber risk and cybersecurity: a systematic review of data availability

  • Open access
  • Published: 17 February 2022
  • Volume 47 , pages 698–736, ( 2022 )

Cite this article

You have full access to this open access article

literature review about cybersecurity

  • Frank Cremer 1 ,
  • Barry Sheehan   ORCID: orcid.org/0000-0003-4592-7558 1 ,
  • Michael Fortmann 2 ,
  • Arash N. Kia 1 ,
  • Martin Mullins 1 ,
  • Finbarr Murphy 1 &
  • Stefan Materne 2  

84k Accesses

111 Citations

43 Altmetric

Explore all metrics

Cybercrime is estimated to have cost the global economy just under USD 1 trillion in 2020, indicating an increase of more than 50% since 2018. With the average cyber insurance claim rising from USD 145,000 in 2019 to USD 359,000 in 2020, there is a growing necessity for better cyber information sources, standardised databases, mandatory reporting and public awareness. This research analyses the extant academic and industry literature on cybersecurity and cyber risk management with a particular focus on data availability. From a preliminary search resulting in 5219 cyber peer-reviewed studies, the application of the systematic methodology resulted in 79 unique datasets. We posit that the lack of available data on cyber risk poses a serious problem for stakeholders seeking to tackle this issue. In particular, we identify a lacuna in open databases that undermine collective endeavours to better manage this set of risks. The resulting data evaluation and categorisation will support cybersecurity researchers and the insurance industry in their efforts to comprehend, metricise and manage cyber risks.

Similar content being viewed by others

literature review about cybersecurity

Systematic Review: Cybersecurity Risk Taxonomy

literature review about cybersecurity

A Survey of Cybersecurity Risk Management Frameworks

literature review about cybersecurity

Cybersecurity Risk Management Frameworks in the Oil and Gas Sector: A Systematic Literature Review

Avoid common mistakes on your manuscript.

Introduction

Globalisation, digitalisation and smart technologies have escalated the propensity and severity of cybercrime. Whilst it is an emerging field of research and industry, the importance of robust cybersecurity defence systems has been highlighted at the corporate, national and supranational levels. The impacts of inadequate cybersecurity are estimated to have cost the global economy USD 945 billion in 2020 (Maleks Smith et al. 2020 ). Cyber vulnerabilities pose significant corporate risks, including business interruption, breach of privacy and financial losses (Sheehan et al. 2019 ). Despite the increasing relevance for the international economy, the availability of data on cyber risks remains limited. The reasons for this are many. Firstly, it is an emerging and evolving risk; therefore, historical data sources are limited (Biener et al. 2015 ). It could also be due to the fact that, in general, institutions that have been hacked do not publish the incidents (Eling and Schnell 2016 ). The lack of data poses challenges for many areas, such as research, risk management and cybersecurity (Falco et al. 2019 ). The importance of this topic is demonstrated by the announcement of the European Council in April 2021 that a centre of excellence for cybersecurity will be established to pool investments in research, technology and industrial development. The goal of this centre is to increase the security of the internet and other critical network and information systems (European Council 2021 ).

This research takes a risk management perspective, focusing on cyber risk and considering the role of cybersecurity and cyber insurance in risk mitigation and risk transfer. The study reviews the existing literature and open data sources related to cybersecurity and cyber risk. This is the first systematic review of data availability in the general context of cyber risk and cybersecurity. By identifying and critically analysing the available datasets, this paper supports the research community by aggregating, summarising and categorising all available open datasets. In addition, further information on datasets is attached to provide deeper insights and support stakeholders engaged in cyber risk control and cybersecurity. Finally, this research paper highlights the need for open access to cyber-specific data, without price or permission barriers.

The identified open data can support cyber insurers in their efforts on sustainable product development. To date, traditional risk assessment methods have been untenable for insurance companies due to the absence of historical claims data (Sheehan et al. 2021 ). These high levels of uncertainty mean that cyber insurers are more inclined to overprice cyber risk cover (Kshetri 2018 ). Combining external data with insurance portfolio data therefore seems to be essential to improve the evaluation of the risk and thus lead to risk-adjusted pricing (Bessy-Roland et al. 2021 ). This argument is also supported by the fact that some re/insurers reported that they are working to improve their cyber pricing models (e.g. by creating or purchasing databases from external providers) (EIOPA 2018 ). Figure  1 provides an overview of pricing tools and factors considered in the estimation of cyber insurance based on the findings of EIOPA ( 2018 ) and the research of Romanosky et al. ( 2019 ). The term cyber risk refers to all cyber risks and their potential impact.

figure 1

An overview of the current cyber insurance informational and methodological landscape, adapted from EIOPA ( 2018 ) and Romanosky et al. ( 2019 )

Besides the advantage of risk-adjusted pricing, the availability of open datasets helps companies benchmark their internal cyber posture and cybersecurity measures. The research can also help to improve risk awareness and corporate behaviour. Many companies still underestimate their cyber risk (Leong and Chen 2020 ). For policymakers, this research offers starting points for a comprehensive recording of cyber risks. Although in many countries, companies are obliged to report data breaches to the respective supervisory authority, this information is usually not accessible to the research community. Furthermore, the economic impact of these breaches is usually unclear.

As well as the cyber risk management community, this research also supports cybersecurity stakeholders. Researchers are provided with an up-to-date, peer-reviewed literature of available datasets showing where these datasets have been used. For example, this includes datasets that have been used to evaluate the effectiveness of countermeasures in simulated cyberattacks or to test intrusion detection systems. This reduces a time-consuming search for suitable datasets and ensures a comprehensive review of those available. Through the dataset descriptions, researchers and industry stakeholders can compare and select the most suitable datasets for their purposes. In addition, it is possible to combine the datasets from one source in the context of cybersecurity or cyber risk. This supports efficient and timely progress in cyber risk research and is beneficial given the dynamic nature of cyber risks.

Cyber risks are defined as “operational risks to information and technology assets that have consequences affecting the confidentiality, availability, and/or integrity of information or information systems” (Cebula et al. 2014 ). Prominent cyber risk events include data breaches and cyberattacks (Agrafiotis et al. 2018 ). The increasing exposure and potential impact of cyber risk have been highlighted in recent industry reports (e.g. Allianz 2021 ; World Economic Forum 2020 ). Cyberattacks on critical infrastructures are ranked 5th in the World Economic Forum's Global Risk Report. Ransomware, malware and distributed denial-of-service (DDoS) are examples of the evolving modes of a cyberattack. One example is the ransomware attack on the Colonial Pipeline, which shut down the 5500 mile pipeline system that delivers 2.5 million barrels of fuel per day and critical liquid fuel infrastructure from oil refineries to states along the U.S. East Coast (Brower and McCormick 2021 ). These and other cyber incidents have led the U.S. to strengthen its cybersecurity and introduce, among other things, a public body to analyse major cyber incidents and make recommendations to prevent a recurrence (Murphey 2021a ). Another example of the scope of cyberattacks is the ransomware NotPetya in 2017. The damage amounted to USD 10 billion, as the ransomware exploited a vulnerability in the windows system, allowing it to spread independently worldwide in the network (GAO 2021 ). In the same year, the ransomware WannaCry was launched by cybercriminals. The cyberattack on Windows software took user data hostage in exchange for Bitcoin cryptocurrency (Smart 2018 ). The victims included the National Health Service in Great Britain. As a result, ambulances were redirected to other hospitals because of information technology (IT) systems failing, leaving people in need of urgent assistance waiting. It has been estimated that 19,000 cancelled treatment appointments resulted from losses of GBP 92 million (Field 2018 ). Throughout the COVID-19 pandemic, ransomware attacks increased significantly, as working from home arrangements increased vulnerability (Murphey 2021b ).

Besides cyberattacks, data breaches can also cause high costs. Under the General Data Protection Regulation (GDPR), companies are obliged to protect personal data and safeguard the data protection rights of all individuals in the EU area. The GDPR allows data protection authorities in each country to impose sanctions and fines on organisations they find in breach. “For data breaches, the maximum fine can be €20 million or 4% of global turnover, whichever is higher” (GDPR.EU 2021 ). Data breaches often involve a large amount of sensitive data that has been accessed, unauthorised, by external parties, and are therefore considered important for information security due to their far-reaching impact (Goode et al. 2017 ). A data breach is defined as a “security incident in which sensitive, protected, or confidential data are copied, transmitted, viewed, stolen, or used by an unauthorized individual” (Freeha et al. 2021 ). Depending on the amount of data, the extent of the damage caused by a data breach can be significant, with the average cost being USD 392 million Footnote 1 (IBM Security 2020 ).

This research paper reviews the existing literature and open data sources related to cybersecurity and cyber risk, focusing on the datasets used to improve academic understanding and advance the current state-of-the-art in cybersecurity. Furthermore, important information about the available datasets is presented (e.g. use cases), and a plea is made for open data and the standardisation of cyber risk data for academic comparability and replication. The remainder of the paper is structured as follows. The next section describes the related work regarding cybersecurity and cyber risks. The third section outlines the review method used in this work and the process. The fourth section details the results of the identified literature. Further discussion is presented in the penultimate section and the final section concludes.

Related work

Due to the significance of cyber risks, several literature reviews have been conducted in this field. Eling ( 2020 ) reviewed the existing academic literature on the topic of cyber risk and cyber insurance from an economic perspective. A total of 217 papers with the term ‘cyber risk’ were identified and classified in different categories. As a result, open research questions are identified, showing that research on cyber risks is still in its infancy because of their dynamic and emerging nature. Furthermore, the author highlights that particular focus should be placed on the exchange of information between public and private actors. An improved information flow could help to measure the risk more accurately and thus make cyber risks more insurable and help risk managers to determine the right level of cyber risk for their company. In the context of cyber insurance data, Romanosky et al. ( 2019 ) analysed the underwriting process for cyber insurance and revealed how cyber insurers understand and assess cyber risks. For this research, they examined 235 American cyber insurance policies that were publicly available and looked at three components (coverage, application questionnaires and pricing). The authors state in their findings that many of the insurers used very simple, flat-rate pricing (based on a single calculation of expected loss), while others used more parameters such as the asset value of the company (or company revenue) or standard insurance metrics (e.g. deductible, limits), and the industry in the calculation. This is in keeping with Eling ( 2020 ), who states that an increased amount of data could help to make cyber risk more accurately measured and thus more insurable. Similar research on cyber insurance and data was conducted by Nurse et al. ( 2020 ). The authors examined cyber insurance practitioners' perceptions and the challenges they face in collecting and using data. In addition, gaps were identified during the research where further data is needed. The authors concluded that cyber insurance is still in its infancy, and there are still several unanswered questions (for example, cyber valuation, risk calculation and recovery). They also pointed out that a better understanding of data collection and use in cyber insurance would be invaluable for future research and practice. Bessy-Roland et al. ( 2021 ) come to a similar conclusion. They proposed a multivariate Hawkes framework to model and predict the frequency of cyberattacks. They used a public dataset with characteristics of data breaches affecting the U.S. industry. In the conclusion, the authors make the argument that an insurer has a better knowledge of cyber losses, but that it is based on a small dataset and therefore combination with external data sources seems essential to improve the assessment of cyber risks.

Several systematic reviews have been published in the area of cybersecurity (Kruse et al. 2017 ; Lee et al. 2020 ; Loukas et al. 2013 ; Ulven and Wangen 2021 ). In these papers, the authors concentrated on a specific area or sector in the context of cybersecurity. This paper adds to this extant literature by focusing on data availability and its importance to risk management and insurance stakeholders. With a priority on healthcare and cybersecurity, Kruse et al. ( 2017 ) conducted a systematic literature review. The authors identified 472 articles with the keywords ‘cybersecurity and healthcare’ or ‘ransomware’ in the databases Cumulative Index of Nursing and Allied Health Literature, PubMed and Proquest. Articles were eligible for this review if they satisfied three criteria: (1) they were published between 2006 and 2016, (2) the full-text version of the article was available, and (3) the publication is a peer-reviewed or scholarly journal. The authors found that technological development and federal policies (in the U.S.) are the main factors exposing the health sector to cyber risks. Loukas et al. ( 2013 ) conducted a review with a focus on cyber risks and cybersecurity in emergency management. The authors provided an overview of cyber risks in communication, sensor, information management and vehicle technologies used in emergency management and showed areas for which there is still no solution in the literature. Similarly, Ulven and Wangen ( 2021 ) reviewed the literature on cybersecurity risks in higher education institutions. For the literature review, the authors used the keywords ‘cyber’, ‘information threats’ or ‘vulnerability’ in connection with the terms ‘higher education, ‘university’ or ‘academia’. A similar literature review with a focus on Internet of Things (IoT) cybersecurity was conducted by Lee et al. ( 2020 ). The review revealed that qualitative approaches focus on high-level frameworks, and quantitative approaches to cybersecurity risk management focus on risk assessment and quantification of cyberattacks and impacts. In addition, the findings presented a four-step IoT cyber risk management framework that identifies, quantifies and prioritises cyber risks.

Datasets are an essential part of cybersecurity research, underlined by the following works. Ilhan Firat et al. ( 2021 ) examined various cybersecurity datasets in detail. The study was motivated by the fact that with the proliferation of the internet and smart technologies, the mode of cyberattacks is also evolving. However, in order to prevent such attacks, they must first be detected; the dissemination and further development of cybersecurity datasets is therefore critical. In their work, the authors observed studies of datasets used in intrusion detection systems. Khraisat et al. ( 2019 ) also identified a need for new datasets in the context of cybersecurity. The researchers presented a taxonomy of current intrusion detection systems, a comprehensive review of notable recent work, and an overview of the datasets commonly used for assessment purposes. In their conclusion, the authors noted that new datasets are needed because most machine-learning techniques are trained and evaluated on the knowledge of old datasets. These datasets do not contain new and comprehensive information and are partly derived from datasets from 1999. The authors noted that the core of this issue is the availability of new public datasets as well as their quality. The availability of data, how it is used, created and shared was also investigated by Zheng et al. ( 2018 ). The researchers analysed 965 cybersecurity research papers published between 2012 and 2016. They created a taxonomy of the types of data that are created and shared and then analysed the data collected via datasets. The researchers concluded that while datasets are recognised as valuable for cybersecurity research, the proportion of publicly available datasets is limited.

The main contributions of this review and what differentiates it from previous studies can be summarised as follows. First, as far as we can tell, it is the first work to summarise all available datasets on cyber risk and cybersecurity in the context of a systematic review and present them to the scientific community and cyber insurance and cybersecurity stakeholders. Second, we investigated, analysed, and made available the datasets to support efficient and timely progress in cyber risk research. And third, we enable comparability of datasets so that the appropriate dataset can be selected depending on the research area.

Methodology

Process and eligibility criteria.

The structure of this systematic review is inspired by the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) framework (Page et al. 2021 ), and the search was conducted from 3 to 10 May 2021. Due to the continuous development of cyber risks and their countermeasures, only articles published in the last 10 years were considered. In addition, only articles published in peer-reviewed journals written in English were included. As a final criterion, only articles that make use of one or more cybersecurity or cyber risk datasets met the inclusion criteria. Specifically, these studies presented new or existing datasets, used them for methods, or used them to verify new results, as well as analysed them in an economic context and pointed out their effects. The criterion was fulfilled if it was clearly stated in the abstract that one or more datasets were used. A detailed explanation of this selection criterion can be found in the ‘Study selection’ section.

Information sources

In order to cover a complete spectrum of literature, various databases were queried to collect relevant literature on the topic of cybersecurity and cyber risks. Due to the spread of related articles across multiple databases, the literature search was limited to the following four databases for simplicity: IEEE Xplore, Scopus, SpringerLink and Web of Science. This is similar to other literature reviews addressing cyber risks or cybersecurity, including Sardi et al. ( 2021 ), Franke and Brynielsson ( 2014 ), Lagerström (2019), Eling and Schnell ( 2016 ) and Eling ( 2020 ). In this paper, all databases used in the aforementioned works were considered. However, only two studies also used all the databases listed. The IEEE Xplore database contains electrical engineering, computer science, and electronics work from over 200 journals and three million conference papers (IEEE 2021 ). Scopus includes 23,400 peer-reviewed journals from more than 5000 international publishers in the areas of science, engineering, medicine, social sciences and humanities (Scopus 2021 ). SpringerLink contains 3742 journals and indexes over 10 million scientific documents (SpringerLink 2021 ). Finally, Web of Science indexes over 9200 journals in different scientific disciplines (Science 2021 ).

A search string was created and applied to all databases. To make the search efficient and reproducible, the following search string with Boolean operator was used in all databases: cybersecurity OR cyber risk AND dataset OR database. To ensure uniformity of the search across all databases, some adjustments had to be made for the respective search engines. In Scopus, for example, the Advanced Search was used, and the field code ‘Title-ABS-KEY’ was integrated into the search string. For IEEE Xplore, the search was carried out with the Search String in the Command Search and ‘All Metadata’. In the Web of Science database, the Advanced Search was used. The special feature of this search was that it had to be carried out in individual steps. The first search was carried out with the terms cybersecurity OR cyber risk with the field tag Topic (T.S. =) and the second search with dataset OR database. Subsequently, these searches were combined, which then delivered the searched articles for review. For SpringerLink, the search string was used in the Advanced Search under the category ‘Find the resources with all of the words’. After conducting this search string, 5219 studies could be found. According to the eligibility criteria (period, language and only scientific journals), 1581 studies were identified in the databases:

Scopus: 135

Springer Link: 548

Web of Science: 534

An overview of the process is given in Fig.  2 . Combined with the results from the four databases, 854 articles without duplicates were identified.

figure 2

Literature search process and categorisation of the studies

Study selection

In the final step of the selection process, the articles were screened for relevance. Due to a large number of results, the abstracts were analysed in the first step of the process. The aim was to determine whether the article was relevant for the systematic review. An article fulfilled the criterion if it was recognisable in the abstract that it had made a contribution to datasets or databases with regard to cyber risks or cybersecurity. Specifically, the criterion was considered to be met if the abstract used datasets that address the causes or impacts of cyber risks, and measures in the area of cybersecurity. In this process, the number of articles was reduced to 288. The articles were then read in their entirety, and an expert panel of six people decided whether they should be used. This led to a final number of 255 articles. The years in which the articles were published and the exact number can be seen in Fig.  3 .

figure 3

Distribution of studies

Data collection process and synthesis of the results

For the data collection process, various data were extracted from the studies, including the names of the respective creators, the name of the dataset or database and the corresponding reference. It was also determined where the data came from. In the context of accessibility, it was determined whether access is free, controlled, available for purchase or not available. It was also determined when the datasets were created and the time period referenced. The application type and domain characteristics of the datasets were identified.

This section analyses the results of the systematic literature review. The previously identified studies are divided into three categories: datasets on the causes of cyber risks, datasets on the effects of cyber risks and datasets on cybersecurity. The classification is based on the intended use of the studies. This system of classification makes it easier for stakeholders to find the appropriate datasets. The categories are evaluated individually. Although complete information is available for a large proportion of datasets, this is not true for all of them. Accordingly, the abbreviation N/A has been inserted in the respective characters to indicate that this information could not be determined by the time of submission. The term ‘use cases in the literature’ in the following and supplementary tables refers to the application areas in which the corresponding datasets were used in the literature. The areas listed there refer to the topic area on which the researchers conducted their research. Since some datasets were used interdisciplinarily, the listed use cases in the literature are correspondingly longer. Before discussing each category in the next sections, Fig.  4 provides an overview of the number of datasets found and their year of creation. Figure  5 then shows the relationship between studies and datasets in the period under consideration. Figure  6 shows the distribution of studies, their use of datasets and their creation date. The number of datasets used is higher than the number of studies because the studies often used several datasets (Table 1 ).

figure 4

Distribution of dataset results

figure 5

Correlation between the studies and the datasets

figure 6

Distribution of studies and their use of datasets

Most of the datasets are generated in the U.S. (up to 58.2%). Canada and Australia rank next, with 11.3% and 5% of all the reviewed datasets, respectively.

Additionally, to create value for the datasets for the cyber insurance industry, an assessment of the applicability of each dataset has been provided for cyber insurers. This ‘Use Case Assessment’ includes the use of the data in the context of different analyses, calculation of cyber insurance premiums, and use of the information for the design of cyber insurance contracts or for additional customer services. To reasonably account for the transition of direct hyperlinks in the future, references were directed to the main websites for longevity (nearest resource point). In addition, the links to the main pages contain further information on the datasets and different versions related to the operating systems. The references were chosen in such a way that practitioners get the best overview of the respective datasets.

Case datasets

This section presents selected articles that use the datasets to analyse the causes of cyber risks. The datasets help identify emerging trends and allow pattern discovery in cyber risks. This information gives cybersecurity experts and cyber insurers the data to make better predictions and take appropriate action. For example, if certain vulnerabilities are not adequately protected, cyber insurers will demand a risk surcharge leading to an improvement in the risk-adjusted premium. Due to the capricious nature of cyber risks, existing data must be supplemented with new data sources (for example, new events, new methods or security vulnerabilities) to determine prevailing cyber exposure. The datasets of cyber risk causes could be combined with existing portfolio data from cyber insurers and integrated into existing pricing tools and factors to improve the valuation of cyber risks.

A portion of these datasets consists of several taxonomies and classifications of cyber risks. Aassal et al. ( 2020 ) propose a new taxonomy of phishing characteristics based on the interpretation and purpose of each characteristic. In comparison, Hindy et al. ( 2020 ) presented a taxonomy of network threats and the impact of current datasets on intrusion detection systems. A similar taxonomy was suggested by Kiwia et al. ( 2018 ). The authors presented a cyber kill chain-based taxonomy of banking Trojans features. The taxonomy built on a real-world dataset of 127 banking Trojans collected from December 2014 to January 2016 by a major U.K.-based financial organisation.

In the context of classification, Aamir et al. ( 2021 ) showed the benefits of machine learning for classifying port scans and DDoS attacks in a mixture of normal and attack traffic. Guo et al. ( 2020 ) presented a new method to improve malware classification based on entropy sequence features. The evaluation of this new method was conducted on different malware datasets.

To reconstruct attack scenarios and draw conclusions based on the evidence in the alert stream, Barzegar and Shajari ( 2018 ) use the DARPA2000 and MACCDC 2012 dataset for their research. Giudici and Raffinetti ( 2020 ) proposed a rank-based statistical model aimed at predicting the severity levels of cyber risk. The model used cyber risk data from the University of Milan. In contrast to the previous datasets, Skrjanc et al. ( 2018 ) used the older dataset KDD99 to monitor large-scale cyberattacks using a cauchy clustering method.

Amin et al. ( 2021 ) used a cyberattack dataset from the Canadian Institute for Cybersecurity to identify spatial clusters of countries with high rates of cyberattacks. In the context of cybercrime, Junger et al. ( 2020 ) examined crime scripts, key characteristics of the target company and the relationship between criminal effort and financial benefit. For their study, the authors analysed 300 cases of fraudulent activities against Dutch companies. With a similar focus on cybercrime, Mireles et al. ( 2019 ) proposed a metric framework to measure the effectiveness of the dynamic evolution of cyberattacks and defensive measures. To validate its usefulness, they used the DEFCON dataset.

Due to the rapidly changing nature of cyber risks, it is often impossible to obtain all information on them. Kim and Kim ( 2019 ) proposed an automated dataset generation system called CTIMiner that collects threat data from publicly available security reports and malware repositories. They released a dataset to the public containing about 640,000 records from 612 security reports published between January 2008 and 2019. A similar approach is proposed by Kim et al. ( 2020 ), using a named entity recognition system to extract core information from cyber threat reports automatically. They created a 498,000-tag dataset during their research (Ulven and Wangen 2021 ).

Within the framework of vulnerabilities and cybersecurity issues, Ulven and Wangen ( 2021 ) proposed an overview of mission-critical assets and everyday threat events, suggested a generic threat model, and summarised common cybersecurity vulnerabilities. With a focus on hospitality, Chen and Fiscus ( 2018 ) proposed several issues related to cybersecurity in this sector. They analysed 76 security incidents from the Privacy Rights Clearinghouse database. Supplementary Table 1 lists all findings that belong to the cyber causes dataset.

Impact datasets

This section outlines selected findings of the cyber impact dataset. For cyber insurers, these datasets can form an important basis for information, as they can be used to calculate cyber insurance premiums, evaluate specific cyber risks, formulate inclusions and exclusions in cyber wordings, and re-evaluate as well as supplement the data collected so far on cyber risks. For example, information on financial losses can help to better assess the loss potential of cyber risks. Furthermore, the datasets can provide insight into the frequency of occurrence of these cyber risks. The new datasets can be used to close any data gaps that were previously based on very approximate estimates or to find new results.

Eight studies addressed the costs of data breaches. For instance, Eling and Jung ( 2018 ) reviewed 3327 data breach events from 2005 to 2016 and identified an asymmetric dependence of monthly losses by breach type and industry. The authors used datasets from the Privacy Rights Clearinghouse for analysis. The Privacy Rights Clearinghouse datasets and the Breach level index database were also used by De Giovanni et al. ( 2020 ) to describe relationships between data breaches and bitcoin-related variables using the cointegration methodology. The data were obtained from the Department of Health and Human Services of healthcare facilities reporting data breaches and a national database of technical and organisational infrastructure information. Also in the context of data breaches, Algarni et al. ( 2021 ) developed a comprehensive, formal model that estimates the two components of security risks: breach cost and the likelihood of a data breach within 12 months. For their survey, the authors used two industrial reports from the Ponemon institute and VERIZON. To illustrate the scope of data breaches, Neto et al. ( 2021 ) identified 430 major data breach incidents among more than 10,000 incidents. The database created is available and covers the period 2018 to 2019.

With a direct focus on insurance, Biener et al. ( 2015 ) analysed 994 cyber loss cases from an operational risk database and investigated the insurability of cyber risks based on predefined criteria. For their study, they used data from the company SAS OpRisk Global Data. Similarly, Eling and Wirfs ( 2019 ) looked at a wide range of cyber risk events and actual cost data using the same database. They identified cyber losses and analysed them using methods from statistics and actuarial science. Using a similar reference, Farkas et al. ( 2021 ) proposed a method for analysing cyber claims based on regression trees to identify criteria for classifying and evaluating claims. Similar to Chen and Fiscus ( 2018 ), the dataset used was the Privacy Rights Clearinghouse database. Within the framework of reinsurance, Moro ( 2020 ) analysed cyber index-based information technology activity to see if index-parametric reinsurance coverage could suggest its cedant using data from a Symantec dataset.

Paté-Cornell et al. ( 2018 ) presented a general probabilistic risk analysis framework for cybersecurity in an organisation to be specified. The results are distributions of losses to cyberattacks, with and without considered countermeasures in support of risk management decisions based both on past data and anticipated incidents. The data used were from The Common Vulnerability and Exposures database and via confidential access to a database of cyberattacks on a large, U.S.-based organisation. A different conceptual framework for cyber risk classification and assessment was proposed by Sheehan et al. ( 2021 ). This framework showed the importance of proactive and reactive barriers in reducing companies’ exposure to cyber risk and quantifying the risk. Another approach to cyber risk assessment and mitigation was proposed by Mukhopadhyay et al. ( 2019 ). They estimated the probability of an attack using generalised linear models, predicted the security technology required to reduce the probability of cyberattacks, and used gamma and exponential distributions to best approximate the average loss data for each malicious attack. They also calculated the expected loss due to cyberattacks, calculated the net premium that would need to be charged by a cyber insurer, and suggested cyber insurance as a strategy to minimise losses. They used the CSI-FBI survey (1997–2010) to conduct their research.

In order to highlight the lack of data on cyber risks, Eling ( 2020 ) conducted a literature review in the areas of cyber risk and cyber insurance. Available information on the frequency, severity, and dependency structure of cyber risks was filtered out. In addition, open questions for future cyber risk research were set up. Another example of data collection on the impact of cyberattacks is provided by Sornette et al. ( 2013 ), who use a database of newspaper articles, press reports and other media to provide a predictive method to identify triggering events and potential accident scenarios and estimate their severity and frequency. A similar approach to data collection was used by Arcuri et al. ( 2020 ) to gather an original sample of global cyberattacks from newspaper reports sourced from the LexisNexis database. This collection is also used and applied to the fields of dynamic communication and cyber risk perception by Fang et al. ( 2021 ). To create a dataset of cyber incidents and disputes, Valeriano and Maness ( 2014 ) collected information on cyber interactions between rival states.

To assess trends and the scale of economic cybercrime, Levi ( 2017 ) examined datasets from different countries and their impact on crime policy. Pooser et al. ( 2018 ) investigated the trend in cyber risk identification from 2006 to 2015 and company characteristics related to cyber risk perception. The authors used a dataset of various reports from cyber insurers for their study. Walker-Roberts et al. ( 2020 ) investigated the spectrum of risk of a cybersecurity incident taking place in the cyber-physical-enabled world using the VERIS Community Database. The datasets of impacts identified are presented below. Due to overlap, some may also appear in the causes dataset (Supplementary Table 2).

Cybersecurity datasets

General intrusion detection.

General intrusion detection systems account for the largest share of countermeasure datasets. For companies or researchers focused on cybersecurity, the datasets can be used to test their own countermeasures or obtain information about potential vulnerabilities. For example, Al-Omari et al. ( 2021 ) proposed an intelligent intrusion detection model for predicting and detecting attacks in cyberspace, which was applied to dataset UNSW-NB 15. A similar approach was taken by Choras and Kozik ( 2015 ), who used machine learning to detect cyberattacks on web applications. To evaluate their method, they used the HTTP dataset CSIC 2010. For the identification of unknown attacks on web servers, Kamarudin et al. ( 2017 ) proposed an anomaly-based intrusion detection system using an ensemble classification approach. Ganeshan and Rodrigues ( 2020 ) showed an intrusion detection system approach, which clusters the database into several groups and detects the presence of intrusion in the clusters. In comparison, AlKadi et al. ( 2019 ) used a localisation-based model to discover abnormal patterns in network traffic. Hybrid models have been recommended by Bhattacharya et al. ( 2020 ) and Agrawal et al. ( 2019 ); the former is a machine-learning model based on principal component analysis for the classification of intrusion detection system datasets, while the latter is a hybrid ensemble intrusion detection system for anomaly detection using different datasets to detect patterns in network traffic that deviate from normal behaviour.

Agarwal et al. ( 2021 ) used three different machine learning algorithms in their research to find the most suitable for efficiently identifying patterns of suspicious network activity. The UNSW-NB15 dataset was used for this purpose. Kasongo and Sun ( 2020 ), Feed-Forward Deep Neural Network (FFDNN), Keshk et al. ( 2021 ), the privacy-preserving anomaly detection framework, and others also use the UNSW-NB 15 dataset as part of intrusion detection systems. The same dataset and others were used by Binbusayyis and Vaiyapuri ( 2019 ) to identify and compare key features for cyber intrusion detection. Atefinia and Ahmadi ( 2021 ) proposed a deep neural network model to reduce the false positive rate of an anomaly-based intrusion detection system. Fossaceca et al. ( 2015 ) focused in their research on the development of a framework that combined the outputs of multiple learners in order to improve the efficacy of network intrusion, and Gauthama Raman et al. ( 2020 ) presented a search algorithm based on Support Vector machine to improve the performance of the detection and false alarm rate to improve intrusion detection techniques. Ahmad and Alsemmeari ( 2020 ) targeted extreme learning machine techniques due to their good capabilities in classification problems and handling huge data. They used the NSL-KDD dataset as a benchmark.

With reference to prediction, Bakdash et al. ( 2018 ) used datasets from the U.S. Department of Defence to predict cyberattacks by malware. This dataset consists of weekly counts of cyber events over approximately seven years. Another prediction method was presented by Fan et al. ( 2018 ), which showed an improved integrated cybersecurity prediction method based on spatial-time analysis. Also, with reference to prediction, Ashtiani and Azgomi ( 2014 ) proposed a framework for the distributed simulation of cyberattacks based on high-level architecture. Kirubavathi and Anitha ( 2016 ) recommended an approach to detect botnets, irrespective of their structures, based on network traffic flow behaviour analysis and machine-learning techniques. Dwivedi et al. ( 2021 ) introduced a multi-parallel adaptive technique to utilise an adaption mechanism in the group of swarms for network intrusion detection. AlEroud and Karabatis ( 2018 ) presented an approach that used contextual information to automatically identify and query possible semantic links between different types of suspicious activities extracted from network flows.

Intrusion detection systems with a focus on IoT

In addition to general intrusion detection systems, a proportion of studies focused on IoT. Habib et al. ( 2020 ) presented an approach for converting traditional intrusion detection systems into smart intrusion detection systems for IoT networks. To enhance the process of diagnostic detection of possible vulnerabilities with an IoT system, Georgescu et al. ( 2019 ) introduced a method that uses a named entity recognition-based solution. With regard to IoT in the smart home sector, Heartfield et al. ( 2021 ) presented a detection system that is able to autonomously adjust the decision function of its underlying anomaly classification models to a smart home’s changing condition. Another intrusion detection system was suggested by Keserwani et al. ( 2021 ), which combined Grey Wolf Optimization and Particle Swam Optimization to identify various attacks for IoT networks. They used the KDD Cup 99, NSL-KDD and CICIDS-2017 to evaluate their model. Abu Al-Haija and Zein-Sabatto ( 2020 ) provide a comprehensive development of a new intelligent and autonomous deep-learning-based detection and classification system for cyberattacks in IoT communication networks that leverage the power of convolutional neural networks, abbreviated as IoT-IDCS-CNN (IoT-based Intrusion Detection and Classification System using Convolutional Neural Network). To evaluate the development, the authors used the NSL-KDD dataset. Biswas and Roy ( 2021 ) recommended a model that identifies malicious botnet traffic using novel deep-learning approaches like artificial neural networks gutted recurrent units and long- or short-term memory models. They tested their model with the Bot-IoT dataset.

With a more forensic background, Koroniotis et al. ( 2020 ) submitted a network forensic framework, which described the digital investigation phases for identifying and tracing attack behaviours in IoT networks. The suggested work was evaluated with the Bot-IoT and UINSW-NB15 datasets. With a focus on big data and IoT, Chhabra et al. ( 2020 ) presented a cyber forensic framework for big data analytics in an IoT environment using machine learning. Furthermore, the authors mentioned different publicly available datasets for machine-learning models.

A stronger focus on a mobile phones was exhibited by Alazab et al. ( 2020 ), which presented a classification model that combined permission requests and application programme interface calls. The model was tested with a malware dataset containing 27,891 Android apps. A similar approach was taken by Li et al. ( 2019a , b ), who proposed a reliable classifier for Android malware detection based on factorisation machine architecture and extraction of Android app features from manifest files and source code.

Literature reviews

In addition to the different methods and models for intrusion detection systems, various literature reviews on the methods and datasets were also found. Liu and Lang ( 2019 ) proposed a taxonomy of intrusion detection systems that uses data objects as the main dimension to classify and summarise machine learning and deep learning-based intrusion detection literature. They also presented four different benchmark datasets for machine-learning detection systems. Ahmed et al. ( 2016 ) presented an in-depth analysis of four major categories of anomaly detection techniques, which include classification, statistical, information theory and clustering. Hajj et al. ( 2021 ) gave a comprehensive overview of anomaly-based intrusion detection systems. Their article gives an overview of the requirements, methods, measurements and datasets that are used in an intrusion detection system.

Within the framework of machine learning, Chattopadhyay et al. ( 2018 ) conducted a comprehensive review and meta-analysis on the application of machine-learning techniques in intrusion detection systems. They also compared different machine learning techniques in different datasets and summarised the performance. Vidros et al. ( 2017 ) presented an overview of characteristics and methods in automatic detection of online recruitment fraud. They also published an available dataset of 17,880 annotated job ads, retrieved from the use of a real-life system. An empirical study of different unsupervised learning algorithms used in the detection of unknown attacks was presented by Meira et al. ( 2020 ).

New datasets

Kilincer et al. ( 2021 ) reviewed different intrusion detection system datasets in detail. They had a closer look at the UNS-NB15, ISCX-2012, NSL-KDD and CIDDS-001 datasets. Stojanovic et al. ( 2020 ) also provided a review on datasets and their creation for use in advanced persistent threat detection in the literature. Another review of datasets was provided by Sarker et al. ( 2020 ), who focused on cybersecurity data science as part of their research and provided an overview from a machine-learning perspective. Avila et al. ( 2021 ) conducted a systematic literature review on the use of security logs for data leak detection. They recommended a new classification of information leak, which uses the GDPR principles, identified the most widely publicly available dataset for threat detection, described the attack types in the datasets and the algorithms used for data leak detection. Tuncer et al. ( 2020 ) presented a bytecode-based detection method consisting of feature extraction using local neighbourhood binary patterns. They chose a byte-based malware dataset to investigate the performance of the proposed local neighbourhood binary pattern-based detection method. With a different focus, Mauro et al. ( 2020 ) gave an experimental overview of neural-based techniques relevant to intrusion detection. They assessed the value of neural networks using the Bot-IoT and UNSW-DB15 datasets.

Another category of results in the context of countermeasure datasets is those that were presented as new. Moreno et al. ( 2018 ) developed a database of 300 security-related accidents from European and American sources. The database contained cybersecurity-related events in the chemical and process industry. Damasevicius et al. ( 2020 ) proposed a new dataset (LITNET-2020) for network intrusion detection. The dataset is a new annotated network benchmark dataset obtained from the real-world academic network. It presents real-world examples of normal and under-attack network traffic. With a focus on IoT intrusion detection systems, Alsaedi et al. ( 2020 ) proposed a new benchmark IoT/IIot datasets for assessing intrusion detection system-enabled IoT systems. Also in the context of IoT, Vaccari et al. ( 2020 ) proposed a dataset focusing on message queue telemetry transport protocols, which can be used to train machine-learning models. To evaluate the performance of machine-learning classifiers, Mahfouz et al. ( 2020 ) created a dataset called Game Theory and Cybersecurity (GTCS). A dataset containing 22,000 malware and benign samples was constructed by Martin et al. ( 2019 ). The dataset can be used as a benchmark to test the algorithm for Android malware classification and clustering techniques. In addition, Laso et al. ( 2017 ) presented a dataset created to investigate how data and information quality estimates enable the detection of anomalies and malicious acts in cyber-physical systems. The dataset contained various cyberattacks and is publicly available.

In addition to the results described above, several other studies were found that fit into the category of countermeasures. Johnson et al. ( 2016 ) examined the time between vulnerability disclosures. Using another vulnerabilities database, Common Vulnerabilities and Exposures (CVE), Subroto and Apriyana ( 2019 ) presented an algorithm model that uses big data analysis of social media and statistical machine learning to predict cyber risks. A similar databank but with a different focus, Common Vulnerability Scoring System, was used by Chatterjee and Thekdi ( 2020 ) to present an iterative data-driven learning approach to vulnerability assessment and management for complex systems. Using the CICIDS2017 dataset to evaluate the performance, Malik et al. ( 2020 ) proposed a control plane-based orchestration for varied, sophisticated threats and attacks. The same dataset was used in another study by Lee et al. ( 2019 ), who developed an artificial security information event management system based on a combination of event profiling for data processing and different artificial network methods. To exploit the interdependence between multiple series, Fang et al. ( 2021 ) proposed a statistical framework. In order to validate the framework, the authors applied it to a dataset of enterprise-level security breaches from the Privacy Rights Clearinghouse and Identity Theft Center database. Another framework with a defensive aspect was recommended by Li et al. ( 2021 ) to increase the robustness of deep neural networks against adversarial malware evasion attacks. Sarabi et al. ( 2016 ) investigated whether and to what extent business details can help assess an organisation's risk of data breaches and the distribution of risk across different types of incidents to create policies for protection, detection and recovery from different forms of security incidents. They used data from the VERIS Community Database.

Datasets that have been classified into the cybersecurity category are detailed in Supplementary Table 3. Due to overlap, records from the previous tables may also be included.

This paper presented a systematic literature review of studies on cyber risk and cybersecurity that used datasets. Within this framework, 255 studies were fully reviewed and then classified into three different categories. Then, 79 datasets were consolidated from these studies. These datasets were subsequently analysed, and important information was selected through a process of filtering out. This information was recorded in a table and enhanced with further information as part of the literature analysis. This made it possible to create a comprehensive overview of the datasets. For example, each dataset contains a description of where the data came from and how the data has been used to date. This allows different datasets to be compared and the appropriate dataset for the use case to be selected. This research certainly has limitations, so our selection of datasets cannot necessarily be taken as a representation of all available datasets related to cyber risks and cybersecurity. For example, literature searches were conducted in four academic databases and only found datasets that were used in the literature. Many research projects also used old datasets that may no longer consider current developments. In addition, the data are often focused on only one observation and are limited in scope. For example, the datasets can only be applied to specific contexts and are also subject to further limitations (e.g. region, industry, operating system). In the context of the applicability of the datasets, it is unfortunately not possible to make a clear statement on the extent to which they can be integrated into academic or practical areas of application or how great this effort is. Finally, it remains to be pointed out that this is an overview of currently available datasets, which are subject to constant change.

Due to the lack of datasets on cyber risks in the academic literature, additional datasets on cyber risks were integrated as part of a further search. The search was conducted on the Google Dataset search portal. The search term used was ‘cyber risk datasets’. Over 100 results were found. However, due to the low significance and verifiability, only 20 selected datasets were included. These can be found in Table 2  in the “ Appendix ”.

The results of the literature review and datasets also showed that there continues to be a lack of available, open cyber datasets. This lack of data is reflected in cyber insurance, for example, as it is difficult to find a risk-based premium without a sufficient database (Nurse et al. 2020 ). The global cyber insurance market was estimated at USD 5.5 billion in 2020 (Dyson 2020 ). When compared to the USD 1 trillion global losses from cybercrime (Maleks Smith et al. 2020 ), it is clear that there exists a significant cyber risk awareness challenge for both the insurance industry and international commerce. Without comprehensive and qualitative data on cyber losses, it can be difficult to estimate potential losses from cyberattacks and price cyber insurance accordingly (GAO 2021 ). For instance, the average cyber insurance loss increased from USD 145,000 in 2019 to USD 359,000 in 2020 (FitchRatings 2021 ). Cyber insurance is an important risk management tool to mitigate the financial impact of cybercrime. This is particularly evident in the impact of different industries. In the Energy & Commodities financial markets, a ransomware attack on the Colonial Pipeline led to a substantial impact on the U.S. economy. As a result of the attack, about 45% of the U.S. East Coast was temporarily unable to obtain supplies of diesel, petrol and jet fuel. This caused the average price in the U.S. to rise 7 cents to USD 3.04 per gallon, the highest in seven years (Garber 2021 ). In addition, Colonial Pipeline confirmed that it paid a USD 4.4 million ransom to a hacker gang after the attack. Another ransomware attack occurred in the healthcare and government sector. The victim of this attack was the Irish Health Service Executive (HSE). A ransom payment of USD 20 million was demanded from the Irish government to restore services after the hack (Tidy 2021 ). In the car manufacturing sector, Miller and Valasek ( 2015 ) initiated a cyberattack that resulted in the recall of 1.4 million vehicles and cost manufacturers EUR 761 million. The risk that arises in the context of these events is the potential for the accumulation of cyber losses, which is why cyber insurers are not expanding their capacity. An example of this accumulation of cyber risks is the NotPetya malware attack, which originated in Russia, struck in Ukraine, and rapidly spread around the world, causing at least USD 10 billion in damage (GAO 2021 ). These events highlight the importance of proper cyber risk management.

This research provides cyber insurance stakeholders with an overview of cyber datasets. Cyber insurers can use the open datasets to improve their understanding and assessment of cyber risks. For example, the impact datasets can be used to better measure financial impacts and their frequencies. These data could be combined with existing portfolio data from cyber insurers and integrated with existing pricing tools and factors to better assess cyber risk valuation. Although most cyber insurers have sparse historical cyber policy and claims data, they remain too small at present for accurate prediction (Bessy-Roland et al. 2021 ). A combination of portfolio data and external datasets would support risk-adjusted pricing for cyber insurance, which would also benefit policyholders. In addition, cyber insurance stakeholders can use the datasets to identify patterns and make better predictions, which would benefit sustainable cyber insurance coverage. In terms of cyber risk cause datasets, cyber insurers can use the data to review their insurance products. For example, the data could provide information on which cyber risks have not been sufficiently considered in product design or where improvements are needed. A combination of cyber cause and cybersecurity datasets can help establish uniform definitions to provide greater transparency and clarity. Consistent terminology could lead to a more sustainable cyber market, where cyber insurers make informed decisions about the level of coverage and policyholders understand their coverage (The Geneva Association 2020).

In addition to the cyber insurance community, this research also supports cybersecurity stakeholders. The reviewed literature can be used to provide a contemporary, contextual and categorised summary of available datasets. This supports efficient and timely progress in cyber risk research and is beneficial given the dynamic nature of cyber risks. With the help of the described cybersecurity datasets and the identified information, a comparison of different datasets is possible. The datasets can be used to evaluate the effectiveness of countermeasures in simulated cyberattacks or to test intrusion detection systems.

In this paper, we conducted a systematic review of studies on cyber risk and cybersecurity databases. We found that most of the datasets are in the field of intrusion detection and machine learning and are used for technical cybersecurity aspects. The available datasets on cyber risks were relatively less represented. Due to the dynamic nature and lack of historical data, assessing and understanding cyber risk is a major challenge for cyber insurance stakeholders. To address this challenge, a greater density of cyber data is needed to support cyber insurers in risk management and researchers with cyber risk-related topics. With reference to ‘Open Science’ FAIR data (Jacobsen et al. 2020 ), mandatory reporting of cyber incidents could help improve cyber understanding, awareness and loss prevention among companies and insurers. Through greater availability of data, cyber risks can be better understood, enabling researchers to conduct more in-depth research into these risks. Companies could incorporate this new knowledge into their corporate culture to reduce cyber risks. For insurance companies, this would have the advantage that all insurers would have the same understanding of cyber risks, which would support sustainable risk-based pricing. In addition, common definitions of cyber risks could be derived from new data.

The cybersecurity databases summarised and categorised in this research could provide a different perspective on cyber risks that would enable the formulation of common definitions in cyber policies. The datasets can help companies addressing cybersecurity and cyber risk as part of risk management assess their internal cyber posture and cybersecurity measures. The paper can also help improve risk awareness and corporate behaviour, and provides the research community with a comprehensive overview of peer-reviewed datasets and other available datasets in the area of cyber risk and cybersecurity. This approach is intended to support the free availability of data for research. The complete tabulated review of the literature is included in the Supplementary Material.

This work provides directions for several paths of future work. First, there are currently few publicly available datasets for cyber risk and cybersecurity. The older datasets that are still widely used no longer reflect today's technical environment. Moreover, they can often only be used in one context, and the scope of the samples is very limited. It would be of great value if more datasets were publicly available that reflect current environmental conditions. This could help intrusion detection systems to consider current events and thus lead to a higher success rate. It could also compensate for the disadvantages of older datasets by collecting larger quantities of samples and making this contextualisation more widespread. Another area of research may be the integratability and adaptability of cybersecurity and cyber risk datasets. For example, it is often unclear to what extent datasets can be integrated or adapted to existing data. For cyber risks and cybersecurity, it would be helpful to know what requirements need to be met or what is needed to use the datasets appropriately. In addition, it would certainly be helpful to know whether datasets can be modified to be used for cyber risks or cybersecurity. Finally, the ability for stakeholders to identify machine-readable cybersecurity datasets would be useful because it would allow for even clearer delineations or comparisons between datasets. Due to the lack of publicly available datasets, concrete benchmarks often cannot be applied.

Average cost of a breach of more than 50 million records.

Aamir, M., S.S.H. Rizvi, M.A. Hashmani, M. Zubair, and J. Ahmad. 2021. Machine learning classification of port scanning and DDoS attacks: A comparative analysis. Mehran University Research Journal of Engineering and Technology 40 (1): 215–229. https://doi.org/10.22581/muet1982.2101.19 .

Article   Google Scholar  

Aamir, M., and S.M.A. Zaidi. 2019. DDoS attack detection with feature engineering and machine learning: The framework and performance evaluation. International Journal of Information Security 18 (6): 761–785. https://doi.org/10.1007/s10207-019-00434-1 .

Aassal, A. El, S. Baki, A. Das, and R.M. Verma. 2020. 2020. An in-depth benchmarking and evaluation of phishing detection research for security needs. IEEE Access 8: 22170–22192. https://doi.org/10.1109/ACCESS.2020.2969780 .

Abu Al-Haija, Q., and S. Zein-Sabatto. 2020. An efficient deep-learning-based detection and classification system for cyber-attacks in IoT communication networks. Electronics 9 (12): 26. https://doi.org/10.3390/electronics9122152 .

Adhikari, U., T.H. Morris, and S.Y. Pan. 2018. Applying Hoeffding adaptive trees for real-time cyber-power event and intrusion classification. IEEE Transactions on Smart Grid 9 (5): 4049–4060. https://doi.org/10.1109/tsg.2017.2647778 .

Agarwal, A., P. Sharma, M. Alshehri, A.A. Mohamed, and O. Alfarraj. 2021. Classification model for accuracy and intrusion detection using machine learning approach. PeerJ Computer Science . https://doi.org/10.7717/peerj-cs.437 .

Agrafiotis, I., J.R.C.. Nurse, M. Goldsmith, S. Creese, and D. Upton. 2018. A taxonomy of cyber-harms: Defining the impacts of cyber-attacks and understanding how they propagate. Journal of Cybersecurity 4: tyy006.

Agrawal, A., S. Mohammed, and J. Fiaidhi. 2019. Ensemble technique for intruder detection in network traffic. International Journal of Security and Its Applications 13 (3): 1–8. https://doi.org/10.33832/ijsia.2019.13.3.01 .

Ahmad, I., and R.A. Alsemmeari. 2020. Towards improving the intrusion detection through ELM (extreme learning machine). CMC Computers Materials & Continua 65 (2): 1097–1111. https://doi.org/10.32604/cmc.2020.011732 .

Ahmed, M., A.N. Mahmood, and J.K. Hu. 2016. A survey of network anomaly detection techniques. Journal of Network and Computer Applications 60: 19–31. https://doi.org/10.1016/j.jnca.2015.11.016 .

Al-Jarrah, O.Y., O. Alhussein, P.D. Yoo, S. Muhaidat, K. Taha, and K. Kim. 2016. Data randomization and cluster-based partitioning for Botnet intrusion detection. IEEE Transactions on Cybernetics 46 (8): 1796–1806. https://doi.org/10.1109/TCYB.2015.2490802 .

Al-Mhiqani, M.N., R. Ahmad, Z.Z. Abidin, W. Yassin, A. Hassan, K.H. Abdulkareem, N.S. Ali, and Z. Yunos. 2020. A review of insider threat detection: Classification, machine learning techniques, datasets, open challenges, and recommendations. Applied Sciences—Basel 10 (15): 41. https://doi.org/10.3390/app10155208 .

Al-Omari, M., M. Rawashdeh, F. Qutaishat, M. Alshira’H, and N. Ababneh. 2021. An intelligent tree-based intrusion detection model for cyber security. Journal of Network and Systems Management 29 (2): 18. https://doi.org/10.1007/s10922-021-09591-y .

Alabdallah, A., and M. Awad. 2018. Using weighted Support Vector Machine to address the imbalanced classes problem of Intrusion Detection System. KSII Transactions on Internet and Information Systems 12 (10): 5143–5158. https://doi.org/10.3837/tiis.2018.10.027 .

Alazab, M., M. Alazab, A. Shalaginov, A. Mesleh, and A. Awajan. 2020. Intelligent mobile malware detection using permission requests and API calls. Future Generation Computer Systems—the International Journal of eScience 107: 509–521. https://doi.org/10.1016/j.future.2020.02.002 .

Albahar, M.A., R.A. Al-Falluji, and M. Binsawad. 2020. An empirical comparison on malicious activity detection using different neural network-based models. IEEE Access 8: 61549–61564. https://doi.org/10.1109/ACCESS.2020.2984157 .

AlEroud, A.F., and G. Karabatis. 2018. Queryable semantics to detect cyber-attacks: A flow-based detection approach. IEEE Transactions on Systems, Man, and Cybernetics: Systems 48 (2): 207–223. https://doi.org/10.1109/TSMC.2016.2600405 .

Algarni, A.M., V. Thayananthan, and Y.K. Malaiya. 2021. Quantitative assessment of cybersecurity risks for mitigating data breaches in business systems. Applied Sciences (switzerland) . https://doi.org/10.3390/app11083678 .

Alhowaide, A., I. Alsmadi, and J. Tang. 2021. Towards the design of real-time autonomous IoT NIDS. Cluster Computing—the Journal of Networks Software Tools and Applications . https://doi.org/10.1007/s10586-021-03231-5 .

Ali, S., and Y. Li. 2019. Learning multilevel auto-encoders for DDoS attack detection in smart grid network. IEEE Access 7: 108647–108659. https://doi.org/10.1109/ACCESS.2019.2933304 .

AlKadi, O., N. Moustafa, B. Turnbull, and K.K.R. Choo. 2019. Mixture localization-based outliers models for securing data migration in cloud centers. IEEE Access 7: 114607–114618. https://doi.org/10.1109/ACCESS.2019.2935142 .

Allianz. 2021. Allianz Risk Barometer. https://www.agcs.allianz.com/content/dam/onemarketing/agcs/agcs/reports/Allianz-Risk-Barometer-2021.pdf . Accessed 15 May 2021.

Almiani, M., A. AbuGhazleh, A. Al-Rahayfeh, S. Atiewi, and Razaque, A. 2020. Deep recurrent neural network for IoT intrusion detection system. Simulation Modelling Practice and Theory 101: 102031. https://doi.org/10.1016/j.simpat.2019.102031

Alsaedi, A., N. Moustafa, Z. Tari, A. Mahmood, and A. Anwar. 2020. TON_IoT telemetry dataset: A new generation dataset of IoT and IIoT for data-driven intrusion detection systems. IEEE Access 8: 165130–165150. https://doi.org/10.1109/access.2020.3022862 .

Alsamiri, J., and K. Alsubhi. 2019. Internet of Things cyber attacks detection using machine learning. International Journal of Advanced Computer Science and Applications 10 (12): 627–634.

Alsharafat, W. 2013. Applying artificial neural network and eXtended classifier system for network intrusion detection. International Arab Journal of Information Technology 10 (3): 230–238.

Google Scholar  

Amin, R.W., H.E. Sevil, S. Kocak, G. Francia III., and P. Hoover. 2021. The spatial analysis of the malicious uniform resource locators (URLs): 2016 dataset case study. Information (switzerland) 12 (1): 1–18. https://doi.org/10.3390/info12010002 .

Arcuri, M.C., L.Z. Gai, F. Ielasi, and E. Ventisette. 2020. Cyber attacks on hospitality sector: Stock market reaction. Journal of Hospitality and Tourism Technology 11 (2): 277–290. https://doi.org/10.1108/jhtt-05-2019-0080 .

Arp, D., M. Spreitzenbarth, M. Hubner, H. Gascon, K. Rieck, and C.E.R.T. Siemens. 2014. Drebin: Effective and explainable detection of android malware in your pocket. In Ndss 14: 23–26.

Ashtiani, M., and M.A. Azgomi. 2014. A distributed simulation framework for modeling cyber attacks and the evaluation of security measures. Simulation 90 (9): 1071–1102. https://doi.org/10.1177/0037549714540221 .

Atefinia, R., and M. Ahmadi. 2021. Network intrusion detection using multi-architectural modular deep neural network. Journal of Supercomputing 77 (4): 3571–3593. https://doi.org/10.1007/s11227-020-03410-y .

Avila, R., R. Khoury, R. Khoury, and F. Petrillo. 2021. Use of security logs for data leak detection: A systematic literature review. Security and Communication Networks 2021: 29. https://doi.org/10.1155/2021/6615899 .

Azeez, N.A., T.J. Ayemobola, S. Misra, R. Maskeliunas, and R. Damasevicius. 2019. Network Intrusion Detection with a Hashing Based Apriori Algorithm Using Hadoop MapReduce. Computers 8 (4): 15. https://doi.org/10.3390/computers8040086 .

Bakdash, J.Z., S. Hutchinson, E.G. Zaroukian, L.R. Marusich, S. Thirumuruganathan, C. Sample, B. Hoffman, and G. Das. 2018. Malware in the future forecasting of analyst detection of cyber events. Journal of Cybersecurity . https://doi.org/10.1093/cybsec/tyy007 .

Barletta, V.S., D. Caivano, A. Nannavecchia, and M. Scalera. 2020. Intrusion detection for in-vehicle communication networks: An unsupervised Kohonen SOM approach. Future Internet . https://doi.org/10.3390/FI12070119 .

Barzegar, M., and M. Shajari. 2018. Attack scenario reconstruction using intrusion semantics. Expert Systems with Applications 108: 119–133. https://doi.org/10.1016/j.eswa.2018.04.030 .

Bessy-Roland, Y., A. Boumezoued, and C. Hillairet. 2021. Multivariate Hawkes process for cyber insurance. Annals of Actuarial Science 15 (1): 14–39.

Bhardwaj, A., V. Mangat, and R. Vig. 2020. Hyperband tuned deep neural network with well posed stacked sparse AutoEncoder for detection of DDoS attacks in cloud. IEEE Access 8: 181916–181929. https://doi.org/10.1109/ACCESS.2020.3028690 .

Bhati, B.S., C.S. Rai, B. Balamurugan, and F. Al-Turjman. 2020. An intrusion detection scheme based on the ensemble of discriminant classifiers. Computers & Electrical Engineering 86: 9. https://doi.org/10.1016/j.compeleceng.2020.106742 .

Bhattacharya, S., S.S.R. Krishnan, P.K.R. Maddikunta, R. Kaluri, S. Singh, T.R. Gadekallu, M. Alazab, and U. Tariq. 2020. A novel PCA-firefly based XGBoost classification model for intrusion detection in networks using GPU. Electronics 9 (2): 16. https://doi.org/10.3390/electronics9020219 .

Bibi, I., A. Akhunzada, J. Malik, J. Iqbal, A. Musaddiq, and S. Kim. 2020. A dynamic DL-driven architecture to combat sophisticated android malware. IEEE Access 8: 129600–129612. https://doi.org/10.1109/ACCESS.2020.3009819 .

Biener, C., M. Eling, and J.H. Wirfs. 2015. Insurability of cyber risk: An empirical analysis. The   Geneva Papers on Risk and Insurance—Issues and Practice 40 (1): 131–158. https://doi.org/10.1057/gpp.2014.19 .

Binbusayyis, A., and T. Vaiyapuri. 2019. Identifying and benchmarking key features for cyber intrusion detection: An ensemble approach. IEEE Access 7: 106495–106513. https://doi.org/10.1109/ACCESS.2019.2929487 .

Biswas, R., and S. Roy. 2021. Botnet traffic identification using neural networks. Multimedia Tools and Applications . https://doi.org/10.1007/s11042-021-10765-8 .

Bouyeddou, B., F. Harrou, B. Kadri, and Y. Sun. 2021. Detecting network cyber-attacks using an integrated statistical approach. Cluster Computing—the Journal of Networks Software Tools and Applications 24 (2): 1435–1453. https://doi.org/10.1007/s10586-020-03203-1 .

Bozkir, A.S., and M. Aydos. 2020. LogoSENSE: A companion HOG based logo detection scheme for phishing web page and E-mail brand recognition. Computers & Security 95: 18. https://doi.org/10.1016/j.cose.2020.101855 .

Brower, D., and M. McCormick. 2021. Colonial pipeline resumes operations following ransomware attack. Financial Times .

Cai, H., F. Zhang, and A. Levi. 2019. An unsupervised method for detecting shilling attacks in recommender systems by mining item relationship and identifying target items. The Computer Journal 62 (4): 579–597. https://doi.org/10.1093/comjnl/bxy124 .

Cebula, J.J., M.E. Popeck, and L.R. Young. 2014. A Taxonomy of Operational Cyber Security Risks Version 2 .

Chadza, T., K.G. Kyriakopoulos, and S. Lambotharan. 2020. Learning to learn sequential network attacks using hidden Markov models. IEEE Access 8: 134480–134497. https://doi.org/10.1109/ACCESS.2020.3011293 .

Chatterjee, S., and S. Thekdi. 2020. An iterative learning and inference approach to managing dynamic cyber vulnerabilities of complex systems. Reliability Engineering and System Safety . https://doi.org/10.1016/j.ress.2019.106664 .

Chattopadhyay, M., R. Sen, and S. Gupta. 2018. A comprehensive review and meta-analysis on applications of machine learning techniques in intrusion detection. Australasian Journal of Information Systems 22: 27.

Chen, H.S., and J. Fiscus. 2018. The inhospitable vulnerability: A need for cybersecurity risk assessment in the hospitality industry. Journal of Hospitality and Tourism Technology 9 (2): 223–234. https://doi.org/10.1108/JHTT-07-2017-0044 .

Chhabra, G.S., V.P. Singh, and M. Singh. 2020. Cyber forensics framework for big data analytics in IoT environment using machine learning. Multimedia Tools and Applications 79 (23–24): 15881–15900. https://doi.org/10.1007/s11042-018-6338-1 .

Chiba, Z., N. Abghour, K. Moussaid, A. Elomri, and M. Rida. 2019. Intelligent approach to build a Deep Neural Network based IDS for cloud environment using combination of machine learning algorithms. Computers and Security 86: 291–317. https://doi.org/10.1016/j.cose.2019.06.013 .

Choras, M., and R. Kozik. 2015. Machine learning techniques applied to detect cyber attacks on web applications. Logic Journal of the IGPL 23 (1): 45–56. https://doi.org/10.1093/jigpal/jzu038 .

Chowdhury, S., M. Khanzadeh, R. Akula, F. Zhang, S. Zhang, H. Medal, M. Marufuzzaman, and L. Bian. 2017. Botnet detection using graph-based feature clustering. Journal of Big Data 4 (1): 14. https://doi.org/10.1186/s40537-017-0074-7 .

Cost Of A Cyber Incident: Systematic Review And Cross-Validation, Cybersecurity & Infrastructure Agency , 1, https://www.cisa.gov/sites/default/files/publications/CISA-OCE_Cost_of_Cyber_Incidents_Study-FINAL_508.pdf (2020).

D’Hooge, L., T. Wauters, B. Volckaert, and F. De Turck. 2019. Classification hardness for supervised learners on 20 years of intrusion detection data. IEEE Access 7: 167455–167469. https://doi.org/10.1109/access.2019.2953451 .

Damasevicius, R., A. Venckauskas, S. Grigaliunas, J. Toldinas, N. Morkevicius, T. Aleliunas, and P. Smuikys. 2020. LITNET-2020: An annotated real-world network flow dataset for network intrusion detection. Electronics 9 (5): 23. https://doi.org/10.3390/electronics9050800 .

De Giovanni, A.L.D., and M. Pirra. 2020. On the determinants of data breaches: A cointegration analysis. Decisions in Economics and Finance . https://doi.org/10.1007/s10203-020-00301-y .

Deng, L., D. Li, X. Yao, and H. Wang. 2019. Retracted Article: Mobile network intrusion detection for IoT system based on transfer learning algorithm. Cluster Computing 22 (4): 9889–9904. https://doi.org/10.1007/s10586-018-1847-2 .

Donkal, G., and G.K. Verma. 2018. A multimodal fusion based framework to reinforce IDS for securing Big Data environment using Spark. Journal of Information Security and Applications 43: 1–11. https://doi.org/10.1016/j.jisa.2018.10.001 .

Dunn, C., N. Moustafa, and B. Turnbull. 2020. Robustness evaluations of sustainable machine learning models against data Poisoning attacks in the Internet of Things. Sustainability 12 (16): 17. https://doi.org/10.3390/su12166434 .

Dwivedi, S., M. Vardhan, and S. Tripathi. 2021. Multi-parallel adaptive grasshopper optimization technique for detecting anonymous attacks in wireless networks. Wireless Personal Communications . https://doi.org/10.1007/s11277-021-08368-5 .

Dyson, B. 2020. COVID-19 crisis could be ‘watershed’ for cyber insurance, says Swiss Re exec. https://www.spglobal.com/marketintelligence/en/news-insights/latest-news-headlines/covid-19-crisis-could-be-watershed-for-cyber-insurance-says-swiss-re-exec-59197154 . Accessed 7 May 2020.

EIOPA. 2018. Understanding cyber insurance—a structured dialogue with insurance companies. https://www.eiopa.europa.eu/sites/default/files/publications/reports/eiopa_understanding_cyber_insurance.pdf . Accessed 28 May 2018

Elijah, A.V., A. Abdullah, N.Z. JhanJhi, M. Supramaniam, and O.B. Abdullateef. 2019. Ensemble and deep-learning methods for two-class and multi-attack anomaly intrusion detection: An empirical study. International Journal of Advanced Computer Science and Applications 10 (9): 520–528.

Eling, M., and K. Jung. 2018. Copula approaches for modeling cross-sectional dependence of data breach losses. Insurance Mathematics & Economics 82: 167–180. https://doi.org/10.1016/j.insmatheco.2018.07.003 .

Eling, M., and W. Schnell. 2016. What do we know about cyber risk and cyber risk insurance? Journal of Risk Finance 17 (5): 474–491. https://doi.org/10.1108/jrf-09-2016-0122 .

Eling, M., and J. Wirfs. 2019. What are the actual costs of cyber risk events? European Journal of Operational Research 272 (3): 1109–1119. https://doi.org/10.1016/j.ejor.2018.07.021 .

Eling, M. 2020. Cyber risk research in business and actuarial science. European Actuarial Journal 10 (2): 303–333.

Elmasry, W., A. Akbulut, and A.H. Zaim. 2019. Empirical study on multiclass classification-based network intrusion detection. Computational Intelligence 35 (4): 919–954. https://doi.org/10.1111/coin.12220 .

Elsaid, S.A., and N.S. Albatati. 2020. An optimized collaborative intrusion detection system for wireless sensor networks. Soft Computing 24 (16): 12553–12567. https://doi.org/10.1007/s00500-020-04695-0 .

Estepa, R., J.E. Díaz-Verdejo, A. Estepa, and G. Madinabeitia. 2020. How much training data is enough? A case study for HTTP anomaly-based intrusion detection. IEEE Access 8: 44410–44425. https://doi.org/10.1109/ACCESS.2020.2977591 .

European Council. 2021. Cybersecurity: how the EU tackles cyber threats. https://www.consilium.europa.eu/en/policies/cybersecurity/ . Accessed 10 May 2021

Falco, G. et al. 2019. Cyber risk research impeded by disciplinary barriers. Science (American Association for the Advancement of Science) 366 (6469): 1066–1069.

Fan, Z.J., Z.P. Tan, C.X. Tan, and X. Li. 2018. An improved integrated prediction method of cyber security situation based on spatial-time analysis. Journal of Internet Technology 19 (6): 1789–1800. https://doi.org/10.3966/160792642018111906015 .

Fang, Z.J., M.C. Xu, S.H. Xu, and T.Z. Hu. 2021. A framework for predicting data breach risk: Leveraging dependence to cope with sparsity. IEEE Transactions on Information Forensics and Security 16: 2186–2201. https://doi.org/10.1109/tifs.2021.3051804 .

Farkas, S., O. Lopez, and M. Thomas. 2021. Cyber claim analysis using Generalized Pareto regression trees with applications to insurance. Insurance: Mathematics and Economics 98: 92–105. https://doi.org/10.1016/j.insmatheco.2021.02.009 .

Farsi, H., A. Fanian, and Z. Taghiyarrenani. 2019. A novel online state-based anomaly detection system for process control networks. International Journal of Critical Infrastructure Protection 27: 11. https://doi.org/10.1016/j.ijcip.2019.100323 .

Ferrag, M.A., L. Maglaras, S. Moschoyiannis, and H. Janicke. 2020. Deep learning for cyber security intrusion detection: Approaches, datasets, and comparative study. Journal of Information Security and Applications 50: 19. https://doi.org/10.1016/j.jisa.2019.102419 .

Field, M. 2018. WannaCry cyber attack cost the NHS £92m as 19,000 appointments cancelled. https://www.telegraph.co.uk/technology/2018/10/11/wannacry-cyber-attack-cost-nhs-92m-19000-appointments-cancelled/ . Accessed 9 May 2018.

FitchRatings. 2021. U.S. Cyber Insurance Market Update (Spike in Claims Leads to Decline in 2020 Underwriting Performance). https://www.fitchratings.com/research/insurance/us-cyber-insurance-market-update-spike-in-claims-leads-to-decline-in-2020-underwriting-performance-26-05-2021 .

Fossaceca, J.M., T.A. Mazzuchi, and S. Sarkani. 2015. MARK-ELM: Application of a novel Multiple Kernel Learning framework for improving the robustness of network intrusion detection. Expert Systems with Applications 42 (8): 4062–4080. https://doi.org/10.1016/j.eswa.2014.12.040 .

Franke, U., and J. Brynielsson. 2014. Cyber situational awareness–a systematic review of the literature. Computers & security 46: 18–31.

Freeha, K., K.J. Hwan, M. Lars, and M. Robin. 2021. Data breach management: An integrated risk model. Information & Management 58 (1): 103392. https://doi.org/10.1016/j.im.2020.103392 .

Ganeshan, R., and P. Rodrigues. 2020. Crow-AFL: Crow based adaptive fractional lion optimization approach for the intrusion detection. Wireless Personal Communications 111 (4): 2065–2089. https://doi.org/10.1007/s11277-019-06972-0 .

GAO. 2021. CYBER INSURANCE—Insurers and policyholders face challenges in an evolving market. https://www.gao.gov/assets/gao-21-477.pdf . Accessed 16 May 2021.

Garber, J. 2021. Colonial Pipeline fiasco foreshadows impact of Biden energy policy. https://www.foxbusiness.com/markets/colonial-pipeline-fiasco-foreshadows-impact-of-biden-energy-policy . Accessed 4 May 2021.

Gauthama Raman, M.R., N. Somu, S. Jagarapu, T. Manghnani, T. Selvam, K. Krithivasan, and V.S. Shankar Sriram. 2020. An efficient intrusion detection technique based on support vector machine and improved binary gravitational search algorithm. Artificial Intelligence Review 53 (5): 3255–3286. https://doi.org/10.1007/s10462-019-09762-z .

Gavel, S., A.S. Raghuvanshi, and S. Tiwari. 2021. Distributed intrusion detection scheme using dual-axis dimensionality reduction for Internet of things (IoT). Journal of Supercomputing . https://doi.org/10.1007/s11227-021-03697-5 .

GDPR.EU. 2021. FAQ. https://gdpr.eu/faq/ . Accessed 10 May 2021.

Georgescu, T.M., B. Iancu, and M. Zurini. 2019. Named-entity-recognition-based automated system for diagnosing cybersecurity situations in IoT networks. Sensors (switzerland) . https://doi.org/10.3390/s19153380 .

Giudici, P., and E. Raffinetti. 2020. Cyber risk ordering with rank-based statistical models. AStA Advances in Statistical Analysis . https://doi.org/10.1007/s10182-020-00387-0 .

Goh, J., S. Adepu, K.N. Junejo, and A. Mathur. 2016. A dataset to support research in the design of secure water treatment systems. In CRITIS.

Gong, X.Y., J.L. Lu, Y.F. Zhou, H. Qiu, and R. He. 2021. Model uncertainty based annotation error fixing for web attack detection. Journal of Signal Processing Systems for Signal Image and Video Technology 93 (2–3): 187–199. https://doi.org/10.1007/s11265-019-01494-1 .

Goode, S., H. Hoehle, V. Venkatesh, and S.A. Brown. 2017. USER compensation as a data breach recovery action: An investigation of the sony playstation network breach. MIS Quarterly 41 (3): 703–727.

Guo, H., S. Huang, C. Huang, Z. Pan, M. Zhang, and F. Shi. 2020. File entropy signal analysis combined with wavelet decomposition for malware classification. IEEE Access 8: 158961–158971. https://doi.org/10.1109/ACCESS.2020.3020330 .

Habib, M., I. Aljarah, and H. Faris. 2020. A Modified multi-objective particle swarm optimizer-based Lévy flight: An approach toward intrusion detection in Internet of Things. Arabian Journal for Science and Engineering 45 (8): 6081–6108. https://doi.org/10.1007/s13369-020-04476-9 .

Hajj, S., R. El Sibai, J.B. Abdo, J. Demerjian, A. Makhoul, and C. Guyeux. 2021. Anomaly-based intrusion detection systems: The requirements, methods, measurements, and datasets. Transactions on Emerging Telecommunications Technologies 32 (4): 36. https://doi.org/10.1002/ett.4240 .

Heartfield, R., G. Loukas, A. Bezemskij, and E. Panaousis. 2021. Self-configurable cyber-physical intrusion detection for smart homes using reinforcement learning. IEEE Transactions on Information Forensics and Security 16: 1720–1735. https://doi.org/10.1109/tifs.2020.3042049 .

Hemo, B., T. Gafni, K. Cohen, and Q. Zhao. 2020. Searching for anomalies over composite hypotheses. IEEE Transactions on Signal Processing 68: 1181–1196. https://doi.org/10.1109/TSP.2020.2971438

Hindy, H., D. Brosset, E. Bayne, A.K. Seeam, C. Tachtatzis, R. Atkinson, and X. Bellekens. 2020. A taxonomy of network threats and the effect of current datasets on intrusion detection systems. IEEE Access 8: 104650–104675. https://doi.org/10.1109/ACCESS.2020.3000179 .

Hong, W., D. Huang, C. Chen, and J. Lee. 2020. Towards accurate and efficient classification of power system contingencies and cyber-attacks using recurrent neural networks. IEEE Access 8: 123297–123309. https://doi.org/10.1109/ACCESS.2020.3007609 .

Husák, M., M. Zádník, V. Bartos, and P. Sokol. 2020. Dataset of intrusion detection alerts from a sharing platform. Data in Brief 33: 106530.

IBM Security. 2020. Cost of a Data breach Report. https://www.capita.com/sites/g/files/nginej291/files/2020-08/Ponemon-Global-Cost-of-Data-Breach-Study-2020.pdf . Accessed 19 May 2021.

IEEE. 2021. IEEE Quick Facts. https://www.ieee.org/about/at-a-glance.html . Accessed 11 May 2021.

Kilincer, I.F., F. Ertam, and S. Abdulkadir. 2021. Machine learning methods for cyber security intrusion detection: Datasets and comparative study. Computer Networks 188: 107840. https://doi.org/10.1016/j.comnet.2021.107840 .

Jaber, A.N., and S. Ul Rehman. 2020. FCM-SVM based intrusion detection system for cloud computing environment. Cluster Computing—the Journal of Networks Software Tools and Applications 23 (4): 3221–3231. https://doi.org/10.1007/s10586-020-03082-6 .

Jacobs, J., S. Romanosky, B. Edwards, M. Roytman, and I. Adjerid. 2019. Exploit prediction scoring system (epss). arXiv:1908.04856

Jacobsen, A. et al. 2020. FAIR principles: Interpretations and implementation considerations. Data Intelligence 2 (1–2): 10–29. https://doi.org/10.1162/dint_r_00024 .

Jahromi, A.N., S. Hashemi, A. Dehghantanha, R.M. Parizi, and K.K.R. Choo. 2020. An enhanced stacked LSTM method with no random initialization for malware threat hunting in safety and time-critical systems. IEEE Transactions on Emerging Topics in Computational Intelligence 4 (5): 630–640. https://doi.org/10.1109/TETCI.2019.2910243 .

Jang, S., S. Li, and Y. Sung. 2020. FastText-based local feature visualization algorithm for merged image-based malware classification framework for cyber security and cyber defense. Mathematics 8 (3): 13. https://doi.org/10.3390/math8030460 .

Javeed, D., T.H. Gao, and M.T. Khan. 2021. SDN-enabled hybrid DL-driven framework for the detection of emerging cyber threats in IoT. Electronics 10 (8): 16. https://doi.org/10.3390/electronics10080918 .

Johnson, P., D. Gorton, R. Lagerstrom, and M. Ekstedt. 2016. Time between vulnerability disclosures: A measure of software product vulnerability. Computers & Security 62: 278–295. https://doi.org/10.1016/j.cose.2016.08.004 .

Johnson, P., R. Lagerström, M. Ekstedt, and U. Franke. 2018. Can the common vulnerability scoring system be trusted? A Bayesian analysis. IEEE Transactions on Dependable and Secure Computing 15 (6): 1002–1015. https://doi.org/10.1109/TDSC.2016.2644614 .

Junger, M., V. Wang, and M. Schlömer. 2020. Fraud against businesses both online and offline: Crime scripts, business characteristics, efforts, and benefits. Crime Science 9 (1): 13. https://doi.org/10.1186/s40163-020-00119-4 .

Kalutarage, H.K., H.N. Nguyen, and S.A. Shaikh. 2017. Towards a threat assessment framework for apps collusion. Telecommunication Systems 66 (3): 417–430. https://doi.org/10.1007/s11235-017-0296-1 .

Kamarudin, M.H., C. Maple, T. Watson, and N.S. Safa. 2017. A LogitBoost-based algorithm for detecting known and unknown web attacks. IEEE Access 5: 26190–26200. https://doi.org/10.1109/ACCESS.2017.2766844 .

Kasongo, S.M., and Y.X. Sun. 2020. A deep learning method with wrapper based feature extraction for wireless intrusion detection system. Computers & Security 92: 15. https://doi.org/10.1016/j.cose.2020.101752 .

Keserwani, P.K., M.C. Govil, E.S. Pilli, and P. Govil. 2021. A smart anomaly-based intrusion detection system for the Internet of Things (IoT) network using GWO–PSO–RF model. Journal of Reliable Intelligent Environments 7 (1): 3–21. https://doi.org/10.1007/s40860-020-00126-x .

Keshk, M., E. Sitnikova, N. Moustafa, J. Hu, and I. Khalil. 2021. An integrated framework for privacy-preserving based anomaly detection for cyber-physical systems. IEEE Transactions on Sustainable Computing 6 (1): 66–79. https://doi.org/10.1109/TSUSC.2019.2906657 .

Khan, I.A., D.C. Pi, A.K. Bhatia, N. Khan, W. Haider, and A. Wahab. 2020. Generating realistic IoT-based IDS dataset centred on fuzzy qualitative modelling for cyber-physical systems. Electronics Letters 56 (9): 441–443. https://doi.org/10.1049/el.2019.4158 .

Khraisat, A., I. Gondal, P. Vamplew, J. Kamruzzaman, and A. Alazab. 2020. Hybrid intrusion detection system based on the stacking ensemble of C5 decision tree classifier and one class support vector machine. Electronics 9 (1): 18. https://doi.org/10.3390/electronics9010173 .

Khraisat, A., I. Gondal, P. Vamplew, and J. Kamruzzaman. 2019. Survey of intrusion detection systems: Techniques, datasets and challenges. Cybersecurity 2 (1): 20. https://doi.org/10.1186/s42400-019-0038-7 .

Kilincer, I.F., F. Ertam, and A. Sengur. 2021. Machine learning methods for cyber security intrusion detection: Datasets and comparative study. Computer Networks 188: 16. https://doi.org/10.1016/j.comnet.2021.107840 .

Kim, D., and H.K. Kim. 2019. Automated dataset generation system for collaborative research of cyber threat analysis. Security and Communication Networks 2019: 10. https://doi.org/10.1155/2019/6268476 .

Kim, G., C. Lee, J. Jo, and H. Lim. 2020. Automatic extraction of named entities of cyber threats using a deep Bi-LSTM-CRF network. International Journal of Machine Learning and Cybernetics 11 (10): 2341–2355. https://doi.org/10.1007/s13042-020-01122-6 .

Kirubavathi, G., and R. Anitha. 2016. Botnet detection via mining of traffic flow characteristics. Computers & Electrical Engineering 50: 91–101. https://doi.org/10.1016/j.compeleceng.2016.01.012 .

Kiwia, D., A. Dehghantanha, K.K.R. Choo, and J. Slaughter. 2018. A cyber kill chain based taxonomy of banking Trojans for evolutionary computational intelligence. Journal of Computational Science 27: 394–409. https://doi.org/10.1016/j.jocs.2017.10.020 .

Koroniotis, N., N. Moustafa, and E. Sitnikova. 2020. A new network forensic framework based on deep learning for Internet of Things networks: A particle deep framework. Future Generation Computer Systems 110: 91–106. https://doi.org/10.1016/j.future.2020.03.042 .

Kruse, C.S., B. Frederick, T. Jacobson, and D. Kyle Monticone. 2017. Cybersecurity in healthcare: A systematic review of modern threats and trends. Technology and Health Care 25 (1): 1–10.

Kshetri, N. 2018. The economics of cyber-insurance. IT Professional 20 (6): 9–14. https://doi.org/10.1109/MITP.2018.2874210 .

Kumar, R., P. Kumar, R. Tripathi, G.P. Gupta, T.R. Gadekallu, and G. Srivastava. 2021. SP2F: A secured privacy-preserving framework for smart agricultural Unmanned Aerial Vehicles. Computer Networks . https://doi.org/10.1016/j.comnet.2021.107819 .

Kumar, R., and R. Tripathi. 2021. DBTP2SF: A deep blockchain-based trustworthy privacy-preserving secured framework in industrial internet of things systems. Transactions on Emerging Telecommunications Technologies 32 (4): 27. https://doi.org/10.1002/ett.4222 .

Laso, P.M., D. Brosset, and J. Puentes. 2017. Dataset of anomalies and malicious acts in a cyber-physical subsystem. Data in Brief 14: 186–191. https://doi.org/10.1016/j.dib.2017.07.038 .

Lee, J., J. Kim, I. Kim, and K. Han. 2019. Cyber threat detection based on artificial neural networks using event profiles. IEEE Access 7: 165607–165626. https://doi.org/10.1109/ACCESS.2019.2953095 .

Lee, S.J., P.D. Yoo, A.T. Asyhari, Y. Jhi, L. Chermak, C.Y. Yeun, and K. Taha. 2020. IMPACT: Impersonation attack detection via edge computing using deep Autoencoder and feature abstraction. IEEE Access 8: 65520–65529. https://doi.org/10.1109/ACCESS.2020.2985089 .

Leong, Y.-Y., and Y.-C. Chen. 2020. Cyber risk cost and management in IoT devices-linked health insurance. The Geneva Papers on Risk and Insurance—Issues and Practice 45 (4): 737–759. https://doi.org/10.1057/s41288-020-00169-4 .

Levi, M. 2017. Assessing the trends, scale and nature of economic cybercrimes: overview and Issues: In Cybercrimes, cybercriminals and their policing, in crime, law and social change. Crime, Law and Social Change 67 (1): 3–20. https://doi.org/10.1007/s10611-016-9645-3 .

Li, C., K. Mills, D. Niu, R. Zhu, H. Zhang, and H. Kinawi. 2019a. Android malware detection based on factorization machine. IEEE Access 7: 184008–184019. https://doi.org/10.1109/ACCESS.2019.2958927 .

Li, D.Q., and Q.M. Li. 2020. Adversarial deep ensemble: evasion attacks and defenses for malware detection. IEEE Transactions on Information Forensics and Security 15: 3886–3900. https://doi.org/10.1109/tifs.2020.3003571 .

Li, D.Q., Q.M. Li, Y.F. Ye, and S.H. Xu. 2021. A framework for enhancing deep neural networks against adversarial malware. IEEE Transactions on Network Science and Engineering 8 (1): 736–750. https://doi.org/10.1109/tnse.2021.3051354 .

Li, R.H., C. Zhang, C. Feng, X. Zhang, and C.J. Tang. 2019b. Locating vulnerability in binaries using deep neural networks. IEEE Access 7: 134660–134676. https://doi.org/10.1109/access.2019.2942043 .

Li, X., M. Xu, P. Vijayakumar, N. Kumar, and X. Liu. 2020. Detection of low-frequency and multi-stage attacks in industrial Internet of Things. IEEE Transactions on Vehicular Technology 69 (8): 8820–8831. https://doi.org/10.1109/TVT.2020.2995133 .

Liu, H.Y., and B. Lang. 2019. Machine learning and deep learning methods for intrusion detection systems: A survey. Applied Sciences—Basel 9 (20): 28. https://doi.org/10.3390/app9204396 .

Lopez-Martin, M., B. Carro, and A. Sanchez-Esguevillas. 2020. Application of deep reinforcement learning to intrusion detection for supervised problems. Expert Systems with Applications . https://doi.org/10.1016/j.eswa.2019.112963 .

Loukas, G., D. Gan, and Tuan Vuong. 2013. A review of cyber threats and defence approaches in emergency management. Future Internet 5: 205–236.

Luo, C.C., S. Su, Y.B. Sun, Q.J. Tan, M. Han, and Z.H. Tian. 2020. A convolution-based system for malicious URLs detection. CMC—Computers Materials Continua 62 (1): 399–411.

Mahbooba, B., M. Timilsina, R. Sahal, and M. Serrano. 2021. Explainable artificial intelligence (XAI) to enhance trust management in intrusion detection systems using decision tree model. Complexity 2021: 11. https://doi.org/10.1155/2021/6634811 .

Mahdavifar, S., and A.A. Ghorbani. 2020. DeNNeS: Deep embedded neural network expert system for detecting cyber attacks. Neural Computing & Applications 32 (18): 14753–14780. https://doi.org/10.1007/s00521-020-04830-w .

Mahfouz, A., A. Abuhussein, D. Venugopal, and S. Shiva. 2020. Ensemble classifiers for network intrusion detection using a novel network attack dataset. Future Internet 12 (11): 1–19. https://doi.org/10.3390/fi12110180 .

Maleks Smith, Z., E. Lostri, and J.A. Lewis. 2020. The hidden costs of cybercrime. https://www.mcafee.com/enterprise/en-us/assets/reports/rp-hidden-costs-of-cybercrime.pdf . Accessed 16 May 2021.

Malik, J., A. Akhunzada, I. Bibi, M. Imran, A. Musaddiq, and S.W. Kim. 2020. Hybrid deep learning: An efficient reconnaissance and surveillance detection mechanism in SDN. IEEE Access 8: 134695–134706. https://doi.org/10.1109/ACCESS.2020.3009849 .

Manimurugan, S. 2020. IoT-Fog-Cloud model for anomaly detection using improved Naive Bayes and principal component analysis. Journal of Ambient Intelligence and Humanized Computing . https://doi.org/10.1007/s12652-020-02723-3 .

Martin, A., R. Lara-Cabrera, and D. Camacho. 2019. Android malware detection through hybrid features fusion and ensemble classifiers: The AndroPyTool framework and the OmniDroid dataset. Information Fusion 52: 128–142. https://doi.org/10.1016/j.inffus.2018.12.006 .

Mauro, M.D., G. Galatro, and A. Liotta. 2020. Experimental review of neural-based approaches for network intrusion management. IEEE Transactions on Network and Service Management 17 (4): 2480–2495. https://doi.org/10.1109/TNSM.2020.3024225 .

McLeod, A., and D. Dolezel. 2018. Cyber-analytics: Modeling factors associated with healthcare data breaches. Decision Support Systems 108: 57–68. https://doi.org/10.1016/j.dss.2018.02.007 .

Meira, J., R. Andrade, I. Praca, J. Carneiro, V. Bolon-Canedo, A. Alonso-Betanzos, and G. Marreiros. 2020. Performance evaluation of unsupervised techniques in cyber-attack anomaly detection. Journal of Ambient Intelligence and Humanized Computing 11 (11): 4477–4489. https://doi.org/10.1007/s12652-019-01417-9 .

Miao, Y., J. Ma, X. Liu, J. Weng, H. Li, and H. Li. 2019. Lightweight fine-grained search over encrypted data in Fog computing. IEEE Transactions on Services Computing 12 (5): 772–785. https://doi.org/10.1109/TSC.2018.2823309 .

Miller, C., and C. Valasek. 2015. Remote exploitation of an unaltered passenger vehicle. Black Hat USA 2015 (S 91).

Mireles, J.D., E. Ficke, J.H. Cho, P. Hurley, and S.H. Xu. 2019. Metrics towards measuring cyber agility. IEEE Transactions on Information Forensics and Security 14 (12): 3217–3232. https://doi.org/10.1109/tifs.2019.2912551 .

Mishra, N., and S. Pandya. 2021. Internet of Things applications, security challenges, attacks, intrusion detection, and future visions: A systematic review. IEEE Access . https://doi.org/10.1109/ACCESS.2021.3073408 .

Monshizadeh, M., V. Khatri, B.G. Atli, R. Kantola, and Z. Yan. 2019. Performance evaluation of a combined anomaly detection platform. IEEE Access 7: 100964–100978. https://doi.org/10.1109/ACCESS.2019.2930832 .

Moreno, V.C., G. Reniers, E. Salzano, and V. Cozzani. 2018. Analysis of physical and cyber security-related events in the chemical and process industry. Process Safety and Environmental Protection 116: 621–631. https://doi.org/10.1016/j.psep.2018.03.026 .

Moro, E.D. 2020. Towards an economic cyber loss index for parametric cover based on IT security indicator: A preliminary analysis. Risks . https://doi.org/10.3390/risks8020045 .

Moustafa, N., E. Adi, B. Turnbull, and J. Hu. 2018. A new threat intelligence scheme for safeguarding industry 4.0 systems. IEEE Access 6: 32910–32924. https://doi.org/10.1109/ACCESS.2018.2844794 .

Moustakidis, S., and P. Karlsson. 2020. A novel feature extraction methodology using Siamese convolutional neural networks for intrusion detection. Cybersecurity . https://doi.org/10.1186/s42400-020-00056-4 .

Mukhopadhyay, A., S. Chatterjee, K.K. Bagchi, P.J. Kirs, and G.K. Shukla. 2019. Cyber Risk Assessment and Mitigation (CRAM) framework using Logit and Probit models for cyber insurance. Information Systems Frontiers 21 (5): 997–1018. https://doi.org/10.1007/s10796-017-9808-5 .

Murphey, H. 2021a. Biden signs executive order to strengthen US cyber security. https://www.ft.com/content/4d808359-b504-4014-85f6-68e7a2851bf1?accessToken=zwAAAXl0_ifgkc9NgINZtQRAFNOF9mjnooUb8Q.MEYCIQDw46SFWsMn1iyuz3kvgAmn6mxc0rIVfw10Lg1ovJSfJwIhAK2X2URzfSqHwIS7ddRCvSt2nGC2DcdoiDTG49-4TeEt&sharetype=gift?token=fbcd6323-1ecf-4fc3-b136-b5b0dd6a8756 . Accessed 7 May 2021.

Murphey, H. 2021b. Millions of connected devices have security flaws, study shows. https://www.ft.com/content/0bf92003-926d-4dee-87d7-b01f7c3e9621?accessToken=zwAAAXnA7f2Ikc8L-SADkm1N7tOH17AffD6WIQ.MEQCIDjBuROvhmYV0Mx3iB0cEV7m5oND1uaCICxJu0mzxM0PAiBam98q9zfHiTB6hKGr1gGl0Azt85yazdpX9K5sI8se3Q&sharetype=gift?token=2538218d-77d9-4dd3-9649-3cb556a34e51 . Accessed 6 May 2021.

Murugesan, V., M. Shalinie, and M.H. Yang. 2018. Design and analysis of hybrid single packet IP traceback scheme. IET Networks 7 (3): 141–151. https://doi.org/10.1049/iet-net.2017.0115 .

Mwitondi, K.S., and S.A. Zargari. 2018. An iterative multiple sampling method for intrusion detection. Information Security Journal 27 (4): 230–239. https://doi.org/10.1080/19393555.2018.1539790 .

Neto, N.N., S. Madnick, A.M.G. De Paula, and N.M. Borges. 2021. Developing a global data breach database and the challenges encountered. ACM Journal of Data and Information Quality 13 (1): 33. https://doi.org/10.1145/3439873 .

Nurse, J.R.C., L. Axon, A. Erola, I. Agrafiotis, M. Goldsmith, and S. Creese. 2020. The data that drives cyber insurance: A study into the underwriting and claims processes. In 2020 International conference on cyber situational awareness, data analytics and assessment (CyberSA), 15–19 June 2020.

Oliveira, N., I. Praca, E. Maia, and O. Sousa. 2021. Intelligent cyber attack detection and classification for network-based intrusion detection systems. Applied Sciences—Basel 11 (4): 21. https://doi.org/10.3390/app11041674 .

Page, M.J. et al. 2021. The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. Systematic Reviews 10 (1): 89. https://doi.org/10.1186/s13643-021-01626-4 .

Pajouh, H.H., R. Javidan, R. Khayami, A. Dehghantanha, and K.R. Choo. 2019. A two-layer dimension reduction and two-tier classification model for anomaly-based intrusion detection in IoT backbone networks. IEEE Transactions on Emerging Topics in Computing 7 (2): 314–323. https://doi.org/10.1109/TETC.2016.2633228 .

Parra, G.D., P. Rad, K.K.R. Choo, and N. Beebe. 2020. Detecting Internet of Things attacks using distributed deep learning. Journal of Network and Computer Applications 163: 13. https://doi.org/10.1016/j.jnca.2020.102662 .

Paté-Cornell, M.E., M. Kuypers, M. Smith, and P. Keller. 2018. Cyber risk management for critical infrastructure: A risk analysis model and three case studies. Risk Analysis 38 (2): 226–241. https://doi.org/10.1111/risa.12844 .

Pooser, D.M., M.J. Browne, and O. Arkhangelska. 2018. Growth in the perception of cyber risk: evidence from U.S. P&C Insurers. The Geneva Papers on Risk and Insurance—Issues and Practice 43 (2): 208–223. https://doi.org/10.1057/s41288-017-0077-9 .

Pu, G., L. Wang, J. Shen, and F. Dong. 2021. A hybrid unsupervised clustering-based anomaly detection method. Tsinghua Science and Technology 26 (2): 146–153. https://doi.org/10.26599/TST.2019.9010051 .

Qiu, J., W. Luo, L. Pan, Y. Tai, J. Zhang, and Y. Xiang. 2019. Predicting the impact of android malicious samples via machine learning. IEEE Access 7: 66304–66316. https://doi.org/10.1109/ACCESS.2019.2914311 .

Qu, X., L. Yang, K. Guo, M. Sun, L. Ma, T. Feng, S. Ren, K. Li, and X. Ma. 2020. Direct batch growth hierarchical self-organizing mapping based on statistics for efficient network intrusion detection. IEEE Access 8: 42251–42260. https://doi.org/10.1109/ACCESS.2020.2976810 .

Rahman, Md.S., S. Halder, Md. Ashraf Uddin, and U.K. Acharjee. 2021. An efficient hybrid system for anomaly detection in social networks. Cybersecurity 4 (1): 10. https://doi.org/10.1186/s42400-021-00074-w .

Ramaiah, M., V. Chandrasekaran, V. Ravi, and N. Kumar. 2021. An intrusion detection system using optimized deep neural network architecture. Transactions on Emerging Telecommunications Technologies 32 (4): 17. https://doi.org/10.1002/ett.4221 .

Raman, M.R.G., K. Kannan, S.K. Pal, and V.S.S. Sriram. 2016. Rough set-hypergraph-based feature selection approach for intrusion detection systems. Defence Science Journal 66 (6): 612–617. https://doi.org/10.14429/dsj.66.10802 .

Rathore, S., J.H. Park. 2018. Semi-supervised learning based distributed attack detection framework for IoT. Applied Soft Computing 72: 79–89. https://doi.org/10.1016/j.asoc.2018.05.049 .

Romanosky, S., L. Ablon, A. Kuehn, and T. Jones. 2019. Content analysis of cyber insurance policies: How do carriers price cyber risk? Journal of Cybersecurity (oxford) 5 (1): tyz002.

Sarabi, A., P. Naghizadeh, Y. Liu, and M. Liu. 2016. Risky business: Fine-grained data breach prediction using business profiles. Journal of Cybersecurity 2 (1): 15–28. https://doi.org/10.1093/cybsec/tyw004 .

Sardi, Alberto, Alessandro Rizzi, Enrico Sorano, and Anna Guerrieri. 2021. Cyber risk in health facilities: A systematic literature review. Sustainability 12 (17): 7002.

Sarker, Iqbal H., A.S.M. Kayes, Shahriar Badsha, Hamed Alqahtani, Paul Watters, and Alex Ng. 2020. Cybersecurity data science: An overview from machine learning perspective. Journal of Big Data 7 (1): 41. https://doi.org/10.1186/s40537-020-00318-5 .

Scopus. 2021. Factsheet. https://www.elsevier.com/__data/assets/pdf_file/0017/114533/Scopus_GlobalResearch_Factsheet2019_FINAL_WEB.pdf . Accessed 11 May 2021.

Sentuna, A., A. Alsadoon, P.W.C. Prasad, M. Saadeh, and O.H. Alsadoon. 2021. A novel Enhanced Naïve Bayes Posterior Probability (ENBPP) using machine learning: Cyber threat analysis. Neural Processing Letters 53 (1): 177–209. https://doi.org/10.1007/s11063-020-10381-x .

Shaukat, K., S.H. Luo, V. Varadharajan, I.A. Hameed, S. Chen, D.X. Liu, and J.M. Li. 2020. Performance comparison and current challenges of using machine learning techniques in cybersecurity. Energies 13 (10): 27. https://doi.org/10.3390/en13102509 .

Sheehan, B., F. Murphy, M. Mullins, and C. Ryan. 2019. Connected and autonomous vehicles: A cyber-risk classification framework. Transportation Research Part a: Policy and Practice 124: 523–536. https://doi.org/10.1016/j.tra.2018.06.033 .

Sheehan, B., F. Murphy, A.N. Kia, and R. Kiely. 2021. A quantitative bow-tie cyber risk classification and assessment framework. Journal of Risk Research 24 (12): 1619–1638.

Shlomo, A., M. Kalech, and R. Moskovitch. 2021. Temporal pattern-based malicious activity detection in SCADA systems. Computers & Security 102: 17. https://doi.org/10.1016/j.cose.2020.102153 .

Singh, K.J., and T. De. 2020. Efficient classification of DDoS attacks using an ensemble feature selection algorithm. Journal of Intelligent Systems 29 (1): 71–83. https://doi.org/10.1515/jisys-2017-0472 .

Skrjanc, I., S. Ozawa, T. Ban, and D. Dovzan. 2018. Large-scale cyber attacks monitoring using Evolving Cauchy Possibilistic Clustering. Applied Soft Computing 62: 592–601. https://doi.org/10.1016/j.asoc.2017.11.008 .

Smart, W. 2018. Lessons learned review of the WannaCry Ransomware Cyber Attack. https://www.england.nhs.uk/wp-content/uploads/2018/02/lessons-learned-review-wannacry-ransomware-cyber-attack-cio-review.pdf . Accessed 7 May 2021.

Sornette, D., T. Maillart, and W. Kröger. 2013. Exploring the limits of safety analysis in complex technological systems. International Journal of Disaster Risk Reduction 6: 59–66. https://doi.org/10.1016/j.ijdrr.2013.04.002 .

Sovacool, B.K. 2008. The costs of failure: A preliminary assessment of major energy accidents, 1907–2007. Energy Policy 36 (5): 1802–1820. https://doi.org/10.1016/j.enpol.2008.01.040 .

SpringerLink. 2021. Journal Search. https://rd.springer.com/search?facet-content-type=%22Journal%22 . Accessed 11 May 2021.

Stojanovic, B., K. Hofer-Schmitz, and U. Kleb. 2020. APT datasets and attack modeling for automated detection methods: A review. Computers & Security 92: 19. https://doi.org/10.1016/j.cose.2020.101734 .

Subroto, A., and A. Apriyana. 2019. Cyber risk prediction through social media big data analytics and statistical machine learning. Journal of Big Data . https://doi.org/10.1186/s40537-019-0216-1 .

Tan, Z., A. Jamdagni, X. He, P. Nanda, R.P. Liu, and J. Hu. 2015. Detection of denial-of-service attacks based on computer vision techniques. IEEE Transactions on Computers 64 (9): 2519–2533. https://doi.org/10.1109/TC.2014.2375218 .

Tidy, J. 2021. Irish cyber-attack: Hackers bail out Irish health service for free. https://www.bbc.com/news/world-europe-57197688 . Accessed 6 May 2021.

Tuncer, T., F. Ertam, and S. Dogan. 2020. Automated malware recognition method based on local neighborhood binary pattern. Multimedia Tools and Applications 79 (37–38): 27815–27832. https://doi.org/10.1007/s11042-020-09376-6 .

Uhm, Y., and W. Pak. 2021. Service-aware two-level partitioning for machine learning-based network intrusion detection with high performance and high scalability. IEEE Access 9: 6608–6622. https://doi.org/10.1109/ACCESS.2020.3048900 .

Ulven, J.B., and G. Wangen. 2021. A systematic review of cybersecurity risks in higher education. Future Internet 13 (2): 1–40. https://doi.org/10.3390/fi13020039 .

Vaccari, I., G. Chiola, M. Aiello, M. Mongelli, and E. Cambiaso. 2020. MQTTset, a new dataset for machine learning techniques on MQTT. Sensors 20 (22): 17. https://doi.org/10.3390/s20226578 .

Valeriano, B., and R.C. Maness. 2014. The dynamics of cyber conflict between rival antagonists, 2001–11. Journal of Peace Research 51 (3): 347–360. https://doi.org/10.1177/0022343313518940 .

Varghese, J.E., and B. Muniyal. 2021. An Efficient IDS framework for DDoS attacks in SDN environment. IEEE Access 9: 69680–69699. https://doi.org/10.1109/ACCESS.2021.3078065 .

Varsha, M. V., P. Vinod, K.A. Dhanya. 2017 Identification of malicious android app using manifest and opcode features. Journal of Computer Virology and Hacking Techniques 13 (2): 125–138. https://doi.org/10.1007/s11416-016-0277-z

Velliangiri, S., and H.M. Pandey. 2020. Fuzzy-Taylor-elephant herd optimization inspired Deep Belief Network for DDoS attack detection and comparison with state-of-the-arts algorithms. Future Generation Computer Systems—the International Journal of Escience 110: 80–90. https://doi.org/10.1016/j.future.2020.03.049 .

Verma, A., and V. Ranga. 2020. Machine learning based intrusion detection systems for IoT applications. Wireless Personal Communications 111 (4): 2287–2310. https://doi.org/10.1007/s11277-019-06986-8 .

Vidros, S., C. Kolias, G. Kambourakis, and L. Akoglu. 2017. Automatic detection of online recruitment frauds: Characteristics, methods, and a public dataset. Future Internet 9 (1): 19. https://doi.org/10.3390/fi9010006 .

Vinayakumar, R., M. Alazab, K.P. Soman, P. Poornachandran, A. Al-Nemrat, and S. Venkatraman. 2019. Deep learning approach for intelligent intrusion detection system. IEEE Access 7: 41525–41550. https://doi.org/10.1109/access.2019.2895334 .

Walker-Roberts, S., M. Hammoudeh, O. Aldabbas, M. Aydin, and A. Dehghantanha. 2020. Threats on the horizon: Understanding security threats in the era of cyber-physical systems. Journal of Supercomputing 76 (4): 2643–2664. https://doi.org/10.1007/s11227-019-03028-9 .

Web of Science. 2021. Web of Science: Science Citation Index Expanded. https://clarivate.com/webofsciencegroup/solutions/webofscience-scie/ . Accessed 11 May 2021.

World Economic Forum. 2020. WEF Global Risk Report. http://www3.weforum.org/docs/WEF_Global_Risk_Report_2020.pdf . Accessed 13 May 2020.

Xin, Y., L. Kong, Z. Liu, Y. Chen, Y. Li, H. Zhu, M. Gao, H. Hou, and C. Wang. 2018. Machine learning and deep learning methods for cybersecurity. IEEE Access 6: 35365–35381. https://doi.org/10.1109/ACCESS.2018.2836950 .

Xu, C., J. Zhang, K. Chang, and C. Long. 2013. Uncovering collusive spammers in Chinese review websites. In Proceedings of the 22nd ACM international conference on Information & Knowledge Management.

Yang, J., T. Li, G. Liang, W. He, and Y. Zhao. 2019. A Simple recurrent unit model based intrusion detection system with DCGAN. IEEE Access 7: 83286–83296. https://doi.org/10.1109/ACCESS.2019.2922692 .

Yuan, B.G., J.F. Wang, D. Liu, W. Guo, P. Wu, and X.H. Bao. 2020. Byte-level malware classification based on Markov images and deep learning. Computers & Security 92: 12. https://doi.org/10.1016/j.cose.2020.101740 .

Zhang, S., X.M. Ou, and D. Caragea. 2015. Predicting cyber risks through national vulnerability database. Information Security Journal 24 (4–6): 194–206. https://doi.org/10.1080/19393555.2015.1111961 .

Zhang, Y., P. Li, and X. Wang. 2019. Intrusion detection for IoT based on improved genetic algorithm and deep belief network. IEEE Access 7: 31711–31722.

Zheng, Muwei, Hannah Robbins, Zimo Chai, Prakash Thapa, and Tyler Moore. 2018. Cybersecurity research datasets: taxonomy and empirical analysis. In 11th {USENIX} workshop on cyber security experimentation and test ({CSET} 18).

Zhou, X., W. Liang, S. Shimizu, J. Ma, and Q. Jin. 2021. Siamese neural network based few-shot learning for anomaly detection in industrial cyber-physical systems. IEEE Transactions on Industrial Informatics 17 (8): 5790–5798. https://doi.org/10.1109/TII.2020.3047675 .

Zhou, Y.Y., G. Cheng, S.Q. Jiang, and M. Dai. 2020. Building an efficient intrusion detection system based on feature selection and ensemble classifier. Computer Networks 174: 17. https://doi.org/10.1016/j.comnet.2020.107247 .

Download references

Open Access funding provided by the IReL Consortium.

Author information

Authors and affiliations.

University of Limerick, Limerick, Ireland

Frank Cremer, Barry Sheehan, Arash N. Kia, Martin Mullins & Finbarr Murphy

TH Köln University of Applied Sciences, Cologne, Germany

Michael Fortmann & Stefan Materne

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Barry Sheehan .

Ethics declarations

Conflict of interest.

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (PDF 334 kb)

Supplementary file1 (docx 418 kb), rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cremer, F., Sheehan, B., Fortmann, M. et al. Cyber risk and cybersecurity: a systematic review of data availability. Geneva Pap Risk Insur Issues Pract 47 , 698–736 (2022). https://doi.org/10.1057/s41288-022-00266-6

Download citation

Received : 15 June 2021

Accepted : 20 January 2022

Published : 17 February 2022

Issue Date : July 2022

DOI : https://doi.org/10.1057/s41288-022-00266-6

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Cyber insurance
  • Systematic review
  • Cybersecurity
  • Find a journal
  • Publish with us
  • Track your research