OFFICIAL SITE OF THE STATE OF NEW JERSEY

	File a Cyber Incident Report \| File a Data Breach Report
	Vulnerability Disclosure Form \| Member Portal Login

ChatGPT and Its Impact on Cybersecurity

Informational Report

Original Release Date: 11/29/2023

Artificial Intelligence

Artificial intelligence (AI) is a technology designed to mimic human cognitive functions, including learning, interacting, decision-making, and problem-solving. Machine learning is a subset of AI that plays a pivotal role in developing intelligence-based algorithms by learning from data. Within AI, deep learning exists as a subset that structures algorithms into layers and fosters data exchange through artificial neural networks.

While all of these technologies are interconnected, they demonstrate various differences. Deep learning, in particular, teaches computers to process data similarly to the human brain, fostering an environment conducive to innovations. Deep learning models utilize extensive and complex Generative Pre-trained Transformer (GPT) language models to prompt human-like text responses. A transformer is a type of neural network that is trained to assess the content and context of the input data, determining the importance of each part of the information. Additionally, transformers predict text, such as a sentence or paragraph, based on their training and information stored within the datasets. In the field of AI, training encompasses the process of instructing a computer system to identify patterns within datasets, make decisions based on the data presented, and employ tests to evaluate the chatbots’ comprehension of the data. It is important to understand that deep learning models lack the ability to think independently or retain information. Therefore, specialized algorithms are utilized to help recognize patterns within data sequences, which contribute to the formation of responses.

What is ChatGPT?

In November 2022, OpenAI developed and released a deep learning model called ChatGPT for public use. ChatGPT’s capabilities are extensive and have the potential to completely transform how we navigate the digital realm completely. This chatbot facilitates user engagement through conversations and is marketed as a user-friendly tool that provides a convenient alternative to conventional search engines and routine tasks. It operates similarly to various other chatbots, such as users inputting text prompts and the chatbot responding by simulating human speech. Cybersecurity professionals may use ChatGPT’s utility to generate substantial volumes of text to execute a wide array of tasks such as writing, coding, debugging code, articulating complex issues, and more. Its ability to generate detailed and precise responses, along with its extensive knowledge of various topics, distinguishes it from other chatbots and makes it an appealing tool for many organizations.

Examples, capabilities, and limitations of ChatGPT

Image Source: Zapier

In June 2020, ChatGPT employed the GPT – 3 language model, a neural network machine learning model. Given its learning method, this model was widely utilized in natural language processing to emulate human language. It marked a turning point for OpenAI’s ChatGPT as this model was accessible to the public and quickly amassed millions of active users. While older AI chatbots answered questions and provided detailed analyses, ChatGPT stands out because it uses a dialogue format which enables it to address follow-up questions and clarify questions to produce precise responses. Additionally, this chatbot rejects requests that violate the ethical principles that it has been programmed to uphold, including inquiries about illegal activity such as how to commit a crime, hack computer systems, conceal evidence, and more.

In November 2022, ChatGPT introduced an updated chatbot utilizing a more advanced transformer known as GPT – 3.5. This model integrates enhanced algorithms and is accessible to the public. Subsequently, OpenAI also introduced an alternative model named ChatGPT Plus, utilizing GPT – 4, which offers faster response times and internet plugins. However, unlike prior GPT models, access to the GPT – 4 model requires a subscription fee. GPT – 4 handles more intricate tasks compared to previous models (GPT - 3 and GPT - 3.5) including describing images, generating comprehensive image captions, and developing detailed responses up to 25,000 words. Despite the recent release of GPT – 4, the adoption of GPT – 5 are already being discussed. At the time of this writing, OpenAI has not announced a specific launch date for GPT – 5.

How Does ChatGPT Work?

ChatGPT is generated by large language model (LLM) and deep neural network (DNN) software architectures, known as transformer networks, that are influenced by the interconnected nodes in a layered structure. These networks learn to perform tasks from extensive sequences of data, encompassing text, audio, images, videos, and even protein structures. LLMs are trained through next token prediction by receiving a large text sample from various sources such as articles, websites, social media platforms, and GitHub. Subsequently, the text is congregated into tokens which represent parts of words.

OpenAI's examples of tokens

Image Source: PCMag

All transformer-based LLMs use next token prediction, breaking down text into tokens, which can be as short as one part of a word (“words”) or two parts (“basically”). An additional training step, reinforcement learning from human feedback (RLHF), distinguishes ChatGPT from other chatbots and enhances its functionality. During this stage, human annotators create prompts and rate the LLM’s output in the development stage. The model refines its parameters based on these ratings to align with user intent.

An example human evaluation task comparing two LLM outputs.

Image Source: Cohereblog

This model’s approach involves predicting the next token in a sequence based on a sample of text. It then compares its verdict to the actual sample in the training data, adjusting its parameters to correct any mistakes and analyze recurring patterns that are identified within the examples. By continuously repeating this process, the model significantly enhances its language capabilities, enabling it to create comprehensive sequences of text when provided with a prompt.

Human feedback and training are crucial in the development process of ChatGPT, guiding it through conversations and evaluating the quality of its responses. The training process is initiated with generic data and gradually incorporates more tailored and specific data designed to perform specific tasks. ChatGPT’s training involves exposure to online text responses to learn the human language, followed by the use of transcripts to understand diverse conversation topics. In order for the chatbot to improve, it must continuously train. Users can actively participate in this process by upvoting and downvoting, using the thumbs up or thumbs down icons respectively. Additionally, users can offer written feedback to improve the overall dialogue. Reward models are also used to help trainers determine the most effective answers.

ChatGPT – Pros

Streamline Communication

ChatGPT can amplify communication among teams and stakeholders in several ways. It serves as a centralized platform that allows individuals to communicate, access, and share information in one space. Users also utilize ChatGPT for questions, data clarification, and feedback. By functioning as a centralized platform, ChatGPT ensures that all individuals with access have the same information and can observe the decision-making process. ChatGPT fosters effective communication, reduces misinterpretation, and improves overall efficiency. It also aids in building trust between teams and stakeholders which provides everyone with the opportunity to express their thoughts and ideas throughout the decision-making process. Additionally, ChatGPT automates responses to notify stakeholders of breaches so they can take steps to swiftly resolve them. This chatbot also analyzes large data sequences, such as vulnerabilities or cybersecurity threats, to assist cybersecurity professionals in making informed decisions. Ultimately, ChatGPT enhances transparency and collaboration through communication.

Enhance Threat Intelligence and Analysis

In the realm of cybersecurity, ChatGPT’s capabilities are undeniably crucial in threat intelligence and analysis. The chatbot contains features that enable the identification of emerging threats, patterns, and anomalies in the data to develop intelligence by sorting and processing large amounts of textual data. These features enable cybersecurity analysts to stay informed of emerging threats and obtain the knowledge to mitigate them using effective cybersecurity strategies. ChatGPT’s insights significantly improve this process by providing context associated with the data. By filtering out irrelevant information, ChatGPT allows analysts to prioritize critical threats, which enhances the effectiveness of threat analysis and the overall cybersecurity landscape.

Strengthen Cybersecurity Awareness Training Programs

User activity is important in shaping the cybersecurity posture of an organization. ChatGPT serves as a valuable resource to educate users about effective cybersecurity practices, respond to questions, and simulate phishing attacks for cybersecurity employee training. By understanding skill gaps, ChatGPT develops personalized training content for each user to highlight the importance of cybersecurity, impact of cybersecurity threats, and effective mitigation strategies. Through intelligent conversations, ChatGPT recommends training materials and offers feedback to enhance the effectiveness of cybersecurity awareness training programs. Additionally, ChatGPT provides employees with access to virtual mentors for professional development by providing realistic interactions and offering practical advice. ChatGPT can also free up resources and function as a 24/7 support system to address queries or troubleshooting issues without the need for continual assistance from the support team.

ChatGPT – Cons

Risk of Bias & Outdated Information

ChatGPT is trained on information it receives from users along with the information available to it on the internet up until 2021 and, therefore, it has very limited knowledge of world events after that time. Therefore, the responses that it aggregates largely relies on the quality of data that it is trained on, which can result in bias or outdated information. This is dangerous because it can promote the spread of misinformation or harmful biases. Because ChatGPT relies on human training to advance, biases are oftentimes replicated in the datasets used to train the chatbot. This grim reality serves as a reminder as to why it is important for individuals to conduct their own independent research by surveying alternative sources aside from ChatGPT to ensure that the data that ChatGPT provides is accurate and reliable.

Data Privacy Concerns

ChatGPT provides numerous benefits for individuals and organizations alike. In contrast, this chatbot also presents negative aspects, such as data privacy concerns. According to OpenAI, ChatGPT records every prompt including those that may contain personal information, files, and feedback and is subject to be reviewed by OpenAI’s trainers to further improve ChatGPT. Several Samsung employees leaked sensitive data, including their programming code, on ChatGPT on three separate occasions. This example emphasizes the importance of enforcing stringent policies as well as the significance of cybersecurity awareness for employees. It is unknown to what extent the tools collect and retain prompts since the chatbot automatically retains and utilizes the prompts to train itself. Therefore, users are advised to refrain from sharing sensitive data because it may be retained to train the chatbot.

Image Source: MakeUseOf

Fortunately, in April 2023, OpenAI created data privacy controls that allow users to maintain their privacy on ChatGPT by disabling the chat history via the Settings menu (Setting > Data controls > Chat history & training). According to OpenAI, when the chat history is disabled, the company only preserves conversations for 30 days. After the 30-day period, conversations are deleted permanently unless the content of the conversation includes content that is illegal or inappropriate in behavior.

Amplified Cybersecurity Threats

ChatGPT’s advanced capabilities enable it to assist threat actors with the execution of cyberattacks. For example, traditional phishing emails are identified using key indicators, such as poor grammar, misspellings, and unsual greetings. However, with the help of ChatGPT, threat actors now possess the tools to develop highly sophisticated phishing emails that look legitimate – thus, rendering many major indicators useless and ineffective. Threat actors have the ability to produce phishing emails that contain flawless language and personalized details, making it significantly more challenging to differentiate between legitimate emails and phishing emails.

Conclusion

ChatGPT’s capabilities have enabled it to become a force to be reckoned with within the realm of cybersecurity. As it continues to undergo the training process and make advancements, ChatGPT will become more intelligent and provide users with substantial value and countless new opportunities. By leveraging its capabilities and acknowledging its limitations, users can utilize this chatbot to facilitate an environment that enhances cybersecurity while upholding ethical values and moral practices.

Encryption: The Basics →