How to Become an Artificial Intelligence Hacker?

Cihan Ozhan
7 min readJan 3, 2023

--

Hacking artificial intelligence or hacking with artificial intelligence…

The basic truth that you will realize as you deal with cyber security and advanced software is that 100% security is not possible. Our job is to hack something or secure it, although it depends on our specializations. However, this has certain limits. No matter how well you provide security, your rival does not sit idle either. Just like martial arts, when you plan how to hit the rival, the rival plans the same for you and prepares himself for it…

Since the early 2000s, when I started researching on software and security, the question that has been insatiable to ask is ‘how can I become a hacker?’. Anyone with security experience has had the same question over and over. My purpose in this article is not to be a hacker or hack, but to guide those who aim for a career in hacking artificial intelligence or hacking with artificial intelligence!

There are many webinar/video recordings, presentations and articles that I have published. To access these, cihanozhan.com.

We will examine the subject under two different headings. These;

  • AI Security/Hacking
  • AI for Security/Hacking

The two are different matters. Don’t confuse the two. To understand these basics, you should read my article ‘Artificial Intelligence Security: Introduction’.

Assuming you have read my previous article, we can move on to our topic.

AI Security : Hacking Artificial Intelligence

Good news for those who see artificial intelligence as a magic wand: it is not!

We divide artificial intelligence into categories such as machine learning, deep learning, etc. Below them are only software and algorithms. I didn’t see any magic while working on autonomous systems or hacking an AI. I’m sure you won’t find this magic either. That’s why I suggest you don’t waste your time searching. Focus on software and algorithm.

If an AI model consists of algorithms and software, do the software security vulnerabilities that have been valid for many years also apply to the subject we are talking about? Yes, totally!

First of all, the trick is software knowledge. The most basic mistake I see in young people who are looking for AI-oriented education or trying to improve themselves is the lack of software knowledge. While they’re looking for AI magic, they think software is unimportant, they don’t know. And then there are tens of hours of coursework, maybe hundreds of hours of math work, etc. can cause disappointment after being involved in the first project in the sector. Because the sector means code, a product consists of millions of lines of code. The mathematical algorithm that has been worked on for weeks to understand may consist of 1–2 lines of code within a million lines of code for a product. From here you need to understand the importance of programming and data knowledge. Data is also included in SQL, CSV, Image, Video, Audio, Frequency, Stream or any form of data… Without knowing the anatomy of a picture or video correctly, you can neither develop AI on it nor hack it. Remember, hacking should usually be the next level after improving that thing. A good architect can see all the weak points of an architecture! As a programming language, the most ideal language to be preferred at the beginning is Python. This Python is amazing, not because it’s so fast, it’s so secure. Although this is the subject of a separate article, as I said, the language that meets the ‘ideal’ truths is Python.

The topics that should be focused on in software-oriented security vulnerabilities are as follows:

  • Static Code Analysis
  • Dynamic Program Analysis
  • Operating System Architecture

When examining the security vulnerabilities of an AI model, we first look at code, environment and tool vulnerabilities. The vulnerability of Pickle, a library you will use to store (data serialization) the parameter data of an AI model, is directly a vulnerability to your model. Or, if you know that a ML/DL tool (TensorFlow, PyTorch, etc.) that seems to be intangible in the production environment does not have authorization, but you see it as reliable, you should be ready for a serious vulnerability. While TF/PyTorch has to be able to access all the operating system, network, disks, CPU and GPU for it to do its job, and you have to give it those permissions, you can’t leave it unattended. Since an AI Hacker will look at the issue for manipulation, he will use the vulnerabilities of these tools to attack the entire MLOps infrastructure. As a result of this attack, your AI model may be stolen, your AI model may be poisoned and your predicted results may be changed, or many other possible damages may occur.

Another mistake is the ‘closed network is safe’ approach! The term closed network is used for the purpose of operating the related computer network as closed to outside access. For example, the view that a fully autonomous vehicle (unmanned car or a military land/air/sea vehicle etc.) usually cannot be exposed to a hacking attack because it usually has scenarios where it is not open to direct access from the outside… Wrong! We do not need to access the network of an autonomous military ground vehicle. That vehicle has to receive data with sensors, camera, LIDAR and RADAR in order to provide autonomy, and our goal is to make a hacking attack on this data obtained from outside. In other words, you can escape but you can’t hide… When it comes to AI and autonomy in a scenario where data must be taken (mostly) as input from outside, the closed network is not secure, it is open to hacking. So an AI Hacker tries to deceive the ‘feelings’ of artificial intelligence by manipulating input data, code and algorithms. Those feelings are not safe in humans or AI.

There are thousands of psychology experiments on this subject that have been done on humans. All human senses and thoughts can be easily manipulated. This is also true for AI.

Static Code Analysis is actually a general software security issue. This is a valid analysis method for any project that includes mobile, web, system or code. You need to improve yourself in this topic.

Dynamic Program Analysis, like SCA, is a general software security issue. The main difference is based on analyzing an application in runtime, not code-time. In other words, pentesting the software that is running in an emulator, lab or a real production environment while it is running. It is the process of manipulating running software. This is again a title that you need to improve yourself for AI Security.

Operating System Architecture, whether software, security or performance, is generally a beginner’s knowledge. The better you know, the more you can increase your technical depth.

Summary of What I Should Learn:

  • Advanced software: Python or C++ suitable. Getting started Python
  • Secure Software Development
  • Static Code Analysis
  • Dynamic Code Analysis
  • Mastering Basic Machine Learning (Statistics) Algorithms
  • Mastery to Write a Deep Neural Network from Scratch
  • AI Security Research: The deepest security area you can see.

AI for Security : Hacking with Artificial Intelligence

Working under this title is very nice, very enjoyable, but also very difficult ;)

This is the area where you need more general cybersecurity domain knowledge. Because now you want to develop an artificial intelligence that does the job of a hacker or security expert.

We can divide this issue into two:

  • Offensive AI : Developing an AI for cyber attack
  • Defensive AI : Developing an AI for cyber defense

I do both for my job and business (safebox.ai), and I enjoy both. However, here I will talk about how to do career development on the side of Defensive AI and what you need.

Software! Without software, this is an impossible stage. Don’t think of software as writing a 100-line script. A hacker who will work in this field should be at the same level as a normal software developer. Again, Python is sufficient and will make your job easier in many points.

Cyber security domain information. For example, if you want to develop a detection or load balancer against DDoS attacks with artificial intelligence, you should naturally have in-depth knowledge of DDoS. This domain knowledge scenario applies to every AI project. You cannot develop an AI project in an area of expertise that you do not know. Maybe you can only develop it at the training level… If you want to develop an artificial intelligence in the field of Threat Intelligence, you must master TI, software, internet, protocols and general intelligence information. You can’t just think of yourself as a software developer, TI person, or artificial intelligence. While a person uses many different background knowledge and experience while doing TI work, some of the other factors that bring success on top of that are research ability, instincts, feelings, some foresight, etc. The more you can transfer these to AI, the more successful you can be. For example, you cannot develop a very successful AI-based SQLi Defender without having a deep knowledge of SQL Injection attack… Or you cannot develop a malware detection product with AI without developing a malware yourself and improving yourself on malware analysis. Domain knowledge is very important in these work areas. You have to teach AI to describe the job algorithmically and programmatically.

One of the main problems you will encounter when working with AI is which of the hundreds of AI algorithms you should use for which vulnerability. We usually do not use a single algorithm, it is necessary to use more than one. This requires AI experience and mastery. Everyone can talk about ChatGPT in the industry, but knowing and understanding how it was developed is another specialty. You will encounter very similar scenarios in the field of AI for Security.

What Should I Learn Summary:

  • Advanced Software: Python
  • Cyber Security Domain Information: In which vertical project you will develop…
  • Defense Knowledge and Experience of Relevant Vulnerability
  • Mastering Security Software Used for Related Vulnerabilities
  • Mastering Basic Machine Learning (Statistics) Algorithms
  • Mastery to Write a Deep Neural Network from Scratch
  • OWASP

I wouldn’t say that AI Security and AI for Security, which are the subject of this article, are easy titles. However, you will see how they will dominate the industry globally in the coming years. ;)

Cihan Ozhan
Safebox.ai

--

--