Hackerspot: AI Security

How Does AI Actually Learn?

Chady — Sun, 10 May 2026 16:11:58 GMT

How does AI learn? Training an AI model isn’t magic. It’s a mechanical process: you show the model examples, measure how wrong it is, and adjust its internal knobs to be less wrong. Repeat millions of times, and you get a model that works.

Here’s the machinery underneath.

The Training Pipeline: Data to Model

Before training even starts, you need a plan for your data.

You collect raw data (emails, images, transactions, sensor readings—whatever your problem requires). You clean it (remove garbage, fix errors, handle missing values). You normalize it (scale numbers to a consistent range so the model doesn’t get confused by different units). Then you split it into three parts: a training set, a validation set, and a test set.

The training set is what the model learns from. You show it thousands of examples, and the model adjusts itself based on what it sees.

The validation set is a referee. While training happens, you periodically check the model against data it’s never seen before. If the model is overfitting—memorizing training examples instead of learning general patterns—the validation set will catch it. The model never learns from validation data; it’s only for observation.

The test set is a final exam. You keep it locked away until training is completely done. Only then do you measure the model’s real-world accuracy on data it’s truly never encountered.

This separation is critical. If you test on the same data the model was trained on, you’ll get an inflated score that doesn’t reflect how the model will perform on new problems.

Loss Functions: The Scoreboard

How does the model know it’s wrong?

A loss function measures how bad the model’s predictions are. The lower the loss, the better the model. Different problems use different loss functions.

For a spam filter, the loss might be: “How many emails did you misclassify?” If the model predicts “spam” for an email that’s actually legitimate, the loss goes up.

For an image classifier that identifies dog breeds, the loss might measure the probability distance between the predicted label and the true label. If the model is 90% confident it’s a poodle but it’s actually a dachshund, the loss is high. If it’s 95% confident it’s a dachshund, the loss is lower.

Here’s a concrete example:

Gradient Descent: Rolling Downhill

Now, how does the model actually adjust itself?

Imagine you’re blindfolded at the top of a hill, trying to reach the lowest point. You can’t see the whole landscape. You feel the slope under your feet, and you take a small step downhill. Then you check the slope again and take another step. Repeat long enough, and you’ll reach a valley.

Gradient descent is this process. The model calculates the slope of the loss function with respect to each of its parameters (called the “gradient”). Then it takes a small step in the direction that reduces loss. It does this thousands or millions of times.

The word “gradient” sounds fancy but it just means: “In which direction does the loss go down, and how steep is it?”

Backpropagation: Assigning Blame

Gradient descent needs to know which parameters to adjust. This is where backpropagation comes in.

Backpropagation is the mechanism that calculates how much each internal parameter contributed to the error. It works backward from the output, asking: “How did this layer’s weights affect the mistake? And the layer before that?”

Think of it as an error audit trail. If the model predicted 95 instead of 50, backpropagation traces the error backward through every calculation and says, “This weight contributed 3 to the error. That weight contributed 7. This one contributed -2.” Gradient descent then adjusts these weights based on their contributions.

You don’t need to understand the mathematics to use it. The key insight: backpropagation lets the model figure out what to fix.

Epochs and Batch Size: The Training Rhythm

Training happens in cycles.

An epoch is one full pass through the entire training dataset. If you have 10,000 training examples, one epoch means the model has seen all 10,000 exactly once.

But you don’t show the model all 10,000 at once. You show them in groups called batches. A batch size of 32 means you process 32 examples, calculate their total loss, backpropagate, adjust the weights, then move to the next 32. This happens because processing one example at a time is slow, and processing all of them at once requires too much memory.

A typical training run might look like: 100 epochs, batch size 32. The model sees all training data 100 times, processing it in batches of 32 each time. Loss decreases with each epoch until it plateaus. That’s when you stop.

Data Quality Beats Algorithm Quality

Here’s something instructors wish beginners knew: better data beats better algorithms.

You can have the fanciest, most sophisticated model ever designed. But if your training data is garbage—full of errors, biased, or unrepresentative of the real world—the model will be garbage. Conversely, mediocre algorithms trained on clean, representative data often outperform fancy algorithms trained on messy data.

This is why data preparation takes longer than algorithm selection in real projects. And why data engineers are in high demand.

The Trust Boundary: Training as a Security Gate

The training process is a boundary where trust matters.

If someone poisons your training data—inserting malicious examples or corrupting labels—the model learns the poisoned patterns. It becomes a poisoned model. The model doesn’t know it learned the wrong thing. It’s confident. It just works based on what it saw.

This is especially dangerous with self-supervised learning and large language models. An LLM trained on poisoned text learns “facts” that are false, and those falsehoods get baked into billions of parameters. The model has “memorized” the corruption.

This is why training data provenance (knowing where it came from and who had access to it) matters in security-critical applications.

Bringing It Together

Training is straightforward in outline: prepare data → measure loss → calculate gradients → adjust weights → repeat. But this simple loop, repeated millions of times on billions of examples, produces systems that can recognize patterns humans barely see.

The key to good models isn’t fancy mathematics. It’s clean data, a sensible loss function, and patience.

Supervised, Unsupervised, and Reinforcement Learning: What’s the Difference?

Chady — Mon, 04 May 2026 04:30:56 GMT

Machine learning isn’t one monolith. The way an AI system learns depends entirely on what data you have and what problem you’re solving. There are three main categories—supervised, unsupervised, and reinforcement learning—each built on a different principle.

Supervised Learning: Learning With a Teacher

Supervised learning works exactly as it sounds: the model learns from examples labeled with the correct answers.

You show the model thousands of emails marked “spam” or “not spam.” You show it thousands of medical images with a diagnosis already attached. You show it credit card transactions labeled “fraud” or “legitimate.” The model sees the input (the email text, the image, the transaction details) paired with the correct output, and learns to predict that output for new, unseen data.

This is the workhorse of applied AI. If you have labeled data, supervised learning is usually your first choice.

Real example: A bank wants to detect fraudulent transactions. They have historical data: millions of past transactions, each marked as either fraud or legitimate. The bank trains a supervised model on this data. When a new transaction arrives, the model predicts “fraud” or “legitimate” based on patterns it learned from the labeled examples.

Supervised learning does have a catch: someone has to label the data. For simple cases like emails (spam filters were manually curated for years), that’s feasible. For medical imaging, you need expert radiologists. Labeling is expensive, time-consuming, and sometimes requires domain expertise. And if the labels are wrong, the model learns the wrong thing—a vulnerability we’ll return to later.

Unsupervised Learning: Finding Patterns Without Answers

Unsupervised learning flips the script. You give the model unlabelled data and say: “Find patterns.”

The model isn’t trying to predict a specific output. It’s trying to discover structure. It might cluster customers into groups based on their shopping behaviour without being told what those groups should be. It might identify which transactions look weird compared to the crowd—potential fraud or system errors. It might compress images into a smaller representation that captures the essential structure while discarding noise.

Because there’s no “correct answer,” unsupervised learning is messier to evaluate. You have to decide whether the patterns the model found are useful. But it’s powerful when you have tons of unlabelled data and want to explore it without predefined categories.

Real example: An e-commerce platform has millions of user sessions but hasn’t manually categorised them. They run unsupervised clustering and discover that users naturally group into three distinct patterns: bargain hunters (frequent price checking), comparison shoppers (research-heavy), and impulse buyers (quick checkout). The platform never labelled these groups—the model found them.

The trade-off is looser control. You can’t easily specify what patterns you want to find. The model might find patterns that are statistically real but not useful for your business. It takes experimentation.

Reinforcement Learning: Learning Through Reward and Penalty

Reinforcement learning is the third path: the model learns by interacting with an environment and receiving rewards or penalties for its actions.

There’s no labelled training set. Instead, imagine a game-playing AI. It makes a move, sees the result, and gets a reward (if the move was good) or a penalty (if the move was bad). Over millions of games, it learns which moves tend to lead to victory. It never saw examples of “the correct move”—it discovered them through trial and error, guided by the reward signal.

Reinforcement learning powers game-playing systems like AlphaGo. It’s used in robotics (robots learn to walk by trial and error, getting rewarded for forward progress). It’s used in recommendation systems where the “reward” is whether a user clicks on a recommendation.

The catch: you have to design the reward carefully. If your reward signal is poorly designed, the system might find creative—and useless—ways to maximise it. An AI tasked with moving as fast as possible might learn to spin in circles instead of reaching the goal. We call this “reward hacking.”

The Variants: Semi-Supervised and Self-Supervised

Two hybrid approaches deserve mention.

Semi-supervised learning uses a mix of labelled and unlabelled data. When labelling is expensive, you label a small portion of your data, then use unsupervised techniques on the unlabelled portion to improve your model’s performance. It’s a practical compromise.

Self-supervised learning is newer and increasingly important. The model generates its own labels from structure in the data. For example, if you’re training on text, you might mask out a word and ask the model to predict it. No human labeller needed. Modern large language models (LLMs) are trained this way: they learn by predicting the next word in a sentence, which is an automatically-generated label that requires no human effort. This approach has made scaling possible.

Security: The Dark Side of Each Approach

Each learning paradigm has its own vulnerabilities.

In supervised learning, if an attacker poisons the labelled data—inserting examples with incorrect labels—they corrupt the model’s understanding. Imagine a spam classifier that’s been fed mislabelled emails by an attacker. It learns the wrong patterns.

In unsupervised learning, if you know the clustering boundaries the model uses, you can craft data to evade detection. An anomaly detector identifies outliers based on distance from cluster centres. If an attacker knows those centres, they can craft a transaction or behaviour that hides inside a normal cluster.

In reinforcement learning, an attacker can exploit the reward system itself. If the system values speed and an attacker can trigger rewards in unintended ways, the AI chases those rewards instead of the intended goal.

In self-supervised learning, poisoning the training data has a subtle but serious effect: the model learns corrupted structure and the falsehoods become baked into its weights. An LLM trained on poisoned text learns to “know” things that aren’t true.

So Which One Do I Use?

There’s no universal answer. The choice depends on what data you have, what problem you’re solving, and what kinds of errors you can tolerate.

Use supervised learning when you have labelled data and a clear prediction target.
Use unsupervised learning when you want to explore unlabelled data or detect anomalies without predefined categories.
Use reinforcement learning when you can simulate interaction with an environment and design a reward signal.

Most real systems use a hybrid approach. And whatever you choose, remember: the learning mechanism is a trust boundary. Poisoned data produces poisoned models.

What Is an AI Model, Actually?

Chady — Sun, 26 Apr 2026 16:34:46 GMT

An AI model is not software in the way you know software. It’s not a program with if-then statements. It’s a mathematical function with learned parameters—numbers that have been adjusted to recognize patterns in data.

Think of it like this: the architecture is the recipe structure. The weights (learned parameters) are the specific measurements tuned by tasting thousands of dishes.

Model = Architecture + Weights

The architecture is the skeleton—the layers of neurons, the way information flows through the system, and the rules that map inputs to outputs. You define the architecture. It’s the blueprint.

The weights are everything else. They’re numbers—sometimes billions of them. Each weight is a tiny adjustment that helps the model recognize patterns. You don’t define them; training does.

Here’s a concrete example. A simple image classifier might have this architecture:

Input layer (the image pixels)
Hidden layer 1 (256 neurons)
Hidden layer 2 (128 neurons)
Output layer (10 categories: cat, dog, bird, etc.)

The architecture tells you the shape. But there are millions of weights between those neurons. Those weights determine what the model actually “knows.” The same architecture trained on different data will have different weights and behave completely differently.

What a Model Actually Does

A model takes input and produces output. Here are some real examples:

Image model: you feed it a photo → it outputs a label (cat, dog, bird)
Language model: you feed it text → it outputs more text (a completion, an answer, a translation)
Audio model: you feed it sound → it outputs a transcript or classification
Tabular model: you feed it a row of numbers → it outputs a prediction (will this customer churn?)

The model doesn’t “think” in the way humans do. It doesn’t have reasoning or understanding. It’s a statistical function. Given input X, it produces output Y based on patterns it learned from training data.

For a language model like ChatGPT, the input is text. The model predicts the next word based on the previous words. Then it predicts the next word after that. And so on. Each prediction is a probability distribution over possible words.

It sounds simple because it is simple. The magic (and the mystery) comes from scale. Billions of parameters adjusted on trillions of words produce a system that appears to understand language. It’s actually pattern matching at extraordinary scale.

The Model File: Just Weights

When you download or run a model, what you’re actually getting is a file containing all those learned weights. Common formats include .pkl (pickle), .safetensors, .pth (PyTorch), or .bin (HuggingFace).

Inside that file: weights. Billions of decimal numbers. That’s the entire model. The architecture is usually defined separately (in code), but the weights are the actual learned knowledge.

This matters more than you might think. That model file is the system. If someone modifies the weights—even slightly—the model’s behavior changes. If a weight is corrupted, the output becomes unreliable. If a weight is deliberately tampered with, the model can be made to misbehave.

This is why the security of model files matters. An untrustworthy source for a model file is untrustworthy, full stop.

Why Model Files Can Be Dangerous

Pickle files (.pkl) deserve special mention because they can execute code when loaded. This is a legacy of how Python pickle works—it was designed to serialize arbitrary Python objects, including functions. An attacker can craft a malicious pickle file that runs code the moment you load it.

If you download a model in pickle format from an untrusted source and load it, you’re potentially running arbitrary code. Safer formats like .safetensors don’t have this vulnerability; they only contain numbers.

Models Are Not Programs

This is the mental shift that matters. A traditional program has logic you can read: function calls, conditionals, loops. A model has none of that. You can’t open a large language model and read “here’s where it decides whether to be helpful.” The behavior emerges from the weights.

This means:

Models are harder to audit. You can’t trace a decision path like you can in code.
Models are harder to explain. You can’t point to a line and say “this caused the output.”
Models fail in unexpected ways. They don’t fail because of a bug in your if-then logic; they fail because the pattern they learned doesn’t generalize.

The Practical Reality

In practice, when you use ChatGPT or Claude, you’re downloading (or accessing via API) a model file with billions of weights. The companies behind those models spent months training them on massive amounts of text using specialized hardware. Then they saved the weights to a file.

When you type a question, that file (the weights) processes your text through its learned patterns and produces an answer. The answer reflects what the model learned during training, for better and worse.

You’re not running a program. You’re querying a statistical function that’s been tuned to be useful.

What is Next

In the next post, we’ll look at different types of learning: supervised learning (where you have labels), unsupervised learning (where you don’t), and reinforcement learning (where the system learns from rewards and penalties).

For now, the key insight: an AI model is a mathematical function with parameters learned from data. The architecture is the shape. The weights are the knowledge. The model file is the saved state of that knowledge. Understanding this separates mystique from reality.

How Did We Get Here? The 70-Year History of AI in 5 Minutes

Hackerspot Team — Mon, 20 Apr 2026 22:04:46 GMT

AI didn’t arrive overnight. The field spent decades in the valley before climbing back out. Understanding where we came from explains why the present moment is actually different.

We’re Going to Solve Thinking (1950s–1970s)

In 1956, researchers at Dartmouth Summer Research Project coined the term “artificial intelligence.” They were optimistic—maybe too optimistic. The idea was that you could program a computer to reason like a human: give it rules and logic, and it would solve problems.

This “symbolic AI” approach ruled for decades. Engineers would manually write rules: if X, then Y. If the weather is rainy, then bring an umbrella. Simple. Clean. Wrong about almost everything complex.

By the 1970s and 1980s, reality had landed hard. The systems couldn’t handle the messiness of real data. They broke on edge cases. Funding evaporated. This first “AI winter” lasted years—not because the researchers were incompetent, but because the promise had outrun the technology.

The lesson: Hype without compute is just noise.

The Rise and Stall of Statistical Learning (1980s–2000s)

The field pivoted. Instead of hand-coding rules, why not let data teach the system? This was the birth of machine learning, statistical methods capable of learning patterns from examples.

By the 1990s and 2000s, these methods worked. Banks deployed neural networks to read handwritten checks. Spam filters learned what junk email looked like. Kaggle competitions crowned winners with algorithms called Gradient Boosting Machines (GBMs), statistical models that combined weak predictors into strong ones.

But progress stalled again. These methods were narrow: a model trained to recognize faces couldn’t suddenly translate English. Each task needed its own hand-engineered pipeline. The systems were brittle.

This wasn’t hype this time—the math worked. The problem was computing. Good statistical learning needs a lot of data, but good deep learning needs vastly more. CPUs couldn’t keep up.

The Deep Learning Inflection: 2012 and Beyond

Then GPUs happened.

In 2012, a team used graphics processors (hardware originally designed for video games) to train a deep neural network on image recognition. The network was called AlexNet. It crushed the competition, cutting error rates nearly in half. The jump was so large that the field collectively paused and said, “Oh. That’s what we’ve been waiting for.”

Deep learning worked because it scaled. More layers, more parameters, more compute. And crucially, with enough data and enough compute, you didn’t need engineers to hand-craft features. The network learned what to look for.

By the mid-2010s, deep learning was everywhere: computer vision, speech recognition, and machine translation.

Researchers noticed something: a new architecture called Transformers (introduced in a 2017 paper titled “Attention Is All You Need”) worked even better. Unlike previous models that read text one word at a time from left to right, Transformers could process entire sequences simultaneously. This "parallelization" allowed them to handle massive datasets with incredible speed, forming the technical foundation for everything that came next.

The Large Language Model Era: 2020 to Now

Starting in 2020, companies began scaling Transformer networks to absurd sizes. OpenAI’s GPT-3, released in 2020, had 175 billion parameters—numbers representing learned patterns. For context: a typical brain has about 86 billion neurons. GPT-3 wasn’t a brain, but it was scaled to a similar order of magnitude.

Then ChatGPT launched in late 2022. It was a GPT-3 variant, fine-tuned to answer questions in conversational English. It hit 1 million users in five days.

Since then: Claude (Anthropic), Gemini (Google), and countless others. The pattern is consistent: scale up, add more compute, train on more text, get smarter.

Why Now Is Actually Different

Here’s what matters: compute is the through-line. AI winters happened when promises exceeded compute capacity. Algorithms didn’t improve miraculously in 2012; GPUs made existing algorithms finally viable.

In 2019, researcher Richard Sutton summarized this shift in an essay titled “The Bitter Lesson.” His point was a blow to human ego: general methods that leverage massive computing always beat “clever” approaches where humans try to bake their own knowledge into the system. The field spent 70 years trying to be smart; it turns out that being “big” was the more effective strategy.

This is why 2020–2025 feels different: we have the compute. We understand the architecture. We have enough data. The constraint that killed AI twice before,” we don’t have enough resources to make this work,” has lifted.

The Cost of Progress: New Vulnerabilities

Each wave of AI introduced new security surfaces. Symbolic AI could fail in obvious ways. Statistical models were opaque but narrowly scoped. Deep learning is opaque and scaled to billions of parameters.

A model file containing billions of learned weights is now the system. Because these systems are pattern-matchers rather than reasoners, they lack an internal “truth check.” This has led to vulnerabilities such as Prompt Injection, in which a model is tricked into ignoring its safety guidelines. As we head into 2026, the threat has evolved into Indirect Prompt Injection, in which an AI can be subverted simply by reading a malicious website or document, turning the entire internet into a potential attack surface.

The attack surfaces keep evolving. So does the defense.

The Actual Arc

The 70-year history of AI is not a genius suddenly striking. It’s: promise, failure, reset, waiting for hardware, breakthrough, scale, repeat. Three phases: symbolic logic failed. Statistical learning stalled. Deep learning accelerated.

We’re in the deep learning phase now, and the resources have finally aligned. But the story isn’t over. As we move through 2026, the focus is shifting from raw scaling to reasoning efficiency, creating models that don’t just know everything, but can “think” through a problem before they speak. The next chapter isn’t just about more data; it’s about what we do with the intelligence we’ve finally managed to build.

What Is AI, Machine Learning, and Deep Learning?

Chady — Mon, 13 Apr 2026 21:54:43 GMT

You’ve heard all three terms. You’ve probably used them interchangeably. But AI, machine learning, and deep learning are not the same thing, and understanding the difference is the first step to understanding why AI systems are inherently fragile, how their "learning" can be turned against them, and why they often behave in ways that defy human logic

Please note that this post is the first of our AI Security series, where we bridge the gap between high-level hype and technical reality. Before we dive into the specialized vulnerabilities of these systems, we must first talk about the basics.
By establishing a clear, jargon-free understanding of how these technologies differ and how they learn, we lay the groundwork for the more complex security and architectural topics to follow in this series.

AI Is the Big Tent

Artificial intelligence (AI) is the broadest term. It refers to any system that exhibits intelligent behavior — reasoning, problem-solving, learning, or decision-making — that we’d normally associate with humans.

That definition is deliberately wide. A rule-based system that plays chess using handwritten rules counts as AI. So does a neural network that generates images from text. They’re very different technologies, but both fall under the AI umbrella.

The key idea is that AI is the goal (machine intelligence), not a specific technique.

Machine Learning Is How Most Modern AI Actually Works

Machine learning (ML) is a subset of AI. Instead of writing explicit rules, you show the system thousands (or millions) of examples, and it figures out the patterns on its own.

Think of it this way. You could write rules to identify spam email: “if the subject contains ‘FREE MONEY’, mark as spam.” But attackers adapt. Rules break. Machine learning takes a different approach: show the system 10 million emails labeled “spam” or “not spam”, and it learns to recognize the patterns itself — including patterns you never thought to write a rule for.

The core principle: ML systems generalize. They learn from past examples and apply that learning to new, unseen data. That’s what makes them powerful. It’s also what makes them fragile in ways traditional software isn’t — a topic we’ll come back to throughout this series.

Deep Learning Is ML With Many Layers

Deep learning (DL) is a subset of machine learning. It uses artificial neural networks, loosely inspired by how neurons connect in the brain, with many layers stacked on top of each other. That’s the “deep” part.

Each layer learns to recognize increasingly abstract features. In an image recognition system:

Layer 1 might detect edges
Layer 5 might detect shapes
Layer 20 might detect “cat ears.”

Deep learning is why we can now build systems that recognize faces, transcribe speech, translate languages, and generate text with remarkable fluency. It powers virtually every AI product you interact with today — from spam filters to ChatGPT.

The hierarchy, in plain terms:

Why Compute Beat Cleverness

Here’s one of the most important, and counterintuitive, lessons from 70 years of AI research.

Researchers spent decades trying to build cleverer algorithms. Handcrafting rules, encoding human knowledge, designing elegant mathematical models. And they were consistently outperformed by one simple strategy: throw more data and more computing power at a simpler approach.

Richard Sutton, a pioneer in AI research, called this “the bitter lesson” in 2019: general methods that leverage computation are ultimately the most effective, by a large margin.

What this means in practice: modern AI progress is driven less by brilliant new algorithms and more by scale — bigger datasets, more powerful GPUs, more parameters. GPT-3, the model behind early ChatGPT, has 175 billion parameters. Its successor models are larger still.

This has a direct security implication. Scale means complexity, and complexity means more attack surface. A system with 175 billion parameters is not something any human can fully inspect or understand. That opacity is a security property — and not a good one.

What AI Is Actually Good At?

A quick litmus test from the training material helps here. AI tends to work well when:

The problem isn’t already solved by simpler means
You have enough good-quality training data
Some margin of error is acceptable
The patterns you’re learning from are relatively stable over time

It tends to fail — sometimes catastrophically — when:

The situation is genuinely novel (unlike anything in the training data)
100% accuracy is required
The underlying patterns change faster than the model can be retrained
The training data was biased, poisoned, or just plain wrong

That last bullet is where security gets interesting. The training data is a trust boundary. If an attacker can influence what a model learns from, they can influence what the model does — permanently, and invisibly. More on that in Series 4.

Conclusion

AI, ML, and deep learning are not interchangeable buzzwords. They’re a nested hierarchy of increasingly specific techniques, all built on the same core idea: learn patterns from data rather than encode rules by hand.

What makes this matter for security is exactly what makes it powerful: these systems learn behaviors that nobody explicitly programmed. That means the attack surface includes the data, the training process, the model file, and the inference pipeline — not just the application code sitting on top.

The rest of this series builds the foundation you need to understand all of that. Next up: how we got from “AI” being coined as a term in 1956 to ChatGPT in 2022 — and what the detours tell us about where the real risks live.

Scaling Your Engineering Impact with Agents

Chady — Fri, 10 Apr 2026 16:30:58 GMT

We are moving past the era of the chatbot. Today, coding agents are beginning to handle the heavy lifting of implementation, but they are only as good as the engineer directing them. Much like a musical instrument, an agent can produce 'slop' or a masterpiece; the difference lies in your technique. I’ve put together a few simple shifts to help you move from writing every line of code to orchestrating the bigger picture

Access to Verification

The single most important factor in an agent’s success is whether it has access to verification. Without it, the agent is simply “guessing” based on patterns.

Provide Tool Access: Agents need to do what humans do: run the application, view logs, and perform tests.
Tighten the Feedback Loop: When an agent can see the output of its work—such as reading logs from a CI server—the quality of its code improves substantially.
Test the Tests: Agents often write code and tests at the same time, which can lead to tests that pass “by construction”. Always ask the agent to introduce a regression to ensure the test actually catches the error.

Work in “Plan Mode”

Don’t ask an agent to do everything at once. You will get better results by separating the “thinking” from the “doing”.

The Power of Plan Mode: In this mode, a system prompt strictly forbids the agent from writing code. This allows the agent to use all its resources to understand the problem and design an architecture.
Human-Led Design: You must still do the work to break down large, messy problems into small, manageable tasks. If the scope is too big, agents may confidently produce “slop”, thousands of lines of code containing hidden bugs.

System Prompt: The background instructions that tell the AI how to behave (e.g., “do not write any code”).

Manage the “Context Window”

An AI’s “memory” is known as its context window. If this window gets too full, the AI’s performance “drops off a cliff”.

The 50% Rule: Try to keep your conversation history below 50% of the context window to maintain high accuracy.
Fresh Starts: If an agent starts going in circles or hallucinating, the context is likely “corrupted”. It is often better to close the session and start a new one.
Track State in Markdown: Keep a .md file in your codebase to track project progress. This allows a new agent session to “read the file” and catch up instantly without wasting memory.

Context Window: The maximum amount of information (text and code) an AI can “remember” at one time.
Hallucination: When an AI confidently provides information that is false or incorrect.

Additional Tips for Better Results

Pick the Right Language: Agents are currently most effective with TypeScript and Go because their libraries are “source available” (the AI can read the actual code). They struggle more with the JVM (Java/Kotlin) because those libraries are often bytecode that the agent cannot read.
Use High-Quality Models: Cheaper models often waste time and tokens by spiraling or deleting code they don’t understand. Using a top-tier model often solves the problem on the first try.
Encode Skills: If you find yourself giving the same instructions repeatedly, turn them into a Skill. This is like giving the agent a permanent “how-to” guide for a specific task.

Tokens: The basic units (words or parts of words) that AI models use to process and “read” text.
Skill: A saved set of instructions that an agent can automatically use whenever it needs to perform a specific job.

Conclusion: From Code Writer to Orchestrator

The arrival of AI doesn’t minimize the need for great engineers; it changes what they focus on. In the past, value was measured by the “depth” of knowledge in a narrow niche. Today, value is shifting toward breadth.

Because the agent can handle the “depth” of implementation, the human engineer must provide the “breadth” of general knowledge. Understanding how networking, security, and architecture connect allows you to act as an orchestrator, delegating tasks while maintaining the high-level judgment that keeps the system robust.

Don’t be discouraged if your first hour with a coding agent feels clunky. It takes practice to develop the skill to use them well. Keep experimenting, keep breaking down your problems, and always give your agent a way to verify its work.

Is Your Security Team Scalable? Why LLMs are the Only Answer

Chady — Fri, 27 Mar 2026 16:31:11 GMT

Security teams have too much work and not enough time. There is a huge gap between the amount of new code being written and the number of people available to check it. I want to share how LLMs can help. We can use AI to act on your team's behalf, helping you work faster and focus on real threats.

Understanding the AI Engine

Before building AI tools, it is important to understand the technical rules that govern how these models process data. Knowing that models are stateless helps you design better systems that rely on context rather than memory.

Tokens and Context: AI reads words in small pieces called “tokens,” which represent about 3/4 of a word.
Stateless Nature: Most modern AI models are stateless, meaning they do not “learn” or change their internal weights while you are talking to them.
Memory: Because the AI is stateless, it doesn’t remember your last question; to give it “memory,” you must include the previous parts of the conversation in your new request.
Data Quality: It is better to give the AI high-quality information (context) in your prompt—sometimes up to 128k tokens—than to try and “train” or fine-tune the model itself.

Checking Projects Faster (SDLC)

The Software Development Life Cycle (SDLC) is the process of building software, and in a fast company, it can be very unpredictable. Using AI to automate the initial review of these projects allows security teams to prioritize the most dangerous changes.

Risk Scoring: You can use an AI bot to read design documents and give a “risk score” and “confidence level” to show which projects need a human expert first.
Watching Changes: If a developer changes a plan—for example, making a private tool public—the AI can see this change and raise the risk score immediately.
Passive Monitoring: AI can watch chat channels; if it sees a developer talking about a security mistake (like skipping a password check), it can alert the security team.

Managing Access (IAM)

Giving people the right permissions to use tools is often slow and creates friction for engineers. AI can simplify this by matching a user’s natural language request to the technical groups required to do their job.

Simple Language: Instead of searching for a specific technical group name, a user can describe what they need, and the AI finds the right access group for them.
Smart Approvals: AI can look at how a person usually works using “cosine similarity”; if their request looks normal for their role, it can be approved faster.
Audit Trails: All access granted through these AI tools is logged to create a clear history for security audits.

Sorting Bug Reports

If you have a “bug bounty” program, you might get thousands of reports every day, which is too much for humans to handle. AI can act as a first filter to remove noise and send real vulnerabilities to the right people.

Filtering the Noise: AI can quickly read reports and close the ones that are just complaints or “out of scope,” like missing email headers.
Directing Traffic: The AI can send payment issues to the billing team and general model errors to the safety team, so security engineers only see real technical bugs.
Improving Quality: AI can even ask the reporter for more information, like a missing URL, before a human ever has to look at the ticket.

Finding Attackers in Logs

Reviewing computer logs is a “needle in a haystack” problem where humans often get tired and miss important data. LLMs are consistently good at finding these small signs of an attack within massive amounts of noisy data.

Log Summarization: AI is great at finding one bad command hidden in thousands of lines of logs, such as a malicious one-liner used to start a reverse shell.
Interactive Remediation: If a user does something risky by accident, such as sharing a file publicly, a bot can message them to ask if it was intentional.
summarization for Defense: The AI summarizes these user conversations and sends them back to the incident response team for a final check.

Tips About Using AI

To get the best results from AI in a security context, you must move past simple trial-and-error and use data-driven methods. Following these expert tips will ensure your AI tools are helpful and accurate.

Treat it like an Expert: Always tell the AI: “You are an expert security engineer.” It will give you much better answers than if you treat it like an average worker.
Use Data, Not “Vibes”: Do not just guess whether the AI is working; use an “Evaluation Framework” with known-good answers to check the AI and improve your prompts.
Self-Correction: You can even use a second, smaller AI model to check the answers of the first model to ensure they are correct.
Keep Humans Involved: AI is not perfect and can “hallucinate” (make things up). A human should always be “in the loop” to review disputes or make high-stakes decisions.

Using these tools is easier than you think. By using AI for the “boring” parts of security, you allow your human experts to focus on the most important work.

Moving Software Security from “Human Speed” to AI

Chady — Fri, 13 Mar 2026 16:30:40 GMT

The AI hype is going full speed, and we are currently losing the race against hackers. While attackers use fast, automated tools to find flaws, we still rely on people to fix them by hand. This creates a dangerous gap. We can no longer manage security manually; we need AI agents that can think and act instantly. It is time to move from a slow, human process to a fast, machine-driven defense.

The reality of modern software is that it is growing too fast for humans to manage. We have millions of lines of code, constant updates, and new threats appearing every hour. Traditional security, where a human finds a bug, writes a fix, and tests it manually, is simply too slow. We are operating at “human speed” in a world that demands “machine speed.”

Today, I want to share a vision for an approach called Autonomous Security. This is the idea that we can use AI agents to automatically find and fix vulnerabilities, with higher quality than even the best human experts.

Finding Vulnerabilities with “Reasoning”

The biggest problem with traditional security scanners is that they aren’t “smart.” They look for patterns, but they don’t understand how code actually works. This leads to thousands of “false alarms” that waste our engineers’ time.

The idea we are moving toward involves an Agentic Reasoning Loop. Instead of a simple scan, we use an AI agent that acts like a researcher:

It makes a hypothesis: “I think there is a flaw in how this data is processed.”
It uses real tools: The AI uses debuggers and code browsers to test its theory.
It proves the flaw: the agent doesn’t report a bug unless it can actually cause the program to fail (a “crash verification”).

By requiring proof, we achieve zero false positives. We only focus on real, verified threats.

The “Self-Healing” Codebase

Finding a bug is only half the battle. The hardest part of my job is fixing a vulnerability without breaking the rest of the product. This is why many security patches take months to release.

We are now exploring a Rigorous Validation Pipeline for autonomous fixing. When the AI finds a flaw, it creates a “patch” and puts it through a gauntlet of tests:

Dynamic Analysis: Does the fix actually close the security hole?
Static Analysis: Does the new code follow our safety standards?
Differential Testing: Does the software still behave exactly the same for the end user?

By automating this validation, we can move from a months-long patching cycle to a minutes-long cycle. The software essentially begins to “heal” itself.

Shifting from Reactive to Proactive

Most security work today is reactive—we fix things after they are broken. I believe the future of this field is proactive hardening.

This vision has three parts:

Hardening: Automatically adding defensive layers to code as it’s being written.
Auto-Mending: Using AI to clean up old, “legacy” codebases that haven’t been touched in years.
Secure Generation: Training our AI models to write “secure-by-default” code, so the bugs never exist in the first place.

Why This Idea Changes Everything

The goal isn’t just to make developers faster; it’s to eliminate the “security debt” that every company carries. By combining the reasoning power of AI with strict, automated testing, we can create a digital world where vulnerabilities are the exception, not the rule.

We are entering an era where our defense is finally as fast as the code we create.

Let's Talk About the Security of AI Agents

Chady — Sat, 13 Dec 2025 05:14:25 GMT

AI is moving into a phase where it no longer just answers — it acts. LLM-driven AI agents are beginning to operate like autonomous digital workers, taking multi-step actions, interacting with live systems, and modifying environments without continuous human supervision.