This article was originally posted on Substack here

Why understanding senders, receivers, and referenced entities is becoming critical for effective data security and AI governance

For years, data security programs have been built around one central question: What type of data is this?

Is it PII or PHI?
Is it PCI data?
Is it confidential?
Does it match a known pattern?

This what-centric approach made perfect sense in a world of structured systems and relatively predictable data flows. Pattern matching, regex, and classification engines were the right tools for the job, and for a long time they delivered meaningful value.

But the environment we operate in today looks very different.

AI systems now generate and remix content at scale. SaaS platforms have multiplied the number of data pathways inside the enterprise. Collaboration tools have blurred traditional boundaries. And autonomous agents are beginning to act on behalf of users in ways that were hard to imagine just a few years ago.

In this new reality, something important is becoming clear.

The real question is no longer just what the data is.

It is increasingly who is involved.

3dcfc6fb-3a32-4ec0-8023-c80da0c29abf_1536x1024

What “The Who” Actually Means

When we talk about “the who,” we are not referring only to user identity in the narrow sense.

In modern data flows, the risk context typically spans three interconnected dimensions: who is sending or sharing the data, who is receiving it, and which people, customers, or business entities are referenced inside the content itself.

All three matter.

You can have the right sender and the right data classification and still create a serious issue by sending the information to the wrong recipient. Just as importantly, the sender and receiver may both be legitimate, yet the content itself may refer to the wrong customer or account. In regulated industries, that distinction alone can determine whether something is routine business or a reportable incident.

Traditional controls rarely reason across these three dimensions simultaneously. Increasingly, that is exactly where real risk lives.

When the Data Is “Correct” — but the Outcome Is Not

One of the most common patterns emerging in modern environments is deceptively simple: the data classification is technically correct, and the organization still has a problem.

From a risk and compliance perspective, the exact same piece of data can be perfectly acceptable in one context and a serious issue in another. The difference often comes down entirely to the entity context surrounding the transaction.

Consider a familiar scenario. A document containing customer PII is sent to the correct policyholder. That is normal business. The same document shared internally with an underwriter may also be entirely appropriate. But send that identical content to a personal Gmail account or to the wrong customer, and the situation changes immediately — sometimes dramatically.

Nothing about the data itself has changed. What changed is the who.

This pattern is becoming more pronounced as data moves faster and more autonomously across email, SaaS platforms, copilots, and emerging AI agents.

Why Regulations and Trust Are Fundamentally About “Who”

There is another reason this shift matters.

If you step back and read most privacy and industry regulations carefully, a consistent theme emerges. The rules are rarely written purely around abstract data types. Instead, they are fundamentally concerned with whose data must be protected (in many regulations they are referred to as “data subjects”), combined with what type of information is involved.

Whether the context is insurance, financial services, healthcare, or broader privacy regimes, the regulatory obligation almost always attaches to a specific person, customer, account holder, or business entity. The same is true for customer trust. Organizations do not lose trust because they mishandled “a pattern.” They lose trust because they exposed someone’s data.

This is one of the quiet mismatches in many legacy data security approaches. Controls became very good at recognizing patterns, but much less capable of reasoning about the business relationships and entity context that regulations and customers actually care about.

The Practical Impact Inside Security Programs

As organizations begin to incorporate entity context more deeply into their controls, the operational effects tend to show up quickly.

One of the first improvements many teams notice is in policy precision. Instead of relying on broad, pattern-triggered rules, security teams can begin to express guardrails that reflect how the business actually operates — who is allowed to share which customer data, under what circumstances, and with which counterparties – in other words, enabling truly differentiated policies.

That added context also tends to reduce one of the longest-running pain points in data security: false positives. When controls understand not just what the data looks like but whether the sender-receiver-entity relationship makes sense, large volumes of benign activity can be safely ignored. The signal-to-noise ratio improves, and alert fatigue becomes more manageable.

Just as important — and often less discussed — is the impact on false negatives. Pattern-only detection frequently misses situations where the data appears normal but the business context is wrong: the right information sent to the wrong customer, cross-account leakage, or AI-generated content that inadvertently mixes entities. Incorporating the “who” dimension expands coverage into classes of risk that traditional approaches struggle to see.

Investigation and attribution also become more efficient. When an incident does occur, security teams can move more quickly to answer the questions that matter most: whose data was involved, who sent it or made it available, who received it, and how the exposure actually unfolded.

Lower Friction for Humans, Not Just Better Signals for Tools

There is another benefit that tends to resonate strongly with the business: user experience.

Traditional pattern-based controls often interrupt legitimate work because they lack sufficient context. Employees encounter unnecessary blocks, repeated justification prompts, and manual override workflows that feel disconnected from how the business actually operates. Over time, this creates real friction and, in some environments, quiet workarounds.

Entity-aware controls are inherently more selective. Because they can reason about who is involved and whether the action makes business sense, they are better positioned to allow legitimate workflows to proceed smoothly while still stopping true risk.

The outcome is a combination that has historically been difficult to achieve at scale: stronger security posture, lower operational burden on IT and security teams, and a noticeably better experience for end users.

Why AI Is Accelerating the Shift

If this trend were only about email and SaaS, it would already be important. AI is making it urgent.

Copilots and autonomous agents aggregate data across silos, generate new derived content, and increasingly act on behalf of users at machine speed. In that environment, knowing only what the data looks like is no longer sufficient. Organizations need to understand whose data is being accessed, which entity relationships are in play, and who will ultimately receive the output.

Without that visibility, blind spots grow quickly.

Just as important in AI-driven environments is the question of attribution. As agents increasingly act on behalf of users, the line between who initiated an action, who authorized it, and who ultimately exposed the data can become blurred. Without strong attribution, security teams may detect that something happened yet still struggle to answer the most important questions: who was responsible, whose authority was used, and whether the action was legitimate in context. In the age of autonomous systems, understanding “the who” must include not only the entities in the data flow, but also clear attribution of the actions themselves.

This is why entity awareness is rapidly moving from a niche capability to a foundational requirement for governing AI-driven data flows safely.

At Bonfy.AI, this shift toward entity- and attribution-aware controls is a central focus of our platform and customer work.

Final Thought

For a long time, data security was largely treated as a pattern-matching problem.

Going forward, it is increasingly a context and entity problem.

The future of effective data security — and durable customer trust — will depend not only on understanding what the data is, but on deeply understanding who it belongs to, who is moving it, and who is supposed to see it.

The future of data security is not just about the what.

It is about the who.