Introducing Harbinger

Introducing Harbinger: An Observe-Only Telemetry Profiler for Microsoft Sentinel

Over the past 18 months, the AI conversation has largely centred around capability. changing the file

Which model is best? Who has the largest context window? Which benchmark has improved?

Those questions have pushed the industry ahead, but I believe the conversation is beginning to change.

My view is that H2 of 2026 won’t simply be about building more capable AI systems. It will increasingly be about building AI systems that are economically sustainable, operationally efficient, fit for purpose and governed.

Recent announcements only reinforce that direction.

With the general availability of Microsoft Copilot Cowork and the move to usage-based billing for Frontier customers from 1 July 2026, organisations are beginning to rethink their AI workloads. Rather than simply licensing AI capability, they now need to consider the cost of every long-running task, every tool invocation, every context retrieval, every model decision and who can use these models.

To me, this isn’t just a licensing change. It’s a signal that AI is entering a new phase.

Over the past decade, I/we learned how to optimise cloud compute, storage and networking. I believe the next operational discipline will be understanding and optimising AI itself, not just tokens.

Inference.

Context windows.

Agent execution.

Latency.

Model selection.

The economics of AI.

It is evident from the popularity of tools such as agentic infrastructure, caching, RAG, and token compression that these landscapes improve daily at an accelerated rate. For many organisations, especially those moving from experimentation into production, the question is no longer:

Can we use AI?

It’s becoming:

How do we use AI in the most effective and economical way?

If anything, this is happening now,, and it’s comforting that Rob Simms / Chief Technologist @ CDW shares the same idea; you can read his article here. I don’t believe the answer will always be “use the largest model available.” I think we’ll see a layered approach emerge, and it’s happening right now.

Small local models handling routine, repeatable work.

Frontier models reserved for genuinely complex reasoning.

Traditional deterministic software continuing to solve problems that don’t need AI at all.

This isn’t about replacing cloud AI. It’s about using the right model, in the right place, for the right workload.

AI in Security

Security is an intriguing example of AI application. Much of the telemetry we process every day is repetitive, highly structured, and, for the most part, predictable. It often doesn’t require frontier-scale reasoning, but it does require speed, privacy, predictable operating costs and, increasingly, local processing. That question kept coming back, and it helped me coin the concept.

Could a small, locally hosted model help security teams understand their telemetry before it ever reached their SIEM?

That question became Harbinger.

What is Harbinger?

Harbinger is an open-source (Apache 2.0), observe-only telemetry profiler designed to help organisations better understand their security telemetry before making ingestion decisions.

Rather than sitting directly in the production log path, Harbinger operates alongside it.

It observes mirrored telemetry, analyses event characteristics using locally hosted language models through Ollama, and produces recommendations that help security teams understand:

  • What data they are receiving
  • How that data is being classified
  • Which Microsoft Sentinel tier may be most appropriate
  • Where optimisation opportunities may exist
  • The possible financial impact of ingestion decisions

Importantly, Harbinger never modifies, filters, routes or interferes with production telemetry.

Its role is observation, not enforcement.

Why Build It?

Throughout numerous Microsoft Sentinel deployments, migrations and optimisation engagements, one challenge has consistently surfaced.

Telemetry decisions are often made before organisations truly understand their telemetry.

The typical journey looks something like this:

  1. Connect a data source.
  2. Ingest everything.
  3. Receive the bill.
  4. Start asking which data is actually valuable.

By that point, the telemetry has already been collected, stored and paid for.

Harbinger explores a different approach:

  1. Mirror the telemetry.
  2. Understand it.
  3. Profile it.

Then take informed decisions about what belongs in Analytics Logs, Basic Logs, or Auxiliary Logs, rather than optimising after the cost has been incurred; optimise before the decision is made.

The Observe-Only Principle

The most important design decision in Harbinger is also its simplest.

It is intentionally observe-only.

Harbinger does not:

  • Modify logs
  • Filter logs
  • Route logs
  • Suppress logs
  • Replace your SIEM
  • Automatically optimise Sentinel

Instead, it watches a copy of your telemetry and generates recommendations; this design boundary was entirely deliberate. The moment a tool becomes responsible for handling production security telemetry, it also becomes part of the operational risk. By remaining outside the production ingestion path, Harbinger can deliver valuable insights without becoming a critical additional dependency.

For me, that’s a far more comfortable place for AI to operate.

How It Works

At a high level, the architecture is intentionally straightforward.

Telemetry Sources
 Mirrored Syslog Feed
     Harbinger
 ┌──────┼─────────────┐
 │      │             │
 ▼      ▼             ▼
Classification  Aggregation  Profiling
 Recommendation Engine
Markdown & JSON Reports

Telemetry is received, classified, aggregated and analysed locally before producing recommendations that can inform Microsoft Sentinel design decisions. No production telemetry is altered. No ingestion path is modified.

Why Local AI?

One of the earliest design decisions was to keep inference local; Harbinger currently uses Ollama to host lightweight language models on-premises.

That provides several benefits.

  • Sensitive telemetry remains local.
  • There are no cloud AI inference costs.
  • Latency is predictable.
  • Organisations retain complete control over processing.
  • Security teams can experiment without introducing another external dependency.

The AI isn’t the product.

It’s simply another tool in the pipeline.

Beyond Microsoft Sentinel

Although Harbinger currently focuses on Microsoft Sentinel recommendations, I increasingly think the underlying idea is much broader. The more I worked on the project, the more I found myself thinking about Telemetry Intelligence. Not simply collecting telemetry, not simply storing telemetry, but understanding telemetry before making operational and financial decisions.

Today, that output targets Microsoft Sentinel; tomorrow, there’s no reason similar concepts couldn’t support other SIEM platforms, security data lakes, or detection engineering workflows.

Open Source

Harbinger is available as an open-source project because I believe these ideas improve through discussion and experimentation.

If you’re a:

  • Security Architect
  • SOC Analyst
  • Detection Engineer
  • Microsoft Sentinel Consultant
  • Open-source contributor

I’d genuinely welcome your thoughts.

Particularly around:

  • Classification approaches
  • Validation methodologies
  • Performance benchmarking
  • Alternative model selection
  • Future use cases

What’s Next?

The current focus is on providing it to the community and helping mature the project.

Areas I’m particularly interested in include:

  • Performance improvement
  • Improving documentation
  • Expanding benchmark testing
  • Building richer sample datasets
  • Measuring classification correctness
  • Investigating additional telemetry sources

Final Thoughts

The security industry has become exceptionally good at analysing telemetry after it has been collected. Perhaps the next challenge is understanding telemetry before we decide to collect it.

Harbinger is my effort to explore that idea. As stated, I think it’s ready for release, but there are items which require improvement and building.

Whether it becomes a useful tool, a wider framework for telemetry intelligence or simply sparks a few interesting conversations, I hope it inspires us to think differently about how AI, economics and security engineering intersect.

Related Posts

comments