Skip to content

Instantly share code, notes, and snippets.

@ddamenova
Last active May 13, 2026 14:00
Show Gist options
  • Select an option

  • Save ddamenova/a24f3f012012affd017d6bf712f2dd02 to your computer and use it in GitHub Desktop.

Select an option

Save ddamenova/a24f3f012012affd017d6bf712f2dd02 to your computer and use it in GitHub Desktop.

IRQL - Incident Response Query Language

A collection of Kusto (KQL) functions that unify security logs behind a consistent, analyst-friendly dialect. IRQL encapsulates query logic in repeatable chunks, hides cluster/database locations and join keys, and projects disparate source schemas into a single, predictable schema. In addition, it represents query logic as their semantic intent via function naming. These functions were created by Saar Ron, John Lambert, and Diana Damenova.

These functions were authored alongside the Lift to Graph functions (Lift_To_Graph, Graph_Render_View, Graph_Fold_By_Property) and are designed to compose with them. Many of the IRQL primitives have a tabular form and a graph-lifted form, so the same logic drives both relational hunts and visual graph investigations.

Why IRQL?

KQL is a phenomenal tool for analyzing large quantities of data, but queries can get verbose quickly:

  • Schema drift across tables. The same concept shows up as ipAddress, IPAddress, IpAddress, ClientIp, callerIpAddress, cip. Timestamps appear as Timestamp, TIMESTAMP, ReportTime, env_time, EventTime, FirstSeen. Analysts waste cycles remembering which spelling applies where.
  • Cluster and database sprawl. Knowing where each table lives - and which join keys connect them - is tribal knowledge that doesn't scale.
  • Repeated, brittle join logic. The same enrichment patterns get copy-pasted across queries and silently break when upstream schemas shift.
  • Wall-of-text queries. Heavy copy/paste and arcane column parsing make queries long, repetitive, and hard to review.
  • High cognitive load for humans and LLMs alike. Thousands of tables with inconsistent naming is hostile to analyst onboarding and to AI-assisted authoring.

IRQL addresses this by hiding all of that complexity behind stable, intention-revealing functions:

  • Repeatable selectors encapsulate cluster/database location, join keys, and projection.
  • A unified projected schema (EnvTime, ClientIp, Username, Hostname, Url, …) makes downstream queries look the same regardless of source.
  • Function names describe intent, not mechanics. Enrich_Ip_Employee, Extract_Email_Sender_Domain, Get_Event_Authentication - a pipeline reads as a sequence of meaningful operations rather than a wall of joins, projections, and regexes. Queries are shorter, easier for humans to reason about, and easier for an AI to compose correctly because the names describe what's happening semantically.
  • A single consistent dialect. Once you learn one selector, every other selector reads the same way.
  • AI-ready by construction. A cohesive domain-specific dialect is dramatically easier for an LLM to generate correctly than raw telemetry, and analyst work written in IRQL becomes higher-quality training signal.
  • Composes with Kusto itself. IRQL is Kusto - every function interoperates cleanly with existing queries, dashboards, and detection pipelines.

Compounding benefits: faster analyst onboarding, faster query authoring, queries that better reflect their semantic intent, less copy-pasted logic, and a shared vocabulary that both humans and AI can read and write.

Function Catalog

IRQL functions fall into five groups:

  1. Selectors - Get_* functions. Return a projected, renamed view of the underlying table using the unified schema.
  2. Extractors - Extract_* functions. Transform fields from existing columns (domain from URL, first name from full name, domain from email sender).
  3. Enrichers - Enrich_* functions. Take a table with a known key column and left-join it against a primitive to add context.
  4. Graph-lifted variants - Extract_Node_*, Enrich_Node_*, Enrich_Graph_*. The same primitives, but operating over a graph table produced by Lift_To_Graph so the results materialize directly into a graph investigation.
  5. External enrichment - two published functions wrapping external threat-intel sources: Enrich_Sha256_VirusTotal and Get_CISA_KEV / Enrich_CISA_KEV.

Examples below use the open KC7 JoJo's Hospital / Valdy Times datasets so you can copy, paste, and run them. Kusto connection string: https://kc7001.eastus.kusto.windows.net.


Before and After: What IRQL Looks Like

The IRQL pitch is easier to feel than to describe. Here's a real investigation query — surfacing AAD applications that are suddenly seeing sign-ins from a user-agent category they've never used before, a strong signal for token theft, OAuth abuse, and AiTM proxy activity — written first against raw telemetry, then in IRQL.

The IRQL primitives shown here don't exist in the published KC7 catalog (which is built around hospital and news-site datasets, not Entra ID sign-ins). They're written in the IRQL idiom to show what the dialect looks like when extended to a real Entra ID estate.

Before — raw KQL

let minimumAppThreshold = 100;
let timeframe = 1d;
let lookback_timeframe = 7d;
let ExtractBrowserTypeFromUA = (ua:string) {
    case(
        ua has "Edge/", dynamic({"AgentType": "Browser", "AgentName": "Edge"}),
        ua has "Edg/", dynamic({"AgentType": "Browser", "AgentName": "Edge"}),
        ua has "Trident/", dynamic({"AgentType": "Browser", "AgentName": "Internet Explorer"}),
        ua has "Chrome/" and ua has "Safari/", dynamic({"AgentType": "Browser", "AgentName": "Chrome"}),
        ua has "Gecko/" and ua has "Firefox/", dynamic({"AgentType": "Browser", "AgentName": "Firefox"}),
        not(ua has "Mobile/") and ua has "Safari/" and ua has "Version/", dynamic({"AgentType": "Browser", "AgentName": "Safari"}),
        ua startswith "Dalvik/" and ua has "Android", dynamic({"AgentType": "Browser", "AgentName": "Android Browser"}),
        ua startswith "MobileSafari//", dynamic({"AgentType": "Browser", "AgentName": "Mobile Safari"}),
        ua has "Mobile/" and ua has "Safari/" and ua has "Version/", dynamic({"AgentType": "Browser", "AgentName": "Mobile Safari"}),
        ua has "Mobile/" and ua has "FxiOS/", dynamic({"AgentType": "Browser", "AgentName": "IOS Firefox"}),
        ua has "Mobile/" and ua has "CriOS/", dynamic({"AgentType": "Browser", "AgentName": "IOS Chrome"}),
        ua has "Mobile/" and ua has "WebKit/", dynamic({"AgentType": "Browser", "AgentName": "Mobile Webkit"}),
        ua startswith "Excel/", dynamic({"AgentType": "OfficeApp", "AgentName": "Excel"}),
        ua startswith "Outlook/", dynamic({"AgentType": "OfficeApp", "AgentName": "Outlook"}),
        ua startswith "OneDrive/", dynamic({"AgentType": "OfficeApp", "AgentName": "OneDrive"}),
        ua startswith "OneNote/", dynamic({"AgentType": "OfficeApp", "AgentName": "OneNote"}),
        ua startswith "Office/", dynamic({"AgentType": "OfficeApp", "AgentName": "Office"}),
        ua startswith "PowerPoint/", dynamic({"AgentType": "OfficeApp", "AgentName": "PowerPoint"}),
        ua startswith "PowerApps/", dynamic({"AgentType": "OfficeApp", "AgentName": "PowerApps"}),
        ua startswith "SharePoint/", dynamic({"AgentType": "OfficeApp", "AgentName": "SharePoint"}),
        ua startswith "Word/", dynamic({"AgentType": "OfficeApp", "AgentName": "Word"}),
        ua startswith "Visio/", dynamic({"AgentType": "OfficeApp", "AgentName": "Visio"}),
        ua startswith "Whiteboard/", dynamic({"AgentType": "OfficeApp", "AgentName": "Whiteboard"}),
        ua =~ "Mozilla/5.0 (compatible; MSAL 1.0)", dynamic({"AgentType": "OfficeApp", "AgentName": "Office Telemetry"}),
        ua has ".NET CLR", dynamic({"AgentType": "Custom", "AgentName": "Dotnet"}),
        ua startswith "Java/", dynamic({"AgentType": "Custom", "AgentName": "Java"}),
        ua startswith "okhttp/", dynamic({"AgentType": "Custom", "AgentName": "okhttp"}),
        ua has "Drupal/", dynamic({"AgentType": "Custom", "AgentName": "Drupal"}),
        ua has "PHP/", dynamic({"AgentType": "Custom", "AgentName": "PHP"}),
        ua startswith "curl/", dynamic({"AgentType": "Custom", "AgentName": "curl"}),
        ua has "python-requests", dynamic({"AgentType": "Custom", "AgentName": "Python"}),
        pack("AgentType", "Other", "AgentName", extract(@"^([^/]*)/", 1, ua)))
let QueryUserAgents = (start_time:timespan, end_time:timespan) {
    union withsource=tbl_name AADNonInteractiveUserSignInLogs, SigninLogs
    | where TimeGenerated between (start_time .. end_time)
    | where ResultType == 0 
    | extend ParsedUserAgent = ExtractBrowserTypeFromUA(UserAgent)
    | extend UserAgentType = tostring(ParsedUserAgent.AgentType)
    | extend UserAgentName = tostring(ParsedUserAgent.AgentName)
    | extend SimpleUserAgent = UserAgentType
    | where not(isempty(UserAgent))
    | where not(isempty(AppId))};
let BaselineUserAgents = materialize(
    QueryUserAgents(lookback_timeframe + timeframe, timeframe)
    | summarize RequestCount = count() by AppId, AppDisplayName, SimpleUserAgent);
let BaselineSummarizedAgents = (
    BaselineUserAgents
    | summarize BaselineUAs = make_set(SimpleUserAgent), BaselineRequestCount = sum(RequestCount) by AppId, AppDisplayName);
QueryUserAgents(timeframe, 0d)
| summarize count() by AppId, AppDisplayName, UserAgent, SimpleUserAgent
| join kind=leftanti BaselineUserAgents on AppId, AppDisplayName, SimpleUserAgent
| join BaselineSummarizedAgents on AppId, AppDisplayName
| where BaselineRequestCount > minimumAppThreshold 
| join (QueryUserAgents(timeframe, 0d)) on AppId, UserAgent
| project-away ParsedUserAgent, UserAgentName
| project-reorder TimeGenerated, AppDisplayName, UserPrincipalName, UserAgent, BaselineUAs
| summarize count() by UserPrincipalName, AppDisplayName, AppId, UserAgentType, SimpleUserAgent, UserAgent

The investigation question — which apps are seeing brand-new UA categories today vs. the past week? — is buried under a 30-branch user-agent classifier inlined as a lambda, a manual union across two sign-in tables, a parameterized subquery invoked at three different time windows, and three explicit joins to stitch baseline against current. The mechanics dominate the intent.

After — IRQL

let minimumAppThreshold = 100;
let timeframe = 1d;
let lookback = 7d;
let SignInsToday    = Get_Event_SignIn(timeframe, 0d);
let SignInsBaseline = Get_Event_SignIn(lookback + timeframe, timeframe);
let Baseline =
    SignInsBaseline
    | invoke Extract_SignIn_UserAgent_Category()
    | summarize RequestCount = count() by AppId, AppDisplayName, UserAgentCategory;
SignInsToday
| invoke Extract_SignIn_UserAgent_Category()
| invoke Enrich_App_New_UserAgent_Category(Baseline, minimumAppThreshold)
| project EnvTime, Username, AppDisplayName, AppId, UserAgentCategory, UserAgent, BaselineCategories
| summarize count() by Username, AppDisplayName, AppId, UserAgentCategory, UserAgent

Same question, expressed as named operations. Get_Event_SignIn is a selector in the Get_Event_* family — it hides the union of AADNonInteractiveUserSignInLogs and SigninLogs, the success filter, the empty-field guards, and projects into the unified schema. Extract_SignIn_UserAgent_Category wraps the 30-branch classifier and adds a single UserAgentCategory column, mirroring how Extract_Email_Sender_Domain adds a Domain column. Enrich_App_New_UserAgent_Category encapsulates the baseline-vs-current diff: the leftanti join, the per-app threshold filter, and the BaselineCategories set that lets analysts see what was normal alongside what's new.

The mechanical work is identical — same union, same classifier, same anti-join, same threshold. The difference is that none of it is in the analyst's face anymore. It lives once, inside the function definitions, and every hunt that touches AAD sign-ins reuses it. The pipeline reads as a sequence of meaningful operations rather than a wall of joins and regex, and an LLM composing a follow-up query has intent-revealing primitives to compose with instead of having to reconstruct the pattern from scratch.

1. Selectors - Get_*

Each Get_* function returns a projected view of a source table using the unified schema. Raw security tables typically carry dozens of columns, most of which are irrelevant to any given hunt - agent versions, internal correlation IDs, redundant timestamp variants, schema-versioning fields, and so on. The default Get_* form down-projects to just the columns analysts actually reach for day-to-day, renamed into the unified schema. The result is a narrower, more readable table that keeps queries focused and avoids drowning the analyst (or an LLM composing a query) in columns they'll never use. When a hunt legitimately needs the fuller picture, every primitive has a Get_*_All companion that returns the same rows with a wider field set - still renamed into the unified schema, but without the trimming. The convention is simple: start with the minimal view, and opt into _All only when a specific investigation requires it.

Function Returns
Get_Event_Authentication Authentication events - EnvTime, Hostname, ClientIp, Username, Result
Get_Event_Authentication_All Full auth events including Description, UserAgent, PasswordHash
Get_Email Email events - EnvTime, EmailSender, EmailRecipient, Subject, Url
Get_Email_All Full email events including ReplyTo, Verdict
Get_Employees Employee directory - Name, ClientIp, Email, Username, Hostname, Role
Get_Employees_All Full employee directory including HireDate, UserAgent, Domain
Get_Event_FileCreation File creation events - EnvTime, Hostname, Filename, Path
Get_Event_FileCreation_All Full file creation events including Username, Sha256, ProcessName
Get_Event_NetworkInbound Inbound network events - EnvTime, ClientIp, Url
Get_Event_NetworkInbound_All Full inbound including Method, UserAgent, StatusCode
Get_Event_NetworkOutbound Outbound network events - EnvTime, ClientIp, Url
Get_Event_NetworkOutbound_All Full outbound including Method, UserAgent
Get_Dns_All Passive DNS - EnvTime, Domain, ClientIp
Get_Event_Process Process events - EnvTime, ProcessCommandLine, ProcessName, Hostname, Username
Get_Event_Process_All Full process events including ParentProcessName, ParentProcessHash, ProcessHash
Get_SecurityAlerts_All Security alerts - EnvTime, AlertType, Severity, Description, Indicators
Get_Network_Connection_All Network flow records - EnvTime, SourceIp, SourcePort, DestinationIp, DestinationPort, Protocol, Bytes

Usage:

Get_Event_NetworkOutbound
| summarize count() by ClientIp, Url
| top 20 by count_

No cluster string. No project-rename. No memorization of where OutboundNetworkEvents actually lives.


2. Extractors - Extract_*

Functions that derive a new column by transforming an existing one. Many security queries require transformations from one column to another, an example in the Defender XDR schema is transforming a RequestId into a unique token identifier.

Function Input Adds
Extract_Email_Sender_Domain(T) EmailSender:string Domain
Extract_Employee_Firstname(T) Name:string Firstname
Extract_Event_Network_Domain(T) Url:string DomainName

Usage:

Get_Email
| invoke Extract_Email_Sender_Domain()
| summarize count() by Domain

3. Enrichers - Enrich_*

Left-join helpers. Give them a table with the IRQL standardized schema and they'll attach the related context from the relevant primitive.

Function Key column Enriches with
Enrich_Event_Authentication_Username(T) Username Authentication events for each user
Enrich_Ip_Employee(T) ClientIp Employee identity from IP
Enrich_Username_Employee(T) Username Employee identity from username
Enrich_Ip_Domain(T) ClientIp DNS domains resolved to each IP
Enrich_Ip_Event_NetworkOutbound(T) ClientIp Outbound network events from each IP
Enrich_Ip_Network_Connection(T) ClientIp Network flow records sourced from each IP

Usage:

Get_Event_Authentication
| where Result == "Failed Login"
| summarize FailedCount = count() by Username
| where FailedCount > 19
| invoke Enrich_Username_Employee()
| project Username, Name, Role, Email, FailedCount

Chain them together freely:

Get_Event_NetworkOutbound
| invoke Extract_Event_Network_Domain()
| invoke Enrich_Ip_Employee()
| where Role has_any ("Executive", "CEO", "Chief", "Director")
| project EnvTime, Name, DomainName, Url, Role

4. Graph-Lifted Variants

The same IRQL primitives, surfaced as node/edge enrichers so they compose directly with Lift_To_Graph. These operate on a graph table (the output of Lift_To_Graph) rather than a raw row-oriented table.

There are two flavors:

  • Extract_Node_* / Enrich_Node_* - adds new properties to existing nodes in place. The graph shape (nodes, edges) doesn't change; the nodes just get richer property bags and optionally updated display names.
  • Enrich_Graph_* - expands the graph by lifting new nodes and edges from the enrichment results, then unions them (deduplicated) with the original graph and wires up "Related To" edges between nodes that share the enrichment key.

Extract_Node_* (in-place property enrichment)

Function Target property Adds to node properties
Extract_Node_Email_Sender_Domain(T, newDisplayName) EmailSender Domain
Extract_Node_Employee_Firstname(T, newDisplayName) Name Firstname
Extract_Node_Event_Network_Domain(T, newDisplayName) Url DomainName

Enrich_Node_* (in-place property enrichment via join)

Function Key property Adds
Enrich_Node_Event_Authentication_Username(T, newDisplayName) Username EnvTime, Hostname, ClientIp, Result
Enrich_Node_Ip_Employee(T, newDisplayName) ClientIp Name, Email, Username, Hostname, Role
Enrich_Node_Username_Employee(T, newDisplayName) Username Name, ClientIp, Email, Hostname, Role
Enrich_Node_Ip_Network_Connection(T, newDisplayName) ClientIp EnvTime, SourceIp, SourcePort, DestinationIp, DestinationPort, Protocol, Bytes
Enrich_Node_Ip_Domain(T, newDisplayName) ClientIp EnvTime, Domain
Enrich_Node_Ip_Event_NetworkOutbound(T, newDisplayName) ClientIp EnvTime, Url

The optional newDisplayName parameter lets you re-label a node using any field brought in by the enrichment - e.g. passing "Name" to Enrich_Node_Ip_Employee relabels IP nodes with the owning employee's name.

Enrich_Graph_* (structural graph expansion)

Function Key property Lifts into graph
Enrich_Graph_Event_Authentication_Username(T, mappingJson) Username New auth-event nodes + edges
Enrich_Graph_Ip_Employee(T, mappingJson) ClientIp New employee nodes + edges
Enrich_Graph_Username_Employee(T, mappingJson) Username New employee nodes + edges
Enrich_Graph_Ip_Network_Connection(T, mappingJson) ClientIp New network-flow nodes + edges
Enrich_Graph_Ip_Event_NetworkOutbound(T, mappingJson) ClientIp New outbound-network-event nodes + edges

Each Enrich_Graph_* function takes its own mappingJson (same schema as Lift_To_Graph) describing how the newly fetched rows should be lifted into nodes and edges. The result is unioned with the input graph via Graph_Merge_Dedup, If a mappingJson is not passed in, any pair of nodes sharing the enrichment key is linked with a Related To edge.


5. External Enrichment

Many security teams have disparate datasets. IRQL enables us to create a catalog and centralized schema and language to access these datasets. An example of a disparate dataset is the open source security datasets that are HTTPS accessible and thus Kusto accessible with an API key. This section introduces two functions, each targeting a different external source. For each of these you will need to enable them to add the domains to the callout policy white list:

.alter cluster policy callout @'[
    {
        "CalloutType": "webapi",
        "CalloutUriRegex": "www\\.virustotal\\.com/vtapi/.*",
        "CanCall": true
    }
]'
Function Source Key Shape
Enrich_Sha256_VirusTotal(T) VirusTotal file/report Sha256 Batched GET, comma-separated hashes
Enrich_CISA_KEV(T) CISA Known Exploited Vulns catalog CveId Static JSON feed, no auth

Both depend on http_request / http_request_post, which is cluster-policy-gated - an admin needs to allowlist the destination domains before the functions will run.

Note on the KC7 examples. KC7's JoJosHospital and ValdyTimes datasets are synthetic - the datasets may not return matches when queried against VirusTotal or the CISA KEV catalog. The external-enrichment functions in this section are meant to be deployed on your own cluster against your own real telemetry. Copy the function definitions into your production database, allowlist the callout domains, and point them at live detection data - that's where the enrichment actually pays off.

VirusTotal (batched GET, comma-separated hashes)

VirusTotal's /vtapi/v2/file/report endpoint accepts a comma-separated list of resources in a single GET, so the whole batch fits in one http_request call - no mv-apply per-row fanout needed. The result is unpacked once and projected into a flat scan-result table. Free API key at https://www.virustotal.com/gui/join-us; the public API is rate-limited to 4 requests/minute, so cache aggressively for production use.

.create-or-alter function with (folder="IRQL", docstring="Returns the VirusTotal API key. Edit in place before first use.")
_get_VirusTotal_ApiKey() {
    tostring('<YOUR_VIRUSTOTAL_API_KEY>')
}

.create-or-alter function with (folder="IRQL", docstring="Enriches a table of SHA256 hashes with VirusTotal file/report scan results")
Enrich_Sha256_VirusTotal(T:(Sha256:string)) {
    let apikey = _get_VirusTotal_ApiKey();
    let file_ids = toscalar(
        T
        | where isnotempty(Sha256)
        | distinct sha256
        | summarize make_list(Sha256)
        | extend hash_list = strcat_array(list_sha256, ",")
        | project hash_list
    );
    let uri = 'https://www.virustotal.com/vtapi/v2/file/report?';
    let url = strcat(uri, "resource=", url_encode_component(file_ids), "&apikey=", url_encode_component(apikey));
    evaluate http_request(url)
    | mv-expand answer = ResponseBody
    | evaluate bag_unpack(answer)
    | project
        Sha256       = tostring(sha256),
        Md5          = tostring(md5),
        Sha1         = tostring(sha1),
        Positives    = tolong(positives),
        Total        = tolong(total),
        ScanDate     = todatetime(scan_date),
        Permalink    = tostring(permalink),
        ResponseCode = tolong(response_code),
        VerboseMsg   = tostring(verbose_msg),
        Scans        = scans
}

Usage - join against any table with a sha256 column:

Get_Event_FileCreation_All
| distinct Sha256
| invoke Enrich_Sha256_VirusTotal()

CISA KEV (static JSON feed, no auth)

Single GET against the static catalog, no per-row fanout anywhere. Cache it if you call it often - wrap the body in materialize() or pull the feed nightly into a real Kusto table via ingestion rather than hitting the URL per query.

.create-or-alter function with (folder="IRQL", docstring="Returns the current CISA Known Exploited Vulnerabilities catalog as a flat table")
Get_CISA_KEV() {
    let url = 'https://www.cisa.gov/sites/default/files/feeds/known_exploited_vulnerabilities.json';
    evaluate http_request(url)
    | mv-expand v = ResponseBody.vulnerabilities
    | project
        CveId              = tostring(v.cveID),
        VendorProject      = tostring(v.vendorProject),
        Product            = tostring(v.product),
        VulnerabilityName  = tostring(v.vulnerabilityName),
        DateAdded          = todatetime(v.dateAdded),
        ShortDescription   = tostring(v.shortDescription),
        RequiredAction     = tostring(v.requiredAction),
        DueDate            = todatetime(v.dueDate),
        KnownRansomwareUse = tostring(v.knownRansomwareCampaignUse),
        Notes              = tostring(v.notes)
}

Enrichment function wrapping the get function:

.create-or-alter function with (folder="IRQL", docstring="Enriches a table of CVE IDs with the current CISA Known Exploited Vulnerabilities catalog")
Enrich_CISA_KEV(T:(CveId:string)) {
    T
    | where isnotempty(CveId)
    | join kind=inner (Get_CISA_KEV()) on CveId
    | project-away CveId1
}

End-to-End Examples

The function catalogs above show each primitive in isolation. Real hunts compose them. The three examples below are end-to-end IRQL queries against the KC7 JoJosHospital / ValdyTimes datasets - each one starts from a single hunting question and walks it to a finished result using only IRQL primitives. The first example also includes a side-by-side comparison of what the same query looks like written against raw telemetry, to make the cognitive-load argument concrete.

Example 1: Phishing campaign ranked by recipient seniority

Question: Which sender domains are emailing the most senior people at the company, and how broadly are they spread across the org?

This is a classic phishing triage pattern. Bulk senders are noise. Senders that target executives across multiple roles are the ones worth pulling on. The query reads top-to-bottom as a sequence of intent-revealing steps: get email, extract sender domain, attach the recipient's identity, score by seniority, summarize per domain, rank.

Get_Email
| invoke Extract_Email_Sender_Domain()
| project EnvTime, EmailSender, Domain, Username = EmailRecipient, Subject, Url
| invoke Enrich_Username_Employee()
| extend Seniority = case( Role has_any ("CEO", "Chief", "Director", "VP", "President"), 3, Role has_any ("Manager", "Lead", "Senior"), 2, 1)
| summarize TotalEmails = count(), SeniorityScore = sum(Seniority),Recipients = make_set(Name, 50),DistinctRecipients = dcount(Username) by Domain
| where DistinctRecipients >= 2
| order by SeniorityScore desc
| take 20

Example 2: Process-side pivot from a malicious file

Question: A specific malicious file (Raisin_Kane) showed up on some hosts. What did those hosts run around the same time that looks like hands-on-keyboard activity?

This is a classic post-exploitation pivot: start from a file-creation indicator, extract the hostnames, then sweep process telemetry on those hosts for living-off-the-land binaries and recon tooling.

let victims =
    Get_Event_FileCreation_All
    | where Filename has "Raisin_Kane"
    | distinct Hostname;
Get_Event_Process
| where Hostname in (victims)
| where ProcessCommandLine has_any ("rundll32", "regsvr32", "powershell", "systeminfo", "advanced-ip-scanner")
| project EnvTime, Hostname, Username, ProcessName, ProcessCommandLine
| order by EnvTime asc

The let-bound subquery isolates the affected hosts, and the main query sweeps process events on those hosts for the kinds of command-line patterns that show up in initial-access and discovery phases. The two Get_* selectors hide the fact that file-creation and process telemetry live in different tables with their own column conventions - the hunt reads as one continuous question.

Example 3: Suspicious-domain investigation as a graph

Question: Three known-bad sender domains are showing up in outbound traffic. Who's talking to them, and what does the cluster of activity look like visually?

This is where the graph-lifted IRQL variants earn their keep. The same Extract_* and Enrich_* primitives that worked tabularly above now operate on a graph, so the same logic produces a visual investigation surface instead of a flat result set.

Get_Event_NetworkOutbound
| invoke Extract_Event_Network_Domain()
| where DomainName has_any ("raisinkanes.com","nothing-to-see-here.net","totally-legit-domain.com")
| invoke Lift_To_Graph(SuspiciousDomainMapping())
| invoke Enrich_Node_Ip_Employee()
| invoke Graph_Fold_By_Property("Url", "DomainName")
| invoke Graph_Render_View()

What's happening:

  1. Pull outbound network events and extract the domain from each URL.
  2. Filter to the three suspicious domains.
  3. Lift the surviving rows into a graph using a Lift_To_Graph mapping (SuspiciousDomainMapping() - defined in the companion gist).
  4. Enrich every IP node in place with the employee who owns it. IPs that map to a person now carry name, role, email, hostname, and username on their property bag.
  5. Fold the graph by domain - URLs that share a domain collapse into a single domain node, so the graph shows one node per suspicious domain instead of one per unique URL.
  6. Render the result as a graph view.

The output is an interactive graph showing each suspicious domain at the center, with the IPs (now labeled with employee identity) that contacted them fanning out around it. The same primitives could have produced a tabular ranking - swapping the last three lines for summarize and project would give a flat result. The point is that one consistent dialect drives both modes.


Availability

The functions cataloged here are published and ready to invoke against the KC7 open datasets - specifically the JoJosHospital and ValdyTimes databases on kc7001.eastus. Spin up Kusto Explorer, point it at those databases, and the primitives, extractors, enrichers, and graph-lifted variants are all callable out of the box.

The concepts generalize. The exact cluster/database references in the Get_* primitives are tied to KC7, but the pattern - one primitive per source, a unified projected schema, extractors and enrichers as small composable functions, and graph-lifted variants built on top - applies to any telemetry estate. Most teams will want to fork these definitions, swap the cluster/database locations for their own, reconcile column names to their own unified schema, and add primitives for the sources that matter to them.

External enrichment functions are published separately. Two functions ship against different external sources:

  • Enrich_Sha256_VirusTotal - VirusTotal file/report, batched GET with comma-separated hashes. Free API key at https://www.virustotal.com/gui/join-us. Allowlist www.virustotal.com.
  • Get_CISA_KEV / Enrich_CISA_KEV - static JSON catalog, no auth. Allowlist www.cisa.gov.

Both depend on http_request / http_request_post, which are cluster-policy-gated. An admin needs to allowlist the destination domains before the functions will run. The KC7 datasets are synthetic and will not return matches against any of these live feeds; deploy these functions on your own cluster against your own real telemetry.

Prerequisites

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment