Getting Started with Zeek for Network Threat Hunting

10 February 2026 | 6 min read | justruss.tech

Zeek is a passive network analysis framework. Where Wireshark gives you packets, Zeek gives you structured logs. It parses protocols, extracts fields, and writes them to JSON log files you can query like a database. For hunting across hours or days of traffic it is significantly faster than working with raw packet captures because you are querying structured data rather than filtering hex.

Installation on Ubuntu 22.04

echo "deb http://download.opensuse.org/repositories/security:/zeek/xUbuntu_22.04/ /" | \
    sudo tee /etc/apt/sources.list.d/security:zeek.list
curl -fsSL https://download.opensuse.org/repositories/security:zeek/xUbuntu_22.04/Release.key | \
    gpg --dearmor | sudo tee /etc/apt/trusted.gpg.d/security_zeek.gpg > /dev/null
sudo apt update && sudo apt install zeek -y

// Configure the monitored interface
sudo nano /etc/zeek/node.cfg
// Change interface=eth0 to your actual span/tap interface

// Enable JSON output (much better for SIEM ingestion and jq parsing)
echo "redef LogAscii::use_json = T;" | sudo tee -a /etc/zeek/local.zeek
echo "redef LogAscii::json_timestamps = JSON::TS_ISO8601;" | sudo tee -a /etc/zeek/local.zeek

// Start Zeek
sudo zeekctl deploy
sudo zeekctl status

// Verify logs are writing
tail -f /opt/zeek/logs/current/conn.log | python3 -m json.tool | head -50

Log files: what they contain and when to use them

// conn.log: every network connection
// Fields: ts, id.orig_h, id.orig_p, id.resp_h, id.resp_p, proto, duration,
//         orig_bytes, resp_bytes, conn_state, history
// Use for: beaconing detection, data exfiltration volume, unusual destinations

// dns.log: every DNS query
// Fields: ts, id.orig_h, query, qtype_name, answers, TTLs
// Use for: DNS tunnelling, C2 domain detection, DGA detection

// ssl.log: TLS sessions
// Fields: ts, id.orig_h, id.resp_h, version, cipher, ja3, ja3s, server_name,
//         subject, issuer, validation_status
// Use for: C2 TLS fingerprinting (JA3), expired/self-signed cert detection

// http.log: HTTP requests
// Fields: ts, id.orig_h, method, host, uri, user_agent, status_code,
//         request_body_len, response_body_len
// Use for: malicious user agents, suspicious URIs, large exfil in POST bodies

// files.log: file transfers across any protocol
// Fields: ts, id.orig_h, fuid, md5, sha1, sha256, mime_type, filename, source
// Use for: malware download detection, file hash reputation checking

// weird.log: protocol anomalies Zeek could not parse cleanly
// Use for: protocol-based evasion, malformed packets

Hunting beaconing in conn.log

// Method 1: zeek-cut + awk interval calculation
cat conn.log | zeek-cut -d ts id.orig_h id.resp_h id.resp_p | \
    sort -k1,1n -k2,2 -k3,3 -k4,4 | \
    awk '
        {key = $2" "$3" "$4}
        key == prev_key {
            diff = $1 - prev_ts
            total[key] += diff
            count[key]++
            sq_diff = (diff - (total[key]/count[key]))^2
            variance[key] += sq_diff
        }
        {prev_key = key; prev_ts = $1}
        END {
            for (k in count) {
                if (count[k] >= 10) {
                    mean = total[k] / count[k]
                    stdev = sqrt(variance[k] / count[k])
                    cv = stdev / mean
                    if (cv  30 && mean < 600)
                        printf "BEACON CANDIDATE: %s mean=%.1fs stdev=%.1fs cv=%.3f samples=%d\n",
                            k, mean, stdev, cv, count[k]
                }
            }
        }
    '

// Method 2: Python script for more sophisticated analysis
python3 << EOF
import json, statistics
from collections import defaultdict

connections = defaultdict(list)
with open("/opt/zeek/logs/current/conn.log") as f:
    for line in f:
        try:
            rec = json.loads(line)
            if rec.get("proto") in ["tcp", "udp"]:
                key = (rec["id.orig_h"], rec["id.resp_h"], str(rec["id.resp_p"]))
                connections[key].append(float(rec["ts"]))
        except:
            pass

print(f"Analysing {len(connections)} unique connections...")
for key, timestamps in sorted(connections.items()):
    if len(timestamps) < 10:
        continue
    timestamps.sort()
    intervals = [timestamps[i+1]-timestamps[i] for i in range(len(timestamps)-1)]
    if len(intervals)  0 else 999
    # Flag: very regular timing, 30s to 10min interval range
    if cv < 0.1 and 30 < mean_i  {dst}:{port}")
        print(f"  mean={mean_i:.1f}s  stdev={stdev_i:.1f}s  cv={cv:.3f}  samples={len(intervals)}")
EOF

DNS tunnelling detection

// Long query names (DNS tunnelling uses subdomains to encode data)
cat dns.log | zeek-cut query id.orig_h | \
    awk '{if (length($1) > 50) print length($1), $0}' | \
    sort -rn | head -20

// High query rate per base domain (rapid polling from tunnelling tools)
cat dns.log | zeek-cut query | \
    awk -F. 'NF > 2 {print $(NF-1)"."$NF}' | \
    sort | uniq -c | sort -rn | head -20

// Queries with high entropy subdomains (encoded data looks random)
python3 < 2 else ""
                if len(subdomain) > 15 and entropy(subdomain) > 3.5:
                    print(f"HIGH ENTROPY SUBDOMAIN: {query}")
                    print(f"  subdomain={subdomain} length={len(subdomain)} entropy={entropy(subdomain):.2f}")
        except:
            pass
EOF

JA3 fingerprinting for C2 detection

// Check all TLS connections against known malware JA3 hashes
// Known Cobalt Strike default: a0e9f5d64349fb13191bc781f81f42e1
// Known Metasploit Meterpreter: 51c64c77e60f3980eea90869b68c58a8

cat ssl.log | zeek-cut ja3 ja3s id.orig_h id.resp_h server_name | \
    grep -E "a0e9f5d64349fb13191bc781f81f42e1|51c64c77e60f3980eea90869b68c58a8"

// More comprehensive check against a JA3 blocklist
python3 << EOF
import json

# Known malicious JA3 hashes (expand from threat intel feeds)
blocklist = {
    "a0e9f5d64349fb13191bc781f81f42e1": "Cobalt Strike default",
    "51c64c77e60f3980eea90869b68c58a8": "Metasploit Meterpreter",
    "6734f37431670b3ab4292b8f60f29984": "Cobalt Strike malleable",
    "b386946a5a44d1ddcc843bc75336dfce": "Possible malware",
}

with open("/opt/zeek/logs/current/ssl.log") as f:
    for line in f:
        try:
            rec = json.loads(line)
            ja3 = rec.get("ja3", "")
            if ja3 in blocklist:
                print(f"BLOCKED JA3: {blocklist[ja3]}")
                print(f"  src={rec['id.orig_h']} dst={rec['id.resp_h']}")
                print(f"  sni={rec.get('server_name','unknown')} ja3={ja3}")
        except:
            pass
EOF

// Also check for self-signed certificates (common in C2 infrastructure)
cat ssl.log | zeek-cut id.orig_h id.resp_h server_name subject issuer validation_status | \
    awk '$6 == "self signed certificate" {print "SELF-SIGNED:", $0}'

Feeding Zeek logs into Elastic

// Filebeat configuration for Zeek log ingestion
// /etc/filebeat/modules.d/zeek.yml

- module: zeek
  connection:
    enabled: true
    var.paths: ["/opt/zeek/logs/current/conn.log"]
  dns:
    enabled: true
    var.paths: ["/opt/zeek/logs/current/dns.log"]
  ssl:
    enabled: true
    var.paths: ["/opt/zeek/logs/current/ssl.log"]
  http:
    enabled: true
    var.paths: ["/opt/zeek/logs/current/http.log"]
  files:
    enabled: true
    var.paths: ["/opt/zeek/logs/current/files.log"]

// Restart Filebeat
systemctl restart filebeat

// Verify data is arriving in Kibana:
// Index: filebeat-*
// Filter: event.module: zeek