Writing Your First Yara Rule: From Sample to Signature

16 December 2025 | 5 min read | justruss.tech

Yara is a pattern matching tool for malware identification. You define conditions based on strings, byte sequences, PE header characteristics, entropy values, or file properties, and Yara tells you whether a file matches. Writing rules from scratch against real samples is the most direct way to understand what makes a rule robust versus fragile, and this walkthrough covers the complete process from raw sample to a production-quality rule.

Initial sample analysis workflow

// Step 1: Basic file identification
file Qakbot_loader.exe
# PE32 executable (GUI) Intel 80386, for MS Windows

// Calculate hashes for record-keeping and VirusTotal lookup
sha256sum Qakbot_loader.exe
md5sum Qakbot_loader.exe

// Step 2: Check existing coverage before writing new rules
// Submit to VirusTotal, check if the family is already detected
// and what names it is being detected under

// Step 3: Extract strings - look for anything distinctive
strings -a -n 8 Qakbot_loader.exe | sort -u | \
    grep -v "^[A-Za-z][a-z]*$" | \   // filter common words
    head -100

// Notable strings from the Qakbot obama200 sample:
// SOFTWARE\Microsoft\Dtcpipe     (known Qakbot persistence key)
// obama200                        (campaign tag, hardcoded)
// PluginStart                     (export function name)
// PluginStop
// PluginCode
// EncryptData
// DecryptData

Finding byte-level patterns with radare2

// Open in radare2 for binary analysis
r2 -A Qakbot_loader.exe

// Search for the XOR key used in config decryption
// Common Qakbot XOR keys appear as immediate values in XOR instructions
[0x00401000]> /x 35deadc0de
# Matches at: 0x00401890
# 0x35 is the XOR opcode, 0xdeadc0de is the immediate key

// View the decryption routine
[0x00401890]> pd 20
// Shows the XOR loop with the key

// Search for string references
[0x00401000]> iz
// Lists all strings in the binary with their addresses

// Look at cross-references to suspicious strings
[0x00401000]> axt @ [address_of_obama200_string]
// Shows where this string is referenced from

PE structure analysis

python3 < 7.0 suggests packed or encrypted section
    if entropy > 7.0:
        print(f"    *** HIGH ENTROPY - likely packed/encrypted ***")

print("\n=== Import Directory ===")
if hasattr(pe, 'DIRECTORY_ENTRY_IMPORT'):
    for entry in pe.DIRECTORY_ENTRY_IMPORT:
        print(f"  {entry.dll.decode()}")
        for imp in entry.imports:
            if imp.name:
                print(f"    {imp.name.decode()}")
EOF

Writing the rule: from indicators to condition

// Full production-quality Yara rule for Qakbot obama200 loader

rule Qakbot_obama200_loader {
    meta:
        author       = "justruss"
        description  = "Qakbot loader binary from the obama200 campaign (2023)"
        date         = "2023-08-20"
        hash_sample  = "3a4b5c6d7e8f..."
        reference    = "https://justruss.tech/post/writing-yara-rules"
        tlp          = "WHITE"

    strings:
        // Registry persistence key (this path is unique to Qakbot)
        $reg_key    = "SOFTWARE\\Microsoft\\Dtcpipe" wide ascii

        // Campaign identifier tag hardcoded in the config blob
        $campaign   = "obama200" nocase

        // XOR key as immediate operand in decryption loop
        // { 35 DE AD C0 DE } = XOR EAX, 0xDEADC0DE
        $xor_key    = { 35 DE AD C0 DE }

        // Export function names consistent across loader variants
        $export_1   = "PluginStart" ascii fullword
        $export_2   = "PluginStop"  ascii fullword
        $export_3   = "PluginCode"  ascii fullword

        // C2 communication strings
        $c2_enc     = "EncryptData" ascii
        $c2_dec     = "DecryptData" ascii

    condition:
        // Must be a valid PE file (MZ header + PE signature)
        uint16(0) == 0x5A4D
        and uint32(uint32(0x3C)) == 0x00004550

        // Reasonable size range for this loader family (50KB to 2MB)
        and filesize > 50KB
        and filesize < 2MB

        // Must match at least 2 of the 3 export function names
        // (provides resilience when one is removed in a variant)
        and 2 of ($export_1, $export_2, $export_3)

        // Must match at least 1 unique family indicator
        and 1 of ($reg_key, $campaign, $xor_key)
}

Testing and validation

// Install Yara
sudo apt install yara -y

// Test against the original sample (must match)
yara -r qakbot_obama200.yar Qakbot_loader.exe
# Expected: Qakbot_obama200_loader Qakbot_loader.exe

// False positive check against clean Windows binaries
yara -r qakbot_obama200.yar C:\Windows\System32\ 2>/dev/null
# Expected: (no output)

// False positive check against a broader clean file set
yara -r qakbot_obama200.yar /usr/bin/ /usr/lib/ 2>/dev/null
# Expected: (no output or very few unexpected matches)

// Test against your malware corpus if you have one
yara -r qakbot_obama200.yar /opt/malware_samples/ 2>/dev/null | grep Qakbot
# Shows all matching samples - useful for measuring family coverage

// Performance test for rules that will be used in high-throughput scanning
time yara qakbot_obama200.yar large_file.bin
# Rules scanning large files should complete in milliseconds

Scanning memory for Yara matches

// Yara can scan live process memory (requires appropriate privileges)
// Scan all running processes for the Qakbot rule
yara -p 8 qakbot_obama200.yar $(ps ax -o pid= | tr -s ' ' | sed 's/^ //')

// Or scan a specific process by PID
yara qakbot_obama200.yar /proc/1234/mem 2>/dev/null

// Using Volatility to run Yara against a memory dump
vol -f memory.raw windows.vadyarascan --yara-file qakbot_obama200.yar
// Scans each process VAD region against the rule
// Much faster than scanning raw memory because it respects process boundaries

Building a Yara rule set for your environment

// Organise rules by family and campaign for maintainability
// Directory structure:
// rules/
//   qakbot/
//     qakbot_obama200_2023.yar
//     qakbot_bb_2024.yar
//   cobalt_strike/
//     cs_default_watermarks.yar
//     cs_beacon_config.yar
//   generic/
//     process_injection_indicators.yar
//     credential_dumping_tools.yar

// Master rules file that includes all others
// master.yar:
// include "rules/qakbot/qakbot_obama200_2023.yar"
// include "rules/cobalt_strike/cs_default_watermarks.yar"
// ...

// Run the master ruleset against a scan target
yara -r master.yar --print-tags --print-meta scan_target/