RCFMARTIN

This guide assumes a working Zabbix agent (covered in the agent post) on the host whose logs you want to read. The agent must run as a user with read access to the log files.

Most monitoring catches symptoms CPU is high, the service is down, the disk is full. Log monitoring catches causes the stack trace that preceded the crash, the auth failure pattern, the ORM warning that silently doubles your DB load. Zabbix has had log items since version 1.x and they're still under-used.

Four Item Keys

Key	What it watches
`log[]`	One specific file
`logrt[]`	A rotating file (regex pattern matches new files)
`log.count[]`	How many lines matched (numeric, suitable for graphs)
`logrt.count[]`	Same, for rotating files

logrt[] is what you almost always want. Modern apps rotate logs (access.log, access.log.1, access.log.2.gz). log[] only watches one fixed file and misses everything that lands in a freshly rotated one.

The Item Definition

In Data Collection -> Hosts -> Items -> Create item:

Type: Zabbix agent (active) log items must be active. Passive doesn't work.
Type of information: Log
Update interval: 1m
Key: logrt[/var/log/nginx/access\.log.*,"HTTP/.*\" 5\d\d ",,,skip]

The key signature is:

logrt[<file_regex>, <content_regex>, <encoding>, <maxlines>, <mode>, <output>, <maxdelay>, <options>]

Three pieces matter most:

file_regex a regex matching log file paths. The directory part is literal; the filename part is regex.
content_regex a regex applied per line. Only matching lines become Zabbix events.
mode all (default) or skip. Use skip on first poll so you don't replay the entire backlog.

Encoding Matters

Windows logs are often UTF-16 LE with BOM (Get-EventLog -LogName Application | Out-File foo.log defaults to that). Linux logs are UTF-8. Specify the encoding explicitly:

log[/var/log/app.log,error,UTF-8]
log[C:\App\app.log,error,UTF-16]

Get this wrong and the agent reads byte-pairs as garbled mojibake and silently matches nothing.

Multi-line Records

Stack traces and pretty-printed JSON span many lines. Zabbix processes a log line at a time by default. To merge multi-line records:

logrt[/var/log/app/server\.log,,UTF-8,,skip,,,,maxlines=200,mdelay=2,maxdelay=10,persistent=1,resetlogrt=1]

The two relevant flags:

maxlines=N cap the lines read per poll cycle. Prevents a 10MB log explosion from saturating the agent.
A ^\d{4}-\d{2}-\d{2} content regex plus the agent's lookback flag match each line that starts a record, capturing the first line as the event. The full multi-line record stays in the file; you just key triggers off the first line.

Truly merging multi-line records into one Zabbix value isn't great with the standard agent. For real structured-log workloads, ship via Filebeat/Fluent Bit to a SIEM and let Zabbix monitor the SIEM's anomaly count via the API. Trying to do everything in logrt[] past a certain point is the wrong tool.

Severity Mapping with Preprocessing

Add a preprocessing step on the item under Preprocessing -> Add:

Regular expression pattern \b(ERROR|WARN|INFO|DEBUG)\b, output \1
Replace find ERROR, replace with 4; WARN→3; INFO→2; DEBUG→1

Now the item value is a number. Use Type of information: Log plus Log time format for the timestamp, and the value carries the severity:

last(/Host/logrt[...]) >= 4

A trigger that fires on any ERROR or worse, regardless of which message.

Zabbix's Log time format field tells the agent how to parse the line's own timestamp into the event time. Without it, every event is timestamped with when the agent read it useless for correlation across hosts.

A Real-World Example: Nginx 5xx

Key:            logrt[/var/log/nginx/access\.log.*,"HTTP/[0-9.]+\" 5\d\d ",,,skip,,200]
Type:           Log
Update int.:    30s
Log time fmt:   pyyyy-pMM-pdd phh:pmm:pss

(p literals come from the agent's parser tokens. The exact format depends on your log line.)

Then a trigger:

nodata(/web01/logrt[...],300) = 0 and find(/web01/logrt[...],,iregexp,"500 Internal Server Error") = 1

"In the last 5 minutes there were log lines, and at least one of them was a 500." Use find(...) rather than last(...) for substring matching across the recent buffer.

Counting, Not Capturing `log.count[]`

Sometimes you don't care about the content of every match, just the rate. log.count[] returns the number of matches per poll:

log.count[/var/log/auth.log,"Failed password",UTF-8,,skip]

Type: Numeric (unsigned). Now you can graph "failed SSH attempts per minute" and trigger on a rate spike:

sum(/host/log.count[...],1m) > 50

Cheaper than log[] for high-volume patterns where you only want aggregate behavior.

Permissions

The agent runs as a non-privileged user (zabbix on Linux, LocalService or a domain account on Windows). Most app logs aren't readable by it.

# Linux: add zabbix to the right group
sudo usermod -a -G adm zabbix          # /var/log/syslog and friends
sudo chmod 644 /var/log/myapp/*.log    # or: chgrp zabbix and 640
sudo systemctl restart zabbix-agent

# Windows: grant the agent's service account read on the log directory
$svc = Get-CimInstance Win32_Service -Filter "Name='Zabbix Agent'"
$account = $svc.StartName        # often "NT Service\Zabbix Agent"

icacls 'C:\App\Logs' /grant "$($account):(RX)" /T

Don't make app log files world-readable to satisfy the agent. Either group-grant or use ACLs. World-readable logs are an audit finding.

Operational Notes

logrt[] remembers position via the agent's persistent buffer. If the agent restarts, it resumes where it left off. Delete /var/lib/zabbix/active.db (or the Windows equivalent) only when you intentionally want a re-scan.
A new file rotation creates a new "session". The agent treats each file matched by the regex independently old files keep being polled until they age out per MaxLagDelay.
Log items go on the active queue. If the agent's connection to the proxy is broken, log events buffer locally up to BufferSize. Default is small; raise it for hosts with chatty logs.
Trigger storms are easy. A misconfigured find() over an error pattern can fire dozens of times a minute. Use trigger dependencies and de-duplication in the action, not in the trigger expression.
Combine with the process logging post for "who launched this and what did its log say" Zabbix natively correlates by host and time, so a process's start and its first error line line up in Latest data automatically.

What to Do Next

logrt[] for rotating files, content regex for filtering, log.count[] for rate-style metrics, severity preprocessing for triggers. Get the encoding right, set skip so the first poll doesn't avalanche, watch your trigger expressions for storm potential, and group-grant the agent read access rather than world-reading. Once you have the agent watching the right files, every other monitoring layer in your stack benefits because the cause finally shows up in the same dashboard as the symptom.

Three concrete moves to add log monitoring to a host this week:

Pick one log file, not the whole directory. Application error log, auth log, or job runner log start with the one whose error patterns you already know by heart. Validate the regex catches what you expect before you scale to a second file.
Set skip on the first deploy. A new log item without skip will replay every error from the file's history and storm your alerting on day one. skip says "start watching from now" turn it off only after you've validated the pattern.
Add log.count[] next to the first content match. Triggers on the content tell you something happened; triggers on the rate tell you it's a flood. The two together separate "one bad request" from "we're being scanned".

Pairs naturally with the process logging post (so a process's start and its first error line correlate by host and time) and the quieting alerts post (because log-based triggers are exactly the ones that need hysteresis and dedup to avoid storms).

Zabbix Log File Monitoring with log[] and logrt[]