Skip to content

Instantly share code, notes, and snippets.

@jordansissel
Last active June 7, 2021 21:10
Show Gist options
  • Select an option

  • Save jordansissel/3753353 to your computer and use it in GitHub Desktop.

Select an option

Save jordansissel/3753353 to your computer and use it in GitHub Desktop.

Revisions

  1. jordansissel revised this gist Jun 7, 2021. No changes.
  2. jordansissel created this gist Sep 20, 2012.
    110 changes: 110 additions & 0 deletions logstash-mysql-query-parse.md
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,110 @@
    # parsing mysql query logs with logstash

    The problem is that some lines in the file are missing timestamps when they
    aren't continuations of any previous line. It's dumb, really.

    The mysql query log is seriously bullshit format, but nothing logstash can't unscrew.

    The main goal here is to show how we can fix the 'missing timestamp' problem.

    % ruby bin/logstash agent -e '
    input {
    stdin {
    type => "mysql-slowquery"
    }
    }
    filter {
    # mysql query logs sometimes are missing a timestamp
    # and use spaces instead (WTF), so merge timestampless events with the
    # previous event.
    multiline {
    what => previous
    pattern => "^\s"
    }

    # pull out the timestamp (like, "120707 0:40:34")
    grok { pattern => "^%{NUMBER:date} *%{NOTSPACE:time}" }

    # put the timestamp into a single field
    mutate { replace => [ "time", "%{date} %{time}" ] }

    # parse the timestamp, which could be one or two digits.
    date { time => [ "YYMMdd H:mm:ss", "YYMMdd HH:mm:ss" ] }

    # remove time/date fields only previously added for parsing.
    mutate { remove => [ "time", "date" ] }

    # Now split up the multiline again, which keeps the timestamp for all split
    # out events. The defaults here are fine as they split '@message' by '\n'
    split { }
    }

    output {
    stdout { debug => true }
    }
    '

    ## input

    120707 0:40:34 4 Connect root@localhost on
    4 Query select @@version_comment limit 1
    120707 0:40:45 4 Query select * from mysql.user
    120707 0:41:18 5 Query hello world

    ## logstash output

    The key thing to note here is that the 'select @@version_comment ...' event now
    has the same timestamp as the previous line.

    {
    "@source" => "stdin://carrera.databits.net/",
    "@tags" => [
    [0] "multiline"
    ],
    "@fields" => {
    "time" => "120707 0:40:34"
    },
    "@timestamp" => "2012-07-07T04:40:34.000Z",
    "@source_host" => "carrera.databits.net",
    "@source_path" => "/",
    "@message" => "120707 0:40:34 4 Connect root@localhost on",
    "@type" => "mysql-slowquery"
    }
    {
    "@source" => "stdin://carrera.databits.net/",
    "@tags" => [
    [0] "multiline"
    ],
    "@fields" => {
    "time" => "120707 0:40:34"
    },
    "@timestamp" => "2012-07-07T04:40:34.000Z",
    "@source_host" => "carrera.databits.net",
    "@source_path" => "/",
    "@message" => " 4 Query select @@version_comment limit 1",
    "@type" => "mysql-slowquery"
    }
    {
    "@source" => "stdin://carrera.databits.net/",
    "@tags" => [],
    "@fields" => {
    "time" => "120707 0:40:45"
    },
    "@timestamp" => "2012-07-07T04:40:45.000Z",
    "@source_host" => "carrera.databits.net",
    "@source_path" => "/",
    "@message" => "120707 0:40:45 4 Query select * from mysql.user",
    "@type" => "mysql-slowquery"
    }
    {
    "@source" => "stdin://carrera.databits.net/",
    "@tags" => [],
    "@fields" => {
    "time" => "120707 0:41:18"
    },
    "@timestamp" => "2012-07-07T04:41:18.000Z",
    "@source_host" => "carrera.databits.net",
    "@source_path" => "/",
    "@message" => "120707 0:41:18 5 Query hello world",
    "@type" => "mysql-slowquery"
    }