I’ve been working on a personal project with Logstash lately, and it relies relatively heavily on grok filters for matching text and extracting matched parts. Today, I’ve been parsing syslog from Puppet to extract various metrics and timings, which will then be passed on from Logstash to Etsy’s statsd and then to graphite for display. Unfortunately, a few of my patterns are showing the “_grokparsefailure” tag and I just can’t seem to find the problem.
The logstash wiki provides a page on Testing your Grok patterns, as does Sean Laurent on his blog: Testing Logstash grok filters. Unfortunately, I work in a CentOS/RHEL shop, and we’re decidedly not a Ruby shop. Our Logstash install is using the monolithic/standalone Java JAR. We run Puppet, which is currently under ruby 1.8.7, and the jls-grok rubygem requires ruby 1.9. There’s no way I’d feel safe installing 1.9 on any of our machines, as they all run (and require) Puppet. So, I found out about RVM, the Ruby Version Manager, which allows you to run and switch between multiple ruby versions, and all of it is installed on a per-user basis. So, I created a new user on my Fedora 16 desktop called “rvmtest” and went about the process of setting up what’s needed to test grok patterns in the user’s local environment. I imagine this would work similarly under CentOS or RHEL, but the following is only tested on Fedora 16. If you have any issues, you should probably refer back to the RVM documentation.
- Create the isolated user, just to be extra careful. Login as that user.
-
As per Installing RVM:
curl https://raw.github.com/wayneeseguin/rvm/master/binscripts/rvm-installer | bash -s stable
-
edit your
~/.bashrc
and add:[[ -s "$HOME/.rvm/scripts/rvm" ]] && . "$HOME/.rvm/scripts/rvm" [[ -r $rvm_path/scripts/completion ]] && . $rvm_path/scripts/completion
The first line sets up RVM for your sessions, and the second sources in tab-completion for the
rvm
command. -
source .bashrc
- If you’re interested, you can see a list of all known rubies with:
rvm list known
- Install Ruby (MRI) 1.9.2:
rvm install 1.9.2
- “switch” to that ruby:
rvm use 1.9.2
and confirm it by runningruby -v
- Make it the default ruby for us:
rvm use 1.9.2 --default
- Create a “gemset” (set of rubygems for our environment):
rvm gemset create groktest
- Use it, and set it as default:
rvm use 1.9.2@groktest --default
- for grok testing,
gem install jls-grok
- check that it’s there:
gem list
- Download Logstash’s default grok patterns from github
- You should now be ready to test some grok patterns.
While the two howto’s linked above use irb
to interactively test the
patterns, I prefer something easier to move to production, more
reliable, and more repeatable. The following quick little ruby script
takes test to match against on STDIN (log files, messages, etc.) and
prints the matches to STDOUT. The script is based on
test.rb
from jordansissel’s
ruby-grok. Note one
important thing here, I couldn’t get the shebang (#!
) to work with
anything other than the explicit path to my RVM ruby install
(which ruby
) so you’ll need to manually update this yourself.
#!.rvm/rubies/ruby-1.9.2-320bin/ruby
require 'rubygems'
require 'grok-pure'
require 'pp'
grok = Grok.new
grok.add_patterns_from_file("grok-patterns")
pattern = 'your_grok_pattern_here'
grok.compile(pattern)
puts "PATTERN: #{pattern}"
while a = gets
puts "IN: #{a}"
match = grok.match(a)
if match
puts "MATCH:"
pp match.captures
else
puts "No Match."
end
end
Here’s an example using a pattern to capture information from custom syslog messages triggered by updating puppet configs. Here’s some sample messages:
[rvmtest@jantmanwork ~]$ cat puppet.log
Updated 2 files in puppet svn (environment prod) to revision 754
Updated 3 files in puppet svn (environment prod) to revision 756
Updated 1 files in puppet svn (environment prod) to revision 757
And the pattern that I use:
Updated%{SPACE}%{NUMBER:puppet_svn_num_files}%{SPACE}files%{SPACE}in%{SPACE}puppet%{SPACE}svn%{SPACE}\(environment%{SPACE}%{WORD:puppet_svn_env}\)%{SPACE}to%{SPACE}revision%{SPACE}%{NUMBER:puppet_svn_revision}
And the output of the script:
[rvmtest@jantmanwork ~]$ cat puppet.log | ./puppet-update-test.rb
PATTERN: Updated%{SPACE}%{NUMBER:puppet_svn_num_files}%{SPACE}files%{SPACE}in%{SPACE}puppet%{SPACE}svn%{SPACE}\(environment%{SPACE}%{WORD:puppet_svn_env}\)%{SPACE}to%{SPACE}revision%{SPACE}%{NUMBER:puppet_svn_revision}
IN: Updated 2 files in puppet svn (environment prod) to revision 754
MATCH:
{"SPACE"=>[" ", " ", " ", " ", " ", " ", " ", " ", " ", " "],
"NUMBER:puppet_svn_num_files"=>["2"],
"BASE10NUM"=>["2", "754"],
"WORD:puppet_svn_env"=>["prod"],
"NUMBER:puppet_svn_revision"=>["754"]}
IN: Updated 3 files in puppet svn (environment prod) to revision 756
MATCH:
{"SPACE"=>[" ", " ", " ", " ", " ", " ", " ", " ", " ", " "],
"NUMBER:puppet_svn_num_files"=>["3"],
"BASE10NUM"=>["3", "756"],
"WORD:puppet_svn_env"=>["prod"],
"NUMBER:puppet_svn_revision"=>["756"]}
IN: Updated 1 files in puppet svn (environment prod) to revision 757
MATCH:
{"SPACE"=>[" ", " ", " ", " ", " ", " ", " ", " ", " ", " "],
"NUMBER:puppet_svn_num_files"=>["1"],
"BASE10NUM"=>["1", "757"],
"WORD:puppet_svn_env"=>["prod"],
"NUMBER:puppet_svn_revision"=>["757"]}
Hopefully this will make the process a bit simpler for someone else…
Comments
comments powered by Disqus