Useful Technology Scripts

dbmon.pl A simple script to monitor MySQL queries and log them into a file for future processing. Optionally will kill long-running queries.

randomPassword.pl Generate a human-rememberable strong password based on subsets of dictionary words. Can also generate impossible-to-remember and very long passwords if you desire. Tries to avoid "obscene" words inside. Some sample output:

$ randomPassword.pl -n 5      $ randomPassword.pl -n 5 -l
SphinoHippalnas               GlycirCalibcierDarnete     
RemaipeChilvied               CozenifaConlarsEconver     
GymolAntoinnged               FlourumRedniumsVacmake     
FinasuVenisires               GrovelefMorciteAlsatid     
PerusWoolitgest               ForiqEarthwenneMonogct     
$ randomPassword.pl -f 2 -n 5    $ randomPassword.pl -f 2 -n 5 -l
Spebo'La35                       LAyxI^Bl53$Asseypo
Decif+DE58                       ISoT(Cla69)Mazourt
StaRAb%B83                       TaNaBA/T37)Hausdon
DEpteK!C14                       ExCaLu)N31!Navigac
BEmO(VeL17                       MAHop_SK42~Pigroun

distribution Generate an ASCII graph (histogram visualisation) of values fed into the script in any order. This is useful if you're at a terminal where you don't have access to graphical interface, and so all the advice to generate PNGs or use a monitoring tool is irrelevent to you. Some features:

  1. Pass in width or height parameters to change the width of the chart or number of values charted.
  2. The header is printed on STDERR so you can pipe the output to sort if you want to see the histogram sorted by values.
  3. Tokenize and match (grep) the input within the tool.
  4. Colourise the output (no meaning to the colour as yet).
  5. See progress of script and statistics using --verbose.
  6. Output histogram in any character you want (or some sequences of characters).
  7. Already-tallied values (eg "du -sb /etc/*") can also be graphed with --graph.
  8. Pass in options in very-short form eg "distribution -w=90 -h=15 -c".
Here's some sample output:
$ zcat /var/log/syslog*gz \
        | awk '{print $5" "$6}' \
        | distribution --tokenize=word --match=word --height=10 --verbose --char=o
 + Objects Processed: 124295.   
 tokens/lines examined: 124295
  tallied in histogram: 36711
     histogram entries: 140
               runtime: 109.03ms

Val           |Ct (Pct)       Histogram
kernel        |12112 (32.99%) ooooooooooooooooooooooooooooooooooooooooooooooooo
NetworkManager|5695 (15.51%)  ooooooooooooooooooooooo
info          |5371 (14.63%)  oooooooooooooooooooooo
client        |1633 (4.45%)   ooooooo
ovpn          |1633 (4.45%)   ooooooo
daemon        |868 (2.36%)    oooo
avahi         |853 (2.32%)    oooo
dhclient      |736 (2.00%)    ooo
Trying        |667 (1.82%)    ooo
dnsmasq       |562 (1.53%)    ooo
$ du -sb /etc/* \
        | distribution --graph --height=15 --char='|'
Val                   |Ct (Pct)         Histogram
/etc/mateconf         |7780758 (44.60%) |||||||||||||||||||||||||||||||||||||||
/etc/brltty           |3143272 (18.02%) ||||||||||||||||
/etc/apparmor.d       |1597915 (9.16%)  ||||||||
/etc/bash_completion.d|597836 (3.43%)   |||
/etc/mono             |535352 (3.07%)   |||
/etc/ssl              |465414 (2.67%)   |||
/etc/ardour2          |362303 (2.08%)   ||
/etc/X11              |226309 (1.30%)   ||
/etc/ImageMagick      |202358 (1.16%)   |
/etc/init.d           |143281 (0.82%)   |
/etc/ssh              |138042 (0.79%)   |
/etc/fonts            |119862 (0.69%)   |
/etc/sound            |112051 (0.64%)   |
/etc/xdg              |111971 (0.64%)   |
/etc/java-7-openjdk   |100414 (0.58%)   |
$ zcat access.log*gz \
        | awk '{print $7}' \
        | distribution -t=/ -h=15
Val            |Ct (Pct)      Histogram
Art            |1839 (16.58%) +++++++++++++++++++++++++++++++++++++++++++++++++
Rendered       |1596 (14.39%) ++++++++++++++++++++++++++++++++++++++++++
Blender        |1499 (13.52%) ++++++++++++++++++++++++++++++++++++++++
AznRigging     |760 (6.85%)   ++++++++++++++++++++
Music          |457 (4.12%)   ++++++++++++
Ringtones      |388 (3.50%)   +++++++++++
CuteStance     |280 (2.52%)   ++++++++
Traditional    |197 (1.78%)   ++++++
Technology     |171 (1.54%)   +++++
CreativeExhaust|134 (1.21%)   ++++
Fractals       |127 (1.15%)   ++++
robots.txt     |125 (1.13%)   ++++
RingtoneEP1.mp3|125 (1.13%)   ++++
Poetry         |108 (0.97%)   +++
RingtoneEP2.mp3|95 (0.86%)    +++

diskTest.pl Brad Fitzpatrick of LiveJournal described the concept of a script that tests storage devices to make sure that when they claim some data is on disk, it actually is. After some web searching, I could not find the script, but only a note from Brad that they hadn't released it due to its being incomplete, so I wrote my own. If your storage device is hooked up to a Linux box, you can use this script to make sure your hardware isn't lying to you. We used it at Digg to make sure our disks weren't lying to us. Note: We decided to put many of our disks into a configuration where they WERE lying to us, because we get more performance that way. But now I'm just being pedantic.

Perl and Python MySQL DBI skeleton scripts. I keep writing these things from scratch. About time I stopped.

  1. mysqlDbiSkeleton.pl Perl version, with random useful concepts.
  2. mysqlDbiSkeleton-oneTable.pl Perl version that does work on some table, pausing between each portion.
  3. mysqlSkeleton.py Python version of the generic-with-random-useful bits.

Broken Perl "splitCompress" - it works, but it's way too slow to be practical. splitCompress.pl

LiveJournalScalingPaper.pdf A paper about how LiveJournal scaled up their system. Not just a MySQL paper, but covers a lot of the details you need to know about doing a MySQL-backed high-volume web site.

mysql-internals.pdf Internals of MySQL. I read this once, and I remember it being interesting, but not a lot of it stuck. Probably better for reference.

teMySQLcacti-20060810.tar.gz A highly-modified version of MySQL monitoring templates for Cacti. You import a single host, which includes a bunch of graphs and graph types. Put a special PHP into your cacti/scripts directory, and then you get pretty graphs of MySQL. Make sure you read the README before importing this template into your Cacti! This now includes also the host template and polling script for a memcached server. (also: previous 20060517 release)

Recently I did a Google search (for unrelated stuff) but found this page with claims their Cacti templates are inspired by mine but are better. It couldn't hurt to look and decide for yourself.


Screenshot of a single host graphed with teMySQL Cacti host template. Here I've gone to the preview mode and typed the name of the host into the search text entry box. Larger graphs with legends appear if you view the host in the tree view.

Screenshot of one statistic for several hosts. To do this, go to the preview view of your graphs, make sure "host" is set to "none". In the text entry box just to the right of that, type "teMySQL - InnoDB I/O" (for example). Then you'll see the InnoDB Row statistic for all hosts that have a graph of that sort.

Also, if you run memcached, here is a host type for graphing memcached statistics.

To the left is a screenshot of the memcached host type. This is a cluster of four memcached servers.