Various Shared Scripts, Modules and Packages

Author: Martin Blais <>


Various personal scripts that I want to share are available here. This is a space where I share miscellaneous codes that do not deserve a project in their own right.

Table of Contents

Python Scripts

Clojure Tools

Simplistic application runner for Clojure that automatically finds jar files and Leiningen projects and starts a Clojure VM with them. This is meant to run Clojure simply.

PDF Tools

Extract the first page of each of the PDF files provided on the cmdline and join them in a single output file. Use -l in order to get the last pages.
Read a list of image filenames, reduce the quality to a specified level and create a PDF file as output with each image on its own page. This can be used to package a list of scanned images in a single file. I needed this to submit exam results.

Mercurial Tools

Assist you in adding, deleting or masking with ignore those unknown or deleted files left in your Mercurial checkout/repo. This is a Mercurial version of svn-foreign below, implemented as a plugin.
Allow the user to split the list of modified files into many commits in a single edit.
A Mercurial plugin that print out the repository as a tree. This is work-in-progress.
A Mercurial extension that allows you to locally track which revisions have been pushed to a remote repository.
Automatically provide default credentials (username/passwords) read from a file.
Traverse a directory hierarchy and create sentinel files to keep the directories in Mercurial.
Merge GPG encrypted text files using Mercurial. I keep some text files in armored encrypted format in repositories, and sometimes I have to merge them. This script decrypts, invoked ediff, and then automatically reencrypts the merged result intelligently.

Subversion Tools

Allow the user to split the list of modified files into many commits in a single edit.
Assist you in adding, deleting or masking with svn:ignore those unregistered files left in your Subversion checkout. A nifty interactive command-line tool for every Subversion user.
Generate a text-based summary of the log file so you can keep track of your activities (or someone else's).
Replicates the directory structure and files of <src> into <dest>, performing the necessary additions and deletions to register the changes files in <dest> into Subversion. <dest> is assumed to be a Subversion checkout. Files that exist on both sides are diffed to figure out if there are changes to be copied.
Take a list of directories, each representing a version of some files (like a checkout of a release of some software), and imports each of these sequentially over an existing checkout, registering the new fileset and creating a subversion release for every directory imported.

A version of Subversion's own svndumpfilter script that untangles move/copy from filtered locations, by converting the move/copy operations into additions. You need to have (or recreate, if you can) a corresponding repository to do this, and the original revisions are fetched using svnadmin dump itself. This is a partial rewrite of Simon Tatham's svndumpfilter2, which consumes less memory.

(Note [2008-08]: Björn Eriksson contributed a backport to Python2.2, you can find it here)

A Subversion dumpfilter filter that removes empty revisions.

X Tools

Reconfigure your touchpad: dump the config parameters with synclient -l, edit them by hand, and load them up using synclient-load. Supports all parameters.

Python Scripts about Python Code

Finds the python modules under given paths and attempts to import them invididually in subprocesses, in order to insure that each module's import is independent of side-effects and import order.
Generate a textual help file for a Python script. The format is reStructuredText and is meant to be converted by docutils to HTML.
Extracts literal blocks from a ReST document and save them out to files.
Extracts name-value fields within a marked definition term from a ReST document and store them in a database. This allows you to use ReST as an embedded input format for defining some table data.
Python AST pretty-printer.

Generic Scripts

A command-line utility to run some command in many directories, possibly filtering on the existence of some other file. This can be used, for example, to pull/update on many DVCS repositories in one command.
Like xargs but with a temporary file filled from stdin instead of the arguments themselves. In other words, read some bytes from stdin, write them to a temporary file and call the given command with the temporary filename as an argument, and delete the temporary file when the command is done running.
A version of cat that insert random intervals of sleep between outputting each line. Useful for testing some types of programs.
Delayed cat (formerly called dout). Allows you to delay overwriting files that being in use during a pipe command, for example, this works: `` cat file | sed -e 's/a/b/g' | delaycat file ``
Find files with a certain name under a hierarchy and cat them all to stdout, with separators in between. I use this to cat all the TODO files under a hierarchy and pipe into more.
Generate lists of files based on extensions repeated occurences of parts of filenames. This is useful for quickly grep-finding through a large number of files.
A simpler, better documented version of, written in Python. I was trying to use and I could not get it to do reliably what I wanted it to, and could not find the documentation. This one is a lot shorter and simpler and does what I expect it to.
Sort lines according to an arbitrary list of ordering string prefixes. You can use this for custom sorts that depend on some external processes, for example, ordering filenames in their order of compilation.
Return the current UNIX timestamp since the UNIX epoch. Useful for debugging timezone time conversion stuff (I hate that).
Remove diacritics and weird characters from stdin. This can be useful to create id strings from text, such as when creating directory names.
Match files from different directories in pairs according to their basename and output the results where a match has succeeded, printing two filenames per-line of output.
Demo command runner/laucher with big buttons.
Demo command runner/laucher for individual tests in a Python module.
Pops up a giant stupid window with the shell's result code and last command run. You push this command in an interactive shell after a long-running command to warn yourself that the command is done. This is especially useful when you use multiple workspaces and you want to go work on other stuff.
Selects a subset of a set of random lines of a set of files.
Copy the given files in the specified directory, creating a random directory structure in the output directory to place the files in.
Looks for forgotten decrypted files in the given directories.
Find files under given current directory, lists them, ask for confirmation and then delete them if confirmed.
Walks a hierarchy of files and conditionally encrypts the files.
Process a C/C++ file as if to pre-process it, only processing include directives for a single include directory. This effectively generates one very large include file that can be included instead of the given file.
Script that looks at all the HTML files in a directory, reads them and creates a mirror site that wraps them up in a template.
Fetch a RSS file from the internet, parse it, remove items that have already been processed by this script before (they are recorded to a file each time) and send the other/new issues by email.
From a list of files or directories, select the files in a way that will fit as close as possible a requested space, either by using given ordering, or by using any ordering to fit as close as possible the allowed size.
Script that takes list of pairs of filenames and renames the files as such. The particularity of this script is that compared to a Bourne shell script, it can handle arbitrary filenames with spaces, single-quotes, double-quotes and parentheses easily. There is no quoting happening on the filenames.
Prepend the date to every filename provided.
mvmk and cpmk
Like mv and cp but create the destination directory if desired.
Read a simplified format for CSS files, strips comments and outputs a valid CSS file.
Render a CSV file to various output formats.
A "fast" version of diff that does not actually look at the file contents, but that just looks at the sizes.
Look for files under a list of directories and list those which have basenames which appear more than once. Has options for Python.
Given a root directory, recurse in it and find all the duplicate files, files that have the same contents, but not necessarily the same filename.
Connect to a Cisco VPN under Linux, without requiring user interaction.
A swiss-knife script that mimics the behaviour of open on the Mac, or of start under Windows: given some files as arguments, just run the most appropriate command.
A convenient tool to mount and unmount loop devices encrypted with losetup (e.g., serpent encryption) under Linux.
A script to search and filter the Python Jobs Board.

Development Tools

Invoke ldd recursively on the given executables and output a Graphviz dot file of the dependencies between the binaries.

Gentoo Stuff

A simple program that will print one-liner descriptions of Gentoo packages. Why this is missing from Gentoo itself I really don't understand.
Add a package to both the unmask and unstable portage files.

Database Tools
A module with helper functions for performing database migration.
A simple extension to DBAPI that provides an easier interface for building queries. More information can be found here.
Import CSV files as PostgreSQL database tables. The script tries to guess as much as it can by itself. This is useful to manipulate data created in CSV files, often you can completely get away from having to use a spreadsheet.
Load a CSV file, find a cell with a SQL statement in it, then iterate through the csv file again, assuming we have columns with a header of ID and NAME, replacing the values in the SQL statement for every row. Print the replacements on stdout.
Split a PostgreSQL database plain format dump into individual files for each database object. This is useful if you have to compare the contents of two databases, you can use a graphical diff program to view their differences.

Misc Python Modules
A true portable encapsulation of a path (not as a string). This makes it easier to write portable code. This lib needs a bit of comment cleanup.
Generic parser for simple file format for keeping notes on some data items.
A simple extension to optparse from Python 2.4 to make it able to deal with arguments automatically.
A library for configuration files written in the form of Python classes. This basically adds support for declaration and validation of the configuration variables and is a very flexible and simple configuration system for any kind of thing.
A module to create classes that render calls to their instances instead of actually calling anything. This can be useful for implementing a quick unidirectional protocol where the server evals the lines in an environment, instead of using something like XML-RPC.
Inject some tracing builtins debugging purposes. This module automatically injects a few (but few) useful tracing and debugging functions that you typically want to quickly add without having to insert an import.
A module that helps you query the user for choices on the command-line. The input is read with the tty in raw mode (not necessary to press enter).
A module that brings up the last exception from a Python program that has run on your system (work in progress).

Misc Other Scripts

A simple script that will parse your videotron quota page and output the level of download and upload you have incurred for the current period.



The computer programs or libraries on this page are provided for free. I am aware that some of the programs that I provide for free allow people to get their work done faster or better. If you are using some of these codes for benefit, especially if you are using it within a commercial environment and it saves you time or work, please consider making a donation to my company's PayPal account by clicking on the link below.