Various Shared Scripts, Modules and Packages

Author: Martin Blais <blais@furius.ca>

Abstract

Various personal scripts that I want to share are available here. This is a space where I share miscellaneous codes that do not deserve a project in their own right.

Table of Contents

Python Scripts

Clojure Tools

streamlined
Simplistic application runner for Clojure that automatically finds jar files and Leiningen projects and starts a Clojure VM with them. This is meant to run Clojure simply.

PDF Tools

pdf-heads
Extract the first page of each of the PDF files provided on the cmdline and join them in a single output file. Use -l in order to get the last pages.
pdf-from-images
Read a list of image filenames, reduce the quality to a specified level and create a PDF file as output with each image on its own page. This can be used to package a list of scanned images in a single file. I needed this to submit exam results.

Mercurial Tools

foreign
Assist you in adding, deleting or masking with ignore those unknown or deleted files left in your Mercurial checkout/repo. This is a Mercurial version of svn-foreign below, implemented as a plugin.
commits
Allow the user to split the list of modified files into many commits in a single edit.
tree
A Mercurial plugin that print out the repository as a tree. This is work-in-progress.
histpush
A Mercurial extension that allows you to locally track which revisions have been pushed to a remote repository.
defpasswd
Automatically provide default credentials (username/passwords) read from a file.
keep-empty-dirs
Traverse a directory hierarchy and create sentinel files to keep the directories in Mercurial.
xx-encrypted-nodeps
Merge GPG encrypted text files using Mercurial. I keep some text files in armored encrypted format in repositories, and sometimes I have to merge them. This script decrypts, invoked ediff, and then automatically reencrypts the merged result intelligently.

Subversion Tools

svn-commits
Allow the user to split the list of modified files into many commits in a single edit.
svn-foreign
Assist you in adding, deleting or masking with svn:ignore those unregistered files left in your Subversion checkout. A nifty interactive command-line tool for every Subversion user.
vc-summarize
Generate a text-based summary of the log file so you can keep track of your activities (or someone else's).
svn-copy-register
Replicates the directory structure and files of <src> into <dest>, performing the necessary additions and deletions to register the changes files in <dest> into Subversion. <dest> is assumed to be a Subversion checkout. Files that exist on both sides are diffed to figure out if there are changes to be copied.
svn-import-releases
Take a list of directories, each representing a version of some files (like a checkout of a release of some software), and imports each of these sequentially over an existing checkout, registering the new fileset and creating a subversion release for every directory imported.
svndumpfilter3

A version of Subversion's own svndumpfilter script that untangles move/copy from filtered locations, by converting the move/copy operations into additions. You need to have (or recreate, if you can) a corresponding repository to do this, and the original revisions are fetched using svnadmin dump itself. This is a partial rewrite of Simon Tatham's svndumpfilter2, which consumes less memory.

(Note [2008-08]: Björn Eriksson contributed a backport to Python2.2, you can find it here)

svndropempty
A Subversion dumpfilter filter that removes empty revisions.

X Tools

synclient-load
Reconfigure your touchpad: dump the config parameters with synclient -l, edit them by hand, and load them up using synclient-load. Supports all parameters.

Python Scripts about Python Code

python-recursive-import-test
Finds the python modules under given paths and attempts to import them invididually in subprocesses, in order to insure that each module's import is independent of side-effects and import order.
python-genpage
Generate a textual help file for a Python script. The format is reStructuredText and is meant to be converted by docutils to HTML.
rst-literals
Extracts literal blocks from a ReST document and save them out to files.
rst-fields
Extracts name-value fields within a marked definition term from a ReST document and store them in a database. This allows you to use ReST as an embedded input format for defining some table data.
astpretty
Python AST pretty-printer.

Generic Scripts

many
A command-line utility to run some command in many directories, possibly filtering on the existence of some other file. This can be used, for example, to pull/update on many DVCS repositories in one command.
xargf
Like xargs but with a temporary file filled from stdin instead of the arguments themselves. In other words, read some bytes from stdin, write them to a temporary file and call the given command with the temporary filename as an argument, and delete the temporary file when the command is done running.
rubbercat
A version of cat that insert random intervals of sleep between outputting each line. Useful for testing some types of programs.
delaycat
Delayed cat (formerly called dout). Allows you to delay overwriting files that being in use during a pipe command, for example, this works: `` cat file | sed -e 's/a/b/g' | delaycat file ``
findcat
Find files with a certain name under a hierarchy and cat them all to stdout, with separators in between. I use this to cat all the TODO files under a hierarchy and pipe into more.
filesets
Generate lists of files based on extensions repeated occurences of parts of filenames. This is useful for quickly grep-finding through a large number of files.
pargrepp
A simpler, better documented version of pargrep.pl, written in Python. I was trying to use pargrep.pl and I could not get it to do reliably what I wanted it to, and could not find the documentation. This one is a lot shorter and simpler and does what I expect it to.
sort-order
Sort lines according to an arbitrary list of ordering string prefixes. You can use this for custom sorts that depend on some external processes, for example, ordering filenames in their order of compilation.
epoch
Return the current UNIX timestamp since the UNIX epoch. Useful for debugging timezone time conversion stuff (I hate that).
idify
Remove diacritics and weird characters from stdin. This can be useful to create id strings from text, such as when creating directory names.
match-files
Match files from different directories in pairs according to their basename and output the results where a match has succeeded, printing two filenames per-line of output.
bigbuts
Demo command runner/laucher with big buttons.
bigbuts.pytest
Demo command runner/laucher for individual tests in a Python module.
fin
Pops up a giant stupid window with the shell's result code and last command run. You push this command in an interactive shell after a long-running command to warn yourself that the command is done. This is especially useful when you use multiple workspaces and you want to go work on other stuff.
random-subset
Selects a subset of a set of random lines of a set of files.
random-tree-cp
Copy the given files in the specified directory, creating a random directory structure in the output directory to place the files in.
find-decrypted
Looks for forgotten decrypted files in the given directories.
find-rm
Find files under given current directory, lists them, ask for confirmation and then delete them if confirmed.
encrypt-hier
Walks a hierarchy of files and conditionally encrypts the files.
flatten-baden
Process a C/C++ file as if to pre-process it, only processing include directives for a single include directory. This effectively generates one very large include file that can be included instead of the given file.
html-wrap
Script that looks at all the HTML files in a directory, reads them and creates a mirror site that wraps them up in a template.
rss-to-mail
Fetch a RSS file from the internet, parse it, remove items that have already been processed by this script before (they are recorded to a file each time) and send the other/new issues by email.
fit-sizes
From a list of files or directories, select the files in a way that will fit as close as possible a requested space, either by using given ordering, or by using any ordering to fit as close as possible the allowed size.
mv-filelist
Script that takes list of pairs of filenames and renames the files as such. The particularity of this script is that compared to a Bourne shell script, it can handle arbitrary filenames with spaces, single-quotes, double-quotes and parentheses easily. There is no quoting happening on the filenames.
mv-prepend-date
Prepend the date to every filename provided.
mvmk and cpmk
Like mv and cp but create the destination directory if desired.
css-convert
Read a simplified format for CSS files, strips comments and outputs a valid CSS file.
csv-render
Render a CSV file to various output formats.
diff-fast
A "fast" version of diff that does not actually look at the file contents, but that just looks at the sizes.
find-duplicate-filenames
Look for files under a list of directories and list those which have basenames which appear more than once. Has options for Python.
find-duplicate-contents
Given a root directory, recurse in it and find all the duplicate files, files that have the same contents, but not necessarily the same filename.
cisco-connect
Connect to a Cisco VPN under Linux, without requiring user interaction.
open
A swiss-knife script that mimics the behaviour of open on the Mac, or of start under Windows: given some files as arguments, just run the most appropriate command.
encmount
A convenient tool to mount and unmount loop devices encrypted with losetup (e.g., serpent encryption) under Linux.
pyjobs.py
A script to search and filter the Python Jobs Board.

Development Tools

depends-dot
Invoke ldd recursively on the given executables and output a Graphviz dot file of the dependencies between the binaries.

Gentoo Stuff

edesc
A simple program that will print one-liner descriptions of Gentoo packages. Why this is missing from Gentoo itself I really don't understand.
gentoo-unmask
Add a package to both the unmask and unstable portage files.

Database Tools

dbmigration.py
A module with helper functions for performing database migration.
dbapiext.py
A simple extension to DBAPI that provides an easier interface for building queries. More information can be found here.
csv-db-import
Import CSV files as PostgreSQL database tables. The script tries to guess as much as it can by itself. This is useful to manipulate data created in CSV files, often you can completely get away from having to use a spreadsheet.
csv-sql-bulk
Load a CSV file, find a cell with a SQL statement in it, then iterate through the csv file again, assuming we have columns with a header of ID and NAME, replacing the values in the SQL statement for every row. Print the replacements on stdout.
pg-split-dump
Split a PostgreSQL database plain format dump into individual files for each database object. This is useful if you have to compare the contents of two databases, you can use a graphical diff program to view their differences.

Misc Python Modules

ipath.py
A true portable encapsulation of a path (not as a string). This makes it easier to write portable code. This lib needs a bit of comment cleanup.
enjot.py
Generic parser for simple file format for keeping notes on some data items.
optparse_wargs.py
A simple extension to optparse from Python 2.4 to make it able to deal with arguments automatically.
turfig.py
A library for configuration files written in the form of Python classes. This basically adds support for declaration and validation of the configuration variables and is a very flexible and simple configuration system for any kind of thing.
callrdr.py
A module to create classes that render calls to their instances instead of actually calling anything. This can be useful for implementing a quick unidirectional protocol where the server evals the lines in an environment, instead of using something like XML-RPC.
injectrace.py
Inject some tracing builtins debugging purposes. This module automatically injects a few (but few) useful tracing and debugging functions that you typically want to quickly add without having to insert an import.
rawquery.py
A module that helps you query the user for choices on the command-line. The input is read with the tty in raw mode (not necessary to press enter).
saverr.py
A module that brings up the last exception from a Python program that has run on your system (work in progress).

Misc Other Scripts

videotron-quota
A simple script that will parse your videotron quota page and output the level of download and upload you have incurred for the current period.

Donations

Important

The computer programs or libraries on this page are provided for free. I am aware that some of the programs that I provide for free allow people to get their work done faster or better. If you are using some of these codes for benefit, especially if you are using it within a commercial environment and it saves you time or work, please consider making a donation to my company's PayPal account by clicking on the link below.