This page lists scripts I quickly hacked up to solve a problem but haven't had time to clean up for general use. Feel free to use them if you like. Unless otherwise stated, they're licensed under the GNU GPL version 2 or later.
Note: My multi-file projects may be found on my GitHub and Launchpad profiles.""" A simple little script to convert Microsoft Word files into Konqueror .war archives (an HTML file and it's dependencies inside a renamed .tar file) using wvHtml. TODO: - Add a file magic check to identify and rename RTF files with .doc extensions. """
#!/bin/sh # enc_ogg+flac.sh # By: Stephan Sokolow # # A wrapper script to allow KAudioCreator to encode to both Ogg Vorbis and FLAC # in one run Also normalizes the input file to avoid the need to do so twice # later on. # # Licensed under the GNU GPL 2 or later.
""" A crude script for identifying audio files that generate duplicate waveforms Originally written as an experiment in identifying files that differ only in metadata. Supports anything sox does. It's CPU-bound, but because of sparse documentation on the MP3 format, it's the best I can do for now. Probably best to use FDMF (http://w140.com/audio/) with some stricter-than- default thresholds until I find the time to rewrite this to be I/O-bound. prints duplicates to stdout (one per-line) with groups of duplicates separated by empty lines. Status messages are sent to stderr. Warning: Seems to get stuck on .mpg files. Requires: sox """
""" Find Dupes Fast By Stephan Sokolow (ssokolow.com) Inspired by Dave Bolton's dedupe.py (http://davebolton.net/blog/?p=173) and Reasonable Software's NoClone. A simple script which identifies duplicate files several orders of magnitude more quickly than fdupes by using smarter algorithms. Most importantly, rather than calculating the MD5 sums for all files with non-unique sizes, this script groups files by their size and then does incremental comparisons. As such, files can be read in 4KiB chunks and the script will only read as many chunks as it needs in order to confirm that a file is unique. (There is no way to avoid reading the entire file if it does have duplicates) In addition, this script eliminates the tiny but present risk of hash collisions causing false positives by doing byte-by-byte comparison rather than hashing the files and then comparing hashes. This doesn't slow the process down because each chunk is only read from the disk once and duplicate-finding is an I/O-bound operation. Grouping by size is used to limit both the memory consumption and the number of open file handles when doing the byte-by-byte comparison. Finally, unlike with fdupes, under no circumstances will the --delete option allow you to accidentally delete every copy of a file. (No --symlinks option is supported and this script will not be confused by specifying the same directory multiple times on the command line or specifying a directory and its parent.) TODO: - Properly support file paths as arguments. As it is, they will be passed to os.walk() which will proceed to ignore them. - As I understand it, fnmatch.fnmatch uses regexes internally and doesn't cache them. Given how many times it gets called, I should try using re.compile with fnmatch.translate instead. - Group files by stat().st_ino to avoid reading from the same inode more than once and to allow advanced handling of hardlinks in --delete mode. - Identify the ideal values for CHUNK_SIZE and HEAD_SIZE... or how about dynamically tuning the read increment size based on the number of files being compared and possibly the available RAM? (To minimize seeking) block_size = min(max_block_size, max_consumption / file_count) Maybe a 64K maximum block size, 4K minimum block size, an an 8MB max consumption? (subordinate to minimum block size when in conflict) - Is there such a thing as a disk access profiler that I could use with this? - Offer a switch to automatically hardlink all duplicates found which share a common partition. - The result groups should be sorted by their first entry and the entries within each group should be sorted too. - Confirm that the byte-by-byte comparison's short-circuit evaluation is working properly and efficiently. - Run this through a memory profiler and look for obvious bloat to trim. - Look into possible solutions for pathological cases of thousands of files with the same exact size and same pre-filter results. (File handle exhaustion) - Look into supporting gettext localization. - Consider adding a command-line switch which skips the non-hash comparison for files which are smaller than HEAD_SIZE. (files that got hashed in their entirety and were MD5-identical) - Once ready, announce this in a comment at http://ubuntu.wordpress.com/2005/10/08/find-duplicate-copies-of-files/ """
""" A simple clone of the KDE Fuzzy Clock widget for use in other desktops. The time appears as a tooltip if you hover your mouse over the tray icon. If you'd like additional levels of fuzziness, just ask. """
""" By: Stephan Sokolow (deitarion/SSokolow) A pure Python GIF validator which also counts the number of frames. Originally conceived as simiply a static/animated detector. TODO: - Clean up and reorganize the code some more. - Validate whatever I can in the color table and the other bit flags. - Find other things I can validate. - Provide a function to extract a GIF's comment blocks. """
""" A convenience wrapper for building and installing a new kernel on Gentoo Linux, complete with some extra bits to make maintaining a couple of backup kernels easy. Also handles mounting and unmounting /boot and calling module-rebuild to regenerate external kernel modules. This only writes a tertiary kernel if none exists to prevent two builds in a row from deleting the only good kernels available, so you'll want to add this to your /etc/conf.d/local.start: mount /boot if [ -e /boot/old_kernel_emergency ]; then echo " * Boot considered successful. Purging tertiary kernel." rm -rf /boot/old_kernel_emergency fi umount /boot Read the source for the rest of the details. TODO: - Should I have this regenerate the fbcondecor initrd? - Better instructions. Possibly a zip bundle with a README """
""" kde_gui_fixes.py A quick script to fix a couple of my KDE pet peeves: - The screensaver disabling mechanism doesn't guarantee it'll get re-enabled. - There's no way to ensure scrolling in the corners will switch desktops. Requires: PyGTK, PyDCOP (optional, default configuration only) (I used PyGTK because I'm more familiar with it than PyQt and I was rushed) Some code based on the shaped window example from the PyGTK 2.x tutorial. TODO: - Merge this with another idea of mine and whip up a nice GUI for it. """
""" A single-file Python CGI script for effortless sharing of other single-file scripts. If you're viewing a "Useful Hacks" list on my website, this is the code behind it. Simply put your desired description into each file's docstring (for shell scripts, it takes every commented line starting with the shabang and ending with the first non-comment line) and drop them into a folder along with this script. Currently supports Bourne-compatible shell scripts and Python scripts. Other languages under consideration. Non-obvious Features: - Hyperlinks URLs and obfuscates e-mail addresses in script descriptions. - Configurable license name hyperlinking Warnings: - The HTML templating is a quick hackjob. I'm not kidding. - Don't forget to remove the template bits specific to my site. TODO: - Switch to a proper templating solution? (No longer a single-file script) - Add caching eventually (current run time for my site, 0.1 seconds) """
"""deitarion/SSokolow's Linux WinSplit clone in need of a name When using --bindkeys, keybindings are Ctrl+Alt+0 through Ctrl+Alt+9 and Ctrl+Alt+Enter (keypad only). For non-keybinding use, see --help. Requirements: - Python 2.3 (Tested on 2.5 but I don't see any newer constructs in the code.) - PyGTK 2.2 (assuming get_active_window() isn't newer than that) - X11 (The code expects _NETWM hints and X11-style window decorations) - python-xlib (optional, required for --bindkeys, tested with 0.12) Known Bugs: - The internal keybindings only work with NumLock and CapsLock off. - The "monitor-switch" action only works on non-maximized windows. - The toggleMaximize function powering the "maximize" action can't unmaximize. (Workaround: Use one of the regular tiling actions to unmaximize) TODO: - Decide how to handle maximization and stick with it. - Implement the secondary major features of WinSplit Revolution (eg. process-shape associations, locking/welding window edges, etc.) - Clean up the code. It's functional, but an ugly rush-job. - Figure out how to implement a --list-keybindings option. - Consider binding KP+ and KP- to allow comfy customization of split widths. (or heights, for the vertical split) - Consider rewriting cycleDimensions to allow command-line use to jump to a specific index without actually flickering the window through all the intermediate shapes. - Expose a D-Bus API for --bindkeys and consider changing it so that, if python-xlib isn't present it displays an error message but keeps running anyway to provide the D-Bus service. - Can I hook into the GNOME and KDE keybinding APIs without using PyKDE or gnome-python? (eg. using D-Bus, perhaps?) References and code used: - http://faq.pygtk.org/index.py?req=show&file=faq23.039.htp - http://www.larsen-b.com/Article/184.html - http://www.pygtk.org/pygtk2tutorial/sec-MonitoringIO.html """
""" A pure-Python module for identifying and examining RAR files developed without any exposure to the original unrar code. (Just format docs from www.wotsit.org) TODO: - Document this module properly. - Complete the parsing of the RAR metadata. (eg. Identify directories, check CRCs, etc.) - Make sure this has the same coding and error conventions as my gif.py. - Consider releasing this under PSF 2.3 license instead. - Support extraction of files stored with no compression. (eg. XviD movies) - Look into supporting split and password-protected RARs """
"""
A quick-access auto-saving, auto-hiding scratchpad for jots, multi-step
copy-pasting, and anything else where a more specialized app is over-thinking
the problem and opening a plain old plaintext editor (like leafpad, notepad, or
kedit) is inefficient and potentially clutter-inducing.
Requires: PyGTK
Recommended: GtkSourceView and its Python bindings (Undo/Redo support)
Usage: Run it and then click the white line along the left edge of the desktop.
(Also supports drag-and-drop)
TODO:
- Include a "pushpin" icon/button in the lower-right corner to lock the tray open.
- Pressing escape should collapse the tray
- Implement some form of multi-note support for storing stuff that needs to be
"backgrounded". Maybe tabbing. (Similar reason to having a few virtual
desktops)
- Support some sort of resize handle or handles.
Known Bugs:
- The window hides while the context menu is visible (harmless but unintuitive)
- Quitting by closing the X connection (xkill) doesn't commit pending changes.
- A drag-and-drop which sends a drag motion event to this but then ends in a
drop to another window will temporarily confuse the auto-hide.
- ScratchTray currently depends on fcntl... which is non-portable. I'll update
it to use a portable wrapper once I've made appropriate preparations so that
my index.cgi script accepts zipped bundles.
(http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/65203)
- Resizing the tray on resolution change is currently broken. I'll take a look
at it soon.
"""
""" A horrendously CPU-bound but functional script for swapping the audio-track byte order in cdrdao .BIN files so that a .CUE file generated by toc2cue can be mounted by DOSBox or CDEmu. TODO: - Optimize and, if necessary, rewrite in C. """
"""upd_hosts.py Automatically generates /etc/hosts from /etc/hosts.local and the MVPS ad-blocking hosts list. Instructions: Put this file in /etc/cron.monthly and chmod it executable. Edit the ADHOST_SUFFIX_WHITELIST variable if you want. (Default is to allow only Project Wonderful because I respect them and they don't serve up flash ads) TODO: - Use If-Modified-Since and ETags on the MVPS file so I can safely run this more often. (Perhaps also use the ZIP download to save bandwidth?) - Add a mode which doesn't require the local hosts file to be moved to /etc/hosts.local """
#!/bin/bash # A three-line wrapper script to allow double-clicked .exe files to be run # with the working directory they expect.