Category Archives: Software

Duplicate Image Finder (in Perl)

I have duplicate photos in my image library.  We all do.  I want to weed them out.  The trouble with using straight md5 for this, though, is that EXIF data on JPEG files may be altered by my photo management tool – thus they contain the exact same photo data, but the associated extra data (date/time, ICC profile, tags, etc) causes pure checksum comparison to fail.

Here’s a Perl script which iterates through a folder list and sends all files off for md5 digest.  However, any jp(e)g files are first run through jpegtran and saved as a temporary file, so they can be “normalized” (i.e. convert to optimized, progressive, and EXIF-stripped) so the md5 is performed on just image data.  This should find duplicates regardless of image program tampering.

#!/usr/bin/perl -w
use strict;

# path to jpegtran
my $JPEGTRAN_LOC = '/Users/grkenn/Pictures/jpegtran';

# Somewhat Advanced Photo Dupe Finder
# Greg Kennedy 2012

# Identifies duplicate photos by image data:
# strips EXIF info and converts to optimize + progressive
# before performing MD5 on image data

# Requires "jpegtran" application from libjpeg project
# Mac users: http://www.phpied.com/installing-jpegtran-mac-unix-linux/

use File::Find;
use Digest::MD5;

my %fingerprint;

my $ctx = Digest::MD5->new;

sub process
{
  my $filename = $_;

  # file is a directory
  if (-d $filename) { return; }
  # file is an OSX hidden resource fork
  if ($filename =~ m/^\._/) { return; }

  if ($filename =~ m/\.jpe?g$/i) {
    # attempt to use jpegtran to "normalize" jpg files
    if (system("$JPEGTRAN_LOC -copy none -optimize -progressive -outfile /tmp/find_dupe.jpg \"$filename\"")) {
      print STDERR "\tError normalizing file " . $File::Find::name . "\n\n";
    } else {
      $filename = '/tmp/find_dupe.jpg';
    }
  }

  # open file
  open (FP, $filename) or die "Couldn't open $filename (source " . $File::Find::name . "): $!\n";
  binmode(FP);
  # MD5 digest on file
  $ctx->addfile(*FP);
  push (@{$fingerprint{$ctx->digest}}, $File::Find::name);
  close(FP);
}

## Main script
if (scalar @ARGV == 0)
{
  print "Usage: ./find_dupe.pl [ ...]\n";
  print "\tjpegtran MUST be in the path,\n";
  print "\tor edit the script and set JPEGTRAN_LOC to an absolute location\n";
  exit;
}

find(\&process, @ARGV);

print "Duplicates report:\n";

foreach my $md5sum (keys %fingerprint)
{
  if (scalar @{$fingerprint{$md5sum}} > 1)
  {
    print "--------------------\n";
    foreach my $fname (@{$fingerprint{$md5sum}})
    {
      print $fname . "\n";
    }
  }
}

The output looks something like this:

macmini:Pictures grkenn$ ./find_dupe.pl test_lib/
Duplicates report:
--------------------
test_lib/ufo or moon.jpg
test_lib/subdirectory/dupe_7.jpg
--------------------
test_lib/too cool jenny.jpg
test_lib/subdirectory/dupe1.jpg
test_lib/subdirectory/dupe2.jpg

SlugFest ’97 DX

Kids have big dreams.  Some of them want to grow up to be scientists or astronauts or football stars or President.  When I was in school I wanted to make video games.  So I took programming classes and worked hard on my craft.  I churned out lines of QBasic spaghetti code and, later, migrated to Visual Basic on Windows to do the same thing.  Surrounding much of what I produced was a feeling that I was destined to do something big with whatever I was working on: I was going to make a million dollars off some shareware game, or I would code up a groundbreakingly massive and openended world (and it would all fit on a 1.44MB floppy), or whatever.

Over a summer break in 1997 my cousin Rusty came to spend a week at my family’s house.  I don’t recall exactly how it happened – something to do with playing a lot of Myst, I think – but I managed to convince Rusty and my sister Erin to work on a video game.  We were going to make an awesome fighting game on PC.  It was going to be released on CD – ostensibly because we could put music in the empty space, but most likely because CD-ROM was the hot item of the day.

Our game was called SlugFest ’97.

And so we set to work, with the enthusiasm that only kids have.  For that whole week we invested our time and effort on producing this game.  We each painstakingly drew out MSPaint sprites for our assigned characters.  We coded and built and playtested.  When we weren’t working on it, we were talking about it: how to improve it, how to produce it, how to market it.  We even took time out to make a “The Making Of” video.  And as time grew short we did, in fact, wrap up a version that we were quite happy with.

Then reality set in: we called a local CD mastering shop (this was in the days before CD-R became widespread) and were told that we would be charged $100 to produce our CD.  In hindsight I think the clerk may have been confused about what we were asking and thought we wanted to book studio time.  In any case, we all realized that the dream was simply beyond our financial resources.  Enthusiasm drifted away.  Though we later made an attempt at a sequel (“SlugFest 2000”), it never made it past initial character design before we all lost interest and started playing BattleMasters on the landing at the top of the stairs.

Well.  I cleared out an old folder on my HD recently and ran across both the compiled version of the game, plus the source code.  Unfortunately, it doesn’t run on the most modern Windows version, and is hit-or-miss functional on the rest.  A quick calculation: Fourteen years of coding experience in the intervening time, including running a college game development club for two years… plus a stash of resources including a 2d game framework… a remake should take very little time indeed.  The actual “game logic” is absurdly simple.  The technology to realize the dream is here too – everyone has a CD burner these days.  Yes, I can do this.  I can release SlugFest.  (Since misquoting Steve Jobs is all the rage these days, I’ll throw in an old favorite: “Real Artists Ship”.)

On to the remake.  It actually was very easy.  The entire thing was rewritten from the ground up in C, using SDL as the backend library (plus SDL_Mixer and SDL_Image to provide sound and graphics loading).  I managed to squeeze in a few “DX Mode” features to inject a bit of modernity into the game.  I even cut some sound samples from ancient recordings of us to make fight sound effects.  The result is faster, better, and smaller than the original… though I included that in the installer too, for completeness’ sake.

In fact the most challenging part of all was the music.  When we wrote the game we had no music sources of our own and couldn’t burn anything to test with, but we wrote the game with CD support expecting to just substitute our own tracks in production.  Most playtesting happened to the tune of either No Doubt’s “Tragic Kingdom” CD, or some electronic “Phantom of the Opera” remix CD.  Inspiring, but copyrighted, and not really fitting for the remake.  Instead, I loaded the game up with MIDI files that I or my sisters had written in our school years and used them for background music.  Some of these hadn’t been heard in many years, because they were in a proprietary format that I first had to write a decoder for.  In the end I found 43 tracks worthy of inclusion, mostly without any musicality or rhythm.  Hey, if it worked for Marvel vs. Capcom 2…

And so we come to the release.  There is a web-downloadable installer here, if you want to try it out:

Download SlugFest ’97 DX – Installer – Windows, version 1.01.  1.6 MB
Download SlugFest ’97 DX – ZIP – Mac OSX (Intel 10.5+), version 1.01. 1.8MB

The finishing touch is here, though: run the MIDIs through a MIDI -> WAV conversion tool (I used WinGroove), create a cue sheet, redo the installer to work from CD, and burn a copy.

Well, there you have it.  Childhood dream: accomplished.  Now if I could just figure out a way to market it… : )

Dumping C64 Tapes

Cleaning out the garage to set up my photo studio and I ran across boxes of Commodore 64 gear given to me by a friend for my birthday a couple years ago.  Among the items: some crusty old cassette tapes with data on them.  Back in the day, disk drives were expensive luxury items for saving your data (and the disks were costly too).  The cheaper alternative: a “datasette” – a specialized tape recorder that can save and load to standard and widely available audio tapes.  There were some quirks to working with tapes: they are extremely slow, must be manually positioned at the right point, and like all magnetic media the tapes “should be” carefully stored.

Fortunately retrocomputing hobbyists have since worked out long-term digital storage solutions for the analog data on the tape of various 8-bit computers.  On Commodore machines this is the TAP file – a digital representation of the various pulse lengths detected on a tape.  It can be replayed in an emulator, or turned back into a pristine WAV file to be used in a real machine (record it back to a new tape, etc).

Now, the ‘preferred’ way to dump a tape is to use a special cable to link the C64 to your PC, but I only have three tapes here and there’s no way I would go to the effort of that.  Instead I settled on the super cheap method: hook a standard tape deck to the line in of the computer sound card, record the WAV, then use a tool to make a TAP file out of it.  Not a very good solution for people with a lot of tapes to dump, because the resulting file needs a lot of manual cleaning to work properly afterwards.  The tool suite I used to fix it all up were:

  • Audacity to record the wave.  Use 96khz sample rate if your hardware supports it.
  • UberCassette (or an alternative, e.g. AudioTap) to turn a wave into a .tap file
  • tapclean to clean up, standardize, repair and verify the tap image
  • a hex editor to do a good bit of manual cleanup
  • A good reference on the tap file format!

End result of all this: I managed to fully dump the three tapes that I own.  One is “Dungeons of Death” for the VIC-20, a previously unreleased RPG.  Another is “Touch Typing Tutor” for C64: a .prg file existed, but no tape dump or scans.  And the last… a tape with programs that my friend had written back in 1985 or so.  Now that I’ve gotten the hang of this, I’m out of material!  So there’s an open request out for C64 tapes.  If you’ve got them, I may be able to dump them.

IMAP Webmail as File Storage

These days it seems like everyone is on the free web storage bandwagon, offering somewhere between tens of MB up to 2GB file storage and all kinds of options on how to share that with other people.  But remember back in the day when GMail was first lauched, and it was a hit because it offered 2 gigabytes of storage space where its closet competitors had only 10 megabytes free?

Well those inbox sizes have continued to grow, up to 7.5gb as of this writing.  GMail isn’t the only one with a big mailbox any more, but it’s certainly the most well-known.  What if you could convert that space into a web-based drive and store files into it?  Even at a 3-to-4 ratio (base64 encoding) you can still cram over 5gb into it, which is enough for many people to store a complete photo library, a document backup, etc.  Free offsite backup for just a bit of legwork.

This is a script called “mail-drive.pl” which does exactly that: it provides an interface to an IMAP mailbox and allows users to put, get, list or delete files from the store.  Individual files are broken down to smaller attachment sizes to fit into the mailbox on storage, and retrieved in the proper order when pulling to disk.  Using this, with a wrapper script in a cron job, will allow me to keep backups in a special gmail inbox which I can then get back later if I ever need it.  Free offsite backup!

Euro1943 v1.1b

Introduction

Join the Axis or the Allies in this multiplayer team-based action game. Pick up weapons to help you fight enemies and take over strategic capture points on the map. Climb into a tank, gunboat, or fighter plane and support the infantry on the ground. Or hop into the HQ and spend your team’s funds on weapons and vehicles for the players to use.

Euro1943 is a combination RTS/action game where players take on the role of both soldiers and generals. It was designed as an entry to the 4 Elements V contest at www.gamedev.net, where it took 7th place (out of 24 entrants). The README file contains gameplay information.

Please see LICENSE.TXT for more information about who can use this software, what you can do with it, etc. The long and short of it is that this is freeware, source code is not available (sorry, it’s a wreck), and I’m not responsible for what happens to your computer when using this software.

Screenshots

Downloads

Last updated Oct. 5, 2009