BBCode Parser (in PHP)

For a while now I’ve been working on a little blog engine in PHP. As part of the post rendering, I’ve settled on BBCode to do all markup and layout.

This meant I would need a BBCode parser. So, I started writing one. And because my projects always go this way, I ended up sinking more time into the BBCode parser than the blog engine itself – trying to get every weird corner case and not produce malformed HTML on output. The primary goals were:

  • Easy to use (one file with one function)
  • Correct (Unicode safe, always-valid HTML, reasonable fallbacks)
  • Easy to understand (avoid massive regexes)

Eventually, I ended up just splitting this piece off into its own project, since it may be useful beyond just my tiny blog.

I know there are a million BBCode parsers out there (and even a PHP extension to do it), some extensible and so on. This one is mine. Would love pull requests.

Writing a WebSocket Client in Perl 5

WebSockets are the latest way to provide bi-directional data transfer for HTTP applications. They replace outdated workarounds like AJAX, repeated polling, Comet, etc. WebSockets are a special protocol atop HTTP (which in itself runs over TCP/IP), and can be wrapped in SSL for security (as in HTTPS). Being a Web Technology, it seems to have been developed by the JavaScript people exclusively for the JavaScript people – working with it outside a web browser or Node.js server can seem convoluted. But that’s the world they built, and we just we live in it.

Basic Perl support / documentation for WebSockets was difficult for me to find. The Mojolicious framework (specifically the UserAgent module) has native WebSockets support and can act as a client, but I was looking for info on using with WebSockets on a lower level / without Mojo or other frameworks. Hopefully, this post can shed some light on how you can use Perl to connect to a remote server using WebSockets, and get access to those sweet real-time services using our favorite language.

First off, if you can, you should just use AnyEvent::WebSocket::Client or Net::Async::WebSocket::Client (depending on your preference of async framework). This package has already done the hard work of combining the two packages you’d probably use anyway, Protocol::WebSocket::Client (for encoding / decoding WebSocket frames), and AnyEvent (for cross-platform non-blocking I/O and doing all the messy TCP socket stuff for you). Having already established my status as a Luddite a desire to know what’s really going on, let’s try to reinvent the wheel and write our own client.

A Client for the Echo Test Service

The goal of this project is to interoperate with the WebSocket Echo Server at ws:// The Echo Server simply listens to any messages sent to it, and returns the same message to the caller. This is enough to build a simple client that we can then customize for other services. There are two things we need to make this work:

  • a plain Internet Socket, for managing the TCP connection to the remote server, and
  • a Protocol handler, for encoding / decoding data in the WebSocket format.

The second part of this is already done for us by Protocol::WebSocket::Client: given a stream of bytes, it can identify WebSocket frames and parse them into data from the server, and it can take data from our program and encapsulate it for sending. This tripped me up at first, so pay attention: Protocol::WebSocket does NOT actually do anything with the TCP socket itself – meaning it does not send or receive any data on its own! The class is responsible for only these things: packing/unpacking data, generating a properly formatted handshake for initiating a WebSocket connection, and sending a “close” message to the server signalling intent to disconnect.

Given that Protocol::WebSocket::Client doesn’t do any TCP socket stuff itself, we have to manage all that. Fortunately, there’s the core module IO::Socket::INET which we can use. Protocol::WebSocket::Client also provides some hooks for points in the WebSocket flow, so that we can insert our own handlers at those points. Let’s get started with some code.

Continue reading

Info Page – @FPAdventuresBot

@FPAdventuresBot currently posts images from the following games:

  • Myst (1993)
    • Extract tool: Riveal
    • Image count: 1531
    • Data files / areas covered:
      • INTRO.DAT
      • MYST.DAT
      • STONE.DAT
      • SELEN.DAT
      • MECHAN.DAT
      • DUNNY.DAT
  • Lighthouse: The Dark Being (1996)
  • Shivers (1995)
    • Extract tool: SCI Resource Viewer + AutoHotkey
    • Image count: 2184
    • Data files / areas covered:
      • (all)
  • The 7th Guest (1993)
    • Extract tool: Custom
    • Image count: 482
    • Data files / areas covered:
      • AT, CH, DR, GA, HTBD, JHEK, LA, MB, MU, P, B, D, FH, HDISK, INTRO, K, LI, MC, N
  • The 11th Hour (1995)
    • Extract tool: Custom
    • Image count: 662
    • Data files / areas covered:
      • atpuz, common3, fhpuz, lapuz, nupuz, pgpuz, bapuz, common4, gapuz, lipuz, omod1, rtpuz, bdpuz, comrooms, htpuz, mbpuz, omod2, chpuz, dopuz, jhpuz, mupuz, omod3, common1, drpuz, kipuz, nav2, omod4, common2, dvmod1c, kxpuz, nav3, omod5
  • The Journeyman Project (1993)
    • Extract tool: ffmpeg
    • Image count: 2045
    • Data files / areas covered:
      • TSA/T_NAV.AVI
      • WSC/W_NAV.AVI
  • The Labyrinth of Time (1993)
    • Extract tool: Custom
    • Image count: 1760
    • Data files / areas covered:
  • Return to Zork (1993)
    • Extract tool: Custom
    • Image count: 256
    • Data files / areas covered:
      • MS-DOS
      • Mac
  • Riven: The Sequel to Myst (1997)
    • Extract tool: Riveal
    • Image count: 3480
    • Data files / areas covered:
      • a_Data.MHK
      • b_Data.MHK
      • g_Data.MHK
      • j_Data1.MHK
      • j_Data2.MHK
      • o_Data.MHK
      • p_Data.MHK
      • r_Data.MHK
      • t_Data1.MHK
      • t_Data2.MHK

Secret of Evermore (Bugfixed)

Secret of Evermore is a game for the Super Nintendo released in 1995. It continues to be a somewhat controversial title in the otherwise spotless Squaresoft SNES library… but I like it, and my wife plays it as a “comfort game” whenever she is feeling sick.

IPS is an antiquated binary patch file format, used to provide a “diff” of raw bytes that should be applied over an existing file. IPS patches for SNES games are widespread, and often do things like enable cheats, alter graphics, translate text, etc. In some cases, people have found bugs or glitches in the original code, and release an IPS patch to fix it. Often these are found from the work of speedrunners, who spot a glitch and exploit it to break the game in some way. The SNES hackers then identify the code problem behind the bug, and patch the raw binary code to close the hole. There’s a whole black art to crafting binary bugfixes – the space for a fix is severely limited, and if the new code is too big you have to find additional unused code area elsewhere to jump into (or optimize a different routine to make some free space!)

For some games, more than one bugfix patch is available. Managing these with the ubiquitous LunarIPS tool is a pain – you have to generate a bunch of intermediate ROMs, and the patches may conflict / overwrite one another without indication of the problem. There are better IPS patchers around, but I didn’t want to go on a research quest to find one. Besides, IPS is a pretty simple format – why not just write my own patcher?

I wanted a way to take a binary file, apply a complete “patch set” to it, and return the resulting bin. I wanted it to check for conflicts in patches, and give a descriptive message of which exact patches were colliding. And then I didn’t want to just serve up a cooked file, for copyright and maintainability reasons. “XYZ (Bugfixed)” hacks are too frequently outdated, as new patches are released. So I put a simple PHP frontend before it and a folder full of patches server-side.

The tool is here:

Users can upload a file. If the SHA-1 matches, it gets patched and they download the fixed version. This tool is for Secret of Evermore, with all the patches (i.e. “hard work”) done by Assassin17.

I may stand up sites for other games as I run across them, or merge these into a single “bugfixer” tool if it gets too out of control.

The script follows.
Continue reading

Downloading from Soundcloud

A couple years ago, downloading a song from Soundcloud used to be pretty trivial. Their server would send you the complete 128kbps MP3, and then the local embedded control would allow you to seek at will. Because the file arrived in one large chunk, it was both easy to identify in cache, and easy to copy somewhere else to play back. Sometimes this still works… You’ll know by looking at the dev console, and see if it shows a huge MP3 file transfer. If so, you’re in luck! Copy it from the cache and you’re set.

Evidently they’ve changed this practice for other tunes, possibly to improve the latency of seeking at random in tracks, or possibly because they don’t want people getting music they shouldn’t be able to get. You can get Greasemonkey scripts which put the download button back, but these simply fire the URL off to an third-party site which “somehow” reconstructs the song and then sends it back your way. Very black-box magic stuff indeed.

However – If you can stream it, you can download it, as they say. Let’s take a look at how Soundcloud actually gets a song to you, and see if we can still figure out how to download something we may not really be allowed to.

Start the browser’s Developer Console and then browse to a song you want to hear. Keep an eye on the “network” activity, it will give you clues as to what is actually going on. As the song begins playing, you’ll see a lot of small network requests to magically named files:

This seems promising. Download one or two and run “file” on it, and you get:
$ file *
c5f47vUnF3Ow.128.mp3?f10880d39085a94a0418a7e163b03d5226edfe2317e6aa1445547d76cf23a7ca5b08b0b9169eed2c0a13f681ab93c51d8e788dcaa887622ee2905d7463e4fd982e918b5b687caf75047026a3429731c5010a16: MPEG ADTS, layer III, v1, 128 kbps, 44.1 kHz, JntStereo
c5f47vUnF3Ow.128.mp3?f10880d39085a94a0418a7e167b03d5249919aaf544816306c9a5e3ca05a129454accfdda2750c51705ac2f68f036a37b2c482058312ab10625db87a6e3ab6dc1d1631dbd883a3f38786db484e66359daf667314eb8f03: MPEG ADTS, layer III, v1, 128 kbps, 44.1 kHz, JntStereo

Okay, so Soundcloud has broken the file into parts and is playing them back in sequence. You can pop one of these into your media player and listen to a portion of the song. We’re close, but how do we know where to find all these parts and put them together in order? Easy: there’s an m3u8 that has that for you – check the Dev Console again! Soundcloud’s player is using this to fetch the data in order from various URLs, and then stream it to you. For example, something like this:


An interesting aside, it seems these URLs time out after a short period of time, leading to 403 Forbidden errors if you try to access it again. No doubt these huge URL parameters point to some browser session or timestamp which becomes invalid after a while. If that happens, reload the page and start playing again to generate new files.

So to recap all this: we need to

  • take the m3u8 file,
  • retrieve each mp3 segment,
  • and concatenate them together.

Getting the m3u8 programmatically is hard, so just copy it from the browser : ) And to put these together you’ll need mp3cat installed – see for info.

#!/usr/bin/env perl
open(FP,"playlist.m3u8") or die "can't open playlist: $!";

my $piece = 0;
while( < FP > )
next if ($_ =~ m/^#/);
$filename = sprintf('outdir/%02s.mp3',$piece);
print `wget --no-check-certificate -O $filename $_`;

print `cat outdir/*.mp3 | mp3cat - - > output.mp3`;