{"id":354,"date":"2019-03-11T17:44:35","date_gmt":"2019-03-11T22:44:35","guid":{"rendered":"https:\/\/greg-kennedy.com\/wordpress\/?p=354"},"modified":"2019-03-11T19:08:26","modified_gmt":"2019-03-12T00:08:26","slug":"writing-a-websocket-client-in-perl-5","status":"publish","type":"post","link":"https:\/\/greg-kennedy.com\/wordpress\/2019\/03\/11\/writing-a-websocket-client-in-perl-5\/","title":{"rendered":"Writing a WebSocket Client in Perl 5"},"content":{"rendered":"\n<p>WebSockets are the latest way to provide bi-directional data transfer for HTTP applications.  They replace outdated workarounds like AJAX, repeated polling, Comet, etc.  WebSockets are a special protocol atop HTTP (which in itself runs over TCP\/IP), and can be wrapped in SSL for security (as in HTTPS).  Being a Web Technology, it seems to have been developed by the JavaScript people exclusively for the JavaScript people &#8211; working with it outside a web browser or Node.js server <a href=\"http:\/\/lucumr.pocoo.org\/2012\/9\/24\/websockets-101\/\">can seem convoluted<\/a>.  But that&#8217;s the world they built, and we just we live in it.<\/p>\n\n\n\n<p>Basic Perl support \/ documentation for WebSockets was difficult for me to find.  The <a href=\"https:\/\/metacpan.org\/pod\/Mojolicious\">Mojolicious <\/a>framework (specifically the UserAgent module) has native WebSockets support and can act as a client, but I was looking for info on using with WebSockets on a lower level \/ without Mojo or other frameworks.  Hopefully, this post can shed some light on how you can use Perl to connect to a remote server using WebSockets, and get access to those sweet real-time services using our favorite language.<\/p>\n\n\n\n<p>First off, if you can, you should <strong>just use <\/strong><a href=\"https:\/\/metacpan.org\/pod\/AnyEvent::WebSocket::Client\"><strong>AnyEvent::WebSocket::Client<\/strong><\/a><strong> or <\/strong><a href=\"https:\/\/metacpan.org\/pod\/Net::Async::WebSocket::Client\"><strong>Net::Async::WebSocket::Client<\/strong><\/a> (depending on your preference of async framework).  This package has already done the hard work of combining the two packages you&#8217;d probably use anyway, Protocol::WebSocket::Client (for encoding \/ decoding WebSocket frames), and AnyEvent (for cross-platform non-blocking I\/O and doing all the messy TCP socket stuff for you).  Having already established <del>my status as a Luddite<\/del> a desire to know what&#8217;s really going on, let&#8217;s try to reinvent the wheel and write our own client.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">A Client for the Echo Test Service<\/h2>\n\n\n\n<p>The goal of this project is to interoperate with the <a href=\"https:\/\/www.websocket.org\/echo.html\">WebSocket Echo Server<\/a> at ws:\/\/echo.websocket.org.  The Echo Server simply listens to any messages sent to it, and returns the same message to the caller.  This is enough to build a simple client that we can then customize for other services.  There are two things we need to make this work:<br><\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>a plain Internet Socket, for managing the TCP connection to the remote server, and<\/li><li>a Protocol handler, for encoding \/ decoding data in the WebSocket format.<\/li><\/ul>\n\n\n\n<p>The second part of this is already done for us by <a href=\"https:\/\/metacpan.org\/pod\/Protocol::WebSocket::Client\">Protocol::WebSocket::Client<\/a>: given a stream of bytes, it can identify WebSocket frames and parse them into data from the server, and it can take data from our program and encapsulate it for sending.  <strong>This tripped me up at first, so pay attention:<\/strong>  Protocol::WebSocket <strong>does NOT<\/strong> actually do anything with the TCP socket itself &#8211; meaning it <strong>does not<\/strong> send or receive any data on its own!  The class is responsible for only these things: packing\/unpacking data, generating a properly formatted handshake for initiating a WebSocket connection, and sending a &#8220;close&#8221; message to the server signalling intent to disconnect.<\/p>\n\n\n\n<p>Given that Protocol::WebSocket::Client doesn&#8217;t do any TCP socket stuff itself, we have to manage all that.  Fortunately, there&#8217;s the core module <a href=\"https:\/\/perldoc.perl.org\/IO\/Socket\/INET.html\">IO::Socket::INET<\/a> which we can use.  Protocol::WebSocket::Client also provides some hooks for points in the WebSocket flow, so that we can insert our own handlers at those points.  Let&#8217;s get started with some code.<\/p>\n\n\n\n<!--more-->\n\n\n\n<h3 class=\"wp-block-heading\">Example Code<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>#!\/usr\/bin\/env perl\nuse v5.014;\nuse warnings;\n\n# Simple WebSocket test client using blocking I\/O\n#  Greg Kennedy 2019\n\n# Core IO::Socket::INET for creating sockets to the Internet\nuse IO::Socket::INET;\n# Protocol handler for WebSocket HTTP protocol\nuse Protocol::WebSocket::Client;\n\n# Uncomment this line if your IO::Socket::INET is below v1.18 -\n#  it enables auto-flush on socket operations.\n#$|= 1;\n\n#####################\n\ndie \"Usage: $0 URL\" unless scalar @ARGV == 1;\n\nmy $url = $ARGV[0];\n\n# Protocol::WebSocket takes a full URL, but IO::Socket::* uses only a host\n#  and port.  This regex section retrieves host\/port from URL.\nmy ($proto, $host, $port, $path);\nif ($url =~ m\/^(?:(?&lt;proto>ws|wss):\\\/\\\/)?(?&lt;host>[^\\\/:]+)(?::(?&lt;port>\\d+))?(?&lt;path>\\\/.*)?$\/)\n{\n  $host = $+{host};\n  $path = $+{path};\n\n  if (defined $+{proto} &amp;&amp; defined $+{port}) {\n    $proto = $+{proto};\n    $port = $+{port};\n  } elsif (defined $+{port}) {\n    $port = $+{port};\n    if ($port == 443) { $proto = 'wss' } else { $proto = 'ws' }\n  } elsif (defined $+{proto}) {\n    $proto = $+{proto};\n    if ($proto eq 'wss') { $port = 443 } else { $port = 80 }\n  } else {\n    $proto = 'ws';\n    $port = 80;\n  }\n} else {\n  die \"Failed to parse Host\/Port from URL.\";\n}\n\nsay \"Attempting to open blocking INET socket to $proto:\/\/$host:$port...\";\n\n# create a basic TCP socket connected to the remote server.\nmy $tcp_socket = IO::Socket::INET->new(\n  PeerAddr => $host,\n  PeerPort => \"$proto($port)\",\n  Proto => 'tcp',\n  Blocking => 1\n) or die \"Failed to connect to socket: $@\";\n\n# create a websocket protocol handler\n#  this doesn't actually \"do\" anything with the socket:\n#  it just encodes \/ decode WebSocket messages.  We have to send them ourselves.\nsay \"Trying to create Protocol::WebSocket::Client handler for $url...\";\nmy $client = Protocol::WebSocket::Client->new(url => $url);\n\n# This is a helper function to take input from stdin, and\n#  * if \"exit\" is entered, disconnect and quit\n#  * otherwise, send the value to the remote server.\nsub get_console_input\n{\n  say \"Type 'exit' to quit, anything else to message the server.\";\n\n  # get something from the user\n  my $input;\n  do { $input = &lt;STDIN>;  chomp $input } while ($input eq '');\n\n  if ($input eq 'exit') {\n    $client->disconnect;\n    exit;\n  } else {\n    $client->write($input);\n  }\n}\n\n# Set up the various methods for the WS Protocol handler\n#  On Write: take the buffer (WebSocket packet) and send it on the socket.\n$client->on(\n  write => sub {\n    my $client = shift;\n    my ($buf) = @_;\n\n    syswrite $tcp_socket, $buf;\n  }\n);\n\n# On Connect: this is what happens after the handshake succeeds, and we\n#  are \"connected\" to the service.\n$client->on(\n  connect => sub {\n    my $client = shift;\n\n   # You may wish to set a global variable here (our $isConnected), or\n   #  just put your logic as I did here.  Or nothing at all :)\n   say \"Successfully connected to service!\";\n\n   get_console_input();\n  }\n);\n\n# On Error, print to console.  This can happen if the handshake\n#  fails for whatever reason.\n$client->on(\n  error => sub {\n    my $client = shift;\n    my ($buf) = @_;\n\n    say \"ERROR ON WEBSOCKET: $buf\";\n    $tcp_socket->close;\n    exit;\n  }\n);\n\n# On Read: This method is called whenever a complete WebSocket \"frame\"\n#  is successfully parsed.\n# We will simply print the decoded packet to screen.  Depending on the service,\n#  you may e.g. call decode_json($buf) or whatever.\n$client->on(\n  read => sub {\n    my $client = shift;\n    my ($buf) = @_;\n\n    say \"Received from socket: '$buf'\";\n\n    # it's our \"turn\" to send a message.\n    get_console_input();\n  }\n);\n\n# Now that we've set all that up, call connect on $client.\n#  This causes the Protocol object to create a handshake and write it\n#  (using the on_write method we specified - which includes sysread $tcp_socket)\nsay \"Calling connect on client...\";\n$client->connect;\n\n# Now, we go into a loop, calling sysread and passing results to client->read.\n#  The client Protocol object knows what to do with the data, and will\n#  call our hooks (on_connect, on_read, on_read, on_read...) accordingly.\nwhile ($tcp_socket->connected) {\n  # await response\n  my $recv_data;\n  my $bytes_read = sysread $tcp_socket, $recv_data, 16384;\n\n  if (!defined $bytes_read) { die \"sysread on tcp_socket failed: $!\" }\n  elsif ($bytes_read == 0) { die \"Connection terminated.\" }\n\n  # unpack response - this triggers any handler if a complete packet is read.\n  $client->read($recv_data);\n}<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Running the Example<\/h3>\n\n\n\n<p>Save this to a file (blocking-client.pl) and execute it, passing a URL on the command line.  If all goes well, you should connect to the remote service, and then are prompted to type messages.  Sending a message should return the exact same message from the server.  Typing &#8220;exit&#8221; will send a final &#8220;close&#8221; packet to the server, and then exit your program.  An example session looks like this:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$ .\/blocking-client.pl ws:\/\/echo.websocket.org\nAttempting to open blocking INET socket to ws:\/\/echo.websocket.org:80...\nTrying to create Protocol::WebSocket::Client handler for ws:\/\/echo.websocket.org...\nCalling connect on client...\nSuccessfully connected to service!\nType 'exit' to quit, anything else to message the server.\nHELLO THERE!\nReceived from socket: 'HELLO THERE!'\nType 'exit' to quit, anything else to message the server.\nIt seems to be working.\nReceived from socket: 'It seems to be working.'\nType 'exit' to quit, anything else to message the server.\nexit\n$<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\">SSL Support<\/h2>\n\n\n\n<p>The example above works for non-encrypted WebSockets only.  As WS is a layer atop HTTP, it is also possible to run WebSockets over HTTPS using SSL, usually on port 443 instead.  Secure WebSocket URLs begin with wss:\/\/ instead of ws:\/\/.<\/p>\n\n\n\n<p>However, Net::Socket::INET does not have built-in support for SSL.  Thus attempting to connect to the &#8220;secure&#8221; test server at wss:\/\/echo.websocket.org will cause the program to hang and never complete the handshake.  This is because you are speaking unencrypted HTTP to a server expecting encrypted HTTP.<\/p>\n\n\n\n<p>There are a few ways around this.  The first is to replace IO::Socket::INET with something that is SSL-aware, as in <a href=\"https:\/\/metacpan.org\/pod\/IO::Socket::SSL\">IO::Socket::SSL<\/a>.  With this module, most functions are the same, and it should transparently handle the SSL for you.  You may also try a module that provides SSL over an existing INET socket, as in <a href=\"https:\/\/metacpan.org\/pod\/Net::SSL\">Net::SSL<\/a>, or for the truly hardcore use <a href=\"https:\/\/metacpan.org\/pod\/Net::SSLeay\">Net::SSLeay<\/a> (Perl bindings to OpenSSL).<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Non-Blocking I\/O<\/h2>\n\n\n\n<p>While this example &#8220;works&#8221;, it has a major drawback: It uses blocking I\/O.  This means that socket communication happens in the foreground: when calling send() or recv() on the socket object, your program will halt at that point until data is available.  While waiting to recv() some data, you can&#8217;t do anything else.  That works OK for this Echo test, but remember that WebSockets are bi-directional: you should be able to juggle multiple things at once.  For example, some services require you to send a periodic &#8220;heartbeat&#8221; to keep connected &#8211; but if already blocking on recv(), you can&#8217;t send() the necessary packet!  Even the Echo example is limited: ideally, you should be able to send() two messages, then recv() two responses.  But because of the design of the example script, it is forced into a &#8220;taking turns&#8221; pattern instead.<\/p>\n\n\n\n<p>Again, there are ways around this.  For a half-solution, you can continue to use blocking I\/O, but with a timeout period.  This allows you to wait a certain time to recv() \/ send() data before &#8220;giving up&#8221;.  The function to adjust socket parameters is <a href=\"https:\/\/perldoc.perl.org\/functions\/setsockopt.html\">setsockopt()<\/a> &#8211; <\/p>\n\n\n\n<p>Another way to handle this is to instead use IO::Select.  The Select object lets you pool sockets and then test at once if any have data waiting &#8211; thus, you can &#8220;multiplex&#8221; inputs together for asynchronous operation, even if the underlying sockets still block.<\/p>\n\n\n\n<p>You can also create the sockets in non-blocking mode and poll them &#8211; reading when no data is available returns immediately, with an error E_WOULDBLOCK.  A further abstraction is an I\/O handling library such as IO::Poll, AnyEvent, POE etc.  Of course, if you&#8217;re going to go THAT route, then maybe you should just do what I said at the start and use AnyEvent::WebSocket::Client &#8211; it combines Protocol::WebSocket with AnyEvent and gives a clean, non-blocking interface for interacting with remote web services.<\/p>\n\n\n\n<p>For a detailed look at each of these methods, and a module that can handle some of them for you, I recommend reading the post &#8220;<a href=\"https:\/\/medium.com\/booking-com-development\/io-socket-timeout-socket-timeout-made-easy-4dfd43e777f7\">IO::Socket::Timeout: socket timeout made easy<\/a>&#8221; on Medium.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">A Non-Blocking, SSL-Aware Client<\/h2>\n\n\n\n<p>I&#8217;ll go ahead and modify my client to address the two issues above.  This version uses IO::Socket::SSL instead (for wss:\/\/ connections), and IO::Select to handle non-blocking on the socket and STDIN, thus enabling a fully asynchronous connection to a WebSocket service.  The initial connection uses a blocking connection until the initial handshake is complete; afterwards, select() is used to multiplex reads from the TCP socket and STDIN at the same time.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>#!\/usr\/bin\/env perl\nuse v5.014;\nuse warnings;\n\n# Perl WebSocket test client\n#  Greg Kennedy 2019\n\n# IO::Socket::SSL lets us open encrypted (wss) connections\nuse IO::Socket::SSL;\n# IO::Select to \"peek\" IO::Sockets for activity\nuse IO::Select;\n# Protocol handler for WebSocket HTTP protocol\nuse Protocol::WebSocket::Client;\n\n#####################\n\ndie \"Usage: $0 URL\" unless scalar @ARGV == 1;\n\nmy $url = $ARGV[0];\n\n# Protocol::WebSocket takes a full URL, but IO::Socket::* uses only a host\n#  and port.  This regex section retrieves host\/port from URL.\nmy ($proto, $host, $port, $path);\nif ($url =~ m\/^(?:(?&lt;proto>ws|wss):\\\/\\\/)?(?&lt;host>[^\\\/:]+)(?::(?&lt;port>\\d+))?(?&lt;path>\\\/.*)?$\/)\n{\n  $host = $+{host};\n  $path = $+{path};\n\n  if (defined $+{proto} &amp;&amp; defined $+{port}) {\n    $proto = $+{proto};\n    $port = $+{port};\n  } elsif (defined $+{port}) {\n    $port = $+{port};\n    if ($port == 443) { $proto = 'wss' } else { $proto = 'ws' }\n  } elsif (defined $+{proto}) {\n    $proto = $+{proto};\n    if ($proto eq 'wss') { $port = 443 } else { $port = 80 }\n  } else {\n    $proto = 'ws';\n    $port = 80;\n  }\n} else {\n  die \"Failed to parse Host\/Port from URL.\";\n}\n\nsay \"Attempting to open SSL socket to $proto:\/\/$host:$port...\";\n\n# create a connecting socket\n#  SSL_startHandshake is dependent on the protocol: this lets us use one socket\n#  to work with either SSL or non-SSL sockets.\nmy $tcp_socket = IO::Socket::SSL->new(\n  PeerAddr => $host,\n  PeerPort => \"$proto($port)\",\n  Proto => 'tcp',\n  SSL_startHandshake => ($proto eq 'wss' ? 1 : 0),\n  Blocking => 1\n) or die \"Failed to connect to socket: $@\";\n\n# create a websocket protocol handler\n#  this doesn't actually \"do\" anything with the socket:\n#  it just encodes \/ decode WebSocket messages.  We have to send them ourselves.\nsay \"Trying to create Protocol::WebSocket::Client handler for $url...\";\nmy $client = Protocol::WebSocket::Client->new(url => $url);\n\n# Set up the various methods for the WS Protocol handler\n#  On Write: take the buffer (WebSocket packet) and send it on the socket.\n$client->on(\n  write => sub {\n    my $client = shift;\n    my ($buf) = @_;\n\n    syswrite $tcp_socket, $buf;\n  }\n);\n\n# On Connect: this is what happens after the handshake succeeds, and we\n#  are \"connected\" to the service.\n$client->on(\n  connect => sub {\n    my $client = shift;\n\n   # You may wish to set a global variable here (our $isConnected), or\n   #  just put your logic as I did here.  Or nothing at all :)\n   say \"Successfully connected to service!\";\n  }\n);\n\n# On Error, print to console.  This can happen if the handshake\n#  fails for whatever reason.\n$client->on(\n  error => sub {\n    my $client = shift;\n    my ($buf) = @_;\n\n    say \"ERROR ON WEBSOCKET: $buf\";\n    $tcp_socket->close;\n    exit;\n  }\n);\n\n# On Read: This method is called whenever a complete WebSocket \"frame\"\n#  is successfully parsed.\n# We will simply print the decoded packet to screen.  Depending on the service,\n#  you may e.g. call decode_json($buf) or whatever.\n$client->on(\n  read => sub {\n    my $client = shift;\n    my ($buf) = @_;\n\n    say \"Received from socket: '$buf'\";\n  }\n);\n\n# Now that we've set all that up, call connect on $client.\n#  This causes the Protocol object to create a handshake and write it\n#  (using the on_write method we specified - which includes sysread $tcp_socket)\nsay \"Calling connect on client...\";\n$client->connect;\n\n# read until handshake is complete.\nwhile (! $client->{hs}->is_done)\n{\n  my $recv_data;\n\n  my $bytes_read = sysread $tcp_socket, $recv_data, 16384;\n\n  if (!defined $bytes_read) { die \"sysread on tcp_socket failed: $!\" }\n  elsif ($bytes_read == 0) { die \"Connection terminated.\" }\n\n  $client->read($recv_data);\n}\n\n# Create a Socket Set for Select.\n#  We can then test this in a loop to see if we should call read.\nmy $set = IO::Select->new($tcp_socket, \\*STDIN);\n\nwhile (1) {\n  # call select and see who's got data\n  my ($ready) = IO::Select->select($set);\n\n  foreach my $ready_socket (@$ready) {\n    # read data from ready socket\n    my $recv_data;\n    my $bytes_read = sysread $ready_socket, $recv_data, 16384;\n\n    # handler by socket type\n    if ($ready_socket == \\*STDIN) {\n      # Input from user (keyboard, cat, etc)\n      if (!defined $bytes_read) { die \"Error reading from STDIN: $!\" }\n      elsif ($bytes_read == 0) {\n        # STDIN closed (ctrl+D or EOF)\n        say \"Connection terminated by user, sending disconnect to remote.\";\n        $client->disconnect;\n        $tcp_socket->close;\n        exit;\n      } else {\n        chomp $recv_data;\n        $client->write($recv_data);\n      }\n    } else {\n      # Input arrived from remote WebSocket!\n      if (!defined $bytes_read) { die \"Error reading from tcp_socket: $!\" }\n      elsif ($bytes_read == 0) {\n        # Remote socket closed\n        say \"Connection terminated by remote.\";\n        exit;\n      } else {\n        # unpack response - this triggers any handler if a complete packet is read.\n        $client->read($recv_data);\n      }\n    }\n  }\n}<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>WebSockets are the latest way to provide bi-directional data transfer for HTTP applications. They replace outdated workarounds like AJAX, repeated polling, Comet, etc. WebSockets are a special protocol atop HTTP (which in itself runs over TCP\/IP), and can be wrapped in SSL for security (as in HTTPS). Being a Web Technology, it seems to have [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[8],"tags":[],"class_list":["post-354","post","type-post","status-publish","format-standard","hentry","category-software"],"_links":{"self":[{"href":"https:\/\/greg-kennedy.com\/wordpress\/wp-json\/wp\/v2\/posts\/354","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/greg-kennedy.com\/wordpress\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/greg-kennedy.com\/wordpress\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/greg-kennedy.com\/wordpress\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/greg-kennedy.com\/wordpress\/wp-json\/wp\/v2\/comments?post=354"}],"version-history":[{"count":6,"href":"https:\/\/greg-kennedy.com\/wordpress\/wp-json\/wp\/v2\/posts\/354\/revisions"}],"predecessor-version":[{"id":361,"href":"https:\/\/greg-kennedy.com\/wordpress\/wp-json\/wp\/v2\/posts\/354\/revisions\/361"}],"wp:attachment":[{"href":"https:\/\/greg-kennedy.com\/wordpress\/wp-json\/wp\/v2\/media?parent=354"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/greg-kennedy.com\/wordpress\/wp-json\/wp\/v2\/categories?post=354"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/greg-kennedy.com\/wordpress\/wp-json\/wp\/v2\/tags?post=354"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}