In this final entry of the Ruby and D interop series I’ll be taking you through a practical usage example. We’ll look at an algorithm from the early Section 3 ARG days that does some byte manipulation to create a crude form of ‘encryption’.

From here on out I’ll be using the dub package manager that comes bundled with D to handle the build process.

The Sleuth Algo

#Sleuth algorithm as written in Ruby.
#Expects key and datanode to be an array of integers.
#Returns an array of integers.
def rb_decrypt(key,datanode)
    output = Array.new(datanode.size / 2, 0) #Upfront allocation for our in-place map.
    i = -1;
    output.map! {
        i += 1
        k = key[i % key.size];
        b = datanode[(i * 2) + 1] & 1
        (((datanode[i * 2] + (k & 1)) * 2) - b - k) % 256
    }
end

The actual specifics of the algorithm are for another blog post but it’s enough to know that we process two input bytes from an array of bytes labeled datanode for each byte of output. The final output is largely determined by our key bytes giving the illusion of some cryptographic function.

On my machine running this algorithm against the 8MB data node provided in the sleuth example takes about 1.3 seconds on average. This isn’t exactly terrible if you already know the correct key for the data node but if you had to brute-force to find the key… you might be here a while.

Sleuth in D

import rubyffi;
mixin rubyffi!__MODULE__;

export extern(C) {	
    @Ruby ubyte* decrypt(immutable(ubyte*) key, in size_t keyLen, immutable(ubyte*) datanode, in size_t nodeLen) pure {
        import core.memory : GC;
        immutable nodeLength = nodeLen / 2;
        ubyte *output = cast(ubyte*)GC.calloc(nodeLength);
        ubyte b = 0, k = 0;
        for (int i = 0; i < nodeLength; i++) {
            k = key[i % keyLen];
            b = datanode[(i * 2) + 1] & 1;
            output[i] = (((datanode[i * 2] + (k & 1)) * 2) - b - k) & 0xFF;
        }
        return output;
    }

    @Ruby ubyte* decrypt_p(immutable(ubyte*) key, in size_t keyLen, immutable(ubyte*) datanode, in size_t nodeLen) {
        import std.parallelism : parallel;
        ubyte[] output = new ubyte[nodeLen / 2];
        foreach (i, ref decrypted; output.parallel) {
            immutable k = key[i % keyLen];
            immutable b = datanode[(i * 2) + 1] & 1;
            decrypted = (((datanode[i * 2] + (k & 1)) * 2) - b - k) & 0xFF;
        }
        return output.ptr;
    }
}

decrypt is written in a more C-like style as I wanted to see the differences between a C and D approach. The only real difference here from a straight C implementation is the GC.calloc() instead of calloc().

decrypt_p is written in a fully D style. Even throwing in a little parallelism in there for fun. The really nice thing about this is that it’s a simple straight translation from Ruby to D.

If we want to test these functions out and make sure they’re working correctly we can take advantage of D’s built in Unit Testing.

unittest { 
    import std.string;
    immutable data = "R\xC7Z\x86aC[Jd\xE17\xDFc;U\x89Z7Y\x1D[\x8Ec\x19;1a\xCF\\\x9F^Rb\xEE".representation;
    immutable size = data.length / 2;
    immutable key = "PASSWORD".representation;
    immutable result = "Super secret text";
    string dp = cast(string)decrypt_p(key.ptr, key.length, data.ptr, data.length)[0..size];
    string d = cast(string)decrypt(key.ptr, key.length, data.ptr, data.length)[0..size];
    assert(dp == result, "decrypt_p: wrong result.");
    assert(d == result, "decrypt: wrong result.");
}

If we append this code to the code above and run dub test we’ll see that all of our tests pass.

I’m using the rubyffi mixin from the second part in this series to simplify the binding generation step. You can simply run ./bindgen in the example from the repo to produce the correct bindings.

Likewise you can run dub build --build=release to compile the sleuth example library.

Passing pointers

If we inspect the code produced by ./bindgen we’ll see that we have

 ...
attach_function :decrypt, [:pointer, :uint, :pointer, :uint], :pointer
attach_function :decrypt_p, [:pointer, :uint, :pointer, :uint], :pointer
 ...

The rubyffi mixin has translated the immutable(ubyte*) and ubyte* types into :pointer on the Ruby end for us.

In our Ruby application code we want to write a small helper function to deal with these pointers.

#Expects key and datanode as strings.
def decrypt(k, dn)
    kp = FFI::MemoryPointer.new :uchar, k.length
    dp = FFI::MemoryPointer.new :uchar, dn.length
    #Place all the key and datanode bytes starting from offset 0.
    kp.put_array_of_uchar 0, k.bytes 
    dp.put_array_of_uchar 0, dn.bytes
    outLen = dn.length/2
    SLEUTH
        .decrypt(kp, k.length, dp, dn.length)
        #Retrieve the result from the pointer and only read as many bytes as we need.
        .read_array_of_uchar(outLen) 
end

This function simply sets up the pointers we require with the correct data types and tacitly returns the result as an array of integers on the Ruby end. I feel like it should be possible to update rubyffi to generate these stubs for us.

Putting it all together

Now that we have our D decrypt function accessible to Ruby we can actually look at ‘decrypting’ the data.node

#!/usr/bin/ruby
require 'ffi'
require './app'

def decrypt(k, dn)
    kp = FFI::MemoryPointer.new :uchar, k.length
    dp = FFI::MemoryPointer.new :uchar, dn.length
    kp.put_array_of_uchar 0, k.bytes
    dp.put_array_of_uchar 0, dn.bytes
    APP
        .decrypt_p(kp, k.length, dp, dn.length)
        .read_array_of_uchar(dn.length/2)
end

datanode = IO.binread("data.node")

key = "0000-VGNS-CLAW-BEES-K-RADZ-KON-CULT-A3LG-TOA-S038-S27-D118-WULF-O47-DEW-J009-S01-BNDR-WOLF-EIRE-RYAN-PNDA-SKY-YMCA-GL13-MOA-F09-KRPY-3UP-MG91-MAK-LAW-S692-TEXX-BOB-WURF-AIR-JEDI-WPX-RAZ4-M0A-SOCW-XENO-GUS-SSDD-ES91-117-U190-ENZO-IIII-RC0N"

IO.binwrite("out.mp4",decrypt(key,datanode).reverse.pack("C*"))

Taken directly from the sleuth example repo

Build instructions:

  1. ./bindgen
  2. dub build --build=release
  3. ./main.rb

You should get a working MP4 out.mp4 as the output.

Was it worth it?

Well let’s run a simple ./benchmark.rb from the repo and see what shakes out.

Rehearsal ------------------------------------------------
rb_decrypt:    1.250000   0.000000   1.250000 (  1.356179)
d_decrypt:     0.078000   0.016000   0.094000 (  0.101248)
d_decrypt_p:   0.110000   0.015000   0.125000 (  0.083242)
--------------------------------------- total: 1.469000sec

                   user     system      total        real
rb_decrypt:    1.313000   0.000000   1.313000 (  1.341551)
d_decrypt:     0.094000   0.000000   0.094000 (  0.105518)
d_decrypt_p:   0.062000   0.016000   0.078000 (  0.091533)

It looks like the D functions execute at least thirteen times faster than their Ruby counterpart which is a pretty solid performance increase for very little work.

d_decrypt_p just ekes out a win over the serial d_decrypt implementation though how much it wins by (or loses) will vary depending on how many cores you have. I was quite impressed by how easy it was to parallelise the algorithm. All I had to do was drop .parallel on the output variable - almost as an after thought to see how it would work out and… work out it did!

If I had to brute-force the key to one of these data nodes I know which path I’d choose.

Series conclusion

In part 1 I provided an overview of writing D code and creating the bindings from scratch to be run by Ruby.

In part 2 we progressed to using D to generate our bindings at compile time using a library I’ve authored to expedite the process of getting up and running.

By this point I hope you have a really good view on how to start using D to make Ruby gains. I hope I’ve demonstrated that it’s not scary or hard to get down to that ‘low-level’ and optimise some piece of code that gets run often in your Ruby application. D should not be too unfamiliar territory to a Ruby developer, the UFCS will make you feel right at home!

I’m not going to say it’s a pain-free experience but this method should certainly be quicker than wading in to building native C extensions with the Ruby devkit.

I find being able to easily unit test my extension code to be a big win.

One final thought. We do have to be a bit wary of who owns the memory allocated in our D functions. We passed a pointer to D allocated memory over to Ruby… the D garbage collector has no idea that Ruby is using this value and could collect (or not) at any time causing us issues.

It doesn’t really apply to this example as we’re acquiring a pointer to fresh memory on each invocation but it’s definitely something I need to investigate further - perhaps a future blog entry.

Resources

  1. RubyFFI Binding Guide
  2. d-ruby-ffi on GitHub
  3. d-ruby-ffi Examples
  4. std.parallelism