Wednesday, February 12, 2014

Book Review: Ruby Under a Microscope

Let's start with a disclaimer. I'm not a Computer Science major, nor did I complete a Computer Science course of study in school. I'm a software tester and one that finds themselves frequently using programming languages of various stripes for various purposes. Ruby is one of the most popular languages in current use, and for many, it's a language that allows them to learn some basic terms, some programming constructs, and then lets them just use it. It's clean, it's elegant, it's almost simple. It's a language that invites the user to just go with it.

For some, though, there's that sense of curiosity... what is my Ruby program really doing? How can I see what the system is actually doing with my code? What's going on underneath the hood? If such explorations interest you, then "Ruby Under a Microscope" by Pat Shaughnessy tackles that subject handily.

A word of warning going in. This is not a general language book. You will not learn much about programming in Ruby here. You should have a decent understanding of what Ruby syntax looks like and how it works. Having said that, you don't need to have years of experience with Ruby to appreciate this book for what it does. It takes some key areas of the language, and through examples, some small programs, and a variety of tools, lets you see and understand what Ruby actually looks like up close and personal.


Chapter 1 focuses on how Ruby understands the text that you type into your Ruby program. Ruby converts your source code first into tokens, and then converts that input stream into an "abstract syntax tree”. Through tools like “Ripper”, you can see this process and watch it in something resembling natural language (well, kind of. It’s still debug code, but it’s a lot less intimidating than one might think).

Chapter 2 covers how Ruby compiles code. Wait, isn’t Ruby a “scripting language, no compiler required? With 1.8 and earlier, yes, but with 1.9 and up, Ruby is compiled just like many other languages we are familiar with. The difference? Ruby does it automatically. You never need to invoke the compiler. Ruby also has its own “virtual machine” (YARV, or "Yet Another Ruby Virtual Machine) that it compiles its bytecode for. Ultimately, the byte code for YARV is what we witness running.

Chapter 3 goes into greater detail about how YARV runs our code. By comparing the steps necessary to run a simple program, we can compare the time it takes to run a program in Ruby 1.8 (which doesn’t have a compile sequence, it just runs) and Ruby 1.9 and 2.0, which do have compile sequences. For simple and brief interactions, it actually looks like Ruby 1.8 performs better, but for longer runs with more iterations, 1.9 and 2.0 have a huge advantage over 1.8 by virtue of its compile cycle. 

Chapter 4 focuses more attention on the virtual machine and how control structures and methods are handled within the YARV. If statements, for loops and calls to various methods demonstrate how ruby breaks down the instructions, as well as how it utilizes its internal calls to “jump” from one piece of code to another.  Ruby categorizes methods into 11 types, and labels its built-in methods as CFUNC methods, meaning they are implemented in C. Ruby also uses a hash to keep track of the number of arguments, her labels and what their default values should be.

Chapter 5 looks at objects and classes, specifically, Ruby’s internal objects and classes. Each Ruby object is, ultimately,  a class pointer paired with an array of instance variables, and everything in Ruby is an object. Several generic objects are shown, along with their C structures (RString, RArray, RRegexp, etc.), and demonstrate how they, likewise, are also simple combinations of a class pointer and instance variables. Classes are a little more involved. Each Ruby class can be defined as Ruby object (with its class pointer & instance variables) plus method definitions, attribute names, a constants table, and a “superclass” pointer.

Chapter 6 brings us deeper into methods and constants, specifically how these aspects are found and represented. Ruby lets a programmer look at programs with two contrasting paradigms. Our code can be organized through classes and superclasses, or it can be organized through "lexical scope". Which approach makes the most sense? it depends on what you want to have your program accomplish.

Chapter 7 gets into one of the key attributes of Ruby internals, the Hash Table. These are interesting data structures that allow a program to return values quickly, and to automatically increase in size as more elements are added. the chapter takes a deep dive into Ruby’s hash function and how it allows for elements to be accessed quickly.

 Chapter 8 covers blocks, and how the Blocks concept in Ruby borrows from the “closure” idea first prosed in the Lisp language several decades back. Blocks can be defined as “a combination of a function and an environment to use when calling that function”.  Using “lambda”,  a block can become a data value that can be passed, saved, and reused.

 Chapter 9 discusses Metaprogramming, a means to program in a way that code can inspect and change itself, dynamically. In other words, by referencing itself, your program can change itself! I’ll admit, this is one of the aspects of Ruby (or any language) that I have trouble getting my head around, and while I won’t claim to have mastery of these ideas after this chapter, I feel I have a little better feel for what’s happening.

 Chapter 10 takes us into the Java realm and shows us Ruby implemented in Java, as opposed to how we’ve been interacting with it thus far in C. The flow is similar, but each Ruby script gets compiled into a Java executable, and then is physically run by the Java Virtual Machine. We see how “Jay” parses the lines of code (much the way Bison does for MRI). By monitoring Java’s Just In time Compiler, we can see which class and structures are called whenever we create a script and run it. We can also see where, by focusing on various “hot spots in our program and compiling them into Java, we can save time in key areas compared to C implemented MRI.

Chapter 11 introduces Rubinious, a version of Ruby implemented with Ruby.  Well, it’s actually a virtual machine using C++ to run  Ruby code. What makes it different is that, rather than relying on C or Java Structures for the built-in classes, Rubinious does it with Ruby code. What does this mean? we can see how Ruby works internally without having to know C or Java. It’s all done in Ruby, and we can see how by reading the source code.

 Chapter 12 explores Ruby’s Garbage Collection, and how it differs, and is similar, in MRI, JRuby and Rubinious. Garbage collection helps us with three processes;  allocating memory for use by new objects,  identifying which objects a program is no longer using and reclaiming memory from unused objects. Various programs and examples demonstrate which objects are mapped where, and how to see when they are deallocated and their memory freed. Various algorithms for the various virtual machines are explored, but this is just a cursory overview of all the options and how they are implemented. Still, it’s an interesting view into a process that many of us take for granted because Ruby and many other languages basically let us take them for granted. 

Bottom Line:


"Ruby Under a Microscope" does something fairly ambitious. It attempts to write a system internals book in a language that non-computer scientists can readily understand. While there are numerous code snippets and examples to try and examine, the ability to look at the various Ruby internals and systems and see how they fit together can be accomplished by someone with general skills and basic familiarity with programming at the script level (which for many of us is as far as we typically get). An old saying says you can’t tell where you are going if you don’t know where you’ve been. Similarly, we can’t expect to get the most out of languages like Ruby without having a more clear idea what’s happening under the hood. It’s entirely possible to work with Ruby and never learn some of this stuff, but having a guide like "Ruby Under a Microscope” opens up a variety of avenues, and does so in a way that will make the journey interesting and, dare I say it, even a little fun.

No comments: