A while ago, I did a pair programming exercise as part of an interview. The goal was to implement a tail-like command in Ruby. In a nutshell, the tail command reads the last few lines of any text given to it as input and prints them to standard output. For the more inner details, check here.
In this post, we are going to implement a limited version of the tail command in Ruby. A complete example of the code is located at https://github.com/renehernandez/bok-tail. Without further delay, let’s jump right into it.
Tail logic implementation
In order to produce a decoupled implementation, we are going to focus first on the logic of reading the last n lines of a file in Ruby. We’ll discuss different ways to do it in the next sections.
Using readlines method
From the IO library in Ruby, the method readlines will return an array with all the lines of text in the given file.
def read(filename, num_lines = 10)
  output = ''
  File.readlines(filename).last(num_lines).each do |line|
    output << line
  end
  output
end
In the code above, after reading all the lines, we use last to specify the amount of lines we want to process and at the end return the string output from the method. This solution is very simple; but it has a significative drawback, in order to read the last lines of the file, we are loading in memory all the content of the file. What happens if we want to read the last 20 lines of a 2 GB file?
Using read and seek methods
Based on the C implementation of reading a file, the read method will read the file from a specific offset position. To specify this offset, we will use the seek method. Both are part of the IO library.
def read(filename, num_lines = 10)
  fd = File.open(filename)
  pos = 0
  current_num = 0
  loop do
    pos -= 1
    fd.seek(pos, IO::SEEK_END)
    char = fd.read(1)
    if eol?(char)
      current_num += 1
    end
    break if current_num > num || fd.tell == 0  # Use tell method to check
  end                                           # if we have reached
                                                # the beginning of file
  fd.read
end
def eol?(char)
  char == "\n"
end
This new version of the read method is longer, but more efficient. The method starts by positioning the offset at the end of the file and then reading backwards byte by byte using the seek method, to check for occurrences of the line terminator character (\n). If it has processed the required number of lines or found that it is at the beginning of the file, it will return the content from that specific offset using the read method.
Reading last number of bytes
The tail command provides another way of reading a file. Instead of specifying the number of lines, it lets you set the number of bytes to read from the end of file (EOF) backwards.
Using the previous code, we just have to make a few modifications. The result could be as follows:
def read(filename, num = 10, bytes = false)
  fd = File.open(filename)
  pos = 0
  current_num = 0
  loop do
    pos -= 1
    fd.seek(pos, IO::SEEK_END)
    char = fd.read(1)
    if bytes || eol?(char)
      current_num += 1
    end
    break if current_num > num || fd.tell == 0  # Use tell method to check
  end                                           # if we have reached
                                                # the beginning of file
  fd.read
end
def eol?(char)
  char == "\n"
end
We add a boolean parameter called bytes and modify the condition to increment the current_num var. With these changes, now we can read either the n last lines or the n last bytes of the given file.
Final structure
With the main method already solved, we can proceed to create a class that will contain this logic and some extra features to be used by the CLI implementation. The class TailLogic allows to read a specific number of lines or bytes, while keeping attached the file descriptor (fd accessor) to be reused for the class. It includes several other methods:
- get_fd: Gets the file descriptor and updates the corresponding stats (mtime and size)
- read_all: Reads the file to the end, starting from the current offset.
- file_changed?: Checks if the file has different stats from the ones stored by the class.
class TailLogic
  attr_accessor :filename, :fd
  attr_reader :mtime, :size
  def initialize(filename)
    @filename = filename
  end
  def get_fd
    @fd ||= File.open(filename)
    update_stats
  end
  def read(num = 10, bytes = false)
    get_fd
    pos = 0
    current_num = 0
    loop do
      pos -= 1
      fd.seek(pos, IO::SEEK_END)
      char = fd.read(1)
      if bytes || eol?(char)
        current_num += 1
      end
      break if current_num > num || fd.tell == 0
    end
    update_stats
    fd.read
  end
  def read_all
    update_stats
    fd.read
  end
  def file_changed?
    mtime != fd.stat.mtime || size != fd.size
  end
  private
  def eol?(char)
    char == "\n"
  end
  def update_stats
    @mtime = fd.stat.mtime
    @size = fd.size
  end
end
Now, we are ready to move onto hacking a simple command line interface :).
CLI implementation
Based on the features already presented in the TailLogic class, our new shiny CLI will read the last n lines or the last n bytes of the given file. There is one extra functionality we would are going to implement as well: following the file. Tail command brings the option --follow which allows to keep track in real time when new information is added for a file.
Using optparse library
For this implementation, we will implement the option of following a file by checking whether the mtime (modified time) or the size of a file has changed and in that case, output the new content to stdout.
OptionParser is an advanced command-line option utility. It provides a lot of features to create powerful command-line applications.
Before processing the files with our new shiny command line application, let’s configure the options and how to handle them.
Parsing the command options
Using the OptionParser class, we can structure the cli help messages tailored for common and specific options of the tail command, as shown in the configure_parser method. We will use an OpenStruct object options to store the selected options once they have been parsed by the parser object. The parse method receives the input args from the command line and we parse the information about the options.
require 'optparse'
require 'ostruct'
require_relative 'tail_logic'
class TailCLI
  VERSION = '1.0.0'
  attr_reader :parser, :options
  def initialize
    @options = OpenStruct.new(
      follow: false,
      num_lines: 10,
      bytes: false
    )
    configure_parser
  end
  def parse(args)
    parser.parse!(args)
  end
  private
  def configure_parser
    @parser = OptionParser.new
    parser.banner = 'Usage: ./tail.rb [options] FILENAME'
    parser.separator ''
    parser.separator 'Specific options:'
    # Additional options
    follow_file_option
    lines_number_option
    bytes_option
    parser.separator ''
    parser.separator 'Common Options'
    parser.on_tail('-h', '--help', 'Show this message') do
      puts parser
      exit
    end
    parser.on('--version', 'Show version' ) do |opt|
      puts Tail::VERSION
      exit
    end
  end
  def follow_file_option
    parser.on('-f', '--follow', TrueClass,
                     'outputs appended data as the file grows') do |f|
      options.follow = f
    end
  end
  def lines_number_option
    parser.on('-n [NUMBER]', '--number [NUMBER]', Integer,
                     'specifies number of lines to read') do |n|
      options.num_lines = n
    end
  end
  def bytes_option
    parser.on('-b', '--bytes', TrueClass,
                     'reads bytes instead of lines') do |b|
      options.bytes = b
    end
  end
end
cli = TailCLI.new
cli.parse(ARGV)
Hey! We are almost done!! Our final step will allow us to execute the tail functionalities.
Executing the tail command
Now, we add the remaining bits to make everything work. First, we will allow the possibility of the file being run as executable (declaring the bash line to environmental ruby). Then, we add the run method, which calls the read method from the TailLogic class and if the follow option is present then continuosly checks if the file has changed and output the new content.
#! /usr/bin/env ruby
require 'optparse'
require 'ostruct'
require_relative 'tail_logic'
class TailCLI
  VERSION = '1.0.0'
  attr_reader :parser, :options, :file
  def initialize
    @options = OpenStruct.new(
      verbose: false,
      follow: false,
      num_lines: 10,
      bytes: false
    )
    configure_parser
  end
  def parse(args)
    parser.parse!(args)
  end
  def run(args)
    @file = args.pop
    tail = TailLogic.new(file)
    puts tail.read(options[:num_lines], options[:bytes])
    if options[:follow]
      while true
        if tail.file_changed?
          puts tail.read_all
        end
        sleep(1)
      end
    end
  end
  private
  def configure_parser
    @parser = OptionParser.new
    parser.banner = 'Usage: ./tail.rb [options] FILENAME'
    parser.separator ''
    parser.separator 'Specific options:'
    # Additional options
    follow_file_option
    lines_number_option
    bytes_option
    parser.separator ''
    parser.separator 'Common Options'
    parser.on_tail('-h', '--help', 'Show this message') do
      puts parser
      exit
    end
    parser.on('--version', 'Show version' ) do |opt|
      puts Tail::VERSION
      exit
    end
  end
  def follow_file_option
    parser.on('-f', '--follow', TrueClass,
                     'outputs appended data as the file grows') do |f|
      options.follow = f
    end
  end
  def lines_number_option
    parser.on('-n [NUMBER]', '--number [NUMBER]', Integer,
                     'specifies number of lines to read') do |n|
      options.num_lines = n
    end
  end
  def bytes_option
    parser.on('-b', '--bytes', TrueClass,
                     'reads bytes instead of lines') do |b|
      options.bytes = b
    end
  end
end
cli = TailCLI.new
cli.parse(ARGV)
cli.run(ARGV)
Assuming the file is called tail_cli.rb, we can run it as:
$ ruby tail_cli.rb --follow -n 20 path/to/filename
Or if we have made it executable, as follows:
$ ./tail_cli.rb -n 50 --bytes path/to/filename
Alternatives
Of course, there are different ways to implement the CLI, but we have provided here something simple that works without relying in any external gem to work with. Several things could be done differently, like using rb-inotify gem that uses the inotify Linux kernel subsystem to listening and react to changes in the filesystem or using the Thor gem to provide the CLI interface.
To wrap it up, thanks to read till the very end and hopefully, you may have learned something new or refresh previous knowledge. See you and stay tuned for more!!