Implementing tail command in Ruby


A while ago, I did a pair programming exercise as part of an interview. The goal was to implement a tail-like command in Ruby. In a nutshell, the tail command reads the last few lines of any text given to it as input and prints them to standard output. For the more inner details, check here.

In this post, we are going to implement a limited version of the tail command in Ruby. A complete example of the code is located at https://github.com/renehernandez/bok-tail. Without further delay, let’s jump right into it.

Tail logic implementation

In order to produce a decoupled implementation, we are going to focus first on the logic of reading the last n lines of a file in Ruby. We’ll discuss different ways to do it in the next sections.

Using readlines method

From the IO library in Ruby, the method readlines will return an array with all the lines of text in the given file.

def read(filename, num_lines = 10)
  output = ''

  File.readlines(filename).last(num_lines).each do |line|
    output << line
  end

  output
end

In the code above, after reading all the lines, we use last to specify the amount of lines we want to process and at the end return the string output from the method. This solution is very simple; but it has a significative drawback, in order to read the last lines of the file, we are loading in memory all the content of the file. What happens if we want to read the last 20 lines of a 2 GB file?

Using read and seek methods

Based on the C implementation of reading a file, the read method will read the file from a specific offset position. To specify this offset, we will use the seek method. Both are part of the IO library.

def read(filename, num_lines = 10)
  fd = File.open(filename)
  pos = 0
  current_num = 0

  loop do
    pos -= 1
    fd.seek(pos, IO::SEEK_END)
    char = fd.read(1)

    if eol?(char)
      current_num += 1
    end

    break if current_num > num || fd.tell == 0  # Use tell method to check
  end                                           # if we have reached
                                                # the beginning of file

  fd.read
end

def eol?(char)
  char == "\n"
end

This new version of the read method is longer, but more efficient. The method starts by positioning the offset at the end of the file and then reading backwards byte by byte using the seek method, to check for occurrences of the line terminator character (\n). If it has processed the required number of lines or found that it is at the beginning of the file, it will return the content from that specific offset using the read method.

Reading last number of bytes

The tail command provides another way of reading a file. Instead of specifying the number of lines, it lets you set the number of bytes to read from the end of file (EOF) backwards.

Using the previous code, we just have to make a few modifications. The result could be as follows:

def read(filename, num = 10, bytes = false)
  fd = File.open(filename)
  pos = 0
  current_num = 0

  loop do
    pos -= 1
    fd.seek(pos, IO::SEEK_END)
    char = fd.read(1)

    if bytes || eol?(char)
      current_num += 1
    end

    break if current_num > num || fd.tell == 0  # Use tell method to check
  end                                           # if we have reached
                                                # the beginning of file

  fd.read
end

def eol?(char)
  char == "\n"
end

We add a boolean parameter called bytes and modify the condition to increment the current_num var. With these changes, now we can read either the n last lines or the n last bytes of the given file.

Final structure

With the main method already solved, we can proceed to create a class that will contain this logic and some extra features to be used by the CLI implementation. The class TailLogic allows to read a specific number of lines or bytes, while keeping attached the file descriptor (fd accessor) to be reused for the class. It includes several other methods:

  • get_fd: Gets the file descriptor and updates the corresponding stats (mtime and size)
  • read_all: Reads the file to the end, starting from the current offset.
  • file_changed?: Checks if the file has different stats from the ones stored by the class.
class TailLogic

  attr_accessor :filename, :fd
  attr_reader :mtime, :size

  def initialize(filename)
    @filename = filename
  end

  def get_fd
    @fd ||= File.open(filename)
    update_stats
  end

  def read(num = 10, bytes = false)
    get_fd
    pos = 0
    current_num = 0

    loop do
      pos -= 1
      fd.seek(pos, IO::SEEK_END)
      char = fd.read(1)

      if bytes || eol?(char)
        current_num += 1
      end

      break if current_num > num || fd.tell == 0
    end

    update_stats
    fd.read
  end

  def read_all
    update_stats
    fd.read
  end

  def file_changed?
    mtime != fd.stat.mtime || size != fd.size
  end

  private
  def eol?(char)
    char == "\n"
  end

  def update_stats
    @mtime = fd.stat.mtime
    @size = fd.size
  end

end

Now, we are ready to move onto hacking a simple command line interface :).

CLI implementation

Based on the features already presented in the TailLogic class, our new shiny CLI will read the last n lines or the last n bytes of the given file. There is one extra functionality we would are going to implement as well: following the file. Tail command brings the option --follow which allows to keep track in real time when new information is added for a file.

Using optparse library

For this implementation, we will implement the option of following a file by checking whether the mtime (modified time) or the size of a file has changed and in that case, output the new content to stdout.

OptionParser is an advanced command-line option utility. It provides a lot of features to create powerful command-line applications.

Before processing the files with our new shiny command line application, let’s configure the options and how to handle them.

Parsing the command options

Using the OptionParser class, we can structure the cli help messages tailored for common and specific options of the tail command, as shown in the configure_parser method. We will use an OpenStruct object options to store the selected options once they have been parsed by the parser object. The parse method receives the input args from the command line and we parse the information about the options.

require 'optparse'
require 'ostruct'
require_relative 'tail_logic'

class TailCLI
  VERSION = '1.0.0'

  attr_reader :parser, :options

  def initialize
    @options = OpenStruct.new(
      follow: false,
      num_lines: 10,
      bytes: false
    )

    configure_parser
  end

  def parse(args)
    parser.parse!(args)
  end

  private
  def configure_parser
    @parser = OptionParser.new

    parser.banner = 'Usage: ./tail.rb [options] FILENAME'

    parser.separator ''
    parser.separator 'Specific options:'

    # Additional options
    follow_file_option
    lines_number_option
    bytes_option

    parser.separator ''
    parser.separator 'Common Options'

    parser.on_tail('-h', '--help', 'Show this message') do
      puts parser
      exit
    end

    parser.on('--version', 'Show version' ) do |opt|
      puts Tail::VERSION
      exit
    end
  end

  def follow_file_option
    parser.on('-f', '--follow', TrueClass,
                     'outputs appended data as the file grows') do |f|
      options.follow = f
    end
  end

  def lines_number_option
    parser.on('-n [NUMBER]', '--number [NUMBER]', Integer,
                     'specifies number of lines to read') do |n|
      options.num_lines = n
    end
  end

  def bytes_option
    parser.on('-b', '--bytes', TrueClass,
                     'reads bytes instead of lines') do |b|
      options.bytes = b
    end
  end

end

cli = TailCLI.new
cli.parse(ARGV)

Hey! We are almost done!! Our final step will allow us to execute the tail functionalities.

Executing the tail command

Now, we add the remaining bits to make everything work. First, we will allow the possibility of the file being run as executable (declaring the bash line to environmental ruby). Then, we add the run method, which calls the read method from the TailLogic class and if the follow option is present then continuosly checks if the file has changed and output the new content.

#! /usr/bin/env ruby

require 'optparse'
require 'ostruct'
require_relative 'tail_logic'

class TailCLI
  VERSION = '1.0.0'

  attr_reader :parser, :options, :file

  def initialize
    @options = OpenStruct.new(
      verbose: false,
      follow: false,
      num_lines: 10,
      bytes: false
    )

    configure_parser
  end

  def parse(args)
    parser.parse!(args)
  end

  def run(args)
    @file = args.pop
    tail = TailLogic.new(file)
    puts tail.read(options[:num_lines], options[:bytes])

    if options[:follow]
      while true
        if tail.file_changed?
          puts tail.read_all
        end
        sleep(1)
      end
    end
  end

  private
  def configure_parser
    @parser = OptionParser.new

    parser.banner = 'Usage: ./tail.rb [options] FILENAME'

    parser.separator ''
    parser.separator 'Specific options:'

    # Additional options
    follow_file_option
    lines_number_option
    bytes_option

    parser.separator ''
    parser.separator 'Common Options'

    parser.on_tail('-h', '--help', 'Show this message') do
      puts parser
      exit
    end

    parser.on('--version', 'Show version' ) do |opt|
      puts Tail::VERSION
      exit
    end
  end

  def follow_file_option
    parser.on('-f', '--follow', TrueClass,
                     'outputs appended data as the file grows') do |f|
      options.follow = f
    end
  end

  def lines_number_option
    parser.on('-n [NUMBER]', '--number [NUMBER]', Integer,
                     'specifies number of lines to read') do |n|
      options.num_lines = n
    end
  end

  def bytes_option
    parser.on('-b', '--bytes', TrueClass,
                     'reads bytes instead of lines') do |b|
      options.bytes = b
    end
  end

end

cli = TailCLI.new
cli.parse(ARGV)
cli.run(ARGV)

Assuming the file is called tail_cli.rb, we can run it as:

$ ruby tail_cli.rb --follow -n 20 path/to/filename

Or if we have made it executable, as follows:

$ ./tail_cli.rb -n 50 --bytes path/to/filename

Alternatives

Of course, there are different ways to implement the CLI, but we have provided here something simple that works without relying in any external gem to work with. Several things could be done differently, like using rb-inotify gem that uses the inotify Linux kernel subsystem to listening and react to changes in the filesystem or using the Thor gem to provide the CLI interface.

To wrap it up, thanks to read till the very end and hopefully, you may have learned something new or refresh previous knowledge. See you and stay tuned for more!!

Disclaimer: The opinions expressed herein are my own personal opinions and do not represent my employer’s view in any way.