Using Ragel to implement TFTP protocol client

This blog is about how to implement TFTP protocol (RFC1350) with ragel.

Ragel is a very powerful State Machine compiler, perfect for text parsing. One of the tasks where ragel can be used for is "Writing robust protocol implementations". More detailed information about ragel can be found here: http://www.colm.net/open-source/ragel.

Indeed, there are projects which successfully use ragel to parse network protocol packets:

  • Mongrel, the web-server developed by Zed Shaw. Mongrel and its derivatives (like Thin) are using ragel to parse HTTP protocol.
  • OverSIP, SIP proxy developed by IƱaki Baz Castillo and using ragel to parse SIP protocol.
Both HTTP and SIP are text based protocols. There are many articles and blogs that explain how to parse text with ragel. TFTP is a bit different and I thought it would be fun to try implement this protocol with ragel.

I decided to use C and following tools for my little tftp client project (only client not server):
  • Autotools to build my tftp client.
  • cmocka and DejaGNU for unit tests and front-end tests.
  • Apache Portable Runtime library will be used for most of the functionality. APR is a small and very convenient collection of functions for programming sockets, IO, memory management etc.
The whole source code of the project can be found here: https://github.com/staskobzar/tftp-ragel
I will not describe in details all the parts of the project. It is a small project and it is easy to understand how it works from reading sources.

TFTP protocol is probably the simplest network protocol. It has only six messages:
  • RRQ for requesting file from TFTP server
  • WRQ for sending (writing) file to TFTP server
  • DATA for transferring file contents in small chunks 
  • ACK - to acknowledge data packets
  • ERROR for error messages
Example of DATA packet would look like this:


Other packets are similar and detailed description can be found in RFC1350.

Ragel is parsing bytes sequence and uses actions to store packets in C structures. Ragel machine looks like this:

  MODE_OCTET  = /octet/i;
  MODE_ASCII  = /netascii/i;
  MODE_MAIL   = /mail/i;
  MODE        = MODE_OCTET | MODE_ASCII | MODE_MAIL;
  BLOCK       = extend extend %block;
  ERCODE      = BLOCK;
  ASCII       = 1..127;

  RQ    = ASCII+ 0x0 @filename MODE 0x0 @mode;
  RRQ   = 0x00 0x01 >{pack->opcode = E_RRQ;} RQ;
  WRQ   = 0x00 0x02 >{pack->opcode = E_WRQ;} RQ;
  DATA  = 0x00 0x03 BLOCK extend{1,512}  %pack_data;
  ACK   = 0x00 0x04 BLOCK                %pack_ack;
  ERROR = 0x00 0x05 ERCODE ASCII+ 0x0    %pack_error;

  tftp := (RRQ | WRQ | DATA | ACK | ERROR);


Whole parsing machine for TFTP in several lines. Handy, isn't it? Well, there will be more code in actions and other functions. But the parsing part is pretty easy to implement with ragel.

The "BLOCK" is using  "extended" built-in machine, which is basically "char" type  of range -127..127 or 255 for unsigned types. It contains of 2 bytes and to get block number it will be converted to unsigned int of 16 bit with action "%block":

action block {
    unsigned char low = *mark & 0xff;       // get lower byte 
    unsigned char hi  = *(mark +1) & 0xff;  // get higher byte 
    block_num = hi | (low << 8);
    mark = p;
  }

The whole machine is in the file src/lib/tftp_msg.rl. And the whole parsing function thus, is small enough:

tftp_pack* tftp_packet_read (char* packet, apr_size_t len, apr_pool_t *mp)
{
  int cs;
  char *p     = packet;
  char *pe    = p + len;
  char *eof   = pe;
  char *mark  = p + 2;
  uint16_t block_num;
  tftp_pack *pack = (tftp_pack *) apr_palloc (mp, sizeof(tftp_pack));
  pack->data = (union data*) apr_palloc (mp, sizeof(union data));

  %%write init;
  %%write exec;

  if ( cs < tftp_first_final )
    return NULL;

  return pack;
}

Just several lines!

Next step is to implement Finite State Machine for TFTP protocol. From specification, my understanding of TFTP machine would look like this:


And I've chosen to implement it with State Transition table. This table record is described with C structure:


struct trans_table {
  state   current_state;    /*!< State */
  enum    opcodes event;    /*!< Event */
  state   (*action)(void);  /*!< Pointer to func to execute. */
};

And the transition table, that implements machine graph above, is following:

static struct trans_table transition[] = {
  {INIT,  E_RRQ,    tftp_proto_rq         },
  {INIT,  E_WRQ,    tftp_proto_rq         },
  {RECV,  E_ERROR,  tftp_proto_error      },
  {RECV,  E_DATA,   tftp_proto_recv_data  },
  {RECV,  E_ACK,    tftp_proto_send_data  },
  {SEND,  E_ACK,    tftp_proto_ack        },
  {SEND,  E_ERROR,  tftp_proto_error      },
  /* sentinel */
  {END,   0,        NULL                  }
};
  
Implementation of the functions for the table can be found in file "src/lib/tftp_proto.c".

That's it for TFTP client with ragel. Check github for whole source code.

Comments

  1. Hey, thats a nice article for someone starting to learn Ragel. I have begun learning ragel for past 2 weeks.
    I do not understand as to why you have desinged another FSM based on the transition table. Doesnt Ragel itself create its own state machine based on the description given in '' main : ...''

    ReplyDelete
  2. Hi there,
    That's true, but I did not find the way how to do it from inside the parsing function. Ragel FSM is working with current packet and you need somehow to let it know current state of your TFTP protocol. Also, this is just a learning project, so it was good opportunity to play with FSM.
    Let me know if you find a way to do it from inside the ragel FSM.

    ReplyDelete

Post a Comment

Popular posts from this blog

Asterisk Queues Realtime Dashboard with amiws and Vue

YAML documents parsing with libyaml in C