YAML documents parsing with libyaml in C

YAML markup standard (or human-readable data serialization language as it stands in wikipedia) is widely used these days as alternative to XML, JSON or others. Anybody who does some ruby-on-rails uses YAML configuration for database connection.

It is simple, but in the same time, flexible standard which definitely should replace outdated formats like INI.

YAML is supported by lots of programming languages. Check projects list at yaml.org.

NOTE: This is other blog article that has implemenation of YAML parsing with libyaml and lemon.

Available  C libs


As far as I found, there are two projects for C that provides YAML libraries: syck and libyaml. The first one, "syck", is outdated and supports only YAML ver 1.0. Another one, "libyaml" looks like a current "industry standard" and is used by other big guys like: ruby, python, perl, php.

Where to get libyaml information


I did not find much information or documentation for libyaml. The official website has very basic information. I also found one nice tutorial by Andrew Poelstra made in 2011 and several discussions on stackoverflow.  

The source header file is very good documented also. The HTML, doxygen generated documentation also can be found in the library source in "doc/html/" directory.


What libyaml does?


It is a small library that provides powerful API for YAML parser and emitter. Emitter allows to produce/generate YAML documents. Parser takes and input stream of bytes and produces sequence of events or tokens.

In the begging there are two function that should be engaged:

  
yaml_parser_t parser;

FILE *file = fopen("/etc/config.yaml", "rb");

yaml_parser_initialize(&parser);

yaml_parser_set_input_file(&parser, file);

Function "yaml_parser_set_input_file" can be replaced with "yaml_parser_set_input_string" if we want to use string instead of file.

After work is done, one must call function "yaml_parser_delete" to free memory.

Also, for each type of parsing there are thee functions:

  • yaml_parser_parse - parse and produce events (yaml_event_t)
  • yaml_parser_scan - parse and produce tokens (yaml_token_t)
  • yaml_parser_load - parse and produce next YAML document (yaml_document_t)

Use of these functions can not be mixed or it will break the parser. Each one has corresponding function to free allocated memory: yaml_event_delete, yaml_token_delete,  yaml_document_delete.

The example usage code for first two functions can be found in blog I have mentioned before.

Here is document parsing example:


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
#include <stdio.h>
#include <yaml.h>
#include <assert.h>

int main(int argc, const char *argv[])
{
  FILE *file;
  yaml_parser_t parser;
  yaml_document_t document;
  yaml_node_t *node;
  int i = 1;

  file = fopen(argv[1], "rb");
  assert(file);

  assert(yaml_parser_initialize(&parser));

  yaml_parser_set_input_file(&parser, file);

  if (!yaml_parser_load(&parser, &document)) {
    goto done;
  }

  while(1) {
    node = yaml_document_get_node(&document, i);
    if(!node) break;
    printf("Node [%d]: %d\n", i++, node->type);
    if(node->type == YAML_SCALAR_NODE) {
      printf("Scalar [%d]: %s\n", node->data.scalar.style, node->data.scalar.value);
    }
  }
  yaml_document_delete(&document);


done:
  yaml_parser_delete(&parser);
  assert(!fclose(file));

  return 0;
}


While libyaml can produce tokens, it probably can be used with LALR parsers like Yacc, Bison or lemon.

Comments

Popular posts from this blog

Asterisk Queues Realtime Dashboard with amiws and Vue