Bluish Coder

Programming Languages, Martials Arts and Computers. The Weblog of Chris Double.


2010-06-02

Safer C Code Using ATS

When developing the original Ogg backend for Firefox we had to integrate and use a number of C libraries. Over the months after landing a number of bugs were raised for various issues in the backend. A fair number of these were the result of common coding errors. Things like not checking return values of the C API calls and not releasing resources correctly. This was internal to the third party libraries we used as well as in our own code. This has made me interested in looking at programming languges that make these sorts of errors easier to find.

I've previously written about ATS, a programming language with dependent and linear types. With ATS you can use C code directly and annotate the code with types to make some usage errors detectable at compile time. Some papers that cover this usage of ATS are:

In this post I look at ways of using a C library from ATS that shows dealing with static checking of return values and resource releasing. I start from a simple wrapper and then explore how I can use ATS to make usage of the API safer from common programmer errors. The library I'm using to do this is libcurl.

A simple usage of libcurl to retrieve the HTML from a URL and print it to stdout is (simple.c):

#include <curl/curl.h>

int main(void)
{
  CURL *curl;
  CURLcode res;

  curl = curl_easy_init();
  curl_easy_setopt(curl, CURLOPT_URL, "bluishcoder.co.nz");
  res = curl_easy_perform(curl);
  curl_easy_cleanup(curl);
  return 0;
}

This can be compiled and run with:

$ gcc -o simple simple.c `curl-config --libs`
$ ./simple
...

Ensuring release of resources

My first attempt at translating this into ATS is in simple1.dats (pretty-printed version):

%{^
#include <curl/curl.h>
%}

absviewtype CURLptr (l:addr) // CURL*

abst@ype CURLoption = $extype "CURLoption"
macdef CURLOPT_URL = $extval(CURLoption, "CURLOPT_URL")

extern fun curl_easy_init
  ()
  : [l:addr] CURLptr l = "#curl_easy_init"
extern fun curl_easy_setopt {l:addr} {p:type}
  (handle: !CURLptr l, option: CURLoption, parameter: p)
  : int = "#curl_easy_setopt"
extern fun curl_easy_perform {l:addr}
  (handle: !CURLptr l)
  : int = "#curl_easy_perform"
extern fun curl_easy_cleanup {l:addr}
  (handle: CURLptr l)
  : void = "#curl_easy_cleanup"

implement main() = let
  val curl  = curl_easy_init();
  val res = curl_easy_setopt(curl, CURLOPT_URL, "www.bluishcoder.co.nz");
  val res = curl_easy_perform(curl);
  val () = curl_easy_cleanup(curl);
in
  ()
end;

The 'main' function looks very much like the C code. The definitions before that make the libcurl C functions, types and enums available to ATS. To compile and run this code:

$ atscc -o simple1 simple1.dats `curl-config --libs`
$ ./simple1
..

The libcurl type CURLoption is a C enum. To wrap this in ATS I use $extype to refer to the C name directly:

abst@ype CURLoption = $extype "CURLoption"

I only reference one value of these types. That is CURLOPT_URL and again I reference the C value directly. Since this is a value rather than a type I use $extval:

macdef CURLOPT_URL = $extval(CURLoption, "CURLOPT_URL")

The CURL type is an abstract type in C - we don't care what it is actually composed of. A CURL object is always refered to via a pointer. This is modeled in ATS with:

absviewtype CURLptr (l:addr) // CURL*

Think of 'CURLptr' as being 'a reference to a CURL object'. It's basically a C 'CURL*'. The definition above states that a CURLptr is an object of type 'CURLptr' located at a memory address 'l'.

The C definition of curl_easy_init looks like:

CURL *curl_easy_init(void);

The ATS definition is:

extern fun curl_easy_init () : [l:addr] CURLptr l = "#curl_easy_init" 

The 'extern' at the beginning and the '#curl_easy_init' at the end means that the implementation of this function is a C function called 'curl_easy_init' in an external library. The '[l:addr]' following the function name gives the type of the 'l' used in the return type - that being a pointer. The return type is 'CURLptr l' which I described previously. The function returns a CURLptr object located at pointer address 'l'.

The C definition of curl_easy_setopt looks like:

CURLcode curl_easy_setopt(CURL *curl, CURLoption option, ...);

It is a C function that takes a variable number of arguments. For this example I'm only using one additional parameter so I cheat and declare it in ATS as accepting one. I'll cover how to do variable argument functions later. The ATS definition is:

extern fun curl_easy_setopt {l:addr} {p:type}
  (handle: !CURLptr l, option: CURLoption, parameter: p)
  : int = "#curl_easy_setopt"

The function takes three arguments and returns one. The arguments it takes are:

  • a '!CURLptr l'. As described before this is a CURLptr object located at address 'l'. The '!' means curl_easy_setopt does not consume the object - we can continue to use it later. The 'l' is declared to be of type 'addr' earlier in the definition using {l:addr}.
  • a 'CURLoption'
  • a parameter value of type 'p'. This is of type 'type' (from the {p:type} earlier in the definition). This is the type of all ATS objects that fit in a machine word (pointers, integers, etc).

The remaining two CURL functions used in the example are variants of these. The main difference is in the ATS definition of 'curl_easy_cleanup':

extern fun curl_easy_cleanup {l:addr}
  (handle: CURLptr l)
  : void = "#curl_easy_cleanup"

Here the 'handle' parameter is a 'CURLptr l'. There is no '!' before it. This means the function consumes the argument and it cannot be used after this call. The type 'CURLptr l' is a linear type. It must be destroyed at some point in the program and once destroyed it cannot be re-used.

By defining 'curl_easy_cleanup' in this way we enforce a contract which states that the programmer must call this function if they have a live 'CURLptr l' object. You can see this in practice if you remove the cleanup call and try to compile. A compile error will result. A compile error will also occur if you try to use the curl handle after the cleanup call.

Because of the usage of the linear type we're already safe from one class of common programmer error. That of not calling cleanup functions for allocated objects.

Handling NULL pointers

There is a bug lurking in this version (and the C program). 'curl_easy_init' can return NULL and if it does you should not call any other curl functions. Using ATS we can define things in such a way that it is a requirement to check for NULL to successfully compile.

A modified version of the example with the changes to support this is in simple2.dats (pretty-printed version).

In this version we expand on the type CURLptr by adding two typedefs. One for pointers which can be NULL (CURLptr0) and one for pointers which cannot be NULL (CURLptr1):

absviewtype CURLptr (l:addr) // CURL*
viewtypedef CURLptr0 = [l:addr | l >= null] CURLptr l
viewtypedef CURLptr1 = [l:addr | l >  null] CURLptr l

The definitions of the Curl functions that are used are changed to use these types in the appropriate places:

extern fun curl_easy_init
  ()
  : CURLptr0 = "#curl_easy_init"
extern fun curl_easy_setopt {p:type}
  (handle: !CURLptr1, option: CURLoption, parameter: p)
  : int = "#curl_easy_setopt"
extern fun curl_easy_perform
  ( handle: !CURLptr1)
  : int = "#curl_easy_perform"
extern fun curl_easy_cleanup
  (handle: CURLptr1)
  : void = "#curl_easy_cleanup"

'curl_easy_init' is changed to return a CURLptr0 since it can return NULL. The other functions use CURLptr1 to signify they must not be NULL. Now if we keep the 'main' function as before without checking the result of 'curl_easy_init' we get a compile time error saying the type of the 'curl' variable does not match that required by 'curl_easy_opt'.

The fix is to add a check for NULL:

implement main() = let
  val curl = curl_easy_init();
  val () = assert_errmsg(CURLptr_isnot_null curl, "curl_easy_init failed");
  val res = curl_easy_setopt(curl, CURLOPT_URL, "www.bluishcoder.co.nz");
  val res = curl_easy_perform(curl);
  val ()  = curl_easy_cleanup(curl);
in
  ()
end;

The compiler knows that after the assert check that 'curl' must not be NULL. This allows the remaining parts of the program to typecheck.

Enforce checking of return values

Another common error is that of not checking the return values of C api calls for errors. This proved to be an issue in the first versions of the Firefox Ogg video backend where return values were not checked in some of the third party libraries we were using. Types can be defined in ATS to give compile time errors if return values are not checked for errors.

'curl_easy_setopt' and 'curl_easy_perform' both return a value indicating success or failure. A non-zero value indicates an error.

See simple3.dats (pretty-printed version) for the changes required to enforce checking of the values. The definition of 'curl_easy_setopt' has changed to:

extern fun curl_easy_setopt {p:type}
  (handle: !CURLptr1 >> opt(CURLptr1, err == 0),
   option: CURLoption, parameter: p)
  : #[err:int] int err = "#curl_easy_setopt"

The same change was made to 'curl_easy_perform'. The parameter 'handle' has changed to have a type of '!CURLptr1 >> opt(CURLptr1, err == 0)'. The type definition before the '>>' is what the function accepts as input. After the '>>' is the type of the parameter after the function returns.

This says that after the function returns the type of 'handle' is a 'CURLptr1' if 'err' is zero. 'err' is defined later in the definition as an integer. This forces calling code to check the return value to see if it is zero. That code can then continue to use the CURLptr1 type by extracting it from the 'opt'. This is the adjusted 'main' function showing how this works:

implement main() = let
  val curl = curl_easy_init();
  val () = assert_errmsg(CURLptr_isnot_null curl, "curl_easy_init failed");
  val res = curl_easy_setopt(curl, CURLOPT_URL, "www.bluishcoder.co.nz");
  val () = assert_errmsg(res = 0, "curl_easy_setopt failed");
  prval () = opt_unsome(curl);
  val res = curl_easy_perform(curl);
  val () = assert_errmsg(res = 0, "curl_easy_perform failed");
  prval () = opt_unsome(curl);
  val ()  = curl_easy_cleanup(curl);
in
 ()
end;

Notice the assertion check for the return value of the functions. This is followed by an 'opt_unsome' call to 'un-opt' the type and continue using it as a 'CURLptr1'. If either the assert check, or the 'opt_unsome' is commented out the code won't compile. If the assert check is done for a value other than zero it won't compile. The code could also check the result using an 'if' statement - I use assert here for brevity.

Tags


This site is accessable over tor as hidden service 6vp5u25g4izec5c37wv52skvecikld6kysvsivnl6sdg6q7wy25lixad.onion, or Freenet using key:
USK@1ORdIvjL2H1bZblJcP8hu2LjjKtVB-rVzp8mLty~5N4,8hL85otZBbq0geDsSKkBK4sKESL2SrNVecFZz9NxGVQ,AQACAAE/bluishcoder/-61/


Tags

Archives
Links