An Introduction to Clang

By Mark Wilson | Monday, April 8, 2013

In this blog post I will be writing about some of my experiences with clang. What is clang? It is a front end to the LLVM compiler and is designed to compile C, C++, Objective-C, and Objective-C++ to machine code. The LLVM Project "is a collection of modular and reusable compiler and toolchain technologies," meaning that you could use LLVM to create a compiler for just about any language you'd like, including your own invented language, were you so inclined. The LLVM core libraries include things such as code generation for a number of CPUs and optimization technology. As my purpose is to write about clang, all we really need to know is that clang is a compiler that can be used in place of gcc to create executable programs from source.

Apple is the primary developer of clang. Under Mac OS X and iOS clang is the official compiler that Apple supplies in their SDK. Clang is also the default compiler on FreeBSD. Clang is considered a production compiler for C++ 98 and implements much of the new C++ 11 standard, so don't be afraid to use it. I have compiled Qt 5 with it (using "configure -platform linux-clang") with no issues. Clang and gcc are compatible so even if Qt was built with gcc you can compile your Qt-based application with clang. Invoking qmake like this will set the compiler: 

% qmake QMAKE_CC=clang QMAKE_CXX=clang

(QMAKE_CC is not needed unless you have C files in your project). Clang is available as packages for most Linux distributions. Under Ubuntu the compiler package is "clang" and the clang library is "libclang1" with development package "libclang-dev". Alternatively you can build it from source code.

The gcc compiler family is well established, free, and works just fine. Why would you want to switch to clang? Clang is faster and uses less memory and is based on a modular design (as opposed to the monolithic design/code of gcc). Clang offers more readable error and warning diagnostics and explicitly highlights related source. Clang even offers "fix-it hints" for many common errors.

For example, here is the output for a very simple error:

file.cc:6:11: error: expected ';' after expression
    i += 8
          ^
          ;

Contrast this with the output from gcc for the same error:

file.cc:7:1: error: expected ';' before '}' token

The clang output is much more expressive and helps you fix the problem faster.

Let's look at a less trivial example. A common (and valid) complaint about C++ compilers is the output for template errors. Clang's more expressive diagnostics make it much easier to work with template code. Here is a simple program that uses std::map:

#include <string>
#include <map>

int
main(int argc, char** argv)
{
    std::map<std::string, int> aMap;

    aMap[1] = "clang";
}

Notice that I "accidentally" used an integer for the key and a string for the contents. Here is what gcc spews onto the screen to help you out with this error:

try.cc: In function 'int main(int, char**)':
try.cc:9:11: error: invalid user-defined conversion from 'int' to 'const key_type& {aka const std::basic_string&}' [-fpermissive]
In file included from /usr/include/c++/4.7/string:55:0,
                 from try.cc:1:
/usr/include/c++/4.7/bits/basic_string.tcc:214:5: note: candidate is: std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const _CharT*, const _Alloc&) [with _CharT = char; _Traits = std::char_traits; _Alloc = std::allocator] 

/usr/include/c++/4.7/bits/basic_string.tcc:214:5: note: no known conversion for argument 1 from 'int' to 'const char*'
try.cc:9:11: error: invalid conversion from 'int' to 'const char*' [-fpermissive]
In file included from /usr/include/c++/4.7/string:55:0,
                 from try.cc:1:
/usr/include/c++/4.7/bits/basic_string.tcc:214:5: error: initializing argument 1 of 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const _CharT*, const _Alloc&) [with _CharT = char; _Traits = std::char_traits; _Alloc = std::allocator]' [-fpermissive]
try.cc:9:15: error: invalid conversion from 'const char*' to 'std::map, int>::mapped_type {aka int}' [-fpermissive]

Imagine this is your first time using map; you'd probably never want to use it again, or any templates for that matter! Here is the output from clang:


 
try.cc:9:9: error: no viable overloaded operator[] for type 'std::map' aMap[1] = "clang"; ~~~~^~ /usr/lib/gcc/i686-linux-gnu/4.7/../../../../include/c++/4.7/bits/stl_map.h:450:7: note: candidate function not viable: no known conversion from 'int' to 'const key_type' (aka 'const std::basic_string') for 1st argument operator[](const key_type& __k) ^ 1 error generated.

This is obviously much easier to read and fix.

Clang is a great compiler, but what I think makes it really interesting is its modular design. Clang was designed from the ground up to be an API. The clang API provides methods to allow full introspection into a C++ program. Clang tracks every aspect of your source, from comments to macro expansions, and provides information on the location of the various tokens that make up your program. This means that you can create a variety of tools: syntax aware editors, syntax checkers, code generators, etc.

An example of a tool written with the clang API is clang-check, which comes with clang.Clang-check is great for dumping out information about your source, finding errors, and even fixing those errors.

Let's write two simple classes:

// Class.h
#ifndef _CLASS_H_
#define _CLASS_H_
class Base
{
public:
    Base();
    int dosomething(int i);
};
class Derived : public Base
{
public:
    Derived();
    void dosomethingelse(double d);
};
#endif

 

// Class.cc
#include <stdio.h>
#include "Class.h"

Base::
Base()
{
}
int Base::
dosomething(int i)
{
}
Derived::
Derived() :
    Bse()
{
}
void Derived::
dosomethingelse(double d)
{
    printf("The value is %i\n", d);
}

I've introduced two simple errors in class.cc. If you can't find them, clang-check will, and it can even fix some of the errors in place for you! Here is the command line for runningclang-check against Class.cc:

/usr/local/bin/clang-check Class.cc -fixit -- clang++

Everything after the -- tells clang-check how the program would normally be compiled; you would include any necessary -I (for include paths) options, defines, etc. This information can also be provided by a compilation database, which can be produced by CMake. Here is the output of clang-check:

/home/mwilson/forClangBlog/Class.cc:13:1: warning: control reaches end of
      non-void function [-Wreturn-type]
}
^
/home/mwilson/forClangBlog/Class.cc:18:5: error: initializer 'Bse' does not name
      a non-static data member or base class; did you mean the base class
      'Base'?
    Bse()
    ^~~
    Base
/home/mwilson/forClangBlog/Class.cc:18:5: note: FIX-IT applied suggested code
      changes
/home/mwilson/forClangBlog/Class.h:14:17: note: base class 'Base' specified here
class Derived : public Base 
                ^~~~~~~~~~~
/home/mwilson/forClangBlog/Class.cc:25:33: warning: format specifies type 'int'
      but the argument has type 'double' [-Wformat]
    printf("The value is %i\n", d);
                         ~~     ^
                         %f
/home/mwilson/forClangBlog/Class.cc:25:33: note: FIX-IT applied suggested code
      changes

If you go back into Class.cc, you will see that the suggested fixes were applied.

In a future posting I will show details on how to use the clang API to make your own tools.


Comments

Comment: