Boost C++ Libraries Home Libraries People FAQ More

PrevUpHomeNext

Introduction

Boost.AFIO is a C++ library which lets you schedule an ordered dependency graph of file and filesystem input/output operations to be executed asynchronously to the maximum capacity of your hardware. If you want to do portable asynchronous filesystem and file i/o in C++, especially if you need to easily order issues of reads and writes, this is the correct library to be looking at.

As a quick check list, if you have ever experienced any of these problems, then AFIO may be useful to you:

  1. Your spinning magnetic rust hard drive goes bananas when some routine in your code tries to do something to storage, and latency per op starts heading into the seconds range.
  2. Your super fast SSD which is supposed to be delivering hundreds of thousands of ops/sec is barely managing a tenth of its supposed ability with your code. After reading about the importance of high queue depth to maximising performance from SSDs, you try opening many handles to the same file and firing an army of thread pool workers at the problem to try and increase queue depth, but your performance actually drops over the single threaded case.
  3. Your code has to interact with a regularly changing filesystem and not get weird race errors e.g. you try to create a new file in path /X/Y/Z, but some other program has just renamed directory /X/Y to /A/B in the time between you deciding on /X/Y/Z and getting round to it.
  4. Your code keeps file handles open a long time in a place where others might delete or rename them, including any part of the directory hierarchy preceding the file.
  5. Deleting directory trees randomly fails on Microsoft Windows for no obvious reason.
  6. Your code needs to read and write files concurrently to other code without resorting to shared memory region tricks e.g. if the files reside on a Samba or NFS network shared drive.
  7. Your CPU needs to be doing more useful work instead of copying memory to and from disc i/o buffers. As great as the STL iostream buffering is, unless disabled it doubles the LL cache pressure on your CPU, evicting other more useful data. The STL iostreams design almost certainly won't allow the kernel use VM tricks to directly busmaster DMA from its buffers to the hard drive, so the kernel will have to copy those buffers a third time. That means that for every 1Kb you read or write you are evicted, as a minimum, 3Kb from the LL caches in your CPU, all of which must be refilled with more useful data later.
  8. Your code wants to experience various filing system features identically across platforms which also work on shared Samba and NFS network drives, such as:
    • Deleting and renaming open files.
    • Files having unique inode values.
    • POSIX timestamping of last accessed, last modified, last status changed and created.
    • File extent management and traversal.
    • Explicitly documented filing system race guarantees.
    • Interrogation of filing system characteristics, devices and mount points.
    • Ten million item directories, or more. We have tested twenty five million item directories on NTFS and ext4 and performance was actually tolerable with under a second pause. Ten million item directories is plenty fast, and one million item directories you won't notice over a ten item directory. Note that your GUI file explorer will very likely hang on ten million item directories, indeed so do most command line tools.
    • Exclusive lock files (manually operated support already there, async support coming in v1.5).
    • File change monitoring (coming in v1.5).
    • File byte range advisory locking (coming in v1.5).
[Note] Note

Lest there be any disappointment in expectations, using AFIO alone will not magically improve your filesystem performance, if anything there is a performance penalty in naive use of AFIO as a direct replacement for naive synchronous file i/o. What AFIO gives you is a large amount of control with easy to twiddle knobs for benchmarking optimal filesystem strategies under various use cases. In other words, using AFIO lets you more easily write and test code that never performs really badly in corner cases.

Boost.AFIO is a Boost.APIBind based Boost library, and therefore is capable of any combination of the following build configurations:

You may note that as a result AFIO can be used as a completely standalone header-only library totally independent from any dependencies on Boost which can be dropped into any existing build system as a simple single header include. This, incidentally, also extends to its unit test suite which can use either Boost.Test OR https://github.com/philsquared/Catch (actually my own thread safe fork of CATCH).

Boost.AFIO provides a pure portable POSIX file i/o backend and specialised file i/o backends making use of host OS asynchronous file i/o facilities are provided for:

Boost.AFIO is regularly compiled and per-commit unit tested on these platforms:

Boost.AFIO extends Boost.ASIO and is therefore dependent on ASIO (Boost or standalone). With a good modern compiler you can expect 50-90% of the throughput of using raw Boost.ASIO at a latency of about 60,000 +/- 600 CPU cycles to get notified of the completion of an operation. This library was brought to Boost as part of Google Summer of Code 2013.

Boost.AFIO is a C++ 11 only library, and it requires, as an absolute minimum, a compiler with:

Some popular compilers known to be minimally sufficient thanks to our Jenkins CI bot include:

The Jenkins CI bot runs a full suite of static analysis tools (currently clang and MSVC static analysers and clang-tidy, the ABI stability compliance checker is planned), runtime analysis tools (currently the clang undefined behaviour and thread sanitisers plus valgrind memcheck) plus a full set of unit tests for all supported compilers on all supported platforms for every single commit to master branch and every single pull request. Additionally, the Travis CI bot runs a full set of code coverage for the unit tests which is pushed to coveralls.io. You can view the build and unit test CI dashboard for all compilers and platforms here.

[Important] Important

Note that Boost.AFIO has not passed Boost peer review, and therefore is not a part of the Boost C++ libraries

As a very quick example of minimal usage:

using namespace BOOST_AFIO_V2_NAMESPACE;
using BOOST_AFIO_V2_NAMESPACE::rmdir;
std::shared_ptr<boost::afio::dispatcher> dispatcher =
boost::afio::make_dispatcher().get();
current_dispatcher_guard h(dispatcher);

// Free function
try
{
  // Schedule creating a directory called testdir
  auto mkdir(async_dir("testdir", boost::afio::file_flags::create));
  // Schedule creating a file called testfile in testdir only when testdir has been
  // created
  auto mkfile(async_file(mkdir, "testfile", boost::afio::file_flags::create));
  // Schedule creating a symbolic link called linktodir to the item referred to by
  // the precondition
  // i.e. testdir. Note that on Windows you can only symbolic link directories.
  auto mklink(
  async_symlink(mkdir, "linktodir", mkdir, boost::afio::file_flags::create));

  // Schedule deleting the symbolic link only after when it has been created
  auto rmlink(async_rmsymlink(mklink));
  // Schedule deleting the file only after when it has been created
  auto rmfile(async_close(async_rmfile(mkfile)));
  // Schedule waiting until both the preceding operations have finished
  auto barrier(dispatcher->barrier({rmlink, rmfile}));
  // Schedule deleting the directory only after the barrier completes
  auto rmdir(async_rmdir(depends(barrier.front(), mkdir)));
  // Check ops for errors
  boost::afio::when_all_p(mkdir, mkfile, mklink, rmlink, rmfile, rmdir).wait();
}
catch(...)
{
  std::cerr << boost::current_exception_diagnostic_information(true) << std::endl;
  throw;
}

// Batch
try
{
  // Schedule creating a directory called testdir
  auto mkdir(
  dispatcher->dir(std::vector<boost::afio::path_req>(
                  1, boost::afio::path_req(
                     "testdir", boost::afio::file_flags::create))).front());
  // Schedule creating a file called testfile in testdir only when testdir has been
  // created
  auto mkfile(dispatcher->file(std::vector<boost::afio::path_req>(
                               1, boost::afio::path_req::relative(
                                  mkdir, "testfile",
                                  boost::afio::file_flags::create))).front());
  // Schedule creating a symbolic link called linktodir to the item referred to by
  // the precondition
  // i.e. testdir. Note that on Windows you can only symbolic link directories.
  // Note that creating
  // symlinks must *always* be as an absolute path, as that is how they are stored.
  auto mklink(dispatcher->symlink(std::vector<boost::afio::path_req>(
                                  1, boost::afio::path_req::absolute(
                                     mkdir, "testdir/linktodir",
                                     boost::afio::file_flags::create))).front());

  // Schedule deleting the symbolic link only after when it has been created
  auto rmlink(dispatcher->close(std::vector<future<>>(
                                1, dispatcher->rmsymlink(mklink))).front());
  // Schedule deleting the file only after when it has been created
  auto rmfile(
  dispatcher->close(std::vector<future<>>(1, dispatcher->rmfile(mkfile))).front());
  // Schedule waiting until both the preceding operations have finished
  auto barrier(dispatcher->barrier({rmlink, rmfile}));
  // Schedule deleting the directory only after the barrier completes
  auto rmdir(
  dispatcher->rmdir(std::vector<path_req>(
                    1, dispatcher->depends(barrier.front(), mkdir))).front());
  // Check ops for errors
  boost::afio::when_all_p(mkdir, mkfile, mklink, rmlink, rmfile, rmdir).wait();
}
catch(...)
{
  std::cerr << boost::current_exception_diagnostic_information(true) << std::endl;
  throw;
}


PrevUpHomeNext