Home | Libraries | People | FAQ | More |
Boost.AFIO came about out of the need for a scalable, high performance, portable asynchronous file i/o and filesystem implementation library for a forthcoming filing system based graph store ACID compliant transactional persistence layer called TripleGit — call it a “SQLite3 but for graphstores”[1]. The fact that a portable asynchronous file i/o and filesystem library for C++ was needed at all came as a bit of a surprise: one thinks of these things as done and dusted decades ago, but it turns out that the fully featured libuv, a C library, is good enough for most people needing portable asynchronous file i/o. However as great as libuv is, it isn't very C++-ish, and hooking it in with Boost.ASIO (parts of which are expected to enter the ISO C++ language standard) isn't particularly clean. I therefore resolved to write a native Boost asynchronous file i/o and filesystem implementation, and keep it as simple as possible.
AFIO started life as a C++ 0x library written for an early Visual Studio 2013 Community Preview back in 2012 as a outside-of-work side project when I was working at BlackBerry. It was ported to Boost during Google Summer of Code 2013 with the help of student Paul Kirth, and VS2012 and VS2010 support was added. For v1.0, AFIO used a simple dispatch engine which kept the extant ops in a hash table, and the entire dispatch engine was protected by a single giant and recursive mutex. Performance never exceeded about 150k ops/sec maximum on a four core Intel Ivy Bridge CPU.
That performance was embarrassing, so for v1.1 the entire engine was rewritten using atomic shared pointers to be completely lock free, and very nearly wait free if it weren't for the thin spin locks around the central ops hash table. Now performance can reach 1.5m ops/sec on a four core Intel Ivy Bridge CPU, or more than half of Boost.ASIO's maximum dispatch rate.
For the v1.2 engine, another large refactor was done, this time to substantially
simplify the engine by removing the use of std::packaged_task<>
completely, replacing it with a new
intrusive-capable enqueued_task<>
which permits the engine to early out
in many cases, plus allowing the consolidation of all spinlocked points down
to just two: one in dispatch, and one other in completion, which is now optimal.
Performance of the v1.2 engine rose by about 20% over the v1.1 engine, plus
AFIO is now fully clean on all race detecting tools.
For the v1.3 engine, yet another large refactor was done, though not for performance but rather to make it much easier to maintain AFIO in the future, especially after acceptance into Boost whereupon one cannot arbitrarily break API anymore, and one must maintain backwards compatibility. To this end the dependencies between AFIO and Boost were completely abstracted into a substitutable symbol aliasing layer such that any combination of Boost and C++ 11 STL threading/chrono, filesystem and networking can be selected externally using macros. Indeed, any of the eight build combinations can coexist in the same translation unit too, I have unit test runs which prove it! With the v1.3 engine AFIO optionally no longer needs Boost at all, not even for its unit testing. However the cost was dropping support for all Visual Studios before 2013 and all GCCs before 4.7 as they don't have the template aliasing support needed to implement the STL abstraction layer. A very large amount of legacy cruft code e.g. support for non-variadic templates was cleaned out for the v1.3 release.
During ACCU and C++ Now 2015 I spoke with a number of ISO WG21 committee members about the structural design problems in iostreams and the Filesystem TS (lack of filesystem race freedom, lack of context dependant filesystem), and what design the committee would prefer to see to fix those problems. As it happens, we were all close to the same page, so from the v1.4 engine onwards I resolved to refactor the AFIO API thusly:
The use of this future factory toolkit makes the AFIO continuations infrastructure redundant, and it will therefore be removed shortly. The monadic programming library also makes quite a bit of internal AFIO implementation code much more simplified as thanks to the monads, one can use noexcept design throughout and therefore skip dealing directly with exception safety as the monads take away the potential of control flow being reversed by an exception throw.
Note | |
---|---|
This version of AFIO being presented for Boost review does not yet make use of lightweight future-promises, and instead mocks up the eventual API using the existing highly mature and very well tested engine. The API presented is expected to be final, except for the very few items specified as deprecated (see below for a list). This has been done in order to test that the engine rewrite based on lightweight future-promise exactly matches the behaviour of the current engine using an identical unit test suite. I should emphasise that I expect any programs written to match the presented API to continue to work after the engine rewrite — after all the internal unit test suite will do so. |
afio::handle_ptr &h; // Some open file handle // Create a sibling file to h->path() race free afio::handle_ptr newfileh=afio::file(h, "../newfile", afio::file_flags::create); // Asynchronously create a sibling file to h->path() race free // afio::future has type void because afio::future *always* carries a shared pointer to a handle // (it only gets a type when the operation returns more than a file handle) afio::future<void> newfilefh2=afio::async_file(h, "../newfile", afio::file_flags::create); // Wait for the asynchronous file creation to complete, rethrowing any error or exception afio::handle_ptr newfileh2=newfilefh2.get_handle();
I'll admit this design isn't quite what the members of WG21 had in mind,
especially the notion that afio::future<void>
with type void always carries a shared
pointer to a handle and that implicit type slicing from afio::future<T>
to afio::future<void>
is not just allowed but absolutely essential. However apart from that,
the above API is probably quite close to what members of WG21 were thinking.
filesystem
abstract base class.
filesystem
instance to be used by the global Filesystem TS functions on that
thread.
AFIO v1.4 and probably v1.5 won't implement this as this is really a
thing for Boost.Filesystem to do. However, AFIO's make_dispatcher()
already takes a URI and there is a RAII facility for setting the current
thread local dispatcher, so AFIO is ready for a Filesystem TS implementation
matching the above design to be written on top of it in that a dispatcher
instance has a suite of
virtual member functions which define what some filesystem is or does.
AFIO v1.4 only provides POSIX and NT kernel filesystem backends currently,
however it is expected that v1.5 will add a new temporary filesystem
backend which lets programs portably work inside tmpfs
in whichever form that takes across Linux, FreeBSD, Apple OS X and Microsoft
Windows. Additional backends implementing say a ZIP archive filesystem
are similarly easy to add on.
As mentioned above, note that due to the above refactoring some parts of this v1.4 release of AFIO are deprecated and are expected to be removed shortly. You can find a list of these shortly to be removed APIs and parts here. The list is not long, and the removals are obvious.
[1] The UnQLite embedded NoSQL database engine is exactly one of those of course. Unfortunately I intend TripleGit for implementing portable Component Objects for C++ extending C++ Modules, which means I need a database engine suitable for incorporation into a dynamic linker, which unfortunately is not quite UnQLite.