This peer review edition of AFIO v1.4 has been “mocked up” with
an API which should very closely resemble the eventual API in the v1.4 engine
which will be rewritten to use the just written lightweight future-promise
factory toolkit in forthcoming Boost.Outcome.
It is, however, still in fact the mature v1.3 engine with a faked wrapper API
simulating the v1.4 engine. Known deviations from the eventual v1.4 release:
-
AFIO's
future
is a shim
type standing in for the eventual custom Boost.Monad based future. Continuations
work, but are horribly hacked together — if your continuation returns
a Boost.Monad future, it shims in a one way conversion to lightweight futures.
If your continuation returns anything else, including an AFIO future, it
stays within AFIO < v1.4's fairly broken future semantics (AFIO was
first written back in 2012 when the Concurrency TS looked very different
indeed, and AFIO has not kept up).
|
Warning |
AFIO's future shim type can't do continuations on anything but future<void> .
For any future<T> it
actually executes the continuation immediately which will usually just happen
to work through fortune as the continuation usually does a get() .
|
-
Relatedly, because AFIO < v1.4 implemented future continuations atop
STL and Boost futures using an internal hash table to look up the associated
extra operation metadata per future, these APIs will also be vanishing:
-
async_io_dispatcher_base::completion()
- replaced with future.then()
-
async_io_dispatcher_base::call()
- replaced with future.then()
-
async_io_dispatcher_base::barrier()
- replaced with when_all().
-
Lightweight future-promise also enables C++ 1z coroutines, and therefore
the code examples in the documentation can then be rewritten to use C++
1z coroutines where appropriate. This should particularly aid the find
regex in files tutorial example which is currently very messy.
-
The v1.4 engine will be rewritten yet again to use a new custom future
implementation whereby async_io_op shall become afio::future<T>.
This should let the API no longer return two sets of futures when returning
results (async_io_op therefore matches afio::future<void>), plus
make best use of the proposed concurrent_unordered_map by finally actually
processing batches of operations as a batch, instead of one at a time.
-
The v1.4 engine afio::future<T> ought to transparently support Boost.Fiber,
this should let you program against AFIO using awaitable resumable functions
which is much cleaner. The v1.3 engine already reduced latency by 50% over
the v1.2 engine, but with Fibers we finally ought to eliminate the O(waiters)
scaling for op completions in favour of O(1) to waiters.
-
AFIO v1.5 will abstract out all the OS specific code into a C (not C++)
abstraction layer. This makes it far easier to reuse the same code for
a later AFIO based Filesystem TS implementation. It also eliminates the
use of C++ exceptions at the OS specific layer. Note I may still use lambdas
in the C abstraction layer, if compiled as C these appear as C callbacks
but if compiled as C++ they are some callable type.
-
It is hoped that the v1.5 engine will have two additional bindings, one
for Microsoft COM thanks to John Bandela's CppComponents (https://github.com/jbandela/cppcomponents)
and another with plain C bindings (which actually wrap the COM object).
Both these bindings have no requirement for anything Microsoft, and work
as expected on POSIX.
-
The v1.5 engine will finally make use of the API support for alternative
async_io_dispatcher implementations, adding at least one for temp file
support with special temp file semantics, and maybe others with transparent
hashing and bit flip healing (see below). The temp file support will allow
anonymous and named temp files which can be device volume matched to some
path, thus allowing file move to be done without copying. Anonymous temp
files can use Linux specific facilities for those.
-
Finally, making good use of the new coroutine support the v1.5 engine should
have an async fast batch hash engine which provides transparent hashing
of all async reads and writes with optional SECDEC ECC calculation. Hashes
will probably be initially limited to Blake2b (crypto strong and very fast,
even on ARM) and SpookyHash (not crypto strong, but superbly fast and for
small blocks), though I may dust off my 4-SHA SSE2/NEON SHA256 engine for
Intel and ARM for the giggle. In addition to fast batch hashing, the coroutine
support should make filing system tree algorithms enormously easier to
write, so expect an optimally fast race free directory tree visitor, deleter,
mover and copier which is the find in files tutorial made generic.
-
When ignoring the close of a cached directory handle, kick out its weak_ptr
from the central directory cache if its reference count is 1.
-
Individual file change monitoring. This would be very useful for implementing
distributed mutual exclusion algorithms to avoid spinning on file updates.
-
Related to the preceding item, formal async lock file support with deadline
timeouts.
-
Portable fast file byte range advisory locking which works across network
shares, but can still utilise shared memory when possible.
-
Related to locking files, the ability to copy files (unwritten destination
locked until complete) using a variety of algorithms (including cheap copies
on btrfs [cp --reflink=auto]) with progress notification via the file change
monitoring. Extend rename file and delete file with lock file options.
-
Also related to locking files, DeleteOnClose should use advisory locks
to have the last handle close do the file delete on POSIX.
-
Right now specifying an empty path with precondition to mean same as precondition
is inefficient and racy: it opens the containing directory and uses the
containing directory as the base with its leafname. What it should do is
to duplicate the handle and use fnctl() etc to reopen the handle with the
desired access and flags.
-
Extended attributes support. TripleGit could use this to avoid a second
file handle open of metadata per graph object read. Unsure if NTFS is any
faster opening EA though, need to test.
-
AFIO's stable DLL ABI is intentionally C compatible, so all one needs is
a libclang tool for generating C wrappers for the DLL ABI which massage
C arrays into pseudo std::vector const lvalue refs. Such C bindings would
be all batch and have none of the friendliness of programming AFIO in C++,
but they ought to work quite well.
-
Kernel side file <=> socket data copying via sendfile and splice
on Linux. Integrating splice as async with ASIO is unfortunately painful
:(
-
async_io_dispatcher_base::read_partial() to read as much of a single buffer
as possible, rather than only complete buffers.
-
Fast, scalable portable directory contents change monitoring. It should
be able to monitor a 1M entry directory experiencing 1% entry changes per
second without using a shocking amount of RAM.
-
ACL support
-
Asynchronous file handle closing in ~async_io_handle() (currently if not
explicitly closed, the async_io_handle destructor must synchronously close)
-
Unit tests were not trapping exception throws from normalise_path(path_normalise::guid_all),
link() or file(file_flags::create_compressed). This caused test failure
on FAT32 and ReFS volumes on Windows as those operations are not supported.
-
Rename file_buffer_allocator to page_allocator. Fixes issue #95 from review.
-
Replace all use of std::cerr with BOOST_AFIO_LOG_FATAL_EXIT(). Fixes issue
#104 from review.
-
All synchronous free functions not returning anything now return a void.
Fixes issue #107 from review.
-
ASIO_STANDALONE is now detected using #ifdef not #if. Fixes issue #113
from review.
-
Removed VS2013 support as lightweight future-promise will require VS2015:
-
Replaced all BOOST_NOEXCEPT and BOOST_NOEXCEPT_OR_NOTHROW with noexcept
-
Replaced all BOOST_CONSTEXPR with constexpr
-
Replaced camel cased items in file_flags with Boost compliant forms.
-
Replaced all async_io_op with afio::future<void>. async_io_op::get()
is now called future<>::get_handle(), plus the former structure members
are now member function accessors.
-
Replaced all functions returning pairs of futures with afio::future<T>
instead.
-
Docs now contain commenting facility per reference page plus on some of
the essay pages.
-
Online docs now include a "Try AFIO now in online compiler" button
every page.
-
Implemented multi-version compatibility, and backported that patch to the
v1.3 branch. The v1.3 branch is now included in the v1.4 branch as v1.
-
Implemented free functions for future continuation for all AFIO operations.
-
async_io_handle is now handle. Instead of explicit std::shared_ptr<async_io_handle>
we now have handle_ptr.
-
async_file_io_dispatcher_base is now dispatcher. Instead of explicit std::shared_ptr<async_file_io_dispatcher_base>
we now have dispatcher_ptr. make_async_io_dispatcher() is now make_dispatcher().
-
make_dispatcher() now takes a URI and returns a monad.
-
Fixed issue #84 Symlinks were never being followed on Windows.
-
Fixed issue #83 Fix failure to handle Dedup reparse point types.
-
async_data_op_req is now io_req, similarly for make_async_data_op_req.
-
async_enumerate_op_req is now enumerate_req.
-
async_path_op_req is now path_req.
-
Added optional additional target parameter to symlink().
-
Added reparse_point flag to stat_t.
-
Verified as working with Boost 1.59 release.
-
handle::try_mapfile()
is no more, we now have handle::map_file()
which can also map writeable files now
plus map offsets.
Still to do in this release:
-
ADD URI REGEX REGISTRATION SYSTEM (need priorities? Need to do some research.
Make sure it's DLL unload safe).
-
Add unit test for multi-version use within the same translation unit.
-
Added Appveyor CI support which complements the Travis CI support.
-
Verified as working with Boost 1.58 release.
AFIO is now a Boost.BindLib based library. This has resulted in an enormous
change set which is only barely summarised here:
Fixed buffer underflow when decoding Win32 error codes to strings. Thanks to
ariccio for reporting this.
Relocated docs from ci.nedprod.com to http://boostgsoc13.github.io/boost.afio/
Updated the stale CI test dashboard copy in the DocBook edition.
Finished getting a ThreadSanitizer (tsan) + UndefinedBehaviorSanitizer (ubsan)
pass running per-commit on Travis CI (>= v1.2 was tsan clean, I just hadn't
bothered getting a CI to verify it to be so per commit).
Fixed bug where --lto wasn't turning on the optimiser for LTO output. Sorry.
Added a benchmark testing for latency under concurrency loads.
Added a new FAQ entry on AFIO execution latencies.
Reorganised source code structure to fit modular Boost. AFIO is now a Boost
v1.56 module just like any other. Obviously this will break source code compatibility
with all preceding Boosts.
Fixed a bug in the custom unit testing framework which was throwing away any
exceptions being thrown by the tests (thanks to Paul Kirth for finding this
and reporting it). Fixing this bug revealed that enumerate() with a glob on
Windows has never worked properly and the exception thrown by MSVC's checked
iterators was hiding the problem, so fixed that bug too.
Added async_io_dispatcher_base::post_op_filter() and async_io_dispatcher_base::post_readwrite_filter(),
including documentation examples and integrating filters into the unit testing.
post_readwrite_filter() ought to be particularly useful to those seeking deep
ASIO integration. Thanks to Bjorn Reese for the long discussions leading up
to this choice of improved ASIO support.
During updating the benchmarks below now I have regained access to my developer
workstation, discovered a severe performance regression in the v1.2 engine
of around 27% over the v1.1 engine. Steps taken:
1. The shared state in every async_io_op was a shared_ptr, now it is the underlying
shared_future. Eliminated copies of shared_ptr, now we always use the shared_future
in enqueued_task directly. This reduced regression to 18%.
2. Removed more code from inside the TSX locks. This reduced regression to
16%.
3. Removed the second TSX lock from complete_async_op(). This eliminated the
regression and actually added 2% to the v1.1 engine.
4. Removed the second TSX lock from chain_async_op(). This added a further
10% over the v1.1 engine, so we are now 12% faster which is about right given
the v1.2 engine removed 15% of code.
Added nested TSX transaction support.
The CI shows that clang 3.1 now produces segfaulting binaries with this release,
so rather than debug clang, I simply dropped clang 3.1 support. AFIO now requires
clang 3.2 or better.
This is a major refactor of AFIO's core op dispatch engine to trim it down
by about 15%. Key breaking differences from the v1.1 series of AFIO are as
follows:
-
Replaced all use of packaged_task with enqueued_task, a custom implementation
which makes possible many performance improvements throughout the engine.
-
thread_source::enqueue() now can take a preprepared enqueued_task.
-
thread_source::enqueue() now always returns a shared_future instead of
a future. This has had knock on effects throughout AFIO, so many futures
are now shared_future.
-
Completion handler spec has changed from:
pair<bool, shared_ptr<async_io_handle>> (*)(size_t id, shared_ptr<async_io_handle> h, exception_ptr *e)
to:
pair<bool, shared_ptr<async_io_handle>> (*)(size_t id, async_io_op preceding)
This substantially improves performance, simplifies the implementation,
and lets completion handlers more readily retrieve the error state of
preceding operations and react appropriately.
-
All restrictions on immediate completions have been removed. You can now
do anything in an immediate completion that you can do in a normal completion.
-
async_io_op::h now always refers to a correct future i.e. the future is
no longer lazily allocated.
-
Now that op futures are always correct, when_all(ops) has been drastically
simplified to an implementation which literally assembles the futures into
a list and passes them to boost::wait_for_all().
-
Added when_any(ops).
Added --fast-build to test Jamfile to preserve my sanity attempting to work
with AFIO on an Intel Atom 220 netbook.
Fixed failure to auto-const an async_data_op_req<boost::asio::mutable_buffer>
when used for writing. Thanks to Bjorn Reese for reporting this.
Replaced use of std::runtime_error with std::invalid_argument where that makes
sense. Thanks to Bjorn Reese for reporting this.
Replaced throwing of std::ios_base::failure with std::system_error. Thanks
to Bjorn Reese for suggesting and submitting a patch for this.
async_io_dispatcher_base::enumerate() did not take a metadata_flags, and it
was supposed to. Thanks to Bjorn Reese for reporting this.
Added a unit compilation test to ensure that implicit construction from a single
arg to the op convenience classes works as intended.
Significantly optimised build system and added in precompiled headers support.
Combined with --fast-build this provides an 8x build time improvement.
boost::afio::stat_t::st_type() is now a boost::filesystem::file_type instead
of replicating the POSIX file type codes. Thanks to Bjorn Reese for suggesting
this.
boost::afio::stat_t::st_mode() is now st_perms(). Also disabled unused fields
in stat_t on Windows. Thanks to Bjorn Reese for suggesting this.
Immediate completions no longer hold the opslock, which meant the opslock could
be changed from a recursive mutex to a spinlock. The new, more parallelised,
behaviour illuminated a number of new race conditions in when_all() which have
been fixed.
Completely gutted dispatch engine and replaced with a new, almost entirely
wait free implementation based on throwing atomics at the problem. If it weren't
for the spin lock around the central ops hash table, AFIO would now be an entirely
wait free design.
In order to do something about that spin lock, replaced all locking in AFIO
(apart from the directory file handle cache) with memory transactions instead.
This does CPUID at runtime and will use Intel's TSX-NI memory transaction implementation
if available, if not it falls back to a spin lock based emulation. On memory
transaction capable CPUs, AFIO is now almost entirely wait free, apart from
when it has to fetch memory from the kernel.
Made AFIO usable as headers only.
First release for end of Google Summer of Code 2013.