git.kernelconcepts.de Git - karo-tx-linux.git/commit

rwsem: Support optimistic spinning

We have reached the point where our mutexes are quite fine tuned for a
number of situations.  This includes the use of heuristics and optimistic
spinning, based on MCS locking techniques.

Exclusive ownership of read-write semaphores are, conceptually, just about
the same as mutexes, making them close cousins.  To this end we need to
make them both perform similarly, and right now, rwsems are simply not up
to it.  This was discovered by both reverting commit 4fc3f1d6 (mm/rmap,
migration: Make rmap_walk_anon() and try_to_unmap_anon() more scalable)
and similarly, converting some other mutexes (ie: i_mmap_mutex) to rwsems.
This creates a situation where users have to choose between a rwsem and
mutex taking into account this important performance difference.
Specifically, biggest difference between both locks is when we fail to
acquire a mutex in the fastpath, optimistic spinning comes in to play and
we can avoid a large amount of unnecessary sleeping and overhead of moving
tasks in and out of wait queue.  Rwsems do not have such logic.

This patch, based on the work from Tim Chen and I, adds support for
write-side optimistic spinning when the lock is contended.  It also
includes support for the recently added cancelable MCS locking for
adaptive spinning.  Note that is is only applicable to the xadd method,
and the spinlock rwsem variant remains intact.

Allowing optimistic spinning before putting the writer on the wait queue
reduces wait queue contention and provided greater chance for the rwsem to
get acquired.  With these changes, rwsem is on par with mutex.  The
performance benefits can be seen on a number of workloads.  For instance,
on a 8 socket, 80 core 64bit Westmere box, aim7 shows the following
improvements in throughput:

+--------------+---------------------+-----------------+
|   Workload   | throughput-increase | number of users |
+--------------+---------------------+-----------------+
| alltests     | 20%                 | >1000           |
| custom       | 27%, 60%            | 10-100, >1000   |
| high_systime | 36%, 30%            | >100, >1000     |
| shared       | 58%, 29%            | 10-100, >1000   |
+--------------+---------------------+-----------------+

There was also improvement on smaller systems, such as a quad-core x86-64
laptop running a 30Gb PostgreSQL (pgbench) workload for up to +60% in
throughput for over 50 clients.  Additionally, benefits were also noticed
in exim (mail server) workloads.  When comparing against regular
non-blocking rw locks ([q]rwlock_t), this change proves that it can
outperform them, for instance when studying the popular anon-vma lock:

http://www.spinics.net/lists/linux-mm/msg72705.html

Furthermore, no performance regression have been seen at all.

This patch applies on top of the -tip branch.

Signed-off-by: Davidlohr Bueso <davidlohr@hp.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

author	Davidlohr Bueso <davidlohr@hp.com>
	Sat, 17 May 2014 13:19:24 +0000 (23:19 +1000)
committer	Stephen Rothwell <sfr@canb.auug.org.au>
	Wed, 21 May 2014 07:11:31 +0000 (17:11 +1000)
commit	b91813f5d6a4a3f26606e52602e3691126203597
tree	5551937e568074a5b05e99143d9f992c08b43494	tree \| snapshot
parent	28f7b0e521dd210878083472c44686ba46794d62	commit \| diff

include/linux/rwsem.h		diff \| blob \| history
kernel/locking/rwsem-xadd.c		diff \| blob \| history
kernel/locking/rwsem.c		diff \| blob \| history