]> git.kernelconcepts.de Git - karo-tx-linux.git/log
karo-tx-linux.git
10 years agoARM: dts: karo: provide inverted PWM signal by reversing brightness-levels kc-master kc/kc-master
Lothar Waßmann [Tue, 5 Nov 2013 15:34:56 +0000 (16:34 +0100)]
ARM: dts: karo: provide inverted PWM signal by reversing brightness-levels

Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
10 years agopwm-backlight: use duty_cycle rather than brightness value to decide whether to shutd...
Lothar Waßmann [Tue, 5 Nov 2013 15:30:14 +0000 (16:30 +0100)]
pwm-backlight: use duty_cycle rather than brightness value to decide whether to shutdown PWM

Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
10 years agopwm: print error messages using pr_err() rather than pr_debug()
Lothar Waßmann [Tue, 5 Nov 2013 15:28:51 +0000 (16:28 +0100)]
pwm: print error messages using pr_err() rather than pr_debug()

Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
10 years agoARM: dts: imx6: add uart2 rtscts pins
Lothar Waßmann [Tue, 5 Nov 2013 09:54:22 +0000 (10:54 +0100)]
ARM: dts: imx6: add uart2 rtscts pins

Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
10 years agoARM: imx6q: add missing sentinel to divider table
Lothar Waßmann [Thu, 31 Oct 2013 11:48:15 +0000 (12:48 +0100)]
ARM: imx6q: add missing sentinel to divider table

Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
10 years agoARM: dts: imx53: Add support for Ka-Ro electronics TX53 modules
Lothar Waßmann [Tue, 29 Oct 2013 07:46:59 +0000 (08:46 +0100)]
ARM: dts: imx53: Add support for Ka-Ro electronics TX53 modules

Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
10 years agoARM: dts: imx53: Add another pwm pinctrl
Lothar Waßmann [Tue, 29 Oct 2013 07:45:04 +0000 (08:45 +0100)]
ARM: dts: imx53: Add another pwm pinctrl

Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
10 years agoARM: dts: imx53: Add sata support
Lothar Waßmann [Tue, 29 Oct 2013 07:44:35 +0000 (08:44 +0100)]
ARM: dts: imx53: Add sata support

Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
10 years agoahci_platform: enable DT probing
Lothar Waßmann [Thu, 24 Oct 2013 08:23:18 +0000 (10:23 +0200)]
ahci_platform: enable DT probing

Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
10 years agoahci_platform: fix error return values in ahci_probe() function
Lothar Waßmann [Thu, 24 Oct 2013 07:57:09 +0000 (09:57 +0200)]
ahci_platform: fix error return values in ahci_probe() function

- ENODEV is more appropriate than EINVAL if platform_get_resource() fails
- promote the return value from platform_get_irq() rather than inventing
  a new one

Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
10 years agoARM: dts: tx28: use defined constants for GPIO polarity
Lothar Waßmann [Thu, 24 Oct 2013 07:44:05 +0000 (09:44 +0200)]
ARM: dts: tx28: use defined constants for GPIO polarity

Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
10 years agoARM: dts: tx28: fix compatible property for edt-ft5x06
Lothar Waßmann [Thu, 24 Oct 2013 07:39:16 +0000 (09:39 +0200)]
ARM: dts: tx28: fix compatible property for edt-ft5x06

Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
10 years agoARM: dts: imx6qdl: add support for Ka-Ro TX6 modules
Lothar Waßmann [Thu, 24 Oct 2013 07:38:02 +0000 (09:38 +0200)]
ARM: dts: imx6qdl: add support for Ka-Ro TX6 modules

Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
10 years agoARM: dts: imx6qdl: add more pinctrls for usbh1
Lothar Waßmann [Thu, 24 Oct 2013 07:36:11 +0000 (09:36 +0200)]
ARM: dts: imx6qdl: add more pinctrls for usbh1

Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
10 years agoARM: dts: imx6qdl: add more pinctrls for pwm
Lothar Waßmann [Thu, 24 Oct 2013 07:35:52 +0000 (09:35 +0200)]
ARM: dts: imx6qdl: add more pinctrls for pwm

Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
10 years agoARM: dts: imx6qdl: rename pinctrl_pwm0 for consistency
Lothar Waßmann [Thu, 24 Oct 2013 07:35:28 +0000 (09:35 +0200)]
ARM: dts: imx6qdl: rename pinctrl_pwm0 for consistency

Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
10 years agoARM: dts: imx6qdl: add more pinctrls for flexcan
Lothar Waßmann [Thu, 24 Oct 2013 07:32:06 +0000 (09:32 +0200)]
ARM: dts: imx6qdl: add more pinctrls for flexcan

Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
10 years agoARM: dts: imx6qdl: add more pinctrls for enet
Lothar Waßmann [Thu, 24 Oct 2013 07:31:12 +0000 (09:31 +0200)]
ARM: dts: imx6qdl: add more pinctrls for enet

Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
10 years agoARM: dts: imx6qdl: add more pinctrls for audmux
Lothar Waßmann [Thu, 24 Oct 2013 07:29:33 +0000 (09:29 +0200)]
ARM: dts: imx6qdl: add more pinctrls for audmux

Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
10 years agoARM: dts: imx6qdl: add pinctrl for uart3 RTC/CTS
Lothar Waßmann [Thu, 24 Oct 2013 07:21:37 +0000 (09:21 +0200)]
ARM: dts: imx6qdl: add pinctrl for uart3 RTC/CTS

Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
10 years agoInput: edt-ft5x06 - add DT support
Lothar Waßmann [Thu, 24 Oct 2013 06:59:32 +0000 (08:59 +0200)]
Input: edt-ft5x06 - add DT support

10 years agodrm/imx: convert dev_dbg() to dev_err() for error message
Lothar Waßmann [Tue, 22 Oct 2013 11:49:21 +0000 (13:49 +0200)]
drm/imx: convert dev_dbg() to dev_err() for error message

10 years agoimx-drm: use native mode if available rather than first in list
Lothar Waßmann [Mon, 21 Oct 2013 14:43:24 +0000 (16:43 +0200)]
imx-drm: use native mode if available rather than first in list

10 years agonet: fec: call dma_mapping_error() where appropriate
Lothar Waßmann [Mon, 21 Oct 2013 14:36:23 +0000 (16:36 +0200)]
net: fec: call dma_mapping_error() where appropriate

This patch fixes the warning:
| DMA-API: device driver failed to check map error
when compiled with CONFIG_DMA_API_DEBUG enabled.

10 years agonet: fec: fix phy-reset-duration limiting
Lothar Waßmann [Mon, 21 Oct 2013 14:31:30 +0000 (16:31 +0200)]
net: fec: fix phy-reset-duration limiting

one second are 1000 milliseconds, not just '1'

10 years agoAdd linux-next specific files for 20131105 next-20131105
Stephen Rothwell [Tue, 5 Nov 2013 07:08:53 +0000 (18:08 +1100)]
Add linux-next specific files for 20131105

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
10 years agoMerge branch 'akpm/master'
Stephen Rothwell [Tue, 5 Nov 2013 06:42:02 +0000 (17:42 +1100)]
Merge branch 'akpm/master'

10 years agosound/core/memalloc.c: use gen_pool_dma_alloc() to allocate iram buffer
Nicolin Chen [Tue, 5 Nov 2013 06:07:11 +0000 (17:07 +1100)]
sound/core/memalloc.c: use gen_pool_dma_alloc() to allocate iram buffer

Since gen_pool_dma_alloc() is introduced, we implement it to simplify code.

Signed-off-by: Nicolin Chen <b42378@freescale.com>
Acked-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoblk-mq: use __smp_call_function_single directly
Christoph Hellwig [Tue, 5 Nov 2013 06:07:11 +0000 (17:07 +1100)]
blk-mq: use __smp_call_function_single directly

Now that __smp_call_function_single is available for all builds and uses
llists to queue up items without taking a lock or disabling interrupts
there is no need to wrap around it in the block code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@suse.cz>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agokernel: use lockless list for smp_call_function_single
Christoph Hellwig [Tue, 5 Nov 2013 06:07:10 +0000 (17:07 +1100)]
kernel: use lockless list for smp_call_function_single

Make smp_call_function_single and friends more efficient by using
a lockless list.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agollists-move-llist_reverse_order-from-raid5-to-llistc-fix
Andrew Morton [Tue, 5 Nov 2013 06:07:10 +0000 (17:07 +1100)]
llists-move-llist_reverse_order-from-raid5-to-llistc-fix

fix comment typo, per Jan

Cc: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@suse.cz>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agollists: move llist_reverse_order from raid5 to llist.c
Christoph Hellwig [Tue, 5 Nov 2013 06:07:09 +0000 (17:07 +1100)]
llists: move llist_reverse_order from raid5 to llist.c

Make this useful helper available for other users.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agokernel: fix generic_exec_single indentation
Christoph Hellwig [Tue, 5 Nov 2013 06:07:09 +0000 (17:07 +1100)]
kernel: fix generic_exec_single indentation

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@suse.cz>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agokernel-provide-a-__smp_call_function_single-stub-for-config_smp-fix
Andrew Morton [Tue, 5 Nov 2013 06:07:08 +0000 (17:07 +1100)]
kernel-provide-a-__smp_call_function_single-stub-for-config_smp-fix

x86_64 allnoconfig:

kernel/up.c:25: error: redefinition of '__smp_call_function_single'
include/linux/smp.h:154: note: previous definition of '__smp_call_function_single' was here

Cc: Christoph Hellwig <hch@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@suse.cz>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agokernel: provide a __smp_call_function_single stub for !CONFIG_SMP
Christoph Hellwig [Tue, 5 Nov 2013 06:07:07 +0000 (17:07 +1100)]
kernel: provide a __smp_call_function_single stub for !CONFIG_SMP

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@suse.cz>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agokernel: remove CONFIG_USE_GENERIC_SMP_HELPERS
Christoph Hellwig [Tue, 5 Nov 2013 06:07:07 +0000 (17:07 +1100)]
kernel: remove CONFIG_USE_GENERIC_SMP_HELPERS

We've switched over every architecture that supports SMP to it, so remove
the new useless config variable.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@suse.cz>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agorevert "softirq: Add support for triggering softirq work on softirqs"
Christoph Hellwig [Tue, 5 Nov 2013 06:07:06 +0000 (17:07 +1100)]
revert "softirq: Add support for triggering softirq work on softirqs"

This commit was incomplete in that code to remove items from the per-cpu
lists was missing and never acquired a user in the 5 years it has been in
the tree.  We're going to implement what it seems to try to archive in a
simpler way, and this code is in the way of doing so.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@suse.cz>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoscripts/tags.sh: remove obsolete __devinit[const|data]
Michael Opdenacker [Tue, 5 Nov 2013 06:07:06 +0000 (17:07 +1100)]
scripts/tags.sh: remove obsolete __devinit[const|data]

This removes the use of __devinitconst and __devinitdata in scripts/tags.sh,
which were removed in 3.8.

Signed-off-by: Michael Opdenacker <michael.opdenacker@free-electrons.com>
Cc: Michal Marek <mmarek@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agodrivers/w1/masters/w1-gpio.c: use dev_get_platdata()
Jingoo Han [Tue, 5 Nov 2013 06:07:05 +0000 (17:07 +1100)]
drivers/w1/masters/w1-gpio.c: use dev_get_platdata()

Use the wrapper function for retrieving the platform data instead of
accessing dev->platform_data directly.  This is a cosmetic change to make
the code simpler and enhance the readability.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Acked-by: Evgeniy Polyakov <zbr@ioremap.net>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agosched: remove INIT_COMPLETION
Wolfram Sang [Tue, 5 Nov 2013 06:07:05 +0000 (17:07 +1100)]
sched: remove INIT_COMPLETION

All users are converted over to reinit_completion(). Remove the old
macro now.

Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agotree-wide-use-reinit_completion-instead-of-init_completion-fix
Andrew Morton [Tue, 5 Nov 2013 06:07:04 +0000 (17:07 +1100)]
tree-wide-use-reinit_completion-instead-of-init_completion-fix

Cc: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agotree-wide: use reinit_completion instead of INIT_COMPLETION
Wolfram Sang [Tue, 5 Nov 2013 06:07:03 +0000 (17:07 +1100)]
tree-wide: use reinit_completion instead of INIT_COMPLETION

Use this new function to make code more comprehensible, since we are
reinitialzing the completion, not initializing.

Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Acked-by: Linus Walleij <linus.walleij@linaro.org> (personally at LCE13)
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agosched: replace INIT_COMPLETION with reinit_completion
Wolfram Sang [Tue, 5 Nov 2013 06:07:03 +0000 (17:07 +1100)]
sched: replace INIT_COMPLETION with reinit_completion

For the casual device driver writer, it is hard to remember when to use
init_completion (to init a completion structure) or INIT_COMPLETION (to
*reinit* a completion structure).  Furthermore, while all other completion
functions exepct a pointer as a parameter, INIT_COMPLETION does not.  To
make it easier to remember which function to use and to make code more
readable, introduce a new inline function with the proper name and
consistent argument type.  Update the kernel-doc for init_completion while
we are here.

Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Acked-by: Linus Walleij <linus.walleij@linaro.org> (personally at LCE13)
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agodrivers/rtc/rtc-hid-sensor-time.c: enable HID input processing early
Alexander Holler [Tue, 5 Nov 2013 06:07:02 +0000 (17:07 +1100)]
drivers/rtc/rtc-hid-sensor-time.c: enable HID input processing early

Enable the processing of HID input records before the RTC will be
registered, in order to allow the RTC register function to read clock.
Without doing that the clock can only be read after the probe function has
finished.

Signed-off-by: Alexander Holler <holler@ahsoftware.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agodrivers/rtc/rtc-hid-sensor-time.c: use dev_get_platdata()
Jingoo Han [Tue, 5 Nov 2013 06:07:02 +0000 (17:07 +1100)]
drivers/rtc/rtc-hid-sensor-time.c: use dev_get_platdata()

Use the wrapper function for retrieving the platform data instead of
accessing dev->platform_data directly.  This is a cosmetic change to make
the code simpler and enhance the readability.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agovsprintf: ignore %n again
Kees Cook [Tue, 5 Nov 2013 06:07:01 +0000 (17:07 +1100)]
vsprintf: ignore %n again

This ignores %n in printf again, as was originally documented.
Implementing %n poses a greater security risk than utility, so it should
stay ignored.  To help anyone attempting to use %n, a warning will be
emitted if it is encountered.

Based on an earlier patch by Joe Perches.

Because %n was designed to write to pointers on the stack, it has been
frequently used as an attack vector when bugs are found that leak
user-controlled strings into functions that ultimately process format
strings.  While this class of bug can still be turned into an information
leak, removing %n eliminates the common method of elevating such a bug
into an arbitrary kernel memory writing primitive, significantly reducing
the danger of this class of bug.

For seq_file users that need to know the length of a written string for
padding, please see seq_setwidth() and seq_pad() instead.

Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Joe Perches <joe@perches.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoseq_file: remove "%n" usage from seq_file users
Tetsuo Handa [Tue, 5 Nov 2013 06:07:00 +0000 (17:07 +1100)]
seq_file: remove "%n" usage from seq_file users

All seq_printf() users are using "%n" for calculating padding size,
convert them to use seq_setwidth() / seq_pad() pair.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Joe Perches <joe@perches.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoseq_file: introduce seq_setwidth() and seq_pad()
Tetsuo Handa [Tue, 5 Nov 2013 06:07:00 +0000 (17:07 +1100)]
seq_file: introduce seq_setwidth() and seq_pad()

There are several users who want to know bytes written by seq_*() for
alignment purpose.  Currently they are using %n format for knowing it
because seq_*() returns 0 on success.

This patch introduces seq_setwidth() and seq_pad() for allowing them to
align without using %n format.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Joe Perches <joe@perches.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm-dynamically-allocate-page-ptl-if-it-cannot-be-embedded-to-struct-page-fix-fix
Andrew Morton [Tue, 5 Nov 2013 06:06:59 +0000 (17:06 +1100)]
mm-dynamically-allocate-page-ptl-if-it-cannot-be-embedded-to-struct-page-fix-fix

Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm: try to detect that page->ptl is in use
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:59 +0000 (17:06 +1100)]
mm: try to detect that page->ptl is in use

prep_new_page() initialize page->private (and therefore page->ptl) with 0.
 Make sure nobody took it in use in between allocation of the page and
page table constructor.

It can happen if arch try to use slab for page table allocation: slab code
uses page->slab_cache and page->first_page (for tail pages), which share
storage with page->ptl.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm: dynamically allocate page->ptl if it cannot be embedded to struct page
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:58 +0000 (17:06 +1100)]
mm: dynamically allocate page->ptl if it cannot be embedded to struct page

If split page table lock is in use, we embed the lock into struct page of
table's page.  We have to disable split lock, if spinlock_t is too big be
to be embedded, like when DEBUG_SPINLOCK or DEBUG_LOCK_ALLOC enabled.

This patch add support for dynamic allocation of split page table lock if
we can't embed it to struct page.

page->ptl is unsigned long now and we use it as spinlock_t if
sizeof(spinlock_t) <= sizeof(long), otherwise it's pointer to spinlock_t.

The spinlock_t allocated in pgtable_page_ctor() for PTE table and in
pgtable_pmd_page_ctor() for PMD table.  All other helpers converted to
support dynamically allocated page->ptl.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Christoph Lameter <cl@linux.com>
Reviewed-by: Peter Zijlstra <peterz@infradead.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chen Liqin <liqin.chen@sunplusct.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: David Howells <dhowells@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoxtensa: use buddy allocator for PTE table
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:58 +0000 (17:06 +1100)]
xtensa: use buddy allocator for PTE table

At the moment xtensa uses slab allocator for PTE table.  It doesn't work
with enabled split page table lock: slab uses page->slab_cache and
page->first_page for its pages.  These fields share stroage with
page->ptl.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Chris Zankel <chris@zankel.net>
Acked-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoiommu/arm-smmu: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:57 +0000 (17:06 +1100)]
iommu/arm-smmu: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoxtensa: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:56 +0000 (17:06 +1100)]
xtensa: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agox86: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:56 +0000 (17:06 +1100)]
x86: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agounicore32: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:55 +0000 (17:06 +1100)]
unicore32: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoum: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:55 +0000 (17:06 +1100)]
um: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agotile: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:54 +0000 (17:06 +1100)]
tile: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agosparc: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:54 +0000 (17:06 +1100)]
sparc: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agosh: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:53 +0000 (17:06 +1100)]
sh: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoscore: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:53 +0000 (17:06 +1100)]
score: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Chen Liqin <liqin.chen@sunplusct.com>
Acked-by: Lennox Wu <lennox.wu@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agos390: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:52 +0000 (17:06 +1100)]
s390: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agopowerpc: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:52 +0000 (17:06 +1100)]
powerpc: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoparisc: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:51 +0000 (17:06 +1100)]
parisc: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Helge Deller <deller@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomips: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:50 +0000 (17:06 +1100)]
mips: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agometag: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:50 +0000 (17:06 +1100)]
metag: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agom68k-handle-pgtable_page_ctor-fail-fix-fix
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:49 +0000 (17:06 +1100)]
m68k-handle-pgtable_page_ctor-fail-fix-fix

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agom68k-handle-pgtable_page_ctor-fail-fix
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:49 +0000 (17:06 +1100)]
m68k-handle-pgtable_page_ctor-fail-fix

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agom68k: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:48 +0000 (17:06 +1100)]
m68k: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agom32r: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:48 +0000 (17:06 +1100)]
m32r: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoia64: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:47 +0000 (17:06 +1100)]
ia64: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agohexagon: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:47 +0000 (17:06 +1100)]
hexagon: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agofrv: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:46 +0000 (17:06 +1100)]
frv: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agocris: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:45 +0000 (17:06 +1100)]
cris: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Mikael Starvik <starvik@axis.com>
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoavr32: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:45 +0000 (17:06 +1100)]
avr32: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Acked-by: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoarm64: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:44 +0000 (17:06 +1100)]
arm64: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoarm: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:44 +0000 (17:06 +1100)]
arm: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoarc: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:43 +0000 (17:06 +1100)]
arc: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com> [for arch/arc bits]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoalpha: handle pgtable_page_ctor() fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:43 +0000 (17:06 +1100)]
alpha: handle pgtable_page_ctor() fail

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoopenrisc: add missing pgtable_page_ctor/dtor calls
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:42 +0000 (17:06 +1100)]
openrisc: add missing pgtable_page_ctor/dtor calls

It will fix NR_PAGETABLE accounting.  It's also required if the arch is
going ever support split ptl.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jonas Bonn <jonas@southpole.se>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomn10300: add missing pgtable_page_ctor/dtor calls
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:42 +0000 (17:06 +1100)]
mn10300: add missing pgtable_page_ctor/dtor calls

It will fix NR_PAGETABLE accounting.  It's also required if the arch is
going ever support split ptl.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: David Howells <dhowells@redhat.com>
Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomicroblaze: add missing pgtable_page_ctor/dtor calls
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:41 +0000 (17:06 +1100)]
microblaze: add missing pgtable_page_ctor/dtor calls

It will fix NR_PAGETABLE accounting.  It's also required if the arch is
going ever support split ptl.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Michal Simek <monstr@monstr.eu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm: allow pgtable_page_ctor() to fail
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:40 +0000 (17:06 +1100)]
mm: allow pgtable_page_ctor() to fail

Change pgtable_page_ctor() return type from void to bool.  Returns true,
if initialization is successful and false otherwise.

Current implementation never fails, but it will change later.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoxtensa: fix potential NULL-pointer dereference
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:40 +0000 (17:06 +1100)]
xtensa: fix potential NULL-pointer dereference

Add missing check for memory allocation fail.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agom32r: fix potential NULL-pointer dereference
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:39 +0000 (17:06 +1100)]
m32r: fix potential NULL-pointer dereference

Add missing check for memory allocation fail.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agocris: fix potential NULL-pointer dereference
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:39 +0000 (17:06 +1100)]
cris: fix potential NULL-pointer dereference

Add missing check for memory allocation fail.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Mikael Starvik <starvik@axis.com>
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agox86: add missed pgtable_pmd_page_ctor/dtor calls for preallocated pmds
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:38 +0000 (17:06 +1100)]
x86: add missed pgtable_pmd_page_ctor/dtor calls for preallocated pmds

In split page table lock case, we embed spinlock_t into struct page.  For
obvious reason, we don't want to increase size of struct page if
spinlock_t is too big, like with DEBUG_SPINLOCK or DEBUG_LOCK_ALLOC or on
-rt kernel.  So we disable split page table lock, if spinlock_t is too big.

This patchset allows to allocate the lock dynamically if spinlock_t is
big.  In this page->ptl is used to store pointer to spinlock instead of
spinlock itself.  It costs additional cache line for indirect access, but
fix page fault scalability for multi-threaded applications.

LOCK_STAT depends on DEBUG_SPINLOCK, so on current kernel enabling
LOCK_STAT to analyse scalability issues breaks scalability.  ;)

The patchset mostly fixes this.  Results for ./thp_memscale -c 80 -b 512M
on 4-socket machine:

baseline, no CONFIG_LOCK_STAT: 9.115460703 seconds time elapsed
baseline, CONFIG_LOCK_STAT=y: 53.890567123 seconds time elapsed
patched, no CONFIG_LOCK_STAT: 8.852250368 seconds time elapsed
patched, CONFIG_LOCK_STAT=y: 11.069770759 seconds time elapsed

Patch count is scary, but most of them trivial. Overview:

 Patches 1-4 Few bug fixes. No dependencies to other patches.
Probably should applied as soon as possible.

 Patch 5 Changes signature of pgtable_page_ctor(). We will use it
for dynamic lock allocation, so it can fail.

 Patches 6-8 Add missing constructor/destructor calls on few archs.
It's fixes NR_PAGETABLE accounting and prepare to use
split ptl.

 Patches 9-33 Add pgtable_page_ctor() fail handling to all archs.

 Patches 34 Finally adds support of dynamically-allocated page->pte.
Also contains documentation for split page table lock.

This patch (of 34):

I've missed that we preallocate few pmds on pgd_alloc() if X86_PAE
enabled.  Let's add missed constructor/destructor calls.

I haven't noticed it during testing since prep_new_page() clears
page->mapping and therefore page->ptl.  It's effectively equal to
spin_lock_init(&page->ptl).

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chen Liqin <liqin.chen@sunplusct.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Howells <dhowells@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agox86-mm-enable-split-page-table-lock-for-pmd-level-checkpatch-fixes
Andrew Morton [Tue, 5 Nov 2013 06:06:38 +0000 (17:06 +1100)]
x86-mm-enable-split-page-table-lock-for-pmd-level-checkpatch-fixes

ERROR: need consistent spacing around '|' (ctx:VxW)
#62: FILE: arch/x86/include/asm/pgalloc.h:84:
+ page = alloc_pages(GFP_KERNEL | __GFP_REPEAT| __GFP_ZERO, 0);
                                              ^

total: 1 errors, 0 warnings, 32 lines checked

./patches/x86-mm-enable-split-page-table-lock-for-pmd-level.patch has style problems, please review.

If any of these errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agox86, mm: enable split page table lock for PMD level
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:37 +0000 (17:06 +1100)]
x86, mm: enable split page table lock for PMD level

Enable PMD split page table lock for X86_64 and PAE.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Tested-by: Alex Thorlton <athorlton@sgi.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Jones <davej@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Robin Holt <robinmholt@gmail.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Hugh Dickins <hughd@google.com>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm: implement split page table lock for PMD level
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:37 +0000 (17:06 +1100)]
mm: implement split page table lock for PMD level

The basic idea is the same as with PTE level: the lock is embedded into
struct page of table's page.

We can't use mm->pmd_huge_pte to store pgtables for THP, since we don't
take mm->page_table_lock anymore. Let's reuse page->lru of table's page
for that.

pgtable_pmd_page_ctor() returns true, if initialization is successful and
false otherwise.  Current implementation never fails, but assumption that
constructor can fail will help to port it to -rt where spinlock_t is
rather huge and cannot be embedded into struct page -- dynamic allocation
is required.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Tested-by: Alex Thorlton <athorlton@sgi.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Jones <davej@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Robin Holt <robinmholt@gmail.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Hugh Dickins <hughd@google.com>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm: convert the rest to new page table lock api
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:36 +0000 (17:06 +1100)]
mm: convert the rest to new page table lock api

Only trivial cases left. Let's convert them altogether.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Tested-by: Alex Thorlton <athorlton@sgi.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Jones <davej@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Robin Holt <robinmholt@gmail.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm-hugetlb-convert-hugetlbfs-to-use-split-pmd-lock-checkpatch-fixes
Andrew Morton [Tue, 5 Nov 2013 06:06:35 +0000 (17:06 +1100)]
mm-hugetlb-convert-hugetlbfs-to-use-split-pmd-lock-checkpatch-fixes

ERROR: code indent should use tabs where possible
#65: FILE: include/linux/hugetlb.h:396:
+               struct mm_struct *mm, pte_t *pte)$

WARNING: please, no spaces at the start of a line
#65: FILE: include/linux/hugetlb.h:396:
+               struct mm_struct *mm, pte_t *pte)$

WARNING: please, no spaces at the start of a line
#67: FILE: include/linux/hugetlb.h:398:
+       if (huge_page_size(h) == PMD_SIZE)$

WARNING: suspect code indent for conditional statements (7, 15)
#67: FILE: include/linux/hugetlb.h:398:
+       if (huge_page_size(h) == PMD_SIZE)
+               return pmd_lockptr(mm, (pmd_t *) pte);

ERROR: code indent should use tabs where possible
#68: FILE: include/linux/hugetlb.h:399:
+               return pmd_lockptr(mm, (pmd_t *) pte);$

WARNING: please, no spaces at the start of a line
#68: FILE: include/linux/hugetlb.h:399:
+               return pmd_lockptr(mm, (pmd_t *) pte);$

WARNING: please, no spaces at the start of a line
#69: FILE: include/linux/hugetlb.h:400:
+       VM_BUG_ON(huge_page_size(h) == PAGE_SIZE);$

WARNING: please, no spaces at the start of a line
#70: FILE: include/linux/hugetlb.h:401:
+       return &mm->page_table_lock;$

ERROR: code indent should use tabs where possible
#90: FILE: include/linux/hugetlb.h:436:
+               struct mm_struct *mm, pte_t *pte)$

WARNING: please, no spaces at the start of a line
#90: FILE: include/linux/hugetlb.h:436:
+               struct mm_struct *mm, pte_t *pte)$

WARNING: please, no spaces at the start of a line
#92: FILE: include/linux/hugetlb.h:438:
+       return &mm->page_table_lock;$

ERROR: code indent should use tabs where possible
#97: FILE: include/linux/hugetlb.h:443:
+               struct mm_struct *mm, pte_t *pte)$

WARNING: please, no spaces at the start of a line
#97: FILE: include/linux/hugetlb.h:443:
+               struct mm_struct *mm, pte_t *pte)$

WARNING: please, no spaces at the start of a line
#99: FILE: include/linux/hugetlb.h:445:
+       spinlock_t *ptl;$

WARNING: please, no spaces at the start of a line
#100: FILE: include/linux/hugetlb.h:446:
+       ptl = huge_pte_lockptr(h, mm, pte);$

WARNING: please, no spaces at the start of a line
#101: FILE: include/linux/hugetlb.h:447:
+       spin_lock(ptl);$

WARNING: please, no spaces at the start of a line
#102: FILE: include/linux/hugetlb.h:448:
+       return ptl;$

WARNING: line over 80 characters
#264: FILE: mm/hugetlb.c:2668:
+  * race occurs while re-acquiring page table lock, and

total: 4 errors, 14 warnings, 474 lines checked

NOTE: whitespace errors detected, you may wish to use scripts/cleanpatch or
      scripts/cleanfile

./patches/mm-hugetlb-convert-hugetlbfs-to-use-split-pmd-lock.patch has style problems, please review.

If any of these errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm, hugetlb: convert hugetlbfs to use split pmd lock
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:35 +0000 (17:06 +1100)]
mm, hugetlb: convert hugetlbfs to use split pmd lock

Hugetlb supports multiple page sizes. We use split lock only for PMD
level, but not for PUD.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Tested-by: Alex Thorlton <athorlton@sgi.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Jones <davej@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Robin Holt <robinmholt@gmail.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm, thp: do not access mm->pmd_huge_pte directly
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:34 +0000 (17:06 +1100)]
mm, thp: do not access mm->pmd_huge_pte directly

Currently mm->pmd_huge_pte protected by page table lock.  It will not work
with split lock.  We have to have per-pmd pmd_huge_pte for proper access
serialization.

For now, let's just introduce wrapper to access mm->pmd_huge_pte.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Tested-by: Alex Thorlton <athorlton@sgi.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Jones <davej@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Robin Holt <robinmholt@gmail.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm, thp: move ptl taking inside page_check_address_pmd()
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:34 +0000 (17:06 +1100)]
mm, thp: move ptl taking inside page_check_address_pmd()

With split page table lock we can't know which lock we need to take before
we find the relevant pmd.

Let's move lock taking inside the function.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Tested-by: Alex Thorlton <athorlton@sgi.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Jones <davej@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Robin Holt <robinmholt@gmail.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm, thp: change pmd_trans_huge_lock() to return taken lock
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:33 +0000 (17:06 +1100)]
mm, thp: change pmd_trans_huge_lock() to return taken lock

With split ptlock it's important to know which lock pmd_trans_huge_lock()
took.  This patch adds one more parameter to the function to return the
lock.

In most places migration to new api is trivial.  Exception is
move_huge_pmd(): we need to take two locks if pmd tables are different.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Tested-by: Alex Thorlton <athorlton@sgi.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Jones <davej@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Robin Holt <robinmholt@gmail.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm: introduce api for split page table lock for PMD level
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:33 +0000 (17:06 +1100)]
mm: introduce api for split page table lock for PMD level

Basic api, backed by mm->page_table_lock for now. Actual implementation
will be added later.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Tested-by: Alex Thorlton <athorlton@sgi.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Jones <davej@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Robin Holt <robinmholt@gmail.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm: convert mm->nr_ptes to atomic_long_t
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:32 +0000 (17:06 +1100)]
mm: convert mm->nr_ptes to atomic_long_t

With split page table lock for PMD level we can't hold mm->page_table_lock
while updating nr_ptes.

Let's convert it to atomic_long_t to avoid races.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Tested-by: Alex Thorlton <athorlton@sgi.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Jones <davej@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Robin Holt <robinmholt@gmail.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm: rename USE_SPLIT_PTLOCKS to USE_SPLIT_PTE_PTLOCKS
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:31 +0000 (17:06 +1100)]
mm: rename USE_SPLIT_PTLOCKS to USE_SPLIT_PTE_PTLOCKS

We're going to introduce split page table lock for PMD level.  Let's
rename existing split ptlock for PTE level to avoid confusion.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Tested-by: Alex Thorlton <athorlton@sgi.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Jones <davej@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Robin Holt <robinmholt@gmail.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm: avoid increase sizeof(struct page) due to split page table lock
Kirill A. Shutemov [Tue, 5 Nov 2013 06:06:31 +0000 (17:06 +1100)]
mm: avoid increase sizeof(struct page) due to split page table lock

Alex Thorlton noticed that some massively threaded workloads work poorly,
if THP enabled.  This patchset fixes this by introducing split page table
lock for PMD tables.  hugetlbfs is not covered yet.

This patchset is based on work by Naoya Horiguchi.

: akpm result summary:
:
: THP off, v3.12-rc2: 18.059261877 seconds time elapsed
: THP off, patched:   16.768027318 seconds time elapsed
:
: THP on, v3.12-rc2:  42.162306788 seconds time elapsed
: THP on, patched:    8.397885779 seconds time elapsed
:
: HUGETLB, v3.12-rc2: 47.574936948 seconds time elapsed
: HUGETLB, patched:   19.447481153 seconds time elapsed

THP off, v3.12-rc2:
-------------------

 Performance counter stats for './thp_memscale -c 80 -b 512m' (5 runs):

    1037072.835207 task-clock                #   57.426 CPUs utilized            ( +-  3.59% )
            95,093 context-switches          #    0.092 K/sec                    ( +-  3.93% )
               140 cpu-migrations            #    0.000 K/sec                    ( +-  5.28% )
        10,000,550 page-faults               #    0.010 M/sec                    ( +-  0.00% )
 2,455,210,400,261 cycles                    #    2.367 GHz                      ( +-  3.62% ) [83.33%]
 2,429,281,882,056 stalled-cycles-frontend   #   98.94% frontend cycles idle     ( +-  3.67% ) [83.33%]
 1,975,960,019,659 stalled-cycles-backend    #   80.48% backend  cycles idle     ( +-  3.88% ) [66.68%]
    46,503,296,013 instructions              #    0.02  insns per cycle
                                             #   52.24  stalled cycles per insn  ( +-  3.21% ) [83.34%]
     9,278,997,542 branches                  #    8.947 M/sec                    ( +-  4.00% ) [83.34%]
        89,881,640 branch-misses             #    0.97% of all branches          ( +-  1.17% ) [83.33%]

      18.059261877 seconds time elapsed                                          ( +-  2.65% )

THP on, v3.12-rc2:
------------------

 Performance counter stats for './thp_memscale -c 80 -b 512m' (5 runs):

    3114745.395974 task-clock                #   73.875 CPUs utilized            ( +-  1.84% )
           267,356 context-switches          #    0.086 K/sec                    ( +-  1.84% )
                99 cpu-migrations            #    0.000 K/sec                    ( +-  1.40% )
            58,313 page-faults               #    0.019 K/sec                    ( +-  0.28% )
 7,416,635,817,510 cycles                    #    2.381 GHz                      ( +-  1.83% ) [83.33%]
 7,342,619,196,993 stalled-cycles-frontend   #   99.00% frontend cycles idle     ( +-  1.88% ) [83.33%]
 6,267,671,641,967 stalled-cycles-backend    #   84.51% backend  cycles idle     ( +-  2.03% ) [66.67%]
   117,819,935,165 instructions              #    0.02  insns per cycle
                                             #   62.32  stalled cycles per insn  ( +-  4.39% ) [83.34%]
    28,899,314,777 branches                  #    9.278 M/sec                    ( +-  4.48% ) [83.34%]
        71,787,032 branch-misses             #    0.25% of all branches          ( +-  1.03% ) [83.33%]

      42.162306788 seconds time elapsed                                          ( +-  1.73% )

HUGETLB, v3.12-rc2:
-------------------

 Performance counter stats for './thp_memscale_hugetlbfs -c 80 -b 512M' (5 runs):

    2588052.787264 task-clock                #   54.400 CPUs utilized            ( +-  3.69% )
           246,831 context-switches          #    0.095 K/sec                    ( +-  4.15% )
               138 cpu-migrations            #    0.000 K/sec                    ( +-  5.30% )
            21,027 page-faults               #    0.008 K/sec                    ( +-  0.01% )
 6,166,666,307,263 cycles                    #    2.383 GHz                      ( +-  3.68% ) [83.33%]
 6,086,008,929,407 stalled-cycles-frontend   #   98.69% frontend cycles idle     ( +-  3.77% ) [83.33%]
 5,087,874,435,481 stalled-cycles-backend    #   82.51% backend  cycles idle     ( +-  4.41% ) [66.67%]
   133,782,831,249 instructions              #    0.02  insns per cycle
                                             #   45.49  stalled cycles per insn  ( +-  4.30% ) [83.34%]
    34,026,870,541 branches                  #   13.148 M/sec                    ( +-  4.24% ) [83.34%]
        68,670,942 branch-misses             #    0.20% of all branches          ( +-  3.26% ) [83.33%]

      47.574936948 seconds time elapsed                                          ( +-  2.09% )

THP off, patched:
-----------------

 Performance counter stats for './thp_memscale -c 80 -b 512m' (5 runs):

     943301.957892 task-clock                #   56.256 CPUs utilized            ( +-  3.01% )
            86,218 context-switches          #    0.091 K/sec                    ( +-  3.17% )
               121 cpu-migrations            #    0.000 K/sec                    ( +-  6.64% )
        10,000,551 page-faults               #    0.011 M/sec                    ( +-  0.00% )
 2,230,462,457,654 cycles                    #    2.365 GHz                      ( +-  3.04% ) [83.32%]
 2,204,616,385,805 stalled-cycles-frontend   #   98.84% frontend cycles idle     ( +-  3.09% ) [83.32%]
 1,778,640,046,926 stalled-cycles-backend    #   79.74% backend  cycles idle     ( +-  3.47% ) [66.69%]
    45,995,472,617 instructions              #    0.02  insns per cycle
                                             #   47.93  stalled cycles per insn  ( +-  2.51% ) [83.34%]
     9,179,700,174 branches                  #    9.731 M/sec                    ( +-  3.04% ) [83.35%]
        89,166,529 branch-misses             #    0.97% of all branches          ( +-  1.45% ) [83.33%]

      16.768027318 seconds time elapsed                                          ( +-  2.47% )

THP on, patched:
----------------

 Performance counter stats for './thp_memscale -c 80 -b 512m' (5 runs):

     458793.837905 task-clock                #   54.632 CPUs utilized            ( +-  0.79% )
            41,831 context-switches          #    0.091 K/sec                    ( +-  0.97% )
                98 cpu-migrations            #    0.000 K/sec                    ( +-  1.66% )
            57,829 page-faults               #    0.126 K/sec                    ( +-  0.62% )
 1,077,543,336,716 cycles                    #    2.349 GHz                      ( +-  0.81% ) [83.33%]
 1,067,403,802,964 stalled-cycles-frontend   #   99.06% frontend cycles idle     ( +-  0.87% ) [83.33%]
   864,764,616,143 stalled-cycles-backend    #   80.25% backend  cycles idle     ( +-  0.73% ) [66.68%]
    16,129,177,440 instructions              #    0.01  insns per cycle
                                             #   66.18  stalled cycles per insn  ( +-  7.94% ) [83.35%]
     3,618,938,569 branches                  #    7.888 M/sec                    ( +-  8.46% ) [83.36%]
        33,242,032 branch-misses             #    0.92% of all branches          ( +-  2.02% ) [83.32%]

       8.397885779 seconds time elapsed                                          ( +-  0.18% )

HUGETLB, patched:
-----------------

 Performance counter stats for './thp_memscale_hugetlbfs -c 80 -b 512M' (5 runs):

     395353.076837 task-clock                #   20.329 CPUs utilized            ( +-  8.16% )
            55,730 context-switches          #    0.141 K/sec                    ( +-  5.31% )
               138 cpu-migrations            #    0.000 K/sec                    ( +-  4.24% )
            21,027 page-faults               #    0.053 K/sec                    ( +-  0.00% )
   930,219,717,244 cycles                    #    2.353 GHz                      ( +-  8.21% ) [83.32%]
   914,295,694,103 stalled-cycles-frontend   #   98.29% frontend cycles idle     ( +-  8.35% ) [83.33%]
   704,137,950,187 stalled-cycles-backend    #   75.70% backend  cycles idle     ( +-  9.16% ) [66.69%]
    30,541,538,385 instructions              #    0.03  insns per cycle
                                             #   29.94  stalled cycles per insn  ( +-  3.98% ) [83.35%]
     8,415,376,631 branches                  #   21.286 M/sec                    ( +-  3.61% ) [83.36%]
        32,645,478 branch-misses             #    0.39% of all branches          ( +-  3.41% ) [83.32%]

      19.447481153 seconds time elapsed                                          ( +-  2.00% )

This patch (of 11):

CONFIG_GENERIC_LOCKBREAK increases sizeof(spinlock_t) to 8 bytes.  It
leads to increase sizeof(struct page) by 4 bytes on 32-bit system if split
page table lock is in use, since page->ptl shares space in union with
longs and pointers.

Let's disable split page table lock on 32-bit systems with
GENERIC_LOCKBREAK enabled.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Jones <davej@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Robin Holt <robinmholt@gmail.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>