]> git.kernelconcepts.de Git - karo-tx-linux.git/log
karo-tx-linux.git
8 years agofs/coda: fix readlink buffer overflow
Jan Harkes [Wed, 9 Sep 2015 22:38:01 +0000 (15:38 -0700)]
fs/coda: fix readlink buffer overflow

Dan Carpenter discovered a buffer overflow in the Coda file system
readlink code.  A userspace file system daemon can return a 4096 byte
result which then triggers a one byte write past the allocated readlink
result buffer.

This does not trigger with an unmodified Coda implementation because Coda
has a 1024 byte limit for symbolic links, however other userspace file
systems using the Coda kernel module could be affected.

Although this is an obvious overflow, I don't think this has to be handled
as too sensitive from a security perspective because the overflow is on
the Coda userspace daemon side which already needs root to open Coda's
kernel device and to mount the file system before we get to the point that
links can be read.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Jan Harkes <jaharkes@cs.cmu.edu>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agocheckpatch: add constant comparison on left side test
Joe Perches [Wed, 9 Sep 2015 22:37:58 +0000 (15:37 -0700)]
checkpatch: add constant comparison on left side test

"CONST <comparison> variable" checks like:

        if (NULL != foo)
and
        while (0 < bar(...))

where a constant (or what appears to be a constant like an upper case
identifier) is on the left of a comparison are generally preferred to be
written using the constant on the right side like:

        if (foo != NULL)
and
        while (bar(...) > 0)

Add a test for this.

Add a --fix option too, but only do it when the code is immediately
surrounded by parentheses to avoid misfixing things like "(0 < bar() +
constant)"

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Nicolas Morey Chaisemartin <nmorey@kalray.eu>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agocheckpatch: add __pmem to $Sparse annotations
Joe Perches [Wed, 9 Sep 2015 22:37:55 +0000 (15:37 -0700)]
checkpatch: add __pmem to $Sparse annotations

commit 61031952f4c8 ("arch, x86: pmem api for ensuring durability of
persistent memory updates") added a new __pmem annotation for sparse
verification.  Add __pmem to the $Sparse variable so checkpatch can
appropriately ignore uses of this attribute too.

Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Acked-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agocheckpatch: fix left brace warning
Eddie Kovsky [Wed, 9 Sep 2015 22:37:52 +0000 (15:37 -0700)]
checkpatch: fix left brace warning

Using checkpatch.pl with Perl 5.22.0 generates the following warning:

    Unescaped left brace in regex is deprecated, passed through in regex;

This patch fixes the warnings by escaping occurrences of the left brace
inside the regular expression.

Signed-off-by: Eddie Kovsky <ewk@edkovsky.org>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agocheckpatch: avoid some commit message long line warnings
Joe Perches [Wed, 9 Sep 2015 22:37:50 +0000 (15:37 -0700)]
checkpatch: avoid some commit message long line warnings

Fixes: and Link: lines may exceed 75 chars in the commit log.
So too can stack dump and dmesg lines and lines that seem
like filenames.

And Fixes: lines don't need to have a "commit" prefix before the
commit id.

Add exceptions for these types of lines.

Signed-off-by: Joe Perches <joe@perches.com>
Reported-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agocheckpatch: emit an error on formats with 0x%<decimal>
Joe Perches [Wed, 9 Sep 2015 22:37:47 +0000 (15:37 -0700)]
checkpatch: emit an error on formats with 0x%<decimal>

Using 0x%d is wrong.  Emit a message when it happens.

Miscellanea:

Improve the %Lu warning to match formats like %16Lu.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agocheckpatch: make --strict the default for drivers/staging files and patches
Joe Perches [Wed, 9 Sep 2015 22:37:44 +0000 (15:37 -0700)]
checkpatch: make --strict the default for drivers/staging files and patches

Making --strict the default for staging may help some people submit
patches without obvious defects.

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agocheckpatch: always check block comment styles
Joe Perches [Wed, 9 Sep 2015 22:37:41 +0000 (15:37 -0700)]
checkpatch: always check block comment styles

Some of the block comment tests that are used only for networking are
appropriate for all patches.

For example, these styles are not encouraged:

/*
 block comment without introductory *
*/
and
/*
 * block comment with line terminating */

Remove the networking specific test and add comments.

There are some infrequent false positives where code is lazily
commented out using /* and */ rather than using #if 0/#endif blocks
like:
/* case foo:
case bar: */
case baz:

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agocheckpatch: report the right line # when using --emacs and --file
Joe Perches [Wed, 9 Sep 2015 22:37:39 +0000 (15:37 -0700)]
checkpatch: report the right line # when using --emacs and --file

commit 34d8815f9512 ("checkpatch: add --showfile to allow input via pipe
to show filenames") broke the --emacs with --file option.

Fix it.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agocheckpatch: add some <foo>_destroy functions to NEEDLESS_IF tests
Joe Perches [Wed, 9 Sep 2015 22:37:36 +0000 (15:37 -0700)]
checkpatch: add some <foo>_destroy functions to NEEDLESS_IF tests

Sergey Senozhatsky has modified several destroy functions that can
now be called with NULL values.

 - kmem_cache_destroy()
 - mempool_destroy()
 - dma_pool_destroy()

Update checkpatch to warn when those functions are preceded by an if.

Update checkpatch to --fix all the calls too only when the code style
form is using leading tabs.

from:
if (foo)
<func>(foo);
to:
<func>(foo);

Signed-off-by: Joe Perches <joe@perches.com>
Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agocheckpatch: Allow longer declaration macros
Joe Perches [Wed, 9 Sep 2015 22:37:33 +0000 (15:37 -0700)]
checkpatch: Allow longer declaration macros

Some really long declaration macros exist.

For instance;
   DEFINE_DMA_BUF_EXPORT_INFO(exp_info);
and
DECLARE_DM_KCOPYD_THROTTLE_WITH_MODULE_PARM(name, description)

Increase the limit from 2 words to 6 after DECLARE/DEFINE uses.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agocheckpatch: improve SUSPECT_CODE_INDENT test
Joe Perches [Wed, 9 Sep 2015 22:37:30 +0000 (15:37 -0700)]
checkpatch: improve SUSPECT_CODE_INDENT test

Many lines exist like

if (foo)
bar;

where the tabbed indentation of the branch is not one more than the "if"
line above it.

checkpatch should emit a warning on those lines.

Miscellenea:

o Remove comments from branch blocks
o Skip blank lines in block

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agocheckpatch: add warning on BUG/BUG_ON use
Joe Perches [Wed, 9 Sep 2015 22:37:27 +0000 (15:37 -0700)]
checkpatch: add warning on BUG/BUG_ON use

Using BUG/BUG_ON crashes the kernel and is just unfriendly.

Enable code that emits a warning on BUG/BUG_ON use.

Make the code emit the message at WARNING level when scanning a patch and
at CHECK level when scanning files so that script users don't feel an
obligation to fix code that might be above their pay grade.

Signed-off-by: Joe Perches <joe@perches.com>
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agocheckpatch: warn on bare SHA-1 commit IDs in commit logs
Joe Perches [Wed, 9 Sep 2015 22:37:25 +0000 (15:37 -0700)]
checkpatch: warn on bare SHA-1 commit IDs in commit logs

Commit IDs should have commit descriptions too.  Warn when a 12 to 40 byte
SHA-1 is used in commit logs.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agolib/test_kasan.c: make kmalloc_oob_krealloc_less more correctly
Wang Long [Wed, 9 Sep 2015 22:37:22 +0000 (15:37 -0700)]
lib/test_kasan.c: make kmalloc_oob_krealloc_less more correctly

In kmalloc_oob_krealloc_less, I think it is better to test
the size2 boundary.

If we do not call krealloc, the access of position size1 will still cause
out-of-bounds and access of position size2 does not.  After call krealloc,
the access of position size2 cause out-of-bounds.  So using size2 is more
correct.

Signed-off-by: Wang Long <long.wanglong@huawei.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agolib/test_kasan.c: fix a typo
Wang Long [Wed, 9 Sep 2015 22:37:19 +0000 (15:37 -0700)]
lib/test_kasan.c: fix a typo

Signed-off-by: Wang Long <long.wanglong@huawei.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agolib/string_helpers: rename "esc" arg to "only"
Kees Cook [Wed, 9 Sep 2015 22:37:16 +0000 (15:37 -0700)]
lib/string_helpers: rename "esc" arg to "only"

To further clarify the purpose of the "esc" argument, rename it to "only"
to reflect that it is a limit, not a list of additional characters to
escape.

Signed-off-by: Kees Cook <keescook@chromium.org>
Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agolib/string_helpers: clarify esc arg in string_escape_mem
Kees Cook [Wed, 9 Sep 2015 22:37:14 +0000 (15:37 -0700)]
lib/string_helpers: clarify esc arg in string_escape_mem

The esc argument is used to reduce which characters will be escaped.  For
example, using " " with ESCAPE_SPACE will not produce any escaped spaces.

Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Mathias Krause <minipli@googlemail.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agohexdump: do not print debug dumps for !CONFIG_DEBUG
Linus Walleij [Wed, 9 Sep 2015 22:37:11 +0000 (15:37 -0700)]
hexdump: do not print debug dumps for !CONFIG_DEBUG

print_hex_dump_debug() is likely supposed to be analogous to pr_debug() or
dev_dbg() & friends.  Currently it will adhere to dynamic debug, but will
not stub out prints if CONFIG_DEBUG is not set.  Let's make it do the
right thing, because I am tired of having my dmesg buffer full of hex
dumps on production systems.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agolib/bitmap.c: bitmap_parselist can accept string with whitespaces on head or tail
Pan Xinhui [Wed, 9 Sep 2015 22:37:08 +0000 (15:37 -0700)]
lib/bitmap.c: bitmap_parselist can accept string with whitespaces on head or tail

In __bitmap_parselist we can accept whitespaces on head or tail during
every parsing procedure.  If input has valid ranges, there is no reason to
reject the user.

For example, bitmap_parselist(" 1-3, 5, ", &mask, nmaskbits).  After
separating the string, we get " 1-3", " 5", and " ".  It's possible and
reasonable to accept such string as long as the parsing result is correct.

Signed-off-by: Pan Xinhui <xinhuix.pan@intel.com>
Cc: Yury Norov <yury.norov@gmail.com>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agolib/bitmap.c: fix a special string handling bug in __bitmap_parselist
Pan Xinhui [Wed, 9 Sep 2015 22:37:05 +0000 (15:37 -0700)]
lib/bitmap.c: fix a special string handling bug in __bitmap_parselist

If string end with '-', for exapmle, bitmap_parselist("1,0-",&mask,
nmaskbits), It is not in a valid pattern, so add a check after loop.
Return -EINVAL on such condition.

Signed-off-by: Pan Xinhui <xinhuix.pan@intel.com>
Cc: Yury Norov <yury.norov@gmail.com>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agolib/bitmap.c: correct a code style and do some, optimization
Pan Xinhui [Wed, 9 Sep 2015 22:37:02 +0000 (15:37 -0700)]
lib/bitmap.c: correct a code style and do some, optimization

We can avoid in-loop incrementation of ndigits.  Save current totaldigits
to ndigits before loop, and check ndigits against totaldigits after the
loop.

Signed-off-by: Pan Xinhui <xinhuix.pan@intel.com>
Cc: Yury Norov <yury.norov@gmail.com>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoproc: convert to kstrto*()/kstrto*_from_user()
Alexey Dobriyan [Wed, 9 Sep 2015 22:36:59 +0000 (15:36 -0700)]
proc: convert to kstrto*()/kstrto*_from_user()

Convert from manual allocation/copy_from_user/...  to kstrto*() family
which were designed for exactly that.

One case can not be converted to kstrto*_from_user() to make code even
more simpler because of whitespace stripping, oh well...

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agokstrto*: accept "-0" for signed conversion
Alexey Dobriyan [Wed, 9 Sep 2015 22:36:17 +0000 (15:36 -0700)]
kstrto*: accept "-0" for signed conversion

strtol(3) et al accept "-0", so should we.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoMAINTAINERS/CREDITS: mark MaxRAID as Orphan, move Anil Ravindranath to CREDITS
Joe Perches [Wed, 9 Sep 2015 22:36:14 +0000 (15:36 -0700)]
MAINTAINERS/CREDITS: mark MaxRAID as Orphan, move Anil Ravindranath to CREDITS

Anil's email address bounces and he hasn't had a signoff
in over 5 years.

Signed-off-by: Joe Perches <joe@perches.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoinclude/linux/printk.h: include pr_fmt in pr_debug_ratelimited
Jason A. Donenfeld [Wed, 9 Sep 2015 22:36:12 +0000 (15:36 -0700)]
include/linux/printk.h: include pr_fmt in pr_debug_ratelimited

The other two implementations of pr_debug_ratelimited include pr_fmt,
along with every other pr_* function.  But pr_debug_ratelimited forgot to
add it with the CONFIG_DYNAMIC_DEBUG implementation.

This patch unifies the behavior.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agokernel/cred.c: remove unnecessary kdebug atomic reads
Joe Perches [Wed, 9 Sep 2015 22:36:09 +0000 (15:36 -0700)]
kernel/cred.c: remove unnecessary kdebug atomic reads

Commit e0e817392b9a ("CRED: Add some configurable debugging [try #6]")
added the kdebug mechanism to this file back in 2009.

The kdebug macro calls no_printk which always evaluates arguments.

Most of the kdebug uses have an unnecessary call of
atomic_read(&cred->usage)

Make the kdebug macro do nothing by defining it with
do { if (0) no_printk(...); } while (0)
when not enabled.

$ size kernel/cred.o* (defconfig x86-64)
   text    data     bss     dec     hex filename
   2748     336       8    3092     c14 kernel/cred.o.new
   2788     336       8    3132     c3c kernel/cred.o.old

Miscellanea:
o Neaten the #define kdebug macros while there

Signed-off-by: Joe Perches <joe@perches.com>
Cc: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agokernel/extable.c: remove duplicated include
Wei Yongjun [Wed, 9 Sep 2015 22:36:06 +0000 (15:36 -0700)]
kernel/extable.c: remove duplicated include

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoinclude/linux/poison.h: remove not-used poison pointer macros
Vasily Kulikov [Wed, 9 Sep 2015 22:36:03 +0000 (15:36 -0700)]
include/linux/poison.h: remove not-used poison pointer macros

Signed-off-by: Vasily Kulikov <segoon@openwall.com>
Cc: Solar Designer <solar@openwall.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoinclude/linux/poison.h: fix LIST_POISON{1,2} offset
Vasily Kulikov [Wed, 9 Sep 2015 22:36:00 +0000 (15:36 -0700)]
include/linux/poison.h: fix LIST_POISON{1,2} offset

Poison pointer values should be small enough to find a room in
non-mmap'able/hardly-mmap'able space.  E.g.  on x86 "poison pointer space"
is located starting from 0x0.  Given unprivileged users cannot mmap
anything below mmap_min_addr, it should be safe to use poison pointers
lower than mmap_min_addr.

The current poison pointer values of LIST_POISON{1,2} might be too big for
mmap_min_addr values equal or less than 1 MB (common case, e.g.  Ubuntu
uses only 0x10000).  There is little point to use such a big value given
the "poison pointer space" below 1 MB is not yet exhausted.  Changing it
to a smaller value solves the problem for small mmap_min_addr setups.

The values are suggested by Solar Designer:
http://www.openwall.com/lists/oss-security/2015/05/02/6

Signed-off-by: Vasily Kulikov <segoon@openwall.com>
Cc: Solar Designer <solar@openwall.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoproc: change proc_subdir_lock to a rwlock
Waiman Long [Wed, 9 Sep 2015 22:35:57 +0000 (15:35 -0700)]
proc: change proc_subdir_lock to a rwlock

The proc_subdir_lock spinlock is used to allow only one task to make
change to the proc directory structure as well as looking up information
in it.  However, the information lookup part can actually be entered by
more than one task as the pde_get() and pde_put() reference count update
calls in the critical sections are atomic increment and decrement
respectively and so are safe with concurrent updates.

The x86 architecture has already used qrwlock which is fair and other
architectures like ARM are in the process of switching to qrwlock.  So
unfairness shouldn't be a concern in that conversion.

This patch changed the proc_subdir_lock to a rwlock in order to enable
concurrent lookup. The following functions were modified to take a
write lock:
 - proc_register()
 - remove_proc_entry()
 - remove_proc_subtree()

The following functions were modified to take a read lock:
 - xlate_proc_name()
 - proc_lookup_de()
 - proc_readdir_de()

A parallel /proc filesystem search with the "find" command (1000 threads)
was run on a 4-socket Haswell-EX box (144 threads).  Before the patch, the
parallel search took about 39s.  After the patch, the parallel find took
only 25s, a saving of about 14s.

The micro-benchmark that I used was artificial, but it was used to
reproduce an exit hanging problem that I saw in real application.  In
fact, only allow one task to do a lookup seems too limiting to me.

Signed-off-by: Waiman Long <Waiman.Long@hp.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Scott J Norton <scott.norton@hp.com>
Cc: Douglas Hatch <doug.hatch@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoprocfs: always expose /proc/<pid>/map_files/ and make it readable
Calvin Owens [Wed, 9 Sep 2015 22:35:54 +0000 (15:35 -0700)]
procfs: always expose /proc/<pid>/map_files/ and make it readable

Currently, /proc/<pid>/map_files/ is restricted to CAP_SYS_ADMIN, and is
only exposed if CONFIG_CHECKPOINT_RESTORE is set.

Each mapped file region gets a symlink in /proc/<pid>/map_files/
corresponding to the virtual address range at which it is mapped.  The
symlinks work like the symlinks in /proc/<pid>/fd/, so you can follow them
to the backing file even if that backing file has been unlinked.

Currently, files which are mapped, unlinked, and closed are impossible to
stat() from userspace.  Exposing /proc/<pid>/map_files/ closes this
functionality "hole".

Not being able to stat() such files makes noticing and explicitly
accounting for the space they use on the filesystem impossible.  You can
work around this by summing up the space used by every file in the
filesystem and subtracting that total from what statfs() tells you, but
that obviously isn't great, and it becomes unworkable once your filesystem
becomes large enough.

This patch moves map_files/ out from behind CONFIG_CHECKPOINT_RESTORE, and
adjusts the permissions enforced on it as follows:

* proc_map_files_lookup()
* proc_map_files_readdir()
* map_files_d_revalidate()

Remove the CAP_SYS_ADMIN restriction, leaving only the current
restriction requiring PTRACE_MODE_READ. The information made
available to userspace by these three functions is already
available in /proc/PID/maps with MODE_READ, so I don't see any
reason to limit them any further (see below for more detail).

* proc_map_files_follow_link()

This stub has been added, and requires that the user have
CAP_SYS_ADMIN in order to follow the links in map_files/,
since there was concern on LKML both about the potential for
bypassing permissions on ancestor directories in the path to
files pointed to, and about what happens with more exotic
memory mappings created by some drivers (ie dma-buf).

In older versions of this patch, I changed every permission check in
the four functions above to enforce MODE_ATTACH instead of MODE_READ.
This was an oversight on my part, and after revisiting the discussion
it seems that nobody was concerned about anything outside of what is
made possible by ->follow_link(). So in this version, I've left the
checks for PTRACE_MODE_READ as-is.

[akpm@linux-foundation.org: catch up with concurrent proc_pid_follow_link() changes]
Signed-off-by: Calvin Owens <calvinowens@fb.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Joe Perches <joe@perches.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoproc: add cond_resched to /proc/kpage* read/write loop
Vladimir Davydov [Wed, 9 Sep 2015 22:35:51 +0000 (15:35 -0700)]
proc: add cond_resched to /proc/kpage* read/write loop

Reading/writing a /proc/kpage* file may take long on machines with a lot
of RAM installed.

Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Suggested-by: Andres Lagar-Cavilla <andreslc@google.com>
Reviewed-by: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Greg Thelen <gthelen@google.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoproc: export idle flag via kpageflags
Vladimir Davydov [Wed, 9 Sep 2015 22:35:48 +0000 (15:35 -0700)]
proc: export idle flag via kpageflags

As noted by Minchan, a benefit of reading idle flag from /proc/kpageflags
is that one can easily filter dirty and/or unevictable pages while
estimating the size of unused memory.

Note that idle flag read from /proc/kpageflags may be stale in case the
page was accessed via a PTE, because it would be too costly to iterate
over all page mappings on each /proc/kpageflags read to provide an
up-to-date value.  To make sure the flag is up-to-date one has to read
/sys/kernel/mm/page_idle/bitmap first.

Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Reviewed-by: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Greg Thelen <gthelen@google.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agomm: introduce idle page tracking
Vladimir Davydov [Wed, 9 Sep 2015 22:35:45 +0000 (15:35 -0700)]
mm: introduce idle page tracking

Knowing the portion of memory that is not used by a certain application or
memory cgroup (idle memory) can be useful for partitioning the system
efficiently, e.g.  by setting memory cgroup limits appropriately.
Currently, the only means to estimate the amount of idle memory provided
by the kernel is /proc/PID/{clear_refs,smaps}: the user can clear the
access bit for all pages mapped to a particular process by writing 1 to
clear_refs, wait for some time, and then count smaps:Referenced.  However,
this method has two serious shortcomings:

 - it does not count unmapped file pages
 - it affects the reclaimer logic

To overcome these drawbacks, this patch introduces two new page flags,
Idle and Young, and a new sysfs file, /sys/kernel/mm/page_idle/bitmap.
A page's Idle flag can only be set from userspace by setting bit in
/sys/kernel/mm/page_idle/bitmap at the offset corresponding to the page,
and it is cleared whenever the page is accessed either through page tables
(it is cleared in page_referenced() in this case) or using the read(2)
system call (mark_page_accessed()). Thus by setting the Idle flag for
pages of a particular workload, which can be found e.g.  by reading
/proc/PID/pagemap, waiting for some time to let the workload access its
working set, and then reading the bitmap file, one can estimate the amount
of pages that are not used by the workload.

The Young page flag is used to avoid interference with the memory
reclaimer.  A page's Young flag is set whenever the Access bit of a page
table entry pointing to the page is cleared by writing to the bitmap file.
If page_referenced() is called on a Young page, it will add 1 to its
return value, therefore concealing the fact that the Access bit was
cleared.

Note, since there is no room for extra page flags on 32 bit, this feature
uses extended page flags when compiled on 32 bit.

[akpm@linux-foundation.org: fix build]
[akpm@linux-foundation.org: kpageidle requires an MMU]
[akpm@linux-foundation.org: decouple from page-flags rework]
Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Reviewed-by: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Greg Thelen <gthelen@google.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agommu-notifier: add clear_young callback
Vladimir Davydov [Wed, 9 Sep 2015 22:35:41 +0000 (15:35 -0700)]
mmu-notifier: add clear_young callback

In the scope of the idle memory tracking feature, which is introduced by
the following patch, we need to clear the referenced/accessed bit not only
in primary, but also in secondary ptes.  The latter is required in order
to estimate wss of KVM VMs.  At the same time we want to avoid flushing
tlb, because it is quite expensive and it won't really affect the final
result.

Currently, there is no function for clearing pte young bit that would meet
our requirements, so this patch introduces one.  To achieve that we have
to add a new mmu-notifier callback, clear_young, since there is no method
for testing-and-clearing a secondary pte w/o flushing tlb.  The new method
is not mandatory and currently only implemented by KVM.

Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Reviewed-by: Andres Lagar-Cavilla <andreslc@google.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Greg Thelen <gthelen@google.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoproc: add kpagecgroup file
Vladimir Davydov [Wed, 9 Sep 2015 22:35:38 +0000 (15:35 -0700)]
proc: add kpagecgroup file

/proc/kpagecgroup contains a 64-bit inode number of the memory cgroup each
page is charged to, indexed by PFN.  Having this information is useful for
estimating a cgroup working set size.

The file is present if CONFIG_PROC_PAGE_MONITOR && CONFIG_MEMCG.

Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Reviewed-by: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Greg Thelen <gthelen@google.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agomemcg: zap try_get_mem_cgroup_from_page
Vladimir Davydov [Wed, 9 Sep 2015 22:35:35 +0000 (15:35 -0700)]
memcg: zap try_get_mem_cgroup_from_page

It is only used in mem_cgroup_try_charge, so fold it in and zap it.

Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Reviewed-by: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Greg Thelen <gthelen@google.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agohwpoison: use page_cgroup_ino for filtering by memcg
Vladimir Davydov [Wed, 9 Sep 2015 22:35:31 +0000 (15:35 -0700)]
hwpoison: use page_cgroup_ino for filtering by memcg

Hwpoison allows to filter pages by memory cgroup ino.  Currently, it
calls try_get_mem_cgroup_from_page to obtain the cgroup from a page and
then its ino using cgroup_ino, but now we have a helper method for
that, page_cgroup_ino, so use it instead.

This patch also loosens the hwpoison memcg filter dependency rules - it
makes it depend on CONFIG_MEMCG instead of CONFIG_MEMCG_SWAP, because
hwpoison memcg filter does not require anything (nor it used to) from
CONFIG_MEMCG_SWAP side.

Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Reviewed-by: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Greg Thelen <gthelen@google.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agomemcg: add page_cgroup_ino helper
Vladimir Davydov [Wed, 9 Sep 2015 22:35:28 +0000 (15:35 -0700)]
memcg: add page_cgroup_ino helper

This patchset introduces a new user API for tracking user memory pages
that have not been used for a given period of time.  The purpose of this
is to provide the userspace with the means of tracking a workload's
working set, i.e.  the set of pages that are actively used by the
workload.  Knowing the working set size can be useful for partitioning the
system more efficiently, e.g.  by tuning memory cgroup limits
appropriately, or for job placement within a compute cluster.

==== USE CASES ====

The unified cgroup hierarchy has memory.low and memory.high knobs, which
are defined as the low and high boundaries for the workload working set
size.  However, the working set size of a workload may be unknown or
change in time.  With this patch set, one can periodically estimate the
amount of memory unused by each cgroup and tune their memory.low and
memory.high parameters accordingly, therefore optimizing the overall
memory utilization.

Another use case is balancing workloads within a compute cluster.  Knowing
how much memory is not really used by a workload unit may help take a more
optimal decision when considering migrating the unit to another node
within the cluster.

Also, as noted by Minchan, this would be useful for per-process reclaim
(https://lwn.net/Articles/545668/). With idle tracking, we could reclaim idle
pages only by smart user memory manager.

==== USER API ====

The user API consists of two new files:

 * /sys/kernel/mm/page_idle/bitmap.  This file implements a bitmap where each
   bit corresponds to a page, indexed by PFN. When the bit is set, the
   corresponding page is idle. A page is considered idle if it has not been
   accessed since it was marked idle. To mark a page idle one should set the
   bit corresponding to the page by writing to the file. A value written to the
   file is OR-ed with the current bitmap value. Only user memory pages can be
   marked idle, for other page types input is silently ignored. Writing to this
   file beyond max PFN results in the ENXIO error. Only available when
   CONFIG_IDLE_PAGE_TRACKING is set.

   This file can be used to estimate the amount of pages that are not
   used by a particular workload as follows:

   1. mark all pages of interest idle by setting corresponding bits in the
      /sys/kernel/mm/page_idle/bitmap
   2. wait until the workload accesses its working set
   3. read /sys/kernel/mm/page_idle/bitmap and count the number of bits set

 * /proc/kpagecgroup.  This file contains a 64-bit inode number of the
   memory cgroup each page is charged to, indexed by PFN. Only available when
   CONFIG_MEMCG is set.

   This file can be used to find all pages (including unmapped file pages)
   accounted to a particular cgroup. Using /sys/kernel/mm/page_idle/bitmap, one
   can then estimate the cgroup working set size.

For an example of using these files for estimating the amount of unused
memory pages per each memory cgroup, please see the script attached
below.

==== REASONING ====

The reason to introduce the new user API instead of using
/proc/PID/{clear_refs,smaps} is that the latter has two serious
drawbacks:

 - it does not count unmapped file pages
 - it affects the reclaimer logic

The new API attempts to overcome them both. For more details on how it
is achieved, please see the comment to patch 6.

==== PATCHSET STRUCTURE ====

The patch set is organized as follows:

 - patch 1 adds page_cgroup_ino() helper for the sake of
   /proc/kpagecgroup and patches 2-3 do related cleanup
 - patch 4 adds /proc/kpagecgroup, which reports cgroup ino each page is
   charged to
 - patch 5 introduces a new mmu notifier callback, clear_young, which is
   a lightweight version of clear_flush_young; it is used in patch 6
 - patch 6 implements the idle page tracking feature, including the
   userspace API, /sys/kernel/mm/page_idle/bitmap
 - patch 7 exports idle flag via /proc/kpageflags

==== SIMILAR WORKS ====

Originally, the patch for tracking idle memory was proposed back in 2011
by Michel Lespinasse (see http://lwn.net/Articles/459269/).  The main
difference between Michel's patch and this one is that Michel implemented
a kernel space daemon for estimating idle memory size per cgroup while
this patch only provides the userspace with the minimal API for doing the
job, leaving the rest up to the userspace.  However, they both share the
same idea of Idle/Young page flags to avoid affecting the reclaimer logic.

==== PERFORMANCE EVALUATION ====

SPECjvm2008 (https://www.spec.org/jvm2008/) was used to evaluate the
performance impact introduced by this patch set.  Three runs were carried
out:

 - base: kernel without the patch
 - patched: patched kernel, the feature is not used
 - patched-active: patched kernel, 1 minute-period daemon is used for
   tracking idle memory

For tracking idle memory, idlememstat utility was used:
https://github.com/locker/idlememstat

testcase            base            patched        patched-active

compiler       537.40 ( 0.00)%   532.26 (-0.96)%   538.31 ( 0.17)%
compress       305.47 ( 0.00)%   301.08 (-1.44)%   300.71 (-1.56)%
crypto         284.32 ( 0.00)%   282.21 (-0.74)%   284.87 ( 0.19)%
derby          411.05 ( 0.00)%   413.44 ( 0.58)%   412.07 ( 0.25)%
mpegaudio      189.96 ( 0.00)%   190.87 ( 0.48)%   189.42 (-0.28)%
scimark.large   46.85 ( 0.00)%    46.41 (-0.94)%    47.83 ( 2.09)%
scimark.small  412.91 ( 0.00)%   415.41 ( 0.61)%   421.17 ( 2.00)%
serial         204.23 ( 0.00)%   213.46 ( 4.52)%   203.17 (-0.52)%
startup         36.76 ( 0.00)%    35.49 (-3.45)%    35.64 (-3.05)%
sunflow        115.34 ( 0.00)%   115.08 (-0.23)%   117.37 ( 1.76)%
xml            620.55 ( 0.00)%   619.95 (-0.10)%   620.39 (-0.03)%

composite      211.50 ( 0.00)%   211.15 (-0.17)%   211.67 ( 0.08)%

time idlememstat:

17.20user 65.16system 2:15:23elapsed 1%CPU (0avgtext+0avgdata 8476maxresident)k
448inputs+40outputs (1major+36052minor)pagefaults 0swaps

==== SCRIPT FOR COUNTING IDLE PAGES PER CGROUP ====
#! /usr/bin/python
#

import os
import stat
import errno
import struct

CGROUP_MOUNT = "/sys/fs/cgroup/memory"
BUFSIZE = 8 * 1024  # must be multiple of 8

def get_hugepage_size():
    with open("/proc/meminfo", "r") as f:
        for s in f:
            k, v = s.split(":")
            if k == "Hugepagesize":
                return int(v.split()[0]) * 1024

PAGE_SIZE = os.sysconf("SC_PAGE_SIZE")
HUGEPAGE_SIZE = get_hugepage_size()

def set_idle():
    f = open("/sys/kernel/mm/page_idle/bitmap", "wb", BUFSIZE)
    while True:
        try:
            f.write(struct.pack("Q", pow(2, 64) - 1))
        except IOError as err:
            if err.errno == errno.ENXIO:
                break
            raise
    f.close()

def count_idle():
    f_flags = open("/proc/kpageflags", "rb", BUFSIZE)
    f_cgroup = open("/proc/kpagecgroup", "rb", BUFSIZE)

    with open("/sys/kernel/mm/page_idle/bitmap", "rb", BUFSIZE) as f:
        while f.read(BUFSIZE): pass  # update idle flag

    idlememsz = {}
    while True:
        s1, s2 = f_flags.read(8), f_cgroup.read(8)
        if not s1 or not s2:
            break

        flags, = struct.unpack('Q', s1)
        cgino, = struct.unpack('Q', s2)

        unevictable = (flags >> 18) & 1
        huge = (flags >> 22) & 1
        idle = (flags >> 25) & 1

        if idle and not unevictable:
            idlememsz[cgino] = idlememsz.get(cgino, 0) + \
                (HUGEPAGE_SIZE if huge else PAGE_SIZE)

    f_flags.close()
    f_cgroup.close()
    return idlememsz

if __name__ == "__main__":
    print "Setting the idle flag for each page..."
    set_idle()

    raw_input("Wait until the workload accesses its working set, "
              "then press Enter")

    print "Counting idle pages..."
    idlememsz = count_idle()

    for dir, subdirs, files in os.walk(CGROUP_MOUNT):
        ino = os.stat(dir)[stat.ST_INO]
        print dir + ": " + str(idlememsz.get(ino, 0) / 1024) + " kB"
==== END SCRIPT ====

This patch (of 8):

Add page_cgroup_ino() helper to memcg.

This function returns the inode number of the closest online ancestor of
the memory cgroup a page is charged to.  It is required for exporting
information about which page is charged to which cgroup to userspace,
which will be introduced by a following patch.

Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Reviewed-by: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Greg Thelen <gthelen@google.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agozswap: update docs for runtime-changeable attributes
Dan Streetman [Wed, 9 Sep 2015 22:35:25 +0000 (15:35 -0700)]
zswap: update docs for runtime-changeable attributes

Change the Documentation/vm/zswap.txt doc to indicate that the "zpool" and
"compressor" params are now changeable at runtime.

Signed-off-by: Dan Streetman <ddstreet@ieee.org>
Cc: Seth Jennings <sjennings@variantweb.net>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agozswap: change zpool/compressor at runtime
Dan Streetman [Wed, 9 Sep 2015 22:35:21 +0000 (15:35 -0700)]
zswap: change zpool/compressor at runtime

Update the zpool and compressor parameters to be changeable at runtime.
When changed, a new pool is created with the requested zpool/compressor,
and added as the current pool at the front of the pool list.  Previous
pools remain in the list only to remove existing compressed pages from.
The old pool(s) are removed once they become empty.

Signed-off-by: Dan Streetman <ddstreet@ieee.org>
Acked-by: Seth Jennings <sjennings@variantweb.net>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agozswap: dynamic pool creation
Dan Streetman [Wed, 9 Sep 2015 22:35:19 +0000 (15:35 -0700)]
zswap: dynamic pool creation

Add dynamic creation of pools.  Move the static crypto compression per-cpu
transforms into each pool.  Add a pointer to zswap_entry to the pool it's
in.

This is required by the following patch which enables changing the zswap
zpool and compressor params at runtime.

[akpm@linux-foundation.org: fix merge snafus]
Signed-off-by: Dan Streetman <ddstreet@ieee.org>
Acked-by: Seth Jennings <sjennings@variantweb.net>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agozpool: add zpool_has_pool()
Dan Streetman [Wed, 9 Sep 2015 22:35:16 +0000 (15:35 -0700)]
zpool: add zpool_has_pool()

This series makes creation of the zpool and compressor dynamic, so that
they can be changed at runtime.  This makes using/configuring zswap
easier, as before this zswap had to be configured at boot time, using boot
params.

This uses a single list to track both the zpool and compressor together,
although Seth had mentioned an alternative which is to track the zpools
and compressors using separate lists.  In the most common case, only a
single zpool and single compressor, using one list is slightly simpler
than using two lists, and for the uncommon case of multiple zpools and/or
compressors, using one list is slightly less simple (and uses slightly
more memory, probably) than using two lists.

This patch (of 4):

Add zpool_has_pool() function, indicating if the specified type of zpool
is available (i.e.  zsmalloc or zbud).  This allows checking if a pool is
available, without actually trying to allocate it, similar to
crypto_has_alg().

This is used by a following patch to zswap that enables the dynamic
runtime creation of zswap zpools.

Signed-off-by: Dan Streetman <ddstreet@ieee.org>
Acked-by: Seth Jennings <sjennings@variantweb.net>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agotcp_cubic: better follow cubic curve after idle period
Eric Dumazet [Thu, 10 Sep 2015 04:55:07 +0000 (21:55 -0700)]
tcp_cubic: better follow cubic curve after idle period

Jana Iyengar found an interesting issue on CUBIC :

The epoch is only updated/reset initially and when experiencing losses.
The delta "t" of now - epoch_start can be arbitrary large after app idle
as well as the bic_target. Consequentially the slope (inverse of
ca->cnt) would be really large, and eventually ca->cnt would be
lower-bounded in the end to 2 to have delayed-ACK slow-start behavior.

This particularly shows up when slow_start_after_idle is disabled
as a dangerous cwnd inflation (1.5 x RTT) after few seconds of idle
time.

Jana initial fix was to reset epoch_start if app limited,
but Neal pointed out it would ask the CUBIC algorithm to recalculate the
curve so that we again start growing steeply upward from where cwnd is
now (as CUBIC does just after a loss). Ideally we'd want the cwnd growth
curve to be the same shape, just shifted later in time by the amount of
the idle period.

Reported-by: Jana Iyengar <jri@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Sangtae Ha <sangtae.ha@gmail.com>
Cc: Lawrence Brakmo <lawrence@brakmo.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agotcp: generate CA_EVENT_TX_START on data frames
Neal Cardwell [Thu, 10 Sep 2015 04:54:37 +0000 (21:54 -0700)]
tcp: generate CA_EVENT_TX_START on data frames

Issuing a CC TX_START event on control frames like pure ACK
is a waste of time, as a CC should not care.

Following patch needs this change, as we want CUBIC to properly track
idle time at a low cost, with a single TX_START being generated.

Yuchung might slightly refine the condition triggering TX_START
on a followup patch.

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Cc: Jana Iyengar <jri@google.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Sangtae Ha <sangtae.ha@gmail.com>
Cc: Lawrence Brakmo <lawrence@brakmo.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoxen-netfront: respect user provided max_queues
Wei Liu [Thu, 10 Sep 2015 10:18:58 +0000 (11:18 +0100)]
xen-netfront: respect user provided max_queues

Originally that parameter was always reset to num_online_cpus during
module initialisation, which renders it useless.

The fix is to only set max_queues to num_online_cpus when user has not
provided a value.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Tested-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoxen-netback: respect user provided max_queues
Wei Liu [Thu, 10 Sep 2015 10:18:57 +0000 (11:18 +0100)]
xen-netback: respect user provided max_queues

Originally that parameter was always reset to num_online_cpus during
module initialisation, which renders it useless.

The fix is to only set max_queues to num_online_cpus when user has not
provided a value.

Reported-by: Johnny Strom <johnny.strom@linuxsolutions.fi>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agor8169: Fix sleeping function called during get_stats64, v2
Corinna Vinschen [Thu, 10 Sep 2015 08:47:35 +0000 (10:47 +0200)]
r8169: Fix sleeping function called during get_stats64, v2

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=104031
Fixes: 6e85d5ad36a26debc23a9a865c029cbe242b2dc8
Based on the discussion starting at
http://www.spinics.net/lists/netdev/msg342193.html

Tested locally on RTL8168evl/8111evl with various concurrent processes
accessing /proc/net/dev while changing the link state as well as
removing/reloading the r8169 module.

Signed-off-by: Corinna Vinschen <vinschen@redhat.com>
Tested-by: poma <pomidorabelisima@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agodrm/i915: Allow DSI dual link to be configured on any pipe
Gaurav K Singh [Mon, 3 Aug 2015 10:15:32 +0000 (15:45 +0530)]
drm/i915: Allow DSI dual link to be configured on any pipe

Just like single link MIPI panels, similarly for dual link panels, pipe
to be configured is based on the DVO port from VBT Block 2. In hardware,
Port A is mapped with Pipe A and Port C is mapped with Pipe B.

This issue got introduced in -

commit 7e9804fdcffc650515c60f524b8b2076ee59e710
Author: Jani Nikula <jani.nikula@intel.com>
Date:   Fri Jan 16 14:27:23 2015 +0200

    drm/i915/dsi: add drm mipi dsi host support

Cc: stable@vger.kernel.org # v4.0
Signed-off-by: Gaurav K Singh <gaurav.k.singh@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
8 years agodrm/i915: Don't try to use DDR DVFS on CHV when disabled in the BIOS
Ville Syrjälä [Tue, 8 Sep 2015 18:05:12 +0000 (21:05 +0300)]
drm/i915: Don't try to use DDR DVFS on CHV when disabled in the BIOS

If one disables DDR DVFS in the BIOS, Punit will apparently ignores
all DDR DVFS request. Currently we assume that DDR DVFS is always
operational, which leads to errors in dmesg when the DDR DVFS requests
time out.

Fix the problem by gently prodding Punit during driver load to find out
whether it will respond to DDR DVFS requests. If the request times out,
we assume that DDR DVFS has been permanenly disabled in the BIOS and
no longer perster the Punit about it.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91629
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Clint Taylor <Clinton.A.Taylor@intel.com>
Tested-by: Clint Taylor <Clinton.A.Taylor@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
8 years agodrm/i915: Fix CSR MMIO address check
Takashi Iwai [Wed, 9 Sep 2015 14:52:09 +0000 (16:52 +0200)]
drm/i915: Fix CSR MMIO address check

Fix a wrong logical AND (&&) used for the range check of CSR MMIO.

Spotted nicely by gcc -Wlogical-op flag:
  drivers/gpu/drm/i915/intel_csr.c: In function â€˜finish_csr_load’:
  drivers/gpu/drm/i915/intel_csr.c:353:41: warning: logical â€˜and’ of mutually exclusive tests is always false [-Wlogical-op]

Fixes: eb805623d8b1 ('drm/i915/skl: Add support to load SKL CSR firmware.')
Cc: <stable@vger.kernel.org> # v4.2
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Animesh Manna <animesh.manna@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
8 years agoether: add IEEE 1722 ethertype - TSN
Henrik Austad [Wed, 9 Sep 2015 10:25:17 +0000 (12:25 +0200)]
ether: add IEEE 1722 ethertype - TSN

IEEE 1722 describes AVB (later renamed to TSN - Time Sensitive
Networking), a protocol, encapsualtion and synchronization to utilize
standard networks for audio/video (and later other time-sensitive)
streams.

This standard uses ethertype 0x22F0.

http://standards.ieee.org/develop/regauth/ethertype/eth.txt

This is a respin of a previous patch ("ether: add AVB frame type
ETH_P_AVB")

CC: "David S. Miller" <davem@davemloft.net>
CC: netdev@vger.kernel.org
CC: linux-api@vger.kernel.org
CC: linux-kernel@vger.kernel.org
Signed-off-by: Henrik Austad <henrik@austad.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoelf-em.h: move EM_MICROBLAZE to the common header
Mike Frysinger [Tue, 18 Aug 2015 07:28:01 +0000 (03:28 -0400)]
elf-em.h: move EM_MICROBLAZE to the common header

The linux/audit.h header uses EM_MICROBLAZE in order to define
AUDIT_ARCH_MICROBLAZE, but it's only available in the microblaze
asm headers.  Move it to the common elf-em.h header so that the
define can be used on non-microblaze systems.  Otherwise we get
build errors that EM_MICROBLAZE isn't defined when we try to use
the AUDIT_ARCH_MICROBLAZE symbol.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
8 years agonetlink, mmap: fix edge-case leakages in nf queue zero-copy
Daniel Borkmann [Thu, 10 Sep 2015 00:10:57 +0000 (02:10 +0200)]
netlink, mmap: fix edge-case leakages in nf queue zero-copy

When netlink mmap on receive side is the consumer of nf queue data,
it can happen that in some edge cases, we write skb shared info into
the user space mmap buffer:

Assume a possible rx ring frame size of only 4096, and the network skb,
which is being zero-copied into the netlink skb, contains page frags
with an overall skb->len larger than the linear part of the netlink
skb.

skb_zerocopy(), which is generic and thus not aware of the fact that
shared info cannot be accessed for such skbs then tries to write and
fill frags, thus leaking kernel data/pointers and in some corner cases
possibly writing out of bounds of the mmap area (when filling the
last slot in the ring buffer this way).

I.e. the ring buffer slot is then of status NL_MMAP_STATUS_VALID, has
an advertised length larger than 4096, where the linear part is visible
at the slot beginning, and the leaked sizeof(struct skb_shared_info)
has been written to the beginning of the next slot (also corrupting
the struct nl_mmap_hdr slot header incl. status etc), since skb->end
points to skb->data + ring->frame_size - NL_MMAP_HDRLEN.

The fix adds and lets __netlink_alloc_skb() take the actual needed
linear room for the network skb + meta data into account. It's completely
irrelevant for non-mmaped netlink sockets, but in case mmap sockets
are used, it can be decided whether the available skb_tailroom() is
really large enough for the buffer, or whether it needs to internally
fallback to a normal alloc_skb().

>From nf queue side, the information whether the destination port is
an mmap RX ring is not really available without extra port-to-socket
lookup, thus it can only be determined in lower layers i.e. when
__netlink_alloc_skb() is called that checks internally for this. I
chose to add the extra ldiff parameter as mmap will then still work:
We have data_len and hlen in nfqnl_build_packet_message(), data_len
is the full length (capped at queue->copy_range) for skb_zerocopy()
and hlen some possible part of data_len that needs to be copied; the
rem_len variable indicates the needed remaining linear mmap space.

The only other workaround in nf queue internally would be after
allocation time by f.e. cap'ing the data_len to the skb_tailroom()
iff we deal with an mmap skb, but that would 1) expose the fact that
we use a mmap skb to upper layers, and 2) trim the skb where we
otherwise could just have moved the full skb into the normal receive
queue.

After the patch, in my test case the ring slot doesn't fit and therefore
shows NL_MMAP_STATUS_COPY, where a full skb carries all the data and
thus needs to be picked up via recv().

Fixes: 3ab1f683bf8b ("nfnetlink: add support for memory mapped netlink")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonetlink, mmap: don't walk rx ring on poll if receive queue non-empty
Daniel Borkmann [Wed, 9 Sep 2015 23:20:46 +0000 (01:20 +0200)]
netlink, mmap: don't walk rx ring on poll if receive queue non-empty

In case of netlink mmap, there can be situations where received frames
have to be placed into the normal receive queue. The ring buffer indicates
this through NL_MMAP_STATUS_COPY, so the user is asked to pick them up
via recvmsg(2) syscall, and to put the slot back to NL_MMAP_STATUS_UNUSED.

Commit 0ef707700f1c ("netlink: rx mmap: fix POLLIN condition") changed
polling, so that we walk in the worst case the whole ring through the
new netlink_has_valid_frame(), for example, when the ring would have no
NL_MMAP_STATUS_VALID, but at least one NL_MMAP_STATUS_COPY frame.

Since we do a datagram_poll() already earlier to pick up a mask that could
possibly contain POLLIN | POLLRDNORM already (due to NL_MMAP_STATUS_COPY),
we can skip checking the rx ring entirely.

In case the kernel is compiled with !CONFIG_NETLINK_MMAP, then all this is
irrelevant anyway as netlink_poll() is just defined as datagram_poll().

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agocxgb4: changes for new firmware 1.14.4.0
Hariprasad Shenai [Thu, 10 Sep 2015 04:25:13 +0000 (09:55 +0530)]
cxgb4: changes for new firmware 1.14.4.0

Incorporate fw_ldst_cmd structure change for new firmware and also
update version string for the same

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: fec: add netif status check before set mac address
Nimrod Andy [Thu, 10 Sep 2015 01:35:39 +0000 (09:35 +0800)]
net: fec: add netif status check before set mac address

There exist one issue by below case that case system hang:
ifconfig eth0 down
ifconfig eth0 hw ether 00:10:19:19:81:19

After eth0 down, all fec clocks are gated off. In the .fec_set_mac_address()
function, it will set new MAC address to registers, which causes system hang.

So it needs to add netif status check to avoid registers access when clocks are
gated off. Until eth0 up the new MAC address are wrote into related registers.

V2:
As Lucas Stach's suggestion, add a comment in the code to explain why it needed.

CC: Lucas Stach <l.stach@pengutronix.de>
CC: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoMerge branch 'r8152-autoresume'
David S. Miller [Thu, 10 Sep 2015 03:27:54 +0000 (20:27 -0700)]
Merge branch 'r8152-autoresume'

Hayes Wang says:

====================
r8152: fix the autoresume may fail

Fix the autosuspend issues which occur about linking change.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agor8152: fix the runtime suspend issues
hayeswang [Mon, 7 Sep 2015 03:57:44 +0000 (11:57 +0800)]
r8152: fix the runtime suspend issues

Fix the runtime suspend issues result from the linking change.

Case 1:
a) link down occurs.
b) driver disable tx/rx.
c) autosuspend occurs.
d) hw linking up.
e) device suspends without enabling tx/rx.
f) couldn't wake up when receiving packets.

Case 2:
a) Nway results in linking down.
b) autosuspend occurs.
c) device suspends.
d) device may not wake up when linking up.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agor8152: split DRIVER_VERSION
hayeswang [Mon, 7 Sep 2015 03:57:43 +0000 (11:57 +0800)]
r8152: split DRIVER_VERSION

Split DRIVER_VERSION into NETNEXT_VERSION and NET_VERSION. Then,
according to the value of DRIVER_VERSION, we could know which
patches are used generally without comparing the source code.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoipv6: fix ifnullfree.cocci warnings
Wu Fengguang [Wed, 9 Sep 2015 22:57:12 +0000 (06:57 +0800)]
ipv6: fix ifnullfree.cocci warnings

net/ipv6/route.c:2946:3-8: WARNING: NULL check before freeing functions like kfree, debugfs_remove, debugfs_remove_recursive or usb_free_urb is not needed. Maybe consider reorganizing relevant code to avoid passing NULL values.

 NULL check before some freeing functions is not needed.

 Based on checkpatch warning
 "kfree(NULL) is safe this check is probably not required"
 and kfreeaddr.cocci by Julia Lawall.

Generated by: scripts/coccinelle/free/ifnullfree.cocci

CC: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoadd microchip LAN88xx phy driver
Woojung.Huh@microchip.com [Wed, 9 Sep 2015 20:49:53 +0000 (20:49 +0000)]
add microchip LAN88xx phy driver

Add Microchip LAN88XX phy driver for phylib.

Signed-off-by: Woojung Huh <woojung.huh@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agostmmac: fix check for phydev being open
Alexey Brodkin [Wed, 9 Sep 2015 15:01:08 +0000 (18:01 +0300)]
stmmac: fix check for phydev being open

Current check of phydev with IS_ERR(phydev) may make not much sense
because of_phy_connect() returns NULL on failure instead of error value.

Still for checking result of phy_connect() IS_ERR() makes perfect sense.

So let's use combined check IS_ERR_OR_NULL() that covers both cases.

Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: qlcnic: delete redundant memsets
Rasmus Villemoes [Wed, 9 Sep 2015 08:38:05 +0000 (10:38 +0200)]
net: qlcnic: delete redundant memsets

In all cases, mbx->req.arg and mbx->rsp.arg have just been allocated
using kcalloc(), so these six memsets are redundant.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: mv643xx_eth: use kzalloc
Rasmus Villemoes [Wed, 9 Sep 2015 08:38:04 +0000 (10:38 +0200)]
net: mv643xx_eth: use kzalloc

The double memset is a little ugly; using kzalloc avoids it altogether.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: jme: use kzalloc() instead of kmalloc+memset
Rasmus Villemoes [Wed, 9 Sep 2015 08:38:03 +0000 (10:38 +0200)]
net: jme: use kzalloc() instead of kmalloc+memset

Using kzalloc saves a tiny bit on .text.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: cavium: liquidio: use kzalloc in setup_glist()
Rasmus Villemoes [Wed, 9 Sep 2015 08:38:02 +0000 (10:38 +0200)]
net: cavium: liquidio: use kzalloc in setup_glist()

We save a little .text and get rid of the sizeof(...) style
inconsistency.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoMerge tag 'qcom-soc-for-4.3-rc2' of git://codeaurora.org/quic/kernel/agross-msm into...
Kevin Hilman [Wed, 9 Sep 2015 23:15:34 +0000 (16:15 -0700)]
Merge tag 'qcom-soc-for-4.3-rc2' of git://codeaurora.org/quic/kernel/agross-msm into next/late

Qualcomm ARM Based SoC Updates for 4.3-rc2

* Fix errant private access in SMEM
* Fix use of correct remote processor ID in SMD transactions
* Correct SMD fBLOCKREADINTR handling

* tag 'qcom-soc-for-4.3-rc2' of git://codeaurora.org/quic/kernel/agross-msm:
  soc: qcom: smd: Correct fBLOCKREADINTR handling
  soc: qcom: smd: Use correct remote processor ID
  soc: qcom: smem: Fix errant private access
  devicetree: soc: Add Qualcomm SMD based RPM DT binding
  soc: qcom: Driver for the Qualcomm RPM over SMD
  soc: qcom: Add Shared Memory Driver
  soc: qcom: Add device tree binding for Shared Memory Device
  drivers: qcom: Select QCOM_SCM unconditionally for QCOM_PM
  soc: qcom: Add Shared Memory Manager driver

8 years agoMerge tag 'qcom-dt-for-4.3-rc2' of git://codeaurora.org/quic/kernel/agross-msm into...
Kevin Hilman [Wed, 9 Sep 2015 23:15:19 +0000 (16:15 -0700)]
Merge tag 'qcom-dt-for-4.3-rc2' of git://codeaurora.org/quic/kernel/agross-msm into next/late

Qualcomm ARM Based Device Tree Updates for v4.3-rc2

* Add labels for serial nodes to be used for aliasing and stdout-path
* Add stdout-path for APQ8064 Compulab QS600
* Add stdout-path for APQ8064 Inforce 6410
* Add stdout-path for APQ8074 Dragonboard
* Add stdout-path for APQ8084 Inforce 6540
* Add stdout-path for APQ8084 MTP
* Add stdout-path for IPQ8064 AP148
* Add stdout-path for MSM8660 Surf
* Add stdout-path for MSM8960 CDP
* Add stdout-path for MSM8974 Xperia Honami

* tag 'qcom-dt-for-4.3-rc2' of git://codeaurora.org/quic/kernel/agross-msm: (24 commits)
  ARM: dts: qcom: msm8974-sony-xperia-honami: Use stdout-path
  ARM: dts: qcom: msm8960-cdp: Use stdout-path
  ARM: dts: qcom: msm8660-surf: Use stdout-path
  ARM: dts: qcom: ipq8064-ap148: Use stdout-path
  ARM: dts: qcom: apq8084-mtp: Use stdout-path
  ARM: dts: qcom: apq8084-ifc6540: Use stdout-path
  ARM: dts: qcom: apq8074-dragonboard: Use stdout-path
  ARM: dts: qcom: apq8064-ifc6410: Use stdout-path
  ARM: dts: qcom: apq8064-cm-qs600: Use stdout-path
  ARM: dts: qcom: Label serial nodes for aliasing and stdout-path
  ARM: dts: qs600: Add real regulators to sdcc
  ARM: dts: ifc6410: add real regulators for sdcc nodes.
  ARM: dts: apq8064: remove temporary fixed regulator for mmc
  ARM: dts: apq8064: fix missing gsbi cell-index
  ARM: dts: apq8064: Add DT support for GSBI6 and for UART pin mux
  ARM: dts: apq8064: add pm8921 mpp support
  ARM: dts: apq8064: Add pm8921 mfd and its gpio node
  ARM: dts: msm8974: Add smem reservation and node
  ARM: dts: msm8974: Add tcsr mutex node
  ARM: dts: qcom: Add ks8851 node for wired ethernet
  ...

8 years agoMerge branch 'next/defconfig' into next/late
Kevin Hilman [Wed, 9 Sep 2015 23:07:41 +0000 (16:07 -0700)]
Merge branch 'next/defconfig' into next/late

* next/defconfig: (45 commits)
  ARM: multi_v7_defconfig: Enable PBIAS regulator
  ARM: add TC2 PM support to multi_v7_defconfig
  ARM: tegra: Update multi_v7_defconfig
  ARM: tegra: Update default configuration
  ARM: at91/defconfig: at91_dt: remove ARM_AT91_ETHER
  ARM: at91/defconfig: at91_dt: enable DRM hlcdc support
  ARM: at91: at91_dt_defconfig: enable ISI and ov2640 support
  ARM: multi_v7_defconfig: Enable Allwinner P2WI, PWM, DMA_SUN6I, cryptodev
  ARM: sunxi_defconfig: Enable DMA_SUN6I, P2WI, PWM, cryptodev, EXTCON, FHANDLE
  ARM: shmobile: Enable fixed voltage regulator in shmobile_defconfig
  ARM: multi_v7_defconfig: Select MX6UL and MX7D
  ARM: prima2_defconfig: enable build for hwspinlock
  ARM: prima2_defconfig: enable build for RTC
  ARM: prima2_defconfig: enable build for misc input
  ARM: prima2_defconfig: enable build for SiRFSoC SDHC host
  ARM: prima2_defconfig: fix the outdated defconfig
  ARM: imx_v6_v7_defconfig: Select CONFIG_IKCONFIG_PROC
  ARM: defconfig: orion5x: add DT support
  ARM: qcom_defconfig: Enable options for KS8851 ethernet
  ARM: multi_v7_defconfig: Enable support for PWM Regulators
  ...

8 years agoARM: multi_v7_defconfig: Enable PBIAS regulator
Kishon Vijay Abraham I [Fri, 4 Sep 2015 12:13:15 +0000 (17:43 +0530)]
ARM: multi_v7_defconfig: Enable PBIAS regulator

PBIAS regulator is required for MMC module in OMAP2, OMAP3, OMAP4,
OMAP5 and DRA7 SoCs. Enable it here.

Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Kevin Hilman <khilman@linaro.org>
8 years agoMerge branch 'drivers/reset' into next/late
Kevin Hilman [Wed, 9 Sep 2015 22:42:45 +0000 (15:42 -0700)]
Merge branch 'drivers/reset' into next/late

* drivers/reset:
  reset: ath79: Fix missing spin_lock_init
  reset: Add (devm_)reset_control_get stub functions
  reset: reset-zynq: Adding support for Xilinx Zynq reset controller.
  docs: dts: Added documentation for Xilinx Zynq Reset Controller bindings.
  MIPS: ath79: Add the reset controller to the AR9132 dtsi
  reset: Add a driver for the reset controller on the AR71XX/AR9XXX
  devicetree: Add bindings for the ATH79 reset controller
  reset: socfpga: Update reset-socfpga to read the altr,modrst-offset property
  doc: dt: add documentation for lpc1850-rgu reset driver
  reset: add driver for lpc18xx rgu
  reset: sti: constify of_device_id array
  ARM: STi: DT: Move reset controller constants into common location
  MAINTAINERS: add include/dt-bindings/reset path to reset controller entry

8 years agoMerge tag 'reset-for-4.3-fixes' of git://git.pengutronix.de/git/pza/linux into driver...
Kevin Hilman [Wed, 9 Sep 2015 22:41:42 +0000 (15:41 -0700)]
Merge tag 'reset-for-4.3-fixes' of git://git.pengutronix.de/git/pza/linux into drivers/reset

Merge "Reset controller fixes for v4.3" from Philipp Zabel:

Reset controller fixes for v4.3

- added stubs to avoid build breakage in COMPILE_TEST
  configurations with RESET_CONTROLLER disabled
- fixed missing spinlock initialization in ath79 driver

* tag 'reset-for-4.3-fixes' of git://git.pengutronix.de/git/pza/linux:
  reset: ath79: Fix missing spin_lock_init
  reset: Add (devm_)reset_control_get stub functions

8 years agonet: ipv6: use common fib_default_rule_pref
Phil Sutter [Wed, 9 Sep 2015 12:20:56 +0000 (14:20 +0200)]
net: ipv6: use common fib_default_rule_pref

This switches IPv6 policy routing to use the shared
fib_default_rule_pref() function of IPv4 and DECnet. It is also used in
multicast routing for IPv4 as well as IPv6.

The motivation for this patch is a complaint about iproute2 behaving
inconsistent between IPv4 and IPv6 when adding policy rules: Formerly,
IPv6 rules were assigned a fixed priority of 0x3FFF whereas for IPv4 the
assigned priority value was decreased with each rule added.

Since then all users of the default_pref field have been converted to
assign the generic function fib_default_rule_pref(), fib_nl_newrule()
may just use it directly instead. Therefore get rid of the function
pointer altogether and make fib_default_rule_pref() static, as it's not
used outside fib_rules.c anymore.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: ethoc: Remove unnecessary #ifdef CONFIG_OF
Tobias Klauser [Wed, 9 Sep 2015 09:24:29 +0000 (11:24 +0200)]
net: ethoc: Remove unnecessary #ifdef CONFIG_OF

For !CONFIG_OF of_get_property() is defined to always return NULL. Thus
there's no need to protect the call to of_get_property() with #ifdef
CONFIG_OF.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: dsa: bcm_sf2: Fix 64-bits register writes
Florian Fainelli [Wed, 9 Sep 2015 03:06:41 +0000 (20:06 -0700)]
net: dsa: bcm_sf2: Fix 64-bits register writes

The macro to write 64-bits quantities to the 32-bits register swapped
the value and offsets arguments, we want to preserve the ordering of the
arguments with respect to how writel() is implemented for instance:
value first, offset/base second.

Fixes: 246d7f773c13 ("net: dsa: add Broadcom SF2 switch driver")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agobpf: fix out of bounds access in verifier log
Alexei Starovoitov [Tue, 8 Sep 2015 20:40:01 +0000 (13:40 -0700)]
bpf: fix out of bounds access in verifier log

when the verifier log is enabled the print_bpf_insn() is doing
bpf_alu_string[BPF_OP(insn->code) >> 4]
and
bpf_jmp_string[BPF_OP(insn->code) >> 4]
where BPF_OP is a 4-bit instruction opcode.
Malformed insns can cause out of bounds access.
Fix it by sizing arrays appropriately.

The bug was found by clang address sanitizer with libfuzzer.

Reported-by: Yonghong Song <yhs@plumgrid.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoipv6: fix multipath route replace error recovery
Roopa Prabhu [Tue, 8 Sep 2015 17:53:04 +0000 (10:53 -0700)]
ipv6: fix multipath route replace error recovery

Problem:
The ecmp route replace support for ipv6 in the kernel, deletes the
existing ecmp route too early, ie when it installs the first nexthop.
If there is an error in installing the subsequent nexthops, its too late
to recover the already deleted existing route leaving the fib
in an inconsistent state.

This patch reduces the possibility of this by doing the following:
a) Changes the existing multipath route add code to a two stage process:
  build rt6_infos + insert them
ip6_route_add rt6_info creation code is moved into
ip6_route_info_create.
b) This ensures that most errors are caught during building rt6_infos
  and we fail early
c) Separates multipath add and del code. Because add needs the special
  two stage mode in a) and delete essentially does not care.
d) In any event if the code fails during inserting a route again, a
  warning is printed (This should be unlikely)

Before the patch:
$ip -6 route show
3000:1000:1000:1000::2 via fe80::202:ff:fe00:b dev swp49s0 metric 1024
3000:1000:1000:1000::2 via fe80::202:ff:fe00:d dev swp49s1 metric 1024
3000:1000:1000:1000::2 via fe80::202:ff:fe00:f dev swp49s2 metric 1024

/* Try replacing the route with a duplicate nexthop */
$ip -6 route change 3000:1000:1000:1000::2/128 nexthop via
fe80::202:ff:fe00:b dev swp49s0 nexthop via fe80::202:ff:fe00:d dev
swp49s1 nexthop via fe80::202:ff:fe00:d dev swp49s1
RTNETLINK answers: File exists

$ip -6 route show
/* previously added ecmp route 3000:1000:1000:1000::2 dissappears from
 * kernel */

After the patch:
$ip -6 route show
3000:1000:1000:1000::2 via fe80::202:ff:fe00:b dev swp49s0 metric 1024
3000:1000:1000:1000::2 via fe80::202:ff:fe00:d dev swp49s1 metric 1024
3000:1000:1000:1000::2 via fe80::202:ff:fe00:f dev swp49s2 metric 1024

/* Try replacing the route with a duplicate nexthop */
$ip -6 route change 3000:1000:1000:1000::2/128 nexthop via
fe80::202:ff:fe00:b dev swp49s0 nexthop via fe80::202:ff:fe00:d dev
swp49s1 nexthop via fe80::202:ff:fe00:d dev swp49s1
RTNETLINK answers: File exists

$ip -6 route show
3000:1000:1000:1000::2 via fe80::202:ff:fe00:b dev swp49s0 metric 1024
3000:1000:1000:1000::2 via fe80::202:ff:fe00:d dev swp49s1 metric 1024
3000:1000:1000:1000::2 via fe80::202:ff:fe00:f dev swp49s2 metric 1024

Fixes: 27596472473a ("ipv6: fix ECMP route replacement")
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agosoc: qcom: smd: Correct fBLOCKREADINTR handling
Bjorn Andersson [Mon, 24 Aug 2015 20:38:46 +0000 (13:38 -0700)]
soc: qcom: smd: Correct fBLOCKREADINTR handling

fBLOCKREADINTR is masking the notification from the remote and should
hence be cleared while we're waiting the tx fifo to drain. Also change
the reset state to mask the notification, as send is the only use case
where we're interested in it.

Signed-off-by: Bjorn Andersson <bjorn.andersson@sonymobile.com>
Signed-off-by: Andy Gross <agross@codeaurora.org>
8 years agosoc: qcom: smd: Use correct remote processor ID
Andy Gross [Wed, 26 Aug 2015 19:42:45 +0000 (14:42 -0500)]
soc: qcom: smd: Use correct remote processor ID

This patch fixes SMEM addressing issues when remote processors need to use
secure SMEM partitions.

Signed-off-by: Andy Gross <agross@codeaurora.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@sonymobile.com>
8 years agosoc: qcom: smem: Fix errant private access
Andy Gross [Wed, 12 Aug 2015 04:48:15 +0000 (23:48 -0500)]
soc: qcom: smem: Fix errant private access

This patch corrects private partition item access.  Instead of falling back to
global for instances where we have an actual host and remote partition existing,
return the results of the private lookup.

Signed-off-by: Andy Gross <agross@codeaurora.org>
8 years agoMerge tag 'qcom-soc-for-4.3' into v4.2-rc2
Andy Gross [Wed, 9 Sep 2015 20:56:35 +0000 (15:56 -0500)]
Merge tag 'qcom-soc-for-4.3' into v4.2-rc2

Qualcomm ARM Based SoC Updates for 4.3

* Add SMEM driver
* Add SMD driver
* Add RPM over SMD driver
* Select QCOM_SCM by default

8 years agointel_pstate: fix PCT_TO_HWP macro
Kristen Carlson Accardi [Wed, 9 Sep 2015 18:41:22 +0000 (11:41 -0700)]
intel_pstate: fix PCT_TO_HWP macro

PCT_TO_HWP does not take the actual range of pstates exported
by HWP_CAPABILITIES in account, and is broken on most platforms.
Remove the macro and set the min and max pstate for hwp by
determining the range and adjusting by the min and max percent
limits values.

Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
8 years agointel_pstate: Fix user input of min/max to legal policy region
Chen Yu [Wed, 9 Sep 2015 10:27:31 +0000 (18:27 +0800)]
intel_pstate: Fix user input of min/max to legal policy region

In current code, max_perf_pct might be smaller than min_perf_pct
by improper user input:

$ grep . /sys/devices/system/cpu/intel_pstate/m*_perf_pct
/sys/devices/system/cpu/intel_pstate/max_perf_pct:100
/sys/devices/system/cpu/intel_pstate/min_perf_pct:100

$ echo 80 > /sys/devices/system/cpu/intel_pstate/max_perf_pct

$ grep . /sys/devices/system/cpu/intel_pstate/m*_perf_pct
/sys/devices/system/cpu/intel_pstate/max_perf_pct:80
/sys/devices/system/cpu/intel_pstate/min_perf_pct:100

Fix this problem by 2 steps:
 1. Normalize the user input to [min_policy, max_policy].
 2. Make sure max_perf_pct>=min_perf_pct, suggested by Seiichi Ikarashi.

Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Acked-by: Kristen Carlson Accardi <kristen@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
8 years agoPM / OPP: Return suspend_opp only if it is enabled
Viresh Kumar [Wed, 9 Sep 2015 11:28:22 +0000 (16:58 +0530)]
PM / OPP: Return suspend_opp only if it is enabled

There is no point returning suspend_opp, if it is disabled by the core.
As we can't use it at all. Fix it.

Fixes: 4eafbd15b6c8 ("PM / OPP: add dev_pm_opp_get_suspend_opp() helper")
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
8 years agoARM: dts: qcom: msm8974-sony-xperia-honami: Use stdout-path
Stephen Boyd [Tue, 16 Jun 2015 21:31:53 +0000 (14:31 -0700)]
ARM: dts: qcom: msm8974-sony-xperia-honami: Use stdout-path

Use stdout-path so that we don't have to put the console on the
kernel command line.

Cc: Tim Bird <tim.bird@sonymobile.com>
Cc: Bjorn Andersson <bjorn.andersson@sonymobile.com>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
8 years agoARM: dts: qcom: msm8960-cdp: Use stdout-path
Stephen Boyd [Tue, 16 Jun 2015 21:31:52 +0000 (14:31 -0700)]
ARM: dts: qcom: msm8960-cdp: Use stdout-path

Use stdout-path so that we don't have to put the console on the
kernel command line.

Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
8 years agoARM: dts: qcom: msm8660-surf: Use stdout-path
Stephen Boyd [Tue, 16 Jun 2015 21:31:51 +0000 (14:31 -0700)]
ARM: dts: qcom: msm8660-surf: Use stdout-path

Use stdout-path so that we don't have to put the console on the
kernel command line.

Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
8 years agoARM: dts: qcom: ipq8064-ap148: Use stdout-path
Stephen Boyd [Tue, 16 Jun 2015 21:31:50 +0000 (14:31 -0700)]
ARM: dts: qcom: ipq8064-ap148: Use stdout-path

Use stdout-path so that we don't have to put the console on the
kernel command line.

Cc: Mathieu Olivari <mathieu@codeaurora.org>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
8 years agoARM: dts: qcom: apq8084-mtp: Use stdout-path
Stephen Boyd [Tue, 16 Jun 2015 21:31:49 +0000 (14:31 -0700)]
ARM: dts: qcom: apq8084-mtp: Use stdout-path

Use stdout-path so that we don't have to put the console on the
kernel command line.

Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
8 years agoARM: dts: qcom: apq8084-ifc6540: Use stdout-path
Stephen Boyd [Tue, 16 Jun 2015 21:31:48 +0000 (14:31 -0700)]
ARM: dts: qcom: apq8084-ifc6540: Use stdout-path

Use stdout-path so that we don't have to put the console on the
kernel command line.

Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
8 years agoARM: dts: qcom: apq8074-dragonboard: Use stdout-path
Stephen Boyd [Tue, 16 Jun 2015 21:31:47 +0000 (14:31 -0700)]
ARM: dts: qcom: apq8074-dragonboard: Use stdout-path

Use stdout-path so that we don't have to put the console on the
kernel command line.

Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
8 years agoARM: dts: qcom: apq8064-ifc6410: Use stdout-path
Stephen Boyd [Tue, 16 Jun 2015 21:31:46 +0000 (14:31 -0700)]
ARM: dts: qcom: apq8064-ifc6410: Use stdout-path

Use stdout-path so that we don't have to put the console on the
kernel command line.

Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
8 years agoARM: dts: qcom: apq8064-cm-qs600: Use stdout-path
Stephen Boyd [Tue, 16 Jun 2015 21:31:45 +0000 (14:31 -0700)]
ARM: dts: qcom: apq8064-cm-qs600: Use stdout-path

Use stdout-path so that we don't have to put the console on the
kernel command line.

Cc: Mike Rapoport <mike.rapoport@gmail.com>
Cc: Igor Grinberg <grinberg@compulab.co.il>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
8 years agoARM: dts: qcom: Label serial nodes for aliasing and stdout-path
Stephen Boyd [Tue, 16 Jun 2015 21:31:44 +0000 (14:31 -0700)]
ARM: dts: qcom: Label serial nodes for aliasing and stdout-path

Add a label to the serial nodes that are being used for the
console.

Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
8 years agoMerge tag 'qcom-dt-for-4.3' into v4.2-rc2
Andy Gross [Wed, 9 Sep 2015 19:56:32 +0000 (14:56 -0500)]
Merge tag 'qcom-dt-for-4.3' into v4.2-rc2

Qualcomm ARM Based Device Tree Updates for v4.3

* Switch to use pinctrl compatible for GPIOs
* Add RPM regulators for MSM8960
* Add SPI Ethernet support on MSM8960 CDP
* Add SMEM support along with dependencies
* Add PM8921 support for GPIO and MPP
* Fix GSBI cell index
* Switch to use real regulators on APQ8064 w/ SDCC

8 years agoebpf: fix fd refcount leaks related to maps in bpf syscall
Daniel Borkmann [Tue, 8 Sep 2015 16:00:09 +0000 (18:00 +0200)]
ebpf: fix fd refcount leaks related to maps in bpf syscall

We may already have gotten a proper fd struct through fdget(), so
whenever we return at the end of an map operation, we need to call
fdput(). However, each map operation from syscall side first probes
CHECK_ATTR() to verify that unused fields in the bpf_attr union are
zero.

In case of malformed input, we return with error, but the lookup to
the map_fd was already performed at that time, so that we return
without an corresponding fdput(). Fix it by performing an fdget()
only right before bpf_map_get(). The fdget() invocation on maps in
the verifier is not affected.

Fixes: db20fd2b0108 ("bpf: add lookup/update/delete/iterate methods to BPF maps")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoWatchdog: Fix parent of watchdog_devices
Pratyush Anand [Thu, 20 Aug 2015 08:35:01 +0000 (14:05 +0530)]
Watchdog: Fix parent of watchdog_devices

/sys/class/watchdog/watchdogn/device/modalias can help to identify the
driver/module for a given watchdog node. However, many wdt devices do not
set their parent and so, we do not see an entry for device in sysfs for
such devices.

This patch fixes parent of watchdog_device so that
/sys/class/watchdog/watchdogn/device is populated.

Exceptions: booke, diag288, octeon, softdog and w83627hf -- They do not
have any parent. Not sure, how we can identify driver for these devices.

Signed-off-by: Pratyush Anand <panand@redhat.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Acked-by: Lee Jones <lee.jones@linaro.org>
Acked-by: Lubomir Rintel <lkundrak@v3.sk>
Acked-by: Maxime Coquelin <maxime.coquelin@st.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
8 years agowatchdog: at91rm9200: Correct check for syscon_node_to_regmap() errors
Bjorn Andersson [Mon, 17 Aug 2015 16:19:03 +0000 (09:19 -0700)]
watchdog: at91rm9200: Correct check for syscon_node_to_regmap() errors

syscon_node_to_regmap() returns a regmap or an ERR_PTR().

Signed-off-by: Bjorn Andersson <bjorn.andersson@sonymobile.com>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>