]> git.kernelconcepts.de Git - karo-tx-linux.git/log
karo-tx-linux.git
12 years agoeCryptfs: Unlock keys needed by ecryptfsd
Tyler Hicks [Wed, 27 Jul 2011 00:47:08 +0000 (19:47 -0500)]
eCryptfs: Unlock keys needed by ecryptfsd

commit b2987a5e05ec7a1af7ca42e5d5349d7a22753031 upstream.

Fixes a regression caused by b5695d04634fa4ccca7dcbc05bb4a66522f02e0b

Kernel keyring keys containing eCryptfs authentication tokens should not
be write locked when calling out to ecryptfsd to wrap and unwrap file
encryption keys. The eCryptfs kernel code can not hold the key's write
lock because ecryptfsd needs to request the key after receiving such a
request from the kernel.

Without this fix, all file opens and creates will timeout and fail when
using the eCryptfs PKI infrastructure. This is not an issue when using
passphrase-based mount keys, which is the most widely deployed eCryptfs
configuration.

Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
Acked-by: Roberto Sassu <roberto.sassu@polito.it>
Tested-by: Roberto Sassu <roberto.sassu@polito.it>
Tested-by: Alexis Hafner1 <haf@zurich.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoecryptfs: Make inode bdi consistent with superblock bdi
Thieu Le [Tue, 26 Jul 2011 23:15:10 +0000 (16:15 -0700)]
ecryptfs: Make inode bdi consistent with superblock bdi

commit 985ca0e626e195ea08a1a82b8dbeb6719747429a upstream.

Make the inode mapping bdi consistent with the superblock bdi so that
dirty pages are flushed properly.

Signed-off-by: Thieu Le <thieule@chromium.org>
Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoext3: Fix oops in ext3_try_to_allocate_with_rsv()
Jan Kara [Mon, 30 May 2011 11:29:20 +0000 (13:29 +0200)]
ext3: Fix oops in ext3_try_to_allocate_with_rsv()

commit ad95c5e9bc8b5885f94dce720137cac8fa8da4c9 upstream.

Block allocation is called from two places: ext3_get_blocks_handle() and
ext3_xattr_block_set(). These two callers are not necessarily synchronized
because xattr code holds only xattr_sem and i_mutex, and
ext3_get_blocks_handle() may hold only truncate_mutex when called from
writepage() path. Block reservation code does not expect two concurrent
allocations to happen to the same inode and thus assertions can be triggered
or reservation structure corruption can occur.

Fix the problem by taking truncate_mutex in xattr code to serialize
allocations.

CC: Sage Weil <sage@newdream.net>
Reported-by: Fyodor Ustinov <ufm@ufm.su>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoext4: free allocated and pre-allocated blocks when check_eofblocks_fl fails
Jiaying Zhang [Mon, 11 Jul 2011 00:07:25 +0000 (20:07 -0400)]
ext4: free allocated and pre-allocated blocks when check_eofblocks_fl fails

commit 575a1d4bdfa2ea9fc10733013136145b497e1be0 upstream.

Upon corrupted inode or disk failures, we may fail after we already
allocate some blocks from the inode or take some blocks from the
inode's preallocation list, but before we successfully insert the
corresponding extent to the extent tree. In this case, we should free
any allocated blocks and discard the inode's preallocated blocks
because the entries in the inode's preallocation list may be in an
inconsistent state.

Signed-off-by: Jiaying Zhang <jiayingz@google.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoext4: fix i_blocks/quota accounting when extent insertion fails
Maxim Patlasov [Sun, 10 Jul 2011 23:37:48 +0000 (19:37 -0400)]
ext4: fix i_blocks/quota accounting when extent insertion fails

commit 7132de744ba76930d13033061018ddd7e3e8cd91 upstream.

The current implementation of ext4_free_blocks() always calls
dquot_free_block This looks quite sensible in the most cases: blocks
to be freed are associated with inode and were accounted in quota and
i_blocks some time ago.

However, there is a case when blocks to free were not accounted by the
time calling ext4_free_blocks() yet:

1. delalloc is on, write_begin pre-allocated some space in quota
2. write-back happens, ext4 allocates some blocks in ext4_ext_map_blocks()
3. then ext4_ext_map_blocks() gets an error (e.g.  ENOSPC) from
   ext4_ext_insert_extent() and calls ext4_free_blocks().

In this scenario, ext4_free_blocks() calls dquot_free_block() who, in
turn, decrements i_blocks for blocks which were not accounted yet (due
to delalloc) After clean umount, e2fsck reports something like:

> Inode 21, i_blocks is 5080, should be 5128.  Fix<y>?
because i_blocks was erroneously decremented as explained above.

The patch fixes the problem by passing the new flag
EXT4_FREE_BLOCKS_NO_QUOT_UPDATE to ext4_free_blocks(), to request
that the dquot_free_block() call be skipped.

Signed-off-by: Maxim Patlasov <maxim.patlasov@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoxtensa: prevent arbitrary read in ptrace
Dan Rosenberg [Tue, 26 Jul 2011 00:11:53 +0000 (17:11 -0700)]
xtensa: prevent arbitrary read in ptrace

commit 0d0138ebe24b94065580bd2601f8bb7eb6152f56 upstream.

Prevent an arbitrary kernel read.  Check the user pointer with access_ok()
before copying data in.

[akpm@linux-foundation.org: s/EIO/EFAULT/]
Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com>
Cc: Christian Zankel <chris@zankel.net>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agomm/backing-dev.c: reset bdi min_ratio in bdi_unregister()
Peter Zijlstra [Tue, 26 Jul 2011 00:11:57 +0000 (17:11 -0700)]
mm/backing-dev.c: reset bdi min_ratio in bdi_unregister()

commit ccb6108f5b0b541d3eb332c3a73e645c0f84278e upstream.

Vito said:

: The system has many usb disks coming and going day to day, with their
: respective bdi's having min_ratio set to 1 when inserted.  It works for
: some time until eventually min_ratio can no longer be set, even when the
: active set of bdi's seen in /sys/class/bdi/*/min_ratio doesn't add up to
: anywhere near 100.
:
: This then leads to an unrelated starvation problem caused by write-heavy
: fuse mounts being used atop the usb disks, a problem the min_ratio setting
: at the underlying devices bdi effectively prevents.

Fix this leakage by resetting the bdi min_ratio when unregistering the
BDI.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Reported-by: Vito Caputo <lkml@pengaru.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agomm/futex: fix futex writes on archs with SW tracking of dirty & young
Benjamin Herrenschmidt [Tue, 26 Jul 2011 00:12:32 +0000 (17:12 -0700)]
mm/futex: fix futex writes on archs with SW tracking of dirty & young

commit 2efaca927f5cd7ecd0f1554b8f9b6a9a2c329c03 upstream.

I haven't reproduced it myself but the fail scenario is that on such
machines (notably ARM and some embedded powerpc), if you manage to hit
that futex path on a writable page whose dirty bit has gone from the PTE,
you'll livelock inside the kernel from what I can tell.

It will go in a loop of trying the atomic access, failing, trying gup to
"fix it up", getting succcess from gup, go back to the atomic access,
failing again because dirty wasn't fixed etc...

So I think you essentially hang in the kernel.

The scenario is probably rare'ish because affected architecture are
embedded and tend to not swap much (if at all) so we probably rarely hit
the case where dirty is missing or young is missing, but I think Shan has
a piece of SW that can reliably reproduce it using a shared writable
mapping & fork or something like that.

On archs who use SW tracking of dirty & young, a page without dirty is
effectively mapped read-only and a page without young unaccessible in the
PTE.

Additionally, some architectures might lazily flush the TLB when relaxing
write protection (by doing only a local flush), and expect a fault to
invalidate the stale entry if it's still present on another processor.

The futex code assumes that if the "in_atomic()" access -EFAULT's, it can
"fix it up" by causing get_user_pages() which would then be equivalent to
taking the fault.

However that isn't the case.  get_user_pages() will not call
handle_mm_fault() in the case where the PTE seems to have the right
permissions, regardless of the dirty and young state.  It will eventually
update those bits ...  in the struct page, but not in the PTE.

Additionally, it will not handle the lazy TLB flushing that can be
required by some architectures in the fault case.

Basically, gup is the wrong interface for the job.  The patch provides a
more appropriate one which boils down to just calling handle_mm_fault()
since what we are trying to do is simulate a real page fault.

The futex code currently attempts to write to user memory within a
pagefault disabled section, and if that fails, tries to fix it up using
get_user_pages().

This doesn't work on archs where the dirty and young bits are maintained
by software, since they will gate access permission in the TLB, and will
not be updated by gup().

In addition, there's an expectation on some archs that a spurious write
fault triggers a local TLB flush, and that is missing from the picture as
well.

I decided that adding those "features" to gup() would be too much for this
already too complex function, and instead added a new simpler
fixup_user_fault() which is essentially a wrapper around handle_mm_fault()
which the futex code can call.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix some nits Darren saw, fiddle comment layout]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reported-by: Shan Hai <haishan.bai@gmail.com>
Tested-by: Shan Hai <haishan.bai@gmail.com>
Cc: David Laight <David.Laight@ACULAB.COM>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Darren Hart <darren.hart@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agogeode: reflect mfgpt dependency on mfd
Philip A. Prindeville [Tue, 26 Jul 2011 00:13:05 +0000 (17:13 -0700)]
geode: reflect mfgpt dependency on mfd

commit 703f03c896fdbd726b809066ae279df513992f0e upstream.

As stated in drivers/mfd/cs5535-mfd.c, the mfd driver exposes the BARs
which then make the GPIO, MFGPT, ACPI, etc.  all visible to the system.

So the dependencies of the MFGPT stuff have changed, and most people
expect Kconfig to bring in the necessary dependencies.  Without them, the
module fails to load and most people don't understand why because the
details of the rewrite aren't captured anywhere most people who know to
look.

This dependency needs to be reflected in Kconfig.

Signed-off-by: Philip A. Prindeville <philipp@redfish-solutions.com>
Acked-by: Alexandros C. Couloumbis <alex@ozo.com>
Acked-by: Andres Salomon <dilinger@queued.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agodrivers/firmware/sigma.c needs MODULE_LICENSE
Randy Dunlap [Tue, 26 Jul 2011 00:13:21 +0000 (17:13 -0700)]
drivers/firmware/sigma.c needs MODULE_LICENSE

commit 27c46a2546c75c6814562e85b751e3d64c188ad5 upstream.

Fix module tainting message:

  sigma: module license 'unspecified' taints kernel.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agocciss: do not attempt to read from a write-only register
Stephen M. Cameron [Sat, 9 Jul 2011 07:04:12 +0000 (09:04 +0200)]
cciss: do not attempt to read from a write-only register

commit 07d0c38e7d84f911c72058a124c7f17b3c779a65 upstream.

Most smartarrays will tolerate it, but some new ones don't.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Note: this is a regression caused by commit 1ddd5049
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoPCI: ARI is a PCIe v2 feature
Chris Wright [Wed, 13 Jul 2011 17:14:33 +0000 (10:14 -0700)]
PCI: ARI is a PCIe v2 feature

commit 864d296cf948aef0fa32b81407541572583f7572 upstream.

The function pci_enable_ari() may mistakenly set the downstream port
of a v1 PCIe switch in ARI Forwarding mode.  This is a PCIe v2 feature,
and with an SR-IOV device on that switch port believing the switch above
is ARI capable it may attempt to use functions 8-255, translating into
invalid (non-zero) device numbers for that bus.  This has been seen
to cause Completion Timeouts and general misbehaviour including hangs
and panics.

Acked-by: Don Dutile <ddutile@redhat.com>
Tested-by: Don Dutile <ddutile@redhat.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoXZ: Fix missing <linux/kernel.h> include
Lasse Collin [Sun, 24 Jul 2011 16:54:25 +0000 (19:54 +0300)]
XZ: Fix missing <linux/kernel.h> include

commit 81d67439855a7f928d90965d832aa4f2fb677342 upstream.

<linux/kernel.h> is needed for min_t. The old version
happened to work on x86 because <asm/unaligned.h>
indirectly includes <linux/kernel.h>, but it didn't
work on ARM.

<linux/kernel.h> includes <asm/byteorder.h> so it's
not necessary to include it explicitly anymore.

Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agotracing: Have "enable" file use refcounts like the "filter" file
Steven Rostedt [Tue, 5 Jul 2011 18:32:51 +0000 (14:32 -0400)]
tracing: Have "enable" file use refcounts like the "filter" file

commit 40ee4dffff061399eb9358e0c8fcfbaf8de4c8fe upstream.

The "enable" file for the event system can be removed when a module
is unloaded and the event system only has events from that module.
As the event system nr_events count goes to zero, it may be freed
if its ref_count is also set to zero.

Like the "filter" file, the "enable" file may be opened by a task and
referenced later, after a module has been unloaded and the events for
that event system have been removed.

Although the "filter" file referenced the event system structure,
the "enable" file only references a pointer to the event system
name. Since the name is freed when the event system is removed,
it is possible that an access to the "enable" file may reference
a freed pointer.

Update the "enable" file to use the subsystem_open() routine that
the "filter" file uses, to keep a reference to the event system
structure while the "enable" file is opened.

Reported-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agotracing: Fix bug when reading system filters on module removal
Steven Rostedt [Tue, 5 Jul 2011 15:36:06 +0000 (11:36 -0400)]
tracing: Fix bug when reading system filters on module removal

commit e9dbfae53eeb9fc3d4bb7da3df87fa9875f5da02 upstream.

The event system is freed when its nr_events is set to zero. This happens
when a module created an event system and then later the module is
removed. Modules may share systems, so the system is allocated when
it is created and freed when the modules are unloaded and all the
events under the system are removed (nr_events set to zero).

The problem arises when a task opened the "filter" file for the
system. If the module is unloaded and it removed the last event for
that system, the system structure is freed. If the task that opened
the filter file accesses the "filter" file after the system has
been freed, the system will access an invalid pointer.

By adding a ref_count, and using it to keep track of what
is using the event system, we can free it after all users
are finished with the event system.

Reported-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoirq_work, alpha: Fix up arch hooks
Peter Zijlstra [Tue, 28 Jun 2011 10:15:51 +0000 (12:15 +0200)]
irq_work, alpha: Fix up arch hooks

commit 0f933625e7b6c3d91878ae95e341bf1984db7eaf upstream.

Commit e360adbe29 ("irq_work: Add generic hardirq context
callbacks") fouled up the Alpha bit, not properly naming the
arch specific function that raises the 'self-IPI'.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Michael Cree <mcree@orcon.net.nz>
Link: http://lkml.kernel.org/n/tip-gukh0txmql2l4thgrekzzbfy@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agopowerpc/kdump: Fix timeout in crash_kexec_wait_realmode
Michael Neuling [Mon, 4 Jul 2011 20:40:10 +0000 (20:40 +0000)]
powerpc/kdump: Fix timeout in crash_kexec_wait_realmode

commit 63f21a56f1cc0b800a4c00349c59448f82473d19 upstream.

The existing code it pretty ugly.  How about we clean it up even more
like this?

From: Anton Blanchard <anton@samba.org>

We check for timeout expiry in the outer loop, but we also need to
check it in the inner loop or we can lock up forever waiting for a
CPU to hit real mode.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agooprofile, x86: Fix nmi-unsafe callgraph support
Robert Richter [Fri, 3 Jun 2011 14:37:47 +0000 (16:37 +0200)]
oprofile, x86: Fix nmi-unsafe callgraph support

commit a0e3e70243f5b270bc3eca718f0a9fa5e6b8262e upstream.

Current oprofile's x86 callgraph support may trigger page faults
throwing the BUG_ON(in_nmi()) message below. This patch fixes this by
using the same nmi-safe copy-from-user code as in perf.

------------[ cut here ]------------
kernel BUG at .../arch/x86/kernel/traps.c:436!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/devices/pci0000:00/0000:00:0a.0/0000:07:00.0/0000:08:04.0/net/eth0/broadcast
CPU 5
Modules linked in:

Pid: 8611, comm: opcontrol Not tainted 2.6.39-00007-gfe47ae7 #1 Advanced Micro Device Anaheim/Anaheim
RIP: 0010:[<ffffffff813e8e35>]  [<ffffffff813e8e35>] do_nmi+0x22/0x1ee
RSP: 0000:ffff88042fd47f28  EFLAGS: 00010002
RAX: ffff88042c0a7fd8 RBX: 0000000000000001 RCX: 00000000c0000101
RDX: 00000000ffff8804 RSI: ffffffffffffffff RDI: ffff88042fd47f58
RBP: ffff88042fd47f48 R08: 0000000000000004 R09: 0000000000001484
R10: 0000000000000001 R11: 0000000000000000 R12: ffff88042fd47f58
R13: 0000000000000000 R14: ffff88042fd47d98 R15: 0000000000000020
FS:  00007fca25e56700(0000) GS:ffff88042fd40000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000074 CR3: 000000042d28b000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process opcontrol (pid: 8611, threadinfo ffff88042c0a6000, task ffff88042c532310)
Stack:
 0000000000000000 0000000000000001 ffff88042c0a7fd8 0000000000000000
 ffff88042fd47de8 ffffffff813e897a 0000000000000020 ffff88042fd47d98
 0000000000000000 ffff88042c0a7fd8 ffff88042fd47de8 0000000000000074
Call Trace:
 <NMI>
 [<ffffffff813e897a>] nmi+0x1a/0x20
 [<ffffffff813f08ab>] ? bad_to_user+0x25/0x771
 <<EOE>>
Code: ff 59 5b 41 5c 41 5d c9 c3 55 65 48 8b 04 25 88 b5 00 00 48 89 e5 41 55 41 54 49 89 fc 53 48 83 ec 08 f6 80 47 e0 ff ff 04 74 04 <0f> 0b eb fe 81 80 44 e0 ff ff 00 00 01 04 65 ff 04 25 c4 0f 01
RIP  [<ffffffff813e8e35>] do_nmi+0x22/0x1ee
 RSP <ffff88042fd47f28>
---[ end trace ed6752185092104b ]---
Kernel panic - not syncing: Fatal exception in interrupt
Pid: 8611, comm: opcontrol Tainted: G      D     2.6.39-00007-gfe47ae7 #1
Call Trace:
 <NMI>  [<ffffffff813e5e0a>] panic+0x8c/0x188
 [<ffffffff813e915c>] oops_end+0x81/0x8e
 [<ffffffff8100403d>] die+0x55/0x5e
 [<ffffffff813e8c45>] do_trap+0x11c/0x12b
 [<ffffffff810023c8>] do_invalid_op+0x91/0x9a
 [<ffffffff813e8e35>] ? do_nmi+0x22/0x1ee
 [<ffffffff8131e6fa>] ? oprofile_add_sample+0x83/0x95
 [<ffffffff81321670>] ? op_amd_check_ctrs+0x4f/0x2cf
 [<ffffffff813ee4d5>] invalid_op+0x15/0x20
 [<ffffffff813e8e35>] ? do_nmi+0x22/0x1ee
 [<ffffffff813e8e7a>] ? do_nmi+0x67/0x1ee
 [<ffffffff813e897a>] nmi+0x1a/0x20
 [<ffffffff813f08ab>] ? bad_to_user+0x25/0x771
 <<EOE>>

Cc: John Lumby <johnlumby@hotmail.com>
Cc: Maynard Johnson <maynardj@us.ibm.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agokexec, x86: Fix incorrect jump back address if not preserving context
Huang Ying [Thu, 14 Jul 2011 01:34:37 +0000 (09:34 +0800)]
kexec, x86: Fix incorrect jump back address if not preserving context

commit 050438ed5a05b25cdf287f5691e56a58c2606997 upstream.

In kexec jump support, jump back address passed to the kexeced
kernel via function calling ABI, that is, the function call
return address is the jump back entry.

Furthermore, jump back entry == 0 should be used to signal that
the jump back or preserve context is not enabled in the original
kernel.

But in the current implementation the stack position used for
function call return address is not cleared context
preservation is disabled. The patch fixes this bug.

Reported-and-tested-by: Yin Kangkai <kangkai.yin@intel.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Link: http://lkml.kernel.org/r/1310607277-25029-1-git-send-email-ying.huang@intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agopnfs: use lwb as layoutcommit length
Peng Tao [Sun, 31 Jul 2011 00:52:34 +0000 (20:52 -0400)]
pnfs: use lwb as layoutcommit length

commit 3557c6c3be5b2ca0b11365db7f8a813253eb520b upstream.

Using NFS4_MAX_UINT64 will break current protocol.

[Needed in v3.0]
Signed-off-by: Peng Tao <peng_tao@emc.com>
Signed-off-by: Jim Rees <rees@umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agopnfs: let layoutcommit handle a list of lseg
Peng Tao [Sun, 31 Jul 2011 00:52:33 +0000 (20:52 -0400)]
pnfs: let layoutcommit handle a list of lseg

commit a9bae5666d0510ad69bdb437371c9a3e6b770705 upstream.

There can be multiple lseg per file, so layoutcommit should be
able to handle it.

[Needed in v3.0]
Signed-off-by: Peng Tao <peng_tao@emc.com>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Jim Rees <rees@umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agopnfs: save layoutcommit cred at layout header init
Peng Tao [Sun, 31 Jul 2011 00:52:32 +0000 (20:52 -0400)]
pnfs: save layoutcommit cred at layout header init

commit 9fa4075878a5faac872a63f4a97ce79c776264e9 upstream.

No need to save it for every lseg.
No need to save it at every pnfs_set_layoutcommit.

[Needed in v3.0]
Signed-off-by: Peng Tao <peng_tao@emc.com>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Jim Rees <rees@umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agopnfs: save layoutcommit lwb at layout header
Peng Tao [Sun, 31 Jul 2011 00:52:31 +0000 (20:52 -0400)]
pnfs: save layoutcommit lwb at layout header

commit acff5880539fe33897d016c0f3dcf062e67c61b6 upstream.

No need to save it for every lseg.

[Needed in v3.0]
Signed-off-by: Peng Tao <peng_tao@emc.com>
Signed-off-by: Jim Rees <rees@umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoALSA: hda - Fix duplicated DAC assignments for Realtek
Takashi Iwai [Wed, 27 Jul 2011 14:41:57 +0000 (16:41 +0200)]
ALSA: hda - Fix duplicated DAC assignments for Realtek

commit c48a8fb0d31d6147d8d76b8e2ad7f51a2fbb5c4d upstream.

Copying hp_pins and speaker_pins from line_out_pins may confuse the
parser, and it can lead to duplicated initializations for the same pin
with a wrong DAC assignment.  The problem appears in 3.0 kernel code.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoALSA: virtuoso: fix silent analog output on Xonar Essence ST Deluxe
Clemens Ladisch [Sun, 17 Jul 2011 20:18:05 +0000 (22:18 +0200)]
ALSA: virtuoso: fix silent analog output on Xonar Essence ST Deluxe

commit c81c6b356b52d3fcb4d531d149573fc100aad643 upstream.

Commit dd203fa97bd5 (ALSA: virtuoso: remove non-working controls on
Essence ST Deluxe) made it impossible to adjust the volume after the
driver initialized it to muted.

Ensure that those DACs that can be accessed with I2C are initialized
to the same volume that is the reset default of the DAC without I2C.

Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agodrm/radeon/kms: add missing vddci setting on NI+
Alex Deucher [Mon, 25 Jul 2011 22:50:08 +0000 (18:50 -0400)]
drm/radeon/kms: add missing vddci setting on NI+

commit 4639dd21e759e32125adc7171abf6cb8140d54cf upstream.

Need to add vddci setting to pm init as well as
resume.  Fixes hangs on load on some boards.

Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=38754

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agodrm/radeon/kms: fix DP training for DPEncoderService revision bigger than 1.1
Jerome Glisse [Mon, 25 Jul 2011 15:57:43 +0000 (11:57 -0400)]
drm/radeon/kms: fix DP training for DPEncoderService revision bigger than 1.1

commit 5a96a899bbdee86024ab9ea6d02b9e242faacbed upstream.

DPEncoderService newer than 1.1 can't properly program the DP (display port)
link training. When facing such version use the DIGxEncoderControl method
instead. Fix DP link training on some R7XX.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agodrm/radeon/kms: fix i2c map for rv250/280
Alex Deucher [Sat, 23 Jul 2011 18:02:04 +0000 (18:02 +0000)]
drm/radeon/kms: fix i2c map for rv250/280

commit 6dd666333ddee39903d86f870d5c40d9f100e0cc upstream.

Those chips have crt2_ddc bus.

Fixes:
https://bugzilla.kernel.org/show_bug.cgi?id=39672

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agohpsa: do not attempt to read from a write-only register
Stephen M. Cameron [Thu, 21 Jul 2011 18:16:05 +0000 (13:16 -0500)]
hpsa: do not attempt to read from a write-only register

commit fec62c368b9c8b05d5124ca6c3b8336b537f26f3 upstream.

Most smartarrays tolerate it, but a few new ones don't.
Without this change some newer Smart Arrays will lock up
and i/o will grind to a halt.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agopmcraid: reject negative request size
Dan Rosenberg [Mon, 11 Jul 2011 21:08:23 +0000 (14:08 -0700)]
pmcraid: reject negative request size

commit b5b515445f4f5a905c5dd27e6e682868ccd6c09d upstream.

There's a code path in pmcraid that can be reached via device ioctl that
causes all sorts of ugliness, including heap corruption or triggering the
OOM killer due to consecutive allocation of large numbers of pages.

First, the user can call pmcraid_chr_ioctl(), with a type
PMCRAID_PASSTHROUGH_IOCTL.  This calls through to
pmcraid_ioctl_passthrough().  Next, a pmcraid_passthrough_ioctl_buffer
is copied in, and the request_size variable is set to
buffer->ioarcb.data_transfer_length, which is an arbitrary 32-bit
signed value provided by the user.  If a negative value is provided
here, bad things can happen.  For example,
pmcraid_build_passthrough_ioadls() is called with this request_size,
which immediately calls pmcraid_alloc_sglist() with a negative size.
The resulting math on allocating a scatter list can result in an
overflow in the kzalloc() call (if num_elem is 0, the sglist will be
smaller than expected), or if num_elem is unexpectedly large the
subsequent loop will call alloc_pages() repeatedly, a high number of
pages will be allocated and the OOM killer might be invoked.

It looks like preventing this value from being negative in
pmcraid_ioctl_passthrough() would be sufficient.

Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agofix crash in scsi_dispatch_cmd()
James Bottomley [Thu, 7 Jul 2011 20:45:40 +0000 (15:45 -0500)]
fix crash in scsi_dispatch_cmd()

commit bfe159a51203c15d23cb3158fffdc25ec4b4dda1 upstream.

USB surprise removal of sr is triggering an oops in
scsi_dispatch_command().  What seems to be happening is that USB is
hanging on to a queue reference until the last close of the upper
device, so the crash is caused by surprise remove of a mounted CD
followed by attempted unmount.

The problem is that USB doesn't issue its final commands as part of
the SCSI teardown path, but on last close when the block queue is long
gone.  The long term fix is probably to make sr do the teardown in the
same way as sd (so remove all the lower bits on ejection, but keep the
upper disk alive until last close of user space).  However, the
current oops can be simply fixed by not allowing any commands to be
sent to a dead queue.

Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoses: requesting a fault indication
Douglas Gilbert [Thu, 9 Jun 2011 04:27:07 +0000 (00:27 -0400)]
ses: requesting a fault indication

commit 2a350cab9daf9a46322d83b091bb05cf54ccf6ab upstream.

Noticed that when the sysfs interface of the SCSI SES
driver was used to request a fault indication the LED
flashed but the buzzer didn't sound. So it was doing
what REQUEST IDENT (locate) should do.

Changelog:
   - fix the setting of REQUEST FAULT for the device slot
     and array device slot elements in the enclosure control
     diagnostic page
   - note the potentially defective code that reads the
     FAULT SENSED and FAULT REQUESTED bits from the enclosure
     status diagnostic page

The attached patch is against git/scsi-misc-2.6

Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agosr: check_events() ignore GET_EVENT when TUR says otherwise
Kay Sievers [Thu, 30 Jun 2011 13:03:48 +0000 (15:03 +0200)]
sr: check_events() ignore GET_EVENT when TUR says otherwise

commit 79b9677d885d1a792bc103f2febb06f91f92de43 upstream.

Some broken devices indicates that media has changed on every
GET_EVENT_STATUS_NOTIFICATION.  This translates into MEDIA_CHANGE
uevent on every open() which lets udev run into a loop.

Verify GET_EVENT result against TUR and if it generates spurious
events for several times in a row, ignore the GET_EVENT events, and
trust only the TUR status.

This is the log of a USB stick with a (broken) fake CDROM drive:

 scsi 5:0:0:0: Direct-Access     SanDisk  U3 Cruzer Micro  8.02 PQ: 0 ANSI: 0 CCS
 sd 5:0:0:0: Attached scsi generic sg3 type 0
 scsi 5:0:0:1: CD-ROM            SanDisk  U3 Cruzer Micro  8.02 PQ: 0 ANSI: 0
 sd 5:0:0:0: [sdb] Attached SCSI removable disk
 sr2: scsi3-mmc drive: 48x/48x tray
 sr 5:0:0:1: Attached scsi CD-ROM sr2
 sr 5:0:0:1: Attached scsi generic sg4 type 5
 sr2: GET_EVENT and TUR disagree continuously, suppress GET_EVENT events
 sd 5:0:0:0: [sdb] 31777279 512-byte logical blocks: (16.2 GB/15.1 GiB)
 sd 5:0:0:0: [sdb] No Caching mode page present
 sd 5:0:0:0: [sdb] Assuming drive cache: write through
 sd 5:0:0:0: [sdb] No Caching mode page present
 sd 5:0:0:0: [sdb] Assuming drive cache: write through
 sdb: sdb1

-tj: Updated to consider only spurious GET_EVENT events among
     different types of disagreement and allow using TUR for kernel
     event polling after GET_EVENT is ignored.

Reported-By: Markus Rathgeb maggu2810@googlemail.com
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoBlacklist Traxdata CDR4120 and IOMEGA Zip drive to avoid lock ups.
Werner Fink [Thu, 9 Jun 2011 05:24:24 +0000 (10:54 +0530)]
Blacklist Traxdata CDR4120 and IOMEGA Zip drive to avoid lock ups.

commit 82103978189e9731658cd32da5eb85ab7b8542b8 upstream.

This patch resulted from the discussion at
https://bugzilla.novell.com/show_bug.cgi?id=679277,
https://bugzilla.novell.com/show_bug.cgi?id=681840 .

Signed-off-by: Werner Fink <werner@novell.com>
Signed-off-by: Ankit Jain <jankit@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoperf: Fix software event overflow
Peter Zijlstra [Thu, 28 Jul 2011 18:47:10 +0000 (20:47 +0200)]
perf: Fix software event overflow

The below patch is for -stable only, upstream has a much larger patch
that contains the below hunk in commit a8b0ca17b80e92faab46ee7179ba9e99ccb61233

Vince found that under certain circumstances software event overflows
go wrong and deadlock. Avoid trying to delete a timer from the timer
callback.

Reported-by: Vince Weaver <vweaver1@eecs.utk.edu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agox86, intel, power: Initialize MSR_IA32_ENERGY_PERF_BIAS
Len Brown [Thu, 14 Jul 2011 04:53:24 +0000 (00:53 -0400)]
x86, intel, power: Initialize MSR_IA32_ENERGY_PERF_BIAS

commit abe48b108247e9b90b4c6739662a2e5c765ed114 upstream.

Since 2.6.36 (23016bf0d25), Linux prints the existence of "epb" in /proc/cpuinfo,
Since 2.6.38 (d5532ee7b40), the x86_energy_perf_policy(8) utility has
been available in-tree to update MSR_IA32_ENERGY_PERF_BIAS.

However, the typical BIOS fails to initialize the MSR, presumably
because this is handled by high-volume shrink-wrap operating systems...

Linux distros, on the other hand, do not yet invoke x86_energy_perf_policy(8).
As a result, WSM-EP, SNB, and later hardware from Intel will run in its
default hardware power-on state (performance), which assumes that users
care for performance at all costs and not for energy efficiency.
While that is fine for performance benchmarks, the hardware's intended default
operating point is "normal" mode...

Initialize the MSR to the "normal" by default during kernel boot.

x86_energy_perf_policy(8) is available to change the default after boot,
should the user have a different preference.

Signed-off-by: Len Brown <len.brown@intel.com>
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1107140051020.18606@x980
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoperf tools: Fix endian conversion reading event attr from file header
David Ahern [Fri, 15 Jul 2011 18:34:09 +0000 (12:34 -0600)]
perf tools: Fix endian conversion reading event attr from file header

commit eda3913bb70ecebac13adccffe1e7f96e93cee02 upstream.

The perf_event_attr struct has two __u32's at the top and
they need to be swapped individually.

With this change I was able to analyze a perf.data collected in a
32-bit PPC VM on an x86 system. I tested both 32-bit and 64-bit
binaries for the Intel analysis side; both read the PPC perf.data
file correctly.

-v2:
 - changed the existing perf_event__attr_swap() to swap only elements
   of perf_event_attr and exported it for use in swapping the
   attributes in the file header
 - updated swap_ops used for processing events

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: acme@ghostprotocols.net
Cc: peterz@infradead.org
Cc: paulus@samba.org
Link: http://lkml.kernel.org/r/1310754849-12474-1-git-send-email-dsahern@gmail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoperf tools, x86: Fix 32-bit compile on 64-bit system
David Ahern [Mon, 11 Jul 2011 21:38:24 +0000 (15:38 -0600)]
perf tools, x86: Fix 32-bit compile on 64-bit system

commit 08a4a43fc407d780bdde36d98f89c0dbb2a6be6b upstream.

Builds for 32-bit perf binaries on a 64-bit host currently fail
with this error:

 [...]
 bench/../../../arch/x86/lib/memcpy_64.S: Assembler messages:
 bench/../../../arch/x86/lib/memcpy_64.S:29: Error: bad register name `%rdi'
 bench/../../../arch/x86/lib/memcpy_64.S:34: Error: invalid instruction suffix for `movs'
 bench/../../../arch/x86/lib/memcpy_64.S:50: Error: bad register name `%rdi'
 bench/../../../arch/x86/lib/memcpy_64.S:61: Error: bad register name `%rdi'
 ...

The problem is the detection of the host arch without considering passed in
flags. This change fixes 32-bit builds via:

make EXTRA_CFLAGS=-m32

and 64-bit builds still reference the memcpy_64.S.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1310420304-21452-1-git-send-email-dsahern@gmail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoirq_work, ppc: Fix up arch hooks
Peter Zijlstra [Mon, 27 Jun 2011 15:22:43 +0000 (17:22 +0200)]
irq_work, ppc: Fix up arch hooks

commit 4f8b50bbbe63ae4ec6bea28a90a9a603c745ea71 upstream.

Commit e360adbe29 ("irq_work: Add generic hardirq context
callbacks") fouled up the ppc bit, not properly naming the
arch specific function that raises the 'self-IPI'.

Cc: Huang Ying <ying.huang@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Eric B Munson <emunson@mgebm.net>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-eg0aqien8p1aqvzu9dft6dtv@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agomac80211: Restart STA timers only on associated state
Rajkumar Manoharan [Thu, 7 Jul 2011 18:03:39 +0000 (23:33 +0530)]
mac80211: Restart STA timers only on associated state

commit 676b58c27475a9defccc025fea1cbd2b141ee539 upstream.

A panic was observed when the device is failed to resume properly,
and there are no running interfaces. ieee80211_reconfig tries
to restart STA timers on unassociated state.

Signed-off-by: Rajkumar Manoharan <rmanohar@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agortlwifi: rtl8192cu: Fix duplicate if test
Larry Finger [Sat, 9 Jul 2011 18:15:58 +0000 (13:15 -0500)]
rtlwifi: rtl8192cu: Fix duplicate if test

commit 1288aa4e80145d9f4196df32f717b4c1cf6aab61 upstream.

A typo causes routine rtl92cu_phy_rf6052_set_cck_txpower() to test the
same condition twice. The problem was found using cppcheck-1.49, and the
proper fix was verified against the pre-mac80211 version of the code.

Reported-by: David Binderman <dcb314@hotmail.com>
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agolibsas: remove expander from dev list on error
Luben Tuikov [Wed, 27 Jul 2011 06:10:48 +0000 (23:10 -0700)]
libsas: remove expander from dev list on error

commit 5911e963d3718e306bcac387b83e259aa4228896 upstream.

If expander discovery fails (sas_discover_expander()), remove the
expander from the port device list (sas_ex_discover_expander()),
before freeing it. Else the list is corrupted and, e.g., when we
attempt to send SMP commands to other devices, the kernel oopses.

Signed-off-by: Luben Tuikov <ltuikov@yahoo.com>
Reviewed-by: Jack Wang <jack_wang@usish.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoIB/srp: Avoid duplicate devices from LUN scan
Bart Van Assche [Wed, 13 Jul 2011 16:19:16 +0000 (09:19 -0700)]
IB/srp: Avoid duplicate devices from LUN scan

commit fd1b6c4a693c9cac59375ffb36ffe5d7c079037c upstream.

SCSI scanning of a channel:id:lun triplet in Linux works as follows
(function scsi_scan_target() in drivers/scsi/scsi_scan.c):

- If lun == SCAN_WILD_CARD, send a REPORT LUNS command to the target
  and process the result.

- If lun != SCAN_WILD_CARD, send an INQUIRY command to the LUN
  corresponding to the specified channel:id:lun triplet to verify
  whether the LUN exists.

So a SCSI driver must either take the channel and target id values in
account in its quecommand() function or it should declare that it only
supports one channel and one target id.

Currently the ib_srp driver does neither.  As a result scanning the
SCSI bus via e.g. rescan-scsi-bus.sh causes many duplicate SCSI
devices to be created. For each 0:0:L device, several duplicates are
created with the same LUN number and with (C:I) != (0:0). Fix this by
declaring that the ib_srp driver only supports one channel and one
target id.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: David Dillow <dillowda@ornl.gov>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agofirewire: cdev: prevent race between first get_info ioctl and bus reset event queuing
Stefan Richter [Sat, 9 Jul 2011 14:43:22 +0000 (16:43 +0200)]
firewire: cdev: prevent race between first get_info ioctl and bus reset event queuing

commit 93b37905f70083d6143f5f4dba0a45cc64379a62 upstream.

Between open(2) of a /dev/fw* and the first FW_CDEV_IOC_GET_INFO
ioctl(2) on it, the kernel already queues FW_CDEV_EVENT_BUS_RESET events
to be read(2) by the client.  The get_info ioctl is practically always
issued right away after open, hence this condition only occurs if the
client opens during a bus reset, especially during a rapid series of bus
resets.

The problem with this condition is twofold:

  - These bus reset events carry the (as yet undocumented) @closure
    value of 0.  But it is not the kernel's place to choose closures;
    they are privat to the client.  E.g., this 0 value forced from the
    kernel makes it unsafe for clients to dereference it as a pointer to
    a closure object without NULL pointer check.

  - It is impossible for clients to determine the relative order of bus
    reset events from get_info ioctl(2) versus those from read(2),
    except in one way:  By comparison of closure values.  Again, such a
    procedure imposes complexity on clients and reduces freedom in use
    of the bus reset closure.

So, change the ABI to suppress queuing of bus reset events before the
first FW_CDEV_IOC_GET_INFO ioctl was issued by the client.

Note, this ABI change cannot be version-controlled.  The kernel cannot
distinguish old from new clients before the first FW_CDEV_IOC_GET_INFO
ioctl.

We will try to back-merge this change into currently maintained stable/
longterm series, and we only document the new behaviour.  The old
behavior is now considered a kernel bug, which it basically is.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: <stable@kernel.org>
12 years agofirewire: cdev: return -ENOTTY for unimplemented ioctls, not -EINVAL
Stefan Richter [Sat, 9 Jul 2011 14:42:26 +0000 (16:42 +0200)]
firewire: cdev: return -ENOTTY for unimplemented ioctls, not -EINVAL

commit d873d794235efa590ab3c94d5ee22bb1fab19ac4 upstream.

On Jun 27 Linus Torvalds wrote:
> The correct error code for "I don't understand this ioctl" is ENOTTY.
> The naming may be odd, but you should think of that error value as a
> "unrecognized ioctl number, you're feeding me random numbers that I
> don't understand and I assume for historical reasons that you tried to
> do some tty operation on me".
[...]
> The EINVAL thing goes way back, and is a disaster. It predates Linux
> itself, as far as I can tell. You'll find lots of man-pages that have
> this line in it:
>
>   EINVAL Request or argp is not valid.
>
> and it shows up in POSIX etc. And sadly, it generally shows up
> _before_ the line that says
>
>   ENOTTY The specified request does not apply to the kind of object
> that the descriptor d references.
>
> so a lot of people get to the EINVAL, and never even notice the ENOTTY.
[...]
> At least glibc (and hopefully other C libraries) use a _string_ that
> makes much more sense: strerror(ENOTTY) is "Inappropriate ioctl for
> device"

So let's correct this in the <linux/firewire-cdev.h> ABI while it is
still young, relative to distributor adoption.

Side note:  We return -ENOTTY not only on _IOC_TYPE or _IOC_NR mismatch,
but also on _IOC_SIZE mismatch.  An ioctl with an unsupported size of
argument structure can be seen as an unsupported version of that ioctl.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoethtool: Allow zero-length register dumps again
Ben Hutchings [Thu, 21 Jul 2011 22:25:30 +0000 (15:25 -0700)]
ethtool: Allow zero-length register dumps again

commit 67ae7cf1eeda777f79259c4c6cb17a0bd28dee71 upstream.

Some drivers (ab)use the ethtool_ops::get_regs operation to expose
only a hardware revision ID.  Commit
a77f5db361ed9953b5b749353ea2c7fed2bf8d93 ('ethtool: Allocate register
dump buffer with vmalloc()') had the side-effect of breaking these, as
vmalloc() returns a null pointer for size=0 whereas kmalloc() did not.

For backward-compatibility, allow zero-length dumps again.

Reported-by: Kalle Valo <kvalo@qca.qualcomm.com>
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agojme: Fix unmap error (Causing system freeze)
Guo-Fu Tseng [Wed, 20 Jul 2011 16:57:36 +0000 (16:57 +0000)]
jme: Fix unmap error (Causing system freeze)

commit 94c5b41b327e08de0ddf563237855f55080652a1 upstream.

This patch add the missing dma_unmap().
Which solved the critical issue of system freeze on heavy load.

Michal Miroslaw's rejected patch:
[PATCH v2 10/46] net: jme: convert to generic DMA API
Pointed out the issue also, thank you Michal.
But the fix was incorrect. It would unmap needed address
when low memory.

Got lots of feedback from End user and Gentoo Bugzilla.
https://bugs.gentoo.org/show_bug.cgi?id=373109
Thank you all. :)

Signed-off-by: Guo-Fu Tseng <cooldavid@cooldavid.org>
Acked-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agostaging: brcm80211: fix for reported log spam problem
Roland Vossen [Wed, 29 Jun 2011 23:48:22 +0000 (16:48 -0700)]
staging: brcm80211: fix for reported log spam problem

commit 37c962d195005d009e130e65a9e55960996c3cab upstream.

Every few minutes, this message would appear in syslog:

ieee80211 ph0: wl_ops_bss_info_changed: BSS idle: true (implement)

The message has been deleted, the driver requires no special action on this
particular event (). See: https://bugzilla.kernel.org/show_bug.cgi?id=38162

Reported-by: David Hill <hilld@binarystorm.net>
Signed-off-by: Roland Vossen <rvossen@broadcom.com>
Reviewed-by: Arend van Spriel <arend@broadcom.com>
Reviewed-by: Franky Lin <frankyl@broadcom.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Stefan Lippers-Hollmann <s.l-h@gmx.de>
12 years agoCIFS: Fix oops while mounting with prefixpath
Pavel Shilovsky [Mon, 25 Jul 2011 13:59:10 +0000 (17:59 +0400)]
CIFS: Fix oops while mounting with prefixpath

commit f5bc1e755d23d022bf948904386337fc3e5e29a8 upstream.

commit fec11dd9a0109fe52fd631e5c510778d6cbff6cc caused
a regression when we have already mounted //server/share/a
and want to mount //server/share/a/b.

The problem is that lookup_one_len calls __lookup_hash
with nd pointer as NULL. Then __lookup_hash calls
do_revalidate in the case when dentry exists and we end
up with NULL pointer deference in cifs_d_revalidate:

if (nd->flags & LOOKUP_RCU)
return -ECHILD;

Fix this by checking nd for NULL.

Signed-off-by: Pavel Shilovsky <piastryyy@gmail.com>
Reviewed-by: Shirish Pargaonkar <shirishp@us.ibm.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoath9k_hw: Fix incorrect key_miss handling
Senthil Balasubramanian [Mon, 11 Jul 2011 18:32:56 +0000 (00:02 +0530)]
ath9k_hw: Fix incorrect key_miss handling

commit 0472ade031b5c0c69c21cf96acf64c50eb9ba3c2 upstream.

Decryping frames on key_miss handling shouldn't be done for Michael
MIC failed frames as h/w would have already decrypted such frames
successfully anyway.

Also leaving CRC and PHY error(where the frame is going to be dropped
anyway), we are left to prcoess Decrypt error for which s/w decrypt is
selected anway and so having key_miss as a separate check doesn't serve
anything. So making key_miss handling mutually exlusive with other RX
status handling makes much more sense.

This patch addresses an issue with STA not reporting MIC failure events
resulting in STA being disconnected immediately.

Signed-off-by: Senthil Balasubramanian <senthilb@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoath6kl: fix crash when interface is closed but scan is ongoing
Kalle Valo [Mon, 13 Jun 2011 08:54:06 +0000 (11:54 +0300)]
ath6kl: fix crash when interface is closed but scan is ongoing

commit 98ab5c7755b5cc9e1a8f2a57ccb22eac5e13ec50 upstream.

When ath6kl module was resumed while a scan was ongoing, for example during
suspend, the driver would crash in ar6k_cfg80211_scanComplete_event():

[26581.586440] Call Trace:
[26581.586440]  [<f99ffeda>] ? ar6k_cfg80211_scanComplete_event+0xaa/0xaa [ath6kl]
[26581.586440]  [<f9a0a020>] wmi_iterate_nodes+0xb/0xd [ath6kl]
[26581.586440]  [<f99ffe78>] ar6k_cfg80211_scanComplete_event+0x48/0xaa [ath6kl]
[26581.586440]  [<f9a038ae>] ar6000_close+0x77/0x7e [ath6kl]
[26581.586440]  [<c139c25d>] __dev_close_many+0x87/0xab
[26581.586440]  [<c139c30a>] dev_close_many+0x54/0xab
[26581.586440]  [<c139c437>] rollback_registered_many+0xa5/0x19e
[26581.586440]  [<c139c595>] rollback_registered+0x23/0x2f
[26581.586440]  [<c139c5ed>] unregister_netdevice_queue+0x4c/0x69
[26581.586440]  [<c139c6b2>] unregister_netdev+0x18/0x1f
[26581.586440]  [<f9a00d4c>] ar6000_destroy+0xf8/0x115 [ath6kl]
[26581.586440]  [<f9a0c765>] ar6k_cleanup_module+0x20/0x29 [ath6kl]
[26581.586440]  [<c1062843>] sys_delete_module+0x181/0x1d9
[26581.586440]  [<c105876b>] ? lock_release_holdtime+0x2b/0xcd
[26581.586440]  [<c10b55dc>] ? sys_munmap+0x3b/0x42
[26581.586440]  [<c14a99dc>] ? restore_all+0xf/0xf
[26581.586440]  [<c14aeb6c>] sysenter_do_call+0x12/0x32
[26581.586440] Code: 89 53 6c 75 07 89 d8 e8 c0 ff ff ff 89 f0 e8 2c f2 a9 c7 5b 5e 5d c3 55 89 e5 57 56 53 89 c3 83 ec 08 89 55 f0 8d 78 04 89 4d ec <8b> b0 b8 00 00 00 46 89 b0 b8 00 00 00 89 f8 e8 ae ed a9 c7 8b

Fix the function not to iterate nodes when the scan is aborted. The nodes
are already freed when the module is being unloaded. Patch "ath6kl: Fix a
kernel panic furing suspend/resume" tried to fix this already but it wasn't
enough as a pointer was still used even after the null check. This patch
removes the null check entirely as the wmi structure is not accessed anymore
during module unload.

Also fix a bug where the status was checked as a bitfield with '&' operator.
But it's not a bitfield, just a regular (enum like) value.

Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoath6kl: cache firmware
Kalle Valo [Mon, 13 Jun 2011 08:54:18 +0000 (11:54 +0300)]
ath6kl: cache firmware

commit b42a7b1bc7c0f535dfe35b2c934f239c60bb8d30 upstream.

Drivers should not request firmware during resume. Fix ath6kl to
cache the firmware instead.

Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoASoC: Mark cache as dirty when suspending
Mark Brown [Mon, 18 Jul 2011 04:17:13 +0000 (13:17 +0900)]
ASoC: Mark cache as dirty when suspending

commit 7be4ba24a3ea53bc8ade841635e4d4a59e98ceb5 upstream.

Since quite a few drivers are not managing to flag the cache as needing
to be resynced after suspend and it's a reasonable thing to do flag the
cache as needing sync automatically when suspending.

The expectation is that systems will mainly only keep the CODEC powered
when doing audio through the CODEC so we won't actually suspend the
device anyway; drivers which want to can override this behaviour when
they resume.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Acked-by: Liam Girdwood <lrg@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoASoC: davinci: fix codec start and stop functions
Rajashekhara, Sudhakar [Wed, 20 Jul 2011 12:06:04 +0000 (17:36 +0530)]
ASoC: davinci: fix codec start and stop functions

commit 3012f43eaf7592d8121426918e43e3b5db013aff upstream.

According to DM365 voice codec data sheet at [1], before starting
recording or playback, ADC/DAC modules should follow a reset and
enable cycle. Writing a 1 to the ADC/DAC bit in the register resets
the module and clearing the bit to 0 will enable the module. But the
driver seems to be doing the reverse of it.

[1] http://focus.ti.com/lit/ug/sprufi9b/sprufi9b.pdf

Signed-off-by: Rajashekhara, Sudhakar <sudhakar.raj@ti.com>
Acked-by: Liam Girdwood <lrg@ti.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoASoC: davinci: add missing break statement
Rajashekhara, Sudhakar [Wed, 20 Jul 2011 12:07:18 +0000 (17:37 +0530)]
ASoC: davinci: add missing break statement

commit 82d1d521036eb3f5aae48b847f939d99a44c18bb upstream.

In davinci_vcif_trigger() function, a break() statement was missing
causing the davinci_vcif_stop() function to be called as a fallback
after calling davinci_vcif_start().

Signed-off-by: Rajashekhara, Sudhakar <sudhakar.raj@ti.com>
Acked-by: Liam Girdwood <lrg@ti.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoARM: pxa/cm-x300: fix V3020 RTC functionality
Igor Grinberg [Mon, 9 May 2011 11:41:46 +0000 (14:41 +0300)]
ARM: pxa/cm-x300: fix V3020 RTC functionality

commit 6c7b3ea52e345ab614edb91d3f0e9f3bb3713871 upstream.

While in sleep mode the CS# and other V3020 RTC GPIOs must be driven
high, otherwise V3020 RTC fails to keep the right time in sleep mode.

Signed-off-by: Igor Grinberg <grinberg@compulab.co.il>
Signed-off-by: Eric Miao <eric.y.miao@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agodrivers/rtc/rtc-tegra.c: properly initialize spinlock
Uwe Kleine-König [Tue, 26 Jul 2011 00:13:34 +0000 (17:13 -0700)]
drivers/rtc/rtc-tegra.c: properly initialize spinlock

commit e57ee01750c4954fd0b5e3c6109cd4b870880eb9 upstream.

Using __SPIN_LOCK_UNLOCKED for a dynamically allocated lock is wrong and
breaks the build with PREEMPT_RT_FULL.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Cc: Andrew Chew <achew@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agortc: limit frequency
Thomas Gleixner [Tue, 26 Jul 2011 23:08:19 +0000 (16:08 -0700)]
rtc: limit frequency

commit 431e2bcc371016824f419baa745f82388258f3ee upstream.

Due to the hrtimer self rearming mode a user can DoS the machine simply
because it's starved by hrtimer events.

The RTC hrtimer is self rearming.  We really need to limit the frequency
to something sensible.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Ben Greear <greearb@candelatech.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agortc: fix hrtimer deadlock
Thomas Gleixner [Tue, 26 Jul 2011 23:08:20 +0000 (16:08 -0700)]
rtc: fix hrtimer deadlock

commit b830ac1d9a2262093bb0f3f6a2fd2a1c8278daf5 upstream.

Ben reported a lockup related to rtc. The lockup happens due to:

CPU0                                        CPU1

rtc_irq_set_state()     __run_hrtimer()
  spin_lock_irqsave(&rtc->irq_task_lock)    rtc_handle_legacy_irq();
      spin_lock(&rtc->irq_task_lock);
  hrtimer_cancel()
    while (callback_running);

So the running callback never finishes as it's blocked on
rtc->irq_task_lock.

Use hrtimer_try_to_cancel() instead and drop rtc->irq_task_lock while
waiting for the callback.  Fix this for both rtc_irq_set_state() and
rtc_irq_set_freq().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reported-by: Ben Greear <greearb@candelatech.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agortc: handle errors correctly in rtc_irq_set_state()
Thomas Gleixner [Tue, 26 Jul 2011 23:08:18 +0000 (16:08 -0700)]
rtc: handle errors correctly in rtc_irq_set_state()

commit 2c4f57d12df7696d65b0247bfd57fd082a7719e6 upstream.

The code checks the correctness of the parameters, but unconditionally
arms/disarms the hrtimer.

The result is that a random task might arm/disarm rtc timer and surprise
the real owner by either generating events or by stopping them.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Ben Greear <greearb@candelatech.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agousb: musb: restore INDEX register in resume path
Ajay Kumar Gupta [Fri, 8 Jul 2011 09:36:13 +0000 (15:06 +0530)]
usb: musb: restore INDEX register in resume path

commit 3c5fec75e121b21a2eb35e5a6b44291509abba6f upstream.

Restoring the missing INDEX register value in musb_restore_context().
Without this suspend resume functionality is broken with offmode
enabled.

Acked-by: Anand Gadiyar <gadiyar@ti.com>
Signed-off-by: Ajay Kumar Gupta <ajay.gupta@ti.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoUSB: EHCI: go back to using the system clock for QH unlinks
Alan Stern [Tue, 5 Jul 2011 16:34:05 +0000 (12:34 -0400)]
USB: EHCI: go back to using the system clock for QH unlinks

commit 004c19682884d4f40000ce1ded53f4a1d0b18206 upstream.

This patch (as1477) fixes a problem affecting a few types of EHCI
controller.  Contrary to what one might expect, these controllers
automatically stop their internal frame counter when no ports are
enabled.  Since ehci-hcd currently relies on the frame counter for
determining when it should unlink QHs from the async schedule, those
controllers run into trouble: The frame counter stops and the QHs
never get unlinked.

Some systems have also experienced other problems traced back to
commit b963801164618e25fbdc0cd452ce49c3628b46c8 (USB: ehci-hcd unlink
speedups), which made the original switch from using the system clock
to using the frame counter.  It never became clear what the reason was
for these problems, but evidently it is related to use of the frame
counter.

To fix all these problems, this patch more or less reverts that commit
and goes back to using the system clock.  But this can't be done
cleanly because other changes have since been made to the scan_async()
subroutine.  One of these changes involved the tricky logic that tries
to avoid rescanning QHs that have already been seen when the scanning
loop is restarted, which happens whenever an URB is given back.
Switching back to clock-based unlinks would make this logic even more
complicated.

Therefore the new code doesn't rescan the entire async list whenever a
giveback occurs.  Instead it rescans only the current QH and continues
on from there.  This requires the use of a separate pointer to keep
track of the next QH to scan, since the current QH may be unlinked
while the scanning is in progress.  That new pointer must be global,
so that it can be adjusted forward whenever the _next_ QH gets
unlinked.  (uhci-hcd uses this same trick.)

Simplification of the scanning loop removes a level of indentation,
which accounts for the size of the patch.  The amount of code changed
is relatively small, and it isn't exactly a reversion of the
b963801164 commit.

This fixes Bugzilla #32432.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Tested-by: Matej Kenda <matejken@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoUSB: OHCI: fix another regression for NVIDIA controllers
Alan Stern [Fri, 15 Jul 2011 21:22:15 +0000 (17:22 -0400)]
USB: OHCI: fix another regression for NVIDIA controllers

commit 6ea12a04d295235ed67010a09fdea58c949e3eb0 upstream.

The NVIDIA series of OHCI controllers continues to be troublesome.  A
few people using the MCP67 chipset have reported that even with the
most recent kernels, the OHCI controller fails to handle new
connections and spams the system log with "unable to enumerate USB
port" messages.  This is different from the other problems previously
reported for NVIDIA OHCI controllers, although it is probably related.

It turns out that the MCP67 controller does not like to be kept in the
RESET state very long.  After only a few seconds, it decides not to
work any more.  This patch (as1479) changes the PCI initialization
quirk code so that NVIDIA controllers are switched into the SUSPEND
state after 50 ms of RESET.  With no interrupts enabled and all the
downstream devices reset, and thus unable to send wakeup requests,
this should be perfectly safe (even for non-NVIDIA hardware).

The removal code in ohci-hcd hasn't been changed; it will still leave
the controller in the RESET state.  As a result, if someone unloads
ohci-hcd and then reloads it, the controller won't work again until
the system is rebooted.  If anybody complains about this, the removal
code can be updated similarly.

This fixes Bugzilla #22052.

Tested-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoStaging: hv: netvsc: Increase the timeout value in the netvsc driver
K. Y. Srinivasan [Thu, 16 Jun 2011 20:16:35 +0000 (13:16 -0700)]
Staging: hv: netvsc: Increase the timeout value in the netvsc driver

commit 5c5781b3f88567211ecaaada13431af15c8c6003 upstream.

On some loaded windows hosts, we have discovered that the host may not
respond to guest requests within the specified time (one second)
as evidenced by the guest timing out. Fix this problem by increasing
the timeout to 5 seconds.

It may be useful to apply this patch to the 3.0 kernel as well.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Hank Janssen <hjanssen@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoStaging: hv: vmbus: Increase the timeout value in the vmbus driver
K. Y. Srinivasan [Thu, 16 Jun 2011 20:16:34 +0000 (13:16 -0700)]
Staging: hv: vmbus: Increase the timeout value in the vmbus driver

commit 2dfde9644fe8c4a77f9c73f95b25d6300ca23b5d upstream.

On some loaded windows hosts, we have discovered that the host may not
respond to guest requests within the specified time (one second)
as evidenced by the guest timing out. Fix this problem by increasing
the timeout to 5 seconds.

It may be useful to apply this patch to the 3.0 kernel as well.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Hank Janssen <hjanssen@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoStaging: hv: storvsc: Increase the timeout value in the storvsc driver
K. Y. Srinivasan [Thu, 16 Jun 2011 20:16:36 +0000 (13:16 -0700)]
Staging: hv: storvsc: Increase the timeout value in the storvsc driver

commit 46d2eb6d82ef44be58ae192c35e8cd52485f02eb upstream.

On some loaded windows hosts, we have discovered that the host may not
respond to guest requests within the specified time (one second)
as evidenced by the guest timing out. Fix this problem by increasing
the timeout to 5 seconds.

It may be useful to apply this patch to the 3.0 kernel as well.
the 3.0 kernel as well.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Hank Janssen <hjanssen@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agostaging: comedi: fix infoleak to userspace
Vasiliy Kulikov [Sun, 26 Jun 2011 08:56:22 +0000 (12:56 +0400)]
staging: comedi: fix infoleak to userspace

commit 819cbb120eaec7e014e5abd029260db1ca8c5735 upstream.

driver_name and board_name are pointers to strings, not buffers of size
COMEDI_NAMELEN.  Copying COMEDI_NAMELEN bytes of a string containing
less than COMEDI_NAMELEN-1 bytes would leak some unrelated bytes.

Signed-off-by: Vasiliy Kulikov <segoon@openwall.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agostaging: r8192e_pci: Handle duplicate PCI ID 0x10ec:0x8192 conflict with rtl8192se
Larry Finger [Sun, 19 Jun 2011 03:34:34 +0000 (22:34 -0500)]
staging: r8192e_pci: Handle duplicate PCI ID 0x10ec:0x8192 conflict with rtl8192se

commit 1c50bf7e415cf6ce9545dbecc2ac0d89d3916c53 upstream.

There are two devices with PCI ID 0x10ec:0x8192, namely RTL8192E and
RTL8192SE. The method of distinguishing them is by the revision ID
at offset 0x8 of the PCI configuration space. If the value is 0x10,
then the device uses rtl8192se for a driver.

Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoStaging: usbip: vhci-hcd: Do not kill already dead RX/TX kthread
Tobias Klauser [Fri, 24 Jun 2011 13:48:47 +0000 (15:48 +0200)]
Staging: usbip: vhci-hcd: Do not kill already dead RX/TX kthread

commit 8547d4cc2b616e4f1dafebe2c673fc986422b506 upstream.

When unbinding a device on the host which was still attached on the
client, I got a NULL pointer dereference on the client. This turned out
to be due to kthread_stop() being called on an already dead kthread.

Here is how I was able to reproduce the problem:

 server:# usbip bind -b 1-2
                                client:# usbip attach -h server -b 1-2
 server:# usbip unbind -b 1-2

This patch fixes the problem by checking the kthread before attempting
to kill it, as it is done on the opposite side in
stub_shutdown_connection().

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agogro: Only reset frag0 when skb can be pulled
Herbert Xu [Wed, 27 Jul 2011 13:16:28 +0000 (06:16 -0700)]
gro: Only reset frag0 when skb can be pulled

commit 17dd759c67f21e34f2156abcf415e1f60605a188 upstream.

Currently skb_gro_header_slow unconditionally resets frag0 and
frag0_len.  However, when we can't pull on the skb this leaves
the GRO fields in an inconsistent state.

This patch fixes this by only resetting those fields after the
pskb_may_pull test.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agobridge: send proper message_age in config BPDU
stephen hemminger [Fri, 22 Jul 2011 07:47:06 +0000 (07:47 +0000)]
bridge: send proper message_age in config BPDU

commit 0c03150e7ea8f7fcd03cfef29385e0010b22ee92 upstream.

A bridge topology with three systems:

      +------+  +------+
      | A(2) |--| B(1) |
      +------+  +------+
           \    /
          +------+
          | C(3) |
          +------+

What is supposed to happen:
 * bridge with the lowest ID is elected root (for example: B)
 * C detects that A->C is higher cost path and puts in blocking state

What happens. Bridge with lowest id (B) is elected correctly as
root and things start out fine initially. But then config BPDU
doesn't get transmitted from A -> C. Because of that
the link from A-C is transistioned to the forwarding state.

The root cause of this is that the configuration messages
is generated with bogus message age, and dropped before
sending.

In the standardmessage_age is supposed to be:
  the time since the generation of the Configuration BPDU by
  the Root that instigated the generation of this Configuration BPDU.

Reimplement this by recording the timestamp (age + jiffies) when
recording config information. The old code incorrectly used the time
elapsed on the ageing timer which was incorrect.

See also:
  https://bugzilla.vyatta.com/show_bug.cgi?id=7164

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agommc: sdhci-esdhc-imx: SDHCI_CARD_PRESENT does not get cleared
Shawn Guo [Tue, 21 Jun 2011 14:41:49 +0000 (22:41 +0800)]
mmc: sdhci-esdhc-imx: SDHCI_CARD_PRESENT does not get cleared

commit 803862a6f7de4939e0a557214e5e4b37e36f87ff upstream.

The function esdhc_readl_le intends to clear bit SDHCI_CARD_PRESENT,
when the card detect gpio tells there is no card.  But it does not
clear the bit actually.  The patch gives a fix on that.

Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Acked-by: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Chris Ball <cjb@laptop.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agommc: Added quirks for Ricoh 1180:e823 lower base clock frequency
Manoj Iyer [Mon, 11 Jul 2011 21:28:35 +0000 (16:28 -0500)]
mmc: Added quirks for Ricoh 1180:e823 lower base clock frequency

commit 15bed0f2fa8e1d7db201692532c210a7823d2d21 upstream.

Ricoh 1180:e823 does not recognize certain types of SD/MMC cards,
as reported at http://launchpad.net/bugs/773524.  Lowering the SD
base clock frequency from 200Mhz to 50Mhz fixes this issue. This
solution was suggest by Koji Matsumuro, Ricoh Company, Ltd.

This change has no negative performance effect on standard SD
cards, though it's quite possible that there will be one on
UHS-1 cards.

Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Tested-by: Daniel Manrique <daniel.manrique@canonical.com>
Cc: Koji Matsumuro <matsumur@nts.ricoh.co.jp>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoUSB: serial: add IDs for WinChipHead USB->RS232 adapter
Wolfgang Denk [Tue, 19 Jul 2011 09:25:38 +0000 (11:25 +0200)]
USB: serial: add IDs for WinChipHead USB->RS232 adapter

commit 026dfaf18973404a01f488d6aa556a8c466e06a4 upstream.

Add ID 4348:5523 for WinChipHead USB->RS 232 adapter with
Prolifec PL2303 chipset

Signed-off-by: Wolfgang Denk <wd@denx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoLinux 3.0 v3.0
Linus Torvalds [Fri, 22 Jul 2011 02:17:23 +0000 (19:17 -0700)]
Linux 3.0

12 years agoMerge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel...
Linus Torvalds [Fri, 22 Jul 2011 00:20:57 +0000 (17:20 -0700)]
Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb:
  sparc,kgdbts: fix compile regression with kgdb test suite

12 years agosparc,kgdbts: fix compile regression with kgdb test suite
Jason Wessel [Tue, 19 Jul 2011 20:43:16 +0000 (15:43 -0500)]
sparc,kgdbts: fix compile regression with kgdb test suite

Commit 63ab25ebbc (kgdbts: unify/generalize gdb breakpoint adjustment)
introduced a compile regression on sparc.

kgdbts.c: In function 'check_and_rewind_pc':
kgdbts.c:307: error: implicit declaration of function 'instruction_pointer_set'

Simply add the correct macro definition for instruction pointer on the
Sparc architecture.

Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Acked-by: David S. Miller <davem@davemloft.net>
12 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6
Linus Torvalds [Thu, 21 Jul 2011 21:28:01 +0000 (14:28 -0700)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
  CIFS: Fix wrong length in cifs_iovec_read

12 years agoMerge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Thu, 21 Jul 2011 19:25:39 +0000 (12:25 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: Make Dell Latitude E6420 use reboot=pci
  x86: Make Dell Latitude E5420 use reboot=pci

12 years agox86: Make Dell Latitude E6420 use reboot=pci
H. Peter Anvin [Thu, 21 Jul 2011 18:22:21 +0000 (11:22 -0700)]
x86: Make Dell Latitude E6420 use reboot=pci

Yet another variant of the Dell Latitude series which requires
reboot=pci.

From the E5420 bug report by Daniel J Blueman:

> The E6420 is affected also (same platform, different casing and
> features), which provides an external confirmation of the issue; I can
> submit a patch for that later or include it if you prefer:
> http://linux.koolsolutions.com/2009/08/04/howto-fix-linux-hangfreeze-during-reboots-and-restarts/

Reported-by: Daniel J Blueman <daniel.blueman@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: <stable@kernel.org>
12 years agox86: Make Dell Latitude E5420 use reboot=pci
Daniel J Blueman [Fri, 13 May 2011 01:04:59 +0000 (09:04 +0800)]
x86: Make Dell Latitude E5420 use reboot=pci

Rebooting on the Dell E5420 often hangs with the keyboard or ACPI
methods, but is reliable via the PCI method.

[ hpa: this was deferred because we believed for a long time that the
  recent reshuffling of the boot priorities in commit
  660e34cebf0a11d54f2d5dd8838607452355f321 fixed this platform.
  Unfortunately that turned out to be incorrect. ]

Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com>
Link: http://lkml.kernel.org/r/1305248699-2347-1-git-send-email-daniel.blueman@gmail.com
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: <stable@kernel.org>
12 years agoMerge branch 'drm-intel-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/keith...
Linus Torvalds [Thu, 21 Jul 2011 18:07:18 +0000 (11:07 -0700)]
Merge branch 'drm-intel-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/keithp/linux-2.6

* 'drm-intel-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/keithp/linux-2.6:
  drm/i915: Fix unfenced alignment on pre-G33 hardware
  drm/i915: Add quirk to disable SSC on Lenovo U160 LVDS

12 years agovfs: drop conditional inode prefetch in __do_lookup_rcu
Linus Torvalds [Thu, 21 Jul 2011 18:01:42 +0000 (11:01 -0700)]
vfs: drop conditional inode prefetch in __do_lookup_rcu

It seems to hurt performance in real life.  Yes, the inode will be used
later, but the conditional doesn't seem to predict all that well
(negative dentries are not uncommon) and it looks like the cost of
prefetching is simply higher than depending on the cache doing the right
thing.

As usual.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoFS-Cache: Fix __fscache_uncache_all_inode_pages()'s outer loop
Jan Beulich [Thu, 21 Jul 2011 14:02:43 +0000 (15:02 +0100)]
FS-Cache: Fix __fscache_uncache_all_inode_pages()'s outer loop

The compiler, at least for ix86 and m68k, validly warns that the
comparison:

next <= (loff_t)-1

is always true (and it's always true also for x86-64 and probably all
other arches - as long as pgoff_t isn't wider than loff_t).  The
intention appears to be to avoid wrapping of "next", so rather than
eliminating the pointless comparison, fix the loop to indeed get exited
when "next" would otherwise wrap.

On m68k the following warning is observed:

  fs/fscache/page.c: In function '__fscache_uncache_all_inode_pages':
  fs/fscache/page.c:979: warning: comparison is always false due to limited range of data type

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reported-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Suresh Jayaraman <sjayaraman@suse.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoCIFS: Fix wrong length in cifs_iovec_read
Pavel Shilovsky [Wed, 20 Jul 2011 14:24:09 +0000 (18:24 +0400)]
CIFS: Fix wrong length in cifs_iovec_read

Signed-off-by: Pavel Shilovsky <piastryyy@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
12 years agoMerge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Wed, 20 Jul 2011 22:56:25 +0000 (15:56 -0700)]
Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  signal: align __lock_task_sighand() irq disabling and RCU
  softirq,rcu: Inform RCU of irq_exit() activity
  sched: Add irq_{enter,exit}() to scheduler_ipi()
  rcu: protect __rcu_read_unlock() against scheduler-using irq handlers
  rcu: Streamline code produced by __rcu_read_unlock()
  rcu: Fix RCU_BOOST race handling current->rcu_read_unlock_special
  rcu: decrease rcu_report_exp_rnp coupling with scheduler

12 years agoMerge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Wed, 20 Jul 2011 22:55:48 +0000 (15:55 -0700)]
Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched: Avoid creating superfluous NUMA domains on non-NUMA systems
  sched: Allow for overlapping sched_domain spans
  sched: Break out cpu_power from the sched_group structure

12 years agoMerge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Wed, 20 Jul 2011 22:33:59 +0000 (15:33 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86. reboot: Make Dell Latitude E6320 use reboot=pci
  x86, doc only: Correct real-mode kernel header offset for init_size
  x86: Disable AMD_NUMA for 32bit for now

12 years agoMerge branch 'rcu/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck...
Ingo Molnar [Wed, 20 Jul 2011 18:59:26 +0000 (20:59 +0200)]
Merge branch 'rcu/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu into core/urgent

12 years agosignal: align __lock_task_sighand() irq disabling and RCU
Paul E. McKenney [Tue, 19 Jul 2011 10:25:36 +0000 (03:25 -0700)]
signal: align __lock_task_sighand() irq disabling and RCU

The __lock_task_sighand() function calls rcu_read_lock() with interrupts
and preemption enabled, but later calls rcu_read_unlock() with interrupts
disabled.  It is therefore possible that this RCU read-side critical
section will be preempted and later RCU priority boosted, which means that
rcu_read_unlock() will call rt_mutex_unlock() in order to deboost itself, but
with interrupts disabled. This results in lockdep splats, so this commit
nests the RCU read-side critical section within the interrupt-disabled
region of code.  This prevents the RCU read-side critical section from
being preempted, and thus prevents the attempt to deboost with interrupts
disabled.

It is quite possible that a better long-term fix is to make rt_mutex_unlock()
disable irqs when acquiring the rt_mutex structure's ->wait_lock.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
12 years agosoftirq,rcu: Inform RCU of irq_exit() activity
Peter Zijlstra [Tue, 19 Jul 2011 22:32:00 +0000 (15:32 -0700)]
softirq,rcu: Inform RCU of irq_exit() activity

The rcu_read_unlock_special() function relies on in_irq() to exclude
scheduler activity from interrupt level.  This fails because exit_irq()
can invoke the scheduler after clearing the preempt_count() bits that
in_irq() uses to determine that it is at interrupt level.  This situation
can result in failures as follows:

 $task IRQ SoftIRQ

 rcu_read_lock()

 /* do stuff */

 <preempt> |= UNLOCK_BLOCKED

 rcu_read_unlock()
   --t->rcu_read_lock_nesting

irq_enter();
/* do stuff, don't use RCU */
irq_exit();
  sub_preempt_count(IRQ_EXIT_OFFSET);
  invoke_softirq()

ttwu();
  spin_lock_irq(&pi->lock)
  rcu_read_lock();
  /* do stuff */
  rcu_read_unlock();
    rcu_read_unlock_special()
      rcu_report_exp_rnp()
        ttwu()
          spin_lock_irq(&pi->lock) /* deadlock */

   rcu_read_unlock_special(t);

Ed can simply trigger this 'easy' because invoke_softirq() immediately
does a ttwu() of ksoftirqd/# instead of doing the in-place softirq stuff
first, but even without that the above happens.

Cure this by also excluding softirqs from the
rcu_read_unlock_special() handler and ensuring the force_irqthreads
ksoftirqd/# wakeup is done from full softirq context.

[ Alternatively, delaying the ->rcu_read_lock_nesting decrement
  until after the special handling would make the thing more robust
  in the face of interrupts as well.  And there is a separate patch
  for that. ]

Cc: Thomas Gleixner <tglx@linutronix.de>
Reported-and-tested-by: Ed Tomlinson <edt@aei.ca>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
12 years agosched: Add irq_{enter,exit}() to scheduler_ipi()
Peter Zijlstra [Tue, 19 Jul 2011 22:07:25 +0000 (15:07 -0700)]
sched: Add irq_{enter,exit}() to scheduler_ipi()

Ensure scheduler_ipi() calls irq_{enter,exit} when it does some actual
work. Traditionally we never did any actual work from the resched IPI
and all magic happened in the return from interrupt path.

Now that we do do some work, we need to ensure irq_{enter,exit} are
called so that we don't confuse things.

This affects things like timekeeping, NO_HZ and RCU, basically
everything with a hook in irq_enter/exit.

Explicit examples of things going wrong are:

  sched_clock_cpu() -- has a callback when leaving NO_HZ state to take
                    a new reading from GTOD and TSC. Without this
                    callback, time is stuck in the past.

  RCU -- needs in_irq() to work in order to avoid some nasty deadlocks

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
12 years agorcu: protect __rcu_read_unlock() against scheduler-using irq handlers
Paul E. McKenney [Mon, 18 Jul 2011 04:14:35 +0000 (21:14 -0700)]
rcu: protect __rcu_read_unlock() against scheduler-using irq handlers

The addition of RCU read-side critical sections within runqueue and
priority-inheritance lock critical sections introduced some deadlock
cycles, for example, involving interrupts from __rcu_read_unlock()
where the interrupt handlers call wake_up().  This situation can cause
the instance of __rcu_read_unlock() invoked from interrupt to do some
of the processing that would otherwise have been carried out by the
task-level instance of __rcu_read_unlock().  When the interrupt-level
instance of __rcu_read_unlock() is called with a scheduler lock held
from interrupt-entry/exit situations where in_irq() returns false,
deadlock can result.

This commit resolves these deadlocks by using negative values of
the per-task ->rcu_read_lock_nesting counter to indicate that an
instance of __rcu_read_unlock() is in flight, which in turn prevents
instances from interrupt handlers from doing any special processing.
This patch is inspired by Steven Rostedt's earlier patch that similarly
made __rcu_read_unlock() guard against interrupt-mediated recursion
(see https://lkml.org/lkml/2011/7/15/326), but this commit refines
Steven's approach to avoid the need for preemption disabling on the
__rcu_read_unlock() fastpath and to also avoid the need for manipulating
a separate per-CPU variable.

This patch avoids need for preempt_disable() by instead using negative
values of the per-task ->rcu_read_lock_nesting counter.  Note that nested
rcu_read_lock()/rcu_read_unlock() pairs are still permitted, but they will
never see ->rcu_read_lock_nesting go to zero, and will therefore never
invoke rcu_read_unlock_special(), thus preventing them from seeing the
RCU_READ_UNLOCK_BLOCKED bit should it be set in ->rcu_read_unlock_special.
This patch also adds a check for ->rcu_read_unlock_special being negative
in rcu_check_callbacks(), thus preventing the RCU_READ_UNLOCK_NEED_QS
bit from being set should a scheduling-clock interrupt occur while
__rcu_read_unlock() is exiting from an outermost RCU read-side critical
section.

Of course, __rcu_read_unlock() can be preempted during the time that
->rcu_read_lock_nesting is negative.  This could result in the setting
of the RCU_READ_UNLOCK_BLOCKED bit after __rcu_read_unlock() checks it,
and would also result it this task being queued on the corresponding
rcu_node structure's blkd_tasks list.  Therefore, some later RCU read-side
critical section would enter rcu_read_unlock_special() to clean up --
which could result in deadlock if that critical section happened to be in
the scheduler where the runqueue or priority-inheritance locks were held.

This situation is dealt with by making rcu_preempt_note_context_switch()
check for negative ->rcu_read_lock_nesting, thus refraining from
queuing the task (and from setting RCU_READ_UNLOCK_BLOCKED) if we are
already exiting from the outermost RCU read-side critical section (in
other words, we really are no longer actually in that RCU read-side
critical section).  In addition, rcu_preempt_note_context_switch()
invokes rcu_read_unlock_special() to carry out the cleanup in this case,
which clears out the ->rcu_read_unlock_special bits and dequeues the task
(if necessary), in turn avoiding needless delay of the current RCU grace
period and needless RCU priority boosting.

It is still illegal to call rcu_read_unlock() while holding a scheduler
lock if the prior RCU read-side critical section has ever had either
preemption or irqs enabled.  However, the common use case is legal,
namely where then entire RCU read-side critical section executes with
irqs disabled, for example, when the scheduler lock is held across the
entire lifetime of the RCU read-side critical section.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
12 years agosched: Avoid creating superfluous NUMA domains on non-NUMA systems
Peter Zijlstra [Wed, 20 Jul 2011 16:42:57 +0000 (18:42 +0200)]
sched: Avoid creating superfluous NUMA domains on non-NUMA systems

When creating sched_domains, stop when we've covered the entire
target span instead of continuing to create domains, only to
later find they're redundant and throw them away again.

This avoids single node systems from touching funny NUMA
sched_domain creation code and reduces the risks of the new
SD_OVERLAP code.

Requested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Anton Blanchard <anton@samba.org>
Cc: mahesh@linux.vnet.ibm.com
Cc: benh@kernel.crashing.org
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/1311180177.29152.57.camel@twins
Signed-off-by: Ingo Molnar <mingo@elte.hu>
12 years agosched: Allow for overlapping sched_domain spans
Peter Zijlstra [Fri, 15 Jul 2011 08:35:52 +0000 (10:35 +0200)]
sched: Allow for overlapping sched_domain spans

Allow for sched_domain spans that overlap by giving such domains their
own sched_group list instead of sharing the sched_groups amongst
each-other.

This is needed for machines with more than 16 nodes, because
sched_domain_node_span() will generate a node mask from the
16 nearest nodes without regard if these masks have any overlap.

Currently sched_domains have a sched_group that maps to their child
sched_domain span, and since there is no overlap we share the
sched_group between the sched_domains of the various CPUs. If however
there is overlap, we would need to link the sched_group list in
different ways for each cpu, and hence sharing isn't possible.

In order to solve this, allocate private sched_groups for each CPU's
sched_domain but have the sched_groups share a sched_group_power
structure such that we can uniquely track the power.

Reported-and-tested-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-08bxqw9wis3qti9u5inifh3y@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
12 years agosched: Break out cpu_power from the sched_group structure
Peter Zijlstra [Thu, 14 Jul 2011 11:00:06 +0000 (13:00 +0200)]
sched: Break out cpu_power from the sched_group structure

In order to prepare for non-unique sched_groups per domain, we need to
carry the cpu_power elsewhere, so put a level of indirection in.

Reported-and-tested-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-qkho2byuhe4482fuknss40ad@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
12 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph...
Linus Torvalds [Wed, 20 Jul 2011 05:10:28 +0000 (22:10 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  ceph: fix file mode calculation

12 years agoMerge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/linux-arm-soc
Linus Torvalds [Wed, 20 Jul 2011 05:10:05 +0000 (22:10 -0700)]
Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/linux-arm-soc

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/linux-arm-soc:
  davinci: DM365 EVM: fix video input mux bits
  ARM: davinci: Check for NULL return from irq_alloc_generic_chip
  arm: davinci: Fix low level gpio irq handlers' argument

12 years agovmscan: fix a livelock in kswapd
Shaohua Li [Tue, 19 Jul 2011 15:49:26 +0000 (08:49 -0700)]
vmscan: fix a livelock in kswapd

I'm running a workload which triggers a lot of swap in a machine with 4
nodes.  After I kill the workload, I found a kswapd livelock.  Sometimes
kswapd3 or kswapd2 are keeping running and I can't access filesystem,
but most memory is free.

This looks like a regression since commit 08951e545918c159 ("mm: vmscan:
correct check for kswapd sleeping in sleeping_prematurely").

Node 2 and 3 have only ZONE_NORMAL, but balance_pgdat() will return 0
for classzone_idx.  The reason is end_zone in balance_pgdat() is 0 by
default, if all zones have watermark ok, end_zone will keep 0.

Later sleeping_prematurely() always returns true.  Because this is an
order 3 wakeup, and if classzone_idx is 0, both balanced_pages and
present_pages in pgdat_balanced() are 0.  We add a special case here.
If a zone has no page, we think it's balanced.  This fixes the livelock.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agofs/libfs.c: fix simple_attr_write() on 32bit machines
Akinobu Mita [Tue, 19 Jul 2011 15:49:25 +0000 (08:49 -0700)]
fs/libfs.c: fix simple_attr_write() on 32bit machines

Assume that /sys/kernel/debug/dummy64 is debugfs file created by
debugfs_create_x64().

# cd /sys/kernel/debug
# echo 0x1234567812345678 > dummy64
# cat dummy64
0x0000000012345678

# echo 0x80000000 > dummy64
# cat dummy64
0xffffffff80000000

A value larger than INT_MAX cannot be written to the debugfs file created
by debugfs_create_u64 or debugfs_create_x64 on 32bit machine.  Because
simple_attr_write() uses simple_strtol() for the conversion.

To fix this, use simple_strtoll() instead.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>