]> git.kernelconcepts.de Git - karo-tx-linux.git/log
karo-tx-linux.git
12 years agoLinux 2.6.38.8 v2.6.38.8
Greg Kroah-Hartman [Fri, 3 Jun 2011 01:35:11 +0000 (10:35 +0900)]
Linux 2.6.38.8

12 years agoAppArmor: fix oops in apparmor_setprocattr
Kees Cook [Tue, 31 May 2011 18:31:41 +0000 (11:31 -0700)]
AppArmor: fix oops in apparmor_setprocattr

commit a5b2c5b2ad5853591a6cac6134cd0f599a720865 upstream.

When invalid parameters are passed to apparmor_setprocattr a NULL deref
oops occurs when it tries to record an audit message. This is because
it is passing NULL for the profile parameter for aa_audit. But aa_audit
now requires that the profile passed is not NULL.

Fix this by passing the current profile on the task that is trying to
setprocattr.

Signed-off-by: Kees Cook <kees@ubuntu.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoext4: Use schedule_timeout_interruptible() for waiting in lazyinit thread
Lukas Czerner [Fri, 20 May 2011 17:49:04 +0000 (13:49 -0400)]
ext4: Use schedule_timeout_interruptible() for waiting in lazyinit thread

commit 4ed5c033c11b33149d993734a6a8de1016e8f03f upstream.

In order to make lazyinit eat approx. 10% of io bandwidth at max, we
are sleeping between zeroing each single inode table. For that purpose
we are using timer which wakes up thread when it expires. It is set
via add_timer() and this may cause troubles in the case that thread
has been woken up earlier and in next iteration we call add_timer() on
still running timer hence hitting BUG_ON in add_timer(). We could fix
that by using mod_timer() instead however we can use
schedule_timeout_interruptible() for waiting and hence simplifying
things a lot.

This commit exchange the old "waiting mechanism" with simple
schedule_timeout_interruptible(), setting the time to sleep. Hence we
do not longer need li_wait_daemon waiting queue and others, so get rid
of it.

Addresses-Red-Hat-Bugzilla: #699708

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoxen mmu: fix a race window causing leave_mm BUG()
Tian, Kevin [Thu, 12 May 2011 02:56:08 +0000 (10:56 +0800)]
xen mmu: fix a race window causing leave_mm BUG()

commit 7899891c7d161752f29abcc9bc0a9c6c3a3af26c upstream.

There's a race window in xen_drop_mm_ref, where remote cpu may exit
dirty bitmap between the check on this cpu and the point where remote
cpu handles drop request. So in drop_other_mm_ref we need check
whether TLB state is still lazy before calling into leave_mm. This
bug is rarely observed in earlier kernel, but exaggerated by the
commit 831d52bc153971b70e64eccfbed2b232394f22f8
("x86, mm: avoid possible bogus tlb entries by clearing prev mm_cpumask after switching mm")
which clears bitmap after changing the TLB state. the call trace is as below:

---------------------------------
kernel BUG at arch/x86/mm/tlb.c:61!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/devices/system/xen_memory/xen_memory0/info/current_kb
CPU 1
Modules linked in: 8021q garp xen_netback xen_blkback blktap blkback_pagemap nbd bridge stp llc autofs4 ipmi_devintf ipmi_si ipmi_msghandler lockd sunrpc bonding ipv6 xenfs dm_multipath video output sbs sbshc parport_pc lp parport ses enclosure snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device serio_raw bnx2 snd_pcm_oss snd_mixer_oss snd_pcm snd_timer iTCO_wdt snd soundcore snd_page_alloc i2c_i801 iTCO_vendor_support i2c_core pcs pkr pata_acpi ata_generic ata_piix shpchp mptsas mptscsih mptbase [last unloaded: freq_table]
Pid: 25581, comm: khelper Not tainted 2.6.32.36fixxen #1 Tecal RH2285
RIP: e030:[<ffffffff8103a3cb>]  [<ffffffff8103a3cb>] leave_mm+0x15/0x46
RSP: e02b:ffff88002805be48  EFLAGS: 00010046
RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffff88015f8e2da0
RDX: ffff88002805be78 RSI: 0000000000000000 RDI: 0000000000000001
RBP: ffff88002805be48 R08: ffff88009d662000 R09: dead000000200200
R10: dead000000100100 R11: ffffffff814472b2 R12: ffff88009bfc1880
R13: ffff880028063020 R14: 00000000000004f6 R15: 0000000000000000
FS:  00007f62362d66e0(0000) GS:ffff880028058000(0000) knlGS:0000000000000000
CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000003aabc11909 CR3: 000000009b8ca000 CR4: 0000000000002660
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 00000000000000 00
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process khelper (pid: 25581, threadinfo ffff88007691e000, task ffff88009b92db40)
Stack:
 ffff88002805be68 ffffffff8100e4ae 0000000000000001 ffff88009d733b88
<0> ffff88002805be98 ffffffff81087224 ffff88002805be78 ffff88002805be78
<0> ffff88015f808360 00000000000004f6 ffff88002805bea8 ffffffff81010108
Call Trace:
 <IRQ>
 [<ffffffff8100e4ae>] drop_other_mm_ref+0x2a/0x53
 [<ffffffff81087224>] generic_smp_call_function_single_interrupt+0xd8/0xfc
 [<ffffffff81010108>] xen_call_function_single_interrupt+0x13/0x28
 [<ffffffff810a936a>] handle_IRQ_event+0x66/0x120
 [<ffffffff810aac5b>] handle_percpu_irq+0x41/0x6e
 [<ffffffff8128c1c0>] __xen_evtchn_do_upcall+0x1ab/0x27d
 [<ffffffff8128dd11>] xen_evtchn_do_upcall+0x33/0x46
 [<ffffffff81013efe>] xen_do_hyper visor_callback+0x1e/0x30
 <EOI>
 [<ffffffff814472b2>] ? _spin_unlock_irqrestore+0x15/0x17
 [<ffffffff8100f8cf>] ? xen_restore_fl_direct_end+0x0/0x1
 [<ffffffff81113f71>] ? flush_old_exec+0x3ac/0x500
 [<ffffffff81150dc5>] ? load_elf_binary+0x0/0x17ef
 [<ffffffff81150dc5>] ? load_elf_binary+0x0/0x17ef
 [<ffffffff8115115d>] ? load_elf_binary+0x398/0x17ef
 [<ffffffff81042fcf>] ? need_resched+0x23/0x2d
 [<ffffffff811f4648>] ? process_measurement+0xc0/0xd7
 [<ffffffff81150dc5>] ? load_elf_binary+0x0/0x17ef
 [<ffffffff81113094>] ? search_binary_handler+0xc8/0x255
 [<ffffffff81114362>] ? do_execve+0x1c3/0x29e
 [<ffffffff8101155d>] ? sys_execve+0x43/0x5d
 [<ffffffff8106fc45>] ? __call_usermodehelper+0x0/0x6f
 [<ffffffff81013e28>] ? kernel_execve+0x68/0xd0
 [<ffffffff 8106fc45>] ? __call_usermodehelper+0x0/0x6f
 [<ffffffff8100f8cf>] ? xen_restore_fl_direct_end+0x0/0x1
 [<ffffffff8106fb64>] ? ____call_usermodehelper+0x113/0x11e
 [<ffffffff81013daa>] ? child_rip+0xa/0x20
 [<ffffffff8106fc45>] ? __call_usermodehelper+0x0/0x6f
 [<ffffffff81012f91>] ? int_ret_from_sys_call+0x7/0x1b
 [<ffffffff8101371d>] ? retint_restore_args+0x5/0x6
 [<ffffffff81013da0>] ? child_rip+0x0/0x20
Code: 41 5e 41 5f c9 c3 55 48 89 e5 0f 1f 44 00 00 e8 17 ff ff ff c9 c3 55 48 89 e5 0f 1f 44 00 00 65 8b 04 25 c8 55 01 00 ff c8 75 04 <0f> 0b eb fe 65 48 8b 34 25 c0 55 01 00 48 81 c6 b8 02 00 00 e8
RIP  [<ffffffff8103a3cb>] leave_mm+0x15/0x46
 RSP <ffff88002805be48>
---[ end trace ce9cee6832a9c503 ]---

Tested-by: Maoxiaoyun<tinnycloud@hotmail.com>
Signed-off-by: Kevin Tian <kevin.tian@intel.com>
[v1: Fleshed out the git description a bit]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoPCI: Add quirk for setting valid class for TI816X Endpoint
Hemant Pedanekar [Tue, 5 Apr 2011 07:02:50 +0000 (12:32 +0530)]
PCI: Add quirk for setting valid class for TI816X Endpoint

commit 63c4408074cbcc070ac17fc10e524800eb9bd0b0 upstream.

TI816X (common name for DM816x/C6A816x/AM389x family) devices configured
to boot as PCIe Endpoint have class code = 0. This makes kernel PCI bus
code to skip allocating BARs to these devices resulting into following
type of error when trying to enable them:

"Device 0000:01:00.0 not available because of resource collisions"

The device cannot be operated because of the above issue.

This patch adds a ID specific (TI VENDOR ID and 816X DEVICE ID based)
'early' fixup quirk to replace class code with
PCI_CLASS_MULTIMEDIA_VIDEO as class.

Signed-off-by: Hemant Pedanekar <hemantp@ti.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoNFSv4.1: Fix the handling of NFS4ERR_SEQ_MISORDERED errors
Trond Myklebust [Thu, 26 May 2011 18:26:35 +0000 (14:26 -0400)]
NFSv4.1: Fix the handling of NFS4ERR_SEQ_MISORDERED errors

commit 444f72fe7e7b5f4db34cee933fa3546ebb8e9122 upstream.

Currently, the call to nfs4_schedule_session_recovery() will actually just
result in a test of the lease when what we really want is to force a
session reset.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoNFSv4: Handle expired stateids when the lease is still valid
Trond Myklebust [Thu, 26 May 2011 18:26:35 +0000 (14:26 -0400)]
NFSv4: Handle expired stateids when the lease is still valid

commit 0ced63d1a245ac11241a5d37932e6d04d9c8040d upstream.

Currently, if the server returns NFS4ERR_EXPIRED in reply to a READ or
WRITE, but the RENEW test determines that the lease is still active, we
fail to recover and end up looping forever in a READ/WRITE + RENEW death
spiral.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoSUNRPC: Deal with the lack of a SYN_SENT sk->sk_state_change callback...
Trond Myklebust [Sat, 19 Mar 2011 00:21:23 +0000 (20:21 -0400)]
SUNRPC: Deal with the lack of a SYN_SENT sk->sk_state_change callback...

commit fe19a96b10032035a35779f42ad59e35d6dd8ffd upstream.

The TCP connection state code depends on the state_change() callback
being called when the SYN_SENT state is set. However the networking layer
doesn't actually call us back in that case.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agodrm/radeon/kms: add wait idle ioctl for eg->cayman
Dave Airlie [Thu, 19 May 2011 04:14:43 +0000 (14:14 +1000)]
drm/radeon/kms: add wait idle ioctl for eg->cayman

commit 97bfd0acd32e9639c9136e03955d574655d5cc2b upstream.

None of the latest GPUs had this hooked up, this is necessary for
correct operation in a lot of cases, however we should test this on a few
GPUs in these families as we've had problems in this area before.

Reviewed-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agodrm/radeon/evergreen/btc/fusion: setup hdp to invalidate and flush when asked
Alex Deucher [Thu, 19 May 2011 15:07:57 +0000 (11:07 -0400)]
drm/radeon/evergreen/btc/fusion: setup hdp to invalidate and flush when asked

commit f25a5c63bfa017498c9adecb24d649ae96ba5c68 upstream.

This needs to be explicitly set on btc.  It's set by default
on evergreen/fusion, so it fine to just unconditionally enable it for
all chips.

Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agodrm/i915: fix user irq miss in BSD ring on g4x
Boqun Feng [Mon, 16 May 2011 08:02:39 +0000 (16:02 +0800)]
drm/i915: fix user irq miss in BSD ring on g4x

commit 5bfa1063a775836a84f97e4df863fc36e1f856ad upstream.

On g4x, user interrupt in BSD ring is missed.
This is because though g4x and ironlake share the same bsd_ring,
their interrupt control interfaces have _two_ differences.

1.different irq enable/disable functions:
On g4x are i915_enable_irq and i915_disable_irq.
On ironlake are ironlake_enable_irq and ironlake_disable_irq.
2.different irq flag:
On g4x user interrupt flag in BSD ring on is I915_BSD_USER_INTERRUPT.
On ironlake is GT_BSD_USER_INTERRUPT

Old bsd_ring_get/put_irq call ring_get_irq and ring_get_irq.
ring_get_irq and ring_put_irq only call ironlake_enable/disable_irq.
So comes the irq miss on g4x.

To fix this, as other rings' code do, conditionally call different
functions(i915_enable/disable_irq and ironlake_enable/disable_irq)
and use different interrupt flags in bsd_ring_get/put_irq.

Signed-off-by: Boqun Feng <boqun.feng@intel.com>
Reviewed-by: Xiang, Haihao <haihao.xiang@intel.com>
Signed-off-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agobrd: handle on-demand devices correctly
Namhyung Kim [Thu, 26 May 2011 19:06:50 +0000 (21:06 +0200)]
brd: handle on-demand devices correctly

commit af46566885a373b0a526932484cd8fef8de7b598 upstream.

When finding or allocating a ram disk device, brd_probe() did not take
partition numbers into account so that it can result to a different
device. Consider following example (I set CONFIG_BLK_DEV_RAM_COUNT=4
for simplicity) :

$ sudo modprobe brd max_part=15
$ ls -l /dev/ram*
brw-rw---- 1 root disk 1,  0 2011-05-25 15:41 /dev/ram0
brw-rw---- 1 root disk 1, 16 2011-05-25 15:41 /dev/ram1
brw-rw---- 1 root disk 1, 32 2011-05-25 15:41 /dev/ram2
brw-rw---- 1 root disk 1, 48 2011-05-25 15:41 /dev/ram3
$ sudo mknod /dev/ram4 b 1 64
$ sudo dd if=/dev/zero of=/dev/ram4 bs=4k count=256
256+0 records in
256+0 records out
1048576 bytes (1.0 MB) copied, 0.00215578 s, 486 MB/s
namhyung@leonhard:linux$ ls -l /dev/ram*
brw-rw---- 1 root disk 1,    0 2011-05-25 15:41 /dev/ram0
brw-rw---- 1 root disk 1,   16 2011-05-25 15:41 /dev/ram1
brw-rw---- 1 root disk 1,   32 2011-05-25 15:41 /dev/ram2
brw-rw---- 1 root disk 1,   48 2011-05-25 15:41 /dev/ram3
brw-r--r-- 1 root root 1,   64 2011-05-25 15:45 /dev/ram4
brw-rw---- 1 root disk 1, 1024 2011-05-25 15:44 /dev/ram64

After this patch, /dev/ram4 - instead of /dev/ram64 - was
accessed correctly.

In addition, 'range' passed to blk_register_region() should
include all range of dev_t that RAMDISK_MAJOR can address.
It does not need to be limited by partition numbers unless
'rd_nr' param was specified.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agobrd: limit 'max_part' module param to DISK_MAX_PARTS
Namhyung Kim [Thu, 26 May 2011 19:06:50 +0000 (21:06 +0200)]
brd: limit 'max_part' module param to DISK_MAX_PARTS

commit 315980c8688c4b06713c1a5fe9d64cdf8ab57a72 upstream.

The 'max_part' parameter controls the number of maximum partition
a brd device can have. However if a user specifies very large
value it would exceed the limitation of device minor number and
can cause a kernel panic (or, at least, produce invalid device
nodes in some cases).

On my desktop system, following command kills the kernel. On qemu,
it triggers similar oops but the kernel was alive:

$ sudo modprobe brd max_part=100000
 BUG: unable to handle kernel NULL pointer dereference at 0000000000000058
 IP: [<ffffffff81110a9a>] sysfs_create_dir+0x2d/0xae
 PGD 7af1067 PUD 7b19067 PMD 0
 Oops: 0000 [#1] SMP
 last sysfs file:
 CPU 0
 Modules linked in: brd(+)

 Pid: 44, comm: insmod Tainted: G        W   2.6.39-qemu+ #158 Bochs Bochs
 RIP: 0010:[<ffffffff81110a9a>]  [<ffffffff81110a9a>] sysfs_create_dir+0x2d/0xae
 RSP: 0018:ffff880007b15d78  EFLAGS: 00000286
 RAX: ffff880007b05478 RBX: ffff880007a52760 RCX: ffff880007b15dc8
 RDX: ffff880007a4f900 RSI: ffff880007b15e48 RDI: ffff880007a52760
 RBP: ffff880007b15da8 R08: 0000000000000002 R09: 0000000000000000
 R10: ffff880007b15e48 R11: ffff880007b05478 R12: 0000000000000000
 R13: ffff880007b05478 R14: 0000000000400920 R15: 0000000000000063
 FS:  0000000002160880(0063) GS:ffff880007c00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000058 CR3: 0000000007b1c000 CR4: 00000000000006b0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 0000000000000000 DR7: 0000000000000000
 Process insmod (pid: 44, threadinfo ffff880007b14000, task ffff880007acb980)
 Stack:
  ffff880007b15dc8 ffff880007b05478 ffff880007b15da8 00000000fffffffe
  ffff880007a52760 ffff880007b05478 ffff880007b15de8 ffffffff81143c0a
  0000000000400920 ffff880007a52760 ffff880007b05478 0000000000000000
 Call Trace:
  [<ffffffff81143c0a>] kobject_add_internal+0xdf/0x1a0
  [<ffffffff81143da1>] kobject_add_varg+0x41/0x50
  [<ffffffff81143e6b>] kobject_add+0x64/0x66
  [<ffffffff8113bbe7>] blk_register_queue+0x5f/0xb8
  [<ffffffff81140f72>] add_disk+0xdf/0x289
  [<ffffffffa00040df>] brd_init+0xdf/0x1aa [brd]
  [<ffffffffa0004000>] ? 0xffffffffa0003fff
  [<ffffffffa0004000>] ? 0xffffffffa0003fff
  [<ffffffff8100020a>] do_one_initcall+0x7a/0x12e
  [<ffffffff8108516c>] sys_init_module+0x9c/0x1dc
  [<ffffffff812ff4bb>] system_call_fastpath+0x16/0x1b
 Code: 89 e5 41 55 41 54 53 48 89 fb 48 83 ec 18 48 85 ff 75 04 0f 0b eb fe 48 8b 47 18 49 c7 c4 70 1e 4d 81 48 85 c0 74 04 4c 8b 60 30
  8b 44 24 58 45 31 ed 0f b6 c4 85 c0 74 0d 48 8b 43 28 48 89
 RIP  [<ffffffff81110a9a>] sysfs_create_dir+0x2d/0xae
  RSP <ffff880007b15d78>
 CR2: 0000000000000058
 ---[ end trace aebb1175ce1f6739 ]---

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoatm: expose ATM device index in sysfs
Dan Williams [Fri, 27 May 2011 04:51:54 +0000 (04:51 +0000)]
atm: expose ATM device index in sysfs

commit e7a46b4d0839c2a3aa2e0ae0b145f293f6738498 upstream.

It's currently exposed only through /proc which, besides requiring
screen-scraping, doesn't allow userspace to distinguish between two
identical ATM adapters with different ATM indexes.  The ATM device index
is required when using PPPoATM on a system with multiple ATM adapters.

Signed-off-by: Dan Williams <dcbw@redhat.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Tested-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agotmpfs: fix race between truncate and writepage
Hugh Dickins [Sat, 28 May 2011 20:14:09 +0000 (13:14 -0700)]
tmpfs: fix race between truncate and writepage

commit 826267cf1e6c6899eda1325a19f1b1d15c558b20 upstream.

While running fsx on tmpfs with a memhog then swapoff, swapoff was hanging
(interruptibly), repeatedly failing to locate the owner of a 0xff entry in
the swap_map.

Although shmem_writepage() does abandon when it sees incoming page index
is beyond eof, there was still a window in which shmem_truncate_range()
could come in between writepage's dropping lock and updating swap_map,
find the half-completed swap_map entry, and in trying to free it,
leave it in a state that swap_shmem_alloc() could not correct.

Arguably a bug in __swap_duplicate()'s and swap_entry_free()'s handling
of the different cases, but easiest to fix by moving swap_shmem_alloc()
under cover of the lock.

More interesting than the bug: it's been there since 2.6.33, why could
I not see it with earlier kernels?  The mmotm of two weeks ago seems to
have some magic for generating races, this is just one of three I found.

With yesterday's git I first saw this in mainline, bisected in search of
that magic, but the easy reproducibility evaporated.  Oh well, fix the bug.

Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoARM: 6941/1: cache: ensure MVA is cacheline aligned in flush_kern_dcache_area
Will Deacon [Thu, 26 May 2011 10:20:19 +0000 (11:20 +0100)]
ARM: 6941/1: cache: ensure MVA is cacheline aligned in flush_kern_dcache_area

commit a248b13b21ae00b97638b4f435c8df3075808b5d upstream.

The v6 and v7 implementations of flush_kern_dcache_area do not align
the passed MVA to the size of a cacheline in the data cache. If a
misaligned address is used, only a subset of the requested area will
be flushed. This has been observed to cause failures in SMP boot where
the secondary_data initialised by the primary CPU is not cacheline
aligned, causing the secondary CPUs to read incorrect values for their
pgd and stack pointers.

This patch ensures that the base address is cacheline aligned before
flushing the d-cache.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agodm table: reject devices without request fns
Milan Broz [Sun, 29 May 2011 12:02:52 +0000 (13:02 +0100)]
dm table: reject devices without request fns

commit f4808ca99a203f20b4475601748e44b25a65bdec upstream.

This patch adds a check that a block device has a request function
defined before it is used.  Otherwise, misconfiguration can cause an oops.

Because we are allowing devices with zero size e.g. an offline multipath
device as in commit 2cd54d9bedb79a97f014e86c0da393416b264eb3
("dm: allow offline devices") there needs to be an additional check
to ensure devices are initialised.  Some block devices, like a loop
device without a backing file, exist but have no request function.

Reproducer is trivial: dm-mirror on unbound loop device
(no backing file on loop devices)

dmsetup create x --table "0 8 mirror core 2 8 sync 2 /dev/loop0 0 /dev/loop1 0"

and mirror resync will immediatelly cause OOps.

BUG: unable to handle kernel NULL pointer dereference at   (null)
 ? generic_make_request+0x2bd/0x590
 ? kmem_cache_alloc+0xad/0x190
 submit_bio+0x53/0xe0
 ? bio_add_page+0x3b/0x50
 dispatch_io+0x1ca/0x210 [dm_mod]
 ? read_callback+0x0/0xd0 [dm_mirror]
 dm_io+0xbb/0x290 [dm_mod]
 do_mirror+0x1e0/0x748 [dm_mirror]

Signed-off-by: Milan Broz <mbroz@redhat.com>
Reported-by: Zdenek Kabelac <zkabelac@redhat.com>
Acked-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoidle governor: Avoid lock acquisition to read pm_qos before entering idle
Tim Chen [Fri, 11 Feb 2011 20:49:04 +0000 (12:49 -0800)]
idle governor: Avoid lock acquisition to read pm_qos before entering idle

commit 333c5ae9948194428fe6c5ef5c088304fc98263b upstream.

Thanks to the reviews and comments by Rafael, James, Mark and Andi.
Here's version 2 of the patch incorporating your comments and also some
update to my previous patch comments.

I noticed that before entering idle state, the menu idle governor will
look up the current pm_qos target value according to the list of qos
requests received.  This look up currently needs the acquisition of a
lock to access the list of qos requests to find the qos target value,
slowing down the entrance into idle state due to contention by multiple
cpus to access this list.  The contention is severe when there are a lot
of cpus waking and going into idle.  For example, for a simple workload
that has 32 pair of processes ping ponging messages to each other, where
64 cpu cores are active in test system, I see the following profile with
37.82% of cpu cycles spent in contention of pm_qos_lock:

-     37.82%          swapper  [kernel.kallsyms]          [k]
_raw_spin_lock_irqsave
   - _raw_spin_lock_irqsave
      - 95.65% pm_qos_request
           menu_select
           cpuidle_idle_call
         - cpu_idle
              99.98% start_secondary

A better approach will be to cache the updated pm_qos target value so
reading it does not require lock acquisition as in the patch below.
With this patch the contention for pm_qos_lock is removed and I saw a
2.2X increase in throughput for my message passing workload.

Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: James Bottomley <James.Bottomley@suse.de>
Acked-by: mark gross <markgross@thegnar.org>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agocpuidle: menu: fixed wrapping timers at 4.294 seconds
Tero Kristo [Thu, 24 Feb 2011 15:19:23 +0000 (17:19 +0200)]
cpuidle: menu: fixed wrapping timers at 4.294 seconds

commit 7467571f4480b273007517b26297c07154c73924 upstream.

Cpuidle menu governor is using u32 as a temporary datatype for storing
nanosecond values which wrap around at 4.294 seconds. This causes errors
in predicted sleep times resulting in higher than should be C state
selection and increased power consumption. This also breaks cpuidle
state residency statistics.

Signed-off-by: Tero Kristo <tero.kristo@nokia.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoi8k: Avoid lahf in 64-bit code
Luca Tettamanti [Wed, 25 May 2011 18:43:31 +0000 (20:43 +0200)]
i8k: Avoid lahf in 64-bit code

commit bc1f419c76a2d6450413ce4349f4e4a07be011d5 upstream.

i8k uses lahf to read the flag register in 64-bit code; early x86-64
CPUs, however, lack this instruction and we get an invalid opcode
exception at runtime.
Use pushf to load the flag register into the stack instead.

Signed-off-by: Luca Tettamanti <kronos.it@gmail.com>
Reported-by: Jeff Rickman <jrickman@myamigos.us>
Tested-by: Jeff Rickman <jrickman@myamigos.us>
Tested-by: Harry G McGavran Jr <w5pny@arrl.net>
Cc: Massimo Dal Zotto <dz@debian.org>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agokbuild: Fix GNU make v3.80 compatibility
Kevin Cernekee [Tue, 10 May 2011 22:47:16 +0000 (15:47 -0700)]
kbuild: Fix GNU make v3.80 compatibility

commit 43f67c98161c65f1b2e3af3a9ce6741850072c06 upstream.

According to Documentation/Changes, the kernel should be buildable with
GNU make 3.80+.  Commit 88d7be031f9f975bb3f50a0b5ef3796a671e7edf (kbuild:
Use a single clean rule for kernel and external modules) introduced the
"$(or" construct, which requires make 3.81.  This causes "make clean" to
malfunction when it is used with external modules.

Replace "$(or" with an equivalent "$(if" expression, to restore backward
compatibility.

Signed-off-by: Kevin Cernekee <cernekee@gmail.com>
Signed-off-by: Michal Marek <mmarek@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoUBIFS: fix a rare memory leak in ro to rw remounting path
Artem Bityutskiy [Fri, 6 May 2011 14:08:56 +0000 (17:08 +0300)]
UBIFS: fix a rare memory leak in ro to rw remounting path

commit eaeee242c531cd4b0a4a46e8b5dd7ef504380c42 upstream.

When re-mounting from R/O mode to R/W mode and the LEB count in the superblock
is not up-to date, because for the underlying UBI volume became larger, we
re-write the superblock. We allocate RAM for these purposes, but never free it.
So this is a memory leak, although very rare one.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoeCryptfs: Allow 2 scatterlist entries for encrypted filenames
Tyler Hicks [Tue, 17 May 2011 05:50:33 +0000 (00:50 -0500)]
eCryptfs: Allow 2 scatterlist entries for encrypted filenames

commit 8d08dab786ad5cc2aca2bf870de370144b78c85a upstream.

The buffers allocated while encrypting and decrypting long filenames can
sometimes straddle two pages. In this situation, virt_to_scatterlist()
will return -ENOMEM, causing the operation to fail and the user will get
scary error messages in their logs:

kernel: ecryptfs_write_tag_70_packet: Internal error whilst attempting
to convert filename memory to scatterlist; expected rc = 1; got rc =
[-12]. block_aligned_filename_size = [272]
kernel: ecryptfs_encrypt_filename: Error attempting to generate tag 70
packet; rc = [-12]
kernel: ecryptfs_encrypt_and_encode_filename: Error attempting to
encrypt filename; rc = [-12]
kernel: ecryptfs_lookup: Error attempting to encrypt and encode
filename; rc = [-12]

The solution is to allow up to 2 scatterlist entries to be used.

Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agop54usb: add zoom 4410 usbid
Christian Lamparter [Fri, 13 May 2011 19:47:23 +0000 (21:47 +0200)]
p54usb: add zoom 4410 usbid

commit 9368a9a2378ab721f82f59430a135b4ce4ff5109 upstream.

Reported-by: Mark Davis <marked86@gmail.com>
Signed-off-by: Christian Lamparter <chunkeey@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agosh: fixup fpu.o compile order
Kuninori Morimoto [Fri, 15 Apr 2011 07:44:27 +0000 (16:44 +0900)]
sh: fixup fpu.o compile order

commit a375b15164dd9264f724ad941825e52c90145151 upstream.

arch_ptrace() was modified to reference init_fpu() to fix up xstate
initialization, which overlooked the fact that there are configurations
that don't enable any of hard FPU support or emulation, resulting in
build errors on DSP parts.

Given that init_fpu() simply sets up the xstate slab cache and is
side-stepped entirely for the DSP case, we can simply always build in the
helper and fix up the references.

Reported-by: Nobuhiro Iwamatsu <iwamatsu@nigauri.org>
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agosh: clkfwk: fixup clk_rate_table_build parameter in div6 clock
Kuninori Morimoto [Thu, 14 Apr 2011 08:13:53 +0000 (17:13 +0900)]
sh: clkfwk: fixup clk_rate_table_build parameter in div6 clock

commit 52c10ad22b7e317960b4d411c9a9ddeaf3d5ae39 upstream.

div6 clock should not use arch_flags for clk_rate_table_build,
because SH_CLK_DIV6_EXT doesn't care .arch_flags.
clk->freq_table[] will be all CPUFREQ_ENTRY_INVALID without this patch.

Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agocx88: hold device lock during sub-driver initialization
Jonathan Nieder [Sun, 1 May 2011 09:29:56 +0000 (06:29 -0300)]
cx88: hold device lock during sub-driver initialization

commit 1d6213ab995c61f7d1d81cf6cf876acf15d6e714 upstream.

cx8802_blackbird_probe makes a device node for the mpeg sub-device
before it has been added to dev->drvlist.  If the device is opened
during that time, the open succeeds but request_acquire cannot be
called, so the reference count remains zero.  Later, when the device
is closed, the reference count becomes negative --- uh oh.

Close the race by holding core->lock during probe and not releasing
until the device is in drvlist and initialization finished.
Previously the BKL prevented this race.

Reported-by: Andreas Huber <hobrom@gmx.at>
Tested-by: Andi Huber <hobrom@gmx.at>
Tested-by: Marlon de Boer <marlon@hyves.nl>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agocx88: fix locking of sub-driver operations
Jonathan Nieder [Sun, 1 May 2011 09:29:37 +0000 (06:29 -0300)]
cx88: fix locking of sub-driver operations

commit 1fe70e963028f34ba5e32488a7870ff4b410b19b upstream.

The BKL conversion of this driver seems to have gone wrong.
Loading the cx88-blackbird driver deadlocks.

The cause: mpeg_ops::open in the cx2388x blackbird driver acquires the
device lock and calls the sub-driver's request_acquire, which tries to
acquire the lock again.  Fix it by clarifying the semantics of
request_acquire, request_release, advise_acquire, and advise_release:
now all will rely on the caller to acquire the device lock.

Based on work by Ben Hutchings <ben@decadent.org.uk>.

Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=31962
Reported-by: Andi Huber <hobrom@gmx.at>
Tested-by: Andi Huber <hobrom@gmx.at>
Tested-by: Marlon de Boer <marlon@hyves.nl>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agocx88: protect per-device driver list with device lock
Jonathan Nieder [Sun, 1 May 2011 09:29:16 +0000 (06:29 -0300)]
cx88: protect per-device driver list with device lock

commit 8a317a8760cfffa8185b56ff59fb4b6c58488d79 upstream.

The BKL conversion of this driver seems to have gone wrong.  Various
uses of the sub-device and driver lists appear to be subject to race
conditions.

In particular, some functions access drvlist without a relevant lock
held, which will race against removal of drivers.  Let's start with
that --- clean up by consistently protecting dev->drvlist with
dev->core->lock, noting driver functions that require the device lock
to be held or not to be held.

After this patch, there are still some races --- e.g.,
cx8802_blackbird_remove can run between the time the blackbird driver
is acquired and the time it is used in mpeg_release, and there's a
similar race in cx88_dvb_bus_ctrl.  Later patches will address the
remaining known races and the deadlock noticed by Andi.  This patch
just makes the semantics clearer in preparation for those later
changes.

Based on work by Ben Hutchings <ben@decadent.org.uk>.

Tested-by: Andi Huber <hobrom@gmx.at>
Tested-by: Marlon de Boer <marlon@hyves.nl>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoUSB: remove remaining usages of hcd->state from usbcore and fix regression
Alan Stern [Tue, 17 May 2011 21:27:12 +0000 (17:27 -0400)]
USB: remove remaining usages of hcd->state from usbcore and fix regression

commit 69fff59de4d844f8b4c2454c3c23d32b69dcbfd7 upstream.

This patch (as1467) removes the last usages of hcd->state from
usbcore.  We no longer check to see if an interrupt handler finds that
a controller has died; instead we rely on host controller drivers to
make an explicit call to usb_hc_died().

This fixes a regression introduced by commit
9b37596a2e860404503a3f2a6513db60c296bfdc (USB: move usbcore away from
hcd->state).  It used to be that when a controller shared an IRQ with
another device and an interrupt arrived while hcd->state was set to
HC_STATE_HALT, the interrupt handler would be skipped.  The commit
removed that test; as a result the current code doesn't skip calling
the handler and ends up believing the controller has died, even though
it's only temporarily stopped.  The solution is to ignore HC_STATE_HALT
following the handler's return.

As a consequence of this change, several of the host controller
drivers need to be modified.  They can no longer implicitly rely on
usbcore realizing that a controller has died because of hcd->state.
The patch adds calls to usb_hc_died() in the appropriate places.

The patch also changes a few of the interrupt handlers.  They don't
expect to be called when hcd->state is equal to HC_STATE_HALT, even if
the controller is still alive.  Early returns were added to avoid any
confusion.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Tested-by: Manuel Lauss <manuel.lauss@googlemail.com>
CC: Rodolfo Giometti <giometti@linux.it>
CC: Olav Kongas <ok@artecdesign.ee>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoOHCI: fix regression caused by nVidia shutdown workaround
Alan Stern [Mon, 16 May 2011 16:15:19 +0000 (12:15 -0400)]
OHCI: fix regression caused by nVidia shutdown workaround

commit 2b7aaf503d56216b847c8265421d2a7d9b42df3e upstream.

This patch (as1463) fixes a regression caused by commit
3df7169e73fc1d71a39cffeacc969f6840cdf52b (OHCI: work around for nVidia
shutdown problem).

The original problem encountered by people using NVIDIA chipsets was
that USB devices were not turning off when the system shut down.  For
example, the LED on an optical mouse would remain on, draining a
laptop's battery.  The problem was caused by a bug in the chipset; an
OHCI controller in the Reset state would continue to drive a bus reset
signal even after system shutdown.  The workaround was to put the
controllers into the Suspend state instead.

It turns out that later NVIDIA chipsets do not suffer from this bug.
Instead some have the opposite bug: If a system is shut down while an
OHCI controller is in the Suspend state, USB devices remain powered!
On other systems, shutting down with a Suspended controller causes the
system to reboot immediately.  Thus, working around the original bug
on some machines exposes other bugs on other machines.

The best solution seems to be to limit the workaround to OHCI
controllers with a low-numbered PCI product ID.  I don't know exactly
at what point NVIDIA changed their chipsets; the value used here is a
guess.  So far it was worked out okay for all the people who have
tested it.

This fixes Bugzilla #35032.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Tested-by: Andre "Osku" Schmidt <andre.osku.schmidt@googlemail.com>
Tested-by: Yury Siamashka <yurand2@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoUSB: option: add support for Huawei E353 device
Marcin Gałczyński [Sun, 15 May 2011 09:41:54 +0000 (11:41 +0200)]
USB: option: add support for Huawei E353 device

commit 610ba42f29c3dfa46a05ff8c2cadc29f544ff76d upstream.

I am sharing patch to the devices/usb/serial/option.c. This allows
operation of Huawei E353 broadband modem using the “option” driver. The
patch simply adds new constant with proper product ID and an entry to
usb_device_id. I worked on the 2.6.38.6 sources. Tested on Dell inspiron
1764 (i3 core cpu) and brand new Huawei E353 modem, Fedora 15 beta.

Looking at the type of change, i doubt it has potential to introduce
problems in other parts of kernel or the driver itself.

Signed-off-by: Marcin Galczynski <marcin@galczynski.pl>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoxhci: Fix memory leak bug when dropping endpoints
Sarah Sharp [Fri, 13 May 2011 01:06:37 +0000 (18:06 -0700)]
xhci: Fix memory leak bug when dropping endpoints

commit 834cb0fc4712a3b21c6b8c5cb55bd13607191311 upstream.

When the USB core wants to change to an alternate interface setting that
doesn't include an active endpoint, or de-configuring the device, the xHCI
driver needs to issue a Configure Endpoint command to tell the host to
drop some endpoints from the schedule.  After the command completes, the
xHCI driver needs to free rings for any endpoints that were dropped.

Unfortunately, the xHCI driver wasn't actually freeing the endpoint rings
for dropped endpoints.  The rings would be freed if the endpoint's
information was simply changed (and a new ring was installed), but dropped
endpoints never had their rings freed.  This caused errors when the ring
segment DMA pool was freed when the xHCI driver was unloaded:

[ 5582.883995] xhci_hcd 0000:06:00.0: dma_pool_destroy xHCI ring segments, ffff88003371d000 busy
[ 5582.884002] xhci_hcd 0000:06:00.0: dma_pool_destroy xHCI ring segments, ffff880033716000 busy
[ 5582.884011] xhci_hcd 0000:06:00.0: dma_pool_destroy xHCI ring segments, ffff880033455000 busy
[ 5582.884018] xhci_hcd 0000:06:00.0: Freed segment pool
[ 5582.884026] xhci_hcd 0000:06:00.0: Freed device context pool
[ 5582.884033] xhci_hcd 0000:06:00.0: Freed small stream array pool
[ 5582.884038] xhci_hcd 0000:06:00.0: Freed medium stream array pool
[ 5582.884048] xhci_hcd 0000:06:00.0: xhci_stop completed - status = 1
[ 5582.884061] xhci_hcd 0000:06:00.0: USB bus 3 deregistered
[ 5582.884193] xhci_hcd 0000:06:00.0: PCI INT A disabled

Fix this issue and free endpoint rings when their endpoints are
successfully dropped.

This patch should be backported to kernels as old as 2.6.31.

Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoxhci: Fix memory leak in ring cache deallocation.
Sarah Sharp [Mon, 16 May 2011 20:09:08 +0000 (13:09 -0700)]
xhci: Fix memory leak in ring cache deallocation.

commit 30f89ca021c3e584b61bc5a14eede89f74b2e826 upstream.

When an endpoint ring is freed, it is either cached in a per-device ring
cache, or simply freed if the ring cache is full.  If the ring was added
to the cache, then virt_dev->num_rings_cached is incremented.  The cache
is designed to hold up to 31 endpoint rings, in array indexes 0 to 30.
When the device is freed (when the slot was disabled),
xhci_free_virt_device() is called, it would free the cached rings in
array indexes 0 to virt_dev->num_rings_cached.

Unfortunately, the original code in xhci_free_or_cache_endpoint_ring()
would put the first entry into the ring cache in array index 1, instead of
array index 0.  This was caused by the second assignment to rings_cached:

rings_cached = virt_dev->num_rings_cached;
if (rings_cached < XHCI_MAX_RINGS_CACHED) {
virt_dev->num_rings_cached++;
rings_cached = virt_dev->num_rings_cached;
virt_dev->ring_cache[rings_cached] =
virt_dev->eps[ep_index].ring;

This meant that when the device was freed, cached rings with indexes 0 to
N would be freed, and the last cached ring in index N+1 would not be
freed.  When the driver was unloaded, this caused interesting messages
like:

xhci_hcd 0000:06:00.0: dma_pool_destroy xHCI ring segments, ffff880063040000 busy

This should be queued to stable kernels back to 2.6.33.

Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoxhci: Fix full speed bInterval encoding.
Sarah Sharp [Fri, 13 May 2011 20:10:01 +0000 (13:10 -0700)]
xhci: Fix full speed bInterval encoding.

commit b513d44751bfb609a3c20463f764c8ce822d63e9 upstream.

Dmitry's patch

dfa49c4ad120a784ef1ff0717168aa79f55a483a USB: xhci - fix math in xhci_get_endpoint_interval()

introduced a bug.  The USB 2.0 spec says that full speed isochronous endpoints'
bInterval must be decoded as an exponent to a power of two (e.g. interval =
2^(bInterval - 1)).  Full speed interrupt endpoints, on the other hand, don't
use exponents, and the interval in frames is encoded straight into bInterval.

Dmitry's patch was supposed to fix up the full speed isochronous to parse
bInterval as an exponent, but instead it changed the *interrupt* endpoint
bInterval decoding.  The isochronous endpoint encoding was the same.

This caused full speed devices with interrupt endpoints (including mice, hubs,
and USB to ethernet devices) to fail under NEC 0.96 xHCI host controllers:

[  100.909818] xhci_hcd 0000:06:00.0: add ep 0x83, slot id 1, new drop flags = 0x0, new add flags = 0x99, new slot info = 0x38100000
[  100.909821] xhci_hcd 0000:06:00.0: xhci_check_bandwidth called for udev ffff88011f0ea000
...
[  100.910187] xhci_hcd 0000:06:00.0: ERROR: unexpected command completion code 0x11.
[  100.910190] xhci_hcd 0000:06:00.0: xhci_reset_bandwidth called for udev ffff88011f0ea000

When the interrupt endpoint was added and a Configure Endpoint command was
issued to the host, the host controller would return a very odd error message
(0x11 means "Slot Not Enabled", which isn't true because the slot was enabled).
Probably the host controller was getting very confused with the bad encoding.

Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Cc: Dmitry Torokhov <dtor@vmware.com>
Reported-by: Thomas Lindroth <thomas.lindroth@gmail.com>
Tested-by: Thomas Lindroth <thomas.lindroth@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agousb: gadget: rndis: don't test against req->length
Felipe Balbi [Fri, 13 May 2011 13:53:48 +0000 (16:53 +0300)]
usb: gadget: rndis: don't test against req->length

commit 472b91274a6c6857877b5caddb875dcb5ecdfcb8 upstream.

composite.c always sets req->length to zero
and expects function driver's setup handlers
to return the amount of bytes to be used
on req->length. If we test against req->length
w_length will always be greater than req->length
thus making us always stall that particular
SEND_ENCAPSULATED_COMMAND request.

Tested against a Windows XP SP3.

Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agousb/gadget: at91sam9g20 fix end point max packet size
Jean-Christophe PLAGNIOL-VILLARD [Fri, 13 May 2011 15:03:02 +0000 (17:03 +0200)]
usb/gadget: at91sam9g20 fix end point max packet size

commit bf1f0a05d472e33dda8e5e69525be1584cdbd03a upstream.

on 9g20 they are the same as the 9260

Signed-off-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoxhci: Fix bug in control transfer cancellation.
Sarah Sharp [Fri, 6 May 2011 02:08:09 +0000 (19:08 -0700)]
xhci: Fix bug in control transfer cancellation.

commit 3abeca998a44205cfd837fa0bf1f7c24f8294acb upstream.

When the xHCI driver attempts to cancel a transfer, it issues a Stop
Endpoint command and waits for the host controller to indicate which TRB
it was in the middle of processing.  The host will put an event TRB with
completion code COMP_STOP on the event ring if it stops on a control
transfer TRB (or other types of transfer TRBs).  The ring handling code
is supposed to set ep->stopped_trb to the TRB that the host stopped on
when this happens.

Unfortunately, there is a long-standing bug in the control transfer
completion code.  It doesn't actually check to see if COMP_STOP is set
before attempting to process the transfer based on which part of the
control TD completed.  So when we get an event on the data phase of the
control TRB with COMP_STOP set, it thinks it's a normal completion of
the transfer and doesn't set ep->stopped_td or ep->stopped_trb.

When the ring handling code goes on to process the completion of the Stop
Endpoint command, it sees that ep->stopped_trb is not a part of the TD
it's trying to cancel.  It thinks the hardware has its enqueue pointer
somewhere further up in the ring, and thinks it's safe to turn the control
TRBs into no-op TRBs.  Since the hardware was in the middle of the control
TRBs to be cancelled, the proper software behavior is to issue a Set TR
dequeue pointer command.

It turns out that the NEC host controllers can handle active TRBs being
set to no-op TRBs after a stop endpoint command, but other host
controllers have issues with this out-of-spec software behavior.  Fix this
behavior.

This patch should be backported to kernels as far back as 2.6.31, but it
may be a bit challenging, since process_ctrl_td() was introduced in some
refactoring done in 2.6.36, and some endian-safe patches added in 2.6.40
that touch the same lines.

Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoUSB: cdc_acm: Fix oops when Droids MuIn LCD is connected
Erik Slagter [Wed, 11 May 2011 10:06:55 +0000 (12:06 +0200)]
USB: cdc_acm: Fix oops when Droids MuIn LCD is connected

commit fd5054c169d29747a44b4e1419ff47f57ae82dbc upstream.

The Droids MuIn LCD operates like a serial remote terminal.
Data received are displayed directly on the LCD. This patch
fixes the kernel null pointer oops when it is plugged in.

Add NO_DATA_INTERFACE quirk to tell the driver that "control"
and "data" interfaces are not separated for this device, which
prevents dereferencing a null pointer in the device probe code.

Signed-off-by: Erik Slagter <erik@slagter.name>
Signed-off-by: Maxin B. John <maxin.john@gmail.com>
Tested-by: Erik Slagter <erik@slagter.name>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoBind only modem AT command endpoint to option module.
Marius B. Kotsbak [Mon, 21 Mar 2011 22:27:21 +0000 (23:27 +0100)]
Bind only modem AT command endpoint to option module.

commit 15b2f3204a5c878c32939094775fb7349f707263 upstream.

Network interface is handled by upcoming gt_b3730 module.

Removed "GT-B3710" from comment, it is another modem with another USB ID.

Signed-off-by: Marius B. Kotsbak <marius@kotsbak.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoUSB: gamin_gps: Fix for data transfer problems in native mode
Hermann Kneissel [Fri, 29 Apr 2011 06:58:43 +0000 (08:58 +0200)]
USB: gamin_gps: Fix for data transfer problems in native mode

commit b4026c4584cd70858d4d3450abfb1cd0714d4f32 upstream.

This patch fixes a problem where data received from the gps is sometimes
transferred incompletely to the serial port. If used in native mode now
all data received via the bulk queue will be forwarded to the serial
port.

Signed-off-by: Hermann Kneissel <herkne@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoUSB: gadget: g_multi: fixed vendor and product ID in inf files
Michal Nazarewicz [Tue, 26 Apr 2011 17:08:36 +0000 (19:08 +0200)]
USB: gadget: g_multi: fixed vendor and product ID in inf files

commit 7701846fd52f86dffe50715e0e63154088b7c982 upstream.

Commit 1c6529e92b "USB: gadget: g_multi: fixed vendor and
product ID" replaced g_multi's vendor and product ID with
proper ID's from Linux Foundation.  This commit now updates
INF files in the Documentation/usb directory which were
omitted in the original commit.

Signed-off-by: Michal Nazarewicz <mina86@mina86.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoUSB: serial: ftdi_sio: adding support for TavIR STK500
Benedek László [Wed, 20 Apr 2011 01:22:21 +0000 (03:22 +0200)]
USB: serial: ftdi_sio: adding support for TavIR STK500

commit 37909fe588c9e09ab57cd267e98678a17ceda64a upstream.

Adding support for the TavIR STK500 (id 0403:FA33)
Atmel AVR programmer device based on FTDI FT232RL.

Signed-off-by: Benedek László <benedekl@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoUSB: moto_modem: Add USB identifier for the Motorola VE240.
Elizabeth Jennifer Myers [Sat, 16 Apr 2011 18:49:51 +0000 (14:49 -0400)]
USB: moto_modem: Add USB identifier for the Motorola VE240.

commit 3938a0b32dc12229e76735679b37095bc2bc1578 upstream.

Tested on my phone, the ttyUSB device is created and is fully
functional.

Signed-off-by: Elizabeth Jennifer Myers <elizabeth@sporksirc.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoUSB: CP210x Add 4 Device IDs for AC-Services Devices
Craig Shelley [Sun, 20 Mar 2011 13:51:13 +0000 (13:51 +0000)]
USB: CP210x Add 4 Device IDs for AC-Services Devices

commit 4eff0b40a7174896b860312910e0db51f2dcc567 upstream.

This patch adds 4 device IDs for CP2102 based devices manufactured by
AC-Services. See http://www.ac-services.eu for further info.

Signed-off-by: Craig Shelley <craig@microtron.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoi2c/writing-clients: Fix foo_driver.id_table
Vikram Narayanan [Tue, 24 May 2011 18:58:48 +0000 (20:58 +0200)]
i2c/writing-clients: Fix foo_driver.id_table

commit 3116c86033079a1d4d4e84c40028f96b614843b8 upstream.

The i2c_device_id structure variable's name is not used in the
i2c_driver structure.

Signed-off-by: Vikram Narayanan <vikram186@gmail.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoloop: handle on-demand devices correctly
Namhyung Kim [Tue, 24 May 2011 14:48:55 +0000 (16:48 +0200)]
loop: handle on-demand devices correctly

commit a1c15c59feee36267c43142a41152fbf7402afb6 upstream.

When finding or allocating a loop device, loop_probe() did not take
partition numbers into account so that it can result to a different
device. Consider following example:

$ sudo modprobe loop max_part=15
$ ls -l /dev/loop*
brw-rw---- 1 root disk 7,   0 2011-05-24 22:16 /dev/loop0
brw-rw---- 1 root disk 7,  16 2011-05-24 22:16 /dev/loop1
brw-rw---- 1 root disk 7,  32 2011-05-24 22:16 /dev/loop2
brw-rw---- 1 root disk 7,  48 2011-05-24 22:16 /dev/loop3
brw-rw---- 1 root disk 7,  64 2011-05-24 22:16 /dev/loop4
brw-rw---- 1 root disk 7,  80 2011-05-24 22:16 /dev/loop5
brw-rw---- 1 root disk 7,  96 2011-05-24 22:16 /dev/loop6
brw-rw---- 1 root disk 7, 112 2011-05-24 22:16 /dev/loop7
$ sudo mknod /dev/loop8 b 7 128
$ sudo losetup /dev/loop8 ~/temp/disk-with-3-parts.img
$ sudo losetup -a
/dev/loop128: [0805]:278201 (/home/namhyung/temp/disk-with-3-parts.img)
$ ls -l /dev/loop*
brw-rw---- 1 root disk 7,    0 2011-05-24 22:16 /dev/loop0
brw-rw---- 1 root disk 7,   16 2011-05-24 22:16 /dev/loop1
brw-rw---- 1 root disk 7, 2048 2011-05-24 22:18 /dev/loop128
brw-rw---- 1 root disk 7, 2049 2011-05-24 22:18 /dev/loop128p1
brw-rw---- 1 root disk 7, 2050 2011-05-24 22:18 /dev/loop128p2
brw-rw---- 1 root disk 7, 2051 2011-05-24 22:18 /dev/loop128p3
brw-rw---- 1 root disk 7,   32 2011-05-24 22:16 /dev/loop2
brw-rw---- 1 root disk 7,   48 2011-05-24 22:16 /dev/loop3
brw-rw---- 1 root disk 7,   64 2011-05-24 22:16 /dev/loop4
brw-rw---- 1 root disk 7,   80 2011-05-24 22:16 /dev/loop5
brw-rw---- 1 root disk 7,   96 2011-05-24 22:16 /dev/loop6
brw-rw---- 1 root disk 7,  112 2011-05-24 22:16 /dev/loop7
brw-r--r-- 1 root root 7,  128 2011-05-24 22:17 /dev/loop8

After this patch, /dev/loop8 - instead of /dev/loop128 - was
accessed correctly.

In addition, 'range' passed to blk_register_region() should
include all range of dev_t that LOOP_MAJOR can address. It does
not need to be limited by partition numbers unless 'max_loop'
param was specified.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoloop: limit 'max_part' module param to DISK_MAX_PARTS
Namhyung Kim [Tue, 24 May 2011 14:48:54 +0000 (16:48 +0200)]
loop: limit 'max_part' module param to DISK_MAX_PARTS

commit 78f4bb367fd147a0e7e3998ba6e47109999d8814 upstream.

The 'max_part' parameter controls the number of maximum partition
a loop block device can have. However if a user specifies very
large value it would exceed the limitation of device minor number
and can cause a kernel panic (or, at least, produce invalid
device nodes in some cases).

On my desktop system, following command kills the kernel. On qemu,
it triggers similar oops but the kernel was alive:

$ sudo modprobe loop max_part0000
 ------------[ cut here ]------------
 kernel BUG at /media/Linux_Data/project/linux/fs/sysfs/group.c:65!
 invalid opcode: 0000 [#1] SMP
 last sysfs file:
 CPU 0
 Modules linked in: loop(+)

 Pid: 43, comm: insmod Tainted: G        W   2.6.39-qemu+ #155 Bochs Bochs
 RIP: 0010:[<ffffffff8113ce61>]  [<ffffffff8113ce61>] internal_create_group=
+0x2a/0x170
 RSP: 0018:ffff880007b3fde8  EFLAGS: 00000246
 RAX: 00000000ffffffef RBX: ffff880007b3d878 RCX: 00000000000007b4
 RDX: ffffffff8152da50 RSI: 0000000000000000 RDI: ffff880007b3d878
 RBP: ffff880007b3fe38 R08: ffff880007b3fde8 R09: 0000000000000000
 R10: ffff88000783b4a8 R11: ffff880007b3d878 R12: ffffffff8152da50
 R13: ffff880007b3d868 R14: 0000000000000000 R15: ffff880007b3d800
 FS:  0000000002137880(0063) GS:ffff880007c00000(0000) knlGS:00000000000000=
00
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000422680 CR3: 0000000007b50000 CR4: 00000000000006b0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 0000000000000000 DR7: 0000000000000000
 Process insmod (pid: 43, threadinfo ffff880007b3e000, task ffff880007afb9c=
0)
 Stack:
  ffff880007b3fe58 ffffffff811e66dd ffff880007b3fe58 ffffffff811e570b
  0000000000000010 ffff880007b3d800 ffff880007a7b390 ffff880007b3d868
  0000000000400920 ffff880007b3d800 ffff880007b3fe48 ffffffff8113cfc8
 Call Trace:
  [<ffffffff811e66dd>] ? device_add+0x4bc/0x5af
  [<ffffffff811e570b>] ? dev_set_name+0x3c/0x3e
  [<ffffffff8113cfc8>] sysfs_create_group+0xe/0x12
  [<ffffffff810b420e>] blk_trace_init_sysfs+0x14/0x16
  [<ffffffff8116a090>] blk_register_queue+0x47/0xf7
  [<ffffffff8116f527>] add_disk+0xdf/0x290
  [<ffffffffa00060eb>] loop_init+0xeb/0x1b8 [loop]
  [<ffffffffa0006000>] ? 0xffffffffa0005fff
  [<ffffffff8100020a>] do_one_initcall+0x7a/0x12e
  [<ffffffff81096804>] sys_init_module+0x9c/0x1e0
  [<ffffffff813329bb>] system_call_fastpath+0x16/0x1b
 Code: c3 55 48 89 e5 41 57 41 56 41 89 f6 41 55 41 54 49 89 d4 53 48 89 fb=
 48 83 ec 28 48 85 ff 74 0b 85 f6 75 0b 48 83 7f 30 00 75 14 <0f> 0b eb fe =
48 83 7f 30 00 b9 ea ff ff ff 0f 84 18 01 00 00 49
 RIP  [<ffffffff8113ce61>] internal_create_group+0x2a/0x170
  RSP <ffff880007b3fde8>
 ---[ end trace a123eb592043acad ]---

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agomm/page_alloc.c: prevent unending loop in __alloc_pages_slowpath()
Andrew Barry [Wed, 25 May 2011 00:12:52 +0000 (17:12 -0700)]
mm/page_alloc.c: prevent unending loop in __alloc_pages_slowpath()

commit cfa54a0fcfc1017c6f122b6f21aaba36daa07f71 upstream.

I believe I found a problem in __alloc_pages_slowpath, which allows a
process to get stuck endlessly looping, even when lots of memory is
available.

Running an I/O and memory intensive stress-test I see a 0-order page
allocation with __GFP_IO and __GFP_WAIT, running on a system with very
little free memory.  Right about the same time that the stress-test gets
killed by the OOM-killer, the utility trying to allocate memory gets stuck
in __alloc_pages_slowpath even though most of the systems memory was freed
by the oom-kill of the stress-test.

The utility ends up looping from the rebalance label down through the
wait_iff_congested continiously.  Because order=0,
__alloc_pages_direct_compact skips the call to get_page_from_freelist.
Because all of the reclaimable memory on the system has already been
reclaimed, __alloc_pages_direct_reclaim skips the call to
get_page_from_freelist.  Since there is no __GFP_FS flag, the block with
__alloc_pages_may_oom is skipped.  The loop hits the wait_iff_congested,
then jumps back to rebalance without ever trying to
get_page_from_freelist.  This loop repeats infinitely.

The test case is pretty pathological.  Running a mix of I/O stress-tests
that do a lot of fork() and consume all of the system memory, I can pretty
reliably hit this on 600 nodes, in about 12 hours.  32GB/node.

Signed-off-by: Andrew Barry <abarry@cray.com>
Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
Reviewed-by: Rik van Riel<riel@redhat.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoHID: magicmouse: ignore 'ivalid report id' while switching modes
Jiri Kosina [Thu, 19 May 2011 15:58:07 +0000 (17:58 +0200)]
HID: magicmouse: ignore 'ivalid report id' while switching modes

commit 23746a66d7d9e73402c68ef00d708796b97ebd72 upstream.

The device reponds with 'invalid report id' when feature report switching it
into multitouch mode is sent to it.

This has been silently ignored before 0825411ade ("HID: bt: Wait for ACK
on Sent Reports"), but since this commit, it propagates -EIO from the _raw
callback .

So let the driver ignore -EIO as response to 0xd7,0x01 report, as that's
how the device reacts in normal mode.

Sad, but following reality.

This fixes https://bugzilla.kernel.org/show_bug.cgi?id=35022

Tested-by: Chase Douglas <chase.douglas@canonical.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoASoC: Add some missing volume update bit sets for wm_hubs devices
Mark Brown [Sun, 15 May 2011 19:18:38 +0000 (12:18 -0700)]
ASoC: Add some missing volume update bit sets for wm_hubs devices

commit fb5af53d421d80725172427e9076f6e889603df6 upstream.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Acked-by: Liam Girdwood <lrg@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoASoC: Ensure output PGA is enabled for line outputs in wm_hubs
Mark Brown [Sun, 15 May 2011 00:21:28 +0000 (17:21 -0700)]
ASoC: Ensure output PGA is enabled for line outputs in wm_hubs

commit d0b48af6c2b887354d0893e598d92911ce52620e upstream.

Also fix a left/right typo while we're at it.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Acked-by: Liam Girdwood <lrg@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoALSA: hda - Use LPIB for ATI/AMD chipsets as default
Takashi Iwai [Fri, 20 May 2011 14:29:09 +0000 (16:29 +0200)]
ALSA: hda - Use LPIB for ATI/AMD chipsets as default

commit 50e3bbf9898840eead86f90a43b3625a2b2f4112 upstream.

ATI and AMD chipsets seem not providing the proper position-buffer
information, and it also doesn't provide FIFO register required by
VIACOMBO fix.  It's better to use LPIB for these.

Reported-by: David Henningsson <david.henningsson@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoALSA: hda - Fix input-src parse in patch_analog.c
Adrian Wilkins [Thu, 19 May 2011 20:52:38 +0000 (21:52 +0100)]
ALSA: hda - Fix input-src parse in patch_analog.c

commit 5a2d227fdc7a02ed1b4cebba391d8fb9ad57caaf upstream.

Compare pin type enum to the pin type and not the array index.
Fixes bug#0005368.

Signed-off-by: Adrian Wilkins <adrian.wilkins@nhs.net>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoALSA: HDA: Add quirk for Lenovo U350
David Henningsson [Mon, 23 May 2011 06:26:16 +0000 (08:26 +0200)]
ALSA: HDA: Add quirk for Lenovo U350

commit d2859fd49200f1f3efd8acdb54b6d51d3ab82302 upstream.

Add model=asus quirk for Lenovo Ideapad U350 to make internal mic
work correctly.

BugLink: http://bugs.launchpad.net/bugs/751681
Reported-by: Kent Baxley <kent.baxley@canonical.com>
Signed-off-by: David Henningsson <david.henningsson@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoALSA: HDA: Use one dmic only for Dell Studio 1558
David Henningsson [Mon, 16 May 2011 10:09:29 +0000 (12:09 +0200)]
ALSA: HDA: Use one dmic only for Dell Studio 1558

commit e033ebfb399227e01686260ac271029011bc6b47 upstream.

There are no signs of a dmic at node 0x0b, so the user is left with
an additional internal mic which does not exist. This commit removes
that non-existing mic.

BugLink: http://bugs.launchpad.net/bugs/731706
Reported-by: James Page <james.page@canonical.com>
Signed-off-by: David Henningsson <david.henningsson@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoips: use interruptible waits in ips-monitor
Jesse Barnes [Mon, 28 Mar 2011 10:36:30 +0000 (06:36 -0400)]
ips: use interruptible waits in ips-monitor

commit a3424216e4935221fdaa5ca3c26e024f11297164 upstream.

This is what I intended to do since:
  1) the driver handles variable waits just fine, and
  2) interruptible waits aren't reported as load in the load avg.

Reported-and-tested-by: Andreas Hartmann <andihartmann@freenet.de>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Cc: Leann Ogasawara <leann.ogasawara@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agomm: vmscan: correctly check if reclaimer should schedule during shrink_slab
Minchan Kim [Wed, 25 May 2011 00:11:11 +0000 (17:11 -0700)]
mm: vmscan: correctly check if reclaimer should schedule during shrink_slab

commit f06590bd718ed950c98828e30ef93204028f3210 upstream.

It has been reported on some laptops that kswapd is consuming large
amounts of CPU and not being scheduled when SLUB is enabled during large
amounts of file copying.  It is expected that this is due to kswapd
missing every cond_resched() point because;

shrink_page_list() calls cond_resched() if inactive pages were isolated
        which in turn may not happen if all_unreclaimable is set in
        shrink_zones(). If for whatver reason, all_unreclaimable is
        set on all zones, we can miss calling cond_resched().

balance_pgdat() only calls cond_resched if the zones are not
        balanced. For a high-order allocation that is balanced, it
        checks order-0 again. During that window, order-0 might have
        become unbalanced so it loops again for order-0 and returns
        that it was reclaiming for order-0 to kswapd(). It can then
        find that a caller has rewoken kswapd for a high-order and
        re-enters balance_pgdat() without ever calling cond_resched().

shrink_slab only calls cond_resched() if we are reclaiming slab
pages. If there are a large number of direct reclaimers, the
shrinker_rwsem can be contended and prevent kswapd calling
cond_resched().

This patch modifies the shrink_slab() case.  If the semaphore is
contended, the caller will still check cond_resched().  After each
successful call into a shrinker, the check for cond_resched() remains in
case one shrinker is particularly slow.

[mgorman@suse.de: preserve call to cond_resched after each call into shrinker]
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Tested-by: Colin King <colin.king@canonical.com>
Cc: Raghavendra D Prabhu <raghu.prabhu13@gmail.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agomm: vmscan: correct use of pgdat_balanced in sleeping_prematurely
Johannes Weiner [Wed, 25 May 2011 00:11:09 +0000 (17:11 -0700)]
mm: vmscan: correct use of pgdat_balanced in sleeping_prematurely

commit afc7e326a3f5bafc41324d7926c324414e343ee5 upstream.

There are a few reports of people experiencing hangs when copying large
amounts of data with kswapd using a large amount of CPU which appear to be
due to recent reclaim changes.  SLUB using high orders is the trigger but
not the root cause as SLUB has been using high orders for a while.  The
root cause was bugs introduced into reclaim which are addressed by the
following two patches.

Patch 1 corrects logic introduced by commit 1741c877 ("mm: kswapd:
        keep kswapd awake for high-order allocations until a percentage of
        the node is balanced") to allow kswapd to go to sleep when
        balanced for high orders.

Patch 2 notes that it is possible for kswapd to miss every
        cond_resched() and updates shrink_slab() so it'll at least reach
        that scheduling point.

Chris Wood reports that these two patches in isolation are sufficient to
prevent the system hanging.  AFAIK, they should also resolve similar hangs
experienced by James Bottomley.

This patch:

Johannes Weiner poined out that the logic in commit 1741c877 ("mm: kswapd:
keep kswapd awake for high-order allocations until a percentage of the
node is balanced") is backwards.  Instead of allowing kswapd to go to
sleep when balancing for high order allocations, it keeps it kswapd
running uselessly.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Wu Fengguang <fengguang.wu@intel.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Tested-by: Colin King <colin.king@canonical.com>
Cc: Raghavendra D Prabhu <raghu.prabhu13@gmail.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Rik van Riel <riel@redhat.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Reviewed-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agomd/bitmap: fix saving of events_cleared and other state.
NeilBrown [Wed, 11 May 2011 04:26:30 +0000 (14:26 +1000)]
md/bitmap: fix saving of events_cleared and other state.

commit 8258c53208d7a9b7207e7d4dae36d2ea384cb278 upstream.

If a bitmap is found to be 'stale' the events_cleared value
is set to match 'events'.
However if the array is degraded this does not get stored on disk.
This can subsequently lead to incorrect behaviour.

So change bitmap_update_sb to always update events_cleared in the
superblock from the known events_cleared.
For neatness also set ->state from ->flags.
This requires updating ->state whenever we update ->flags, which makes
sense anyway.

This is suitable for any active -stable release.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agomd: Fix race when creating a new md device.
NeilBrown [Tue, 10 May 2011 07:49:01 +0000 (17:49 +1000)]
md: Fix race when creating a new md device.

commit b0140891a8cea36469f58d23859e599b1122bd37 upstream.

There is a race when creating an md device by opening /dev/mdXX.

If two processes do this at much the same time they will follow the
call path
  __blkdev_get -> get_gendisk -> kobj_lookup

The first will call
  -> md_probe -> md_alloc -> add_disk -> blk_register_region

and the race happens when the second gets to kobj_lookup after
add_disk has called blk_register_region but before it returns to
md_alloc.

In the case the second will not call md_probe (as the probe is already
done) but will get a handle on the gendisk, return to __blkdev_get
which will then call md_open (via the ->open) pointer.

As mddev->gendisk hasn't been set yet, md_open will think something is
wrong an return with ERESTARTSYS.

This can loop endlessly while the first thread makes no progress
through add_disk.  Nothing is blocking it, but due to scheduler
behaviour it doesn't get a turn.
So this is essentially a live-lock.

We fix this by simply moving the assignment to mddev->gendisk before
the call the add_disk() so md_open doesn't get confused.
Also move blk_queue_flush earlier because add_disk should be as late
as possible.

To make sure that md_open doesn't complete until md_alloc has done all
that is needed, we take mddev->open_mutex during the last part of
md_alloc.  md_open will wait for this.

This can cause a lock-up on boot so Cc:ing for stable.
For 2.6.36 and earlier a different patch will be needed as the
'blk_queue_flush' call isn't there.

Signed-off-by: NeilBrown <neilb@suse.de>
Reported-by: Thomas Jarosch <thomas.jarosch@intra2net.com>
Tested-by: Thomas Jarosch <thomas.jarosch@intra2net.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoseqlock: Don't smp_rmb in seqlock reader spin loop
Milton Miller [Thu, 12 May 2011 09:13:54 +0000 (04:13 -0500)]
seqlock: Don't smp_rmb in seqlock reader spin loop

commit 5db1256a5131d3b133946fa02ac9770a784e6eb2 upstream.

Move the smp_rmb after cpu_relax loop in read_seqlock and add
ACCESS_ONCE to make sure the test and return are consistent.

A multi-threaded core in the lab didn't like the update
from 2.6.35 to 2.6.36, to the point it would hang during
boot when multiple threads were active.  Bisection showed
af5ab277ded04bd9bc6b048c5a2f0e7d70ef0867 (clockevents:
Remove the per cpu tick skew) as the culprit and it is
supported with stack traces showing xtime_lock waits including
tick_do_update_jiffies64 and/or update_vsyscall.

Experimentation showed the combination of cpu_relax and smp_rmb
was significantly slowing the progress of other threads sharing
the core, and this patch is effective in avoiding the hang.

A theory is the rmb is affecting the whole core while the
cpu_relax is causing a resource rebalance flush, together they
cause an interfernce cadance that is unbroken when the seqlock
reader has interrupts disabled.

At first I was confused why the refactor in
3c22cd5709e8143444a6d08682a87f4c57902df3 (kernel: optimise
seqlock) didn't affect this patch application, but after some
study that affected seqcount not seqlock. The new seqcount was
not factored back into the seqlock.  I defer that the future.

While the removal of the timer interrupt offset created
contention for the xtime lock while a cpu does the
additonal work to update the system clock, the seqlock
implementation with the tight rmb spin loop goes back much
further, and is just waiting for the right trigger.

Signed-off-by: Milton Miller <miltonm@bga.com>
Cc: <linuxppc-dev@lists.ozlabs.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Link: http://lkml.kernel.org/r/%3Cseqlock-rmb%40mdm.bga.com%3E
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoFix for buffer overflow in ldm_frag_add not sufficient
Timo Warns [Thu, 19 May 2011 07:24:17 +0000 (09:24 +0200)]
Fix for buffer overflow in ldm_frag_add not sufficient

commit cae13fe4cc3f24820ffb990c09110626837e85d4 upstream.

As Ben Hutchings discovered [1], the patch for CVE-2011-1017 (buffer
overflow in ldm_frag_add) is not sufficient.  The original patch in
commit c340b1d64000 ("fs/partitions/ldm.c: fix oops caused by corrupted
partition table") does not consider that, for subsequent fragments,
previously allocated memory is used.

[1] http://lkml.org/lkml/2011/5/6/407

Reported-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Timo Warns <warns@pre-sense.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agostaging: r8712u: Fix driver to support ad-hoc mode
Jeff Chua [Wed, 27 Apr 2011 16:25:14 +0000 (11:25 -0500)]
staging: r8712u: Fix driver to support ad-hoc mode

commit 62819fd9481021db7f87d5f61f2e2fd2be1dfcfa upstream.

Driver r8712u is unable to handle ad-hoc mode. The issue is that when
the driver first starts, there will not be an SSID for association.
The fix is to always call the "select and join from scan" routine when
in ad-hoc mode.

Note: Ad-hoc mode worked intermittently before. If the driver had
previously been associated, then things were OK.

Signed-off-by: Jeff Chua <jeff.chua.linux@gmail.com>
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agostaging: usbip: fix wrong endian conversion
David Chang [Thu, 12 May 2011 10:31:11 +0000 (18:31 +0800)]
staging: usbip: fix wrong endian conversion

commit cacd18a8476ce145ca5dcd46dc5b75585fd1289c upstream.

Fix number_of_packets wrong endian conversion in function
correct_endian_ret_submit()

Signed-off-by: David Chang <dchang@novell.com>
Acked-by: Arjan Mels <arjan.mels@gmx.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoWhen mandatory encryption on share, fail mount
Steve French [Thu, 26 May 2011 18:38:54 +0000 (18:38 +0000)]
When mandatory encryption on share, fail mount

commit 6848b7334b24b47aa3d0e70342ff839ffa95d5fa upstream.

    When mandatory encryption is configured in samba server on a
    share (smb.conf parameter "smb encrypt = mandatory") the
    server will hang up the tcp session when we try to send
    the first frame after the tree connect if it is not a
    QueryFSUnixInfo, this causes cifs mount to hang (it must
    be killed with ctl-c).  Move the QueryFSUnixInfo call
    earlier in the mount sequence, and check whether the SetFSUnixInfo
    fails due to mandatory encryption so we can return a sensible
    error (EACCES) on mount.

    In a future patch (for 2.6.40) we will support mandatory
    encryption.

Reviewed-by: Pavel Shilovsky <piastry@etersoft.ru>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agorcu: Fix unpaired rcu_irq_enter() from locking selftests
Frederic Weisbecker [Fri, 20 May 2011 00:09:54 +0000 (02:09 +0200)]
rcu: Fix unpaired rcu_irq_enter() from locking selftests

commit ba9f207c9f82115aba4ce04b22e0081af0ae300f upstream.

HARDIRQ_ENTER() maps to irq_enter() which calls rcu_irq_enter().
But HARDIRQ_EXIT() maps to __irq_exit() which doesn't call
rcu_irq_exit().

So for every locking selftest that simulates hardirq disabled,
we create an imbalance in the rcu extended quiescent state
internal state.

As a result, after the first missing rcu_irq_exit(), subsequent
irqs won't exit dyntick-idle mode after leaving the interrupt
handler.  This means that RCU won't see the affected CPU as being
in an extended quiescent state, resulting in long grace-period
delays (as in grace periods extending for hours).

To fix this, just use __irq_enter() to simulate the hardirq
context. This is sufficient for the locking selftests as we
don't need to exit any extended quiescent state or perform
any check that irqs normally do when they wake up from idle.

As a side effect, this patch makes it possible to restore
"rcu: Decrease memory-barrier usage based on semi-formal proof",
which eventually helped finding this bug.

Reported-and-tested-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agooprofile, x86: Enable preemption during pci device setup in IBS init
Robert Richter [Fri, 20 May 2011 07:46:54 +0000 (09:46 +0200)]
oprofile, x86: Enable preemption during pci device setup in IBS init

commit 3d2606f42984613d324ad3047cf503bcddc3880a upstream.

IBS initialization is a mix of per-core register access and per-node
pci device setup. Register access should be pinned to the cpu, but pci
setup must run with preemption enabled.

This patch better separates the code into non-/preemptible sections
and fixes sleeping with preemption disabled. See bug message below.

Fixes also freeing the eilvt entry by introducing put_eilvt().

BUG: sleeping function called from invalid context at mm/slub.c:824
in_atomic(): 1, irqs_disabled(): 0, pid: 32357, name: modprobe
INFO: lockdep is turned off.
Pid: 32357, comm: modprobe Not tainted 2.6.39-rc7+ #14
Call Trace:
 [<ffffffff8104bdc8>] __might_sleep+0x112/0x117
 [<ffffffff81129693>] kmem_cache_alloc_trace+0x4b/0xe7
 [<ffffffff81278f14>] kzalloc.constprop.0+0x29/0x2b
 [<ffffffff81278f4c>] pci_get_subsys+0x36/0x78
 [<ffffffff81022689>] ? setup_APIC_eilvt+0xfb/0x139
 [<ffffffff81278fa4>] pci_get_device+0x16/0x18
 [<ffffffffa06c8b5d>] op_amd_init+0xd3/0x211 [oprofile]
 [<ffffffffa064d000>] ? 0xffffffffa064cfff
 [<ffffffffa064d298>] op_nmi_init+0x21e/0x26a [oprofile]
 [<ffffffffa064d062>] oprofile_arch_init+0xe/0x26 [oprofile]
 [<ffffffffa064d010>] oprofile_init+0x10/0x42 [oprofile]
 [<ffffffff81002099>] do_one_initcall+0x7f/0x13a
 [<ffffffff81096524>] sys_init_module+0x132/0x281
 [<ffffffff814cc682>] system_call_fastpath+0x16/0x1b

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agox86, cpufeature: Update CPU feature RDRND to RDRAND
Kees Cook [Tue, 24 May 2011 23:29:26 +0000 (16:29 -0700)]
x86, cpufeature: Update CPU feature RDRND to RDRAND

commit 7ccafc5f75c87853f3c49845d5a884f2376e03ce upstream.

The Intel manual changed the name of the CPUID bit to match the
instruction name. We should follow suit for sanity's sake. (See Intel SDM
Volume 2, Table 3-20 "Feature Information Returned in the ECX Register".)

[ hpa: we can only do this at this time because there are currently no CPUs
  with this feature on the market, hence this is pre-hardware enabling.
  However, Cc:'ing stable so that stable can present a consistent ABI. ]

Signed-off-by: Kees Cook <kees.cook@canonical.com>
Link: http://lkml.kernel.org/r/20110524232926.GA27728@outflux.net
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agox86, amd: Use _safe() msr access for GartTlbWlk disable code
Roedel, Joerg [Thu, 19 May 2011 09:13:39 +0000 (11:13 +0200)]
x86, amd: Use _safe() msr access for GartTlbWlk disable code

commit d47cc0db8fd6011de2248df505fc34990b7451bf upstream.

The workaround for Bugzilla:

https://bugzilla.kernel.org/show_bug.cgi?id=33012

introduced a read and a write to the MC4 mask msr.

Unfortunatly this MSR is not emulated by the KVM hypervisor
so that the kernel will get a #GP and crashes when applying
this workaround when running inside KVM.

This issue was reported as:

https://bugzilla.kernel.org/show_bug.cgi?id=35132

and is fixed with this patch. The change just let the kernel
ignore any #GP it gets while accessing this MSR by using the
_safe msr access methods.

Reported-by: Török Edwin <edwintorok@gmail.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Maciej Rutecki <maciej.rutecki@gmail.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agox86, amd: Do not enable ARAT feature on AMD processors below family 0x12
Boris Ostrovsky [Thu, 26 May 2011 15:19:52 +0000 (11:19 -0400)]
x86, amd: Do not enable ARAT feature on AMD processors below family 0x12

commit e9cdd343a5e42c43bcda01e609fa23089e026470 upstream.

Commit b87cf80af3ba4b4c008b4face3c68d604e1715c6 added support for
ARAT (Always Running APIC timer) on AMD processors that are not
affected by erratum 400. This erratum is present on certain processor
families and prevents APIC timer from waking up the CPU when it
is in a deep C state, including C1E state.

Determining whether a processor is affected by this erratum may
have some corner cases and handling these cases is somewhat
complicated. In the interest of simplicity we won't claim ARAT
support on processor families below 0x12 and will go back to
broadcasting timer when going idle.

Signed-off-by: Boris Ostrovsky <ostr@amd64.org>
Link: http://lkml.kernel.org/r/1306423192-19774-1-git-send-email-ostr@amd64.org
Tested-by: Boris Petkov <borislav.petkov@amd.com>
Cc: Hans Rosenfeld <Hans.Rosenfeld@amd.com>
Cc: Andreas Herrmann <Andreas.Herrmann3@amd.com>
Cc: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agox86, ioapic: Fix potential resume deadlock
Daniel J Blueman [Wed, 18 May 2011 23:31:31 +0000 (16:31 -0700)]
x86, ioapic: Fix potential resume deadlock

commit b64ce24daffb634b5b3133a2e411bd4de50654e8 upstream.

Fix a potential deadlock when resuming; here the calling
function has disabled interrupts, so we cannot sleep.

Change the memory allocation flag from GFP_KERNEL to GFP_ATOMIC.

TODO: We can do away with this memory allocation during resume
      by reusing the ioapic suspend/resume code that uses boot time
      allocated buffers, but we want to keep this -stable patch
      simple.

Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/20110518233157.385970138@sbsiddha-MOBL3.sc.intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agotarget: Fix task->task_execute_queue=1 clear bug + LUN_RESET OOPs
Nicholas Bellinger [Fri, 20 May 2011 03:19:12 +0000 (20:19 -0700)]
target: Fix task->task_execute_queue=1 clear bug + LUN_RESET OOPs

commit af57c3ac9947990da2608561b71f4799eb7795c6 upstream.

This patch fixes a bug where task->task_execute_queue=1 was not being
cleared once se_task had been removed from se_device->execute_task_list,
resulting in an OOPs in core_tmr_lun_reset() for the task->task_active=0
case where transport_remove_task_from_execute_queue() was incorrectly
being called.

This patch fixes two cases in transport_get_task_from_execute_queue()
and transport_remove_task_from_execute_queue() to properly clear
task->task_execute_queue=0 once list_del(&task->t_execute_list) has
been called.

It also adds an explict check in transport_remove_task_from_execute_queue()
to dump_stack + return if called with task->task_execute_queue=0.

Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agotarget: Fix bug with task_sg chained transport_free_dev_tasks release
Nicholas Bellinger [Fri, 20 May 2011 03:19:11 +0000 (20:19 -0700)]
target: Fix bug with task_sg chained transport_free_dev_tasks release

commit f436677262a5b524ac87675014c6d4e8ee153029 upstream.

This patch addresses a bug in the target core release path for HW
operation where transport_free_dev_tasks() was incorrectly being called
from transport_lun_remove_cmd() while releasing a se_cmd reference and
calling struct target_core_fabric_ops->queue_data_in().

This would result in a OOPs with HW target mode when the release of
se_task->task_sg[] would happen before pci_unmap_sg() can be called in
HW target mode fabric module code.  This patch addresses the issue by
moving transport_free_dev_tasks() from transport_lun_remove_cmd() into
transport_generic_free_cmd(), and adding TRANSPORT_FREE_CMD_INTR and
transport_generic_free_cmd_intr() to allow se_cmd descriptor release
to happen fromfrom within transport_processing_thread() process context
when release of se_cmd is not possible from HW interrupt context.

Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agotarget: Fix interrupt context bug with stats_lock and core_tmr_alloc_req
Nicholas Bellinger [Fri, 20 May 2011 03:19:10 +0000 (20:19 -0700)]
target: Fix interrupt context bug with stats_lock and core_tmr_alloc_req

commit 53ab6709b4d35b1924240854d794482fd7d33d4a upstream.

This patch fixes two bugs wrt to the interrupt context usage of target
core with HW target mode drivers.  It first converts the usage of struct
se_device->stats_lock in transport_get_lun_for_cmd() and core_tmr_lun_reset()
to properly use spin_lock_irq() to address an BUG with CONFIG_LOCKDEP_SUPPORT=y
enabled.

This patch also adds a 'in_interrupt()' check to allow GFP_ATOMIC usage from
core_tmr_alloc_req() to fix a 'sleeping in interrupt context' BUG with HW
target fabrics that require this logic to function.

Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agotarget: Fix multi task->task_sg[] chaining logic bug
Nicholas Bellinger [Fri, 20 May 2011 03:19:09 +0000 (20:19 -0700)]
target: Fix multi task->task_sg[] chaining logic bug

commit 97868c8905a1537153d406c4a3aa39a503a5c299 upstream.

This patch fixes a bug in transport_do_task_sg_chain() used by HW target
mode modules with sg_chain() to provide a single sg_next() walkable memory
layout for use with pci_map_sg() and friends.  This patch addresses an
issue with mapping multiple small block max_sector tasks across multiple
struct se_task->task_sg[] mappings for HW target mode operation.

This was causing OOPs with (cmd->t_task->t_tasks_no > 1) I/O traffic for
HW target drivers using transport_do_task_sg_chain(), and has been tested
so far with tcm_fc(openfcoe), tcm_qla2xxx, and ib_srpt fabrics with
t_tasks_no > 1 IBLOCK backends using a smaller max_sectors to trigger the
original issue.

Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Acked-by: Kiran Patil <kiran.patil@intel.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoFix Ultrastor asm snippet
Samuel Thibault [Wed, 18 May 2011 15:06:05 +0000 (17:06 +0200)]
Fix Ultrastor asm snippet

commit fad4dab5e44e10acf6b0235e469cb8e773b58e31 upstream.

Commit 1292500b replaced

"=m" (*field) : "1" (*field)

with

"=m" (*field) :

with comment "The following patch fixes it by using the '+' operator on
the (*field) operand, marking it as read-write to gcc."
'+' was actually forgotten.  This really puts it.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agobnx2i: Updated the connection shutdown/cleanup timeout
Eddie Wai [Mon, 16 May 2011 18:13:19 +0000 (11:13 -0700)]
bnx2i: Updated the connection shutdown/cleanup timeout

commit d5307a078bb0288945c900c6f4a2fd77ba6d0817 upstream.

Modified the 10s wait time for inflight offload connections to
advance to the next state to 2s based on test result.
Modified the 20s shutdown timeout to 30s based on test result.

Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agobnx2i: Fixed packet error created when the sq_size is set to 16
Eddie Wai [Mon, 16 May 2011 18:13:18 +0000 (11:13 -0700)]
bnx2i: Fixed packet error created when the sq_size is set to 16

commit 7287c63e986fe1a51a89f4bb1327320274a7a741 upstream.

The number of chip's internal command cell, which is use to generate
SCSI cmd packets to the target, was not initialized correctly by
the driver when the sq_size is changed from the default 128.
This, in turn, will create a problem where the chip's transmit pipe
will erroneously reuse an old command cell that is no longer valid.
The fix is to correctly initialize the chip's command cell upon setup.

Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agompt2sas: move even handling of MPT2SAS_TURN_ON_FAULT_LED into process context
Kashyap, Desai [Wed, 4 May 2011 11:05:58 +0000 (16:35 +0530)]
mpt2sas: move even handling of MPT2SAS_TURN_ON_FAULT_LED into process context

commit 3ace8e052be5293ebb3e00f819effccc64108a38 upstream.

Driver was a sending a SEP request during interrupt context which
required to go to sleep.

The fix is to rearrange the code so a fake event
MPT2SAS_TURN_ON_FAULT_LED is fired from interrupt context, then later
during the kernel worker threads processing, the SEP request is issued
to firmware.

Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agobonding: prevent deadlock on slave store with alb mode (v3)
Neil Horman [Wed, 25 May 2011 08:13:01 +0000 (08:13 +0000)]
bonding: prevent deadlock on slave store with alb mode (v3)

[ Upstream commit 9fe0617d9b6d21f700ee9e658e1c9fe3be2fb402 ]

This soft lockup was recently reported:

[root@dell-per715-01 ~]# echo +bond5 > /sys/class/net/bonding_masters
[root@dell-per715-01 ~]# echo +eth1 > /sys/class/net/bond5/bonding/slaves
bonding: bond5: doing slave updates when interface is down.
bonding bond5: master_dev is not up in bond_enslave
[root@dell-per715-01 ~]# echo -eth1 > /sys/class/net/bond5/bonding/slaves
bonding: bond5: doing slave updates when interface is down.

BUG: soft lockup - CPU#12 stuck for 60s! [bash:6444]
CPU 12:
Modules linked in: bonding autofs4 hidp rfcomm l2cap bluetooth lockd sunrpc
be2d
Pid: 6444, comm: bash Not tainted 2.6.18-262.el5 #1
RIP: 0010:[<ffffffff80064bf0>]  [<ffffffff80064bf0>]
.text.lock.spinlock+0x26/00
RSP: 0018:ffff810113167da8  EFLAGS: 00000286
RAX: ffff810113167fd8 RBX: ffff810123a47800 RCX: 0000000000ff1025
RDX: 0000000000000000 RSI: ffff810123a47800 RDI: ffff81021b57f6f8
RBP: ffff81021b57f500 R08: 0000000000000000 R09: 000000000000000c
R10: 00000000ffffffff R11: ffff81011d41c000 R12: ffff81021b57f000
R13: 0000000000000000 R14: 0000000000000282 R15: 0000000000000282
FS:  00002b3b41ef3f50(0000) GS:ffff810123b27940(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00002b3b456dd000 CR3: 000000031fc60000 CR4: 00000000000006e0

Call Trace:
 [<ffffffff80064af9>] _spin_lock_bh+0x9/0x14
 [<ffffffff886937d7>] :bonding:tlb_clear_slave+0x22/0xa1
 [<ffffffff8869423c>] :bonding:bond_alb_deinit_slave+0xba/0xf0
 [<ffffffff8868dda6>] :bonding:bond_release+0x1b4/0x450
 [<ffffffff8006457b>] __down_write_nested+0x12/0x92
 [<ffffffff88696ae4>] :bonding:bonding_store_slaves+0x25c/0x2f7
 [<ffffffff801106f7>] sysfs_write_file+0xb9/0xe8
 [<ffffffff80016b87>] vfs_write+0xce/0x174
 [<ffffffff80017450>] sys_write+0x45/0x6e
 [<ffffffff8005d28d>] tracesys+0xd5/0xe0

It occurs because we are able to change the slave configuarion of a bond while
the bond interface is down.  The bonding driver initializes some data structures
only after its ndo_open routine is called.  Among them is the initalization of
the alb tx and rx hash locks.  So if we add or remove a slave without first
opening the bond master device, we run the risk of trying to lock/unlock a
spinlock that has garbage for data in it, which results in our above softlock.

Note that sometimes this works, because in many cases an unlocked spinlock has
the raw_lock parameter initialized to zero (meaning that the kzalloc of the
net_device private data is equivalent to calling spin_lock_init), but thats not
true in all cases, and we aren't guaranteed that condition, so we need to pass
the relevant spinlocks through the spin_lock_init function.

Fix it by moving the spin_lock_init calls for the tx and rx hashtable locks to
the ndo_init path, so they are ready for use by the bond_store_slaves path.

Change notes:
v2) Based on conversation with Jay and Nicolas it seems that the ability to
enslave devices while the bond master is down should be safe to do.  As such
this is an outlier bug, and so instead we'll just initalize the errant spinlocks
in the init path rather than the open path, solving the problem.  We'll also
remove the warnings about the bond being down during enslave operations, since
it should be safe

v3) Fix spelling error

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reported-by: jtluka@redhat.com
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: nicolas.2p.debian@gmail.com
CC: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agosch_sfq: fix peek() implementation
Eric Dumazet [Wed, 25 May 2011 04:40:11 +0000 (04:40 +0000)]
sch_sfq: fix peek() implementation

[ Upstream commit 07bd8df5df4369487812bf85a237322ff3569b77 ]

Since commit eeaeb068f139 (sch_sfq: allow big packets and be fair),
sfq_peek() can return a different skb that would be normally dequeued by
sfq_dequeue() [ if current slot->allot is negative ]

Use generic qdisc_peek_dequeued() instead of custom implementation, to
get consistent result.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Jarek Poplawski <jarkao2@gmail.com>
CC: Patrick McHardy <kaber@trash.net>
CC: Jesper Dangaard Brouer <hawk@diku.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agosctp: fix memory leak of the ASCONF queue when free asoc
Wei Yongjun [Tue, 24 May 2011 21:48:02 +0000 (21:48 +0000)]
sctp: fix memory leak of the ASCONF queue when free asoc

[ Upstream commit 8b4472cc13136d04727e399c6fdadf58d2218b0a ]

If an ASCONF chunk is outstanding, then the following ASCONF
chunk will be queued for later transmission. But when we free
the asoc, we forget to free the ASCONF queue at the same time,
this will cause memory leak.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agosch_sfq: avoid giving spurious NET_XMIT_CN signals
Eric Dumazet [Mon, 23 May 2011 11:02:42 +0000 (11:02 +0000)]
sch_sfq: avoid giving spurious NET_XMIT_CN signals

[ Upstream commit 8efa885406359af300d46910642b50ca82c0fe47 ]

While chasing a possible net_sched bug, I found that IP fragments have
litle chance to pass a congestioned SFQ qdisc :

- Say SFQ qdisc is full because one flow is non responsive.
- ip_fragment() wants to send two fragments belonging to an idle flow.
- sfq_enqueue() queues first packet, but see queue limit reached :
- sfq_enqueue() drops one packet from 'big consumer', and returns
NET_XMIT_CN.
- ip_fragment() cancel remaining fragments.

This patch restores fairness, making sure we return NET_XMIT_CN only if
we dropped a packet from the same flow.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Patrick McHardy <kaber@trash.net>
CC: Jarek Poplawski <jarkao2@gmail.com>
CC: Jamal Hadi Salim <hadi@cyberus.ca>
CC: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agonet: add skb_dst_force() in sock_queue_err_skb()
Eric Dumazet [Wed, 18 May 2011 06:21:31 +0000 (02:21 -0400)]
net: add skb_dst_force() in sock_queue_err_skb()

[ Upstream commit abb57ea48fd9431fa320a5c55f73e6b5a44c2efb ]

Commit 7fee226ad239 (add a noref bit on skb dst) forgot to use
skb_dst_force() on packets queued in sk_error_queue

This triggers following warning, for applications using IP_CMSG_PKTINFO
receiving one error status

------------[ cut here ]------------
WARNING: at include/linux/skbuff.h:457 ip_cmsg_recv_pktinfo+0xa6/0xb0()
Hardware name: 2669UYD
Modules linked in: isofs vboxnetadp vboxnetflt nfsd ebtable_nat ebtables
lib80211_crypt_ccmp uinput xcbc hdaps tp_smapi thinkpad_ec radeonfb fb_ddc
radeon ttm drm_kms_helper drm ipw2200 intel_agp intel_gtt libipw i2c_algo_bit
i2c_i801 agpgart rng_core cfbfillrect cfbcopyarea cfbimgblt video raid10 raid1
raid0 linear md_mod vboxdrv
Pid: 4697, comm: miredo Not tainted 2.6.39-rc6-00569-g5895198-dirty #22
Call Trace:
 [<c17746b6>] ? printk+0x1d/0x1f
 [<c1058302>] warn_slowpath_common+0x72/0xa0
 [<c15bbca6>] ? ip_cmsg_recv_pktinfo+0xa6/0xb0
 [<c15bbca6>] ? ip_cmsg_recv_pktinfo+0xa6/0xb0
 [<c1058350>] warn_slowpath_null+0x20/0x30
 [<c15bbca6>] ip_cmsg_recv_pktinfo+0xa6/0xb0
 [<c15bbdd7>] ip_cmsg_recv+0x127/0x260
 [<c154f82d>] ? skb_dequeue+0x4d/0x70
 [<c1555523>] ? skb_copy_datagram_iovec+0x53/0x300
 [<c178e834>] ? sub_preempt_count+0x24/0x50
 [<c15bdd2d>] ip_recv_error+0x23d/0x270
 [<c15de554>] udp_recvmsg+0x264/0x2b0
 [<c15ea659>] inet_recvmsg+0xd9/0x130
 [<c1547752>] sock_recvmsg+0xf2/0x120
 [<c11179cb>] ? might_fault+0x4b/0xa0
 [<c15546bc>] ? verify_iovec+0x4c/0xc0
 [<c1547660>] ? sock_recvmsg_nosec+0x100/0x100
 [<c1548294>] __sys_recvmsg+0x114/0x1e0
 [<c1093895>] ? __lock_acquire+0x365/0x780
 [<c1148b66>] ? fget_light+0xa6/0x3e0
 [<c1148b7f>] ? fget_light+0xbf/0x3e0
 [<c1148aee>] ? fget_light+0x2e/0x3e0
 [<c1549f29>] sys_recvmsg+0x39/0x60

Close bug https://bugzilla.kernel.org/show_bug.cgi?id=34622

Reported-by: Witold Baryluk <baryluk@smp.if.uj.edu.pl>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoigmp: call ip_mc_clear_src() only when we have no users of ip_mc_list
Veaceslav Falico [Mon, 23 May 2011 23:15:05 +0000 (23:15 +0000)]
igmp: call ip_mc_clear_src() only when we have no users of ip_mc_list

[ Upstream commit 24cf3af3fed5edcf90bc2a0ed181e6ce1513d2dc ]

In igmp_group_dropped() we call ip_mc_clear_src(), which resets the number
of source filters per mulitcast. However, igmp_group_dropped() is also
called on NETDEV_DOWN, NETDEV_PRE_TYPE_CHANGE and NETDEV_UNREGISTER, which
means that the group might get added back on NETDEV_UP, NETDEV_REGISTER and
NETDEV_POST_TYPE_CHANGE respectively, leaving us with broken source
filters.

To fix that, we must clear the source filters only when there are no users
in the ip_mc_list, i.e. in ip_mc_dec_group() and on device destroy.

Acked-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agovlan: fix GVRP at dismantle time MIME-Version: 1.0
Eric Dumazet [Tue, 10 May 2011 19:22:54 +0000 (12:22 -0700)]
vlan: fix GVRP at dismantle time MIME-Version: 1.0

[ Upstream commit 0442277740ec56109c5b5f7bcfded299cf9e72bd ]

ip link add link eth2 eth2.103 type vlan id 103 gvrp on loose_binding on
ip link set eth2.103 up
rmmod tg3    # driver providing eth2

 BUG: unable to handle kernel NULL pointer dereference at           (null)
 IP: [<ffffffffa0030c9e>] garp_request_leave+0x3e/0xc0 [garp]
 PGD 11d251067 PUD 11b9e0067 PMD 0
 Oops: 0000 [#1] SMP
 last sysfs file: /sys/devices/virtual/net/eth2.104/ifindex
 CPU 0
 Modules linked in: tg3(-) 8021q garp nfsd lockd auth_rpcgss sunrpc libphy sg [last unloaded: x_tables]

 Pid: 11494, comm: rmmod Tainted: G        W   2.6.39-rc6-00261-gfd71257-dirty #580 HP ProLiant BL460c G6
 RIP: 0010:[<ffffffffa0030c9e>]  [<ffffffffa0030c9e>] garp_request_leave+0x3e/0xc0 [garp]
 RSP: 0018:ffff88007a19bae8  EFLAGS: 00010286
 RAX: 0000000000000000 RBX: ffff88011b5e2000 RCX: 0000000000000002
 RDX: 0000000000000000 RSI: 0000000000000175 RDI: ffffffffa0030d5b
 RBP: ffff88007a19bb18 R08: 0000000000000001 R09: ffff88011bd64a00
 R10: ffff88011d34ec00 R11: 0000000000000000 R12: 0000000000000002
 R13: ffff88007a19bc48 R14: ffff88007a19bb88 R15: 0000000000000001
 FS:  0000000000000000(0000) GS:ffff88011fc00000(0063) knlGS:00000000f77d76c0
 CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
 CR2: 0000000000000000 CR3: 000000011a675000 CR4: 00000000000006f0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
 Process rmmod (pid: 11494, threadinfo ffff88007a19a000, task ffff8800798595c0)
 Stack:
  ffff88007a19bb36 ffff88011c84b800 ffff88011b5e2000 ffff88007a19bc48
  ffff88007a19bb88 0000000000000006 ffff88007a19bb38 ffffffffa003a5f6
  ffff88007a19bb38 670088007a19bba8 ffff88007a19bb58 ffffffffa00397e7
 Call Trace:
  [<ffffffffa003a5f6>] vlan_gvrp_request_leave+0x46/0x50 [8021q]
  [<ffffffffa00397e7>] vlan_dev_stop+0xb7/0xc0 [8021q]
  [<ffffffff8137e427>] __dev_close_many+0x87/0xe0
  [<ffffffff8137e507>] dev_close_many+0x87/0x110
  [<ffffffff8137e630>] rollback_registered_many+0xa0/0x240
  [<ffffffff8137e7e9>] unregister_netdevice_many+0x19/0x60
  [<ffffffffa00389eb>] vlan_device_event+0x53b/0x550 [8021q]
  [<ffffffff8143f448>] ? ip6mr_device_event+0xa8/0xd0
  [<ffffffff81479d03>] notifier_call_chain+0x53/0x80
  [<ffffffff81062539>] __raw_notifier_call_chain+0x9/0x10
  [<ffffffff81062551>] raw_notifier_call_chain+0x11/0x20
  [<ffffffff8137df82>] call_netdevice_notifiers+0x32/0x60
  [<ffffffff8137e69f>] rollback_registered_many+0x10f/0x240
  [<ffffffff8137e85f>] rollback_registered+0x2f/0x40
  [<ffffffff8137e8c8>] unregister_netdevice_queue+0x58/0x90
  [<ffffffff8137e9eb>] unregister_netdev+0x1b/0x30
  [<ffffffffa005d73f>] tg3_remove_one+0x6f/0x10b [tg3]

We should call vlan_gvrp_request_leave() from unregister_vlan_dev(),
not from vlan_dev_stop(), because vlan_gvrp_uninit_applicant()
is called right after unregister_netdevice_queue(). In batch mode,
unregister_netdevice_queue() doesn’t immediately call vlan_dev_stop().

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agotcp: len check is unnecessarily devastating, change to WARN_ON
Ilpo Järvinen [Sat, 2 Apr 2011 04:47:41 +0000 (21:47 -0700)]
tcp: len check is unnecessarily devastating, change to WARN_ON

[ Upstream commit 2fceec13375e5d98ef033c6b0ee03943fc460950 ]

All callers are prepared for alloc failures anyway, so this error
can safely be boomeranged to the callers domain without super
bad consequences. ...At worst the connection might go into a state
where each RTO tries to (unsuccessfully) re-fragment with such
a mis-sized value and eventually dies.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoSCTP: fix race between sctp_bind_addr_free() and sctp_bind_addr_conflict()
Jacek Luczak [Thu, 19 May 2011 09:55:13 +0000 (09:55 +0000)]
SCTP: fix race between sctp_bind_addr_free() and sctp_bind_addr_conflict()

[ Upstream commit c182f90bc1f22ce5039b8722e45621d5f96862c2 ]

During the sctp_close() call, we do not use rcu primitives to
destroy the address list attached to the endpoint.  At the same
time, we do the removal of addresses from this list before
attempting to remove the socket from the port hash

As a result, it is possible for another process to find the socket
in the port hash that is in the process of being closed.  It then
proceeds to traverse the address list to find the conflict, only
to have that address list suddenly disappear without rcu() critical
section.

Fix issue by closing address list removal inside RCU critical
section.

Race can result in a kernel crash with general protection fault or
kernel NULL pointer dereference:

kernel: general protection fault: 0000 [#1] SMP
kernel: RIP: 0010:[<ffffffffa02f3dde>]  [<ffffffffa02f3dde>] sctp_bind_addr_conflict+0x64/0x82 [sctp]
kernel: Call Trace:
kernel:  [<ffffffffa02f415f>] ? sctp_get_port_local+0x17b/0x2a3 [sctp]
kernel:  [<ffffffffa02f3d45>] ? sctp_bind_addr_match+0x33/0x68 [sctp]
kernel:  [<ffffffffa02f4416>] ? sctp_do_bind+0xd3/0x141 [sctp]
kernel:  [<ffffffffa02f5030>] ? sctp_bindx_add+0x4d/0x8e [sctp]
kernel:  [<ffffffffa02f5183>] ? sctp_setsockopt_bindx+0x112/0x4a4 [sctp]
kernel:  [<ffffffff81089e82>] ? generic_file_aio_write+0x7f/0x9b
kernel:  [<ffffffffa02f763e>] ? sctp_setsockopt+0x14f/0xfee [sctp]
kernel:  [<ffffffff810c11fb>] ? do_sync_write+0xab/0xeb
kernel:  [<ffffffff810e82ab>] ? fsnotify+0x239/0x282
kernel:  [<ffffffff810c2462>] ? alloc_file+0x18/0xb1
kernel:  [<ffffffff8134a0b1>] ? compat_sys_setsockopt+0x1a5/0x1d9
kernel:  [<ffffffff8134aaf1>] ? compat_sys_socketcall+0x143/0x1a4
kernel:  [<ffffffff810467dc>] ? sysenter_dispatch+0x7/0x32

Signed-off-by: Jacek Luczak <luczak.jacek@gmail.com>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
CC: Eric Dumazet <eric.dumazet@gmail.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoRevert "tcp: disallow bind() to reuse addr/port"
David S. Miller [Wed, 13 Apr 2011 19:01:14 +0000 (12:01 -0700)]
Revert "tcp: disallow bind() to reuse addr/port"

[ Upstream commit 3e8c806a08c7beecd972e7ce15c570b9aba64baa ]

This reverts commit c191a836a908d1dd6b40c503741f91b914de3348.

It causes known regressions for programs that expect to be able to use
SO_REUSEADDR to shutdown a socket, then successfully rebind another
socket to the same ID.

Programs such as haproxy and amavisd expect this to work.

This should fix kernel bugzilla 32832.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoRevert "bridge: Forward reserved group addresses if !STP"
David S. Miller [Fri, 22 Apr 2011 04:17:25 +0000 (21:17 -0700)]
Revert "bridge: Forward reserved group addresses if !STP"

[ Upstream commit f01cb5fbea1c1613621f9f32f385e12c1a29dde0 ]

This reverts commit 1e253c3b8a1aeed51eef6fc366812f219b97de65.

It breaks 802.3ad bonding inside of a bridge.

The commit was meant to support transport bridging, and specifically
virtual machines bridged to an ethernet interface connected to a
switch port wiht 802.1x enabled.

But this isn't the way to do it, it breaks too many other things.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agonet: use hlist_del_rcu() in dev_change_name()
Eric Dumazet [Tue, 17 May 2011 17:56:59 +0000 (13:56 -0400)]
net: use hlist_del_rcu() in dev_change_name()

[ Upstream commit 372b2312010bece1e36f577d6c99a6193ec54cbd ]

Using plain hlist_del() in dev_change_name() is wrong since a
concurrent reader can crash trying to dereference LIST_POISON1.

Bug introduced in commit 72c9528bab94 (net: Introduce
dev_get_by_name_rcu())

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agonet: Do not wrap sysctl igmp_max_memberships in IP_MULTICAST
Joakim Tjernlund [Tue, 12 Apr 2011 20:59:33 +0000 (13:59 -0700)]
net: Do not wrap sysctl igmp_max_memberships in IP_MULTICAST

[ Upstream commit 192910a6cca5e50e5bd6cbd1da0e7376c7adfe62 ]

controlling igmp_max_membership is useful even when IP_MULTICAST
is off.
Quagga(an OSPF deamon) uses multicast addresses for all interfaces
using a single socket and hits igmp_max_membership limit when
there are 20 interfaces or more.
Always export sysctl igmp_max_memberships in proc, just like
igmp_max_msf

Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agomacvlan: fix panic if lowerdev in a bond
Eric Dumazet [Fri, 20 May 2011 18:59:23 +0000 (14:59 -0400)]
macvlan: fix panic if lowerdev in a bond

[ Upstream commit d93515611bbc70c2fe4db232e5feb448ed8e4cc9 ]

commit a35e2c1b6d905 (macvlan: use rx_handler_data pointer to store
macvlan_port pointer V2) added a bug in macvlan_port_create()

Steps to reproduce the bug:

# ifenslave bond0 eth0 eth1

# ip link add link eth0 up name eth0#1 type macvlan
->error EBUSY

# ip link add link eth0 up name eth0#1 type macvlan
->panic

Fix: Dont set IFF_MACVLAN_PORT in error case.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoipv6: udp: fix the wrong headroom check
Shan Wei [Tue, 19 Apr 2011 22:52:49 +0000 (22:52 +0000)]
ipv6: udp: fix the wrong headroom check

[ Upstream commit a9cf73ea7ff78f52662c8658d93c226effbbedde ]

At this point, skb->data points to skb_transport_header.
So, headroom check is wrong.

For some case:bridge(UFO is on) + eth device(UFO is off),
there is no enough headroom for IPv6 frag head.
But headroom check is always false.

This will bring about data be moved to there prior to skb->head,
when adding IPv6 frag header to skb.

Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoipv6: Remove hoplimit initialization to -1
Thomas Egerer [Wed, 20 Apr 2011 22:56:02 +0000 (22:56 +0000)]
ipv6: Remove hoplimit initialization to -1

[ Upstream commit e965c05dabdabb85af0187952ccd75e43995c4b3 ]

The changes introduced with git-commit a02e4b7d ("ipv6: Demark default
hoplimit as zero.") missed to remove the hoplimit initialization. As a
result, ipv6_get_mtu interprets the return value of dst_metric_raw
(-1) as 255 and answers ping6 with this hoplimit.  This patche removes
the line such that ping6 is answered with the hoplimit value
configured via sysctl.

Signed-off-by: Thomas Egerer <thomas.egerer@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoinetpeer: reduce stack usage
Eric Dumazet [Mon, 11 Apr 2011 22:39:40 +0000 (22:39 +0000)]
inetpeer: reduce stack usage

[ Upstream commit 66944e1c5797562cebe2d1857d46dff60bf9a69e ]

On 64bit arches, we use 752 bytes of stack when cleanup_once() is called
from inet_getpeer().

Lets share the avl stack to save ~376 bytes.

Before patch :

# objdump -d net/ipv4/inetpeer.o | scripts/checkstack.pl

0x000006c3 unlink_from_pool [inetpeer.o]: 376
0x00000721 unlink_from_pool [inetpeer.o]: 376
0x00000cb1 inet_getpeer [inetpeer.o]: 376
0x00000e6d inet_getpeer [inetpeer.o]: 376
0x0004 inet_initpeers [inetpeer.o]: 112
# size net/ipv4/inetpeer.o
   text    data     bss     dec     hex filename
   5320     432      21    5773    168d net/ipv4/inetpeer.o

After patch :

objdump -d net/ipv4/inetpeer.o | scripts/checkstack.pl
0x00000c11 inet_getpeer [inetpeer.o]: 376
0x00000dcd inet_getpeer [inetpeer.o]: 376
0x00000ab9 peer_check_expire [inetpeer.o]: 328
0x00000b7f peer_check_expire [inetpeer.o]: 328
0x0004 inet_initpeers [inetpeer.o]: 112
# size net/ipv4/inetpeer.o
   text    data     bss     dec     hex filename
   5163     432      21    5616    15f0 net/ipv4/inetpeer.o

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Scot Doyle <lkml@scotdoyle.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com>
Reviewed-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoirda: fix locking unbalance in irda_sendmsg
Dave Jones [Tue, 12 Apr 2011 22:29:54 +0000 (15:29 -0700)]
irda: fix locking unbalance in irda_sendmsg

[ Upstream commit 020318d0d2af51e0fd59ba654ede9b2171558720 ]

5b40964eadea40509d353318d2c82e8b7bf5e8a5 ("irda: Remove BKL instances
from af_irda.c") introduced a path where we have a locking unbalance.
If we pass invalid flags, we unlock a socket we never locked,
resulting in this...

=====================================
[ BUG: bad unlock balance detected! ]
-------------------------------------
trinity/20101 is trying to release lock (sk_lock-AF_IRDA) at:
[<ffffffffa057f001>] irda_sendmsg+0x207/0x21d [irda]
but there are no more locks to release!

other info that might help us debug this:
no locks held by trinity/20101.

stack backtrace:
Pid: 20101, comm: trinity Not tainted 2.6.39-rc3+ #3
Call Trace:
 [<ffffffffa057f001>] ? irda_sendmsg+0x207/0x21d [irda]
 [<ffffffff81085041>] print_unlock_inbalance_bug+0xc7/0xd2
 [<ffffffffa057f001>] ? irda_sendmsg+0x207/0x21d [irda]
 [<ffffffff81086aca>] lock_release+0xcf/0x18e
 [<ffffffff813ed190>] release_sock+0x2d/0x155
 [<ffffffffa057f001>] irda_sendmsg+0x207/0x21d [irda]
 [<ffffffff813e9f8c>] __sock_sendmsg+0x69/0x75
 [<ffffffff813ea105>] sock_sendmsg+0xa1/0xb6
 [<ffffffff81100ca3>] ? might_fault+0x5c/0xac
 [<ffffffff81086b7c>] ? lock_release+0x181/0x18e
 [<ffffffff81100cec>] ? might_fault+0xa5/0xac
 [<ffffffff81100ca3>] ? might_fault+0x5c/0xac
 [<ffffffff81133b94>] ? fcheck_files+0xb9/0xf0
 [<ffffffff813f387a>] ? copy_from_user+0x2f/0x31
 [<ffffffff813f3b70>] ? verify_iovec+0x52/0xa6
 [<ffffffff813eb4e3>] sys_sendmsg+0x23a/0x2b8
 [<ffffffff81086b7c>] ? lock_release+0x181/0x18e
 [<ffffffff810773c6>] ? up_read+0x28/0x2c
 [<ffffffff814bec3d>] ? do_page_fault+0x360/0x3b4
 [<ffffffff81087043>] ? trace_hardirqs_on_caller+0x10b/0x12f
 [<ffffffff810458aa>] ? finish_task_switch+0xb2/0xe3
 [<ffffffff8104583e>] ? finish_task_switch+0x46/0xe3
 [<ffffffff8108364a>] ? trace_hardirqs_off_caller+0x33/0x90
 [<ffffffff814bbaf9>] ? retint_swapgs+0x13/0x1b
 [<ffffffff81087043>] ? trace_hardirqs_on_caller+0x10b/0x12f
 [<ffffffff810a9dd3>] ? audit_syscall_entry+0x11c/0x148
 [<ffffffff8125609e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
 [<ffffffff814c22c2>] system_call_fastpath+0x16/0x1b

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agoieee802154: Remove hacked CFLAGS in net/ieee802154/Makefile
David S. Miller [Tue, 12 Apr 2011 22:33:23 +0000 (15:33 -0700)]
ieee802154: Remove hacked CFLAGS in net/ieee802154/Makefile

[ Upstream commit bfac3693c426d280b026f6a1b77dc2294ea43fea ]

It adds -Wall (which the kernel carefully controls already) and of all
things -DDEBUG (which should be set by other means if desired, please
we have dynamic-debug these days).

Kill this noise.

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agobridge: fix forwarding of IPv6
Stephen Hemminger [Fri, 13 May 2011 20:03:24 +0000 (16:03 -0400)]
bridge: fix forwarding of IPv6

[ Upstream commit cb68552858c64db302771469b1202ea09e696329 ]

The commit 6b1e960fdbd75dcd9bcc3ba5ff8898ff1ad30b6e
    bridge: Reset IPCB when entering IP stack on NF_FORWARD
broke forwarding of IPV6 packets in bridge because it would
call bp_parse_ip_options with an IPV6 packet.

Reported-by: Noah Meyerhans <noahm@debian.org>
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>