]> git.kernelconcepts.de Git - karo-tx-linux.git/log
karo-tx-linux.git
14 years agoLinux 2.6.30.9 v2.6.30.9
Greg Kroah-Hartman [Mon, 5 Oct 2009 15:38:08 +0000 (08:38 -0700)]
Linux 2.6.30.9

14 years agommap: avoid unnecessary anon_vma lock acquisition in vma_adjust()
Lee Schermerhorn [Tue, 22 Sep 2009 00:03:40 +0000 (17:03 -0700)]
mmap: avoid unnecessary anon_vma lock acquisition in vma_adjust()

commit 252c5f94d944487e9f50ece7942b0fbf659c5c31 upstream.

We noticed very erratic behavior [throughput] with the AIM7 shared
workload running on recent distro [SLES11] and mainline kernels on an
8-socket, 32-core, 256GB x86_64 platform.  On the SLES11 kernel
[2.6.27.19+] with Barcelona processors, as we increased the load [10s of
thousands of tasks], the throughput would vary between two "plateaus"--one
at ~65K jobs per minute and one at ~130K jpm.  The simple patch below
causes the results to smooth out at the ~130k plateau.

But wait, there's more:

We do not see this behavior on smaller platforms--e.g., 4 socket/8 core.
This could be the result of the larger number of cpus on the larger
platform--a scalability issue--or it could be the result of the larger
number of interconnect "hops" between some nodes in this platform and how
the tasks for a given load end up distributed over the nodes' cpus and
memories--a stochastic NUMA effect.

The variability in the results are less pronounced [on the same platform]
with Shanghai processors and with mainline kernels.  With 31-rc6 on
Shanghai processors and 288 file systems on 288 fibre attached storage
volumes, the curves [jpm vs load] are both quite flat with the patched
kernel consistently producing ~3.9% better throughput [~80K jpm vs ~77K
jpm] than the unpatched kernel.

Profiling indicated that the "slow" runs were incurring high[er]
contention on an anon_vma lock in vma_adjust(), apparently called from the
sbrk() system call.

The patch:

A comment in mm/mmap.c:vma_adjust() suggests that we don't really need the
anon_vma lock when we're only adjusting the end of a vma, as is the case
for brk().  The comment questions whether it's worth while to optimize for
this case.  Apparently, on the newer, larger x86_64 platforms, with
interesting NUMA topologies, it is worth while--especially considering
that the patch [if correct!] is quite simple.

We can detect this condition--no overlap with next vma--by noting a NULL
"importer".  The anon_vma pointer will also be NULL in this case, so
simply avoid loading vma->anon_vma to avoid the lock.

However, we DO need to take the anon_vma lock when we're inserting a vma
['insert' non-NULL] even when we have no overlap [NULL "importer"], so we
need to check for 'insert', as well.  And Hugh points out that we should
also take it when adjusting vm_start (so that rmap.c can rely upon
vma_address() while it holds the anon_vma lock).

akpm: Zhang Yanmin reprts a 150% throughput improvement with aim7, so it
might be -stable material even though thiss isn't a regression: "this
issue is not clear on dual socket Nehalem machine (2*4*2 cpu), but is
severe on large machine (4*8*2 cpu)"

[hugh.dickins@tiscali.co.uk: test vma start too]
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Eric Whitney <eric.whitney@hp.com>
Tested-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agomm: fix anonymous dirtying
Hugh Dickins [Tue, 22 Sep 2009 00:03:29 +0000 (17:03 -0700)]
mm: fix anonymous dirtying

commit 1ac0cb5d0e22d5e483f56b2bc12172dec1cf7536 upstream.

do_anonymous_page() has been wrong to dirty the pte regardless.
If it's not going to mark the pte writable, then it won't help
to mark it dirty here, and clogs up memory with pages which will
need swap instead of being thrown away.  Especially wrong if no
overcommit is chosen, and this vma is not yet VM_ACCOUNTed -
we could exceed the limit and OOM despite no overcommit.

Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Minchan Kim <minchan.kim@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agothinkpad-acpi: fix incorrect use of TPACPI_BRGHT_MODE_ECNVRAM
Henrique de Moraes Holschuh [Sat, 26 Sep 2009 00:57:33 +0000 (21:57 -0300)]
thinkpad-acpi: fix incorrect use of TPACPI_BRGHT_MODE_ECNVRAM

HBRV-based default selection of backlight control strategy didn't work
well, at least the X41 defines it but doesn't use it and I don't think
it will stop there.  Switch to a blacklist, and make sure only Radeon-
based models get ECNVRAM.

Symptoms of incorrect backlight mode selection are:

1. Non-working backlight control through sysfs;

2. Backlight gets reset to the lowest level at every shutdown, reboot
   and when thinkpad-acpi gets unloaded;

This fixes a regression in 2.6.30, bugzilla #13826.  This fix is
already present on 2.6.31.

This is a minimal patch for 2.6.30-stable, based on mainline
commits: 050df107c408a3df048524b3783a5fc6d4dccfdb,
 7d95a3d564901e88ed42810f054e579874151999,
 59fe4fe34d7afdf63208124f313be9056feaa2f4,
 6da25bf51689a5cc60370d30275dbb9e6852e0cb

Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Reported-by: Tobias Diedrich <ranma+kernel@tdiedrich.de>
Reported-by: Robert de Rooy <robert.de.rooy@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoPM / yenta: Fix cardbus suspend/resume regression
Rafael J. Wysocki [Mon, 28 Sep 2009 22:11:03 +0000 (00:11 +0200)]
PM / yenta: Fix cardbus suspend/resume regression

commit 0c570cdeb8fdfcb354a3e9cd81bfc6a09c19de0c upstream.

Since 2.6.29 the PCI PM core have been restoring the standard
configuration registers of PCI devices in the early phase of
resume.  In particular, PCI devices without drivers have been handled
this way since commit 355a72d75b3b4f4877db4c9070c798238028ecb5
(PCI: Rework default handling of suspend and resume).  Unfortunately,
this leads to post-resume problems with CardBus devices which cannot
be accessed in the early phase of resume, because the sockets they
are on have not been woken up yet at that point.

To solve this problem, move the yenta socket resume to the early
phase of resume and, analogously, move the suspend of it to the late
phase of suspend.  Additionally, remove some unnecessary PCI code
from the yenta socket's resume routine.

Fixes http://bugzilla.kernel.org/show_bug.cgi?id=13092, which is a
post-2.6.28 regression.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reported-by: Florian <fs-kernelbugzilla@spline.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoPM / PCMCIA: Drop second argument of pcmcia_socket_dev_suspend()
Rafael J. Wysocki [Mon, 28 Sep 2009 22:10:41 +0000 (00:10 +0200)]
PM / PCMCIA: Drop second argument of pcmcia_socket_dev_suspend()

commit 827b4649d4626bf97b203b4bcd69476bb9b4e760 upstream.

pcmcia_socket_dev_suspend() doesn't use its second argument, so it
may be dropped safely.

This change is necessary for the subsequent yenta suspend/resume fix.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years ago/proc/kcore: work around a BUG()
KAMEZAWA Hiroyuki [Tue, 22 Sep 2009 00:01:02 +0000 (17:01 -0700)]
/proc/kcore: work around a BUG()

Not upstream due to other fixes in .32

Works around a BUG() which is triggered when the kernel accesses holes in
vmalloc regions.

BUG: unable to handle kernel paging request at fa54c000
IP: [<c04f687a>] read_kcore+0x260/0x31a
*pde = 3540b067 *pte = 00000000
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/pci0000:00/0000:00:1c.2/0000:03:00.0/ieee80211/phy0/rfkill0/state
Modules linked in: fuse sco bridge stp llc bnep l2cap bluetooth sunrpc nf_conntrack_ftp ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 cpufreq_ondemand acpi_cpufreq dm_multipath uinput usb_storage arc4 ecb snd_hda_codec_realtek snd_hda_intel ath5k snd_hda_codec snd_hwdep iTCO_wdt snd_pcm iTCO_vendor_support pcspkr i2c_i801 mac80211 joydev snd_timer serio_raw r8169 snd soundcore mii snd_page_alloc ath cfg80211 ata_generic i915 drm i2c_algo_bit i2c_core video output [last unloaded: scsi_wait_scan]
Sep  4 12:45:16 tuxedu kernel: Pid: 2266, comm: cat Not tainted (2.6.31-rc8 #2) Joybook Lite U101
EIP: 0060:[<c04f687a>] EFLAGS: 00010286 CPU: 0
EIP is at read_kcore+0x260/0x31a
EAX: f5e5ea00 EBX: fa54d000 ECX: 00000400 EDX: 00001000
ESI: fa54c000 EDI: f44ad000 EBP: e4533f4c ESP: e4533f24
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process cat (pid: 2266, ti=e4532000 task=f09d19a0 task.ti=e4532000)
Stack:
00005000 00000000 f44ad000 09d9c000 00003000 fa54c000 00001000 f6d16f60
 e4520b80 fffffffb e4533f70 c04ef8eb e4533f98 00008000 09d97000 c04f661a
 e4520b80 09d97000 c04ef88c e4533f8c c04ba531 e4533f98 c04c0930 e4520b80
Call Trace:
[<c04ef8eb>] ? proc_reg_read+0x5f/0x73
[<c04f661a>] ? read_kcore+0x0/0x31a
[<c04ef88c>] ? proc_reg_read+0x0/0x73
[<c04ba531>] ? vfs_read+0x82/0xe1
[<c04c0930>] ? path_put+0x1a/0x1d
[<c04ba62e>] ? sys_read+0x40/0x62
[<c0403298>] ? sysenter_do_call+0x12/0x2d
Code: 39 f3 89 ca 0f 43 f3 89 fb 29 f2 29 f3 39 cf 0f 46 d3 29 55 dc 8d 1c 32 f6 40 0c 01 75 18 89 d1 89 f7 c1 e9 02 2b 7d ec 03 7d e0 <f3> a5 89 d1 83 e1 03 74 02 f3 a4 8b 00 83 7d dc 00 74 04 85 c0
EIP: [<c04f687a>] read_kcore+0x260/0x31a SS:ESP 0068:e4533f24
CR2: 00000000fa54c000

To access vmalloc area which may have memory holes, copy_from_user is
useful.  So this:

 # cat /proc/kcore > /dev/null

will not panic.

This is a minimal fix, suitable for 2.6.30.x and 2.6.31.  More extensive
/proc/kcore changes are planned for 2.6.32.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Tested-by: Nick Craig-Wood <nick@craig-wood.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Reported-by: <kbowa@tuxedu.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agopowerpc: Fix incorrect setting of __HAVE_ARCH_PTE_SPECIAL
Weirich, Bernhard [Thu, 24 Sep 2009 07:16:53 +0000 (17:16 +1000)]
powerpc: Fix incorrect setting of __HAVE_ARCH_PTE_SPECIAL

[I'm going to fix upstream differently, by having all CPU types
actually support _PAGE_SPECIAL, but I prefer the simple and obvious
fix for -stable. -- Ben]

The test that decides whether to define __HAVE_ARCH_PTE_SPECIAL on
powerpc is bogus and will end up always defining it, even when
_PAGE_SPECIAL is not supported (in which case it's 0) such as on
8xx or 40x processors.

Signed-off-by: Bernhard Weirich <bernhard.weirich@riedel.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agopowerpc/8xx: Fix regression introduced by cache coherency rewrite
Rex Feany [Thu, 24 Sep 2009 07:16:54 +0000 (17:16 +1000)]
powerpc/8xx: Fix regression introduced by cache coherency rewrite

commit e0908085fc2391c85b85fb814ae1df377c8e0dcb upstream.

After upgrading to the latest kernel on my mpc875 userspace started
running incredibly slow (hours to get to a shell, even!).
I tracked it down to commit 8d30c14cab30d405a05f2aaceda1e9ad57800f36,
that patch removed a work-around for the 8xx. Adding it
back makes my problem go away.

Signed-off-by: Rex Feany <rfeany@mrv.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agohugetlb: restore interleaving of bootmem huge pages (2.6.31)
Lee Schermerhorn [Tue, 22 Sep 2009 00:01:04 +0000 (17:01 -0700)]
hugetlb: restore interleaving of bootmem huge pages (2.6.31)

Not upstream as it is fixed differently in .32

I noticed that alloc_bootmem_huge_page() will only advance to the next
node on failure to allocate a huge page.  I asked about this on linux-mm
and linux-numa, cc'ing the usual huge page suspects.  Mel Gorman
responded:

I strongly suspect that the same node being used until allocation
failure instead of round-robin is an oversight and not deliberate
at all. It appears to be a side-effect of a fix made way back in
commit 63b4613c3f0d4b724ba259dc6c201bb68b884e1a ["hugetlb: fix
hugepage allocation with memoryless nodes"]. Prior to that patch
it looked like allocations would always round-robin even when
allocation was successful.

Andy Whitcroft countered that the existing behavior looked like Andi
Kleen's original implementation and suggested that we ask him.  We did and
Andy replied that his intention was to interleave the allocations.  So,
...

This patch moves the advance of the hstate next node from which to
allocate up before the test for success of the attempted allocation.  This
will unconditionally advance the next node from which to alloc,
interleaving successful allocations over the nodes with sufficient
contiguous memory, and skipping over nodes that fail the huge page
allocation attempt.

Note that alloc_bootmem_huge_page() will only be called for huge pages of
order > MAX_ORDER.

Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: David Rientjes <rientjes@google.com>
Cc: Adam Litke <agl@us.ibm.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Eric Whitney <eric.whitney@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoFix idle time field in /proc/uptime
Michael Abbott [Thu, 24 Sep 2009 08:15:19 +0000 (10:15 +0200)]
Fix idle time field in /proc/uptime

commit 96830a57de1197519b62af6a4c9ceea556c18c3d upstream.

Git commit 79741dd changes idle cputime accounting, but unfortunately
the /proc/uptime file hasn't caught up.  Here the idle time calculation
from /proc/stat is copied over.

Signed-off-by: Michael Abbott <michael.abbott@diamond.ac.uk>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agonetfilter: nf_nat: fix inverted logic for persistent NAT mappings
Patrick McHardy [Thu, 17 Sep 2009 11:58:26 +0000 (13:58 +0200)]
netfilter: nf_nat: fix inverted logic for persistent NAT mappings

netfilter: nf_nat: fix inverted logic for persistent NAT mappings

Upstream commit cce5a5c3:

Kernel 2.6.30 introduced a patch [1] for the persistent option in the
netfilter SNAT target. This is exactly what we need here so I had a quick look
at the code and noticed that the patch is wrong. The logic is simply inverted.
The patch below fixes this.

Also note that because of this the default behavior of the SNAT target has
changed since kernel 2.6.30 as it now ignores the destination IP in choosing
the source IP for nating (which should only be the case if the persistent
option is set).

[1] http://git.eu.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=98d500d66cb7940747b424b245fc6a51ecfbf005

Signed-off-by: Maximilian Engelhardt <maxi@daemonizer.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agonetfilter: ebt_ulog: fix checkentry return value
Patrick McHardy [Thu, 17 Sep 2009 11:58:30 +0000 (13:58 +0200)]
netfilter: ebt_ulog: fix checkentry return value

netfilter: ebt_ulog: fix checkentry return value

Upstream commit 8a56df0a:

Commit 19eda87 (netfilter: change return types of check functions for
Ebtables extensions) broke the ebtables ulog module by missing a return
value conversion.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agonetfilter: bridge: refcount fix
Patrick McHardy [Thu, 17 Sep 2009 11:58:29 +0000 (13:58 +0200)]
netfilter: bridge: refcount fix

netfilter: bridge: refcount fix

Upstream commit f3abc9b9:

commit f216f082b2b37c4943f1e7c393e2786648d48f6f
([NETFILTER]: bridge netfilter: deal with martians correctly)
added a refcount leak on in_dev.

Instead of using in_dev_get(), we can use __in_dev_get_rcu(),
as netfilter hooks are running under rcu_read_lock(), as pointed
by Patrick.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoFix NULL ptr regression in powernow-k8
Kurt Roeckx [Wed, 16 Sep 2009 15:09:32 +0000 (11:09 -0400)]
Fix NULL ptr regression in powernow-k8

commit f0adb134d8dc9993a9998dc50845ec4f6ff4fadc upstream.

Fixes bugzilla #13780

From: Kurt Roeckx <kurt@roeckx.be>
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agonet: Make the copy length in af_packet sockopt handler unsigned
Arjan van de Ven [Wed, 30 Sep 2009 11:54:47 +0000 (13:54 +0200)]
net: Make the copy length in af_packet sockopt handler unsigned

fixed upstream in commit b7058842c940ad2c08dd829b21e5c92ebe3b8758 in a different way

The length of the to-copy data structure is currently stored in
a signed integer. However many comparisons are done with sizeof(..)
which is unsigned. It's more suitable for this variable to be unsigned
to make these comparisons more naturally right.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agonet ax25: Fix signed comparison in the sockopt handler
Arjan van de Ven [Wed, 30 Sep 2009 11:51:11 +0000 (13:51 +0200)]
net ax25: Fix signed comparison in the sockopt handler

fixed upstream in commit b7058842c940ad2c08dd829b21e5c92ebe3b8758 in a different way

The ax25 code tried to use

        if (optlen < sizeof(int))
                return -EINVAL;

as a security check against optlen being negative (or zero) in the
set socket option.

Unfortunately, "sizeof(int)" is an unsigned property, with the
result that the whole comparison is done in unsigned, letting
negative values slip through.

This patch changes this to

        if (optlen < (int)sizeof(int))
                return -EINVAL;

so that the comparison is done as signed, and negative values
get properly caught.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoahci: restore pci_intx() handling
Tejun Heo [Wed, 16 Sep 2009 15:34:39 +0000 (00:34 +0900)]
ahci: restore pci_intx() handling

commit 31b239ad1ba7225435e13f5afc47e48eb674c0cc upstream.

Commit a5bfc4714b3f01365aef89a92673f2ceb1ccf246 dropped explicit
pci_intx() manipulation from ahci because it seemed unnecessary and
ahci doesn't seem to be the right place to be tweaking it if it were.
This was largely okay but there are exceptions.  There was one on an
embedded platform which was fixed via firmware and now bko#14124
reports it on a HP DL320.

  http://bugzilla.kernel.org/show_bug.cgi?id=14124

I still think this isn't something libata drivers should be caring
about (the only ones which are calling pci_intx() explicitly are
libata ones and one other driver) but for now reverting the change
seems to be the right thing to do.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Thomas Jarosch <thomas.jarosch@intra2net.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoRevert "KVM: x86: check for cr3 validity in ioctl_set_sregs"
Marcelo Tosatti [Mon, 28 Sep 2009 18:05:53 +0000 (15:05 -0300)]
Revert "KVM: x86: check for cr3 validity in ioctl_set_sregs"

(cherry picked from commit dc7e795e3dd2a763e5ceaa1615f307e808cf3932)

This reverts commit 6c20e1442bb1c62914bb85b7f4a38973d2a423ba.

To my understanding, it became obsolete with the advent of the more
robust check in mmu_alloc_roots (89da4ff17f). Moreover, it prevents
the conceptually safe pattern

 1. set sregs
 2. register mem-slots
 3. run vcpu

by setting a sticky triple fault during step 1.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoKVM: fix cpuid E2BIG handling for extended request types
Mark McLoughlin [Fri, 18 Sep 2009 23:08:07 +0000 (20:08 -0300)]
KVM: fix cpuid E2BIG handling for extended request types

(cherry picked from commit cb007648de83cf226d69ec76e1c01848b4e8e49f)

If we run out of cpuid entries for extended request types
we should return -E2BIG, just like we do for the standard
request types.

Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoKVM guest: fix bogus wallclock physical address calculation
Glauber Costa [Fri, 18 Sep 2009 23:08:06 +0000 (20:08 -0300)]
KVM guest: fix bogus wallclock physical address calculation

(cherry picked from commit a20316d2aa41a8f4fd171648bad8f044f6060826)

The use of __pa() to calculate the address of a C-visible symbol
is wrong, and can lead to unpredictable results. See arch/x86/include/asm/page.h
for details.

It should be replaced with __pa_symbol(), that does the correct math here,
by taking relocations into account.  This ensures the correct wallclock data
structure physical address is passed to the hypervisor.

Signed-off-by: Glauber Costa <glommer@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoKVM: limit lapic periodic timer frequency
Marcelo Tosatti [Fri, 18 Sep 2009 23:08:05 +0000 (20:08 -0300)]
KVM: limit lapic periodic timer frequency

(cherry picked from commit 1444885a045fe3b1905a14ea1b52540bf556578b)

Otherwise its possible to starve the host by programming lapic timer
with a very high frequency.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoKVM: MMU: fix bogus alloc_mmu_pages assignment
Marcelo Tosatti [Fri, 18 Sep 2009 23:08:04 +0000 (20:08 -0300)]
KVM: MMU: fix bogus alloc_mmu_pages assignment

(cherry picked from commit b90c062c65cc8839edfac39778a37a55ca9bda36)

Remove the bogus n_free_mmu_pages assignment from alloc_mmu_pages.

It breaks accounting of mmu pages, since n_free_mmu_pages is modified
but the real number of pages remains the same.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoKVM: MMU: fix missing locking in alloc_mmu_pages
Marcelo Tosatti [Fri, 18 Sep 2009 23:08:03 +0000 (20:08 -0300)]
KVM: MMU: fix missing locking in alloc_mmu_pages

(cherry picked from commit 6a1ac77110ee3e8d8dfdef8442f3b30b3d83e6a2)

n_requested_mmu_pages/n_free_mmu_pages are used by
kvm_mmu_change_mmu_pages to calculate the number of pages to zap.

alloc_mmu_pages, called from the vcpu initialization path, modifies this
variables without proper locking, which can result in a negative value
in kvm_mmu_change_mmu_pages (say, with cpu hotplug).

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoKVM: x86: Disallow hypercalls for guest callers in rings > 0
Jan Kiszka [Fri, 18 Sep 2009 23:08:02 +0000 (20:08 -0300)]
KVM: x86: Disallow hypercalls for guest callers in rings > 0

(cherry picked from commit 07708c4af1346ab1521b26a202f438366b7bcffd)

So far unprivileged guest callers running in ring 3 can issue, e.g., MMU
hypercalls. Normally, such callers cannot provide any hand-crafted MMU
command structure as it has to be passed by its physical address, but
they can still crash the guest kernel by passing random addresses.

To close the hole, this patch considers hypercalls valid only if issued
from guest ring 0. This may still be relaxed on a per-hypercall base in
the future once required.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoKVM: MMU: make __kvm_mmu_free_some_pages handle empty list
Izik Eidus [Fri, 18 Sep 2009 23:08:01 +0000 (20:08 -0300)]
KVM: MMU: make __kvm_mmu_free_some_pages handle empty list

(cherry picked from commit 3b80fffe2b31fb716d3ebe729c54464ee7856723)

First check if the list is empty before attempting to look at list
entries.

Signed-off-by: Izik Eidus <ieidus@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoKVM: VMX: Fix cr8 exiting control clobbering by EPT
Gleb Natapov [Fri, 18 Sep 2009 23:08:00 +0000 (20:08 -0300)]
KVM: VMX: Fix cr8 exiting control clobbering by EPT

(cherry picked from commit 5fff7d270bd6a4759b6d663741b729cdee370257)
Don't call adjust_vmx_controls() two times for the same control.
It restores options that were dropped earlier.  This loses us the cr8
exit control, which causes a massive performance regression Windows x64.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoKVM: VMX: Check cpl before emulating debug register access
Avi Kivity [Fri, 18 Sep 2009 23:07:59 +0000 (20:07 -0300)]
KVM: VMX: Check cpl before emulating debug register access

(cherry picked from commit 0a79b009525b160081d75cef5dbf45817956acf2)

Debug registers may only be accessed from cpl 0.  Unfortunately, vmx will
code to emulate the instruction even though it was issued from guest
userspace, possibly leading to an unexpected trap later.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoRe-enable Lanman security
Chuck Ebbert [Tue, 15 Sep 2009 05:53:21 +0000 (01:53 -0400)]
Re-enable Lanman security

commit 20d1752f3d6bd32beb90949559e0d14a0b234445 upstream.

commit ac68392460ffefed13020967bae04edc4d3add06 ("[CIFS] Allow raw
ntlmssp code to be enabled with sec=ntlmssp") added a new bit to the
allowed security flags mask but seems to have inadvertently removed
Lanman security from the allowed flags. Add it back.

Signed-off-by: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agop54usb: add Zcomax XG-705A usbid
Christian Lamparter [Mon, 14 Sep 2009 21:08:43 +0000 (23:08 +0200)]
p54usb: add Zcomax XG-705A usbid

commit f7f71173ea69d4dabf166533beffa9294090b7ef upstream.

This patch adds a new usbid for Zcomax XG-705A to the device table.

Reported-by: Jari Jaakola <jari.jaakola@gmail.com>
Signed-off-by: Christian Lamparter <chunkeey@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agonilfs2: fix missing zero-fill initialization of btree node cache
Ryusuke Konishi [Sun, 27 Sep 2009 16:46:11 +0000 (01:46 +0900)]
nilfs2: fix missing zero-fill initialization of btree node cache

commit 1f28fcd925b2b3157411bbd08f0024b55b70d8dd upstream.

This will fix file system corruption which infrequently happens after
mount.  The problem was reported from users with the title "[NILFS
users] Fail to mount NILFS." (Message-ID:
<200908211918.34720.yuri@itinteg.net>), and so forth.  I've also
experienced the corruption multiple times on kernel 2.6.30 and 2.6.31.

The problem turned out to be caused due to discordance between
mapping->nrpages of a btree node cache and the actual number of pages
hung on the cache; if the mapping->nrpages becomes zero even as it has
pages, truncate_inode_pages() returns without doing anything.  Usually
this is harmless except it may cause page leak, but garbage collection
fairly infrequently sees a stale page remained in the btree node cache
of DAT (i.e. disk address translation file of nilfs), and induces the
corruption.

I identified a missing initialization in btree node caches was the
root cause.  This corrects the bug.

I've tested this for kernel 2.6.30 and 2.6.31.

Reported-by: Yuri Chislov <yuri@itinteg.net>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agokallsyms: fix segfault in prefix_underscores_count()
Paul Mundt [Tue, 22 Sep 2009 23:44:12 +0000 (16:44 -0700)]
kallsyms: fix segfault in prefix_underscores_count()

commit a9ece53c4089ef23d4002d34c4c7148d94622a40 upstream.

Commit b478b782e110fdb4135caa3062b6d687e989d994 "kallsyms, tracing: output
more proper symbol name" introduces a "bugfix" that introduces a segfault
in kallsyms in my configurations.

The cause is the introduction of prefix_underscores_count() which attempts
to count underscores, even in symbols that do not have them.  As a result,
it just uselessly runs past the end of the buffer until it crashes:

  CC      init/version.o
  LD      init/built-in.o
  LD      .tmp_vmlinux1
  KSYM    .tmp_kallsyms1.S
/bin/sh: line 1: 16934 Done                    sh-linux-gnu-nm -n .tmp_vmlinux1
     16935 Segmentation fault      | scripts/kallsyms > .tmp_kallsyms1.S
make: *** [.tmp_kallsyms1.S] Error 139

This simplifies the logic and just does a straightforward count.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Reviewed-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Paulo Marques <pmarques@grupopie.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agofs: make sure data stored into inode is properly seen before unlocking new inode
Jan Kara [Tue, 22 Sep 2009 00:01:06 +0000 (17:01 -0700)]
fs: make sure data stored into inode is properly seen before unlocking new inode

commit 580be0837a7a59b207c3d5c661d044d8dd0a6a30 upstream.

In theory it could happen that on one CPU we initialize a new inode but
clearing of I_NEW | I_LOCK gets reordered before some of the
initialization.  Thus on another CPU we return not fully uptodate inode
from iget_locked().

This seems to fix a corruption issue on ext3 mounted over NFS.

[akpm@linux-foundation.org: add some commentary]
Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoACPI: pci_slot.ko wants a 64-bit _SUN
Alex Chiang [Tue, 4 Aug 2009 20:44:17 +0000 (14:44 -0600)]
ACPI: pci_slot.ko wants a 64-bit _SUN

commit 7e24bc1ce669b2876ffa475ea1147f2bb9ffdc52 upstream.

Similar to commit b6adc195 (PCI hotplug: acpiphp wants a 64-bit
_SUN), pci_slot.ko reads and creates sysfs directories based on
the _SUN method.

Certain HP platforms return 64 bits in _SUN. This change to
pci_slot.ko allows us to see the correct sysfs directories.

Reported-by: Chad Smith <chad.smith@hp.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoLinux 2.6.30.8 v2.6.30.8
Greg Kroah-Hartman [Thu, 24 Sep 2009 15:28:02 +0000 (08:28 -0700)]
Linux 2.6.30.8

14 years agopowerpc/pseries: Fix to handle slb resize across migration
Brian King [Fri, 28 Aug 2009 12:06:29 +0000 (12:06 +0000)]
powerpc/pseries: Fix to handle slb resize across migration

commit 46db2f86a3b2a94e0b33e0b4548fb7b7b6bdff66 upstream.

The SLB can change sizes across a live migration, which was not
being handled, resulting in possible machine crashes during
migration if migrating to a machine which has a smaller max SLB
size than the source machine. Fix this by first reducing the
SLB size to the minimum possible value, which is 32, prior to
migration. Then during the device tree update which occurs after
migration, we make the call to ensure the SLB gets updated. Also
add the slb_size to the lparcfg output so that the migration
tools can check to make sure the kernel has this capability
before allowing migration in scenarios where the SLB size will change.

BenH: Fixed #include <asm/mmu-hash64.h> -> <asm/mmu.h> to avoid
      breaking ppc32 build

Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoPCI: Unhide the SMBus on the Compaq Evo D510 USDT
Jean Delvare [Tue, 28 Jul 2009 09:49:19 +0000 (11:49 +0200)]
PCI: Unhide the SMBus on the Compaq Evo D510 USDT

commit 6b5096e4d4496e185cd1ada5d1b8e1d941c805ed upstream.

One more form factor for Compaq Evo D510, which needs the same quirk
as the other form factors. Apparently there's no hardware monitoring
chip on that one, but SPD EEPROMs, so it's still worth unhiding the
SMBus.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Tested-by: Nuzhna Pomoshch <nuzhna_pomoshch@yahoo.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agolibata: fix off-by-one error in ata_tf_read_block()
Tejun Heo [Sun, 16 Aug 2009 12:21:21 +0000 (21:21 +0900)]
libata: fix off-by-one error in ata_tf_read_block()

commit ac8672ea922bde59acf50eaa1eaa1640a6395fd2 upstream.

ata_tf_read_block() has off-by-one error when converting CHS address
to LBA.  The bug isn't very visible because ata_tf_read_block() is
used only when generating sense data for a failed RW command and CHS
addressing isn't used too often these days.

This problem was spotted by Atsushi Nemoto.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agovirtio_blk: don't bounce highmem requests
Christoph Hellwig [Fri, 11 Sep 2009 22:49:19 +0000 (18:49 -0400)]
virtio_blk: don't bounce highmem requests

commit 4eff3cae9c9809720c636e64bc72f212258e0bd5 upstream

virtio_blk: don't bounce highmem requests

By default a block driver bounces highmem requests, but virtio-blk is
perfectly fine with any request that fit into it's 64 bit addressing scheme,
mapped in the kernel virtual space or not.

Besides improving performance on highmem systems this also makes the
reproducible oops in __bounce_end_io go away (but hiding the real cause).

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoV4L: em28xx: set up tda9887_conf in em28xx_card_setup()
Franklin Meng [Sat, 12 Sep 2009 14:31:05 +0000 (10:31 -0400)]
V4L: em28xx: set up tda9887_conf in em28xx_card_setup()

V4L: em28xx: set up tda9887_conf in em28xx_card_setup()

(cherry picked from commit ae3340cbf59ea362c2016eea762456cc0969fd9e)

Added tda9887_conf set up into em28xx_card_setup()

Signed-off-by: Franklin Meng <fmeng2002@yahoo.com>
Signed-off-by: Douglas Schilling Landgraf <dougsland@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Tested-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Michael Krufky <mkrufky@linuxtv.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agox86, pat: Fix cacheflush address in change_page_attr_set_clr()
Jack Steiner [Thu, 3 Sep 2009 17:56:02 +0000 (12:56 -0500)]
x86, pat: Fix cacheflush address in change_page_attr_set_clr()

commit fa526d0d641b5365676a1fb821ce359e217c9b85 upstream.

Fix address passed to cpa_flush_range() when changing page
attributes from WB to UC. The address (*addr) is
modified by __change_page_attr_set_clr(). The result is that
the pages being flushed start at the _end_ of the changed range
instead of the beginning.

This should be considered for 2.6.30-stable and 2.6.31-stable.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agox86/i386: Make sure stack-protector segment base is cache aligned
Jeremy Fitzhardinge [Thu, 3 Sep 2009 19:27:15 +0000 (12:27 -0700)]
x86/i386: Make sure stack-protector segment base is cache aligned

commit 1ea0d14e480c245683927eecc03a70faf06e80c8 upstream.

The Intel Optimization Reference Guide says:

In Intel Atom microarchitecture, the address generation unit
assumes that the segment base will be 0 by default. Non-zero
segment base will cause load and store operations to experience
a delay.
- If the segment base isn't aligned to a cache line
  boundary, the max throughput of memory operations is
  reduced to one [e]very 9 cycles.
[...]
Assembly/Compiler Coding Rule 15. (H impact, ML generality)
For Intel Atom processors, use segments with base set to 0
whenever possible; avoid non-zero segment base address that is
not aligned to cache line boundary at all cost.

We can't avoid having a non-zero base for the stack-protector
segment, but we can make it cache-aligned.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
LKML-Reference: <4AA01893.6000507@goop.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agox86: Fix x86_model test in es7000_apic_is_cluster()
Roel Kluin [Tue, 25 Aug 2009 13:35:12 +0000 (15:35 +0200)]
x86: Fix x86_model test in es7000_apic_is_cluster()

commit 005155b1f626d2b2d7932e4afdf4fead168c6888 upstream.

For the x86_model to be greater than 6 or less than 12 is
logically always true.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agosound: oxygen: work around MCE when changing volume
Clemens Ladisch [Mon, 7 Sep 2009 08:18:54 +0000 (10:18 +0200)]
sound: oxygen: work around MCE when changing volume

commit f1bc07af9a9edc5c1d4bdd971f7099316ed2e405 upstream.

When the volume is changed continuously (e.g., when the user drags a
volume slider with the mouse), the driver does lots of I2C writes.
Apparently, the sound chip can get confused when we poll the I2C status
register too much, and fails to complete a read from it.  On the PCI-E
models, the PCI-E/PCI bridge gets upset by this and generates a machine
check exception.

To avoid this, this patch replaces the polling with an unconditional
wait that is guaranteed to be long enough.

Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Tested-by: Johann Messner <johann.messner at jku.at>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoPCI: apply nv_msi_ht_cap_quirk on resume too
Tejun Heo [Tue, 21 Jul 2009 23:08:43 +0000 (16:08 -0700)]
PCI: apply nv_msi_ht_cap_quirk on resume too

commit 6dab62ee5a3bf4f71b8320c09db2e6022a19f40e upstream.

http://bugzilla.kernel.org/show_bug.cgi?id=12542 reports that with the
quirk not applied on resume, msi stops working after resuming and mcp78s
ahci fails due to IRQ mis-delivery.  Apply it on resume too.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Peer Chen <pchen@nvidia.com>
Cc: Tj <linux@tjworld.net>
Reported-by: Nicolas Derive <kalon33@ubuntu.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agomlx4_core: Allocate and map sufficient ICM memory for EQ context
Roland Dreier [Sun, 6 Sep 2009 03:24:49 +0000 (20:24 -0700)]
mlx4_core: Allocate and map sufficient ICM memory for EQ context

commit fa0681d2129732027355d6b7083dd8932b9b799d upstream.

The current implementation allocates a single host page for EQ context
memory, which was OK when we only allocated a few EQs.  However, since
we now allocate an EQ for each CPU core, this patch removes the
hard-coded limit (which we exceed with 4 KB pages and 128 byte EQ
context entries with 32 CPUs) and uses the same ICM table code as all
other context tables, which ends up simplifying the code quite a bit
while fixing the problem.

This problem was actually hit in practice on a dual-socket Nehalem box
with 16 real hardware threads and sufficiently odd ACPI tables that it
shows on boot

    SMP: Allowing 32 CPUs, 16 hotplug CPUs

so num_possible_cpus() ends up 32, and mlx4 ends up creating 33 MSI-X
interrupts and 33 EQs.  This mlx4 bug means that mlx4 can't even
initialize at all on this quite mainstream system.

Reported-by: Eli Cohen <eli@mellanox.co.il>
Tested-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoASoC: Fix WM835x Out4 capture enumeration
Mark Brown [Mon, 7 Sep 2009 17:09:58 +0000 (18:09 +0100)]
ASoC: Fix WM835x Out4 capture enumeration

commit 87831cb660954356d68cebdb1406f3be09e784e9 upstream.

It's the 8th enum of a zero indexed array. This is why I don't let
new drivers use these arrays of enums...

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoARM: 5691/1: fix cache aliasing issues between kmap() and kmap_atomic() with highmem
Nicolas Pitre [Thu, 3 Sep 2009 20:45:59 +0000 (21:45 +0100)]
ARM: 5691/1: fix cache aliasing issues between kmap() and kmap_atomic() with highmem

commit 7929eb9cf643ae416e5081b2a6fa558d37b9854c upstream.

Let's suppose a highmem page is kmap'd with kmap().  A pkmap entry is
used, the page mapped to it, and the virtual cache is dirtied.  Then
kunmap() is used which does virtually nothing except for decrementing a
usage count.

Then, let's suppose the _same_ page gets mapped using kmap_atomic().
It is therefore mapped onto a fixmap entry instead, which has a
different virtual address unaware of the dirty cache data for that page
sitting in the pkmap mapping.

Fortunately it is easy to know if a pkmap mapping still exists for that
page and use it directly with kmap_atomic(), thanks to kmap_high_get().

And actual testing with a printk in the added code path shows that this
condition is actually met *extremely* frequently.  Seems that we've been
quite lucky that things have worked so well with highmem so far.

Signed-off-by: Nicolas Pitre <nico@marvell.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoALSA: cs46xx - Fix minimum period size
Sophie Hamilton [Tue, 8 Sep 2009 08:58:42 +0000 (10:58 +0200)]
ALSA: cs46xx - Fix minimum period size

commit 6148b130eb84edc76e4fa88da1877b27be6c2f06 upstream.

Fix minimum period size for cs46xx cards. This fixes a problem in the
case where neither a period size nor a buffer size is passed to ALSA;
this is the case in Audacious, OpenAL, and others.

Signed-off-by: Sophie Hamilton <kernel@theblob.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoudf: Use device size when drive reported bogus number of written blocks
Jan Kara [Thu, 18 Jun 2009 10:33:16 +0000 (12:33 +0200)]
udf: Use device size when drive reported bogus number of written blocks

commit 24a5d59f3477bcff4c069ff4d0ca9a3e037d0235 upstream.

Some drives report 0 as the number of written blocks when there are some blocks
recorded. Use device size in such case so that we can automagically mount such
media.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoTPM: Fixup boot probe timeout for tpm_tis driver
Jason Gunthorpe [Wed, 9 Sep 2009 23:22:18 +0000 (17:22 -0600)]
TPM: Fixup boot probe timeout for tpm_tis driver

commit ec57935837a78f9661125b08a5d08b697568e040 upstream.

When probing the device in tpm_tis_init the call request_locality
uses timeout_a, which wasn't being initalized until after
request_locality. This results in request_locality falsely timing
out if the chip is still starting. Move the initialization to before
request_locality.

This probably only matters for embedded cases (ie mine), a BIOS likely
gets the TPM into a state where this code path isn't necessary.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Acked-by: Rajiv Andrade <srajiv@linux.vnet.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agopowerpc/ps3: Workaround for flash memory I/O error
Geoff Levand [Wed, 9 Sep 2009 13:28:05 +0000 (13:28 +0000)]
powerpc/ps3: Workaround for flash memory I/O error

commit bc00351edd5c1b84d48c3fdca740fedfce4ae6ce upstream.

A workaround for flash memory I/O errors when the PS3 internal
hard disk has not been formatted for OtherOS use.

This error condition mainly effects 'Live CD' users who have not
formatted the PS3's internal hard disk for OtherOS.

Fixes errors similar to these when using the ps3-flash-util
or ps3-boot-game-os programs:

  ps3flash read failed 0x2050000
  os_area_header_read: read error: os_area_header: Input/output error
  main:627: os_area_read_hp error.
  ERROR: can't change boot flag

Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agofix undefined reference to user_shm_unlock
Hugh Dickins [Sat, 12 Sep 2009 11:21:27 +0000 (12:21 +0100)]
fix undefined reference to user_shm_unlock

commit 2195d2818c37bdf263865f1e9effccdd9fc5f9d4 upstream.

My 353d5c30c666580347515da609dd74a2b8e9b828 "mm: fix hugetlb bug due to
user_shm_unlock call" broke the CONFIG_SYSVIPC !CONFIG_MMU build of both
2.6.31 and 2.6.30.6: "undefined reference to `user_shm_unlock'".

gcc didn't understand my comment! so couldn't figure out to optimize
away user_shm_unlock() from the error path in the hugetlb-less case, as
it does elsewhere.  Help it to do so, in a language it understands.

Reported-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agocfg80211: fix looping soft lockup in find_ie()
Bob Copeland [Tue, 1 Sep 2009 22:12:11 +0000 (18:12 -0400)]
cfg80211: fix looping soft lockup in find_ie()

commit fcc6cb0c13555e78c2d47257b6d1b5e59b0c419a upstream.

The find_ie() function uses a size_t for the len parameter, and
directly uses len as a loop variable.  If any received packets
are malformed, it is possible for the decrease of len to overflow,
and since the result is unsigned, the loop will not terminate.
Change it to a signed int so the loop conditional works for
negative values.

This fixes the following soft lockup:

[38573.102007] BUG: soft lockup - CPU#0 stuck for 61s! [phy0:2230]
[38573.102007] Modules linked in: aes_i586 aes_generic fuse af_packet ipt_REJECT xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_state iptable_filter ip_tables x_tables acpi_cpufreq binfmt_misc dm_mirror dm_region_hash dm_log dm_multipath dm_mod kvm_intel kvm uinput i915 arc4 ecb drm snd_hda_codec_idt ath5k snd_hda_intel hid_apple mac80211 usbhid appletouch snd_hda_codec snd_pcm ath cfg80211 snd_timer i2c_algo_bit ohci1394 video snd processor ieee1394 rfkill ehci_hcd sg sky2 backlight snd_page_alloc uhci_hcd joydev output ac thermal button battery sr_mod applesmc cdrom input_polldev evdev unix [last unloaded: scsi_wait_scan]
[38573.102007] irq event stamp: 2547724535
[38573.102007] hardirqs last  enabled at (2547724534): [<c1002ffc>] restore_all_notrace+0x0/0x18
[38573.102007] hardirqs last disabled at (2547724535): [<c10038f4>] apic_timer_interrupt+0x28/0x34
[38573.102007] softirqs last  enabled at (92950144): [<c103ab48>] __do_softirq+0x108/0x210
[38573.102007] softirqs last disabled at (92950274): [<c1348e74>] _spin_lock_bh+0x14/0x80
[38573.102007]
[38573.102007] Pid: 2230, comm: phy0 Tainted: G        W  (2.6.31-rc7-wl #8) MacBook1,1
[38573.102007] EIP: 0060:[<f8ea2d50>] EFLAGS: 00010292 CPU: 0
[38573.102007] EIP is at cmp_ies+0x30/0x180 [cfg80211]
[38573.102007] EAX: 00000082 EBX: 00000000 ECX: ffffffc1 EDX: d8efd014
[38573.102007] ESI: ffffff7c EDI: 0000004d EBP: eee2dc50 ESP: eee2dc3c
[38573.102007]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[38573.102007] CR0: 8005003b CR2: d8efd014 CR3: 01694000 CR4: 000026d0
[38573.102007] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[38573.102007] DR6: ffff0ff0 DR7: 00000400
[38573.102007] Call Trace:
[38573.102007]  [<f8ea2f8d>] cmp_bss+0xed/0x100 [cfg80211]
[38573.102007]  [<f8ea33e4>] cfg80211_bss_update+0x84/0x410 [cfg80211]
[38573.102007]  [<f8ea3884>] cfg80211_inform_bss_frame+0x114/0x180 [cfg80211]
[38573.102007]  [<f97255ff>] ieee80211_bss_info_update+0x4f/0x180 [mac80211]
[38573.102007]  [<f972b118>] ieee80211_rx_bss_info+0x88/0xf0 [mac80211]
[38573.102007]  [<f9739297>] ? ieee802_11_parse_elems+0x27/0x30 [mac80211]
[38573.102007]  [<f972b224>] ieee80211_rx_mgmt_probe_resp+0xa4/0x1c0 [mac80211]
[38573.102007]  [<f972bc59>] ieee80211_sta_rx_queued_mgmt+0x919/0xc50 [mac80211]
[38573.102007]  [<c1009707>] ? sched_clock+0x27/0xa0
[38573.102007]  [<c1009707>] ? sched_clock+0x27/0xa0
[38573.102007]  [<c105ffd0>] ? mark_held_locks+0x60/0x80
[38573.102007]  [<c1348be5>] ? _spin_unlock_irqrestore+0x55/0x70
[38573.102007]  [<c134baa5>] ? sub_preempt_count+0x85/0xc0
[38573.102007]  [<c1348bce>] ? _spin_unlock_irqrestore+0x3e/0x70
[38573.102007]  [<c12c1c0f>] ? skb_dequeue+0x4f/0x70
[38573.102007]  [<f972c021>] ieee80211_sta_work+0x91/0xb80 [mac80211]
[38573.102007]  [<c1009707>] ? sched_clock+0x27/0xa0
[38573.102007]  [<c134baa5>] ? sub_preempt_count+0x85/0xc0
[38573.102007]  [<c10479af>] worker_thread+0x18f/0x320
[38573.102007]  [<c104794e>] ? worker_thread+0x12e/0x320
[38573.102007]  [<c1348be5>] ? _spin_unlock_irqrestore+0x55/0x70
[38573.102007]  [<f972bf90>] ? ieee80211_sta_work+0x0/0xb80 [mac80211]
[38573.102007]  [<c104cbb0>] ? autoremove_wake_function+0x0/0x50
[38573.102007]  [<c1047820>] ? worker_thread+0x0/0x320
[38573.102007]  [<c104c854>] kthread+0x84/0x90
[38573.102007]  [<c104c7d0>] ? kthread+0x0/0x90
[38573.102007]  [<c1003ab7>] kernel_thread_helper+0x7/0x10

Signed-off-by: Bob Copeland <me@bobcopeland.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agobinfmt_elf: fix PT_INTERP bss handling
Roland McGrath [Wed, 9 Sep 2009 02:49:40 +0000 (19:49 -0700)]
binfmt_elf: fix PT_INTERP bss handling

commit 9f0ab4a3f0fdb1ff404d150618ace2fa069bb2e1 upstream.

In fs/binfmt_elf.c, load_elf_interp() calls padzero() for .bss even if
the PT_LOAD has no PROT_WRITE and no .bss.  This generates EFAULT.

Here is a small test case.  (Yes, there are other, useful PT_INTERP
which have only .text and no .data/.bss.)

----- ptinterp.S
_start: .globl _start
 nop
 int3
-----
$ gcc -m32 -nostartfiles -nostdlib -o ptinterp ptinterp.S
$ gcc -m32 -Wl,--dynamic-linker=ptinterp -o hello hello.c
$ ./hello
Segmentation fault  # during execve() itself

After applying the patch:
$ ./hello
Trace trap  # user-mode execution after execve() finishes

If the ELF headers are actually self-inconsistent, then dying is fine.
But having no PROT_WRITE segment is perfectly normal and correct if
there is no segment with p_memsz > p_filesz (i.e. bss).  John Reiser
suggested checking for PROT_WRITE in the bss logic.  I think it makes
most sense to simply apply the bss logic only when there is bss.

This patch looks less trivial than it is due to some reindentation.
It just moves the "if (last_bss > elf_bss) {" test up to include the
partial-page bss logic as well as the more-pages bss logic.

Reported-by: John Reiser <jreiser@bitwagon.com>
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoath5k: write PCU registers on initial reset
Bob Copeland [Sun, 5 Jul 2009 01:03:13 +0000 (21:03 -0400)]
ath5k: write PCU registers on initial reset

commit 3355443ad7601991affa5992b0d53870335af765 upstream.

"Ath5k: unify resets"
introduced a regression into 2.6.28 where the PCU registers are never
initialized, due to ath5k_reset() always passing true for change_channel.
We subsequently program a lot of these registers but several may start
in an unknown state.

Reported-by: Forrest Zhang <forrest@hifulltech.com>
Signed-off-by: Bob Copeland <me@bobcopeland.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoagp/intel: remove restore in resume
Zhenyu Wang [Mon, 14 Sep 2009 02:47:06 +0000 (10:47 +0800)]
agp/intel: remove restore in resume

commit 121264827656f5f06328b17983c796af17dc5949 upstream.

As early pci resume has already restored config for host
bridge and graphics device, don't need to restore it again,
This removes an original order hack for graphics device restore.

This fixed the resume hang issue found by Alan Stern on 845G,
caused by extra config restore on graphics device.

Cc: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Dave Airlie <airlied@linux.ie>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agosg: fix oops in the error path in sg_build_indirect()
Michal Schmidt [Thu, 3 Sep 2009 12:27:08 +0000 (14:27 +0200)]
sg: fix oops in the error path in sg_build_indirect()

commit e71044ee2efa4792e21d243b03d49006db66aec9 upstream.

When the allocation fails in sg_build_indirect(), an oops happens in
the error path. It's caused by an obvious typo.

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Reported-by: Bob Tracy <rct@gherkin.frus.com>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoInput: joydev - decouple axis and button map ioctls from input constants
Stephen Kitt [Wed, 12 Aug 2009 08:12:08 +0000 (01:12 -0700)]
Input: joydev - decouple axis and button map ioctls from input constants

commit ec8b4b7085605e801a7740a2c3c33256aebe249c upstream.

The KEY_MAX change in 2.6.28 changed the values of the JSIOCSBTNMAP and
JSIOCGBTNMAP constants; software compiled with the old values no longer
works with kernels following 2.6.28, because the ioctl switch statement
no longer matches the values given by the software. This patch handles
these ioctls independently of the length of data specified, and applies the
same treatment to JSIOCSAXMAP and JSIOCGAXMAP which currently depend on
ABS_MAX.

Signed-off-by: Stephen Kitt <steve@sk2.org>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoLinux 2.6.30.7 v2.6.30.7
Greg Kroah-Hartman [Tue, 15 Sep 2009 17:46:05 +0000 (10:46 -0700)]
Linux 2.6.30.7

14 years agodm snapshot: fix on disk chunk size validation
Mikulas Patocka [Fri, 4 Sep 2009 19:40:43 +0000 (20:40 +0100)]
dm snapshot: fix on disk chunk size validation

commit ae0b7448e91353ea5f821601a055aca6b58042cd upstream.

Fix some problems seen in the chunk size processing when activating a
pre-existing snapshot.

For a new snapshot, the chunk size can either be supplied by the creator
or a default value can be used.  For an existing snapshot, the
chunk size in the snapshot header on disk should always be used.

If someone attempts to load an existing snapshot and has the 'default
chunk size' option set, the kernel uses its default value even when it
is incorrect for the snapshot being loaded.  This patch ensures the
correct on-disk value is always used.

Secondly, when the code does use the chunk size stored on the disk it is
prudent to revalidate it, so the code can exit cleanly if it got
corrupted as happened in
https://bugzilla.redhat.com/show_bug.cgi?id=461506 .

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agodm exception store: split set_chunk_size
Mikulas Patocka [Fri, 4 Sep 2009 19:40:41 +0000 (20:40 +0100)]
dm exception store: split set_chunk_size

commit 2defcc3fb4661e7351cb2ac48d843efc4c64db13 upstream.

Break the function set_chunk_size to two functions in preparation for
the fix in the following patch.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agodm snapshot: fix header corruption race on invalidation
Mikulas Patocka [Fri, 4 Sep 2009 19:40:39 +0000 (20:40 +0100)]
dm snapshot: fix header corruption race on invalidation

commit 61578dcd3fafe6babd72e8db32110cc0b630a432 upstream.

If a persistent snapshot fills up, a race can corrupt the on-disk header
which causes a crash on any future attempt to activate the snapshot
(typically while booting).  This patch fixes the race.

When the snapshot overflows, __invalidate_snapshot is called, which calls
snapshot store method drop_snapshot. It goes to persistent_drop_snapshot that
calls write_header. write_header constructs the new header in the "area"
location.

Concurrently, an existing kcopyd job may finish, call copy_callback
and commit_exception method, that goes to persistent_commit_exception.
persistent_commit_exception doesn't do locking, relying on the fact that
callbacks are single-threaded, but it can race with snapshot invalidation and
overwrite the header that is just being written while the snapshot is being
invalidated.

The result of this race is a corrupted header being written that can
lead to a crash on further reactivation (if chunk_size is zero in the
corrupted header).

The fix is to use separate memory areas for each.

See the bug: https://bugzilla.redhat.com/show_bug.cgi?id=461506

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agodm snapshot: refactor zero_disk_area to use chunk_io
Mikulas Patocka [Fri, 4 Sep 2009 19:40:37 +0000 (20:40 +0100)]
dm snapshot: refactor zero_disk_area to use chunk_io

commit 02d2fd31defce6ff77146ad0fef4f19006055d86 upstream.

Refactor chunk_io to prepare for the fix in the following patch.

Pass an area pointer to chunk_io and simplify zero_disk_area to use
chunk_io.  No functional change.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agodm raid1: do not allow log_failure variable to unset after being set
Jonathan Brassow [Fri, 4 Sep 2009 19:40:32 +0000 (20:40 +0100)]
dm raid1: do not allow log_failure variable to unset after being set

commit d2b698644c97cb033261536a4f2010924a00eac9 upstream.

This patch fixes a bug which was triggering a case where the primary leg
could not be changed on failure even when the mirror was in-sync.

The case involves the failure of the primary device along with
the transient failure of the log device.  The problem is that
bios can be put on the 'failures' list (due to log failure)
before 'fail_mirror' is called due to the primary device failure.
Normally, this is fine, but if the log device failure is transient,
a subsequent iteration of the work thread, 'do_mirror', will
reset 'log_failure'.  The 'do_failures' function then resets
the 'in_sync' variable when processing bios on the failures list.
The 'in_sync' variable is what is used to determine if the
primary device can be switched in the event of a failure.  Since
this has been reset, the primary device is incorrectly assumed
to be not switchable.

The case has been seen in the cluster mirror context, where one
machine realizes the log device is dead before the other machines.
As the responsibilities of the server migrate from one node to
another (because the mirror is being reconfigured due to the failure),
the new server may think for a moment that the log device is fine -
thus resetting the 'log_failure' variable.

In any case, it is inappropiate for us to reset the 'log_failure'
variable.  The above bug simply illustrates that it can actually
hurt us.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agosound: oxygen: fix MCLK rate for 192 kHz playback
Clemens Ladisch [Tue, 1 Sep 2009 06:23:58 +0000 (08:23 +0200)]
sound: oxygen: fix MCLK rate for 192 kHz playback

commit b91ab72b830e1494c2c7f8de05ccb2ab2c9cfb26 upstream.

Do not forget to program the MCLK ratio for the I2S output.
Otherwise, the master clock frequency can be too high for
the DACs at sample frequencies above 96 kHz.

Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agosound: oxygen: handle cards with missing EEPROM
Clemens Ladisch [Wed, 2 Sep 2009 16:25:39 +0000 (18:25 +0200)]
sound: oxygen: handle cards with missing EEPROM

commit 92653453c3015c083b9fe0ad48261c6b2267d482 upstream.

The card model detection code introduced in 2.6.30 that tries to work
around partially broken EEPROM contents by reading the EEPROM directly
does not handle cards where the EEPROM has been omitted.  In this case,
we have to use the default ID to allow the driver to load.

Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Reported-and-tested-by: Ozan Çağlayan <ozan@pardus.org.tr>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoSCSI: sd: fix bug in SCSI async probing
James Bottomley [Tue, 26 May 2009 20:35:48 +0000 (20:35 +0000)]
SCSI: sd: fix bug in SCSI async probing

commit 601e7638254c118fca135af9b1a9f35061420f62 upstream.

The async split up of probing in sd.c created a potential failure case where
something goes wrong with device_add(), but which we don't recover properly.
Since, in general, asynchronous error handling is hard, move the device_add()
into the asynchronous path (it should be fast) and make sure all the deferred
processing cannot fail.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoPCI SR-IOV: correct broken resource alignment calculations
Chris Wright [Fri, 28 Aug 2009 20:00:06 +0000 (13:00 -0700)]
PCI SR-IOV: correct broken resource alignment calculations

commit 6faf17f6f1ffc586d16efc2f9fa2083a7785ee74 upstream.

An SR-IOV capable device includes an SR-IOV PCIe capability which
describes the Virtual Function (VF) BAR requirements.  A typical SR-IOV
device can support multiple VFs whose BARs must be in a contiguous region,
effectively an array of VF BARs.  The BAR reports the size requirement
for a single VF.  We calculate the full range needed by simply multiplying
the VF BAR size with the number of possible VFs and create a resource
spanning the full range.

This all seems sane enough except it artificially inflates the alignment
requirement for the VF BAR.  The VF BAR need only be aligned to the size
of a single BAR not the contiguous range of VF BARs.  This can cause us
to fail to allocate resources for the BAR despite the fact that we
actually have enough space.

This patch adds a thin PCI specific layer over the generic
resource_alignment() function which is aware of the special nature of
VF BARs and does sorting and allocation based on the smaller alignment
requirement.

I recognize that while resource_alignment is generic, it's basically a
PCI helper.  An alternative to this patch is to add PCI VF BAR specific
information to struct resource.  I opted for the extra layer rather than
adding such PCI specific information to struct resource.  This does
have the slight downside that we don't cache the BAR size and re-read
for each alignment query (happens a small handful of times during boot
for each VF BAR).

Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agonilfs2: fix preempt count underflow in nilfs_btnode_prepare_change_key
Ryusuke Konishi [Sat, 29 Aug 2009 19:21:41 +0000 (04:21 +0900)]
nilfs2: fix preempt count underflow in nilfs_btnode_prepare_change_key

commit b1f1b8ce0a1d71cbc72f7540134d52b79bd8f5ac upstream.

This will fix the following preempt count underflow reported from
users with the title "[NILFS users] segctord problem" (Message-ID:
<949415.6494.qm@web58808.mail.re1.yahoo.com> and Message-ID:
<debc30fc0908270825v747c1734xa59126623cfd5b05@mail.gmail.com>):

 WARNING: at kernel/sched.c:4890 sub_preempt_count+0x95/0xa0()
 Hardware name: HP Compaq 6530b (KR980UT#ABC)
 Modules linked in: bridge stp llc bnep rfcomm l2cap xfs exportfs nilfs2 cowloop loop vboxnetadp vboxnetflt vboxdrv btusb bluetooth uvcvideo videodev v4l1_compat v4l2_compat_ioctl32 arc4 snd_hda_codec_analog ecb iwlagn iwlcore rfkill lib80211 mac80211 snd_hda_intel snd_hda_codec ehci_hcd uhci_hcd usbcore snd_hwdep snd_pcm tg3 cfg80211 psmouse snd_timer joydev libphy ohci1394 snd_page_alloc hp_accel lis3lv02d ieee1394 led_class i915 drm i2c_algo_bit video backlight output i2c_core dm_crypt dm_mod
 Pid: 4197, comm: segctord Not tainted 2.6.30-gentoo-r4-64 #7
 Call Trace:
  [<ffffffff8023fa05>] ? sub_preempt_count+0x95/0xa0
  [<ffffffff802470f8>] warn_slowpath_common+0x78/0xd0
  [<ffffffff8024715f>] warn_slowpath_null+0xf/0x20
  [<ffffffff8023fa05>] sub_preempt_count+0x95/0xa0
  [<ffffffffa04ce4db>] nilfs_btnode_prepare_change_key+0x11b/0x190 [nilfs2]
  [<ffffffffa04d01ad>] nilfs_btree_assign_p+0x19d/0x1e0 [nilfs2]
  [<ffffffffa04d10ad>] nilfs_btree_assign+0xbd/0x130 [nilfs2]
  [<ffffffffa04cead7>] nilfs_bmap_assign+0x47/0x70 [nilfs2]
  [<ffffffffa04d9bc6>] nilfs_segctor_do_construct+0x956/0x20f0 [nilfs2]
  [<ffffffff805ac8e2>] ? _spin_unlock_irqrestore+0x12/0x40
  [<ffffffff803c06e0>] ? __up_write+0xe0/0x150
  [<ffffffff80262959>] ? up_write+0x9/0x10
  [<ffffffffa04ce9f3>] ? nilfs_bmap_test_and_clear_dirty+0x43/0x60 [nilfs2]
  [<ffffffffa04cd627>] ? nilfs_mdt_fetch_dirty+0x27/0x60 [nilfs2]
  [<ffffffffa04db5fc>] nilfs_segctor_construct+0x8c/0xd0 [nilfs2]
  [<ffffffffa04dc3dc>] nilfs_segctor_thread+0x15c/0x3a0 [nilfs2]
  [<ffffffffa04dbe20>] ? nilfs_construction_timeout+0x0/0x10 [nilfs2]
  [<ffffffff80252633>] ? add_timer+0x13/0x20
  [<ffffffff802370da>] ? __wake_up_common+0x5a/0x90
  [<ffffffff8025e960>] ? autoremove_wake_function+0x0/0x40
  [<ffffffffa04dc280>] ? nilfs_segctor_thread+0x0/0x3a0 [nilfs2]
  [<ffffffffa04dc280>] ? nilfs_segctor_thread+0x0/0x3a0 [nilfs2]
  [<ffffffff8025e556>] kthread+0x56/0x90
  [<ffffffff8020cdea>] child_rip+0xa/0x20
  [<ffffffff8025e500>] ? kthread+0x0/0x90
  [<ffffffff8020cde0>] ? child_rip+0x0/0x20

This problem was caused due to a missing radix_tree_preload() call in
the retry path of nilfs_btnode_prepare_change_key() function.

Reported-by: Eric A <eric225125@yahoo.com>
Reported-by: Jerome Poulin <jeromepoulin@gmail.com>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Tested-by: Jerome Poulin <jeromepoulin@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoslub: Fix kmem_cache_destroy() with SLAB_DESTROY_BY_RCU
Eric Dumazet [Thu, 3 Sep 2009 19:38:59 +0000 (22:38 +0300)]
slub: Fix kmem_cache_destroy() with SLAB_DESTROY_BY_RCU

commit d76b1590e06a63a3d8697168cd0aabf1c4b3cb3a upstream.

kmem_cache_destroy() should call rcu_barrier() *after* kmem_cache_close() and
*before* sysfs_slab_remove() or risk rcu_free_slab() being called after
kmem_cache is deleted (kfreed).

rmmod nf_conntrack can crash the machine because it has to kmem_cache_destroy()
a SLAB_DESTROY_BY_RCU enabled cache.

Reported-by: Zdenek Kabelac <zdenek.kabelac@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoJFFS2: add missing verify buffer allocation/deallocation
Massimo Cirillo [Thu, 27 Aug 2009 08:44:09 +0000 (10:44 +0200)]
JFFS2: add missing verify buffer allocation/deallocation

commit bc8cec0dff072f1a45ce7f6b2c5234bb3411ac51 upstream.

The function jffs2_nor_wbuf_flash_setup() doesn't allocate the verify buffer
if CONFIG_JFFS2_FS_WBUF_VERIFY is defined, so causing a kernel panic when
that macro is enabled and the verify function is called. Similarly the
jffs2_nor_wbuf_flash_cleanup() must free the buffer if
CONFIG_JFFS2_FS_WBUF_VERIFY is enabled.
The following patch fixes the problem.
The following patch applies to 2.6.30 kernel.

Signed-off-by: Massimo Cirillo <maxcir@gmail.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agosparc: sys32.S incorrect compat-layer splice() system call
Mathieu Desnoyers [Wed, 19 Aug 2009 03:16:55 +0000 (20:16 -0700)]
sparc: sys32.S incorrect compat-layer splice() system call

[ Upstream commit e2c6cbd9ace61039d3de39e717195e38f1492aee ]

I think arch/sparc/kernel/sys32.S has an incorrect splice definition:

SIGN2(sys32_splice, sys_splice, %o0, %o1)

The splice() prototype looks like :

       long splice(int fd_in, loff_t *off_in, int fd_out,
                   loff_t *off_out, size_t len, unsigned int flags);

So I think we should have :

SIGN2(sys32_splice, sys_splice, %o0, %o2)

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agosparc64: Fix bootup with mcount in some configs.
David S. Miller [Fri, 4 Sep 2009 10:38:54 +0000 (03:38 -0700)]
sparc64: Fix bootup with mcount in some configs.

[ Upstream commit bd4352cadfacb9084c97c853b025fac010266c26 ]

Functions invoked early when booting up a cpu can't use
tracing because mcount requires a valid 'current_thread_info()'
and TLB mappings to be setup.

The code path of sun4v_register_mondo_queues --> register_one_mondo
is one such case.  sun4v_register_mondo_queues already has the
necessary 'notrace' annotation, but register_one_mondo does not.

Normally register_one_mondo is inlined so the bug doesn't trigger,
but with some config/compiler combinations, it won't be so we
must properly mark it notrace.

While we're here, add 'notrace' annoations to prom_printf and
prom_halt so that early error handling won't have the same problem.

Reported-by: Alexander Beregalov <a.beregalov@gmail.com>
Reported-by: Leif Sawyer <lsawyer@gci.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agosparc64: Validate linear D-TLB misses.
David S. Miller [Tue, 25 Aug 2009 23:47:46 +0000 (16:47 -0700)]
sparc64: Validate linear D-TLB misses.

[ Upstream commit d8ed1d43e17898761c7221014a15a4c7501d2ff3 ]

When page alloc debugging is not enabled, we essentially accept any
virtual address for linear kernel TLB misses.  But with kgdb, kernel
address probing, and other facilities we can try to access arbitrary
crap.

So, make sure the address we miss on will translate to physical memory
that actually exists.

In order to make this work we have to embed the valid address bitmap
into the kernel image.  And in order to make that less expensive we
make an adjustment, in that the max physical memory address is
decreased to "1 << 41", even on the chips that support a 42-bit
physical address space.  We can do this because bit 41 indicates
"I/O space" and thus covers non-memory ranges.

The result of this is that:

1) kpte_linear_bitmap shrinks from 2K to 1K in size

2) we need 64K more for the valid address bitmap

We can't let the valid address bitmap be dynamically allocated
once we start using it to validate TLB misses, otherwise we have
crazy issues to deal with wrt. recursive TLB misses and such.

If we're in a TLB miss it could be the deepest trap level that's legal
inside of the cpu.  So if we TLB miss referencing the bitmap, the cpu
will be out of trap levels and enter RED state.

To guard against out-of-range accesses to the bitmap, we have to check
to make sure no bits in the physical address above bit 40 are set.  We
could export and use last_valid_pfn for this check, but that's just an
unnecessary extra memory reference.

On the plus side of all this, since we load all of these translations
into the special 4MB mapping TSB, and we check the TSB first for TLB
misses, there should be absolutely no real cost for these new checks
in the TLB miss path.

Reported-by: heyongli@gmail.com
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agosparc64: Kill spurious NMI watchdog triggers by increasing limit to 30 seconds.
David S. Miller [Thu, 3 Sep 2009 09:35:20 +0000 (02:35 -0700)]
sparc64: Kill spurious NMI watchdog triggers by increasing limit to 30 seconds.

[ Upstream commit e6617c6ec28a17cf2f90262b835ec05b9b861400 ]

This is a compromise and a temporary workaround for bootup NMI
watchdog triggers some people see with qla2xxx devices present.

This happens when, for example:

CPU 0 is in the driver init and looping submitting mailbox commands to
load the firmware, then waiting for completion.

CPU 1 is receiving the device interrupts.  CPU 1 is where the NMI
watchdog triggers.

CPU 0 is submitting mailbox commands fast enough that by the time CPU
1 returns from the device interrupt handler, a new one is pending.
This sequence runs for more than 5 seconds.

The problematic case is CPU 1's timer interrupt running when the
barrage of device interrupts begin.  Then we have:

timer interrupt
return for softirq checking
pending, thus enable interrupts

 qla2xxx interrupt
 return
 qla2xxx interrupt
 return
 ... 5+ seconds pass
 final qla2xxx interrupt for fw load
 return

run timer softirq
return

At some point in the multi-second qla2xxx interrupt storm we trigger
the NMI watchdog on CPU 1 from the NMI interrupt handler.

The timer softirq, once we get back to running it, is smart enough to
run the timer work enough times to make up for the missed timer
interrupts.

However, the NMI watchdogs (both x86 and sparc) use the timer
interrupt count to notice the cpu is wedged.  But in the above
scenerio we'll receive only one such timer interrupt even if we last
all the way back to running the timer softirq.

The default watchdog trigger point is only 5 seconds, which is pretty
low (the softwatchdog triggers at 60 seconds).  So increase it to 30
seconds for now.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agonet: net_assign_generic() fix
Eric Dumazet [Tue, 28 Jul 2009 02:36:15 +0000 (02:36 +0000)]
net: net_assign_generic() fix

[ Upstream commit 144586301f6af5ae5943a002f030d8c626fa4fdd ]

memcpy() should take into account size of pointers,
not only number of pointers to copy.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agopppol2tp: calls unregister_pernet_gen_device() at unload time
Eric Dumazet [Tue, 28 Jul 2009 03:47:39 +0000 (03:47 +0000)]
pppol2tp: calls unregister_pernet_gen_device() at unload time

[ Upstream commit 446e72f30eca76d6f9a1a54adf84d2c6ba2831f8 ]

Failure to call unregister_pernet_gen_device() can exhaust memory
if module is loaded/unloaded many times.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoppp: fix lost fragments in ppp_mp_explode() (resubmit)
Ben McKeegan [Tue, 28 Jul 2009 07:43:57 +0000 (07:43 +0000)]
ppp: fix lost fragments in ppp_mp_explode() (resubmit)

[ Upstream commit a53a8b56827cc429c6d9f861ad558beeb5f6103f ]

This patch fixes the corner cases where the sum of MTU of the free
channels (adjusted for fragmentation overheads) is less than the MTU
of PPP link.  There are at least 3 situations where this case might
arise:

- some of the channels are busy

- the multilink session is running in a degraded state (i.e. with less
than its full complement of active channels)

- by design, where multilink protocol is being used to artificially
increase the effective link MTU of a single link.

Without this patch, at most 1 fragment is ever sent per free channel
for a given PPP frame and any remaining part of the PPP frame that
does not fit into those fragments is silently discarded.

This patch restores the original behaviour which was broken by commit
9c705260feea6ae329bc6b6d5f6d2ef0227eda0a 'ppp:ppp_mp_explode()
redesign'.  Once all 'free' channels have been given a fragment, an
additional fragment is queued to each available channel in turn, as many
times as necessary, until the entire PPP frame has been consumed.

Signed-off-by: Ben McKeegan <ben@netservers.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agogre: Fix MTU calculation for bound GRE tunnels
Tom Goff [Fri, 14 Aug 2009 23:33:56 +0000 (16:33 -0700)]
gre: Fix MTU calculation for bound GRE tunnels

[ Upstream commit 8cdb045632e5ee22854538619ac6f150eb0a4894 ]

The GRE header length should be subtracted when the tunnel MTU is
calculated.  This just corrects for the associativity change
introduced by commit 42aa916265d740d66ac1f17290366e9494c884c2
("gre: Move MTU setting out of ipgre_tunnel_bind_dev").

Signed-off-by: Tom Goff <thomas.goff@boeing.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoE100: fix interaction with swiotlb on X86.
Krzysztof Hałasa [Mon, 24 Aug 2009 02:02:13 +0000 (19:02 -0700)]
E100: fix interaction with swiotlb on X86.

[ Upstream commit 6ff9c2e7fa8ca63a575792534b63c5092099c286 ]

E100 places it's RX packet descriptors inside skb->data and uses them
with bidirectional streaming DMA mapping. Data in descriptors is
accessed simultaneously by the chip (writing status and size when
a packet is received) and CPU (reading to check if the packet was
received). This isn't a valid usage of PCI DMA API, which requires use
of the coherent (consistent) memory for such purpose. Unfortunately e100
chips working in "simplified" RX mode have to store received data
directly after the descriptor. Fixing the driver to conform to the API
would require using unsupported "flexible" RX mode or receiving data
into a coherent memory and using CPU to copy it to network buffers.

This patch, while not yet making the driver conform to the PCI DMA API,
allows it to work correctly on X86 with swiotlb (while not breaking
other architectures).

Signed-off-by: Krzysztof Hałasa <khc@pm.waw.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agodccp: missing destroy of percpu counter variable while unload module
Wei Yongjun [Tue, 4 Aug 2009 21:44:39 +0000 (21:44 +0000)]
dccp: missing destroy of percpu counter variable while unload module

[ Upstream commit 476181cb05c6a3aea3ef42309388e255c934a06f ]

percpu counter dccp_orphan_count is init in dccp_init() by
percpu_counter_init() while dccp module is loaded, but the
destroy of it is missing while dccp module is unloaded. We
can get the kernel WARNING about this. Reproduct by the
following commands:

  $ modprobe dccp
  $ rmmod dccp
  $ modprobe dccp

WARNING: at lib/list_debug.c:26 __list_add+0x27/0x5c()
Hardware name: VMware Virtual Platform
list_add corruption. next->prev should be prev (c080c0c4), but was (null). (next
=ca7188cc).
Modules linked in: dccp(+) nfsd lockd nfs_acl auth_rpcgss exportfs sunrpc
Pid: 1956, comm: modprobe Not tainted 2.6.31-rc5 #55
Call Trace:
 [<c042f8fa>] warn_slowpath_common+0x6a/0x81
 [<c053a6cb>] ? __list_add+0x27/0x5c
 [<c042f94f>] warn_slowpath_fmt+0x29/0x2c
 [<c053a6cb>] __list_add+0x27/0x5c
 [<c053c9b3>] __percpu_counter_init+0x4d/0x5d
 [<ca9c90c7>] dccp_init+0x19/0x2ed [dccp]
 [<c0401141>] do_one_initcall+0x4f/0x111
 [<ca9c90ae>] ? dccp_init+0x0/0x2ed [dccp]
 [<c06971b5>] ? notifier_call_chain+0x26/0x48
 [<c0444943>] ? __blocking_notifier_call_chain+0x45/0x51
 [<c04516f7>] sys_init_module+0xac/0x1bd
 [<c04028e4>] sysenter_do_call+0x12/0x22

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoLinux 2.6.30.6 v2.6.30.6
Greg Kroah-Hartman [Wed, 9 Sep 2009 03:39:34 +0000 (20:39 -0700)]
Linux 2.6.30.6

14 years agoocfs2: ocfs2_write_begin_nolock() should handle len=0
Sunil Mushran [Fri, 4 Sep 2009 18:12:01 +0000 (11:12 -0700)]
ocfs2: ocfs2_write_begin_nolock() should handle len=0

commit 8379e7c46cc48f51197dd663fc6676f47f2a1e71 upstream.

Bug introduced by mainline commit e7432675f8ca868a4af365759a8d4c3779a3d922
The bug causes ocfs2_write_begin_nolock() to oops when len=0.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoNET: llc, zero sockaddr_llc struct
Jiri Slaby [Mon, 24 Aug 2009 05:55:51 +0000 (22:55 -0700)]
NET: llc, zero sockaddr_llc struct

commit 28e9fc592cb8c7a43e4d3147b38be6032a0e81bc upstream.

sllc_arphrd member of sockaddr_llc might not be changed. Zero sllc
before copying to the above layer's structure.

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agorose: Fix rose_getname() leak
Eric Dumazet [Thu, 6 Aug 2009 03:34:06 +0000 (03:34 +0000)]
rose: Fix rose_getname() leak

commit 17ac2e9c58b69a1e25460a568eae1b0dc0188c25 upstream.

rose_getname() can leak kernel memory to user.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoeconet: Fix econet_getname() leak
Eric Dumazet [Thu, 6 Aug 2009 03:48:36 +0000 (03:48 +0000)]
econet: Fix econet_getname() leak

commit 80922bbb12a105f858a8f0abb879cb4302d0ecaa upstream.

econet_getname() can leak kernel memory to user.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agonetrom: Fix nr_getname() leak
Eric Dumazet [Thu, 6 Aug 2009 03:31:07 +0000 (03:31 +0000)]
netrom: Fix nr_getname() leak

commit f6b97b29513950bfbf621a83d85b6f86b39ec8db upstream.

nr_getname() can leak kernel memory to user.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoappletalk: fix atalk_getname() leak
Eric Dumazet [Thu, 6 Aug 2009 02:27:43 +0000 (02:27 +0000)]
appletalk: fix atalk_getname() leak

commit 3d392475c873c10c10d6d96b94d092a34ebd4791 upstream.

atalk_getname() can leak 8 bytes of kernel memory to user

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoirda: Fix irda_getname() leak
Eric Dumazet [Thu, 6 Aug 2009 03:55:04 +0000 (03:55 +0000)]
irda: Fix irda_getname() leak

commit 09384dfc76e526c3993c09c42e016372dc9dd22c upstream.

irda_getname() can leak kernel memory to user.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agocan: Fix raw_getname() leak
Eric Dumazet [Thu, 6 Aug 2009 20:27:04 +0000 (20:27 +0000)]
can: Fix raw_getname() leak

commit e84b90ae5eb3c112d1f208964df1d8156a538289 upstream.

raw_getname() can leak 10 bytes of kernel memory to user

(two bytes hole between can_family and can_ifindex,
8 bytes at the end of sockaddr_can structure)

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Oliver Hartkopp <oliver@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoxenfb: connect to backend before registering fb
Michal Schmidt [Thu, 27 Aug 2009 19:22:43 +0000 (12:22 -0700)]
xenfb: connect to backend before registering fb

commit 0a80fb10239b04c45e5e80aad8d4b2ca5ac407b2 upstream.

As soon as the framebuffer is registered, our methods may be called by the
kernel. This leads to a crash as xenfb_refresh() gets called before we have
the irq.

Connect to the backend before registering our framebuffer with the kernel.

[ Fixes bug http://bugzilla.kernel.org/show_bug.cgi?id=14059 ]

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoar9170: fix read & write outside array bounds
Dan Carpenter [Sun, 9 Aug 2009 12:24:09 +0000 (14:24 +0200)]
ar9170: fix read & write outside array bounds

commit e9d126cdfa60b575f1b5b02024c4faee27dccf07 upstream.

Backport done by Christian Lamparter <chunkeey@googlemail.com>

queue == __AR9170_NUM_TXQ would cause a bug on the next line.

found by Smatch ( http://repo.or.cz/w/smatch.git ).

Reported-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Christian Lamparter <chunkeey@web.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoipv4: make ip_append_data() handle NULL routing table
Julien TINNES [Thu, 27 Aug 2009 13:26:58 +0000 (15:26 +0200)]
ipv4: make ip_append_data() handle NULL routing table

commit 788d908f2879a17e5f80924f3da2e23f1034482d upstream.

Add a check in ip_append_data() for NULL *rtp to prevent future bugs in
callers from being exploitable.

Signed-off-by: Julien Tinnes <julien@cr0.org>
Signed-off-by: Tavis Ormandy <taviso@sdf.lonestar.org>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agopowerpc/ps3: Add missing check for PS3 to rtc-ps3 platform device registration
Geert Uytterhoeven [Sun, 23 Aug 2009 22:54:32 +0000 (22:54 +0000)]
powerpc/ps3: Add missing check for PS3 to rtc-ps3 platform device registration

commit 7b6a09f3d6aedeaac923824af2a5df30300b56e9 upstream.

On non-PS3, we get:

| kernel BUG at drivers/rtc/rtc-ps3.c:36!

because the rtc-ps3 platform device is registered unconditionally in a kernel
with builtin support for PS3.

Reported-by: Sachin Sant <sachinp@in.ibm.com>
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Acked-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoUSB: EHCI: fix two new bugs related to Clear-TT-Buffer
Alan Stern [Fri, 31 Jul 2009 14:40:22 +0000 (10:40 -0400)]
USB: EHCI: fix two new bugs related to Clear-TT-Buffer

commit 7a0f0d951273eee889c2441846842348ebc00a2a upstream.

This patch (as1273) fixes two(!) bugs introduced by the new
Clear-TT-Buffer implementation in ehci-hcd.

It is now possible for an idle QH to have some URBs on its
queue -- this will happen if a Clear-TT-Buffer is pending for
the QH's endpoint.  Consequently we should not issue a warning
when someone tries to unlink an URB from an idle QH; instead
we should process the request immediately.

The refcounts for QHs could get messed up, because
submit_async() would increment the refcount when calling
qh_link_async() and qh_link_async() would then refuse to link
the QH into the schedule if a Clear-TT-Buffer was pending.
Instead we should increment the refcount only when the QH
actually is added to the schedule.  The current code tries to
be clever by leaving the refcount alone if an unlink is
immediately followed by a relink; the patch changes this to an
unconditional decrement and increment (although they occur in
the opposite order).

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
CC: David Brownell <david-b@pacbell.net>
Tested-by: Manuel Lauss <manuel.lauss@gmail.com>
Tested-by: Matthijs Kooijman <matthijs@stdin.nl>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoUSB: EHCI: use the new clear_tt_buffer interface
Alan Stern [Mon, 29 Jun 2009 14:47:30 +0000 (10:47 -0400)]
USB: EHCI: use the new clear_tt_buffer interface

commit 914b701280a76f96890ad63eb0fa99bf204b961c upstream.

This patch (as1256) changes ehci-hcd and all the other drivers in the
EHCI family to make use of the new clear_tt_buffer callbacks.  When a
Clear-TT-Buffer request is in progress for a QH, the QH is not allowed
to be linked into the async schedule until the request is finished.
At that time, if there are any URBs queued for the QH, it is linked
into the async schedule.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoUSB: fix the clear_tt_buffer interface
Alan Stern [Mon, 29 Jun 2009 14:43:32 +0000 (10:43 -0400)]
USB: fix the clear_tt_buffer interface

commit cb88a1b887bb8908f6e00ce29e893ea52b074940 upstream.

This patch (as1255) updates the interface for calling
usb_hub_clear_tt_buffer().  Even the name of the function is changed!

When an async URB (i.e., Control or Bulk) going through a high-speed
hub to a non-high-speed device is cancelled or fails, the hub's
Transaction Translator buffer may be left busy still trying to
complete the transaction.  The buffer has to be cleared; that's what
usb_hub_clear_tt_buffer() does.

It isn't safe to send any more URBs to the same endpoint until the TT
buffer is fully clear.  Therefore the HCD needs to be told when the
Clear-TT-Buffer request has finished.  This patch adds a callback
method to struct hc_driver for that purpose, and makes the hub driver
invoke the callback at the proper time.

The patch also changes a couple of names; "hub_tt_kevent" and
"tt.kevent" now look rather antiquated.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoipv6: Fix commit 63d9950b08184e6531adceb65f64b429909cc101 (ipv6: Make v4-mapped bindi...
Bruno Prémont [Mon, 24 Aug 2009 02:06:28 +0000 (19:06 -0700)]
ipv6: Fix commit 63d9950b08184e6531adceb65f64b429909cc101 (ipv6: Make v4-mapped bindings consistent with IPv4)

commit ca6982b858e1d08010c1d29d8e8255b2ac2ad70a upstream.

Commit 63d9950b08184e6531adceb65f64b429909cc101
  (ipv6: Make v4-mapped bindings consistent with IPv4)
changes behavior of inet6_bind() for v4-mapped addresses so it should
behave the same way as inet_bind().

During this change setting of err to -EADDRNOTAVAIL got lost:

af_inet.c:469 inet_bind()
err = -EADDRNOTAVAIL;
if (!sysctl_ip_nonlocal_bind &&
    !(inet->freebind || inet->transparent) &&
    addr->sin_addr.s_addr != htonl(INADDR_ANY) &&
    chk_addr_ret != RTN_LOCAL &&
    chk_addr_ret != RTN_MULTICAST &&
    chk_addr_ret != RTN_BROADCAST)
goto out;

af_inet6.c:463 inet6_bind()
if (addr_type == IPV6_ADDR_MAPPED) {
int chk_addr_ret;

/* Binding to v4-mapped address on a v6-only socket
 * makes no sense
 */
if (np->ipv6only) {
err = -EINVAL;
goto out;
}

/* Reproduce AF_INET checks to make the bindings consitant */
v4addr = addr->sin6_addr.s6_addr32[3];
chk_addr_ret = inet_addr_type(net, v4addr);
if (!sysctl_ip_nonlocal_bind &&
    !(inet->freebind || inet->transparent) &&
    v4addr != htonl(INADDR_ANY) &&
    chk_addr_ret != RTN_LOCAL &&
    chk_addr_ret != RTN_MULTICAST &&
    chk_addr_ret != RTN_BROADCAST)
goto out;
} else {

Signed-off-by: Bruno Prémont <bonbons@linux-vserver.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agokthreads: fix kthread_create() vs kthread_stop() race
Oleg Nesterov [Mon, 24 Aug 2009 10:45:29 +0000 (12:45 +0200)]
kthreads: fix kthread_create() vs kthread_stop() race

The bug should be "accidently" fixed by recent changes in 2.6.31,
all kernels <= 2.6.30 need the fix. The problem was never noticed before,
it was found because it causes mysterious failures with GFS mount/umount.

Credits to Robert Peterson. He blaimed kthread.c from the very beginning.
But, despite my promise, I forgot to inspect the old implementation until
he did a lot of testing and reminded me. This led to huge delay in fixing
this bug.

kthread_stop() does put_task_struct(k) before it clears kthread_stop_info.k.
This means another kthread_create() can re-use this task_struct, but the
new kthread can still see kthread_should_stop() == T and exit even without
calling threadfn().

Reported-by: Robert Peterson <rpeterso@redhat.com>
Tested-by: Robert Peterson <rpeterso@redhat.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>