Jean Delvare [Sun, 21 Oct 2012 22:53:21 +0000 (09:53 +1100)]
i2c-i801: Simplify dependency towards GPIOLIB
Arbitrarily selecting GPIOLIB causes trouble on some architectures,
so don't do that. Instead, just make the optional multiplexing code
depend on CONFIG_I2C_MUX_GPIO instead of CONFIG_I2C_MUX for now. We
can revisit if the i2c-i801 driver ever supports other multiplexing
flavors.
Also make that optional code depend on DMI, as it won't do anything
without that.
Signed-off-by: Jean Delvare <khali@linux-fr.org> Cc: Fengguang Wu <fengguang.wu@intel.com>
Jean Delvare [Sun, 21 Oct 2012 10:05:51 +0000 (12:05 +0200)]
Documentation: Reflect the new location of the NMI watchdog info
Commit 9919cba7 ("watchdog: Update documentation") moved the
NMI watchdog documentation from nmi_watchdog.txt to
lockup-watchdogs.txt. Update the index file accordingly.
Signed-off-by: Jean Delvare <khali@linux-fr.org> Cc: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: Don Zickus <dzickus@redhat.com> Link: http://lkml.kernel.org/r/20121021120551.4656d99b@endymion.delvare Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Sun, 21 Oct 2012 16:16:39 +0000 (18:16 +0200)]
Merge tag 'numascale_mce_fix' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras into x86/urgent
Pull a fix from Borislav Petkov:
"Fix for a bug causing an OOPS on confederate systems from Numascale
where northbridge descriptors are not unique, causing a lookup failure
of the northbridge descriptor in the amd_nb.c code.
Rik van Riel [Thu, 18 Oct 2012 21:20:21 +0000 (17:20 -0400)]
numa, mm: Rename the PROT_NONE fault handling functions to *_numa()
Having the function name indicate what the function is used
for makes the code a little easier to read. Furthermore,
the fault handling code largely consists of do_...._page
functions.
Rename the NUMA working set sampling fault handling functions
to _numa() names, to indicate what they are used for.
This separates the naming from the regular PROT_NONE namings.
Andrea Arcangeli [Tue, 18 Sep 2012 00:14:51 +0000 (02:14 +0200)]
x86, mm: Prevent gcc to re-read the pagetables
GCC is very likely to read the pagetables just once and cache them in
the local stack or in a register, but it is can also decide to re-read
the pagetables. The problem is that the pagetable in those places can
change from under gcc.
In the page fault we only hold the ->mmap_sem for reading and both the
page fault and MADV_DONTNEED only take the ->mmap_sem for reading and we
don't hold any PT lock yet.
In get_user_pages_fast() the TLB shootdown code can clear the pagetables
before firing any TLB flush (the page can't be freed until the TLB
flushing IPI has been delivered but the pagetables will be cleared well
before sending any TLB flushing IPI).
With THP/hugetlbfs the pmd (and pud for hugetlbfs giga pages) can
change as well under gup_fast, it won't just be cleared for the same
reasons described above for the pte in the page fault case.
[ This patch was picked up from the AutoNUMA tree. ]
Originally-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Rik van Riel <riel@redhat.com>
[ Ported to this tree, because we are modifying the page tables
at a high rate here, so this problem is potentially more
likely to show up in practice. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
Mel Gorman [Wed, 12 Oct 2011 19:06:51 +0000 (21:06 +0200)]
mm: Check if PTE is already allocated during page fault
With transparent hugepage support, handle_mm_fault() has to be careful
that a normal PMD has been established before handling a PTE fault. To
achieve this, it used __pte_alloc() directly instead of pte_alloc_map
as pte_alloc_map is unsafe to run against a huge PMD. pte_offset_map()
is called once it is known the PMD is safe.
pte_alloc_map() is smart enough to check if a PTE is already present
before calling __pte_alloc but this check was lost. As a consequence,
PTEs may be allocated unnecessarily and the page table lock taken.
Thi useless PTE does get cleaned up but it's a performance hit which
is visible in page_test from aim9.
This patch simply re-adds the check normally done by pte_alloc_map to
check if the PTE needs to be allocated before taking the page table
lock. The effect is noticable in page_test from aim9.
(While this affects 2.6.38, it is a performance rather than a
functional bug and normally outside the rules -stable. While the big
performance differences are to a microbench, the difference in fork
and exec performance may be significant enough that -stable wants to
consider the patch)
Reported-by: Raz Ben Yehuda <raziebe@gmail.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Rik van Riel <riel@redhat.com>
[ Picked this up from the AutoNUMA tree to help
it upstream and to allow apples-to-apples
performance comparisons. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
numa, mm: Fix NUMA hinting page faults from gup/gup_fast
Introduce FOLL_NUMA to tell follow_page to check
pte/pmd_numa. get_user_pages must use FOLL_NUMA, and it's safe to do
so because it always invokes handle_mm_fault and retries the
follow_page later.
KVM secondary MMU page faults will trigger the NUMA hinting page
faults through gup_fast -> get_user_pages -> follow_page ->
handle_mm_fault.
Other follow_page callers like KSM should not use FOLL_NUMA, or they
would fail to get the pages if they use follow_page instead of
get_user_pages.
[ This patch was picked up from the AutoNUMA tree. ]
Originally-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Rik van Riel <riel@redhat.com>
[ ported to this tree. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Sun, 14 Oct 2012 14:59:13 +0000 (16:59 +0200)]
sched, numa, mm: Implement constant, per task Working Set Sampling (WSS) rate
Previously, to probe the working set of a task, we'd use
a very simple and crude method: mark all of its address
space PROT_NONE.
That method has various (obvious) disadvantages:
- it samples the working set at dissimilar rates,
giving some tasks a sampling quality advantage
over others.
- creates performance problems for tasks with very
large working sets
- over-samples processes with large address spaces but
which only very rarely execute
Improve that method by keeping a rotating offset into the
address space that marks the current position of the scan,
and advance it by a constant rate (in a CPU cycles execution
proportional manner). If the offset reaches the last mapped
address of the mm then it then it starts over at the first
address.
The per-task nature of the working set sampling functionality
in this tree allows such constant rate, per task,
execution-weight proportional sampling of the working set,
with an adaptive sampling interval/frequency that goes from
once per 100 msecs up to just once per 1.6 seconds.
The current sampling volume is 256 MB per interval.
As tasks mature and converge their working set, so does the
sampling rate slow down to just a trickle, 256 MB per 1.6
seconds of CPU time executed.
This, beyond being adaptive, also rate-limits rarely
executing systems and does not over-sample on overloaded
systems.
[ In AutoNUMA speak, this patch deals with the effective sampling
rate of the 'hinting page fault'. AutoNUMA's scanning is
currently rate-limited, but it is also fundamentally
single-threaded, executing in the knuma_scand kernel thread,
so the limit in AutoNUMA is global and does not scale up with
the number of CPUs, nor does it scan tasks in an execution
proportional manner.
So the idea of rate-limiting the scanning was first implemented
in the AutoNUMA tree via a global rate limit. This patch goes
beyond that by implementing an execution rate proportional
working set sampling rate that is not implemented via a single
global scanning daemon. ]
Based-on-idea-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Rik van Riel <riel@redhat.com> Link: http://lkml.kernel.org/n/tip-wt5b48o2226ec63784i58s3j@git.kernel.org
[ Wrote changelog and fixed bug. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
Julia Lawall [Sun, 21 Oct 2012 10:52:03 +0000 (12:52 +0200)]
ALSA: sound/isa/opti9xx/miro.c: eliminate possible double free
snd_miro_probe is a static function that is only called twice in the file
that defines it. At each call site, its argument is freed using
snd_card_free. Thus, there is no need for snd_miro_probe to call
snd_card_free on its argument on any of its error exit paths.
Because snd_card_free both reads the fields of its argument and kfrees its
argments, the results of the second snd_card_free should be unpredictable.
A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)
Pete Leigh [Sun, 21 Oct 2012 09:29:17 +0000 (10:29 +0100)]
ALSA: usb-audio: USB audio quirk for Roland VG-99 advanced mode
Without this quirk the VG-99 will work in standard mode (set under
USB on System menu page 2) giving 16 bits at 44.1 Khz audio in/out
but no midi, and is not recognised when set to advanced mode.
After applying this, I can also use the VG-99 in advanced mode: 24
24 bits audio in/out at 44.1 Khz, and midi in/out. Sysex is so far
untested.
In standard mode, the device appears with ID 0x00b3, so the
behaviour isn't affected by this quirk.
Thanks to Clemens Ladisch for simplifying and correcting my initial
attempt!
Signed-off-by: Pete Leigh <pete.leigh@gmail.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>