]> git.kernelconcepts.de Git - karo-tx-linux.git/log
karo-tx-linux.git
15 years agosdhci: Add quirk for controllers with no end-of-busy IRQ
Ben Dooks [Fri, 20 Feb 2009 17:33:08 +0000 (20:33 +0300)]
sdhci: Add quirk for controllers with no end-of-busy IRQ

commit f945405cdecd9e0ae3e58ff84cabd19b4522965e upstream.

The Samsung SDHCI (and FSL eSDHC) controller block seems to fail
to generate an INT_DATA_END after the transfer has completed and
the bus busy state finished.

Changes in e809517f6fa5803a5a1cd56026f0e2190fc13d5c to use the
new busy method are the cause of the behaviour change.

Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agopowerpc: Fix load/store float double alignment handler
Michael Neuling [Thu, 19 Feb 2009 18:52:20 +0000 (18:52 +0000)]
powerpc: Fix load/store float double alignment handler

commit 49f297f8df9adb797334155470ea9ca68bdb041e upstream.

When we introduced VSX, we changed the way FPRs are stored in the
thread_struct.  Unfortunately we missed the load/store float double
alignment handler code when updating how we access FPRs in the
thread_struct.

Below fixes this and merges the little/big endian case.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoproc: fix PG_locked reporting in /proc/kpageflags
Helge Bahmann [Fri, 20 Feb 2009 13:24:12 +0000 (16:24 +0300)]
proc: fix PG_locked reporting in /proc/kpageflags

commit e07a4b9217d1e97d2f3a62b6b070efdc61212110 upstream.

Expr always evaluates to zero.

Cc: Matt Mackall <mpm@selenic.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agocopy_process: fix CLONE_PARENT && parent_exec_id interaction
Oleg Nesterov [Mon, 2 Mar 2009 21:58:45 +0000 (22:58 +0100)]
copy_process: fix CLONE_PARENT && parent_exec_id interaction

commit 2d5516cbb9daf7d0e342a2e3b0fc6f8c39a81205 upstream.

CLONE_PARENT can fool the ->self_exec_id/parent_exec_id logic. If we
re-use the old parent, we must also re-use ->parent_exec_id to make
sure exit_notify() sees the right ->xxx_exec_id's when the CLONE_PARENT'ed
task exits.

Also, move down the "p->parent_exec_id = p->self_exec_id" thing, to place
two different cases together.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Serge E. Hallyn <serge@hallyn.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoPCI: Add PCI quirk to disable L0s ASPM state for 82575 and 82598
Alexander Duyck [Thu, 5 Mar 2009 18:57:28 +0000 (13:57 -0500)]
PCI: Add PCI quirk to disable L0s ASPM state for 82575 and 82598

commit 649426efcfbc67a8b033497151816cbac9fd0cfa upstream.

This patch is intended to disable L0s ASPM link state for 82598 (ixgbe)
parts due to the fact that it is possible to corrupt TX data when coming
back out of L0s on some systems.  The workaround had been added for 82575
(igb) previously, but did not use the ASPM api.  This quirk uses the ASPM
api to prevent the ASPM subsystem from re-enabling the L0s state.

Instead of adding the fix in igb to the ixgbe driver as well it was
decided to move it into a pci quirk.  It is necessary to move the fix out
of the driver and into a pci quirk in order to prevent the issue from
occuring prior to driver load to handle the possibility of the device being
passed to a VM via direct assignment.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agofore200: fix oops on failed firmware load
Meelis Roos [Wed, 11 Feb 2009 01:19:19 +0000 (17:19 -0800)]
fore200: fix oops on failed firmware load

commit fcffd0d8bbddac757cd856e635ac75e8eb4518bc upstream.

Fore 200 ATM driver fails to handle request_firmware failures and oopses
when no firmware file was found. Fix it by checking for the right return
values and propaganting the return value up.

Signed-off-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agozaurus: add usb id for motomagx phones
Dmitriy Taychenachev [Tue, 24 Feb 2009 18:42:48 +0000 (18:42 +0000)]
zaurus: add usb id for motomagx phones

commit 52c0326beaa3cb0049d0f1c51c6ad5d4a04e4430 upstream.

The Motorola MOTOMAGX phones (Z6, E8, Zn5 so far) are providing
combined ACM/BLAN USB configuration. Since it has Vendor Specific
class, the corresponding drivers (cdc-acm, zaurus) can't find it just
by interface info. This patch adds usb id so the zaurus driver can
properly handle this combined device.

Signed-off-by: Dmitriy Taychenachev <dimichxp@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agocdc_ether: add usb id for Ericsson F3507g
Bjørn Mork [Wed, 25 Feb 2009 04:33:58 +0000 (04:33 +0000)]
cdc_ether: add usb id for Ericsson F3507g

commit cac477e8f1038c41b6f29d3161ce351462ef3df7 upstream.

The Ericsson F3507g wireless broadband module provides a CDC Ethernet
compliant interface, but identifies it as a "Mobile Direct Line" CDC
subclass, thereby preventing the CDC Ethernet class driver from picking
it up.  This patch adds the device id to cdc_ether.c as a workaround.

Ericsson has provided a "class" driver for this device:
http://kerneltrap.org/mailarchive/linux-net/2008/10/28/3832094
But closer inspection of that driver reveals that it adds little more
than duplication of code from cdc_ether.c.  See also
http://marc.info/?l=linux-usb&m=123334979706403&w=2

Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoasix: new device ids
Greg Kroah-Hartman [Wed, 25 Feb 2009 07:52:24 +0000 (23:52 -0800)]
asix: new device ids

commit fef7cc0893146550b286b13c0e6e914556142730 upstream.

This patch adds two new device ids to the asix driver.

One comes directly from the asix driver on their web site, the other was
reported by Armani Liao as needed for the MSI X320 to get the driver to
work properly for it.

Reported-by: Armani Liao <aliao@novell.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoxen/blkfront: use blk_rq_map_sg to generate ring entries
Jens Axboe [Tue, 24 Feb 2009 07:10:09 +0000 (08:10 +0100)]
xen/blkfront: use blk_rq_map_sg to generate ring entries

commit 9e973e64ac6dc504e6447d52193d4fff1a670156 upstream.

On occasion, the request will apparently have more segments than we
fit into the ring. Jens says:

> The second problem is that the block layer then appears to create one
> too many segments, but from the dump it has rq->nr_phys_segments ==
> BLKIF_MAX_SEGMENTS_PER_REQUEST. I suspect the latter is due to
> xen-blkfront not handling the merging on its own. It should check that
> the new page doesn't form part of the previous page. The
> rq_for_each_segment() iterates all single bits in the request, not dma
> segments. The "easiest" way to do this is to call blk_rq_map_sg() and
> then iterate the mapped sg list. That will give you what you are
> looking for.

> Here's a test patch, compiles but otherwise untested. I spent more
> time figuring out how to enable XEN than to code it up, so YMMV!
> Probably the sg list wants to be put inside the ring and only
> initialized on allocation, then you can get rid of the sg on stack and
> sg_init_table() loop call in the function. I'll leave that, and the
> testing, to you.

[Moved sg array into info structure, and initialize once. -J]

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Sven Köhler <sven.koehler@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoxen: disable interrupts early, as start_kernel expects
Jeremy Fitzhardinge [Wed, 25 Feb 2009 17:42:25 +0000 (09:42 -0800)]
xen: disable interrupts early, as start_kernel expects

commit 55d8085671863fe4ee6a17b7814bd38180a44e1d upstream.

This avoids a lockdep warning from:
if (DEBUG_LOCKS_WARN_ON(unlikely(!early_boot_irqs_enabled)))
return;
in trace_hardirqs_on_caller();

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Cc: Xen-devel <xen-devel@lists.xensource.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agox86-64: syscall-audit: fix 32/64 syscall hole
Roland McGrath [Sat, 28 Feb 2009 03:03:24 +0000 (19:03 -0800)]
x86-64: syscall-audit: fix 32/64 syscall hole

commit ccbe495caa5e604b04d5a31d7459a6f6a76a756c upstream.

On x86-64, a 32-bit process (TIF_IA32) can switch to 64-bit mode with
ljmp, and then use the "syscall" instruction to make a 64-bit system
call.  A 64-bit process make a 32-bit system call with int $0x80.

In both these cases, audit_syscall_entry() will use the wrong system
call number table and the wrong system call argument registers.  This
could be used to circumvent a syscall audit configuration that filters
based on the syscall numbers or argument details.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agox86-64: seccomp: fix 32/64 syscall hole
Roland McGrath [Sat, 28 Feb 2009 07:25:54 +0000 (23:25 -0800)]
x86-64: seccomp: fix 32/64 syscall hole

commit 5b1017404aea6d2e552e991b3fd814d839e9cd67 upstream.

On x86-64, a 32-bit process (TIF_IA32) can switch to 64-bit mode with
ljmp, and then use the "syscall" instruction to make a 64-bit system
call.  A 64-bit process make a 32-bit system call with int $0x80.

In both these cases under CONFIG_SECCOMP=y, secure_computing() will use
the wrong system call number table.  The fix is simple: test TS_COMPAT
instead of TIF_IA32.  Here is an example exploit:

/* test case for seccomp circumvention on x86-64

   There are two failure modes: compile with -m64 or compile with -m32.

   The -m64 case is the worst one, because it does "chmod 777 ." (could
   be any chmod call).  The -m32 case demonstrates it was able to do
   stat(), which can glean information but not harm anything directly.

   A buggy kernel will let the test do something, print, and exit 1; a
   fixed kernel will make it exit with SIGKILL before it does anything.
*/

#define _GNU_SOURCE
#include <assert.h>
#include <inttypes.h>
#include <stdio.h>
#include <linux/prctl.h>
#include <sys/stat.h>
#include <unistd.h>
#include <asm/unistd.h>

int
main (int argc, char **argv)
{
  char buf[100];
  static const char dot[] = ".";
  long ret;
  unsigned st[24];

  if (prctl (PR_SET_SECCOMP, 1, 0, 0, 0) != 0)
    perror ("prctl(PR_SET_SECCOMP) -- not compiled into kernel?");

#ifdef __x86_64__
  assert ((uintptr_t) dot < (1UL << 32));
  asm ("int $0x80 # %0 <- %1(%2 %3)"
       : "=a" (ret) : "0" (15), "b" (dot), "c" (0777));
  ret = snprintf (buf, sizeof buf,
  "result %ld (check mode on .!)\n", ret);
#elif defined __i386__
  asm (".code32\n"
       "pushl %%cs\n"
       "pushl $2f\n"
       "ljmpl $0x33, $1f\n"
       ".code64\n"
       "1: syscall # %0 <- %1(%2 %3)\n"
       "lretl\n"
       ".code32\n"
       "2:"
       : "=a" (ret) : "0" (4), "D" (dot), "S" (&st));
  if (ret == 0)
    ret = snprintf (buf, sizeof buf,
    "stat . -> st_uid=%u\n", st[7]);
  else
    ret = snprintf (buf, sizeof buf, "result %ld\n", ret);
#else
# error "not this one"
#endif

  write (1, buf, ret);

  syscall (__NR_exit, 1);
  return 2;
}

Signed-off-by: Roland McGrath <roland@redhat.com>
[ I don't know if anybody actually uses seccomp, but it's enabled in
  at least both Fedora and SuSE kernels, so maybe somebody is. - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agohpilo: new pci device
David Altobelli [Fri, 27 Feb 2009 22:03:09 +0000 (14:03 -0800)]
hpilo: new pci device

commit 31d8b5631f095cb7100cfccc95c801a2547ffe2b upstream.

Future iLO devices will have an HP vendor id.

Signed-off-by: David Altobelli <david.altobelli@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoselinux: Fix the NetLabel glue code for setsockopt()
Paul Moore [Fri, 20 Feb 2009 21:33:02 +0000 (16:33 -0500)]
selinux: Fix the NetLabel glue code for setsockopt()

commit 09c50b4a52c01a1f450b8eec819089e228655bfb upstream.

At some point we (okay, I) managed to break the ability for users to use the
setsockopt() syscall to set IPv4 options when NetLabel was not active on the
socket in question.  The problem was noticed by someone trying to use the
"-R" (record route) option of ping:

 # ping -R 10.0.0.1
 ping: record route: No message of desired type

The solution is relatively simple, we catch the unlabeled socket case and
clear the error code, allowing the operation to succeed.  Please note that we
still deny users the ability to override IPv4 options on socket's which have
NetLabel labeling active; this is done to ensure the labeling remains intact.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
Cc: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoselinux: Fix a panic in selinux_netlbl_inode_permission()
Paul Moore [Fri, 27 Feb 2009 20:00:03 +0000 (15:00 -0500)]
selinux: Fix a panic in selinux_netlbl_inode_permission()

commit d7f59dc4642ce2fc7b79fcd4ec02ffce7f21eb02 upstream.

Rick McNeal from LSI identified a panic in selinux_netlbl_inode_permission()
caused by a certain sequence of SUNRPC operations.  The problem appears to be
due to the lack of NULL pointer checking in the function; this patch adds the
pointer checks so the function will exit safely in the cases where the socket
is not completely initialized.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
Cc: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agox86-64: fix int $0x80 -ENOSYS return
Roland McGrath [Sat, 7 Feb 2009 02:15:18 +0000 (18:15 -0800)]
x86-64: fix int $0x80 -ENOSYS return

commit c09249f8d1b84344eca882547afdbffee8c09d14 upstream.

One of my past fixes to this code introduced a different new bug.
When using 32-bit "int $0x80" entry for a bogus syscall number,
the return value is not correctly set to -ENOSYS.  This only happens
when neither syscall-audit nor syscall tracing is enabled (i.e., never
seen if auditd ever started).  Test program:

/* gcc -o int80-badsys -m32 -g int80-badsys.c
   Run on x86-64 kernel.
   Note to reproduce the bug you need auditd never to have started.  */

#include <errno.h>
#include <stdio.h>

int
main (void)
{
  long res;
  asm ("int $0x80" : "=a" (res) : "0" (99999));
  printf ("bad syscall returns %ld\n", res);
  return res != -ENOSYS;
}

The fix makes the int $0x80 path match the sysenter and syscall paths.

Reported-by: Dmitry V. Levin <ldv@altlinux.org>
Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agox86: tone down mtrr_trim_uncached_memory() warning
Ingo Molnar [Thu, 29 Jan 2009 10:45:35 +0000 (11:45 +0100)]
x86: tone down mtrr_trim_uncached_memory() warning

commit bf3647c44bc76c43c4b2ebb4c37a559e899ac70e upstream.

kerneloops.org is reporting a lot of these warnings that come due to
vmware not setting up any MTRRs for emulated CPUs:

| Reported 709 times (14696 total reports)
| BIOS bug (often in VMWare) where the MTRR's are set up incorrectly
| or not at all
|
| This warning was last seen in version 2.6.29-rc2-git1, and first
| seen in 2.6.24.
|
| More info:
|   http://www.kerneloops.org/searchweek.php?search=mtrr_trim_uncached_memory

Keep a one-liner KERN_INFO about it - so that we have so notice if empty
MTRRs are caused by native hardware/BIOS weirdness.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agommc_test: fix basic read test
Rabin Vincent [Fri, 13 Feb 2009 17:25:26 +0000 (22:55 +0530)]
mmc_test: fix basic read test

commit 58a5dd3e0e77029d3db1f8fa75d0b54b38169d5d upstream.

Due to a typo in the Basic Read test, it's currently identical to the
Basic Write test.  Fix this.

Signed-off-by: Rabin Vincent <rabin@rab.in>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoMMC: fix bug - SDHC card capacity not correct
Yi Li [Thu, 5 Feb 2009 07:31:57 +0000 (15:31 +0800)]
MMC: fix bug - SDHC card capacity not correct

commit 444122fd58fdc83c96877a92b3f6288cafddb08d upstream.

Signed-off-by: Yi Li <yi.li@analog.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agommc: s3cmci: fix s3c2410_dma_config() arguments.
Ben Dooks [Thu, 12 Mar 2009 21:31:33 +0000 (14:31 -0700)]
mmc: s3cmci: fix s3c2410_dma_config() arguments.

commit 7c48ed3383bfb2106694807361ec187fe8a4333d upstream.

The s3cmci driver is calling s3c2410_dma_config with incorrect data for
the DCON register.  The S3C2410_DCON_HWTRIG is implicit in the channel
configuration and the device selection of S3C2410_DCON_CH0_SDI is
incorrect as the DMA system may not select channel 0.

Signed-off-by: Ben Dooks <ben@simtec.co.uk>
Acked-by: Pierre Ossman <drzeus@drzeus.cx>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agos3cmci: Fix hangup in do_pio_write()
Yauhen Kharuzhy [Wed, 11 Feb 2009 21:25:52 +0000 (13:25 -0800)]
s3cmci: Fix hangup in do_pio_write()

commit 994244883739e4044bef76d4e5d7a9b66dc6c7b6 upstream.

This commit fixes the regression what was added by commit
088a78af978d0c8e339071a9b2bca1f4cb368f30 "s3cmci: Support transfers
which are not multiple of 32 bits."

fifo_free() now returns amount of available space in FIFO buffer in
bytes.  But do_pio_write() writes to FIFO 32-bit words.  Condition for
return from cycle is (fifo_free() == 0), but when fifo has 1..3 bytes
of free space then this condition will never be true and system hangs.

This patch changes condition in the while() to (fifo_free() > 3).

Signed-off-by: Yauhen Kharuzhy <jekhor@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
15 years agommc: fix data timeout for SEND_EXT_CSD
Adrian Hunter [Tue, 10 Feb 2009 14:32:33 +0000 (16:32 +0200)]
mmc: fix data timeout for SEND_EXT_CSD

commit cda56ac29f2d8288d62978272856884d26e0b47b upstream.

Commit 0d3e0460f307e84904968aad6cff97bd688583d8
"MMC: CSD and CID timeout values" inadvertently broke
the timeout for the MMC command SEND_EXT_CSD.

This patch puts it back again.

Depending on the characteristics of the controller,
this bug may prevent the use of MMC cards.

Signed-off-by: Adrian Hunter <adrian.hunter@nokia.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agolibata: make sure port is thawed when skipping resets
Tejun Heo [Wed, 4 Mar 2009 06:59:30 +0000 (15:59 +0900)]
libata: make sure port is thawed when skipping resets

commit d6515e6ff4ad3db4bd5ef2dd4e1026a7aca2482e upstream.

When SCR access is available and the link is offline, softreset is
skipped as it only wastes time and some controllers don't respond very
well.  However, the skip path forgot to thaw the port, which not only
blocks further event notification from the port but also causes
repeated EH invocations on the same event on drivers which rely on
->thaw() to clear events if the IRQ is shared with another device or
port.

This problem has always been there but is uncovered by recent sata_nv
nf2/3 change which dropped hardreset support while maintaining SCR
access.  nf2/3 doesn't clear hotplug event mask from the interrupt
handler but relies on ->thaw() to clear them.  When the hardreset was
there, the reset action was never skipped and the port was always
thawed but, with the hardreset gone, ->prereset() determines that
there's no need for softreset and both ->softreset() and ->thaw() are
skipped.  This leads to stuck hotplug event in the IRQ status register
triggering hotplug event whenever IRQ is delieverd on the same IRQ.
As the controller shares the same IRQ for both ports, this happens on
every IO if one port is occpupied and the other isn't.

This patch fixes the problem by making sure that the port is thawed on
reset-skip path.

bko#11615 reports this problem.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Robert Hancock <hancockrwd@gmail.com>
Reported-by: Dan Andresan <danyer@gmail.com>
Reported-by: Arne Woerner <arne_woerner@yahoo.com>
Reported-by: Stefan Lippers-Hollmann <s.L-H@gmx.de>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agolibata: Don't trust current capacity values in identify words 57-58
Robert Hancock [Tue, 17 Feb 2009 02:15:08 +0000 (20:15 -0600)]
libata: Don't trust current capacity values in identify words 57-58

commit 968e594afdbc40b4270f9d4032ae8350475749d6 upstream.

Hanno Böck reported a problem where an old Conner CP30254 240MB hard drive
was reported as 1.1TB in capacity by libata:

http://lkml.org/lkml/2009/2/13/134

This was caused by libata trusting the drive's reported current capacity in
sectors in identify words 57 and 58 if the drive does not support LBA and the
current CHS translation values appear valid. Unfortunately it seems older
ATA specs were vague about what this field should contain and a number of drives
used values with wrong byte order or that were totally bogus. There's no
unique information that it conveys and so we can just calculate the number
of sectors from the reported current CHS values.

While we're at it, clean up this function to use named constants for the
identify word values.

Signed-off-by: Robert Hancock <hancockrwd@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agojsm: additional device support
Adam Lackorzynski [Wed, 18 Feb 2009 22:48:34 +0000 (14:48 -0800)]
jsm: additional device support

commit ffa7525c13eb3db0fd19a3e1cffe2ce6f561f5f3 upstream.

I have a Digi Neo 8 PCI card (114f:00b1) Serial controller: Digi
International Digi Neo 8 (rev 05)

that works with the jsm driver after using the following patch.

Signed-off-by: Adam Lackorzynski <adam@os.inf.tu-dresden.de>
Cc: Scott H Kilau <Scott_Kilau@digi.com>
Cc: Wendy Xiong <wendyx@us.ibm.com>
Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoPCI: Enable PCIe AER only after checking firmware support
Andrew Patterson [Fri, 20 Feb 2009 23:04:59 +0000 (16:04 -0700)]
PCI: Enable PCIe AER only after checking firmware support

commit 1f9f13c8d59c1d8da1a602b71d1ab96d1d37d69e upstream.

The PCIe port driver currently sets the PCIe AER error reporting bits for
any root or switch port without first checking to see if firmware will grant
control. This patch moves setting these bits to the AER service driver
aer_enable_port routine.  The bits are then set for the root port and any
downstream switch ports after the check for firmware support (aer_osc_setup)
is made. The patch also unsets the bits in a similar fashion when the AER
service driver is unloaded.

Reviewed-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@hobbes.lan>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoPCIe: portdrv: call pci_disable_device during remove
Alex Chiang [Sun, 8 Mar 2009 02:35:47 +0000 (19:35 -0700)]
PCIe: portdrv: call pci_disable_device during remove

commit d89987193631bf23d1735c55d13a06d4b8d0e9bd upstream.

The PCIe port driver calls pci_enable_device() during probe but
never calls pci_disable_device() during remove.

Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agofs: new inode i_state corruption fix
Nick Piggin [Thu, 12 Mar 2009 21:31:38 +0000 (14:31 -0700)]
fs: new inode i_state corruption fix

commit 7ef0d7377cb287e08f3ae94cebc919448e1f5dff upstream.

There was a report of a data corruption
http://lkml.org/lkml/2008/11/14/121.  There is a script included to
reproduce the problem.

During testing, I encountered a number of strange things with ext3, so I
tried ext2 to attempt to reduce complexity of the problem.  I found that
fsstress would quickly hang in wait_on_inode, waiting for I_LOCK to be
cleared, even though instrumentation showed that unlock_new_inode had
already been called for that inode.  This points to memory scribble, or
synchronisation problme.

i_state of I_NEW inodes is not protected by inode_lock because other
processes are not supposed to touch them until I_LOCK (and I_NEW) is
cleared.  Adding WARN_ON(inode->i_state & I_NEW) to sites where we modify
i_state revealed that generic_sync_sb_inodes is picking up new inodes from
the inode lists and passing them to __writeback_single_inode without
waiting for I_NEW.  Subsequently modifying i_state causes corruption.  In
my case it would look like this:

CPU0                            CPU1
unlock_new_inode()              __sync_single_inode()
 reg <- inode->i_state
 reg -> reg & ~(I_LOCK|I_NEW)   reg <- inode->i_state
 reg -> inode->i_state          reg -> reg | I_SYNC
                                reg -> inode->i_state

Non-atomic RMW on CPU1 overwrites CPU0 store and sets I_LOCK|I_NEW again.

Fix for this is rather than wait for I_NEW inodes, just skip over them:
inodes concurrently being created are not subject to data integrity
operations, and should not significantly contribute to dirty memory
either.

After this change, I'm unable to reproduce any of the added warnings or
hangs after ~1hour of running.  Previously, the new warnings would start
immediately and hang would happen in under 5 minutes.

I'm also testing on ext3 now, and so far no problems there either.  I
don't know whether this fixes the problem reported above, but it fixes a
real problem for me.

Cc: "Jorge Boncompte [DTI2]" <jorge@dti2.net>
Reported-by: Adrian Hunter <ext-adrian.hunter@nokia.com>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoproc: fix kflags to uflags copying in /proc/kpageflags
Wu Fengguang [Wed, 11 Mar 2009 01:00:04 +0000 (09:00 +0800)]
proc: fix kflags to uflags copying in /proc/kpageflags

commit ad3bdefe877afb47480418fdb05ecd42842de65e upstream.

Fix kpf_copy_bit(src,dst) to be kpf_copy_bit(dst,src) to match the
actual call patterns, e.g. kpf_copy_bit(kflags, KPF_LOCKED, PG_locked).

This misplacement of src/dst only affected reporting of PG_writeback,
PG_reclaim and PG_buddy. For others kflags==uflags so not affected.

Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agomtd_dataflash: fix probing of AT45DB321C chips.
Will Newton [Tue, 10 Mar 2009 19:55:53 +0000 (12:55 -0700)]
mtd_dataflash: fix probing of AT45DB321C chips.

commit 229cc58ba2b5a83b0b55764c6cb98695c106238a upstream.

Commit 771999b65f79264acde4b855e5d35696eca5e80c ("[MTD] DataFlash: bugfix,
binary page sizes now handled") broke support for probing AT45DB321C flash
chips.  These chips do not support the "page size" status bit, so if we
match the JEDEC id return early.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Will Newton <will.newton@gmail.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agointel-agp: fix a panic with 1M of shared memory, no GTT entries
Lubomir Rintel [Tue, 10 Mar 2009 19:55:54 +0000 (12:55 -0700)]
intel-agp: fix a panic with 1M of shared memory, no GTT entries

commit 9c1e8a4ebcc04226cb6f3a1bf1d72f4cafd6b089 upstream.

When GTT size is equal to amount of video memory, the amount of GTT
entries is computed lower than zero, which is invalid and leads to
off-by-one error in intel_i915_configure()

Originally posted here:
http://bugzilla.kernel.org/show_bug.cgi?id=12539
http://bugzilla.redhat.com/show_bug.cgi?id=445592

Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Cc: Lubomir Rintel <lkundrak@v3.sk>
Cc: Dave Airlie <airlied@linux.ie>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agox86: add Dell XPS710 reboot quirk
Leann Ogasawara [Wed, 4 Mar 2009 19:53:00 +0000 (11:53 -0800)]
x86: add Dell XPS710 reboot quirk

commit dd4124a8a06bca89c077a16437edac010f0bb993 upstream.

Dell XPS710 will hang on reboot.  This is resolved by adding a quirk to
set bios reboot.

Signed-off-by: Leann Ogasawara <leann.ogasawara@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Cc: "manoj.iyer" <manoj.iyer@canonical.com>
LKML-Reference: <1236196380.3231.89.camel@emiko>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agox86: oprofile: don't set counter width from cpuid on Core2
Tim Blechmann [Thu, 19 Feb 2009 16:34:03 +0000 (17:34 +0100)]
x86: oprofile: don't set counter width from cpuid on Core2

commit 780eef9492b16a1543a3b2ae9f9526a735fc9856 upstream.

Impact: fix stuck NMIs and non-working oprofile on certain CPUs

Resetting the counter width of the performance counters on Intel's
Core2 CPUs, breaks the delivery of NMIs, when running in x86_64 mode.

This should fix bug #12395:

  http://bugzilla.kernel.org/show_bug.cgi?id=12395

Signed-off-by: Tim Blechmann <tim@klingt.org>
Signed-off-by: Robert Richter <robert.richter@amd.com>
LKML-Reference: <20090303100412.GC10085@erda.amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agosdhci: fix led naming
Helmut Schaa [Sat, 14 Feb 2009 15:22:39 +0000 (16:22 +0100)]
sdhci: fix led naming

commit 5dbace0c9ba110c1a3810a89fa6bf12b7574b5a3 upstream.

Fix the led device naming for the sdhci driver.

The led class documentation defines the led name to have the
form "devicename:colour:function" while not applicable sections
should be left blank.

To comply with the documentation the led device name is changed
from "mmc*" to "mmc*::".

Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoinotify: fix GFP_KERNEL related deadlock
Ingo Molnar [Wed, 18 Feb 2009 22:48:43 +0000 (14:48 -0800)]
inotify: fix GFP_KERNEL related deadlock

commit f04b30de3c82528f1ab4c58b3dd4c975f5341901 upstream.

Enhanced lockdep coverage of __GFP_NOFS turned up this new lockdep
assert:

[ 1093.677775]
[ 1093.677781] =================================
[ 1093.680031] [ INFO: inconsistent lock state ]
[ 1093.680031] 2.6.29-rc5-tip-01504-gb49eca1-dirty #1
[ 1093.680031] ---------------------------------
[ 1093.680031] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
[ 1093.680031] kswapd0/308 [HC0[0]:SC0[0]:HE1:SE1] takes:
[ 1093.680031]  (&inode->inotify_mutex){+.+.?.}, at: [<c0205942>] inotify_inode_is_dead+0x20/0x80
[ 1093.680031] {RECLAIM_FS-ON-W} state was registered at:
[ 1093.680031]   [<c01696b9>] mark_held_locks+0x43/0x5b
[ 1093.680031]   [<c016baa4>] lockdep_trace_alloc+0x6c/0x6e
[ 1093.680031]   [<c01cf8b0>] kmem_cache_alloc+0x20/0x150
[ 1093.680031]   [<c040d0ec>] idr_pre_get+0x27/0x6c
[ 1093.680031]   [<c02056e3>] inotify_handle_get_wd+0x25/0xad
[ 1093.680031]   [<c0205f43>] inotify_add_watch+0x7a/0x129
[ 1093.680031]   [<c020679e>] sys_inotify_add_watch+0x20f/0x250
[ 1093.680031]   [<c010389e>] sysenter_do_call+0x12/0x35
[ 1093.680031]   [<ffffffff>] 0xffffffff
[ 1093.680031] irq event stamp: 60417
[ 1093.680031] hardirqs last  enabled at (60417): [<c018d5f5>] call_rcu+0x53/0x59
[ 1093.680031] hardirqs last disabled at (60416): [<c018d5b9>] call_rcu+0x17/0x59
[ 1093.680031] softirqs last  enabled at (59656): [<c0146229>] __do_softirq+0x157/0x16b
[ 1093.680031] softirqs last disabled at (59651): [<c0106293>] do_softirq+0x74/0x15d
[ 1093.680031]
[ 1093.680031] other info that might help us debug this:
[ 1093.680031] 2 locks held by kswapd0/308:
[ 1093.680031]  #0:  (shrinker_rwsem){++++..}, at: [<c01b0502>] shrink_slab+0x36/0x189
[ 1093.680031]  #1:  (&type->s_umount_key#4){+++++.}, at: [<c01e6d77>] shrink_dcache_memory+0x110/0x1fb
[ 1093.680031]
[ 1093.680031] stack backtrace:
[ 1093.680031] Pid: 308, comm: kswapd0 Not tainted 2.6.29-rc5-tip-01504-gb49eca1-dirty #1
[ 1093.680031] Call Trace:
[ 1093.680031]  [<c016947a>] valid_state+0x12a/0x13d
[ 1093.680031]  [<c016954e>] mark_lock+0xc1/0x1e9
[ 1093.680031]  [<c016a5b4>] ? check_usage_forwards+0x0/0x3f
[ 1093.680031]  [<c016ab74>] __lock_acquire+0x2c6/0xac8
[ 1093.680031]  [<c01688d9>] ? register_lock_class+0x17/0x228
[ 1093.680031]  [<c016b3d3>] lock_acquire+0x5d/0x7a
[ 1093.680031]  [<c0205942>] ? inotify_inode_is_dead+0x20/0x80
[ 1093.680031]  [<c08824c4>] __mutex_lock_common+0x3a/0x4cb
[ 1093.680031]  [<c0205942>] ? inotify_inode_is_dead+0x20/0x80
[ 1093.680031]  [<c08829ed>] mutex_lock_nested+0x2e/0x36
[ 1093.680031]  [<c0205942>] ? inotify_inode_is_dead+0x20/0x80
[ 1093.680031]  [<c0205942>] inotify_inode_is_dead+0x20/0x80
[ 1093.680031]  [<c01e6672>] dentry_iput+0x90/0xc2
[ 1093.680031]  [<c01e67a3>] d_kill+0x21/0x45
[ 1093.680031]  [<c01e6a46>] __shrink_dcache_sb+0x27f/0x355
[ 1093.680031]  [<c01e6dc5>] shrink_dcache_memory+0x15e/0x1fb
[ 1093.680031]  [<c01b05ed>] shrink_slab+0x121/0x189
[ 1093.680031]  [<c01b0d12>] kswapd+0x39f/0x561
[ 1093.680031]  [<c01ae499>] ? isolate_pages_global+0x0/0x233
[ 1093.680031]  [<c0157eae>] ? autoremove_wake_function+0x0/0x43
[ 1093.680031]  [<c01b0973>] ? kswapd+0x0/0x561
[ 1093.680031]  [<c0157daf>] kthread+0x41/0x82
[ 1093.680031]  [<c0157d6e>] ? kthread+0x0/0x82
[ 1093.680031]  [<c01043ab>] kernel_thread_helper+0x7/0x10

inotify_handle_get_wd() does idr_pre_get() which does a
kmem_cache_alloc() without __GFP_FS - and is hence deadlockable under
extreme MM pressure.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: MinChan Kim <minchan.kim@gmail.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoHID: fix bus endianity in file2alias
Jiri Slaby [Tue, 17 Feb 2009 11:38:36 +0000 (12:38 +0100)]
HID: fix bus endianity in file2alias

commit 2b639386a2a26c84c8d26c649cf657ebd43a7bc8 upstream.

Fix endianness of bus member of hid_device_id in modpost.

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Reported-by: Nye Liu <nyet@mrv.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agox86, vmi: TSC going backwards check in vmi clocksource
Alok N Kataria [Wed, 18 Feb 2009 20:33:55 +0000 (12:33 -0800)]
x86, vmi: TSC going backwards check in vmi clocksource

commit 48ffc70b675aa7798a52a2e92e20f6cce9140b3d upstream.

Impact: fix time warps under vmware

Similar to the check for TSC going backwards in the TSC clocksource,
we also need this check for VMI clocksource.

Signed-off-by: Alok N Kataria <akataria@vmware.com>
Cc: Zachary Amsden <zach@vmware.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years ago8250: fix boot hang with serial console when using with Serial Over Lan port
Mauro Carvalho Chehab [Fri, 20 Feb 2009 23:38:52 +0000 (15:38 -0800)]
8250: fix boot hang with serial console when using with Serial Over Lan port

commit b6adea334c6c89d5e6c94f9196bbf3a279cb53bd upstream.

Intel 8257x Ethernet boards have a feature called Serial Over Lan.

This feature works by emulating a serial port, and it is detected by
kernel as a normal 8250 port.  However, this emulation is not perfect, as
also noticed on changeset 7500b1f602aad75901774a67a687ee985d85893f.

Before this patch, the kernel were trying to check if the serial TX is
capable of work using IRQ's.

This were done with a code similar this:

        serial_outp(up, UART_IER, UART_IER_THRI);
        lsr = serial_in(up, UART_LSR);
        iir = serial_in(up, UART_IIR);
        serial_outp(up, UART_IER, 0);

        if (lsr & UART_LSR_TEMT && iir & UART_IIR_NO_INT)
up->bugs |= UART_BUG_TXEN;

This works fine for other 8250 ports, but, on 8250-emulated SoL port, the
chip is a little lazy to down UART_IIR_NO_INT at UART_IIR register.

Due to that, UART_BUG_TXEN is sometimes enabled.  However, as TX IRQ keeps
working, and the TX polling is now enabled, the driver miss-interprets the
IRQ received later, hanging up the machine until a key is pressed at the
serial console.

This is the 6 version of this patch.  Previous versions were trying to
introduce a large enough delay between serial_outp and serial_in(up,
UART_IIR), but not taking forever.  However, the needed delay couldn't be
safely determined.

At the experimental tests, a delay of 1us solves most of the cases, but
still hangs sometimes.  Increasing the delay to 5us was better, but still
doesn't solve.  A very high delay of 50 ms seemed to work every time.

However, poking around with delays and pray for it to be enough doesn't
seem to be a good approach, even for a quirk.

So, instead of playing with random large arbitrary delays, let's just
disable UART_BUG_TXEN for all SoL ports.

[akpm@linux-foundation.org: fix warnings]
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoFix fixpoint divide exception in acct_update_integrals
Heiko Carstens [Mon, 9 Mar 2009 12:31:59 +0000 (13:31 +0100)]
Fix fixpoint divide exception in acct_update_integrals

commit 6d5b5acca9e566515ef3f1ed617e7295c4f94345 upstream.

Frans Pop reported the crash below when running an s390 kernel under Hercules:

  Kernel BUG at 000738b4  verbose debug info unavailable!
  fixpoint divide exception: 0009  #1! SMP
  Modules linked in: nfs lockd nfs_acl sunrpc ctcm fsm tape_34xx
     cu3088 tape ccwgroup tape_class ext3 jbd mbcache dm_mirror dm_log dm_snapshot
     dm_mod dasd_eckd_mod dasd_mod
  CPU: 0 Not tainted 2.6.27.19 #13
  Process awk (pid: 2069, task: 0f9ed9b8, ksp: 0f4f7d18)
  Krnl PSW : 070c1000 800738b4 (acct_update_integrals+0x4c/0x118)
             R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:1 PM:0
  Krnl GPRS: 00000000 000007d0 7fffffff fffff830
             00000000 ffffffff 00000002 0f9ed9b8
             00000000 00008ca0 00000000 0f9ed9b8
             0f9edda4 8007386e 0f4f7ec8 0f4f7e98
  Krnl Code: 800738aaa71807d0         lhi     %r1,2000
             800738ae8c200001         srdl    %r2,1
             800738b2: 1d21             dr      %r2,%r1
            >800738b45810d10e         l       %r1,270(%r13)
             800738b8: 1823             lr      %r2,%r3
             800738ba4130f060         la      %r3,96(%r15)
             800738be: 0de1             basr    %r14,%r1
             800738c05800f060         l       %r0,96(%r15)
  Call Trace:
  ( <000000000004fdea>! blocking_notifier_call_chain+0x1e/0x2c)
    <0000000000038502>! do_exit+0x106/0x7c0
    <0000000000038c36>! do_group_exit+0x7a/0xb4
    <0000000000038c8e>! SyS_exit_group+0x1e/0x30
    <0000000000021c28>! sysc_do_restart+0x12/0x16
    <0000000077e7e924>! 0x77e7e924

Reason for this is that cpu time accounting usually only happens from
interrupt context, but acct_update_integrals gets also called from
process context with interrupts enabled.

So in acct_update_integrals we may end up with the following scenario:

Between reading tsk->stime/tsk->utime and tsk->acct_timexpd an interrupt
happens which updates accouting values.  This causes acct_timexpd to be
greater than the former stime + utime.  The subsequent calculation of

dtime = cputime_sub(time, tsk->acct_timexpd);

will be negative and the division performed by

cputime_to_jiffies(dtime)

will generate an exception since the result won't fit into a 32 bit
register.

In order to fix this just always disable interrupts while accessing any
of the accounting values.

Reported by: Frans Pop <elendil@planet.nl>
Tested by: Frans Pop <elendil@planet.nl>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agovmalloc: call flush_cache_vunmap() from unmap_kernel_range()
Tejun Heo [Fri, 20 Feb 2009 23:38:48 +0000 (15:38 -0800)]
vmalloc: call flush_cache_vunmap() from unmap_kernel_range()

commit f6fcba7014f9cc535fa75ef98c008b24e49e2212 upstream.

Impact: proper vcache flush on unmap_kernel_range()

flush_cache_vunmap() should be called before pages are unmapped.  Add
a call to it in unmap_kernel_range().

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Nick Piggin <npiggin@suse.de>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoacer-wmi: fix regression in backlight detection
Michael Spang [Thu, 12 Mar 2009 21:31:34 +0000 (14:31 -0700)]
acer-wmi: fix regression in backlight detection

commit 1ba869ec581fd9078b684c56c399ffe3d2345e27 upstream.

Currently we disable the Acer WMI backlight device if there is no ACPI
backlight device.  As a result, we end up with no backlight device at all.
 We should instead disable it if there is an ACPI device, as the other
laptop drivers do.  This regression was introduced in febf2d9 ("Acer-WMI:
fingers off backlight if video.ko is serving this functionality").

Each laptop driver with backlight support got a similar change around
febf2d9.  The changes to the other drivers look correct; see e.g.
a598c82f for a similar but correct change.  The regression is also in
2.6.28.

Signed-off-by: Michael Spang <mspang@csclub.uwaterloo.ca>
Acked-by: Thomas Renninger <trenn@suse.de>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Carlos Corbacho <carlos@strangeworlds.co.uk>
Cc: Len Brown <len.brown@intel.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoALSA: aw2: do not grab every saa7146 based device
Anssi Hannula [Sun, 22 Feb 2009 12:42:54 +0000 (14:42 +0200)]
ALSA: aw2: do not grab every saa7146 based device

commit e8bf069c419c1dc0657e02636441fe1179a9db14 upstream.

Audiowerk2 driver snd-aw2 is bound to any saa7146 device as it does not
check subsystem ids. Many DVB devices are saa7146 based, so aw2 driver
grabs them as well.

According to http://lkml.org/lkml/2008/10/15/311 aw2 devices have the
subsystem ids set to 0, the saa7146 default.

Fix conflicts with DVB devices by checking for subsystem ids = 0
specifically.

Signed-off-by: Anssi Hannula <anssi.hannula@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoALSA: hda - add another MacBook Pro 3,1 SSID
Luke Yelavich [Mon, 23 Feb 2009 02:00:33 +0000 (13:00 +1100)]
ALSA: hda - add another MacBook Pro 3,1 SSID

commit 2d4663816064fabb68935f920bbd7ccdc7f9392d upstream.

Reference: Ubuntu bug #33245
    https://bugs.launchpad.net/bugs/332456

Signed-off-by: Luke Yelavich <themuso@ubuntu.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoALSA: hda - Fix digital mic on dell-m4-1 and dell-m4-3
Takashi Iwai [Fri, 27 Feb 2009 16:36:33 +0000 (17:36 +0100)]
ALSA: hda - Fix digital mic on dell-m4-1 and dell-m4-3

commit ea18aa464452c3e6550320d247c0306aaa2d156f upstream.

Fix num_dmuxes initialization for dell-m4-1 and dell-m4-3 models
of IDT 92HD71bxx codec, which was wrongly set to zero.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoALSA: fix excessive background noise introduced by OSS emulation rate shrink
Steve Chen [Sat, 21 Feb 2009 14:05:04 +0000 (08:05 -0600)]
ALSA: fix excessive background noise introduced by OSS emulation rate shrink

commit 5370d96f85962769ea3df3a81cc885f257c51589 upstream.

Incorrect variable was used to get the next sample which caused S2
to be stuck with the same value resulting in loud background noise.

Signed-off-by: Steve Chen <schen@mvista.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agosound: usb-audio: fix uninitialized variable with M-Audio MIDI interfaces
Clemens Ladisch [Mon, 16 Feb 2009 14:22:39 +0000 (15:22 +0100)]
sound: usb-audio: fix uninitialized variable with M-Audio MIDI interfaces

commit e156ac4c571e3be741bc411e58820b74a9295c72 upstream.

Fix the snd_usbmidi_create_endpoints_midiman() function, which forgot to
set the out_interval member of the endpoint info structure for Midiman/
M-Audio devices.  Since kernel 2.6.24, any non-zero value makes the
driver use interrupt transfers instead of bulk transfers.  With EHCI
controllers, these random interval values result in unbearably large
latencies for output MIDI transfers.

Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Reported-by: David <devurandom@foobox.com>
Tested-by: David <devurandom@foobox.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoALSA: usb-audio - Workaround for misdetected sample rate with CM6207
Joris van Rantwijk [Mon, 16 Feb 2009 21:58:23 +0000 (22:58 +0100)]
ALSA: usb-audio - Workaround for misdetected sample rate with CM6207

commit 3b03cc5b86e2052295b9b484f37226ee15c87924 upstream.

The CM6207 incorrectly advertises its 96 kHz playback setting as 48 kHz
in its USB device descriptor. This patch extends an existing workaround
in usbaudio.c to also cover the CM6207.

This resolves issue 0004249 in the ALSA bug tracker.

Signed-off-by: Joris van Rantwijk <jorispubl@xs4all.nl>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoALSA: usb-audio - Fix non-continuous rate detection
Takashi Iwai [Mon, 16 Feb 2009 21:48:12 +0000 (22:48 +0100)]
ALSA: usb-audio - Fix non-continuous rate detection

commit 0412558c873f716efe902b397af0653a550f7341 upstream.

The detection of non-continuous rates (given via rate tables) isn't
processed properly (e.g. for type II).

This patch fixes and simplifies the detection code.

Tested-by: Joris van Rantwijk <jorispubl@xs4all.nl>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agosound: virtuoso: revert "do not overwrite EEPROM on Xonar D2/D2X"
Clemens Ladisch [Tue, 17 Feb 2009 08:50:30 +0000 (09:50 +0100)]
sound: virtuoso: revert "do not overwrite EEPROM on Xonar D2/D2X"

commit 6ce6c473a7fd742fdb0db95841e2c4c6b37337c5 upstream.

This reverts commit 7e86c0e6850504ec9516b953f316a47277825e33 ("do not
overwrite EEPROM on Xonar D2/D2X") because it did not actually help with
the problem.

More user reports show that the overwriting of the EEPROM is not
triggered by using this driver but by installing Linux, and that the
installation of any other operating system (even one without any CMI8788
driver) has the same effect.  In other words, the presence of this
driver does not have any effect on the occurrence of the error.  (So
far, the available evidence seems to point to a BIOS bug.)

Furthermore, it turns out that the EEPROM chip is protected against
stray write commands by the command format and by requiring a separate
write-enable command, so the error scenario in the previous commit (that
SPI writes can be misinterpreted as an EEPROM write command) is not even
theoretically possible.

The mixer control that was removed as a consequence of the previous
commit can only be partially emulated in userspace, which also means it
cannot be seen be the in-kernel OSS API emulation, so it is better to
revert that change.

Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agomd/raid10: Don't skip more than 1 bitmap-chunk at a time during recovery.
NeilBrown [Wed, 25 Feb 2009 02:18:47 +0000 (13:18 +1100)]
md/raid10: Don't skip more than 1 bitmap-chunk at a time during recovery.

commit 09b4068a7fe442efc40e9dcbcf5ff37c3338ab15 upstream.

When doing recovery on a raid10 with a write-intent bitmap, we only
need to recovery chunks that are flagged in the bitmap.

However if we choose to skip a chunk as it isn't flag, the code
currently skips the whole raid10-chunk, thus it might not recovery
some blocks that need recovering.

This patch fixes it.

In case that is confusing, it might help to understand that there
is a 'raid10 chunk size' which guides how data is distributed across
the devices, and a 'bitmap chunk size' which says how much data
corresponds to a single bit in the bitmap.

This bug only affects cases where the bitmap chunk size is smaller
than the raid10 chunk size.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agomd/raid10: Don't call bitmap_cond_end_sync when we are doing recovery.
NeilBrown [Wed, 25 Feb 2009 02:18:47 +0000 (13:18 +1100)]
md/raid10: Don't call bitmap_cond_end_sync when we are doing recovery.

commit 78200d45cde2a79c0d0ae0407883bb264caa3c18 upstream.

For raid1/4/5/6, resync (fixing inconsistencies between devices) is
very similar to recovery (rebuilding a failed device onto a spare).
The both walk through the device addresses in order.

For raid10 it can be quite different.  resync follows the 'array'
address, and makes sure all copies are the same.  Recover walks
through 'device' addresses and recreates each missing block.

The 'bitmap_cond_end_sync' function allows the write-intent-bitmap
(When present) to be updated to reflect a partially completed resync.
It makes assumptions which mean that it does not work correctly for
raid10 recovery at all.

In particularly, it can cause bitmap-directed recovery of a raid10 to
not recovery some of the blocks that need to be recovered.

So move the call to bitmap_cond_end_sync into the resync path, rather
than being in the common "resync or recovery" path.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agomd: avoid races when stopping resync.
NeilBrown [Wed, 25 Feb 2009 02:18:47 +0000 (13:18 +1100)]
md: avoid races when stopping resync.

commit 73d5c38a9536142e062c35997b044e89166e063b upstream.

There has been a race in raid10 and raid1 for a long time
which has only recently started showing up due to a scheduler changed.

When a sync_read request finishes, as soon as reschedule_retry
is called, another thread can mark the resync request as having
completed, so md_do_sync can finish, ->stop can be called, and
->conf can be freed.  So using conf after reschedule_retry is not
safe.

Similarly, when finishing a sync_write, calling md_done_sync must be
the last thing we do, as it allows a chain of events which will free
conf and other data structures.

The first of these requires action in raid10.c
The second requires action in raid1.c and raid10.c

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoUSB: EHCI: slow down ITD reuse
Karsten Wiese [Mon, 9 Feb 2009 00:07:58 +0000 (16:07 -0800)]
USB: EHCI: slow down ITD reuse

commit 9aa09d2f8f4bc440d6db1c3414d4009642875240 upstream.

Currently ITDs are immediately recycled whenever their URB completes.
However, EHCI hardware can sometimes remember some ITD state.  This
means that when the ITD is reused before end-of-frame it may sometimes
cause the hardware to reference bogus state.

This patch defers reusing such ITDs by moving them into a new ehci member
cached_itd_list. ITDs resting in cached_itd_list are moved back into their
stream's free_list once scan_periodic() detects that the active frame has
elapsed.

This makes the snd_usb_us122l driver (in kernel since .28) work right
when it's hooked up through EHCI.

[ dbrownell@users.sourceforge.net: comment fixups ]

Signed-off-by: Karsten Wiese <fzu@wemgehoertderstaat.de>
Tested-by: Philippe Carriere <philippe-f.carriere@wanadoo.fr>
Tested-by: Federico Briata <federicobriata@gmail.com>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoUSB: option: add BenQ 3g modem information
Jesse Sung [Sat, 21 Feb 2009 05:13:45 +0000 (21:13 -0800)]
USB: option: add BenQ 3g modem information

commit 28fb66821f884870987a0b5ab064ef651d9f7c16 upstream.

This patch addes the BenQ 3g modem support to the option driver.

From: Jesse Sung <jsung@novell.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoRDMA/nes: Don't allow userspace QPs to use STag zero
Faisal Latif [Thu, 12 Mar 2009 21:34:59 +0000 (14:34 -0700)]
RDMA/nes: Don't allow userspace QPs to use STag zero

commit c12e56ef6951f4fce1afe9ef6aab9243ea9a9b04 upstream.

STag zero is a special STag that allows consumers to access any bus
address without registering memory.  The nes driver unfortunately
allows STag zero to be used even with QPs created by unprivileged
userspace consumers, which means that any process with direct verbs
access to the nes device can read and write any memory accessible to
the underlying PCI device (usually any memory in the system).  Such
access is usually given for cluster software such as MPI to use, so
this is a local privilege escalation bug on most systems running this
driver.

The driver was using STag zero to receive the last streaming mode
data; to allow STag zero to be disabled for unprivileged QPs, the
driver now registers a special MR for this data.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoWATCHDOG: rc32434_wdt: fix sections
Phil Sutter [Sun, 8 Feb 2009 15:44:42 +0000 (16:44 +0100)]
WATCHDOG: rc32434_wdt: fix sections

commit d9a8798c4bab5ccd40e45e011f668099cfb3eb83 upstream.

Fix init and exit sections.

Signed-off-by: Phil Sutter <n0-1@freewrt.org>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoWATCHDOG: rc32434_wdt: fix watchdog driver
Phil Sutter [Sun, 8 Feb 2009 15:44:42 +0000 (16:44 +0100)]
WATCHDOG: rc32434_wdt: fix watchdog driver

commit 0af98d37e85e6958eb84987b1f60da3b54008317 upstream.

The existing driver code wasn't working. Neither the timeout was set
correctly, nor system reset was being triggered, as the driver seemed
to keep the WDT alive himself. There was also some unnecessary code.

Signed-off-by: Phil Sutter <n0-1@freewrt.org>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoWATCHDOG: ks8695_wdt.c: 'CLOCK_TICK_RATE' undeclared
Alexey Dobriyan [Thu, 12 Feb 2009 10:42:41 +0000 (13:42 +0300)]
WATCHDOG: ks8695_wdt.c: 'CLOCK_TICK_RATE' undeclared

commit b02c387892fc6b3cc59c78ab2f79413d55f50190 upstream.

On arm-acs5k_tiny:

drivers/watchdog/ks8695_wdt.c:68: error: 'CLOCK_TICK_RATE' undeclared
(first use in this function)

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agortl8187: New USB ID's for RTL8187L
Larry Finger [Tue, 17 Feb 2009 20:31:12 +0000 (14:31 -0600)]
rtl8187: New USB ID's for RTL8187L

commit 046ee5d26ac91316a8ac0a29c0b33139dc9da20d upstream.

Add new USB ID codes. These come from two postings on forums and
mailing lists, and four are derived from the .inf that accompanies
the latest Realtek Windows driver for the RTL8187L.

Thanks to Viktor Ilijašić <viktor.ilijasic@gmail.com> and Xose Vazquez
Perez <xose.vazquez@gmail.com> for reporting these new ID's.

Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoUSB: cdc-acm: add usb id for motomagx phones
Dmitriy Taychenachev [Wed, 25 Feb 2009 04:36:51 +0000 (12:36 +0800)]
USB: cdc-acm: add usb id for motomagx phones

commit 155df65ae11dfc322214c6f887185929c809df1b upstream.

The Motorola MOTOMAGX phones (Z6, E8, Zn5 so far) are providing
combined ACM/BLAN USB configuration. Since it has Vendor Specific
class, the corresponding drivers (cdc-acm, zaurus) can't find it just
by interface info. This patch adds usb id so the cdc-acm driver can
properly handle this combined device.

Signed-off-by: Dmitriy Taychenachev <dimichxp@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoUSB: usb-storage: add IGNORE_RESIDUE flag for Genesys Logic adapters
Alan Stern [Mon, 23 Feb 2009 17:02:05 +0000 (12:02 -0500)]
USB: usb-storage: add IGNORE_RESIDUE flag for Genesys Logic adapters

commit 5126a2674ddac0804450f59da25a058cca629d38 upstream.

This patch (as1219) adds the IGNORE_RESIDUE flag to the unusual_devs
entries for Genesys Logic's USB-IDE adapter.  Although this device
usually gets the residue correct, there is one command crucial to the
operation of CD and DVD drives which it messes up.

Tested-by: Mike Lampard <mike@mtgambier.net>
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoUSB: usb_get_string should check the descriptor type
Alan Stern [Fri, 20 Feb 2009 21:33:08 +0000 (16:33 -0500)]
USB: usb_get_string should check the descriptor type

commit 67f5a4ba9741fcef3f4db3509ad03565d9e33af2 upstream.

This patch (as1218) fixes a problem with a radio-control joystick used
in the "walkera 4#3" helicopter.  This device responds to the initial
Get-String-Descriptor request for string 0 (which is really the list
of supported languages) by sending its config descriptor!  The
usb_get_string() routine needs to check whether it got the right
type of descriptor.

Oddly enough, this sort of check is already present in
usb_get_descriptor().  The patch changes the error code from -EPROTO
to -ENODATA, because -EPROTO shows up in so many other contexts to
indicate a hardware failure rather than a firmware error.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Tested-by: Guillermo Jarabo <williamjap@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoSCSI: sd: revive sd_index_lock
Tejun Heo [Sat, 21 Feb 2009 02:04:45 +0000 (11:04 +0900)]
SCSI: sd: revive sd_index_lock

commit 4034cc68157bfa0b6622efe368488d3d3e20f4e6 upstream.

Commit f27bac2761cab5a2e212dea602d22457a9aa6943 which converted sd to
use ida instead of idr incorrectly removed sd_index_lock around id
allocation and free.  idr/ida do have internal locks but they protect
their free object lists not the allocation itself.  The caller is
responsible for that.  This missing synchronization led to the same id
being assigned to multiple devices leading to oops.

Reported and tracked down by Stuart Hayes of Dell.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoJFFS2: fix mount crash caused by removed nodes
Thomas Gleixner [Mon, 16 Feb 2009 20:29:31 +0000 (21:29 +0100)]
JFFS2: fix mount crash caused by removed nodes

commit 4c41bd0ec953954158f92bed5d3062645062b98e upstream.

At scan time we observed following scenario:

   node A inserted
   node B inserted
   node C inserted -> sets overlapped flag on node B

   node A is removed due to CRC failure -> overlapped flag on node B remains

   while (tn->overlapped)
     tn = tn_prev(tn);

   ==> crash, when tn_prev(B) is referenced.

When the ultimate node is removed at scan time and the overlapped flag
is set on the penultimate node, then nothing updates the overlapped
flag of that node. The overlapped iterators blindly expect that the
ultimate node does not have the overlapped flag set, which causes the
scan code to crash.

It would be a huge overhead to go through the node chain on node
removal and fix up the overlapped flags, so detecting such a case on
the fly in the overlapped iterators is a simpler and reliable
solution.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoSCSI: hptiop: Add new PCI device ID
HighPoint Linux Team [Thu, 12 Feb 2009 03:28:31 +0000 (11:28 +0800)]
SCSI: hptiop: Add new PCI device ID

commit b73a77494292b930642fbf87de3e3196593f7593 upstream.

Signed-off-by: HighPoint Linux Team <linux@highpoint-tech.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoPCI quirk: enable MSI on 8132
Yinghai Lu [Wed, 18 Feb 2009 04:40:09 +0000 (20:40 -0800)]
PCI quirk: enable MSI on 8132

commit e0ae4f5503235ba4449ffb5bcb4189edcef4d584 upstream.

David reported that LSI SAS doesn't work with MSI.  It turns out that
his BIOS doesn't enable it, but the HT MSI 8132 does support HT MSI.
Add quirk to enable it

Reported-by: David Lang <david@lang.hm>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agomm: vmap fix overflow
Nick Piggin [Fri, 27 Feb 2009 22:03:03 +0000 (14:03 -0800)]
mm: vmap fix overflow

commit 7766970cc13e9071b356b1f2a48a9eb8675bfcce upstream.

The new vmap allocator can wrap the address and get confused in the case
of large allocations or VMALLOC_END near the end of address space.

Problem reported by Christoph Hellwig on a 32-bit XFS workload.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Reported-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agomm: fix lazy vmap purging (use-after-free error)
Vegard Nossum [Fri, 27 Feb 2009 22:03:04 +0000 (14:03 -0800)]
mm: fix lazy vmap purging (use-after-free error)

commit cbb766766f3f2f6d9326c561b1020590642c6e39 upstream.

I just got this new warning from kmemcheck:

    WARNING: kmemcheck: Caught 32-bit read from freed memory (c7806a60)
    a06a80c7ecde70c1a04080c700000000a06709c1000000000000000000000000
     f f f f f f f f f f f f f f f f f f f f f f f f f f f f f f f f
     ^

    Pid: 0, comm: swapper Not tainted (2.6.29-rc4 #230)
    EIP: 0060:[<c1096df7>] EFLAGS: 00000286 CPU: 0
    EIP is at __purge_vmap_area_lazy+0x117/0x140
    EAX: 00070f43 EBX: c7806a40 ECX: c1677080 EDX: 00027b66
    ESI: 00002001 EDI: c170df0c EBP: c170df00 ESP: c178830c
     DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
    CR0: 80050033 CR2: c7806b14 CR3: 01775000 CR4: 00000690
    DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
    DR6: 00004000 DR7: 00000000
     [<c1096f3e>] free_unmap_vmap_area_noflush+0x6e/0x70
     [<c1096f6a>] remove_vm_area+0x2a/0x70
     [<c1097025>] __vunmap+0x45/0xe0
     [<c10970de>] vunmap+0x1e/0x30
     [<c1008ba5>] text_poke+0x95/0x150
     [<c1008ca9>] alternatives_smp_unlock+0x49/0x60
     [<c171ef47>] alternative_instructions+0x11b/0x124
     [<c171f991>] check_bugs+0xbd/0xdc
     [<c17148c5>] start_kernel+0x2ed/0x360
     [<c171409e>] __init_begin+0x9e/0xa9
     [<ffffffff>] 0xffffffff

It happened here:

    $ addr2line -e vmlinux -i c1096df7
    mm/vmalloc.c:540

Code:

list_for_each_entry(va, &valist, purge_list)
__free_vmap_area(va);

It's this instruction:

    mov    0x20(%ebx),%edx

Which corresponds to a dereference of va->purge_list.next:

    (gdb) p ((struct vmap_area *) 0)->purge_list.next
    Cannot access memory at address 0x20

It seems that we should use "safe" list traversal here, as the element
is freed inside the loop. Please verify that this is the right fix.

Acked-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoFix oops in cifs_strfromUCS_le mounting to servers which do not specify their OS
Steve French [Tue, 17 Feb 2009 01:29:40 +0000 (01:29 +0000)]
Fix oops in cifs_strfromUCS_le mounting to servers which do not specify their OS

commit 69765529d701c838df19ea1f5ad2f33a528261ae upstream.

Fixes kernel bug #10451 http://bugzilla.kernel.org/show_bug.cgi?id=10451

Certain NAS appliances do not set the operating system or network operating system
fields in the session setup response on the wire.  cifs was oopsing on the unexpected
zero length response fields (when trying to null terminate a zero length field).

This fixes the oops.

Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agomm: fix memmap init for handling memory hole
KAMEZAWA Hiroyuki [Wed, 18 Feb 2009 22:48:33 +0000 (14:48 -0800)]
mm: fix memmap init for handling memory hole

commit cc2559bccc72767cb446f79b071d96c30c26439b upstream.

Now, early_pfn_in_nid(PFN, NID) may returns false if PFN is a hole.
and memmap initialization was not done. This was a trouble for
sparc boot.

To fix this, the PFN should be initialized and marked as PG_reserved.
This patch changes early_pfn_in_nid() return true if PFN is a hole.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reported-by: David Miller <davem@davemlloft.net>
Tested-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agomm: clean up for early_pfn_to_nid()
KAMEZAWA Hiroyuki [Wed, 18 Feb 2009 22:48:32 +0000 (14:48 -0800)]
mm: clean up for early_pfn_to_nid()

commit f2dbcfa738368c8a40d4a5f0b65dc9879577cb21 upstream.

What's happening is that the assertion in mm/page_alloc.c:move_freepages()
is triggering:

BUG_ON(page_zone(start_page) != page_zone(end_page));

Once I knew this is what was happening, I added some annotations:

if (unlikely(page_zone(start_page) != page_zone(end_page))) {
printk(KERN_ERR "move_freepages: Bogus zones: "
       "start_page[%p] end_page[%p] zone[%p]\n",
       start_page, end_page, zone);
printk(KERN_ERR "move_freepages: "
       "start_zone[%p] end_zone[%p]\n",
       page_zone(start_page), page_zone(end_page));
printk(KERN_ERR "move_freepages: "
       "start_pfn[0x%lx] end_pfn[0x%lx]\n",
       page_to_pfn(start_page), page_to_pfn(end_page));
printk(KERN_ERR "move_freepages: "
       "start_nid[%d] end_nid[%d]\n",
       page_to_nid(start_page), page_to_nid(end_page));
 ...

And here's what I got:

move_freepages: Bogus zones: start_page[2207d0000] end_page[2207dffc0] zone[fffff8103effcb00]
move_freepages: start_zone[fffff8103effcb00] end_zone[fffff8003fffeb00]
move_freepages: start_pfn[0x81f600] end_pfn[0x81f7ff]
move_freepages: start_nid[1] end_nid[0]

My memory layout on this box is:

[    0.000000] Zone PFN ranges:
[    0.000000]   Normal   0x00000000 -> 0x0081ff5d
[    0.000000] Movable zone start PFN for each node
[    0.000000] early_node_map[8] active PFN ranges
[    0.000000]     0: 0x00000000 -> 0x00020000
[    0.000000]     1: 0x00800000 -> 0x0081f7ff
[    0.000000]     1: 0x0081f800 -> 0x0081fe50
[    0.000000]     1: 0x0081fed1 -> 0x0081fed8
[    0.000000]     1: 0x0081feda -> 0x0081fedb
[    0.000000]     1: 0x0081fedd -> 0x0081fee5
[    0.000000]     1: 0x0081fee7 -> 0x0081ff51
[    0.000000]     1: 0x0081ff59 -> 0x0081ff5d

So it's a block move in that 0x81f600-->0x81f7ff region which triggers
the problem.

This patch:

Declaration of early_pfn_to_nid() is scattered over per-arch include
files, and it seems it's complicated to know when the declaration is used.
 I think it makes fix-for-memmap-init not easy.

This patch moves all declaration to include/linux/mm.h

After this,
  if !CONFIG_NODES_POPULATES_NODE_MAP && !CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID
     -> Use static definition in include/linux/mm.h
  else if !CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID
     -> Use generic definition in mm/page_alloc.c
  else
     -> per-arch back end function will be called.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Tested-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reported-by: David Miller <davem@davemlloft.net>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoaoe: ignore vendor extension AoE responses
Ed Cashin [Wed, 18 Feb 2009 22:48:13 +0000 (14:48 -0800)]
aoe: ignore vendor extension AoE responses

commit b6d6c5175809934e04a606d9193ef04924a7a7d9 upstream.

The Welland ME-747K-SI AoE target generates unsolicited AoE responses that
are marked as vendor extensions.  Instead of ignoring these packets, the
aoe driver was generating kernel messages for each unrecognized response
received.  This patch corrects the behavior.

Signed-off-by: Ed Cashin <ecashin@coraid.com>
Reported-by: <karaluh@karaluh.pl>
Tested-by: <karaluh@karaluh.pl>
Cc: Alex Buell <alex.buell@munted.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agotimerfd: add flags check
Davide Libenzi [Wed, 18 Feb 2009 22:48:18 +0000 (14:48 -0800)]
timerfd: add flags check

commit 610d18f4128ebbd88845d0fc60cce67b49af881e upstream.

As requested by Michael, add a missing check for valid flags in
timerfd_settime(), and make it return EINVAL in case some extra bits are
set.

Michael said:
If this is to be any use to userland apps that want to check flag
support (perhaps it is too late already), then the sooner we get it
into the kernel the better: 2.6.29 would be good; earlier stables as
well would be even better.

[akpm@linux-foundation.org: remove unused TFD_FLAGS_SET]
Acked-by: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agovt: Declare PIO_CMAP/GIO_CMAP as compatbile ioctls.
Bill Nottingham [Wed, 18 Feb 2009 22:48:39 +0000 (14:48 -0800)]
vt: Declare PIO_CMAP/GIO_CMAP as compatbile ioctls.

commit 2db69a9340da12a4db44edb7506dd68799aeff55 upstream.

Otherwise, these don't work when called from 32-bit userspace on 64-bit
kernels.

Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoseq_file: properly cope with pread
Eric Biederman [Wed, 18 Feb 2009 22:48:16 +0000 (14:48 -0800)]
seq_file: properly cope with pread

commit 8f19d472935c83d823fa4cf02bcc0a7b9952db30 upstream.

Currently seq_read assumes that the offset passed to it is always the
offset it passed to user space.  In the case pread this assumption is
broken and we do the wrong thing when presented with pread.

To solve this I introduce an offset cache inside of struct seq_file so we
know where our logical file position is.  Then in seq_read if we try to
read from another offset we reset our data structures and attempt to go to
the offset user space wanted.

[akpm@linux-foundation.org: restore FMODE_PWRITE]
[pjt@google.com: seq_open needs its fmode opened up to take advantage of this]
Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Paul Turner <pjt@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agovfs: separate FMODE_PREAD/FMODE_PWRITE into separate flags
Paul Turner [Wed, 18 Feb 2009 22:48:15 +0000 (14:48 -0800)]
vfs: separate FMODE_PREAD/FMODE_PWRITE into separate flags

commit 55ec82176eca52e4e0530a82a0eb59160a1a95a1 upstream.

Separate FMODE_PREAD and FMODE_PWRITE into separate flags to reflect the
reality that the read and write paths may have independent restrictions.

A git grep verifies that these flags are always cleared together so this
new behavior will only apply to interfaces that change to clear flags
individually.

This is required for "seq_file: properly cope with pread", a post-2.6.25
regression fix.

[akpm@linux-foundation.org: add comment]
Signed-off-by: Paul Turner <pjt@google.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agosparc64: Fix DAX handling via userspace access from kernel.
David S. Miller [Tue, 20 Jan 2009 06:56:51 +0000 (22:56 -0800)]
sparc64: Fix DAX handling via userspace access from kernel.

[ Upstream commit fcd26f7ae2ea5889134e8b3d60a42ce8b993c95f ]

If we do a userspace access from kernel mode, and get a
data access exception, we need to check the exception
table just like a normal fault does.

The spitfire DAX handler was doing this, but such logic
was missing from the sun4v DAX code.

Reported-by: Dennis Gilmore <dgilmore@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agosparc64: Fix crashes in jbusmc_print_dimm()
David S. Miller [Wed, 11 Feb 2009 08:54:07 +0000 (00:54 -0800)]
sparc64: Fix crashes in jbusmc_print_dimm()

[ Upstream commit 1b0e235cc9bfae4bc0f5cd0cba929206fb0f6a64 ]

Return was missing for the case where there is no dimm
info match.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agonet: Kill skb_truesize_check(), it only catches false-positives.
David S. Miller [Wed, 18 Feb 2009 05:24:05 +0000 (21:24 -0800)]
net: Kill skb_truesize_check(), it only catches false-positives.

[ Upstream commit 92a0acce186cde8ead56c6915d9479773673ea1a ]

A long time ago we had bugs, primarily in TCP, where we would modify
skb->truesize (for TSO queue collapsing) in ways which would corrupt
the socket memory accounting.

skb_truesize_check() was added in order to try and catch this error
more systematically.

However this debugging check has morphed into a Frankenstein of sorts
and these days it does nothing other than catch false-positives.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agonet: amend the fix for SO_BSDCOMPAT gsopt infoleak
Eugene Teo [Mon, 23 Feb 2009 23:38:41 +0000 (15:38 -0800)]
net: amend the fix for SO_BSDCOMPAT gsopt infoleak

[ Upstream commit 50fee1dec5d71b8a14c1b82f2f42e16adc227f8b ]

The fix for CVE-2009-0676 (upstream commit df0bca04) is incomplete. Note
that the same problem of leaking kernel memory will reappear if someone
on some architecture uses struct timeval with some internal padding (for
example tv_sec 64-bit and tv_usec 32-bit) --- then, you are going to
leak the padded bytes to userspace.

Signed-off-by: Eugene Teo <eugeneteo@kernel.sg>
Reported-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoLinux 2.6.28.7 v2.6.28.7
Greg Kroah-Hartman [Fri, 20 Feb 2009 22:41:27 +0000 (14:41 -0800)]
Linux 2.6.28.7

15 years agoFix longstanding "error: storage size of '__mod_dmi_device_table' isn't known"
Alexey Dobriyan [Wed, 21 Jan 2009 22:58:36 +0000 (01:58 +0300)]
Fix longstanding "error: storage size of '__mod_dmi_device_table' isn't known"

commit 40413dcb7b273bda681dca38e6ff0bbb3728ef11 upstream.

gcc 3.4.6 doesn't like MODULE_DEVICE_TABLE(dmi, x) expansion enough to
error out.  Shut it up in a most simple way.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoext4: Initialize the new group descriptor when resizing the filesystem
Theodore Ts'o [Tue, 17 Feb 2009 15:32:42 +0000 (10:32 -0500)]
ext4: Initialize the new group descriptor when resizing the filesystem

(cherry picked from commit fdff73f094e7220602cc3f8959c7230517976412)

Make sure all of the fields of the group descriptor are properly
initialized.  Previously, we allowed bg_flags field to be contain
random garbage, which could trigger non-deterministic behavior,
including a kernel OOPS.

http://bugzilla.kernel.org/show_bug.cgi?id=12433

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agojbd2: On a __journal_expect() assertion failure printk "JBD2", not "EXT3-fs"
Theodore Ts'o [Tue, 17 Feb 2009 15:32:41 +0000 (10:32 -0500)]
jbd2: On a __journal_expect() assertion failure printk "JBD2", not "EXT3-fs"

(cherry picked from commit 08ec8c3878cea0bf91f2ba3c0badf44b383752d0)

Otherwise it can be very confusing to find a "EXT3-fs: " failure in
the middle of EXT4-fs failures, and it makes it harder to track the
source of the failure.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoext4: Add sanity check to make_indexed_dir
Theodore Ts'o [Tue, 17 Feb 2009 15:32:40 +0000 (10:32 -0500)]
ext4: Add sanity check to make_indexed_dir

(cherry picked from commit e6b8bc09ba2075cd91fbffefcd2778b1a00bd76f)

Make sure the rec_len field in the '..' entry is sane, lest we overrun
the directory block and cause a kernel oops on a purposefully
corrupted filesystem.

Thanks to Sami Liedes for reporting this bug.

http://bugzilla.kernel.org/show_bug.cgi?id=12430

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoext4: only use i_size_high for regular files
Theodore Ts'o [Tue, 17 Feb 2009 15:32:39 +0000 (10:32 -0500)]
ext4: only use i_size_high for regular files

(cherry picked from commit 06a279d636734da32bb62dd2f7b0ade666f65d7c)

Directories are not allowed to be bigger than 2GB, so don't use
i_size_high for anything other than regular files.  E2fsck should
complain about these inodes, but the simplest thing to do for the
kernel is to only use i_size_high for regular files.

This prevents an intentially corrupted filesystem from causing the
kernel to burn a huge amount of CPU and issuing error messages such
as:

EXT4-fs warning (device loop0): ext4_block_to_path: block 135090028 > max

Thanks to David Maciejak from Fortinet's FortiGuard Global Security
Research Team for reporting this issue.

http://bugzilla.kernel.org/show_bug.cgi?id=12375

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoext4: Add sanity checks for the superblock before mounting the filesystem
Theodore Ts'o [Tue, 17 Feb 2009 15:32:38 +0000 (10:32 -0500)]
ext4: Add sanity checks for the superblock before mounting the filesystem

(cherry picked from commit 4ec110281379826c5cf6ed14735e47027c3c5765)

This avoids insane superblock configurations that could lead to kernel
oops due to null pointer derefences.

http://bugzilla.kernel.org/show_bug.cgi?id=12371

Thanks to David Maciejak at Fortinet's FortiGuard Global Security
Research Team who discovered this bug independently (but at
approximately the same time) as Thiemo Nagel, who submitted the patch.

Signed-off-by: Thiemo Nagel <thiemo.nagel@ph.tum.de>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoext4: Fix s_dirty_blocks_counter if block allocation failed with nodelalloc
Aneesh Kumar K.V [Tue, 17 Feb 2009 15:32:37 +0000 (10:32 -0500)]
ext4: Fix s_dirty_blocks_counter if block allocation failed with nodelalloc

(cherry picked from commit 0087d9fb3f29f59e8d42c8b058376d80e5adde4c)

With nodelalloc option we need to update the dirty block counter on
block allocation failure. This is needed because we increment the
dirty block counter early in the block allocation phase. Without
the patch s_dirty_blocks_counter goes wrong so that filesystem's
free blocks decreases incorrectly.

Tested-by: Akira Fujita <a-fujita@rs.jp.nec.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoext4: Init the complete page while building buddy cache
Aneesh Kumar K.V [Tue, 17 Feb 2009 15:32:36 +0000 (10:32 -0500)]
ext4: Init the complete page while building buddy cache

(cherry picked from commit 29eaf024980e07cc01f31ae4ea5d68c917f4b7da)

We need to init the complete page during buddy cache init
by setting the contents to '1'.  Otherwise we can see the
following errors after doing an online resize of the
filesystem:

EXT4-fs error (device sdb1): ext4_mb_mark_diskspace_used:
Allocating block 1040385 in system zone of 127 group

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoext4: Don't allow new groups to be added during block allocation
Aneesh Kumar K.V [Tue, 17 Feb 2009 15:32:35 +0000 (10:32 -0500)]
ext4: Don't allow new groups to be added during block allocation

(cherry picked from commit 8556e8f3b6c4c11601ce1e9ea8090a6d8bd5daae)

After we mark the blocks in the buddy cache as allocated,
we need to ensure that we don't reinit the buddy cache until
the block bitmap is updated.  This commit achieves this by holding
the group_info alloc_semaphore till ext4_mb_release_context

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoext4: mark the blocks/inode bitmap beyond end of group as used
Aneesh Kumar K.V [Tue, 17 Feb 2009 15:32:34 +0000 (10:32 -0500)]
ext4: mark the blocks/inode bitmap beyond end of group as used

(cherry picked from commit 648f5879f5892dddd3ba71cd0d285599f40f2512)

We need to mark the block/inode bitmap beyond the end of the group
with '1'.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoext4: Use new buffer_head flag to check uninit group bitmaps initialization
Aneesh Kumar K.V [Tue, 17 Feb 2009 15:32:33 +0000 (10:32 -0500)]
ext4: Use new buffer_head flag to check uninit group bitmaps initialization

(cherry picked from commit 2ccb5fb9f113dae969d1ae9b6c10e80fa34f8cd3)

For uninit block group, the on-disk bitmap is not initialized. That
implies we cannot depend on the uptodate flag on the bitmap
buffer_head to find bitmap validity.  Use a new buffer_head flag which
would be set after we properly initialize the bitmap.  This also
prevents (re-)initializing the uninit group bitmap every time we call
ext4_read_block_bitmap().

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agojbd2: Add BH_JBDPrivateStart
Mark Fasheh [Tue, 17 Feb 2009 15:32:32 +0000 (10:32 -0500)]
jbd2: Add BH_JBDPrivateStart

(cherry picked from commit e97fcd95a4778a8caf1980c6c72fdf68185a0838)

Add this so that file systems using JBD2 can safely allocate unused b_state
bits.

In this case, we add it so that Ocfs2 can define a single bit for tracking
the validation state of a buffer.

Acked-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoext4: Fix the race between read_inode_bitmap() and ext4_new_inode()
Aneesh Kumar K.V [Tue, 17 Feb 2009 15:32:31 +0000 (10:32 -0500)]
ext4: Fix the race between read_inode_bitmap() and ext4_new_inode()

(cherry picked from commit 393418676a7602e1d7d3f6e560159c65c8cbd50e)

We need to make sure we update the inode bitmap and clear
EXT4_BG_INODE_UNINIT flag with sb_bgl_lock held, since
ext4_read_inode_bitmap() looks at EXT4_BG_INODE_UNINIT to decide
whether to initialize the inode bitmap each time it is called.
(introduced by commit c806e68f.)

ext4_read_inode_bitmap does:

spin_lock(sb_bgl_lock(EXT4_SB(sb), block_group));
if (desc->bg_flags & cpu_to_le16(EXT4_BG_INODE_UNINIT)) {
ext4_init_inode_bitmap(sb, bh, block_group, desc);

and ext4_new_inode does
if (!ext4_set_bit_atomic(sb_bgl_lock(sbi, group),
                   ino, inode_bitmap_bh->b_data))
   ......
   ...
spin_lock(sb_bgl_lock(sbi, group));

gdp->bg_flags &= cpu_to_le16(~EXT4_BG_INODE_UNINIT);
i.e., on allocation we update the bitmap then we take the sb_bgl_lock
and clear the EXT4_BG_INODE_UNINIT flag. What can happen is a
parallel ext4_read_inode_bitmap can zero out the bitmap in between
the above ext4_set_bit_atomic and spin_lock(sb_bg_lock..)

The race results in below user visible errors
EXT4-fs error (device sdb1): ext4_free_inode: bit already cleared for inode 168449
EXT4-fs warning (device sdb1): ext4_unlink: Deleting nonexistent file ...
EXT4-fs warning (device sdb1): ext4_rmdir: empty directory has too many links ...
ls: /mnt/tmp/f/p369/d3/d6/d39/db2/dee/d10f/d3f/l71: Stale NFS file handle

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoext4: Fix race between read_block_bitmap() and mark_diskspace_used()
Aneesh Kumar K.V [Tue, 17 Feb 2009 15:32:30 +0000 (10:32 -0500)]
ext4: Fix race between read_block_bitmap() and mark_diskspace_used()

(cherry picked from commit e8134b27e351e813414da3b95aa8eac6d3908088)

We need to make sure we update the block bitmap and clear
EXT4_BG_BLOCK_UNINIT flag with sb_bgl_lock held, since
ext4_read_block_bitmap() looks at EXT4_BG_BLOCK_UNINIT to decide
whether to initialize the block bitmap each time it is called
(introduced by commit c806e68f), and this can race with block
allocations in ext4_mb_mark_diskspace_used().

ext4_read_block_bitmap does:

spin_lock(sb_bgl_lock(EXT4_SB(sb), block_group));
if (desc->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT)) {
ext4_init_block_bitmap(sb, bh, block_group, desc);

Now on the block allocation side we do

mb_set_bits(sb_bgl_lock(sbi, ac->ac_b_ex.fe_group), bitmap_bh->b_data,
ac->ac_b_ex.fe_start, ac->ac_b_ex.fe_len);
....
spin_lock(sb_bgl_lock(sbi, ac->ac_b_ex.fe_group));
if (gdp->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT)) {
gdp->bg_flags &= cpu_to_le16(~EXT4_BG_BLOCK_UNINIT);

ie on allocation we update the bitmap then we take the sb_bgl_lock
and clear the EXT4_BG_BLOCK_UNINIT flag. What can happen is a
parallel ext4_read_block_bitmap can zero out the bitmap in between
the above mb_set_bits and spin_lock(sb_bg_lock..)

The race results in below user visible errors
EXT4-fs error (device sdb1): ext4_mb_release_inode_pa: free 100, pa_free 105
EXT4-fs error (device sdb1): mb_free_blocks: double-free of inode 0's block ..

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoext4: don't use blocks freed but not yet committed in buddy cache init
Aneesh Kumar K.V [Tue, 17 Feb 2009 15:32:29 +0000 (10:32 -0500)]
ext4: don't use blocks freed but not yet committed in buddy cache init

(cherry picked from commit 7a2fcbf7f85737735fd44eb34b62315bccf6d6e4)

When we generate buddy cache (especially during resize) we need to
make sure we don't use the blocks freed but not yet comitted.  This
makes sure we have the right value of free blocks count in the group
info and also in the bitmap.  This also ensures the ordered mode
consistency

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoext4: cleanup mballoc header files
Aneesh Kumar K.V [Tue, 17 Feb 2009 15:32:28 +0000 (10:32 -0500)]
ext4: cleanup mballoc header files

(cherry picked from commit c3a326a657562dab81acf05aee106dc1fe345eb4)

Move some of the forward declaration of the static functions
to mballoc.c where they are used. This enables us to include
mballoc.h in other .c files. Also correct the buddy cache
documentation.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoext4: Use EXT4_GROUP_INFO_NEED_INIT_BIT during resize
Aneesh Kumar K.V [Tue, 17 Feb 2009 15:32:27 +0000 (10:32 -0500)]
ext4: Use EXT4_GROUP_INFO_NEED_INIT_BIT during resize

(cherry picked from commit 920313a726e04fef0f2c0bcb04ad8229c0e700d8)

The new groups added during resize are flagged as
need_init group. Make sure we properly initialize these
groups. When we have block size < page size and we are adding
new groups the page may still be marked uptodate even though
we haven't initialized the group. While forcing the init
of buddy cache we need to make sure other groups part of the
same page of buddy cache is not using the cache.
group_info->alloc_sem is added to ensure the same.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoext4: Add blocks added during resize to bitmap
Aneesh Kumar K.V [Tue, 17 Feb 2009 15:32:26 +0000 (10:32 -0500)]
ext4: Add blocks added during resize to bitmap

(cherry picked from commit e21675d4b63975d09eb75c443c48ebe663d23e18)

With this change new blocks added during resize
are marked as free in the block bitmap and the
group is flagged with EXT4_GROUP_INFO_NEED_INIT_BIT
flag.  This makes sure when mballoc tries to allocate
blocks from the new group we would reload the
buddy information using the bitmap present in the disk.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>