]> git.kernelconcepts.de Git - karo-tx-linux.git/log
karo-tx-linux.git
16 years agonf_nat: fix memset error
Li Zefan [Wed, 28 Nov 2007 08:56:27 +0000 (09:56 +0100)]
nf_nat: fix memset error

This patch fixes an incorrect memset in the NAT code, causing
misbehaviour when unloading and reloading the NAT module.
Applies to stable-2.6.22 and stable-2.6.23.

Please apply, thanks.
[NETFILTER]: nf_nat: fix memset error

Upstream commit e0bf9cf15fc30d300b7fbd821c6bc975531fab44

The size passing to memset is the size of a pointer. Fixes
misbehaviour when unloading and reloading the NAT module.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoesp_scsi: fix reset cleanup spinlock recursion
Maciej W. Rozycki [Mon, 10 Dec 2007 23:49:31 +0000 (15:49 -0800)]
esp_scsi: fix reset cleanup spinlock recursion

patch 522939d45c293388e6a360210905f9230298df16 in mainline.

The esp_reset_cleanup() function is called with the host lock held and
invokes starget_for_each_device() which wants to take it too.  Here is a
fix along the lines of shost_for_each_device()/__shost_for_each_device()
adding a __starget_for_each_device() counterpart which assumes the lock
has already been taken.

Eventually, I think the driver should get modified so that more work is
done as a softirq rather than in the interrupt context, but for now it
fixes a bug that causes the spinlock debugger to fire.

While at it, it fixes a small number of cosmetic problems with
starget_for_each_device() too.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agorevert "dpt_i2o: convert to SCSI hotplug model"
Andrew Morton [Mon, 10 Dec 2007 23:49:20 +0000 (15:49 -0800)]
revert "dpt_i2o: convert to SCSI hotplug model"

patch 24601bbcacb3356657747f2e64317923feb7a1a2 in mainline.

revert

    commit 55d9fcf57ba5ec427544fca7abc335cf3da78160
    Author: Matthew Wilcox <matthew@wil.cx>
    Date:   Mon Jul 30 15:19:18 2007 -0600

        [SCSI] dpt_i2o: convert to SCSI hotplug model

         - Delete refereces to HOSTS_C
         - Switch to module_init/module_exit instead of detect/release
         - Don't pass around the host template and rename it to adpt_template
         - Switch from scsi_register/scsi_unregister to scsi_host_alloc,
           scsi_add_host, scsi_scan_host and scsi_host_put.

Because it caused (for unknown reasons) Andres' all-data-reads-as-zeroes
problem, reported at
http://groups.google.com/group/fa.linux.kernel/msg/083a9acff0330234

Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Mark Salyzyn <mark_salyzyn@adaptec.com>
Cc: James Bottomley <James.Bottomley@SteelEye.com>
Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Anders Henke <anders.henke@1und1.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agofb_ddc: fix DDC lines quirk
Jean Delvare [Thu, 29 Nov 2007 00:21:35 +0000 (16:21 -0800)]
fb_ddc: fix DDC lines quirk

patch b64d70825abbf706bbe80be1b11b09514b71f45e in mainline.

The code in fb_ddc_read() is said to be based on the implementation of the
radeon driver:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=fc5891c8a3ba284f13994d7bc1f1bfa8283982de

However, comparing the old radeon driver code with the new fb_ddc code
reveals some differences.  Most notably, the I2C bus lines are held at the
end of the function, while the original code was releasing them (as the
comment above correctly says.)

There are a few other differences, which appear to be responsible for read
failures on my system.  While tracing low-level I2C code in i2c-algo-bit, I
noticed that the initial attempt to read the EDID always failed.  It takes
one retry for the read to succeed.  As we are about to remove this
automatic retry property from i2c-algo-bit, reading the EDID would really
fail.

As a summary, the I2C lines quirk which is supposedly needed to read EDID
on some older monitors is currently breaking the (first) read on all other
monitors (and might not even work with older ones - did anyone try since
October 2006?)

After applying the patch below, which makes the code in fb_ddc_read()
really similar to what the radeon driver used to have, the first EDID read
succeeds again.

On top of that, as it appears that this code has been broken for one year
now and nobody seems to have complained, I'm curious if it makes sense to
keep this quirk in place.  It makes the code more complex and slower just
for the sake of monitors which I guess nobody uses anymore.  Can't we just
get rid of it?

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Tested-by: Roger Leigh <rleigh@whinlatter.ukfsn.org>
Tested-by: Michael Buesch <mb@bu3sch.de>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agowait_task_stopped(): pass correct exit_code to wait_noreap_copyout()
Scott James Remnant [Thu, 29 Nov 2007 00:22:07 +0000 (16:22 -0800)]
wait_task_stopped(): pass correct exit_code to wait_noreap_copyout()

patch e6ceb32aa25fc33f21af84cc7a32fe289b3e860c in mainline.

In wait_task_stopped() exit_code already contains the right value for the
si_status member of siginfo, and this is simply set in the non WNOWAIT
case.

If you call waitid() with a stopped or traced process, you'll get the signal
in siginfo.si_status as expected -- however if you call waitid(WNOWAIT) at the
same time, you'll get the signal << 8 | 0x7f

Pass it unchanged to wait_noreap_copyout(); we would only need to shift it
and add 0x7f if we were returning it in the user status field and that
isn't used for any function that permits WNOWAIT.

Signed-off-by: Scott James Remnant <scott@ubuntu.com>
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoPNP: increase the maximum number of resources
Zhao Yakui [Thu, 29 Nov 2007 00:21:21 +0000 (16:21 -0800)]
PNP: increase the maximum number of resources

patch a7839e960675b549f06209d18283d5cee2ce9261 in mainline.

On some systems the number of resources(IO,MEM) returnedy by PNP device is
greater than the PNP constant, for example motherboard devices.  It brings
that some resources can't be reserved and resource confilicts.  This will
cause PCI resources are assigned wrongly in some systems, and cause hang.
This is a regression since we deleted ACPI motherboard driver and use PNP
system driver.

[akpm@linux-foundation.org: fix text and coding-style a bit]
Signed-off-by: Li Shaohua <shaohua.li@intel.com>
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Thomas Renninger <trenn@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoFreezer: Fix APM emulation breakage
Rafael J. Wysocki [Wed, 21 Nov 2007 01:53:14 +0000 (02:53 +0100)]
Freezer: Fix APM emulation breakage

patch cb43c54ca05c01533c45e4d3abfe8f99b7acf624 in mainline.

The APM emulation is currently broken as a result of commit
831441862956fffa17b9801db37e6ea1650b0f69
"Freezer: make kernel threads nonfreezable by default"
that removed the PF_NOFREEZE annotations from apm_ioctl() without adding
the appropriate freezer hooks.  Fix it and remove the unnecessary variable flags
from apm_ioctl().

Special thanks to Franck Bui-Huu <vagabon.xyz@gmail.com> for pointing out the
problem.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Franck Bui-Huu <vagabon.xyz@gmail.com>
Cc: Nigel Cunningham <nigel@nigel.suspend2.net>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agofutex: fix for futex_wait signal stack corruption
Steven Rostedt [Wed, 5 Dec 2007 14:46:09 +0000 (15:46 +0100)]
futex: fix for futex_wait signal stack corruption

From Steven Rostedt <srostedt@redhat.com>

patch ce6bd420f43b28038a2c6e8fbb86ad24014727b6 in mainline.

David Holmes found a bug in the -rt tree with respect to
pthread_cond_timedwait. After trying his test program on the latest git
from mainline, I found the bug was there too.  The bug he was seeing
that his test program showed, was that if one were to do a "Ctrl-Z" on a
process that was in the pthread_cond_timedwait, and then did a "bg" on
that process, it would return with a "-ETIMEDOUT" but early. That is,
the timer would go off early.

Looking into this, I found the source of the problem. And it is a rather
nasty bug at that.

Here's the relevant code from kernel/futex.c: (not in order in the file)

[...]
smlinkage long sys_futex(u32 __user *uaddr, int op, u32 val,
                          struct timespec __user *utime, u32 __user *uaddr2,
                          u32 val3)
{
        struct timespec ts;
        ktime_t t, *tp = NULL;
        u32 val2 = 0;
        int cmd = op & FUTEX_CMD_MASK;

        if (utime && (cmd == FUTEX_WAIT || cmd == FUTEX_LOCK_PI)) {
                if (copy_from_user(&ts, utime, sizeof(ts)) != 0)
                        return -EFAULT;
                if (!timespec_valid(&ts))
                        return -EINVAL;

                t = timespec_to_ktime(ts);
                if (cmd == FUTEX_WAIT)
                        t = ktime_add(ktime_get(), t);
                tp = &t;
        }
[...]
        return do_futex(uaddr, op, val, tp, uaddr2, val2, val3);
}

[...]

long do_futex(u32 __user *uaddr, int op, u32 val, ktime_t *timeout,
                u32 __user *uaddr2, u32 val2, u32 val3)
{
        int ret;
        int cmd = op & FUTEX_CMD_MASK;
        struct rw_semaphore *fshared = NULL;

        if (!(op & FUTEX_PRIVATE_FLAG))
                fshared = &current->mm->mmap_sem;

        switch (cmd) {
        case FUTEX_WAIT:
                ret = futex_wait(uaddr, fshared, val, timeout);

[...]

static int futex_wait(u32 __user *uaddr, struct rw_semaphore *fshared,
                      u32 val, ktime_t *abs_time)
{
[...]
               struct restart_block *restart;
                restart = &current_thread_info()->restart_block;
                restart->fn = futex_wait_restart;
                restart->arg0 = (unsigned long)uaddr;
                restart->arg1 = (unsigned long)val;
                restart->arg2 = (unsigned long)abs_time;
                restart->arg3 = 0;
                if (fshared)
                        restart->arg3 |= ARG3_SHARED;
                return -ERESTART_RESTARTBLOCK;
[...]

static long futex_wait_restart(struct restart_block *restart)
{
        u32 __user *uaddr = (u32 __user *)restart->arg0;
        u32 val = (u32)restart->arg1;
        ktime_t *abs_time = (ktime_t *)restart->arg2;
        struct rw_semaphore *fshared = NULL;

        restart->fn = do_no_restart_syscall;
        if (restart->arg3 & ARG3_SHARED)
                fshared = &current->mm->mmap_sem;
        return (long)futex_wait(uaddr, fshared, val, abs_time);
}

So when the futex_wait is interrupt by a signal we break out of the
hrtimer code and set up or return from signal. This code does not return
back to userspace, so we set up a RESTARTBLOCK.  The bug here is that we
save the "abs_time" which is a pointer to the stack variable "ktime_t t"
from sys_futex.

This returns and unwinds the stack before we get to call our signal. On
return from the signal we go to futex_wait_restart, where we update all
the parameters for futex_wait and call it. But here we have a problem
where abs_time is no longer valid.

I verified this with print statements, and sure enough, what abs_time
was set to ends up being garbage when we get to futex_wait_restart.

The solution I did to solve this (with input from Linus Torvalds)
was to add unions to the restart_block to allow system calls to
use the restart with specific parameters.  This way the futex code now
saves the time in a 64bit value in the restart block instead of storing
it on the stack.

Note: I'm a bit nervious to add "linux/types.h" and use u32 and u64
in thread_info.h, when there's a #ifdef __KERNEL__ just below that.
Not sure what that is there for.  If this turns out to be a problem, I've
tested this with using "unsigned int" for u32 and "unsigned long long" for
u64 and it worked just the same. I'm using u32 and u64 just to be
consistent with what the futex code uses.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoisdn: avoid copying overly-long strings
Karsten Keil [Thu, 22 Nov 2007 11:43:13 +0000 (12:43 +0100)]
isdn: avoid copying overly-long strings

patch 0f13864e5b24d9cbe18d125d41bfa4b726a82e40 in mainline.

Addresses http://bugzilla.kernel.org/show_bug.cgi?id=9416

Signed-off-by: Karsten Keil <kkeil@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agox86 setup: add a near jump to serialize %cr0 on 386/486
H. Peter Anvin [Mon, 5 Nov 2007 01:50:12 +0000 (17:50 -0800)]
x86 setup: add a near jump to serialize %cr0 on 386/486

patch 7ed192906a2144ebc8ca2925a85d27b9c5355668 in mainline.

The 386 and 486 needs a jump immediately after setting %cr0 in order
to serialize the pipeline.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoKVM: VMX: Reset mmu context when entering real mode
Eddie Dong [Sun, 2 Dec 2007 11:18:47 +0000 (13:18 +0200)]
KVM: VMX: Reset mmu context when entering real mode

patch 8668a3c468ed55d19514117a5a959d91d3d03823 in mainline.

Resetting an SMP guest will force AP enter real mode (RESET) with
paging enabled in protected mode. While current enter_rmode() can
only handle mode switch from nonpaging mode to real mode which leads
to SMP reboot failure.

Fix by reloading the mmu context on entering real mode.

Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com>
Signed-off-by: Qing He <qing.he@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoKVM: VMX: Force vm86 mode if setting flags during real mode
Avi Kivity [Sun, 2 Dec 2007 11:18:46 +0000 (13:18 +0200)]
KVM: VMX: Force vm86 mode if setting flags during real mode

patch 78f7826868da8e27d097802139a3fec39f47f3b8 in mainline.

When resetting from userspace, we need to handle the flags being cleared
even after we are in real mode.

Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoKVM: Skip pio instruction when it is emulated, not executed
Avi Kivity [Sun, 2 Dec 2007 11:18:45 +0000 (13:18 +0200)]
KVM: Skip pio instruction when it is emulated, not executed

patch 0967b7bf1c22b55777aba46ff616547feed0b141 in mainline.

If we defer updating rip until pio instructions are executed, we have a
problem with reset:  a pio reset updates rip, and when the instruction
completes we skip the emulated instruction, pointing rip somewhere completely
unrelated.

Fix by updating rip when we see decode the instruction, not after emulation.

Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoKVM: SVM: Fix FPU leak while emulating clts
Amit Shah [Sun, 2 Dec 2007 11:18:44 +0000 (13:18 +0200)]
KVM: SVM: Fix FPU leak while emulating clts

patch 404fb881b82cf0cf6981832f8d31a7484e4dee81 in mainline.

The clts code didn't use set_cr0 properly, so our lazy FPU
processing wasn't being done by the clts instruction at all.

(this isn't called on Intel as the hardware does the decode for us)

Signed-off-by: Amit Shah <amit.shah@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoKVM: Fix hang on uniprocessor
Marko Kohtala [Sun, 2 Dec 2007 11:18:43 +0000 (13:18 +0200)]
KVM: Fix hang on uniprocessor

This is not in mainline, as it was fixed differently in that tree.

first_cpu(cpus) returns the only CPU when NR_CPUS is 1 regardless of
the cpus mask. Therefore we avoid a kernel hang in
KVM_SET_MEMORY_REGION ioctl on uniprocessor by not entering the loop at
all.

Signed-off-by: Marko Kohtala <marko.kohtala@gmail.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoKVM: x86 emulator: Use emulator_write_emulated and not emulator_write_std
Amit Shah [Sun, 2 Dec 2007 11:18:42 +0000 (13:18 +0200)]
KVM: x86 emulator: Use emulator_write_emulated and not emulator_write_std

patch 00b2ef475d4728ca53a2bc788c7978042907e354 in mainline.

emulator_write_std() is not implemented, and calling write_emulated should
work just as well in place of write_std.

Fixes emulator failures with the push r/m instruction.

Signed-off-by: Amit Shah <amit.shah@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoKVM: SVM: Intercept the 'invd' and 'wbinvd' instructions
Avi Kivity [Sun, 2 Dec 2007 11:18:41 +0000 (13:18 +0200)]
KVM: SVM: Intercept the 'invd' and 'wbinvd' instructions

patch cf5a94d1331b411b84414c13e43f578260942d6b in mainline.

'invd' can destroy host data, and 'wbinvd' allows the guest to induce
long (milliseconds) latencies.

Noted by Ben Serebrin.

Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoKVM: x86 emulator: invd instruction
Avi Kivity [Sun, 2 Dec 2007 11:18:40 +0000 (13:18 +0200)]
KVM: x86 emulator: invd instruction

patch 651a3e29b3d19418d7a8a9787906061f9be7cc5f in mainline.

Emulate the 'invd' instruction (opcode 0f 08).

Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoKVM: x86 emulator: fix access registers for instructions with ModR/M byte and Mod = 3
Aurelien Jarno [Sun, 2 Dec 2007 11:18:39 +0000 (13:18 +0200)]
KVM: x86 emulator: fix access registers for instructions with ModR/M byte and Mod = 3

patch 4e62417bf317504c0b85e0d7abd236f334f54eaf in mainline.

The patch belows changes the access type to register from memory for
instructions that are declared as SrcMem or DstMem, but have a
ModR/M byte with Mod = 3.

It fixes (at least) the lmsw and smsw instructions on an AMD64 CPU,
which are needed for FreeBSD.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoKVM: x86 emulator: implement 'movnti mem, reg'
Sheng Yang [Sun, 2 Dec 2007 11:18:38 +0000 (13:18 +0200)]
KVM: x86 emulator: implement 'movnti mem, reg'

patch a012e65aee48379a7a87eadafa74f878b61522b9 in mainline.

Implement emulation of instruction:
    movnti m32/m64, r32/r64
    opcode: 0x0f 0xc3

Needed to support Linux 2.6.16 as guest (used for mmio).

Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agohrtimers: avoid overflow for large relative timeouts (CVE-2007-5966)
Thomas Gleixner [Fri, 7 Dec 2007 18:16:17 +0000 (19:16 +0100)]
hrtimers: avoid overflow for large relative timeouts (CVE-2007-5966)

patch 62f0f61e6673e67151a7c8c0f9a09c7ea43fe2b5 in mainline

Relative hrtimers with a large timeout value might end up as negative
timer values, when the current time is added in hrtimer_start().

This in turn is causing the clockevents_set_next() function to set an
huge timeout and sleep for quite a long time when we have a clock
source which is capable of long sleeps like HPET. With PIT this almost
goes unnoticed as the maximum delta is ~27ms. The non-hrt/nohz code
sorts this out in the next timer interrupt, so we never noticed that
problem which has been there since the first day of hrtimers.

This bug became more apparent in 2.6.24 which activates HPET on more
hardware.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoforcedeth boot delay fix
Ayaz Abdulla [Wed, 21 Nov 2007 23:02:58 +0000 (15:02 -0800)]
forcedeth boot delay fix

patch 9e555930bd873d238f5f7b9d76d3bf31e6e3ce93 in mainline.

Fix a long boot delay in the forcedeth driver.  During initialization, the
timeout for the handshake between mgmt unit and driver can be very long.
The patch reduces the timeout by eliminating a extra loop around the
timeout logic.

Addresses http://bugzilla.kernel.org/show_bug.cgi?id=9308

Signed-off-by: Ayaz Abdulla <aabdulla@nvidia.com>
Cc: Alex Howells <astinus@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoforcedeth: new mcp79 pci ids
Ayaz Abdulla [Sat, 24 Nov 2007 01:54:01 +0000 (20:54 -0500)]
forcedeth: new mcp79 pci ids

patch 490dde8990c55662596a4be71b5070bd7d382d4a in mainline.

This patch adds new device ids and features for mcp79 devices into the
forcedeth driver.

Signed-off-by: Ayaz Abdulla <aabdulla@nvidia.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
index 92ce2e3..f9ba0ac 100644

16 years agoI4L: fix isdn_ioctl memory overrun vulnerability
Karsten Keil [Sat, 1 Dec 2007 20:16:15 +0000 (12:16 -0800)]
I4L: fix isdn_ioctl memory overrun vulnerability

patch eafe1aa37e6ec2d56f14732b5240c4dd09f0613a in mainline.

Fix possible memory overrun issue in the isdn ioctl code.  Found by ADLAB
<adlab@venustech.com.cn>

Signed-off-by: Karsten Keil <kkeil@suse.de>
Cc: ADLAB <adlab@venustech.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agotmpfs: restore missing clear_highpage
Hugh Dickins [Wed, 28 Nov 2007 18:55:10 +0000 (18:55 +0000)]
tmpfs: restore missing clear_highpage

patch e84e2e132c9c66d8498e7710d4ea532d1feaaac5 in mainline

tmpfs was misconverted to __GFP_ZERO in 2.6.11.  There's an unusual case in
which shmem_getpage receives the page from its caller instead of allocating.
We must cover this case by clear_highpage before SetPageUptodate, as before.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoUSB: fix up EHCI startup synchronization
David Brownell [Wed, 28 Nov 2007 22:50:03 +0000 (14:50 -0800)]
USB: fix up EHCI startup synchronization

patch 1cb52658b4f5b10a9e91f8e1c21ca2bcc1b9a3ca in mainline.

A recent patch added software synchronization during EHCI startup,
so ports aren't switched away from the companion controllers after
resets have started.  This patch adds a short delay letting hardware
finish that port switching before any new resets begin ... so both
ends of that hardware race window are closed.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: Dave Miller <davem@davemloft.net>
Cc: Dely Sy <dely.l.sy@intel.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoUSB: make the microtek driver and HAL cooperate
Oliver Neukum [Wed, 28 Nov 2007 22:50:02 +0000 (14:50 -0800)]
USB: make the microtek driver and HAL cooperate

patch 5cf1973a44bd298e3cfce6f6af8faa8c9d0a6d55 in mainline

to make HAL like the microtek driver's devices the parent must be
correctly set.

Signed-off-by: Oliver Neukum <oneukum@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoInput: ALPS - add support for model found in Dell Vostro 1400
William Pettersson [Wed, 21 Nov 2007 22:11:07 +0000 (17:11 -0500)]
Input: ALPS - add support for model found in Dell Vostro 1400

changeset dac4ae0daa1be36ab015973ed9e9dc04a2684395 in mainline.

Input: ALPS - add support for model found in Dell Vostro 1400

Signed-off-by: William Pettersson <william.pettersson@gmail.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Cc: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoFix synchronize_irq races with IRQ handler
Herbert Xu [Wed, 21 Nov 2007 22:09:39 +0000 (17:09 -0500)]
Fix synchronize_irq races with IRQ handler

patch a98ce5c6feead6bfedefabd46cb3d7f5be148d9a in mainline.

Fix synchronize_irq races with IRQ handler

As it is some callers of synchronize_irq rely on memory barriers
to provide synchronisation against the IRQ handlers.  For example,
the tg3 driver does

tp->irq_sync = 1;
smp_mb();
synchronize_irq();

and then in the IRQ handler:

if (!tp->irq_sync)
netif_rx_schedule(dev, &tp->napi);

Unfortunately memory barriers only work well when they come in
pairs.  Because we don't actually have memory barriers on the
IRQ path, the memory barrier before the synchronize_irq() doesn't
actually protect us.

In particular, synchronize_irq() may return followed by the
result of netif_rx_schedule being made visible.

This patch (mostly written by Linus) fixes this by using spin
locks instead of memory barries on the synchronize_irq() path.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Chuck Ebbert <cebbert@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoPKT_SCHED: Check subqueue status before calling hard_start_xmit
Peter P Waskiewicz Jr [Wed, 21 Nov 2007 12:32:57 +0000 (20:32 +0800)]
PKT_SCHED: Check subqueue status before calling hard_start_xmit

[PKT_SCHED]: Check subqueue status before calling hard_start_xmit

[ Upstream commit: 5f1a485d5905aa641f33009019b3699076666a4c ]

The only qdiscs that check subqueue state before dequeue'ing are PRIO
and RR.  The other qdiscs, including the default pfifo_fast qdisc,
will allow traffic bound for subqueue 0 through to hard_start_xmit.
The check for netif_queue_stopped() is done above in pkt_sched.h, so
it is unnecessary for qdisc_restart().  However, if the underlying
driver is multiqueue capable, and only sets queue states on subqueues,
this will allow packets to enter the driver when it's currently unable
to process packets, resulting in expensive requeues and driver
entries.  This patch re-adds the check for the subqueue status before
calling hard_start_xmit, so we can try and avoid the driver entry when
the queues are stopped.

Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agosched: some proc entries are missed in sched_domain sys_ctl debug code
Zou Nan hai [Mon, 15 Oct 2007 15:00:14 +0000 (17:00 +0200)]
sched: some proc entries are missed in sched_domain sys_ctl debug code

patch ace8b3d633f93da8535921bf3e3679db3c619578 in mainline.

cache_nice_tries and flags entry do not appear in proc fs sched_domain
directory, because ctl_table entry is skipped.

This patch fixes the issue.

Signed-off-by: Zou Nan hai <nanhai.zou@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoFuture of Linux 2.6.22.y series
Christian Borntraeger [Tue, 6 Nov 2007 11:26:15 +0000 (12:26 +0100)]
Future of Linux 2.6.22.y series

commit 5d0360ee96a5ef953dbea45873c2a8c87e77d59b upstream.

We have seen ramdisk based install systems, where some pages of mapped
libraries and programs were suddendly zeroed under memory pressure. This
should not happen, as the ramdisk avoids freeing its pages by keeping
them dirty all the time.

It turns out that there is a case, where the VM makes a ramdisk page
clean, without telling the ramdisk driver.  On memory pressure
shrink_zone runs and it starts to run shrink_active_list.  There is a
check for buffer_heads_over_limit, and if true, pagevec_strip is called.
pagevec_strip calls try_to_release_page. If the mapping has no
releasepage callback, try_to_free_buffers is called. try_to_free_buffers
has now a special logic for some file systems to make a dirty page
clean, if all buffers are clean. Thats what happened in our test case.

The simplest solution is to provide a noop-releasepage callback for the
ramdisk driver. This avoids try_to_free_buffers for ramdisk pages.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Acked-by: Nick Piggin <npiggin@suse.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoNETFILTER: Fix NULL pointer dereference in nf_nat_move_storage()
Evgeniy Polyakov [Wed, 21 Nov 2007 12:32:56 +0000 (20:32 +0800)]
NETFILTER: Fix NULL pointer dereference in nf_nat_move_storage()

[NETFILTER]: Fix NULL pointer dereference in nf_nat_move_storage()

[ Upstream commit: 7799652557d966e49512479f4d3b9079bbc01fff ]

Reported by Chuck Ebbert as:

https://bugzilla.redhat.com/show_bug.cgi?id=259501#c14

This routine is called each time hash should be replaced, nf_conn has
extension list which contains pointers to connection tracking users
(like nat, which is right now the only such user), so when replace takes
place it should copy own extensions. Loop above checks for own
extension, but tries to move higer-layer one, which can lead to above
oops.

Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoNET: random : secure_tcp_sequence_number should not assume CONFIG_KTIME_SCALAR
Eric Dumazet [Wed, 21 Nov 2007 12:32:55 +0000 (20:32 +0800)]
NET: random : secure_tcp_sequence_number should not assume CONFIG_KTIME_SCALAR

[NET] random : secure_tcp_sequence_number should not assume CONFIG_KTIME_SCALAR

[ Upstream commit: 6dd10a62353a50b30b30e0c18653650975b29c71 ]

All 32 bits machines but i386 dont have CONFIG_KTIME_SCALAR. On these
machines, ktime.tv64 is more than 4 times the (correct) result given
by ktime_to_ns()

Again on these machines, using ktime_get_real().tv64 >> 6 give a
32bits rollover every 64 seconds, which is not wanted (less than the
120 s MSL)

Using ktime_to_ns() is the portable way to get nsecs from a ktime, and
have correct code.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agolibertas: properly account for queue commands
Marcelo Tosatti [Tue, 20 Nov 2007 18:54:52 +0000 (13:54 -0500)]
libertas: properly account for queue commands

patch 29f5f2a19b055feabfcc6f92e1d40ec092c373ea in mainline.

Properly account for queue commands, this fixes a problem reported
by Holger Schurig when using the debugfs interface.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoLinux 2.6.23.9 v2.6.23.9
Greg Kroah-Hartman [Mon, 26 Nov 2007 17:51:43 +0000 (09:51 -0800)]
Linux 2.6.23.9

16 years agoipw2200: batch non-user-requested scan result notifications
Dan Williams [Tue, 9 Oct 2007 17:55:24 +0000 (13:55 -0400)]
ipw2200: batch non-user-requested scan result notifications

patch 0b5316769774d1dc2fdd702e095f9e6992af269a in mainline.

ipw2200 makes extensive use of background scanning when unassociated or
down.  Unfortunately, the firmware sends scan completed events many
times per second, which the driver pushes directly up to userspace.
This needlessly wakes up processes listening for wireless events many
times per second.  Batch together scan completed events for
non-user-requested scans and send them up to userspace every 4 seconds.
Scan completed events resulting from an SIOCSIWSCAN call are pushed up
without delay.

Signed-off-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Cc: Tobias Powalowski <t.powa@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoUSB: Nikon D40X unusual_devs entry
Ortwin Glück [Thu, 11 Oct 2007 15:29:43 +0000 (17:29 +0200)]
USB: Nikon D40X unusual_devs entry

patch d466a9190ff1ceddfee50686e61d63590fc820d9 in mainline.

Not surprisingly the Nikon D40X DSC needs the same quirks as the D40,
but it has a separate ID.
See http://bugs.gentoo.org/show_bug.cgi?id=191431

From: Ortwin Glück <odi@odi.ch>
Cc: Tobias Powalowski <t.powa@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoUSB: unusual_devs modification for Nikon D200
Phil Dibowitz [Sun, 23 Sep 2007 03:58:12 +0000 (20:58 -0700)]
USB: unusual_devs modification for Nikon D200

patch 16eb345f4d9189b59bae576ae63cba7ca77817b2 in mainline.

Upgrade the unusual_devs.h file to support the Nikon D200

Signed-off-by: Mike Pagano <mpagano-kernel@mpagano.com>
Signed-off-by: Phil Dibowitz <phil@ipom.com>
Cc: Tobias Powalowski <t.powa@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agosoftlockup: use cpu_clock() instead of sched_clock()
Ingo Molnar [Wed, 17 Oct 2007 06:26:06 +0000 (23:26 -0700)]
softlockup: use cpu_clock() instead of sched_clock()

patch a3b13c23f186ecb57204580cc1f2dbe9c284953a in mainline.

sched_clock() is not a reliable time-source, use cpu_clock() instead.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agosoftlockup watchdog fixes and cleanups
Ingo Molnar [Sun, 18 Nov 2007 00:55:38 +0000 (01:55 +0100)]
softlockup watchdog fixes and cleanups

This is a merge of commits a5f2ce3c6024a5bb895647b6bd88ecae5001020a and
43581a10075492445f65234384210492ff333eba in mainline to fix a warning in
the 2.6.23.3 kernel release.

softlockup watchdog: style cleanups

kernel/softirq.c grew a few style uncleanlinesses in the past few
months, clean that up. No functional changes:

text    data     bss     dec     hex filename
1126      76       4    1206     4b6 softlockup.o.before
1129      76       4    1209     4b9 softlockup.o.after

( the 3 bytes .text increase is due to the "<1>" appended to one of
the printk messages. )

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
softlockup: improve debug output

Improve the debuggability of kernel lockups by enhancing the debug
output of the softlockup detector: print the task that causes the lockup
and try to print a more intelligent backtrace.

The old format was:

BUG: soft lockup detected on CPU#1!
[<c0105e4a>] show_trace_log_lvl+0x19/0x2e
[<c0105f43>] show_trace+0x12/0x14
[<c0105f59>] dump_stack+0x14/0x16
[<c015f6bc>] softlockup_tick+0xbe/0xd0
[<c013457d>] run_local_timers+0x12/0x14
[<c01346b8>] update_process_times+0x3e/0x63
[<c0145fb8>] tick_sched_timer+0x7c/0xc0
[<c0140a75>] hrtimer_interrupt+0x135/0x1ba
[<c011bde7>] smp_apic_timer_interrupt+0x6e/0x80
[<c0105aa3>] apic_timer_interrupt+0x33/0x38
[<c0104f8a>] syscall_call+0x7/0xb
=======================

The new format is:

BUG: soft lockup detected on CPU#1! [prctl:2363]

Pid: 2363, comm:                prctl
EIP: 0060:[<c013915f>] CPU: 1
EIP is at sys_prctl+0x24/0x18c
EFLAGS: 00000213    Not tainted  (2.6.22-cfs-v20 #26)
EAX: 00000001 EBX: 000003e7 ECX: 00000001 EDX: f6df0000
ESI: 000003e7 EDI: 000003e7 EBP: f6df0fb0 DS: 007b ES: 007b FS: 00d8
CR0: 8005003b CR2: 4d8c3340 CR3: 3731d000 CR4: 000006d0
[<c0105e4a>] show_trace_log_lvl+0x19/0x2e
[<c0105f43>] show_trace+0x12/0x14
[<c01040be>] show_regs+0x1ab/0x1b3
[<c015f807>] softlockup_tick+0xef/0x108
[<c013457d>] run_local_timers+0x12/0x14
[<c01346b8>] update_process_times+0x3e/0x63
[<c0145fcc>] tick_sched_timer+0x7c/0xc0
[<c0140a89>] hrtimer_interrupt+0x135/0x1ba
[<c011bde7>] smp_apic_timer_interrupt+0x6e/0x80
[<c0105aa3>] apic_timer_interrupt+0x33/0x38
[<c0104f8a>] syscall_call+0x7/0xb
=======================

Note that in the old format we only knew that some system call locked
up, we didnt know _which_. With the new format we know that it's at a
specific place in sys_prctl(). [which was where i created an artificial
kernel lockup to test the new format.]

This is also useful if the lockup happens in user-space - the user-space
EIP (and other registers) will be printed too. (such a lockup would
either suggest that the task was running at SCHED_FIFO:99 and looping
for more than 10 seconds, or that the softlockup detector has a
false-positive.)

The task name is printed too first, just in case we dont manage to print
a useful backtrace.

[satyam@infradead.org: fix warning]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Satyam Sharma <satyam@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agox86: fix freeze in x86_64 RTC update code in time_64.c
David P. Reed [Wed, 14 Nov 2007 22:47:35 +0000 (17:47 -0500)]
x86: fix freeze in x86_64 RTC update code in time_64.c

patch c399da0d97e06803e51085ec076b63a3168aad1b in mainline.

x86: fix freeze in x86_64 RTC update code in time_64.c

Fix hard freeze on x86_64 when the ntpd service calls
update_persistent_clock()

A repeatable but randomly timed freeze has been happening in Fedora 6
and 7 for the last year, whenever I run the ntpd service on my AMD64x2
HP Pavilion dv9000z laptop.  This freeze is due to the use of
spin_lock(&rtc_lock) under the assumption (per a bad comment) that
set_rtc_mmss is called only with interrupts disabled.  The call from
ntp.c to update_persistent_clock is made with interrupts enabled.

[ tglx@linutronix.de: ported to 2.6.23.stable ]

Signed-off-by: David P. Reed <dpreed@reed.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agontp: fix typo that makes sync_cmos_clock erratic
David P. Reed [Wed, 14 Nov 2007 22:49:21 +0000 (17:49 -0500)]
ntp: fix typo that makes sync_cmos_clock erratic

patch fa6a1a554b50cbb7763f6907e6fef927ead480d9 in mainline.

ntp: fix typo that makes sync_cmos_clock erratic

Fix a typo in ntp.c that has caused updating of the persistent (RTC)
clock when synced to NTP to behave erratically.

When debugging a freeze that arises on my AMD64 machines when I
run the ntpd service, I added a number of printk's to monitor the
sync_cmos_clock procedure.  I discovered that it was not syncing to
cmos RTC every 11 minutes as documented, but instead would keep trying
every second for hours at a time.  The reason turned out to be a typo
in sync_cmos_clock, where it attempts to ensure that
update_persistent_clock is called very close to 500 msec. after a 1
second boundary (required by the PC RTC's spec). That typo referred to
"xtime" in one spot, rather than "now", which is derived from "xtime"
but not equal to it.  This makes the test erratic, creating a
"coin-flip" that decides when update_persistent_clock is called - when
it is called, which is rarely, it may be at any time during the one
second period, rather than close to 500 msec, so the value written is
needlessly incorrect, too.

Signed-off-by: David P. Reed <dpreed@reed.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agox86: return correct error code from child_rip in x86_64 entry.S
Andrey Mirkin [Wed, 17 Oct 2007 16:04:33 +0000 (18:04 +0200)]
x86: return correct error code from child_rip in x86_64 entry.S

patch 1c5b5cfd290b6cb7c67020ef420e275f746a7236 in mainline.

x86: return correct error code from child_rip in x86_64 entry.S

Right now register edi is just cleared before calling do_exit.
That is wrong because correct return value will be ignored.
Value from rax should be copied to rdi instead of clearing edi.

AK: changed to 32bit move because it's strictly an int

[ tglx: arch/x86 adaptation ]

Signed-off-by: Andrey Mirkin <major@openvz.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agox86: NX bit handling in change_page_attr()
Huang, Ying [Wed, 17 Oct 2007 16:04:35 +0000 (18:04 +0200)]
x86: NX bit handling in change_page_attr()

patch 84e0fdb1754d066dd0a8b257de7299f392d1e727 in mainline.

x86: NX bit handling in change_page_attr()

This patch fixes a bug of change_page_attr/change_page_attr_addr on
Intel x86_64 CPUs.  After changing page attribute to be executable with
these functions, the page remains un-executable on Intel x86_64 CPU.
Because on Intel x86_64 CPU, only if the "NX" bits of all four level
page tables are cleared, the corresponding page is executable (refer to
section 4.13.2 of Intel 64 and IA-32 Architectures Software Developer's
Manual).  So, the bug is fixed through clearing the "NX" bit of PMD when
splitting the huge PMD.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agox86: mark read_crX() asm code as volatile
Kirill Korotaev [Wed, 17 Oct 2007 16:04:33 +0000 (18:04 +0200)]
x86: mark read_crX() asm code as volatile

patch c1217a75ea102d4e69321f210fab60bc47b9a48e in mainline.

x86: mark read_crX() asm code as volatile

Some gcc versions (I checked at least 4.1.1 from RHEL5 & 4.1.2 from gentoo)
can generate incorrect code with read_crX()/write_crX() functions mix up,
due to cached results of read_crX().

The small app for x8664 below compiled with -O2 demonstrates this
(i686 does the same thing):

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agox86: fix off-by-one in find_next_zero_string
Andrew Hastings [Wed, 17 Oct 2007 16:04:33 +0000 (18:04 +0200)]
x86: fix off-by-one in find_next_zero_string

patch 801916c1b369b637ce799e6c71a94963ff63df79 in mainline.

x86: fix off-by-one in find_next_zero_string

Fix an off-by-one error in find_next_zero_string which prevents
allocating the last bit.

[ tglx: arch/x86 adaptation ]

Signed-off-by: Andrew Hastings <abh@cray.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoi386: avoid temporarily inconsistent pte-s
Jan Beulich [Wed, 17 Oct 2007 16:04:33 +0000 (18:04 +0200)]
i386: avoid temporarily inconsistent pte-s

patch aa506dc7b12d03fbf8fd11aab752aed1aadd9c07 in mainline.

i386: avoid temporarily inconsistent pte-s

One more of these issues (which were considered fixed a few releases
back): other than on x86-64, i386 allows set_fixmap() to replace
already present mappings. Consequently, on PAE, care must be taken to
not update the high half of a pte while the low half is still holding
the old value.

 [tglx: arch/x86 adaptation]

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agolibcrc32c: keep intermediate crc state in cpu order
Herbert Xu [Thu, 15 Nov 2007 01:07:23 +0000 (09:07 +0800)]
libcrc32c: keep intermediate crc state in cpu order

It's upstream changeset ef19454bd437b2ba14c9cda1de85debd9f383484.

[LIB] crc32c: Keep intermediate crc state in cpu order

crypto/crc32.c:chksum_final() is computing the digest as
*(__le32 *)out = ~cpu_to_le32(mctx->crc);
so the low-level crc32c_le routines should just keep
the crc in cpu order, otherwise it is getting swabbed
one too many times on big-endian machines.

Signed-off-by: Benny Halevy <bhalevy@fs1.bhalevy.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agogeode: Fix not inplace encryption
Sebastian Siewior [Sat, 10 Nov 2007 11:37:49 +0000 (19:37 +0800)]
geode: Fix not inplace encryption

patch 2e21630ddc3fb717dc645356b75771c6a52dc627 in mainline.

Currently the Geode AES module fails to encrypt or decrypt if
the coherent bits are not set what is currently the case if the
encryption does not occur inplace. However, the encryption works
on my Geode machine _only_ if the coherent bits are always set.

Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Acked-by: Jordan Crouse <jordan.crouse@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoFix divide-by-zero in the 2.6.23 scheduler code
Chuck Ebbert [Wed, 14 Nov 2007 23:33:16 +0000 (18:33 -0500)]
Fix divide-by-zero in the 2.6.23 scheduler code

No patch in mainline as this logic has been removed from 2.6.24 so it is
not necessary.

https://bugzilla.redhat.com/show_bug.cgi?id=340161

The problem code has been removed in 2.6.24. The below patch disables
SCHED_FEAT_PRECISE_CPU_LOAD which causes the offending code to be skipped
but does not prevent the user from enabling it.

The divide-by-zero is here in kernel/sched.c:

static void update_cpu_load(struct rq *this_rq)
{
u64 fair_delta64, exec_delta64, idle_delta64, sample_interval64, tmp64;
unsigned long total_load = this_rq->ls.load.weight;
unsigned long this_load =  total_load;
struct load_stat *ls = &this_rq->ls;
int i, scale;

this_rq->nr_load_updates++;
if (unlikely(!(sysctl_sched_features & SCHED_FEAT_PRECISE_CPU_LOAD)))
goto do_avg;

/* Update delta_fair/delta_exec fields first */
update_curr_load(this_rq);

fair_delta64 = ls->delta_fair + 1;
ls->delta_fair = 0;

exec_delta64 = ls->delta_exec + 1;
ls->delta_exec = 0;

sample_interval64 = this_rq->clock - ls->load_update_last;
ls->load_update_last = this_rq->clock;

if ((s64)sample_interval64 < (s64)TICK_NSEC)
sample_interval64 = TICK_NSEC;

if (exec_delta64 > sample_interval64)
exec_delta64 = sample_interval64;

idle_delta64 = sample_interval64 - exec_delta64;

======> tmp64 = div64_64(SCHED_LOAD_SCALE * exec_delta64, fair_delta64);
tmp64 = div64_64(tmp64 * exec_delta64, sample_interval64);

this_load = (unsigned long)tmp64;

do_avg:

/* Update our load: */
for (i = 0, scale = 1; i < CPU_LOAD_IDX_MAX; i++, scale += scale) {
unsigned long old_load, new_load;

/* scale is effectively 1 << i now, and >> i divides by scale */

old_load = this_rq->cpu_load[i];
new_load = this_load;

this_rq->cpu_load[i] = (old_load*(scale-1) + new_load) >> i;
}
}

For stable only; the code has been removed in 2.6.24.

Signed-off-by: Chuck Ebbert <cebbert@redhat.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoACPI: VIDEO: Adjust current level to closest available one.
Alexey Starikovskiy [Thu, 15 Nov 2007 07:04:29 +0000 (08:04 +0100)]
ACPI: VIDEO: Adjust current level to closest available one.

patch 63f0edfc0b7f8058f9d3f9b572615ec97ae011ba in mainline.

ACPI: VIDEO: Adjust current level to closest available one.

Signed-off-by: Alexey Starikovskiy <astarikovskiy@suse.de>
Signed-off-by: Len Brown <len.brown@intel.com>
Cc: Tobias Powalowski <t.powa@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agolibata: sata_sis: use correct S/G table size
Jeff Garzik [Thu, 15 Nov 2007 06:59:44 +0000 (07:59 +0100)]
libata: sata_sis: use correct S/G table size

patch 96af154710d44b574515431a0bb014888398a741 in mainline.

[libata] sata_sis: use correct S/G table size

sata_sis has the same restrictions as other SFF controllers, and so must
use LIBATA_MAX_PRD to denote that SCSI may only fill ATA_MAX_PRD/2
entries, due to our need to handle IOMMU merging.

Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Cc: Tobias Powalowski <t.powa@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agosata_sis: fix SCR read breakage
Tejun Heo [Thu, 15 Nov 2007 06:59:44 +0000 (07:59 +0100)]
sata_sis: fix SCR read breakage

patch aaa092a114696f4425cd57c4d7fa05110007e247 in mainline.

sata_sis: fix SCR read breakage

SCR read for controllers which uses PCI configuration space for SCR
access got broken while adding @val argument to SCR accessors.  Fix
it.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Cc: Tobias Powalowski <t.powa@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoreiserfs: don't drop PG_dirty when releasing sub-page-sized dirty file
Fengguang Wu [Thu, 15 Nov 2007 00:59:54 +0000 (16:59 -0800)]
reiserfs: don't drop PG_dirty when releasing sub-page-sized dirty file

patch c06a018fa5362fa9ed0768bd747c0fab26bc8849 in mainline.

This is not a new problem in 2.6.23-git17.  2.6.22/2.6.23 is buggy in the
same way.

Reiserfs could accumulate dirty sub-page-size files until umount time.
They cannot be synced to disk by pdflush routines or explicit `sync'
commands.  Only `umount' can do the trick.

The direct cause is: the dirty page's PG_dirty is wrongly _cleared_.
Call trace:
 [<ffffffff8027e920>] cancel_dirty_page+0xd0/0xf0
 [<ffffffff8816d470>] :reiserfs:reiserfs_cut_from_item+0x660/0x710
 [<ffffffff8816d791>] :reiserfs:reiserfs_do_truncate+0x271/0x530
 [<ffffffff8815872d>] :reiserfs:reiserfs_truncate_file+0xfd/0x3b0
 [<ffffffff8815d3d0>] :reiserfs:reiserfs_file_release+0x1e0/0x340
 [<ffffffff802a187c>] __fput+0xcc/0x1b0
 [<ffffffff802a1ba6>] fput+0x16/0x20
 [<ffffffff8029e676>] filp_close+0x56/0x90
 [<ffffffff8029fe0d>] sys_close+0xad/0x110
 [<ffffffff8020c41e>] system_call+0x7e/0x83

Fix the bug by removing the cancel_dirty_page() call. Tests show that
it causes no bad behaviors on various write sizes.

=== for the patient ===
Here are more detailed demonstrations of the problem.

1) the page has both PG_dirty(D)/PAGECACHE_TAG_DIRTY(d) after being written to;
   and then only PAGECACHE_TAG_DIRTY(d) remains after the file is closed.

------------------------------ screen 0 ------------------------------
[T0] root /home/wfg# cat > /test/tiny
[T1] hi
[T2] root /home/wfg#

------------------------------ screen 1 ------------------------------
[T1] root /home/wfg# echo /test/tiny > /proc/filecache
[T1] root /home/wfg# cat /proc/filecache
     # file /test/tiny
     # flags R:referenced A:active M:mmap U:uptodate D:dirty W:writeback O:owner B:buffer d:dirty w:writeback
     # idx   len     state   refcnt
     0       1       ___UD__Bd_      2
[T2] root /home/wfg# cat /proc/filecache
     # file /test/tiny
     # flags R:referenced A:active M:mmap U:uptodate D:dirty W:writeback O:owner B:buffer d:dirty w:writeback
     # idx   len     state   refcnt
     0       1       ___U___Bd_      2

2) note the non-zero 'cancelled_write_bytes' after /tmp/hi is copied.

------------------------------ screen 0 ------------------------------
[T0] root /home/wfg# echo hi > /tmp/hi
[T1] root /home/wfg# cp /tmp/hi /dev/stdin /test
[T2] hi
[T3] root /home/wfg#

------------------------------ screen 1 ------------------------------
[T1] root /proc/4397# cd /proc/`pidof cp`
[T1] root /proc/4713# cat io
     rchar: 8396
     wchar: 3
     syscr: 20
     syscw: 1
     read_bytes: 0
     write_bytes: 20480
     cancelled_write_bytes: 4096
[T2] root /proc/4713# cat io
     rchar: 8399
     wchar: 6
     syscr: 21
     syscw: 2
     read_bytes: 0
     write_bytes: 24576
     cancelled_write_bytes: 4096

//Question: the 'write_bytes' is a bit more than expected ;-)

Tested-by: Maxim Levitsky <maximlevitsky@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Fengguang Wu <wfg@mail.ustc.edu.cn>
Reviewed-by: Chris Mason <chris.mason@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agox86: disable preemption in delay_tsc()
Andrew Morton [Thu, 15 Nov 2007 01:00:41 +0000 (17:00 -0800)]
x86: disable preemption in delay_tsc()

patch 35d5d08a085c56f153458c3f5d8ce24123617faf in mainline.

Marin Mitov points out that delay_tsc() can misbehave if it is preempted and
rescheduled on a different CPU which has a skewed TSC.  Fix it by disabling
preemption.

(I assume that the worst-case behaviour here is a stall of 2^32 cycles)

Cc: Andi Kleen <ak@suse.de>
Cc: Marin Mitov <mitov@issp.bas.bg>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agodmaengine: fix broken device refcounting
Haavard Skinnemoen [Thu, 15 Nov 2007 00:59:27 +0000 (16:59 -0800)]
dmaengine: fix broken device refcounting

patch 348badf1e825323c419dd118f65783db0f7d2ec8 in mainline.

When a DMA device is unregistered, its reference count is decremented twice
for each channel: Once dma_class_dev_release() and once in
dma_chan_cleanup().  This may result in the DMA device driver's remove()
function completing before all channels have been cleaned up, causing lots
of use-after-free fun.

Fix it by incrementing the device's reference count twice for each
channel during registration.

[dan.j.williams@intel.com: kill unnecessary client refcounting]
Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agonfsd4: recheck for secure ports in fh_verify
J. Bruce Fields [Mon, 12 Nov 2007 21:05:03 +0000 (16:05 -0500)]
nfsd4: recheck for secure ports in fh_verify

patch 6fa02839bf9412e18e773d04e96182b4cd0b5d57 in mainline.

As with

7fc90ec93a5eb71f4b08... "call nfsd_setuser() on fh_compose()..."

this is a case where we need to redo a security check in fh_verify()
even though the filehandle already has an associated dentry--if the
filehandle was created by fh_compose() in an earlier operation of the
nfsv4 compound, then we may not have done these checks yet.

Without this fix it is possible, for example, to traverse from an export
without the secure ports requirement to one with it in a single
compound, and bypass the secure port check on the new export.

While we're here, fix up some minor style problems and change a printk()
to a dprintk(), to make it harder for random unprivileged users to spam
the logs.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Reviewed-By: NeilBrown <neilb@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoknfsd: fix spurious EINVAL errors on first access of new filesystem
J. Bruce Fields [Mon, 12 Nov 2007 21:05:02 +0000 (16:05 -0500)]
knfsd: fix spurious EINVAL errors on first access of new filesystem

patch ac8587dcb58e40dd336d99d60f852041e06cc3dd in mainline.

The v2/v3 acl code in nfsd is translating any return from fh_verify() to
nfserr_inval.  This is particularly unfortunate in the case of an
nfserr_dropit return, which is an internal error meant to indicate to
callers that this request has been deferred and should just be dropped
pending the results of an upcall to mountd.

Thanks to Roland <devzero@web.de> for bug report and data collection.

Cc: Roland <devzero@web.de>
Acked-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Reviewed-By: NeilBrown <neilb@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoraid5: fix unending write sequence
Dan Williams [Thu, 15 Nov 2007 00:59:35 +0000 (16:59 -0800)]
raid5: fix unending write sequence

patch 6c55be8b962f1bdc592d579e81fc27b11ea53dfc in mainline.

<debug output from Joel's system>
handling stripe 7629696, state=0x14 cnt=1, pd_idx=2 ops=0:0:0
check 5: state 0x6 toread 0000000000000000 read 0000000000000000 write fffff800ffcffcc0 written 0000000000000000
check 4: state 0x6 toread 0000000000000000 read 0000000000000000 write fffff800fdd4e360 written 0000000000000000
check 3: state 0x1 toread 0000000000000000 read 0000000000000000 write 0000000000000000 written 0000000000000000
check 2: state 0x1 toread 0000000000000000 read 0000000000000000 write 0000000000000000 written 0000000000000000
check 1: state 0x6 toread 0000000000000000 read 0000000000000000 write fffff800ff517e40 written 0000000000000000
check 0: state 0x6 toread 0000000000000000 read 0000000000000000 write fffff800fd4cae60 written 0000000000000000
locked=4 uptodate=2 to_read=0 to_write=4 failed=0 failed_num=0
for sector 7629696, rmw=0 rcw=0
</debug>

These blocks were prepared to be written out, but were never handled in
ops_run_biodrain(), so they remain locked forever.  The operations flags
are all clear which means handle_stripe() thinks nothing else needs to be
done.

This state suggests that the STRIPE_OP_PREXOR bit was sampled 'set' when it
should not have been.  This patch cleans up cases where the code looks at
sh->ops.pending when it should be looking at the consistent stack-based
snapshot of the operations flags.

Report from Joel:
Resync done. Patch fix this bug.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Tested-by: Joel Bertrand <joel.bertrand@systella.fr>
Cc: <stable@kernel.org>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agooProfile: oops when profile_pc() returns ~0LU
Philippe Elie [Thu, 15 Nov 2007 00:58:48 +0000 (16:58 -0800)]
oProfile: oops when profile_pc() returns ~0LU

patch df9d177aa28d50e64bae6fbd6b263833079e3571 in mainline.

Instruction pointer returned by profile_pc() can be a random value.  This
break the assumption than we can safely set struct op_sample.eip field to a
magic value to signal to the per-cpu buffer reader side special event like
task switch ending up in a segfault in get_task_mm() when profile_pc()
return ~0UL.  Fixed by sanitizing the sampled eip and reject/log invalid
eip.

Problem reported by Sami Farin, patch tested by him.

Signed-off-by: Philippe Elie <phil.el@wanadoo.fr>
Tested-by: Sami Farin <safari-kernel@safari.iki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agodrivers/video/ps3fb: fix memset size error
Li Zefan [Thu, 15 Nov 2007 00:58:33 +0000 (16:58 -0800)]
drivers/video/ps3fb: fix memset size error

patch 3cc2c17700c98b0af778566b0af6292b23b01430 in mainline.

The size passing to memset is wrong.

Signed-off-by Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoi2c/eeprom: Hide Sony Vaio serial numbers
Jean Delvare [Fri, 16 Nov 2007 09:34:17 +0000 (10:34 +0100)]
i2c/eeprom: Hide Sony Vaio serial numbers

patch 0f2cbd38aa377e30df3b7602abed69464d1970aa in mainline.

The sysfs interface to DMI data takes care to not make the system
serial number and UUID world-readable, presumably due to privacy
concerns. For consistency, we should not let the eeprom driver
export these same strings to the world on Sony Vaio laptops.
Instead, only make them readable by root, as we already do for BIOS
passwords.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoi2c/eeprom: Recognize VGN as a valid Sony Vaio name prefix
Jean Delvare [Fri, 16 Nov 2007 09:37:55 +0000 (10:37 +0100)]
i2c/eeprom: Recognize VGN as a valid Sony Vaio name prefix

patch 8b925a3dd8a4d7451092cb9aa11da727ba69e0f0 in mainline.

Recent (i.e. 2005 and later) Sony Vaio laptops have names beginning
with VGN rather than PCG. Update the eeprom driver so that it
recognizes these.

Why this matters: the eeprom driver hides private data from the
EEPROMs it recognizes as Vaio EEPROMs (passwords, serial number...) so
if the driver fails to recognize a Vaio EEPROM as such, the private
data is exposed to the world.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoi2c-pasemi: Fix NACK detection
Jean Delvare [Fri, 16 Nov 2007 09:24:36 +0000 (10:24 +0100)]
i2c-pasemi: Fix NACK detection

patch be8a1f7cd4501c3b4b32543577a33aee6d2193ac in mainline.

Turns out we don't actually check the status to see if there was a
device out there to talk to, just if we had a timeout when doing so.

Add the proper check, so we don't falsly think there are devices
on the bus that are not there, etc.

Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoLinux 2.6.23.8 v2.6.23.8
Greg Kroah-Hartman [Fri, 16 Nov 2007 18:14:27 +0000 (10:14 -0800)]
Linux 2.6.23.8

16 years agowait_task_stopped: Check p->exit_state instead of TASK_TRACED (CVE-2007-5500)
Roland McGrath [Wed, 14 Nov 2007 06:11:50 +0000 (22:11 -0800)]
wait_task_stopped: Check p->exit_state instead of TASK_TRACED (CVE-2007-5500)

patch a3474224e6a01924be40a8255636ea5522c1023a in mainline

The original meaning of the old test (p->state > TASK_STOPPED) was
"not dead", since it was before TASK_TRACED existed and before the
state/exit_state split.  It was a wrong correction in commit
14bf01bb0599c89fc7f426d20353b76e12555308 to make this test for
TASK_TRACED instead.  It should have been changed when TASK_TRACED
was introducted and again when exit_state was introduced.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Alexey Dobriyan <adobriyan@sw.ru>
Cc: Kees Cook <kees@ubuntu.com>
Acked-by: Scott James Remnant <scott@ubuntu.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoTCP: Make sure write_queue_from does not begin with NULL ptr (CVE-2007-5501)
Ilpo Järvinen [Wed, 14 Nov 2007 23:47:18 +0000 (15:47 -0800)]
TCP: Make sure write_queue_from does not begin with NULL ptr (CVE-2007-5501)

patch 96a2d41a3e495734b63bff4e5dd0112741b93b38 in mainline.

NULL ptr can be returned from tcp_write_queue_head to cached_skb
and then assigned to skb if packets_out was zero. Without this,
system is vulnerable to a carefully crafted ACKs which obviously
is remotely triggerable.

Besides, there's very little that needs to be done in sacktag
if there weren't any packets outstanding, just skipping the rest
doesn't hurt.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoLinux 2.6.23.7 v2.6.23.7
Greg Kroah-Hartman [Fri, 16 Nov 2007 17:43:31 +0000 (09:43 -0800)]
Linux 2.6.23.7

16 years agoNFS: Fix a writeback race...
Trond Myklebust [Fri, 19 Oct 2007 21:26:11 +0000 (17:26 -0400)]
NFS: Fix a writeback race...

patch 61e930a904966cc37e0a3404276f0b73037e57ca in mainline

This patch fixes a regression that was introduced by commit
44dd151d5c21234cc534c47d7382f5c28c3143cd

We cannot zero the user page in nfs_mark_uptodate() any more, since

  a) We'd be modifying the page without holding the page lock
  b) We can race with other updates of the page, most notably
     because of the call to nfs_wb_page() in nfs_writepage_setup().

Instead, we do the zeroing in nfs_update_request() if we see that we're
creating a request that might potentially be marked as up to date.

Thanks to Olivier Paquet for reporting the bug and providing a test-case.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoocfs2: fix write() performance regression
Mark Fasheh [Mon, 12 Nov 2007 22:09:22 +0000 (14:09 -0800)]
ocfs2: fix write() performance regression

patch 4e9563fd55ff4479f2b118d0757d121dd0cfc39c in mainline.

ocfs2: fix write() performance regression

On file systems which don't support sparse files, Ocfs2_map_page_blocks()
was reading blocks on appending writes. This caused write performance to
suffer dramatically. Fix this by detecting an appending write on a nonsparse
fs and skipping the read.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agominixfs: limit minixfs printks on corrupted dir i_size (CVE-2006-6058)
Eric Sandeen [Wed, 17 Oct 2007 06:27:15 +0000 (23:27 -0700)]
minixfs: limit minixfs printks on corrupted dir i_size (CVE-2006-6058)

patch f44ec6f3f89889a469773b1fd894f8fcc07c29cf upstream.

This attempts to address CVE-2006-6058
http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2006-6058

first reported at http://projects.info-pull.com/mokb/MOKB-17-11-2006.html

Essentially a corrupted minix dir inode reporting a very large
i_size will loop for a very long time in minix_readdir, minix_find_entry,
etc, because on EIO they just move on to try the next page.  This is
under the BKL, printk-storming as well.  This can lock up the machine
for a very long time.  Simply ratelimiting the printks gets things back
under control.  Make the message a bit more informative while we're here.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Cc: Bodo Eggert <7eggert@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoLinux 2.6.23.6 v2.6.23.6
Greg Kroah-Hartman [Fri, 16 Nov 2007 17:33:58 +0000 (09:33 -0800)]
Linux 2.6.23.6

16 years agoACPI: suspend: Wrong order of GPE restore.
Alexey Starikovskiy [Tue, 13 Nov 2007 00:09:01 +0000 (19:09 -0500)]
ACPI: suspend: Wrong order of GPE restore.

commit 1dbc1fda5d8ca907f320b806005d4a447977d26a in mainline.

ACPI: suspend: Wrong order of GPE restore.

acpi_leave_sleep_state() should have correct list of wake and
runtime GPEs, which is available only after disable_wakeup_device()
is called.

[cebbert@redhat.com: backport to 2.6.23]

Signed-off-by: Alexey Starikovskiy <astarikovskiy@suse.de>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Len Brown <len.brown@intel.com>
Cc: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoACPI: sleep: Fix GPE suspend cleanup
Alexey Starikovskiy [Tue, 13 Nov 2007 00:06:40 +0000 (19:06 -0500)]
ACPI: sleep: Fix GPE suspend cleanup

patch is 9c1c6a1ba786d58bd03e27ee49f89a5685e8e07b in mainline.

ACPI: sleep: Fix GPE suspend cleanup

Commit 9b039330808b83acac3597535da26f47ad1862ce removed
acpi_gpe_sleep_prepare(), the only function used at S5 transition
Add call to generic acpi_enable_wake_device().

Reference: https://bugzilla.novell.com/show_bug.cgi?id=299882

Signed-off-by: Alexey Starikovskiy <astarikovskiy@suse.de>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Len Brown <len.brown@intel.com>
Cc: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agolibata: backport ATA_FLAG_NO_SRST and ATA_FLAG_ASSUME_ATA, part 2
Tejun Heo [Thu, 25 Oct 2007 06:53:19 +0000 (15:53 +0900)]
libata: backport ATA_FLAG_NO_SRST and ATA_FLAG_ASSUME_ATA, part 2

Differs from mainline, but the functionality is already there.

P5W-DH Deluxe has ICH7R which doesn't have PMP support but SIMG 4726
hardwired to the second port of AHCI controller at PCI device 1f.2.
The 4726 doesn't work as PMP but as a storage processor which can do
hardware RAID on downstream ports.

When no device is attached to the downstream port of the 4726, pseudo
ATA device for configuration appears.  Unfortunately, ATA emulation on
the device is very lousy and causes long hang during boot.

This patch implements workaround for the board.  If the mainboard is
P5W-DH Deluxe (matched using DMI), only hardreset is used on the
second port of AHCI controller @ 1f.2 and the hardreset doesn't depend
on receiving the first FIS and just proceed to IDENTIFY.

This workaround fixes bugzilla #8923.

  http://bugzilla.kernel.org/show_bug.cgi?id=8923

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agolibata: backport ATA_FLAG_NO_SRST and ATA_FLAG_ASSUME_ATA
Tejun Heo [Thu, 25 Oct 2007 06:51:57 +0000 (15:51 +0900)]
libata: backport ATA_FLAG_NO_SRST and ATA_FLAG_ASSUME_ATA

Differs from mainline, but the functionality is already there.

Backport ATA_FLAG_NO_SRST and ATA_FLAG_ASSUME_ATA.  These are
originally link flags (ATA_LFLAG_*) but link abstraction doesn't exist
on 2.6.23, so make it port flags.

This is for the following workaround for ASUS P5W DH Deluxe.

These new flags don't introduce any behavior change unless set and
nobody sets them yet.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agolibata: add HTS542525K9SA00 to NCQ blacklist
Tejun Heo [Wed, 24 Oct 2007 02:47:45 +0000 (11:47 +0900)]
libata: add HTS542525K9SA00 to NCQ blacklist

patch e14cbfa630cd3ab2631ee21b718b290928f47868 in mainline.

Another one doing spurious NCQ completions.  Blacklist it.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Luca Tettamanti <kronos.it@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoradeon: set the address to access the GART table on the CPU side correctly
Dave Airlie [Tue, 6 Nov 2007 00:33:10 +0000 (00:33 +0000)]
radeon: set the address to access the GART table on the CPU side correctly

Upstream as 7fc86860cf73e060ab8ed9763010dfe5b5389b1c

This code relied on the CPU and GPU address for the aperture being the same,
On some r5xx hardware I was playing with I noticed that this isn't always true.
This fixes issues seen on some r400 cards. (bugs.freedesktop.org 9957)

Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoChar: moxa, fix and optimise empty timer
Jiri Slaby [Thu, 18 Oct 2007 10:06:19 +0000 (03:06 -0700)]
Char: moxa, fix and optimise empty timer

patch c43422053bea7a5ce09f18d0c50a606fe1a549f4 in mainline.

moxa, fix and optimise empty timer

don't wait and delete empty timer in empty timer function. Also fire next
empty timer at rounded jiffies to save power.

This fixes a lockup, because we wait for ourselves to finish forever.
(i.e.  sync called from the timer itself).

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoChar: rocket, fix dynamic_dev tty
Jiri Slaby [Thu, 18 Oct 2007 10:06:26 +0000 (03:06 -0700)]
Char: rocket, fix dynamic_dev tty

patch ac6aec2f5683588361ab408cb3346b08c66bdfbe in mainline.

- register_device unconditionally (non-pci dependent) to have also isa
  devices in /dev
- unregister devices on module removal
- don't set TTY_DRIVER_DYNAMIC_DEV twice (removed the one dependent on some
  macro)

This is the substantial part of the patch and the previous point is for
not checking which devices to unregister and which not (simply register
and unregister all found no matter on which bus they are plugged).

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Cc: Ferenc Wagner <wferi@niif.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agohptiop: avoid buffer overflow when returning sense data
HighPoint Linux Team [Tue, 16 Oct 2007 21:28:24 +0000 (14:28 -0700)]
hptiop: avoid buffer overflow when returning sense data

patch 0fec02c93f60fb44ba3a24a0d3e4a52521d34d3f in mainline.

avoid buffer overflow when returning sense data.

With current adapter firmware the driver is working but future firmware
updates may return sense data larger than 96 bytes, causing overflow on
scp->sense_buffer and a kernel crash.

This fix should be backported to earlier kernels.

Signed-off-by: HighPoint Linux Team <linux@highpoint-tech.com>
Signed-off-by: James Bottomley <James.Bottomley@steeleye.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoide: Fix cs5535 driver accessing beyond array boundary
Benjamin Herrenschmidt [Thu, 18 Oct 2007 22:30:05 +0000 (00:30 +0200)]
ide: Fix cs5535 driver accessing beyond array boundary

patch 15d8061bf02aa299b2447f7a22fd18b4a503ea9d in mainline.

The cs5535 uses an incorrect construct to access the other drive of a pair,
causing it to access beyond an array boundary on the secondary interface.

This fixes it by using the new ide_get_paired_drive() helper instead.

Bart: patch description fixes

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Andrew Morton <akpm@osdl.org>
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoide: Fix siimage driver accessing beyond array boundary
Benjamin Herrenschmidt [Thu, 18 Oct 2007 22:30:05 +0000 (00:30 +0200)]
ide: Fix siimage driver accessing beyond array boundary

patch a87a87ccdc541e0a0cc8c7d01a365be8d9153a7b in mainline.

The siimage uses an incorrect construct to access the other drive of a pair,
causing it to access beyond an array boundary on the secondary interface.

This fixes it by using the new ide_get_paired_drive() helper instead.

Bart: patch description fixes

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Andrew Morton <akpm@osdl.org>
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoide: Add ide_get_paired_drive() helper
Benjamin Herrenschmidt [Thu, 18 Oct 2007 22:30:05 +0000 (00:30 +0200)]
ide: Add ide_get_paired_drive() helper

patch 1b678347121001c3c230c6eccfdf9f65c3ec1a4e in mainline.

This adds a helper to get to the "other" drive on a pair connected
to a given hwif.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Andrew Morton <akpm@osdl.org>
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoide: fix serverworks.c UDMA regression
Tony Battersby [Tue, 16 Oct 2007 20:29:52 +0000 (22:29 +0200)]
ide: fix serverworks.c UDMA regression

patch 0c824b51b338c808de650b440ba5f9f4a725f7fc in mainline.

The patch described by the following excerpt from ChangeLog-2.6.22 makes
it impossible to use UDMA on a Tyan S2707 motherboard (SvrWks CSB5):

commit 2d5eaa6dd744a641e75503232a01f52d0768884c
Author: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Date:   Thu May 10 00:01:08 2007 +0200

    ide: rework the code for selecting the best DMA transfer mode (v3)

    ...

This one-line patch against 2.6.23 fixes the problem.

Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoi4l: fix random freezes with AVM B1 drivers
Karsten Keil [Thu, 18 Oct 2007 10:04:31 +0000 (03:04 -0700)]
i4l: fix random freezes with AVM B1 drivers

patch 9713d9e650045f7f2afd81d58a068827be306993 in mainline.

This fix the same issue which was debbuged for the C4 controller for the B1
versions.

The capilib_ function modify or traverse a linked list without locking.

This patch extends the existing locking to the calls of these function to
prevent access to a list which is in the middle of a modification.

Signed-off-by: Karsten Keil <kkeil@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoi4l: Fix random hard freeze with AVM c4 card
Karsten Keil [Thu, 18 Oct 2007 10:04:32 +0000 (03:04 -0700)]
i4l: Fix random hard freeze with AVM c4 card

patch 1ccfd63367c1a6aaf8b33943f18856dde85f2f0b in mainline.

The patch
- Includes the call to capilib_data_b3_req in the spinlock. This routine
  in turn calls the offending mq_enqueue routine that triggered the
  freeze if not locked.  This should also fix other indicators of
  incosistent capilib_msgidqueue list, that trigger messages like:
  Oct  5 03:05:57 BERL0 kernel: kcapi: msgid 3019 ncci 0x30301 not on queue
  that we saw several times a day (usually several in a row).
- Fixes all occurrences of c4_dispatch_tx to be called with active
  spinlock, there were some instances where no lock was active. Mostly
  these are in very infrequently called routines, so the additional
  performance penalty is minimal.

Signed-off-by: Karsten Keil <kkeil@suse.de>
Signed-off-by: Rainer Brestan <rainer.brestan@frequentis.com>
Signed-off-by: Ralf Schlatterbeck <rsc@runtux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoALSA: hda-codec - Add array terminator for dmic in STAC codec
Takashi Iwai [Mon, 15 Oct 2007 12:37:11 +0000 (14:37 +0200)]
ALSA: hda-codec - Add array terminator for dmic in STAC codec

patch f6e9852ad05fa28301c83d4e2b082620de010358 in mainline.

[ALSA] hda-codec - Add array terminator for dmic in STAC codec

Reported by Jan-Marek Glogowski.

The dmic array is passed to snd_hda_parse_pin_def_config() and
should be zero-terminated.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoUSB: usbserial - fix potential deadlock between write() and IRQ
Jiri Kosina [Fri, 19 Oct 2007 22:05:19 +0000 (00:05 +0200)]
USB: usbserial - fix potential deadlock between write() and IRQ

patch acd2a847e7fee7df11817f67dba75a2802793e5d in mainline.

USB: usbserial - fix potential deadlock between write() and IRQ

usb_serial_generic_write() doesn't disable interrupts when taking port->lock,
and could therefore deadlock with usb_serial_generic_read_bulk_callback()
being called from interrupt, taking the same lock. Fix it.

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Acked-by: Larry Finger <larry.finger@lwfinger.net>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoUSB: add URB_FREE_BUFFER to permissible flags
Oliver Neukum [Thu, 25 Oct 2007 20:14:04 +0000 (13:14 -0700)]
USB: add URB_FREE_BUFFER to permissible flags

patch 0b28baaf74ca04be2e0cc4d4dd2bbc801697f744 in mainline.

URB_FREE_BUFFER needs to be allowed in the sanity checks to use drivers that
use that flag.

Signed-off-by: Oliver Neukum <oneukum@suse.de>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoUSB: mutual exclusion for EHCI init and port resets
Alan Stern [Fri, 12 Oct 2007 22:19:14 +0000 (15:19 -0700)]
USB: mutual exclusion for EHCI init and port resets

patch 32fe01985aa2cb2562f6fc171e526e279abe10db in mainline.

This patch (as999) fixes a problem that sometimes shows up when host
controller driver modules are loaded in the wrong order.  If ehci-hcd
happens to initialize an EHCI controller while the companion OHCI or
UHCI controller is in the middle of a port reset, the reset can fail
and the companion may get very confused.  The patch adds an
rw-semaphore and uses it to keep EHCI initialization and port resets
mutually exclusive.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: David Brownell <david-b@pacbell.net>
Cc: David Miller <davem@davemloft.net>
Cc: Dely L Sy <dely.l.sy@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agousb-gadget-ether: prevent oops caused by error interrupt race
Benedikt Spranger [Fri, 12 Oct 2007 22:18:59 +0000 (15:18 -0700)]
usb-gadget-ether: prevent oops caused by error interrupt race

patch 5395353e0c8272fe73ac914acd7e4add0da2bef0 in mainline.

Fix a longstanding race in the Ethernet gadget driver, which can cause an
oops on device disconnect.  The fix is just to make the TX path check
whether its freelist is empty.  That check is otherwise not necessary,
since the queue is always stopped when that list empties (and restarted
when request completion puts an entry back on that freelist).

The race window starts when the network code decides to transmit a packet,
and ends when hard_start_xmit() grabs the freelist lock.  When disconnect()
is called inside that window, it shuts down the TX queue and breaks the
otherwise-solid assumption that packets are never sent through a TX queue
that's stopped.

Signed-off-by: Benedikt Spranger <bene@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoUSB: remove USB_QUIRK_NO_AUTOSUSPEND
Alan Stern [Fri, 12 Oct 2007 22:18:49 +0000 (15:18 -0700)]
USB: remove USB_QUIRK_NO_AUTOSUSPEND

patch a691efa9888e71232dfb4088fb8a8304ffc7b0f9 in mainline.

This patch (as995) cleans up the remains of the former NO_AUTOSUSPEND
quirk.  Since autosuspend is disabled by default, we will let
userspace worry about which devices can safely be suspended.  Thus the
lengthy series of quirk entries is no longer needed, and neither is
the quirk ID.  I suppose someone might eventually run across a hub
that can't be suspended; let's ignore the possibility for now.

The patch also cleans up the hasty way in which autosuspend gets
disabled.  Setting udev->autosuspend_delay to -1 wasn't quite right,
because the value is always supposed to be a multiple of HZ.  It's
better to leave the delay value alone and set autosuspend_disabled,
which is what the quirk routine used to do.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoMSI: Use correct data offset for 32-bit MSI in read_msi_msg()
Roland Dreier [Fri, 12 Oct 2007 22:16:16 +0000 (15:16 -0700)]
MSI: Use correct data offset for 32-bit MSI in read_msi_msg()

patch cbf5d9e6b9bcf03291cbb51db144b3e2773a8a2d in mainline.

While reading the MSI code trying to find a reason why MSI wouldn't
work for devices that have a 32-bit MSI address capability, I noticed
that read_msi_msg() seems to read the message data from the wrong
offset in this case.

Signed-off-by: Roland Dreier <roland@digitalvampire.org>
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agomd: raid5: fix clearing of biofill operations
Dan Williams [Tue, 23 Oct 2007 03:45:11 +0000 (20:45 -0700)]
md: raid5: fix clearing of biofill operations

raid5: fix clearing of biofill operations

This is the correct merge of the two upstream patches for this issue (it
was mis-merged...)

ops_complete_biofill() runs outside of spin_lock(&sh->lock) and clears the
'pending' and 'ack' bits.  Since the test_and_ack_op() macro only checks
against 'complete' it can get an inconsistent snapshot of pending work.

Move the clearing of these bits to handle_stripe5(), under the lock.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Tested-by: Joel Bertrand <joel.bertrand@systella.fr>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agomd: fix an unsigned compare to allow creation of bitmaps with v1.0 metadata
NeilBrown [Tue, 23 Oct 2007 03:45:11 +0000 (20:45 -0700)]
md: fix an unsigned compare to allow creation of bitmaps with v1.0 metadata

patch 85bfb4da8cad483a4e550ec89060d05a4daf895b in mainline.

As page->index is unsigned, this all becomes an unsigned comparison, which
 almost always returns an error.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agodm: fix thaw_bdev
Jun'ichi Nomura [Fri, 12 Oct 2007 17:15:25 +0000 (18:15 +0100)]
dm: fix thaw_bdev

patch ae9da83f6d800fe1f3b23bfbc8f7222ad1c5bb74 in mainline.

This patch fixes a bd_mount_sem counter corruption bug in device-mapper.

thaw_bdev() should be called only when freeze_bdev() was called for the
device.
Otherwise, thaw_bdev() will up bd_mount_sem and corrupt the semaphore counter.
struct block_device with the corrupted semaphore may remain in slab cache
and be reused later.

Attached patch will fix it by calling unlock_fs() instead.
unlock_fs() will determine whether it should call thaw_bdev()
by checking the device is frozen or not.

Easy reproducer is:
  #!/bin/sh
  while [ 1 ]; do
     dmsetup --notable create a
     dmsetup --nolockfs suspend a
     dmsetup remove a
  done

It's not easy to see the effect of corrupted semaphore.
So I have tested with putting printk below in bdev_alloc_inode():
        if (atomic_read(&ei->bdev.bd_mount_sem.count) != 1)
                printk(KERN_DEBUG "Incorrect semaphore count = %d (%p)\n",
                        atomic_read(&ei->bdev.bd_mount_sem.count),
                        &ei->bdev);

Without the patch, I saw something like:
 Incorrect semaphore count = 17 (f2ab91c0)

With the patch, the message didn't appear.

The bug was introduced in 2.6.16 with this bug fix:

commit d9dde59ba03095e526640988c0fedd75e93bc8b7
Date:   Fri Feb 24 13:04:24 2006 -0800

    [PATCH] dm: missing bdput/thaw_bdev at removal

    Need to unfreeze and release bdev otherwise the bdev inode with
    inconsistent state is reused later and cause problem.

and backported to 2.6.15.5.

It occurs only in free_dev(), which is called only when the dm device is
removed.  The buggy code is executed only if md->suspended_bdev is
non-NULL and that can happen only when the device was suspended without
noflush.

Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agodm delay: fix status
Milan Broz [Fri, 12 Oct 2007 17:14:55 +0000 (18:14 +0100)]
dm delay: fix status

patch 79662d1ea37392651f2cff08626cab6a40ba3adc in mainline.

Fix missing space in dm-delay target status output
if separate read and write delay are configured.

Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agolibata: sync NCQ blacklist with upstream
Tejun Heo [Thu, 11 Oct 2007 01:55:15 +0000 (10:55 +0900)]
libata: sync NCQ blacklist with upstream

Synchronize NCQ blacklist with the current upstream.  Based on changes
already in Linus's 2.6.24-rc kernel tree.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>