]> git.kernelconcepts.de Git - karo-tx-linux.git/log
karo-tx-linux.git
15 years agoLinux 2.6.24.6 v2.6.24.6
Greg Kroah-Hartman [Thu, 1 May 2008 21:50:00 +0000 (14:50 -0700)]
Linux 2.6.24.6

15 years agoFix dnotify/close race (CVE-2008-1375)
Al Viro [Thu, 1 May 2008 02:52:22 +0000 (03:52 +0100)]
Fix dnotify/close race (CVE-2008-1375)

commit 214b7049a7929f03bbd2786aaef04b8b79db34e2 upstream.

We have a race between fcntl() and close() that can lead to
dnotify_struct inserted into inode's list *after* the last descriptor
had been gone from current->files.

Since that's the only point where dnotify_struct gets evicted, we are
screwed - it will stick around indefinitely.  Even after struct file in
question is gone and freed.  Worse, we can trigger send_sigio() on it at
any later point, which allows to send an arbitrary signal to arbitrary
process if we manage to apply enough memory pressure to get the page
that used to host that struct file and fill it with the right pattern...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoISDN: Do not validate ISDN net device address prior to interface-up
Paul Bolle [Mon, 14 Apr 2008 05:44:20 +0000 (22:44 -0700)]
ISDN: Do not validate ISDN net device address prior to interface-up

Commit bada339 (Validate device addr prior to interface-up) caused a regression
in the ISDN network code, see: http://bugzilla.kernel.org/show_bug.cgi?id=9923
The trivial fix is to remove the pointer to eth_validate_addr() in the
net_device struct in isdn_net_init().

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoV4L: cx88: enable radio GPIO correctly
Steven Toth [Fri, 25 Apr 2008 00:52:42 +0000 (20:52 -0400)]
V4L: cx88: enable radio GPIO correctly

This patch fixes an issue on the HVR1300, where GPIO is blown away due to
the radio input being undefined, breaking the functionality of the DVB
demodulator and MPEG2 encoder used on the cx8802 mpeg TS port.

This is a minimal patch for 2.6.26 and the -stable series.  This must be
fixed a better way for 2.6.27.

Signed-off-by: Steven Toth <stoth@hauppauge.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Signed-off-by: Michael Krufky <mkrufky@linuxtv.org>
(cherry picked from commit 6b92b3bd7ac91b7e255541f4be9bfd55b12dae41)
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoV4L: Fix VIDIOCGAP corruption in ivtv
Alan Cox [Fri, 25 Apr 2008 00:52:26 +0000 (20:52 -0400)]
V4L: Fix VIDIOCGAP corruption in ivtv

Frank Bennett reported that ivtv was causing skype to crash. With help
from one of their developers he showed it was a kernel problem.
VIDIOCGCAP copies a name into a fixed length buffer - ivtv uses names
that are too long and does not truncate them so corrupts a few bytes of
the app data area.

Possibly the names also want trimming but for now this should fix the
corruption case.

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Signed-off-by: Michael Krufky <mkrufky@linuxtv.org>
(cherry picked from commit d2b213f7b76f187c4391079c7581d3a08b940133)
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoUSB: remove broken usb-serial num_endpoints check
Greg Kroah-Hartman [Thu, 17 Apr 2008 03:05:15 +0000 (03:05 +0000)]
USB: remove broken usb-serial num_endpoints check

commit: 07c3b1a1001614442c665570942a3107a722c314

The num_interrupt_in, num_bulk_in, and other checks in the usb-serial
code are just wrong, there are too many different devices out there with
different numbers of endpoints.  We need to just be sticking with the
device ids instead of trying to catch this kind of thing.  It broke too
many different devices.

This fixes a large number of usb-serial devices to get them working
properly again.

Cc: Oliver Neukum <oliver@neukum.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoIncrease the max_burst threshold from 3 to tp->reordering.
John Heffner [Fri, 25 Apr 2008 08:43:57 +0000 (01:43 -0700)]
Increase the max_burst threshold from 3 to tp->reordering.

[ Upstream commit: dd9e0dda66ba38a2ddd1405ac279894260dc5c36 ]

This change is necessary to allow cwnd to grow during persistent
reordering.  Cwnd moderation is applied when in the disorder state
and an ack that fills the hole comes in.  If the hole was greater
than 3 packets, but less than tp->reordering, cwnd will shrink when
it should not have.

Signed-off-by: John Heffner <jheffner@napa.none>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoJFFS2: Fix free space leak with in-band cleanmarkers
David Woodhouse [Wed, 23 Apr 2008 10:15:35 +0000 (11:15 +0100)]
JFFS2: Fix free space leak with in-band cleanmarkers

We were accounting for the cleanmarker by calling jffs2_link_node_ref()
(without locking!), which adjusted both superblock and per-eraseblock
accounting, subtracting the size of the cleanmarker from {jeb,c}->free_size
and adding it to {jeb,c}->used_size.

But only _then_ were we adding the size of the newly-erased block back
to the superblock counts, and we were adding each of jeb->{free,used}_size
to the corresponding superblock counts. Thus, the size of the cleanmarker
was effectively subtracted from the superblock's free_size _twice_.

Fix this, by always adding a full eraseblock size to c->free_size when
we've erased a block. And call jffs2_link_node_ref() under the proper
lock, while we're at it.

Thanks to Alexander Yurchenko and/or Damir Shayhutdinov for (almost)
pinpointing the problem.

[Backport of commit 014b164e1392a166fe96e003d2f0e7ad2e2a0bb7]

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoUSB: gadget: queue usb USB_CDC_GET_ENCAPSULATED_RESPONSE message
Jan Altenberg [Tue, 19 Feb 2008 00:44:50 +0000 (01:44 +0100)]
USB: gadget: queue usb USB_CDC_GET_ENCAPSULATED_RESPONSE message

backport of 41566bcf35a8b23ce4715dadb5acfd1098c1d3e4

commit 0cf4f2de0a0f4100795f38ef894d4910678c74f8 introduced a bug, which
prevents sending an USB_CDC_GET_ENCAPSULATED_RESPONSE message. This
breaks the RNDIS initialization (especially / only Windoze machines
dislike this behavior...).

Signed-off-by: Benedikt Spranger <b.spranger@linutronix.de>
Signed-off-by: Jan Altenberg <jan.altenberg@linutronix.de>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: Vernon Sauder <vernoninhand@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agotehuti: move ioctl perm check closer to function start (CVE-2008-1675)
Jeff Garzik [Fri, 25 Apr 2008 07:11:31 +0000 (03:11 -0400)]
tehuti: move ioctl perm check closer to function start (CVE-2008-1675)

Commit f946dffed6334f08da065a89ed65026ebf8b33b4 upstream

Noticed by davem.

Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agotehuti: check register size (CVE-2008-1675)
Francois Romieu [Sun, 20 Apr 2008 17:32:34 +0000 (19:32 +0200)]
tehuti: check register size (CVE-2008-1675)

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agox86: Fix 32-bit x86 MSI-X allocation leakage
PJ Waskiewicz [Mon, 28 Apr 2008 18:56:03 +0000 (11:56 -0700)]
x86: Fix 32-bit x86 MSI-X allocation leakage

commit 9d9ad4b51d2b29b5bbeb4011f5e76f7538119cf9 upstream

This bug was introduced in the 2.6.24 lguest tree merge, where
MSI-X vector allocation will eventually fail.  The cause is the new
bit array tracking used vectors is not getting cleared properly on
IRQ destruction on the 32-bit APIC code.

This can be seen easily using the ixgbe 10 GbE driver on multi-core
systems by simply loading and unloading the driver a few times.
Depending on the number of available vectors on the host system, the
MSI-X allocation will eventually fail, and the driver will only be
able to use legacy interrupts.

Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agofix oops on rmmod capidrv
Karsten Keil [Fri, 25 Jan 2008 10:55:28 +0000 (11:55 +0100)]
fix oops on rmmod capidrv

commit eb36f4fc019835cecf0788907f6cab774508087b upstream.

Fix overwriting the stack with the version string
(it is currently 10 bytes + zero) when unloading the
capidrv module. Safeguard against overwriting it
should the version string grow in the future.

Should fix Kernel Bug Tracker Bug 9696.

Signed-off-by: Gerd v. Egidy <gerd.von.egidy@intra2net.com>
Acked-by: Karsten Keil <kkeil@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agosplice: use mapping_gfp_mask
Hugh Dickins [Thu, 3 Apr 2008 22:35:22 +0000 (23:35 +0100)]
splice: use mapping_gfp_mask

upstream commit: 4cd13504652d28e16bf186c6bb2bbb3725369383

The loop block driver is careful to mask __GFP_IO|__GFP_FS out of its
mapping_gfp_mask, to avoid hangs under memory pressure.  But nowadays
it uses splice, usually going through __generic_file_splice_read.  That
must use mapping_gfp_mask instead of GFP_KERNEL to avoid those hangs.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoLinux 2.6.24.5 v2.6.24.5
Chris Wright [Sat, 19 Apr 2008 01:53:39 +0000 (18:53 -0700)]
Linux 2.6.24.5

16 years agolocks: fix possible infinite loop in fcntl(F_SETLKW) over nfs
J. Bruce Fields [Mon, 14 Apr 2008 19:03:02 +0000 (15:03 -0400)]
locks: fix possible infinite loop in fcntl(F_SETLKW) over nfs

upstream commit: 19e729a928172103e101ffd0829fd13e68c13f78

Miklos Szeredi found the bug:

"Basically what happens is that on the server nlm_fopen() calls
nfsd_open() which returns -EACCES, to which nlm_fopen() returns
NLM_LCK_DENIED.

"On the client this will turn into a -EAGAIN (nlm_stat_to_errno()),
which in will cause fcntl_setlk() to retry forever."

So, for example, opening a file on an nfs filesystem, changing
permissions to forbid further access, then trying to lock the file,
could result in an infinite loop.

And Trond Myklebust identified the culprit, from Marc Eshel and I:

7723ec9777d9832849b76475b1a21a2872a40d20 "locks: factor out
generic/filesystem switch from setlock code"

That commit claimed to just be reshuffling code, but actually introduced
a behavioral change by calling the lock method repeatedly as long as it
returned -EAGAIN.

We assumed this would be safe, since we assumed a lock of type SETLKW
would only return with either success or an error other than -EAGAIN.
However, nfs does can in fact return -EAGAIN in this situation, and
independently of whether that behavior is correct or not, we don't
actually need this change, and it seems far safer not to depend on such
assumptions about the filesystem's ->lock method.

Therefore, revert the problematic part of the original commit.  This
leaves vfs_lock_file() and its other callers unchanged, while returning
fcntl_setlk and fcntl_setlk64 to their former behavior.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Tested-by: Miklos Szeredi <mszeredi@suse.cz>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Marc Eshel <eshel@almaden.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agofile capabilities: remove cap_task_kill()
Serge Hallyn [Fri, 29 Feb 2008 15:14:57 +0000 (15:14 +0000)]
file capabilities: remove cap_task_kill()

upstream commit: aedb60a67c10a0861af179725d060765262ba0fb

The original justification for cap_task_kill() was as follows:

check_kill_permission() does appropriate uid equivalence checks.
However with file capabilities it becomes possible for an
unprivileged user to execute a file with file capabilities
resulting in a more privileged task with the same uid.

However now that cap_task_kill() always returns 0 (permission
granted) when p->uid==current->uid, the whole hook is worthless,
and only likely to create more subtle problems in the corner cases
where it might still be called but return -EPERM.  Those cases
are basically when uids are different but euid/suid is equivalent
as per the check in check_kill_permission().

One example of a still-broken application is 'at' for non-root users.

This patch removes cap_task_kill().

Signed-off-by: Serge Hallyn <serge@hallyn.com>
Acked-by: Andrew G. Morgan <morgan@kernel.org>
Earlier-version-tested-by: Luiz Fernando N. Capitulino <lcapitulino@mandriva.com.br>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[chrisw@sous-sol.org: backport to 2.6.24.4]
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agomacb: Call phy_disconnect on removing
Atsushi Nemoto [Thu, 10 Apr 2008 14:30:07 +0000 (23:30 +0900)]
macb: Call phy_disconnect on removing

upstream commit: 84b7901f8d5a17536ef2df7fd628ab865df8fe3a

Call phy_disconnect() on remove routine.  Otherwise the phy timer
causes a kernel crash when unloading.

Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agofbdev: fix /proc/fb oops after module removal
Alexey Dobriyan [Wed, 16 Apr 2008 02:45:07 +0000 (02:45 +0000)]
fbdev: fix /proc/fb oops after module removal

upstream commit: c43f89c2084f46e3ec59ddcbc52ecf4b1e9b015a

/proc/fb is not removed during rmmod.

Steps to reproduce:

modprobe fb
rmmod fb
ls /proc

BUG: unable to handle kernel paging request at ffffffffa0094370
IP: [<ffffffff802b92a1>] proc_get_inode+0x101/0x130
PGD 203067 PUD 207063 PMD 17e758067 PTE 0
Oops: 0000 [1] SMP
last sysfs file: /sys/devices/pci0000:00/0000:00:1e.0/0000:05:02.0/resource
CPU 1
Modules linked in: nf_conntrack_irc xt_state iptable_filter ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack ip_tables x_tables vfat fat usbhid ehci_hcd uhci_hcd usbcore sr_mod cdrom [last unloaded: fb]
Pid: 21205, comm: ls Not tainted 2.6.25-rc8-mm2 #14
RIP: 0010:[<ffffffff802b92a1>]  [<ffffffff802b92a1>] proc_get_inode+0x101/0x130
RSP: 0018:ffff81017c4bfc78  EFLAGS: 00010246
RAX: 0000000000008000 RBX: ffff8101787f5470 RCX: 0000000048011ccc
RDX: ffffffffa0094320 RSI: ffff810006ad43b0 RDI: ffff81017fc2cc00
RBP: ffff81017e450300 R08: 0000000000000002 R09: ffff81017c5d1000
R10: 0000000000000000 R11: 0000000000000246 R12: ffff81016b903a28
R13: ffff81017f822020 R14: ffff81017c4bfd58 R15: ffff81017f822020
FS:  00007f08e71696f0(0000) GS:ffff81017fc06480(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: ffffffffa0094370 CR3: 000000017e54a000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process ls (pid: 21205, threadinfo ffff81017c4be000, task ffff81017de48770)
Stack:  ffff81017c5d1000 00000000ffffffea ffff81017e450300 ffffffff802bdd1e
 ffff81017f802258 ffff81017c4bfe48 ffff81016b903a28 ffff81017f822020
 ffff81017c4bfd48 ffffffff802b9ba0 ffff81016b903a28 ffff81017f802258
Call Trace:
 [<ffffffff802bdd1e>] ? proc_lookup_de+0x8e/0x100
 [<ffffffff802b9ba0>] ? proc_root_lookup+0x20/0x60
 [<ffffffff802882a7>] ? do_lookup+0x1b7/0x210
 [<ffffffff8028883d>] ? __link_path_walk+0x53d/0x7f0
 [<ffffffff80295eb8>] ? mntput_no_expire+0x28/0x130
 [<ffffffff80288b4a>] ? path_walk+0x5a/0xc0
 [<ffffffff80288dd3>] ? do_path_lookup+0x83/0x1c0
 [<ffffffff80287785>] ? getname+0xe5/0x210
 [<ffffffff80289adb>] ? __user_walk_fd+0x4b/0x80
 [<ffffffff8028236c>] ? vfs_lstat_fd+0x2c/0x70
 [<ffffffff8028bf1e>] ? filldir+0xae/0xf0
 [<ffffffff802b92e9>] ? de_put+0x9/0x50
 [<ffffffff8029633d>] ? mnt_want_write+0x2d/0x80
 [<ffffffff8029339f>] ? touch_atime+0x1f/0x170
 [<ffffffff802b9b1d>] ? proc_root_readdir+0x7d/0xa0
 [<ffffffff802825e7>] ? sys_newlstat+0x27/0x50
 [<ffffffff8028bffb>] ? vfs_readdir+0x9b/0xd0
 [<ffffffff8028c0fe>] ? sys_getdents+0xce/0xe0
 [<ffffffff8020b39b>] ? system_call_after_swapgs+0x7b/0x80

Code: b7 83 b2 00 00 00 25 00 f0 00 00 3d 00 80 00 00 74 19 48 89 93 f0 00 00 00 48 89 df e8 39 9a fd ff 48 89 d8 48 83 c4 08 5b 5d c3 <48> 83 7a 50 00 48 c7 c0 60 16 45 80 48 c7 c2 40 17 45 80 48 0f
RIP  [<ffffffff802b92a1>] proc_get_inode+0x101/0x130
 RSP <ffff81017c4bfc78>
CR2: ffffffffa0094370
---[ end trace c71hiarjan8ab739 ]---

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
"Antonino A. Daplas" <adaplas@pol.net>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoacpi: bus: check once more for an empty list after locking it
Chuck Ebbert [Wed, 16 Apr 2008 02:45:05 +0000 (02:45 +0000)]
acpi: bus: check once more for an empty list after locking it

upstream commit: f0a37e008750ead1751b7d5e89d220a260a46147

List could have become empty after the unlocked check that was made earlier,
so check again inside the lock.

Should fix https://bugzilla.redhat.com/show_bug.cgi?id=427765

Signed-off-by: Chuck Ebbert <cebbert@redhat.com>
Cc: <stable@kernel.org>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoPARISC fix signal trampoline cache flushing
Kyle McMartin [Tue, 15 Apr 2008 22:36:38 +0000 (18:36 -0400)]
PARISC fix signal trampoline cache flushing

upstream commit: cf39cc3b56bc4a562db6242d3069f65034ec7549

The signal trampolines were accidently flushing the kernel I$ instead of
the users.  Fix that up, and also add a missing user D$ flush while
we're at it.

Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoPARISC pdc_console: fix bizarre panic on boot
Kyle McMartin [Tue, 15 Apr 2008 16:46:03 +0000 (11:46 -0500)]
PARISC pdc_console: fix bizarre panic on boot

upstream commit ef1afd4d79f0479960ff36bb5fe6ec6eba1ebff2

commit 721fdf34167580ff98263c74cead8871d76936e6
Author: Kyle McMartin <kyle@shortfin.cabal.ca>
Date:   Thu Dec 6 09:32:15 2007 -0800

    [PARISC] print more than one character at a time for pdc console

introduced a subtle bug by accidentally removing the "static" from
iodc_dbuf. This resulted in, what appeared to be, a trap without
*current set to a task. Probably the result of a trap in real mode
while calling firmware.

Also do other misc clean ups. Since the only input from firmware is non
blocking, share iodc_dbuf between input and output, and spinlock the
only callers.

[jejb: fixed up rejections against the stable tree]

Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoPARISC futex: special case cmpxchg NULL in kernel space
Kyle McMartin [Tue, 15 Apr 2008 15:45:11 +0000 (10:45 -0500)]
PARISC futex: special case cmpxchg NULL in kernel space

upstream commit: c20a84c91048c76c1379011c96b1a5cee5c7d9a0

commit f9e77acd4060fefbb60a351cdb8d30fca27fe194
Author: Thomas Gleixner <tglx@linutronix.de>
Date:   Sun Feb 24 02:10:05 2008 +0000

    futex: runtime enable pi and robust functionality

which was backported to stable based on mainline Commit
a0c1e9073ef7428a14309cba010633a6cd6719ea added code to futex.c
to detect whether futex_atomic_cmpxchg_inatomic was implemented at run
time:

+       curval = cmpxchg_futex_value_locked(NULL, 0, 0);
+       if (curval == -EFAULT)
+               futex_cmpxchg_enabled = 1;

This is bogus on parisc, since page zero in kernel virtual space is the
gateway page for syscall entry, and should not be read from the kernel.
(That, and we really don't like the kernel faulting on its own address
 space...)

Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agopnpacpi: reduce printk severity for "pnpacpi: exceeded the max number of ..."
Len Brown [Tue, 15 Apr 2008 07:16:56 +0000 (03:16 -0400)]
pnpacpi: reduce printk severity for "pnpacpi: exceeded the max number of ..."

upstream commit 33fd7afd66ffdc6addf1b085fe6403b6af532f8e

We have been printing these messages at KERN_ERR since 2.6.24,
per http://bugzilla.kernel.org/show_bug.cgi?id=9535

But KERN_ERR pops up on a console booted with "quiet"
and causes users to get alarmed and file bugs
about the message itself:
https://bugzilla.redhat.com/show_bug.cgi?id=436589

So reduce the severity of these messages to
KERN_WARNING, which is not printed by "quiet".

This message will still be seen without "quiet",
but a lot of messages are printed in that mode
and it will be less likely to cause undue alarm.

We could go all the way to KERN_DEBUG, but this
is a real warning after all, so it seems prudent
not to require "debug" to see it.

Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoPOWERPC: Fix build of modular drivers/macintosh/apm_emu.c
Guido Guenther [Tue, 15 Apr 2008 13:45:51 +0000 (13:45 +0000)]
POWERPC: Fix build of modular drivers/macintosh/apm_emu.c

upstream commit: 620a245978d007279bc5c7c64e15f5f63af9af98

Currently, if drivers/macintosh/apm_emu is a module and the config
doesn't have CONFIG_SUSPEND we get:

ERROR: "pmu_batteries" [drivers/macintosh/apm_emu.ko] undefined!
ERROR: "pmu_battery_count" [drivers/macintosh/apm_emu.ko] undefined!
ERROR: "pmu_power_flags" [drivers/macintosh/apm_emu.ko] undefined!

on PPC32.  The variables aren't wrapped in '#if defined(CONFIG_SUSPEND)'
so we probably shouldn't wrap the exports either.  This removes the
CONFIG_SUSPEND part of the export, which fixes compilation on ppc32.

Signed-off-by: Guido Guenther <agx@sigxcpu.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
mpagano@gentoo.org notes:

The details can be found at http://bugs.gentoo.org/show_bug.cgi?id=217629.

Cc: Mike Pagano <mpagano@gentoo.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agomd: close a livelock window in handle_parity_checks5
Dan Williams [Fri, 11 Apr 2008 16:55:06 +0000 (16:55 +0000)]
md: close a livelock window in handle_parity_checks5

upstream commit: bd2ab67030e9116f1e4aae1289220255412b37fd

If a failure is detected after a parity check operation has been initiated,
but before it completes handle_parity_checks5 will never quiesce operations on
the stripe.

Explicitly handle this case by "canceling" the parity check, i.e.  clear the
STRIPE_OP_CHECK flags and queue the stripe on the handle list again to refresh
any non-uptodate blocks.

Kernel versions >= 2.6.23 are susceptible.

Cc: <stable@kernel.org>
Cc: NeilBrown <neilb@suse.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agosignalfd: fix for incorrect SI_QUEUE user data reporting
Davide Libenzi [Fri, 11 Apr 2008 16:55:04 +0000 (16:55 +0000)]
signalfd: fix for incorrect SI_QUEUE user data reporting

upstream commit: 0859ab59a8a48d2a96b9d2b7100889bcb6bb5818

Michael Kerrisk found out that signalfd was not reporting back user data
pushed using sigqueue:

  http://groups.google.com/group/linux.kernel/msg/9397cab8551e3123

The following patch makes signalfd report back the ssi_ptr and ssi_int members
of the signalfd_siginfo structure.

Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Acked-by: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoplip: replace spin_lock_irq with spin_lock_irqsave in irq context
Mikulas Patocka [Mon, 31 Mar 2008 23:22:45 +0000 (01:22 +0200)]
plip: replace spin_lock_irq with spin_lock_irqsave in irq context

upstream commit: cabce28ec0a0ae3d0ddfa4461f0e8be94ade9e46

Plip uses spin_lock_irq/spin_unlock_irq in its IRQ handler (called from
parport IRQ handler), the latter enables interrupts without parport
subsystem IRQ handler expecting it.

The bug can be seen if you compile kernel with lock dependency checking
and use plip --- it produces a warning.

This patch changes it to spin_lock_irqsave/spin_lock_irqrestore, so that
it doesn't enable interrupts when already disabled.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoacpi: fix "buggy BIOS check" when CPUs are hot removed
Alok Kataria [Thu, 10 Apr 2008 01:50:05 +0000 (01:50 +0000)]
acpi: fix "buggy BIOS check" when CPUs are hot removed

upstream commit: ba62b077871a5255e271f4fdae57167651839277

Fixes a BUG in ACPI hotplugging.

processor_device_array[pr->id] needs to be set to NULL when removing a CPU.
Else the "buggy BIOS check" in acpi_processor_start mistakenly fires when a
CPU is removed from the system and then later re-added.

Signed-off-by: Alok N Kataria <akataria@vmware.com>
Signed-off-by: Dan Arai <arai@vmware.com>
Cc: Len Brown <lenb@kernel.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoHFS+: fix unlink of links
Roman Zippel [Wed, 9 Apr 2008 15:44:07 +0000 (17:44 +0200)]
HFS+: fix unlink of links

upstream commit: 76b0c26af2736b7e5b87e6ed7ab63901483d5736

Some time ago while attempting to handle invalid link counts, I botched
the unlink of links itself, so this patch fixes this now correctly, so
that only the link count of nodes that don't point to links is ignored.
Thanks to Vlado Plaga <rechner@vlado-do.de> to notify me of this
problem.

Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoDVB: tda10086: make the 22kHz tone for DISEQC a config option
Hartmut Hackmann [Wed, 9 Apr 2008 01:12:41 +0000 (21:12 -0400)]
DVB: tda10086: make the 22kHz tone for DISEQC a config option

(backported from commit ea75baf4b0f117564bd50827a49c4b14d61d24e9)

Some cards need the diseqc signal modulated, while some just need
the envelope to control the LNB supply.

This fixes Bug 9887

Signed-off-by: Hartmut Hackmann <hartmut.hackmann@t-online.de>
Acked-by: Oliver Endriss <o.endriss@gmx.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Cc: Hermann Pitton <hermann-pitton@arcor.de>
Signed-off-by: Michael Krufky <mkrufky@linuxtv.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoSPARC64: Fix FPU saving in 64-bit signal handling.
David S. Miller [Tue, 8 Apr 2008 05:24:24 +0000 (22:24 -0700)]
SPARC64: Fix FPU saving in 64-bit signal handling.

Upstream commit: 7c3cce978e4f933ac13758ec5d2554fc8d0927d2

The calculation of the FPU reg save area pointer
was wrong.

Based upon an OOPS report from Tom Callaway.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agobluetooth: hci_core: defer hci_unregister_sysfs()
Dave Young [Thu, 6 Mar 2008 02:45:59 +0000 (18:45 -0800)]
bluetooth: hci_core: defer hci_unregister_sysfs()

upstream commit: 147e2d59833e994cc99341806a88b9e59be41391

Alon Bar-Lev reports:

 Feb 16 23:41:33 alon1 usb 3-1: configuration #1 chosen from 1 choice
Feb 16 23:41:33 alon1 BUG: unable to handle kernel NULL pointer
dereference at virtual address 00000008
Feb 16 23:41:33 alon1 printing eip: c01b2db6 *pde = 00000000
Feb 16 23:41:33 alon1 Oops: 0000 [#1] PREEMPT
Feb 16 23:41:33 alon1 Modules linked in: ppp_deflate zlib_deflate
zlib_inflate bsd_comp ppp_async rfcomm l2cap hci_usb vmnet(P)
vmmon(P) tun radeon drm autofs4 ipv6 aes_generic crypto_algapi
ieee80211_crypt_ccmp nf_nat_irc nf_nat_ftp nf_conntrack_irc
nf_conntrack_ftp ipt_MASQUERADE iptable_nat nf_nat ipt_REJECT
xt_tcpudp ipt_LOG xt_limit xt_state nf_conntrack_ipv4 nf_conntrack
iptable_filter ip_tables x_tables snd_pcm_oss snd_mixer_oss
snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device
bluetooth ppp_generic slhc ioatdma dca cfq_iosched cpufreq_powersave
cpufreq_ondemand cpufreq_conservative acpi_cpufreq freq_table uinput
fan af_packet nls_cp1255 nls_iso8859_1 nls_utf8 nls_base pcmcia
snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm nsc_ircc snd_timer
ipw2200 thinkpad_acpi irda snd ehci_hcd yenta_socket uhci_hcd
psmouse ieee80211 soundcore intel_agp hwmon rsrc_nonstatic pcspkr
e1000 crc_ccitt snd_page_alloc i2c_i801 ieee80211_crypt pcmcia_core
agpgart thermal bat!
tery nvram rtc sr_mod ac sg firmware_class button processor cdrom
unix usbcore evdev ext3 jbd ext2 mbcache loop ata_piix libata sd_mod
scsi_mod
Feb 16 23:41:33 alon1
Feb 16 23:41:33 alon1 Pid: 4, comm: events/0 Tainted: P
(2.6.24-gentoo-r2 #1)
Feb 16 23:41:33 alon1 EIP: 0060:[<c01b2db6>] EFLAGS: 00010282 CPU: 0
Feb 16 23:41:33 alon1 EIP is at sysfs_get_dentry+0x26/0x80
Feb 16 23:41:33 alon1 EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX:
f48a2210
Feb 16 23:41:33 alon1 ESI: f72eb900 EDI: f4803ae0 EBP: f4803ae0 ESP:
f7c49efc
Feb 16 23:41:33 alon1 hcid[7004]: HCI dev 0 registered
Feb 16 23:41:33 alon1 DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
Feb 16 23:41:33 alon1 Process events/0 (pid: 4, ti=f7c48000
task=f7c3efc0 task.ti=f7c48000)
Feb 16 23:41:33 alon1 Stack: f7cb6140 f4822668 f7e71e10 c01b304d
ffffffff ffffffff fffffffe c030ba9c
Feb 16 23:41:33 alon1 f7cb6140 f4822668 f6da6720 f7cb6140 f4822668
f6da6720 c030ba8e c01ce20b
Feb 16 23:41:33 alon1 f6e9dd00 c030ba8e f6da6720 f6e9dd00 f6e9dd00
00000000 f4822600 00000000
Feb 16 23:41:33 alon1 Call Trace:
Feb 16 23:41:33 alon1 [<c01b304d>] sysfs_move_dir+0x3d/0x1f0
Feb 16 23:41:33 alon1 [<c01ce20b>] kobject_move+0x9b/0x120
Feb 16 23:41:33 alon1 [<c0241711>] device_move+0x51/0x110
Feb 16 23:41:33 alon1 [<f9aaed80>] del_conn+0x0/0x70 [bluetooth]
Feb 16 23:41:33 alon1 [<f9aaed99>] del_conn+0x19/0x70 [bluetooth]
Feb 16 23:41:33 alon1 [<c012c1a1>] run_workqueue+0x81/0x140
Feb 16 23:41:33 alon1 [<c02c0c88>] schedule+0x168/0x2e0
Feb 16 23:41:33 alon1 [<c012fc70>] autoremove_wake_function+0x0/0x50
Feb 16 23:41:33 alon1 [<c012c9cb>] worker_thread+0x9b/0xf0
Feb 16 23:41:33 alon1 [<c012fc70>] autoremove_wake_function+0x0/0x50
Feb 16 23:41:33 alon1 [<c012c930>] worker_thread+0x0/0xf0
Feb 16 23:41:33 alon1 [<c012f962>] kthread+0x42/0x70
Feb 16 23:41:33 alon1 [<c012f920>] kthread+0x0/0x70
Feb 16 23:41:33 alon1 [<c0104c2f>] kernel_thread_helper+0x7/0x18
Feb 16 23:41:33 alon1 =======================
Feb 16 23:41:33 alon1 Code: 26 00 00 00 00 57 89 c7 a1 50 1b 3a c0
56 53 8b 70 38 85 f6 74 08 8b 0e 85 c9 74 58 ff 06 8b 56 50 39 fa 74
47 89 fb eb 02 89 c3 <8b> 43 08 39 c2 75 f7 8b 46 08 83 c0 68 e8 98
e7 10 00 8b 43 10
Feb 16 23:41:33 alon1 EIP: [<c01b2db6>] sysfs_get_dentry+0x26/0x80
SS:ESP 0068:f7c49efc
Feb 16 23:41:33 alon1 ---[ end trace aae864e9592acc1d ]---

Defer hci_unregister_sysfs because hci device could be destructed
while hci conn devices still there.

Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Tested-by: Stefan Seyfried <seife@suse.de>
Acked-by: Alon Bar-Lev <alon.barlev@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
dsd@gentoo.org notes:

This patch fixes http://bugs.gentoo.org/211179

Cc: Daniel Drake <dsd@gentoo.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agosis190: read the mac address from the eeprom first
Francois Romieu [Mon, 18 Feb 2008 20:20:32 +0000 (21:20 +0100)]
sis190: read the mac address from the eeprom first

upstream commit: 563e0ae06ff18f0b280f11cf706ba0172255ce52

Reading a serie of zero from the cmos sram area do not work
well with is_valid_ether_addr(). Let's read the mac address
from the eeprom first as it seems more reliable.

Fix for http://bugzilla.kernel.org/show_bug.cgi?id=9831

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
dsd@gentoo.org notes:
This patch fixes http://bugs.gentoo.org/207706

Cc: Daniel Drake <dsd@gentoo.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agolibata: assume no device is attached if both IDENTIFYs are aborted
Tejun Heo [Sun, 23 Mar 2008 06:16:53 +0000 (15:16 +0900)]
libata: assume no device is attached if both IDENTIFYs are aborted

upstream commit: 1ffc151fcddf524d0c76709d7e7a2af0255acb6b

This is to fix bugzilla #10254.  QSI cdrom attached to pata_sis as
secondary master appears as phantom device for the slave.
Interestingly, instead of not setting DRQ after IDENTIFY which
triggers NODEV_HINT, it aborts both IDENTIFY and IDENTIFY PACKET which
makes EH retry.

Modify EH such that it assumes no device is attached if both flavors
of IDENTIFY are aborted by the device.  There really isn't much point
in retrying when the device actively aborts the commands.

While at it, convert NODEV detection message to ata_dev_printk() to
help debugging obscure detection problems.

This problem was reported by Jan Bücken.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Jan Bücken <jb.faq@gmx.de>
Acked-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
dsd@gentoo.org notes:

This patch fixes http://bugs.gentoo.org/211369

Cc: Daniel Drake <dsd@gentoo.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoSPARC64: flush_ptrace_access() needs preemption disable.
David S. Miller [Mon, 7 Apr 2008 07:26:11 +0000 (00:26 -0700)]
SPARC64: flush_ptrace_access() needs preemption disable.

Upstream commit: f6a843d939ade435e060d580f5c56d958464f8a5

Based upon a report by Mariusz Kozlowski.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoSPARC64: Fix __get_cpu_var in preemption-enabled area.
David S. Miller [Mon, 7 Apr 2008 07:25:35 +0000 (00:25 -0700)]
SPARC64: Fix __get_cpu_var in preemption-enabled area.

Upstream commit: 69072f6e8e4bd4799d2a54e4ff8771d0657512c1

Reported by Mariusz Kozlowski.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoSPARC64: Fix atomic backoff limit.
David S. Miller [Mon, 7 Apr 2008 07:25:20 +0000 (00:25 -0700)]
SPARC64: Fix atomic backoff limit.

Upstream commit: 4cfea5a7dfcc2766251e50ca30271a782d5004ad

4096 will not fit into the immediate field of a compare instruction,
in fact it will end up being -4096 causing the check to fail every
time and thus disabling backoff.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoVLAN: Don't copy ALLMULTI/PROMISC flags from underlying device
Patrick McHardy [Mon, 7 Apr 2008 06:46:45 +0000 (23:46 -0700)]
VLAN: Don't copy ALLMULTI/PROMISC flags from underlying device

Upstream commit: 0ed21b321a13421e2dfeaa70a6c324e05e3e91e6

Changing these flags requires to use dev_set_allmulti/dev_set_promiscuity
or dev_change_flags. Setting it directly causes two unwanted effects:

- the next dev_change_flags call will notice a difference between
  dev->gflags and the actual flags, enable promisc/allmulti
  mode and incorrectly update dev->gflags

- this keeps the underlying device in promisc/allmulti mode until
  the VLAN device is deleted

[ Ported back to 2.6.24 VLAN code. -DaveM ]

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoTCP: Let skbs grow over a page on fast peers
Herbert Xu [Mon, 7 Apr 2008 06:43:38 +0000 (23:43 -0700)]
TCP: Let skbs grow over a page on fast peers

Upstream commit: 69d1506731168d6845a76a303b2c45f7c05f3f2c

While testing the virtio-net driver on KVM with TSO I noticed
that TSO performance with a 1500 MTU is significantly worse
compared to the performance of non-TSO with a 16436 MTU.  The
packet dump shows that most of the packets sent are smaller
than a page.

Looking at the code this actually is quite obvious as it always
stop extending the packet if it's the first packet yet to be
sent and if it's larger than the MSS.  Since each extension is
bound by the page size, this means that (given a 1500 MTU) we're
very unlikely to construct packets greater than a page, provided
that the receiver and the path is fast enough so that packets can
always be sent immediately.

The fix is also quite obvious.  The push calls inside the loop
is just an optimisation so that we don't end up doing all the
sending at the end of the loop.  Therefore there is no specific
reason why it has to do so at MSS boundaries.  For TSO, the
most natural extension of this optimisation is to do the pushing
once the skb exceeds the TSO size goal.

This is what the patch does and testing with KVM shows that the
TSO performance with a 1500 MTU easily surpasses that of a 16436
MTU and indeed the packet sizes sent are generally larger than
16436.

I don't see any obvious downsides for slower peers or connections,
but it would be prudent to test this extensively to ensure that
those cases don't regress.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoTCP: Fix shrinking windows with window scaling
Patrick McHardy [Mon, 7 Apr 2008 06:43:18 +0000 (23:43 -0700)]
TCP: Fix shrinking windows with window scaling

Upstream commit: 607bfbf2d55dd1cfe5368b41c2a81a8c9ccf4723

When selecting a new window, tcp_select_window() tries not to shrink
the offered window by using the maximum of the remaining offered window
size and the newly calculated window size. The newly calculated window
size is always a multiple of the window scaling factor, the remaining
window size however might not be since it depends on rcv_wup/rcv_nxt.
This means we're effectively shrinking the window when scaling it down.

The dump below shows the problem (scaling factor 2^7):

- Window size of 557 (71296) is advertised, up to 3111907257:

IP 172.2.2.3.33000 > 172.2.2.2.33000: . ack 3111835961 win 557 <...>

- New window size of 514 (65792) is advertised, up to 3111907217, 40 bytes
  below the last end:

IP 172.2.2.3.33000 > 172.2.2.2.33000: . 3113575668:3113577116(1448) ack 3111841425 win 514 <...>

The number 40 results from downscaling the remaining window:

3111907257 - 3111841425 = 65832
65832 / 2^7 = 514
65832 % 2^7 = 40

If the sender uses up the entire window before it is shrunk, this can have
chaotic effects on the connection. When sending ACKs, tcp_acceptable_seq()
will notice that the window has been shrunk since tcp_wnd_end() is before
tp->snd_nxt, which makes it choose tcp_wnd_end() as sequence number.
This will fail the receivers checks in tcp_sequence() however since it
is before it's tp->rcv_wup, making it respond with a dupack.

If both sides are in this condition, this leads to a constant flood of
ACKs until the connection times out.

Make sure the window is never shrunk by aligning the remaining window to
the window scaling factor.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoNET: Fix multicast device ioctl checks
Patrick McHardy [Mon, 7 Apr 2008 06:42:55 +0000 (23:42 -0700)]
NET: Fix multicast device ioctl checks

Upstream commit: 61ee6bd487b9cc160e533034eb338f2085dc7922

SIOCADDMULTI/SIOCDELMULTI check whether the driver has a set_multicast_list
method to determine whether it supports multicast. Drivers implementing
secondary unicast support use set_rx_mode however.

Check for both dev->set_multicast_mode and dev->set_rx_mode to determine
multicast capabilities.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoSCTP: Fix local_addr deletions during list traversals.
Chidambar 'ilLogict' Zinnoury [Mon, 7 Apr 2008 06:42:35 +0000 (23:42 -0700)]
SCTP: Fix local_addr deletions during list traversals.

Upstream commit: 22626216c46f2ec86287e75ea86dd9ac3df54265

Since the lists are circular, we need to explicitely tag
the address to be deleted since we might end up freeing
the list head instead.  This fixes some interesting SCTP
crashes.

Signed-off-by: Chidambar 'ilLogict' Zinnoury <illogict@online.fr>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agosch_htb: fix "too many events" situation
Martin Devera [Mon, 7 Apr 2008 06:42:10 +0000 (23:42 -0700)]
sch_htb: fix "too many events" situation

Upstream commit: 8f3ea33a5078a09eba12bfe57424507809367756

HTB is event driven algorithm and part of its work is to apply
scheduled events at proper times. It tried to defend itself from
livelock by processing only limited number of events per dequeue.
Because of faster computers some users already hit this hardcoded
limit.

This patch limits processing up to 2 jiffies (why not 1 jiffie ?
because it might stop prematurely when only fraction of jiffie
remains).

Signed-off-by: Martin Devera <devik@cdi.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoNET: Add preemption point in qdisc_run
Herbert Xu [Mon, 7 Apr 2008 06:41:50 +0000 (23:41 -0700)]
NET: Add preemption point in qdisc_run

Upstream commit: 2ba2506ca7ca62c56edaa334b0fe61eb5eab6ab0

The qdisc_run loop is currently unbounded and runs entirely in a
softirq.  This is bad as it may create an unbounded softirq run.

This patch fixes this by calling need_resched and breaking out if
necessary.

It also adds a break out if the jiffies value changes since that would
indicate we've been transmitting for too long which starves other
softirqs.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoPPPOL2TP: Fix SMP issues in skb reorder queue handling
James Chapman [Mon, 7 Apr 2008 06:41:29 +0000 (23:41 -0700)]
PPPOL2TP: Fix SMP issues in skb reorder queue handling

Upstream commit: e653181dd6b3ad38ce14904351b03a5388f4b0f7

When walking a session's packet reorder queue, use
skb_queue_walk_safe() since the list could be modified inside the
loop.

Rearrange the unlinking skbs from the reorder queue such that it is
done while the queue lock is held in pppol2tp_recv_dequeue() when
walking the skb list.

A version of this patch was suggested by Jarek Poplawski.

Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoPPPOL2TP: Make locking calls softirq-safe
James Chapman [Mon, 7 Apr 2008 06:41:18 +0000 (23:41 -0700)]
PPPOL2TP: Make locking calls softirq-safe

Upstream commit: cf3752e2d203bbbfc88d29e362e6938cef4339b3

Fix locking issues in the pppol2tp driver which can cause a kernel
crash on SMP boxes. There were two problems:-

1. The driver was violating read_lock() and write_lock() scheduling
   rules because it wasn't using softirq-safe locks in softirq
   contexts. So we now consistently use the _bh variants of the lock
   functions.

2. The driver was calling sk_dst_get() in pppol2tp_xmit() which was
   taking sk_dst_lock in softirq context. We now call __sk_dst_get().

Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agonetpoll: zap_completion_queue: adjust skb->users counter
Jarek Poplawski [Mon, 7 Apr 2008 06:40:53 +0000 (23:40 -0700)]
netpoll: zap_completion_queue: adjust skb->users counter

Upstream commit: 8a455b087c9629b3ae3b521b4f1ed16672f978cc

zap_completion_queue() retrieves skbs from completion_queue where they have
zero skb->users counter.  Before dev_kfree_skb_any() it should be non-zero
yet, so it's increased now.

Reported-and-tested-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoLLC: Restrict LLC sockets to root
Patrick McHardy [Mon, 7 Apr 2008 06:40:33 +0000 (23:40 -0700)]
LLC: Restrict LLC sockets to root

Upstream commit: 3480c63bdf008e9289aab94418f43b9592978fff

LLC currently allows users to inject raw frames, including IP packets
encapsulated in SNAP. While Linux doesn't handle IP over SNAP, other
systems do. Restrict LLC sockets to root similar to packet sockets.

[ Modified Patrick's patch to use CAP_NEW_RAW --DaveM ]

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoINET: inet_frag_evictor() must run with BH disabled
David S. Miller [Mon, 7 Apr 2008 06:40:06 +0000 (23:40 -0700)]
INET: inet_frag_evictor() must run with BH disabled

Part of upstream commit: e8e16b706e8406f1ab3bccab16932ebc513896d8

Based upon a lockdep trace from Dave Jones.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoSUNGEM: Fix NAPI assertion failure.
David S. Miller [Mon, 7 Apr 2008 06:37:08 +0000 (23:37 -0700)]
SUNGEM: Fix NAPI assertion failure.

Upstream commit: da990a2402aeaee84837f29054c4628eb02f7493

As reported by Johannes Berg:

I started getting this warning with recent kernels:

[  773.908927] ------------[ cut here ]------------
[  773.908954] Badness at net/core/dev.c:2204
 ...

If we loop more than once in gem_poll(), we'll
use more than the real budget in our gem_rx()
calls, thus eventually trigger the caller's
assertions in net_rx_action().

Subtract "work_done" from "budget" for the second
arg to gem_rx() to fix the bug.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoNET: include <linux/types.h> into linux/ethtool.h for __u* typedef
Kirill A. Shutemov [Mon, 7 Apr 2008 06:35:53 +0000 (23:35 -0700)]
NET: include <linux/types.h> into linux/ethtool.h for __u* typedef

Upstream commit: e621e69137b24fdbbe7ad28214e8d81e614c25b7

Signed-off-by: Kirill A. Shutemov <k.shutemov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoAX25 ax25_out: check skb for NULL in ax25_kick()
Jarek Poplawski [Mon, 7 Apr 2008 06:35:31 +0000 (23:35 -0700)]
AX25 ax25_out: check skb for NULL in ax25_kick()

Upstream commit: f47b7257c7368698eabff6fd7b340071932af640

According to some OOPS reports ax25_kick tries to clone NULL skbs
sometimes. It looks like a race with ax25_clear_queues(). Probably
there is no need to add more than a simple check for this yet.
Another report suggested there are probably also cases where ax25
->paclen == 0 can happen in ax25_output(); this wasn't confirmed
during testing but let's leave this debugging check for some time.

Reported-and-tested-by: Jann Traschewski <jann@gmx.de>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoipmi: change device node ordering to reflect probe order
Carol Hebert [Fri, 4 Apr 2008 21:30:03 +0000 (14:30 -0700)]
ipmi: change device node ordering to reflect probe order

upstream commit: abd24df828f1a72971db29d1b74fefae104ea9e2

In 2.6.14 a patch was merged which switching the order of the ipmi device
naming from in-order-of-discovery over to reverse-order-of-discovery.

So on systems with multiple BMC interfaces, the ipmi device names are being
created in reverse order relative to how they are discovered on the system
(e.g.  on an IBM x3950 multinode server with N nodes, the device name for the
BMC in the first node is /dev/ipmiN-1 and the device name for the BMC in the
last node is /dev/ipmi0, etc.).

The problem is caused by the list handling routines chosen in dmi_scan.c.
Using list_add() causes the multiple ipmi devices to be added to the device
list using a stack-paradigm and so the ipmi driver subsequently pulls them off
during initialization in LIFO order.  This patch changes the
dmi_save_ipmi_device() list handling paradigm to a queue, thereby allowing the
ipmi driver to build the ipmi device names in the order in which they are
found on the system.

Signed-off-by: Carol Hebert <cah@us.ibm.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agomtd: fix broken state in CFI driver caused by FL_SHUTDOWN
Alexey Korolev [Fri, 4 Apr 2008 22:15:06 +0000 (22:15 +0000)]
mtd: fix broken state in CFI driver caused by FL_SHUTDOWN

upstream commit: fb6d080c6f75dfd7e23d5a3575334785aa8738eb

THe CFI driver in 2.6.24 kernel is broken.  Not so intensive read/write
operations cause incomplete writes which lead to kernel panics in JFFS2.

We investigated the issue - it is caused by bug in FL_SHUTDOWN parsing code.
Sometimes chip returns -EIO as if it is in FL_SHUTDOWN state when it should
wait in FL_PONT (error in order of conditions).

The following patch fixes the bug in state parsing code of CFI.  Also I've
added comments to notify developers if they want to add new case in future.

Signed-off-by: Alexey Korolev <akorolev@infradead.org>
Reviewed-by: Joern Engel <joern@logfs.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoCRYPTO xcbc: Fix crash when ipsec uses xcbc-mac with big data chunk
Joy Latten [Fri, 4 Apr 2008 12:05:02 +0000 (20:05 +0800)]
CRYPTO xcbc: Fix crash when ipsec uses xcbc-mac with big data chunk

upstream commit: 1edcf2e1ee2babb011cfca80ad9d202e9c491669

The kernel crashes when ipsec passes a udp packet of about 14XX bytes
of data to aes-xcbc-mac.

It seems the first xxxx bytes of the data are in first sg entry,
and remaining xx bytes are in next sg entry. But we don't
check next sg entry to see if we need to go look the page up.

I noticed in hmac.c, we do a scatterwalk_sg_next(), to do this check
and possible lookup, thus xcbc.c needs to use this routine too.

A 15-hour run of an ipsec stress test sending streams of tcp and
udp packets of various sizes,  using this patch and
aes-xcbc-mac completed successfully, so hopefully this fixes the
problem.

Signed-off-by: Joy Latten <latten@austin.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
[chrisw@sous-sol.org: backport to 2.6.24.4]
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoUSB: serial: ti_usb_3410_5052: Correct TUSB3410 endpoint requirements.
Robert Spanton [Wed, 2 Apr 2008 23:15:15 +0000 (23:15 +0000)]
USB: serial: ti_usb_3410_5052: Correct TUSB3410 endpoint requirements.

upstream commit: 1bfd6693cd66f1e79abce62d3e8c3647e1f59a55

The changes introduced in commit
063a2da8f01806906f7d7b1a1424b9afddebc443 changed the semantics of the
num_interrupt_in, num_interrupt_out, num_bulk_in and num_bulk_out
entries of the usb_serial_driver struct to be the number of endpoints
the device has when probed.

This patch changes the ti_1port_device usb_serial_driver struct to
reflect this change.  The single port devices only have 1
bulk_out endpoint in their initial configuration, and so this patch
changes the number of other types to NUM_DONT_CARE.

The same change probably needs doing to the ti_2port_device struct,
but I don't have a two port device at hand.

Signed-off-by: Robert Spanton <rspanton@zepler.net>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoUSB: serial: fix regression in Visor/Palm OS module for kernels >= 2.6.24
Brad Sawatzky [Wed, 2 Apr 2008 23:15:13 +0000 (23:15 +0000)]
USB: serial: fix regression in Visor/Palm OS module for kernels >= 2.6.24

upstream commit: d04863e9e65767feff7807c8f693ac2719dd1944

Fixes a bug/inconsistency revealed by the additional sanity checking in
   commit 063a2da8f01806906f7d7b1a1424b9afddebc443
introduced in the original 2.6.24 branch.

The Handspring Visor / PalmOS 4 device structure defines .num_bulk_out=2
but the usb-serial probe returns num_bulk_out=3, triggering the check in
the above commit and forcing a bail out when the device (a Garmin iQue in
my case) attempts to connect.  The patch bumps the expected number of
endpoints to 3.

FWIW, this patch will probably solve the following kernel bug report for
Treo users (identical symptoms, different model PalmOS units):
  <http://bugzilla.kernel.org/show_bug.cgi?id=10118>

Signed-off-by: Brad Sawatzky <brad+kernel@swatter.net>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoUSB: Allow initialization of broken keyspan serial adapters.
Clark Rawlins [Wed, 2 Apr 2008 23:15:09 +0000 (23:15 +0000)]
USB: Allow initialization of broken keyspan serial adapters.

upstream commit: 822470537d0fc1dee38a2a9c8b8c398bfbb332bb

Fixes the keyspan driver after the addition of additional
checking of driver requirements introduced in usb-serial.c
commit 063a2da8f01806906f7d7b1a1424b9afddebc443.  The initialization
of the keyspan usb_serial_driver structs were not initializing the
num_interrupt_out field and the additional checking was rejecting
the end point so the driver wouldn't finish initializing.

This commit initializes the fields to NUM_DONT_CARE.
It works for the keyspan USA-49WG and doesn't break the USA-19HS
which are the two keyspan devices I have to test with.

Signed-off-by: Clark Rawlins <clark.rawlins@escient.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agovmcoreinfo: add the symbol "phys_base"
Ken'ichi Ohmichi [Wed, 2 Apr 2008 23:15:03 +0000 (23:15 +0000)]
vmcoreinfo: add the symbol "phys_base"

upstream commit: 629c8b4cdb354518308663aff2f719e02f69ffbe

Fix the problem that makedumpfile sometimes fails on x86_64 machine.

This patch adds the symbol "phys_base" to a vmcoreinfo data.  The
vmcoreinfo data has the minimum debugging information only for dump
filtering.  makedumpfile (dump filtering command) gets it to distinguish
unnecessary pages, and makedumpfile creates a small dumpfile.

On x86_64 kernel which compiled with CONFIG_PHYSICAL_START=0x0 and
CONFIG_RELOCATABLE=y, makedumpfile fails like the following:

 # makedumpfile -d31 /proc/vmcore dumpfile
 The kernel version is not supported.
 The created dumpfile may be incomplete.
 _exclude_free_page: Can't get next online node.

 makedumpfile Failed.
 #

The cause is the lack of the symbol "phys_base" in a vmcoreinfo data.
If the symbol "phys_base" does not exist, makedumpfile considers an
x86_64 kernel as non relocatable.  As the result, makedumpfile
misunderstands the physical address where the kernel is loaded, and it
cannot translate a kernel virtual address to physical address correctly.

To fix this problem, this patch adds the symbol "phys_base" to a
vmcoreinfo data.

Signed-off-by: Ken'ichi Ohmichi <oomichi@mxs.nes.nec.co.jp>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: <stable@kernel.org>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agohwmon: (w83781d) Fix I/O resource conflict with PNP
Jean Delvare [Sun, 9 Mar 2008 12:34:28 +0000 (13:34 +0100)]
hwmon: (w83781d) Fix I/O resource conflict with PNP

upstream commit: 2961cb22ef02850d90e7a12c28a14d74e327df8d

Only request I/O ports 0x295-0x296 instead of the full I/O address
range. This solves a conflict with PNP resources on a few motherboards.

Also request the I/O ports in two parts (4 low ports, 4 high ports)
during device detection, otherwise the PNP resource makes the request
(and thus the detection) fail.

This fixes lm-sensors ticket #2306:
http://www.lm-sensors.org/ticket/2306

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Mark M. Hoffman <mhoffman@lightlink.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agopci: revert SMBus unhide on HP Compaq nx6110
Jean Delvare [Fri, 28 Mar 2008 21:16:04 +0000 (14:16 -0700)]
pci: revert SMBus unhide on HP Compaq nx6110

upstream commit: a99acc832de1104afaba02d7c2576fd9b9fd6422

This reverts commit 3c0a654e390d00fef9d8faed758f5e1e8078adb5 and
fixes kernel bug #10245:

http://bugzilla.kernel.org/show_bug.cgi?id=10245

The HP Compaq nc6120 has the same PCI sub-device ID as the nx6110, and the
SMBus is used by ACPI for thermal management on the nc6120, so Linux should
not attach a native driver to it.  This means that this quirk is unsafe and
has to be removed.

I also added a comment to help developers realize that adding new IDs to this
SMBus unhiding quirk table should be done only with great care, and in
particular only after checking that ACPI is not making use of the SMBus.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Tomasz Koprowski <tomek@koprowski.org>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agovfs: fix data leak in nobh_write_end()
Dmitri Monakhov [Fri, 28 Mar 2008 22:10:07 +0000 (22:10 +0000)]
vfs: fix data leak in nobh_write_end()

upstream commit: 5b41e74ad1b0bf7bc51765ae74e5dc564afc3e48

Current nobh_write_end() implementation ignore partial writes(copied < len)
case if page was fully mapped and simply mark page as Uptodate, which is
totally wrong because area [pos+copied, pos+len) wasn't updated explicitly in
previous write_begin call.  It simply contains garbage from pagecache and
result in data leakage.

#TEST_CASE_BEGIN:
~~~~~~~~~~~~~~~~
In fact issue triggered by classical testcase
open("/mnt/test", O_RDWR|O_CREAT|O_TRUNC, 0666) = 3
ftruncate(3, 409600)                    = 0
writev(3, [{"a", 1}, {NULL, 4095}], 2)  = 1
##TESTCASE_SOURCE:
~~~~~~~~~~~~~~~~~
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <sys/uio.h>
#include <sys/mman.h>
#include <errno.h>
int main(int argc, char **argv)
{
int fd,  ret;
void* p;
struct iovec iov[2];
fd = open(argv[1], O_RDWR|O_CREAT|O_TRUNC, 0666);
ftruncate(fd, 409600);
iov[0].iov_base="a";
iov[0].iov_len=1;
iov[1].iov_base=NULL;
iov[1].iov_len=4096;
ret = writev(fd, iov, sizeof(iov)/sizeof(struct iovec));
printf("writev  = %d, err = %d\n", ret, errno);
return 0;
}
##TESTCASE RESULT:
~~~~~~~~~~~~~~~~~~
[root@ts63 ~]# mount | grep mnt2
/dev/mapper/test on /mnt2 type ext2 (rw,nobh)
[root@ts63 ~]#  /tmp/writev /mnt2/test
writev  = 1, err = 0
[root@ts63 ~]# hexdump -C /mnt2/test

00000000  61 65 62 6f 6f 74 00 00  f0 b9 b4 59 3a 00 00 00  |aeboot.....Y:...|
00000010  20 00 00 00 00 00 00 00  21 00 00 00 00 00 00 00  | .......!.......|
00000020  df df df df df df df df  df df df df df df df df  |................|
00000030  3a 00 00 00 2a 00 00 00  21 00 00 00 00 00 00 00  |:...*...!.......|
00000040  60 c0 8c 00 00 00 00 00  40 4a 8d 00 00 00 00 00  |`.......@J......|
00000050  00 00 00 00 00 00 00 00  41 00 00 00 00 00 00 00  |........A.......|
00000060  74 69 6d 65 20 64 64 20  69 66 3d 2f 64 65 76 2f  |time dd if=/dev/|
00000070  6c 6f 6f 70 30 20 20 6f  66 3d 2f 64 65 76 2f 6e  |loop0  of=/dev/n|
skip..
00000f50  00 00 00 00 00 00 00 00  31 00 00 00 00 00 00 00  |........1.......|
00000f60  6d 6b 66 73 2e 65 78 74  33 20 2f 64 65 76 2f 76  |mkfs.ext3 /dev/v|
00000f70  7a 76 67 2f 74 65 73 74  20 2d 62 34 30 39 36 00  |zvg/test -b4096.|
00000f80  a0 fe 8c 00 00 00 00 00  21 00 00 00 00 00 00 00  |........!.......|
00000f90  23 31 32 30 35 39 35 30  34 30 34 00 3a 00 00 00  |#1205950404.:...|
00000fa0  20 00 8d 00 00 00 00 00  21 00 00 00 00 00 00 00  | .......!.......|
00000fb0  d0 cf 8c 00 00 00 00 00  10 d0 8c 00 00 00 00 00  |................|
00000fc0  00 00 00 00 00 00 00 00  41 00 00 00 00 00 00 00  |........A.......|
00000fd0  6d 6f 75 6e 74 20 2f 64  65 76 2f 76 7a 76 67 2f  |mount /dev/vzvg/|
00000fe0  74 65 73 74 20 20 2f 76  7a 20 2d 6f 20 64 61 74  |test  /vz -o dat|
00000ff0  61 3d 77 72 69 74 65 62  61 63 6b 00 00 00 00 00  |a=writeback.....|
00001000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|

As you can see file's page contains garbage from pagecache instead of zeros.
#TEST_CASE_END

Attached patch:
- Add sanity check BUG_ON in order to prevent incorrect usage by caller,
  This is function invariant because page can has buffers and in no zero
  *fadata pointer at the same time.
- Always attach buffers to page is it is partial write case.
- Always switch back to generic_write_end if page has buffers.
  This is reasonable because if page already has buffer then generic_write_begin
  was called previously.

Signed-off-by: Dmitri Monakhov <dmonakhov@openvz.org>
Reviewed-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoalloc_percpu() fails to allocate percpu data
Eric Dumazet [Fri, 28 Mar 2008 18:42:43 +0000 (14:42 -0400)]
alloc_percpu() fails to allocate percpu data

upstream commit: be852795e1c8d3829ddf3cb1ce806113611fa555

Some oprofile results obtained while using tbench on a 2x2 cpu machine were
very surprising.

For example, loopback_xmit() function was using high number of cpu cycles
to perform the statistic updates, supposed to be real cheap since they use
percpu data

        pcpu_lstats = netdev_priv(dev);
        lb_stats = per_cpu_ptr(pcpu_lstats, smp_processor_id());
        lb_stats->packets++;  /* HERE : serious contention */
        lb_stats->bytes += skb->len;

struct pcpu_lstats is a small structure containing two longs.  It appears
that on my 32bits platform, alloc_percpu(8) allocates a single cache line,
instead of giving to each cpu a separate cache line.

Using the following patch gave me impressive boost in various benchmarks
( 6 % in tbench)
(all percpu_counters hit this bug too)

Long term fix (ie >= 2.6.26) would be to let each CPU allocate their own
block of memory, so that we dont need to roudup sizes to L1_CACHE_BYTES, or
merging the SGI stuff of course...

Note : SLUB vs SLAB is important here to *show* the improvement, since they
dont have the same minimum allocation sizes (8 bytes vs 32 bytes).  This
could very well explain regressions some guys reported when they switched
to SLUB.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoPERCPU : __percpu_alloc_mask() can dynamically size percpu_data storage
Eric Dumazet [Fri, 28 Mar 2008 18:42:42 +0000 (14:42 -0400)]
PERCPU : __percpu_alloc_mask() can dynamically size percpu_data storage

upstream commit: b3242151906372f30f57feaa43b4cac96a23edb1

Instead of allocating a fix sized array of NR_CPUS pointers for percpu_data,
we can use nr_cpu_ids, which is generally < NR_CPUS.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoxen: fix UP setup of shared_info
Jeremy Fitzhardinge [Thu, 27 Mar 2008 20:35:05 +0000 (20:35 +0000)]
xen: fix UP setup of shared_info

upstream commit: 2e8fe719b57bbdc9e313daed1204bb55fed3ed44

We need to set up the shared_info pointer once we've mapped the real
shared_info into its fixmap slot.  That needs to happen once the general
pagetable setup has been done.  Previously, the UP shared_info was set
up one in xen_start_kernel, but that was left pointing to the dummy
shared info.  Unfortunately there's no really good place to do a later
setup of the shared_info in UP, so just do it once the pagetable setup
has been done.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
[chrisw@sous-sol.org: backport to 2.6.24.4]
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoxen: mask out SEP from CPUID
Jeremy Fitzhardinge [Fri, 29 Feb 2008 17:55:43 +0000 (18:55 +0100)]
xen: mask out SEP from CPUID

upstream commit: d40e705903397445c6861a0a56c23e5b2e8f9b9a

Fix 32-on-64 pvops kernel:

we don't want userspace using syscall/sysenter, even if the hypervisor
supports it, so mask it out from CPUID.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoxen: fix RMW when unmasking events
Jeremy Fitzhardinge [Thu, 27 Mar 2008 20:35:06 +0000 (20:35 +0000)]
xen: fix RMW when unmasking events

upstream commit: 04c44a080d2f699a3042d4e743f7ad2ffae9d538

xen_irq_enable_direct and xen_sysexit were using "andw $0x00ff,
XEN_vcpu_info_pending(vcpu)" to unmask events and test for pending ones
in one instuction.

Unfortunately, the pending flag must be modified with a locked operation
since it can be set by another CPU, and the unlocked form of this
operation was causing the pending flag to get lost, allowing the processor
to return to usermode with pending events and ultimately deadlock.

The simple fix would be to make it a locked operation, but that's rather
costly and unnecessary.  The fix here is to split the mask-clearing and
pending-testing into two instructions; the interrupt window between
them is of no concern because either way pending or new events will
be processed.

This should fix lingering bugs in using direct vcpu structure access too.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoslab: fix cache_cache bootstrap in kmem_cache_init()
Daniel Yeisley [Wed, 26 Mar 2008 21:37:41 +0000 (23:37 +0200)]
slab: fix cache_cache bootstrap in kmem_cache_init()

upstream commit: ec1f5eeeb5a79a0d48036de649a3498da42db565

Commit 556a169dab38b5100df6f4a45b655dddd3db94c1 ("slab: fix bootstrap on
memoryless node") introduced bootstrap-time cache_cache list3s for all nodes
but forgot that initkmem_list3 needs to be accessed by [somevalue + node]. This
patch fixes list_add() corruption in mm/slab.c seen on the ES7000.

Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Dan Yeisley <dan.yeisley@unisys.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoNOHZ: reevaluate idle sleep length after add_timer_on()
Thomas Gleixner [Wed, 26 Mar 2008 18:35:10 +0000 (18:35 +0000)]
NOHZ: reevaluate idle sleep length after add_timer_on()

upstream commit: 06d8308c61e54346585b2691c13ee3f90cb6fb2f

add_timer_on() can add a timer on a CPU which is currently in a long
idle sleep, but the timer wheel is not reevaluated by the nohz code on
that CPU. So a timer can be delayed for quite a long time. This
triggered a false positive in the clocksource watchdog code.

To avoid this we need to wake up the idle CPU and enforce the
reevaluation of the timer wheel for the next timer event.

Add a function, which checks a given CPU for idle state, marks the
idle task with NEED_RESCHED and sends a reschedule IPI to notify the
other CPU of the change in the timer wheel.

Call this function from add_timer_on().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
--
 include/linux/sched.h |    6 ++++++
 kernel/sched.c        |   43 +++++++++++++++++++++++++++++++++++++++++++
 kernel/timer.c        |   10 +++++++++-
 3 files changed, 58 insertions(+), 1 deletion(-)

16 years agoinotify: remove debug code
Nick Piggin [Tue, 25 Mar 2008 12:48:18 +0000 (13:48 +0100)]
inotify: remove debug code

upstream commit: 0d71bd5993b630a989d15adc2562a9ffe41cd26d

The inotify debugging code is supposed to verify that the
DCACHE_INOTIFY_PARENT_WATCHED scalability optimisation does not result in
notifications getting lost nor extra needless locking generated.

Unfortunately there are also some races in the debugging code.  And it isn't
very good at finding problems anyway.  So remove it for now.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Robert Love <rlove@google.com>
Cc: John McCutchan <ttb@tentacle.dhs.org>
Cc: Jan Kara <jack@ucw.cz>
Cc: Yan Zheng <yanzheng@21cn.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Christian Lamparter <chunkeey@web.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoinotify: fix race
Nick Piggin [Tue, 25 Mar 2008 12:48:15 +0000 (13:48 +0100)]
inotify: fix race

upstream commit: d599e36a9ea85432587f4550acc113cd7549d12a

There is a race between setting an inode's children's "parent watched" flag
when placing the first watch on a parent, and instantiating new children of
that parent: a child could miss having its flags set by
set_dentry_child_flags, but then inotify_d_instantiate might still see
!inotify_inode_watched.

The solution is to set_dentry_child_flags after adding the watch.  Locking is
taken care of, because both set_dentry_child_flags and inotify_d_instantiate
hold dcache_lock and child->d_locks.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Robert Love <rlove@google.com>
Cc: John McCutchan <ttb@tentacle.dhs.org>
Cc: Jan Kara <jack@ucw.cz>
Cc: Yan Zheng <yanzheng@21cn.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Christian Lamparter <chunkeey@web.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoUSB: new quirk flag to avoid Set-Interface
Alan Stern [Tue, 25 Mar 2008 06:35:12 +0000 (06:35 +0000)]
USB: new quirk flag to avoid Set-Interface

upstream commit: 392e1d9817d0024c96aae237c3c4349e47c976fd

This patch (as1057) fixes a problem with the X-Rite/Gretag-Macbeth
Eye-One Pro display colorimeter; the device crashes when it receives a
Set-Interface request.  A new quirk (USB_QUIRK_NO_SET_INTF) is
introduced and a quirks entry is created for this device.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
[chrisw@sous-sol.org: backport to 2.6.24.4]
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoUSB: add support for Motorola ROKR Z6 cellphone in mass storage mode
Constantin Baranov [Tue, 25 Mar 2008 06:35:11 +0000 (06:35 +0000)]
USB: add support for Motorola ROKR Z6 cellphone in mass storage mode

upstream commit: cc36bdd47ae51b66780b317c1fa519221f894405

Motorola ROKR Z6 cellphone has bugs in its USB, so it is impossible to use
it as mass storage. Patch describes new "unusual" USB device for it with
FIX_INQUIRY and FIX_CAPACITY flags and new BULK_IGNORE_TAG flag.
Last flag relaxes check for equality of bcs->Tag and us->tag in
usb_stor_Bulk_transport routine.

Signed-off-by: Constantin Baranov <const@tltsu.ru>
Signed-off-by: Matthew Dharm <mdharm-usb@one-eyed-alien.net>
Signed-off-by: Daniel Drake <dsd@gentoo.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoUIO: add pgprot_noncached() to UIO mmap code
Jean-Samuel Chenard [Tue, 25 Mar 2008 06:35:08 +0000 (06:35 +0000)]
UIO: add pgprot_noncached() to UIO mmap code

upstream commit: c9698d6b1a90929e427a165bd8283f803f57d9bd

Mapping of physical memory in UIO needs pgprot_noncached() to ensure
that IO memory is not cached. Without pgprot_noncached(), it (accidentally)
works on x86 and arm, but fails on PPC.

Signed-off-by: Jean-Samuel Chenard <jsamch@gmail.com>
Signed-off-by: Hans J Koch <hjk@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoV4L: ivtv: Add missing sg_init_table()
Ian Armstrong [Sat, 22 Mar 2008 19:59:02 +0000 (15:59 -0400)]
V4L: ivtv: Add missing sg_init_table()

upstream commit: 165e1213e13b49761f8b3fd9314701f83cf3db3a

If a dma transfer is attempted for either yuv or framebuffer output, a
missing sg_init_table() call causes a kernel BUG in scatterlist.h if
CONFIG_DEBUG_SG is set.

Signed-off-by: Ian Armstrong <ian@iarmst.demon.co.uk>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Signed-off-by: Michael Krufky <mkrufky@linuxtv.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agomd: remove the 'super' sysfs attribute from devices in an 'md' array
NeilBrown [Tue, 25 Mar 2008 04:21:26 +0000 (21:21 -0700)]
md: remove the 'super' sysfs attribute from devices in an 'md' array

upstream commit: 0e82989d95cc46cc58622381eafa54f7428ee679

Exposing the binary blob which is the md 'super-block' via sysfs doesn't
really fit with the whole sysfs model, and ever since commit
8118a859dc7abd873193986c77a8d9bdb877adc8 ("sysfs: fix off-by-one error
in fill_read_buffer()") it doesn't actually work at all (as the size of
the blob is often one page).

(akpm: as in, fs/sysfs/file.c:fill_read_buffer() goes BUG)

So just remove it altogether.  It isn't really useful.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agomtd: memory corruption in block2mtd.c
Ingo van Lil [Tue, 25 Mar 2008 02:40:04 +0000 (02:40 +0000)]
mtd: memory corruption in block2mtd.c

upstream commit: 2875fb65f8e40401c4b781ebc5002df10485f635

The block2mtd driver (drivers/mtd/devices/block2mtd.c) will kfree an on-stack
pointer when handling an invalid argument line (e.g.
block2mtd=/dev/loop0,xxx).

The kfree was added some time ago when "name" was dynamically allocated.

Signed-off-by: Ingo van Lil <inguin@gmx.de>
Acked-by: Joern Engel <joern@lazybastard.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agokbuild: soften modpost checks when doing cross builds
Sam Ravnborg [Tue, 25 Mar 2008 02:40:08 +0000 (02:40 +0000)]
kbuild: soften modpost checks when doing cross builds

upstream commit: 4ce6efed48d736e3384c39ff87bda723e1f8e041

The module alias support in the kernel have a consistency
check where it is checked that the size of a structure
in the kernel and on the build host are the same.
For cross builds this check does not make sense so detect
when we do cross builds and silently skip the check in these
situations.
This fixes a build bug for a wireless driver when cross building
for arm.

Acked-by: Michael Buesch <mb@bu3sch.de>
Tested-by: Gordon Farquharson <gordonfarquharson@gmail.com>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
[chrisw@sous-sol.org: backport to 2.6.24.4]
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agotime: prevent the loop in timespec_add_ns() from being optimised away
Segher Boessenkool [Tue, 4 Mar 2008 22:59:54 +0000 (14:59 -0800)]
time: prevent the loop in timespec_add_ns() from being optimised away

upstream commit: 38332cb98772f5ea757e6486bed7ed0381cb5f98

Since some architectures don't support __udivdi3().

Signed-off-by: Segher Boessenkool <segher@kernel.crashing.org>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Sedat Dilek <sedat.dilek@googlemail.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoLinux 2.6.24.4 v2.6.24.4
Chris Wright [Mon, 24 Mar 2008 18:49:18 +0000 (11:49 -0700)]
Linux 2.6.24.4

16 years agoS390 futex: let futex_atomic_cmpxchg_pt survive early functional tests.
Heiko Carstens [Thu, 20 Mar 2008 16:33:38 +0000 (17:33 +0100)]
S390 futex: let futex_atomic_cmpxchg_pt survive early functional tests.

a0c1e9073ef7428a14309cba010633a6cd6719ea "futex: runtime enable pi and
robust functionality" introduces a test wether futex in atomic stuff
works or not.
It does that by writing to address 0 of the kernel address space. This
will crash on older machines where addressing mode switching is enabled
but where the mvcos instruction is not available. Page table walking is
done by hand and therefore the code tries to access current->mm which
is NULL.
Therefore add an extra check, so we survive the early test.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoslab: NUMA slab allocator migration bugfix
Joe Korty [Wed, 5 Mar 2008 23:04:59 +0000 (15:04 -0800)]
slab: NUMA slab allocator migration bugfix

NUMA slab allocator cpu migration bugfix

The NUMA slab allocator (specifically, cache_alloc_refill)
is not refreshing its local copies of what cpu and what
numa node it is on, when it drops and reacquires the irq
block that it inherited from its caller.  As a result
those values become invalid if an attempt to migrate the
process to another numa node occured while the irq block
had been dropped.

The solution is to make cache_alloc_refill reload these
variables whenever it drops and reacquires the irq block.

The error is very difficult to hit.  When it does occur,
one gets the following oops + stack traceback bits in
check_spinlock_acquired:

kernel BUG at mm/slab.c:2417
cache_alloc_refill+0xe6
kmem_cache_alloc+0xd0
...

This patch was developed against 2.6.23, ported to and
compiled-tested only against 2.6.25-rc4.

Signed-off-by: Joe Korty <joe.korty@ccur.com>
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agorelay: fix subbuf_splice_actor() adding too many pages
Jens Axboe [Mon, 17 Mar 2008 08:04:59 +0000 (09:04 +0100)]
relay: fix subbuf_splice_actor() adding too many pages

If subbuf_pages was larger than the max number of pages the pipe
buffer will hold, subbuf_splice_actor() would happily go beyond
the array size.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoBLUETOOTH: Fix bugs in previous conn add/del workqueue changes.
Dave Young [Fri, 1 Feb 2008 02:33:10 +0000 (18:33 -0800)]
BLUETOOTH: Fix bugs in previous conn add/del workqueue changes.

Jens Axboe noticed that we were queueing &conn->work on both btaddconn
and keventd_wq.

Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoSCSI advansys: Fix bug in AdvLoadMicrocode
Matthew Wilcox [Fri, 21 Mar 2008 15:15:03 +0000 (15:15 +0000)]
SCSI advansys: Fix bug in AdvLoadMicrocode

commit: 951b62c11e86acf8c55d9828aa8c921575023c29

buf[i] can be up to 0xfd, so doubling it and assigning the result to an
unsigned char truncates the value.  Just use an unsigned int instead;
it's only a temporary.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoasync_tx: avoid the async xor_zero_sum path when src_cnt > device->max_xor
Dan Williams [Wed, 19 Mar 2008 04:40:04 +0000 (04:40 +0000)]
async_tx: avoid the async xor_zero_sum path when src_cnt > device->max_xor

commit: 8d8002f642886ae256a3c5d70fe8aff4faf3631a

If the channel cannot perform the operation in one call to
->device_prep_dma_zero_sum, then fallback to the xor+page_is_zero path.
This only affects users with arrays larger than 16 devices on iop13xx or
32 devices on iop3xx.

Cc: <stable@kernel.org>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
[chrisw@sous-sol.org: backport to 2.6.24.3]
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoaio: bad AIO race in aio_complete() leads to process hang
Quentin Barnes [Thu, 20 Mar 2008 02:45:07 +0000 (02:45 +0000)]
aio: bad AIO race in aio_complete() leads to process hang

commit: 6cb2a21049b8990df4576c5fce4d48d0206c22d5

My group ran into a AIO process hang on a 2.6.24 kernel with the process
sleeping indefinitely in io_getevents(2) waiting for the last wakeup to come
and it never would.

We ran the tests on x86_64 SMP.  The hang only occurred on a Xeon box
("Clovertown") but not a Core2Duo ("Conroe").  On the Xeon, the L2 cache isn't
shared between all eight processors, but is L2 is shared between between all
two processors on the Core2Duo we use.

My analysis of the hang is if you go down to the second while-loop
in read_events(), what happens on processor #1:
1) add_wait_queue_exclusive() adds thread to ctx->wait
2) aio_read_evt() to check tail
3) if aio_read_evt() returned 0, call [io_]schedule() and sleep

In aio_complete() with processor #2:
A) info->tail = tail;
B) waitqueue_active(&ctx->wait)
C) if waitqueue_active() returned non-0, call wake_up()

The way the code is written, step 1 must be seen by all other processors
before processor 1 checks for pending events in step 2 (that were recorded by
step A) and step A by processor 2 must be seen by all other processors
(checked in step 2) before step B is done.

The race I believed I was seeing is that steps 1 and 2 were
effectively swapped due to the __list_add() being delayed by the L2
cache not shared by some of the other processors.  Imagine:
proc 2: just before step A
proc 1, step 1: adds to ctx->wait, but is not visible by other processors yet
proc 1, step 2: checks tail and sees no pending events
proc 2, step A: updates tail
proc 1, step 3: calls [io_]schedule() and sleeps
proc 2, step B: checks ctx->wait, but sees no one waiting, skips wakeup
                so proc 1 sleeps indefinitely

My patch adds a memory barrier between steps A and B.  It ensures that the
update in step 1 gets seen on processor 2 before continuing.  If processor 1
was just before step 1, the memory barrier makes sure that step A (update
tail) gets seen by the time processor 1 makes it to step 2 (check tail).

Before the patch our AIO process would hang virtually 100% of the time.  After
the patch, we have yet to see the process ever hang.

Signed-off-by: Quentin Barnes <qbarnes+linux@yahoo-inc.com>
Reviewed-by: Zach Brown <zach.brown@oracle.com>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: <stable@kernel.org>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[ We should probably disallow that "if (waitqueue_active()) wake_up()"
  coding pattern, because it's so often buggy wrt memory ordering ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agojbd: correctly unescape journal data blocks
Duane Griffin [Thu, 20 Mar 2008 02:45:06 +0000 (02:45 +0000)]
jbd: correctly unescape journal data blocks

commit: 439aeec639d7c57f3561054a6d315c40fd24bb74

Fix a long-standing typo (predating git) that will cause data corruption if a
journal data block needs unescaping.  At the moment the wrong buffer head's
data is being unescaped.

To test this case mount a filesystem with data=journal, start creating and
deleting a bunch of files containing only JFS_MAGIC_NUMBER (0xc03b3998), then
pull the plug on the device.  Without this patch the files will contain zeros
instead of the correct data after recovery.

Signed-off-by: Duane Griffin <duaneg@dghda.com>
Acked-by: Jan Kara <jack@suse.cz>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agojbd2: correctly unescape journal data blocks
Duane Griffin [Thu, 20 Mar 2008 02:45:05 +0000 (02:45 +0000)]
jbd2: correctly unescape journal data blocks

commit: d00256766a0b4f1441931a7f569a13edf6c68200

Fix a long-standing typo (predating git) that will cause data corruption if a
journal data block needs unescaping.  At the moment the wrong buffer head's
data is being unescaped.

To test this case mount a filesystem with data=journal, start creating and
deleting a bunch of files containing only JBD2_MAGIC_NUMBER (0xc03b3998), then
pull the plug on the device.  Without this patch the files will contain zeros
instead of the correct data after recovery.

Signed-off-by: Duane Griffin <duaneg@dghda.com>
Acked-by: Jan Kara <jack@suse.cz>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agozisofs: fix readpage() outside i_size
Dave Young [Thu, 20 Mar 2008 02:45:04 +0000 (02:45 +0000)]
zisofs: fix readpage() outside i_size

commit: 08ca0db8aa2db4ddcf487d46d85dc8ffb22162cc

A read request outside i_size will be handled in do_generic_file_read().  So
we just return 0 to avoid getting -EIO as normal reading, let
do_generic_file_read do the rest.

At the same time we need unlock the page to avoid system stuck.

Fixes http://bugzilla.kernel.org/show_bug.cgi?id=10227

Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Acked-by: Jan Kara <jack@suse.cz>
Report-by: Christian Perle <chris@linuxinfotag.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoNETFILTER: nfnetlink_log: fix computation of netlink skb size
Eric Leblond [Mon, 17 Mar 2008 14:41:47 +0000 (15:41 +0100)]
NETFILTER: nfnetlink_log: fix computation of netlink skb size

Upstream commit 7000d38d:

This patch is similar to nfnetlink_queue fixes. It fixes the computation
of skb size by using NLMSG_SPACE instead of NLMSG_ALIGN.

Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoNETFILTER: nfnetlink_queue: fix computation of allocated size for netlink skb
Eric Leblond [Mon, 17 Mar 2008 14:41:46 +0000 (15:41 +0100)]
NETFILTER: nfnetlink_queue: fix computation of allocated size for netlink skb

Upstream commit cabaa9bf:

Size of the netlink skb was wrongly computed because the formula was using
NLMSG_ALIGN instead of NLMSG_SPACE. NLMSG_ALIGN does not add the room for
netlink header as NLMSG_SPACE does. This was causing a failure of message
building in some cases.

On my test system, all messages for packets in range [8*k+41, 8*k+48] where k
is an integer were invalid and the corresponding packets were dropped.

Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoNETFILTER: xt_time: fix failure to match on Sundays
Jan Engelhardt [Mon, 17 Mar 2008 14:41:44 +0000 (15:41 +0100)]
NETFILTER: xt_time: fix failure to match on Sundays

Upstream commit 4f4c9430:

xt_time_match() in net/netfilter/xt_time.c in kernel 2.6.24 never
matches on Sundays. On my host I have a rule like

iptables -A OUTPUT -m time --weekdays Sun -j REJECT

and it never matches. The problem is in localtime_2(), which uses

    r->weekday = (4 + r->dse) % 7;

to map the epoch day onto a weekday in {0,...,6}. In particular this
gives 0 for Sundays. But 0 has to be wrong; a weekday of 0 can never
match. xt_time_match() has

    if (!(info->weekdays_match & (1 << current_time.weekday)))
        return false;

and when current_time.weekday = 0, the result of the & is always
zero, even when info->weekdays_match = XT_TIME_ALL_WEEKDAYS = 0xFE.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agosched_nr_migrate wrong mode bits
Michal Schmidt [Mon, 17 Mar 2008 23:13:16 +0000 (00:13 +0100)]
sched_nr_migrate wrong mode bits

sched_nr_migrate has strange permission bits:

 $ ls -l /proc/sys/kernel/sched_nr_migrate
 --w----r-T 1 root root 0 2008-03-17 23:31 /proc/sys/kernel/sched_nr_migrate

The bug is an obvious decimal/octal confusion.

Fixed (collaterally) in Linus's tree by Peter Zijlstra with commit fa85ae241
"sched: rt time limit" (in 2.6.25-rc1).

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agonfsd: fix oops on access from high-numbered ports
J. Bruce Fields [Fri, 14 Mar 2008 23:37:11 +0000 (19:37 -0400)]
nfsd: fix oops on access from high-numbered ports

This bug was always here, but before my commit 6fa02839bf9412e18e77
("recheck for secure ports in fh_verify"), it could only be triggered by
failure of a kmalloc().  After that commit it could be triggered by a
client making a request from a non-reserved port for access to an export
marked "secure".  (Exports are "secure" by default.)

The result is a struct svc_export with a reference count one too low,
resulting in likely oopses next time the export is accessed.

The reference counting here is not straightforward; a later patch will
clean up fh_verify().

Thanks to Lukas Hejtmanek for the bug report and followup.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Cc: Lukas Hejtmanek <xhejtman@ics.muni.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agosched: fix race in schedule()
Hiroshi Shimamoto [Mon, 10 Mar 2008 18:01:20 +0000 (11:01 -0700)]
sched: fix race in schedule()

Fix a hard to trigger crash seen in the -rt kernel that also affects
the vanilla scheduler.

There is a race condition between schedule() and some dequeue/enqueue
functions; rt_mutex_setprio(), __setscheduler() and sched_move_task().

When scheduling to idle, idle_balance() is called to pull tasks from
other busy processor. It might drop the rq lock. It means that those 3
functions encounter on_rq=0 and running=1. The current task should be
put when running.

Here is a possible scenario:

   CPU0                               CPU1
    |                              schedule()
    |                              ->deactivate_task()
    |                              ->idle_balance()
    |                              -->load_balance_newidle()
rt_mutex_setprio()                     |
    |                              --->double_lock_balance()
    *get lock                          *rel lock
    * on_rq=0, ruuning=1               |
    * sched_class is changed           |
    *rel lock                          *get lock
    :                                  |
                                       :
                                   ->put_prev_task_rt()
                                   ->pick_next_task_fair()
                                       => panic

The current process of CPU1(P1) is scheduling. Deactivated P1, and the
scheduler looks for another process on other CPU's runqueue because CPU1
will be idle. idle_balance(), load_balance_newidle() and
double_lock_balance() are called and double_lock_balance() could drop
the rq lock. On the other hand, CPU0 is trying to boost the priority of
P1. The result of boosting only P1's prio and sched_class are changed to
RT. The sched entities of P1 and P1's group are never put. It makes
cfs_rq invalid, because the cfs_rq has curr and no leaf, but
pick_next_task_fair() is called, then the kernel panics.

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
[chrisw@sous-sol.org: backport to 2.6.24.3]
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoSCSI: mpt fusion: don't oops if NumPhys==0
Krzysztof Oledzki [Tue, 4 Mar 2008 22:56:23 +0000 (14:56 -0800)]
SCSI: mpt fusion: don't oops if NumPhys==0

Don't oops if NumPhys==0, instead return -ENODEV.
This patch fixes http://bugzilla.kernel.org/show_bug.cgi?id=9909

Signed-off-by: Krzysztof Piotr Oledzki <ole@ans.pl>
Acked-by: Eric Moore <Eric.Moore@lsi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoSCSI: gdth: fix to internal commands execution
Boaz Harrosh [Thu, 6 Mar 2008 02:10:13 +0000 (02:10 +0000)]
SCSI: gdth: fix to internal commands execution

commit: ee54cc6af95a7fa09da298493b853a9e64fa8abd

The recent patch named:
  [SCSI] gdth: !use_sg cleanup and use of scsi accessors

has done a bad job in handling internal commands issued by gdth_execute().

Internal commands are issued with device gdth_cmd_str ready made directly
to the card, without any mapping or translations of scsi commands. So here
I added a gdth_cmd_str pointer to the gdth_cmndinfo private structure which
is then copied directly to host.

following this patch is a cleanup that removes the home cooked accessors
and reverts them to regular scsi_cmnd accessors. Since they are not used
anymore. After review maybe the 2 patches should be squashed together.

FIXME: There is still a problem with gdth_get_info(). as reported there
   is a WARN_ON trigerd in dma_free_coherent() when doing:
   $ cat /proc/sys/gdth/0

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Tested-by: Joerg Dorchain: <joerg@dorchain.net>
Tested-by: Stefan Priebe <s.priebe@allied-internet.ag>
Tested-by: Jon Chelton <jchelton@ffpglobal.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoSCSI: gdth: bugfix for the at-exit problems
Boaz Harrosh [Thu, 6 Mar 2008 02:10:15 +0000 (02:10 +0000)]
SCSI: gdth: bugfix for the at-exit problems

commit: b31ddd31c266c2ad1b708cad0d3d8e0aa7fa2737

gdth_exit would first remove all cards then stop the timer
and would not sync with the timer function. This caused a crash
in gdth_timer() when module was unloaded.
So del_timer_sync the timer before we delete the cards.

also the reboot notifier function would crash. So clean
that up and fix the crashes.

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Tested-by: Joerg Dorchain: <joerg@dorchain.net>
Tested-by: Stefan Priebe <s.priebe@allied-internet.ag>
Tested-by: Jon Chelton <jchelton@ffpglobal.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>