Eric Sandeen [Mon, 11 Jun 2007 05:02:45 +0000 (14:02 +0900)]
[PATCH] sysfs: store sysfs inode nrs in s_ino to avoid readdir oopses
Backport of
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.22-rc1/2.6.22-rc1-mm1/broken-out/gregkh-driver-sysfs-allocate-inode-number-using-ida.patch
For regular files in sysfs, sysfs_readdir wants to traverse
sysfs_dirent->s_dentry->d_inode->i_ino to get to the inode number.
But, the dentry can be reclaimed under memory pressure, and there is
no synchronization with readdir. This patch follows Tejun's scheme of
allocating and storing an inode number in the new s_ino member of a
sysfs_dirent, when dirents are created, and retrieving it from there
for readdir, so that the pointer chain doesn't have to be traversed.
Tejun's upstream patch uses a new-ish "ida" allocator which brings
along some extra complexity; this -stable patch has a brain-dead
incrementing counter which does not guarantee uniqueness, but because
sysfs doesn't hash inodes as iunique expects, uniqueness wasn't
guaranteed today anyway.
Subject: [PATCH] [NET]: Do not dereference iov if length is zero
When msg_iovlen is zero we shouldn't try to dereference
msg_iov. Right now the only thing that tries to do so
is skb_copy_and_csum_datagram_iovec. Since the total
length should also be zero if msg_iovlen is zero, it's
sufficient to check the total length there and simply
return if it's zero.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
In testing our ESP/AH offload hardware, I discovered an issue with how
AH handles mutable fields in IPv4. RFC 4302 (AH) states the following
on the subject:
For IPv4, the entire option is viewed as a unit; so even
though the type and length fields within most options are immutable
in transit, if an option is classified as mutable, the entire option
is zeroed for ICV computation purposes.
The current implementation does not zero the type and length fields,
resulting in authentication failures when communicating with hosts
that do (i.e. FreeBSD).
I have tested record route and timestamp options (ping -R and ping -T)
on a small network involving Windows XP, FreeBSD 6.2, and Linux hosts,
with one router. In the presence of these options, the FreeBSD and
Linux hosts (with the patch or with the hardware) can communicate.
The Windows XP host simply fails to accept these packets with or
without the patch.
I have also been trying to test source routing options (using
traceroute -g), but haven't had much luck getting this option to work
*without* AH, let alone with.
Signed-off-by: Nick Bowler <nbowler@ellipticsemi.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
[IPv6]: Fix NULL pointer dereference in ip6_flush_pending_frames
Some of skbs in sk->write_queue do not have skb->dst because
we do not fill skb->dst when we allocate new skb in append_data().
BTW, I think we may not need to (or we should not) increment some stats
when using corking; if 100 sendmsg() (with MSG_MORE) result in 2 packets,
how many should we increment?
If 100, we should set skb->dst for every queued skbs.
If 1 (or 2 (*)), we increment the stats for the first queued skb and
we should just skip incrementing OutDiscards for the rest of queued skbs,
adn we should also impelement this semantics in other places;
e.g., we should increment other stats just once, not 100 times.
*: depends on the place we are discarding the datagram.
I guess should just increment by 1 (or 2).
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
As noticed by Chuck Ebbert, commit c5e3ae8823693b260ce1f217adca8add1bc0b3de
introduced a copy-paste typo, as realtek phy is 0x732 and not 0x1c1. Obvious
fix below suggested by Ayaz Abdulla.
Signed-off-by: Willy Tarreau <w@1wt.eu> Cc: Ayaz Abdulla <aabdulla@nvidia.com> Cc: Chuck Ebbert <cebbert@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
[CRYPTO] blkcipher: Fix handling of kmalloc page straddling
The function blkcipher_get_spot tries to return a buffer of
the specified length that does not straddle a page. It has
an off-by-one bug so it may advance a page unnecessarily.
What's worse, one of its callers doesn't provide a buffer
that's sufficiently long for this operation.
This patch fixes both problems. Thanks to Bob Gilligan for
diagnosing this problem and providing a fix.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
There's a race condition in blk_queue_end_tag() for shared tag maps,
users include stex (promise supertrak thingy) and qla2xxx. The former
at least has reported bugs in this area, not sure why we haven't seen
any for the latter. It could be because the window is narrow and that
other conditions in the qla2xxx code hide this. It's a real bug,
though, as the stex smp users can attest.
We need to ensure two things - the tag bit clearing needs to happen
AFTER we cleared the tag pointer, as the tag bit clearing/setting is
what protects this map. Secondly, we need to ensure that the visibility
of the tag pointer and tag bit clear are ordered properly.
[ I removed the SMP barriers - "test_and_clear_bit()" already implies
all the required barriers. -- Linus ]
Also see http://bugzilla.kernel.org/show_bug.cgi?id=7842
Stefan Richter [Fri, 21 Sep 2007 06:11:08 +0000 (08:11 +0200)]
[PATCH] ieee1394: ohci1394: fix initialization if built non-modular
Initialization of ohci1394 was broken according to one reporter if the
driver was statically linked, i.e. not built as loadable module. Dmesg:
PCI: Device 0000:02:07.0 not available because of resource collisions
ohci1394: Failed to enable OHCI hardware.
This was reported for a Toshiba Satellite 5100-503. The cause is commit 8df4083c5291b3647e0381d3c69ab2196f5dd3b7 in Linux 2.6.19-rc1 which only
served purposes of early remote debugging via FireWire. This
functionality is better provided by the currently out-of-tree driver
ohci1394_earlyinit. Reversal of the commit was OK'd by Andi Kleen.
Convert asserts (BUGs) in dx_probe from bad on-disk data to recoverable
errors with helpful warnings. With help catching other asserts from Duane
Griffin <duaneg@dghda.com>
The inode->i_flock list contains the leases, flocks and posix
locks in the specified order. However, the flocks are added in
the head of this list thus hiding the leases from F_GETLEASE
command, from time_out_leases() and other code that expects
the leases to come first.
The futex list traversal on the compat side appears to have
a bug.
It's loop termination condition compares:
while (compat_ptr(uentry) != &head->list)
But that can't be right because "uentry" has the special
"pi" indicator bit still potentially set at bit 0. This
is cleared by fetch_robust_entry() into the "entry"
return value.
What this seems to mean is that the list won't terminate
when list iteration gets back to the the head. And we'll
also process the list head like a normal entry, which could
cause all kinds of problems.
So we should check for equality with "entry". That pointer
is of the non-compat type so we have to do a little casting
to keep the compiler and sparse happy.
The same problem can in theory occur with the 'pending'
variable, although that has not been reported from users
so far.
Based on the original patch from David Miller.
Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: David Miller <davem@davemloft.net> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
[PATCH] V4L: cx88: Avoid a NULL pointer dereference during mpeg_open()
Bug: With a hardware encoder board installed as cx88[1] and a
non-encoder boards installed as cx88[0], an OOPS is generated
during cx8802_get_device() called from mpeg_open().
Signed-off-by: Steven Toth <stoth@hauppauge.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org> Signed-off-by: Michael Krufky <mkrufky@linuxtv.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When we flush register state for FP, Altivec, or SPE in flush_*_to_thread
we need to respect the task_struct that the caller has passed to us.
Most cases we are called with current, however sometimes (ptrace) we may
be passed a different task_struct.
This showed up when using gdbserver debugging a simple program that used
floating point. When gdb tried to show the FP regs they all showed up as
0, because the child's FP registers were never properly flushed to memory.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Spotted by taoyue <yue.tao@windriver.com> and Jeremy Katz <jeremy.katz@windriver.com>.
collect_signal: sigqueue_free:
list_del_init(&first->list);
if (!list_empty(&q->list)) {
// not taken
}
q->flags &= ~SIGQUEUE_PREALLOC;
__sigqueue_free(first); __sigqueue_free(q);
Now, __sigqueue_free() is called twice on the same "struct sigqueue" with the
obviously bad implications.
In particular, this double free breaks the array_cache->avail logic, so the
same sigqueue could be "allocated" twice, and the bug can manifest itself via
the "impossible" BUG_ON(!SIGQUEUE_PREALLOC) in sigqueue_free/send_sigqueue.
Hopefully this can explain these mysterious bug-reports, see
Alexey Dobriyan reports this patch makes the difference for the testcase, but
nobody has an access to the application which opened the problems originally.
Also, this patch removes tasklist lock/unlock, ->siglock is enough.
Oliver Neukum [Wed, 22 Aug 2007 22:15:43 +0000 (15:15 -0700)]
[PATCH] USB: fix DoS in pwc USB video driver
the pwc driver has a disconnect method that waits for user space to
close the device. This opens up an opportunity for a DoS attack,
blocking the USB subsystem and making khubd's task busy wait in
kernel space. This patch shifts freeing resources to close if an opened
device is disconnected.
Signed-off-by: Oliver Neukum <oneukum@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Alan Stern [Wed, 22 Aug 2007 22:15:42 +0000 (15:15 -0700)]
[PATCH] USB: allow retry on descriptor fetch errors
This patch (as964) was suggested by Steffen Koepf. It makes
usb_get_descriptor() retry on all errors other than ETIMEDOUT, instead
of only on EPIPE. This helps with some devices.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
David Miller [Wed, 22 Aug 2007 04:14:45 +0000 (21:14 -0700)]
[PATCH] TCP: Do not autobind ports for TCP sockets
[TCP]: Invoke tcp_sendmsg() directly, do not use inet_sendmsg().
As discovered by Evegniy Polyakov, if we try to sendmsg after
a connection reset, we can do incredibly stupid things.
The core issue is that inet_sendmsg() tries to autobind the
socket, but we should never do that for TCP. Instead we should
just go straight into TCP's sendmsg() code which will do all
of the necessary state and pending socket error checks.
TCP's sendpage already directly vectors to tcp_sendpage(), so this
merely brings sendmsg() in line with that.
Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Herbert Xu [Wed, 22 Aug 2007 04:09:15 +0000 (21:09 -0700)]
[PATCH] NET: Fix missing rcu unlock in __sock_create()
[NET]: Fix unbalanced rcu_read_unlock in __sock_create
The recent RCU work created an unbalanced rcu_read_unlock
in __sock_create. This patch fixes that. Reported by
oleg 123.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Herbert Xu [Wed, 22 Aug 2007 04:07:30 +0000 (21:07 -0700)]
[PATCH] SNAP: Fix SNAP protocol header accesses.
The snap_rcv code reads 5 bytes so we should make sure that
we have 5 bytes in the head before proceeding.
Based on diagnosis and fix by Evgeniy Polyakov, reported by
Alan J. Wylie.
Patch also kills the skb->sk assignment before kfree_skb
since it's redundant.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Chuck Ebbert [Wed, 22 Aug 2007 04:05:14 +0000 (21:05 -0700)]
[PATCH] Netfilter: Missing Kbuild entry for netfilter
Author: Chuck Ebbert <cebbert@redhat.com>
Add xt_statistic.h to the list of headers to install.
Apparently needed to build newer versions of iptables.
Signed-off-by: Chuck Ebbert <cebbert@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The problem was that GFP_KERNEL was used; fixed by using gfp_any().
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Francois Romieu [Tue, 14 Aug 2007 22:29:27 +0000 (00:29 +0200)]
[PATCH] r8169: avoid needless NAPI poll scheduling
Theory : though needless, it should not have hurt.
Practice: it does not play nice with DEBUG_SHIRQ + LOCKDEP + UP
(see https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=3D242572).
The patch makes sense in itself but I should dig why it has an effect
on #242572 (assuming that NAPI do not change in a near future).
[PATCH] AVR32: Fix atomic_add_unless() and atomic_sub_unless()
These functions depend on "result" being initalized to 0, but "result"
is not included as an input constraint to the inline assembly block
following its initialization, only as an output constraint. Thus gcc
thinks it doesn't need to initialize it, so result ends up undefined
if the "unless" condition is true.
This fixes an oops in sunrpc where the faulty atomics caused
rpciod_up() to not start the workqueue as it should.
Bob Moore [Wed, 15 Aug 2007 18:58:15 +0000 (14:58 -0400)]
[PATCH] ACPICA: Fixed possible corruption of global GPE list
ACPICA: Fixed possible corruption of global GPE list
Fixed a problem in acpi_ev_delete_gpe_xrupt where the global interrupt
list could be corrupted if the interrupt being removed was at
the head of the list. Reported by Linn Crosetto.
Signed-off-by: Bob Moore <robert.moore@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> Cc: Chuck Ebbert <cebbert@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Tom Alsberg [Tue, 8 May 2007 14:30:31 +0000 (07:30 -0700)]
CPU time limit patch / setrlimit(RLIMIT_CPU, 0) cheat fix
CPU time limit patch / setrlimit(RLIMIT_CPU, 0) cheat fix
As discovered here today, the change in Kernel 2.6.17 intended to inhibit
users from setting RLIMIT_CPU to 0 (as that is equivalent to unlimited) by
"cheating" and setting it to 1 in such a case, does not make a difference,
as the check is done in the wrong place (too late), and only applies to the
profiling code.
On all systems I checked running kernels above 2.6.17, no matter what the
hard and soft CPU time limits were before, a user could escape them by
issuing in the shell (sh/bash/zsh) "ulimit -t 0", and then the user's
process was not ever killed.
Attached is a trivial patch to fix that. Simply moving the check to a
slightly earlier location (specifically, before the line that actually
assigns the limit - *old_rlim = new_rlim), does the trick.
Do note that at least the zsh (but not ash, dash, or bash) shell has the
problem of "caching" the limits set by the ulimit command, so when running
zsh the fix will not immediately be evident - after entering "ulimit -t 0",
"ulimit -a" will show "-t: cpu time (seconds) 0", even though the actual
limit as returned by getrlimit(...) will be 1. It can be verified by
opening a subshell (which will not have the values of the parent shell in
cache) and checking in it, or just by running a CPU intensive command like
"echo '65536^1048576' | bc" and verifying that it dumps core after one
second.
Regardless of whether that is a misfeature in the shell, perhaps it would
be better to return -EINVAL from setrlimit in such a case instead of
cheating and setting to 1, as that does not really reflect the actual state
of the process anymore. I do not however know what the ground for that
decision was in the original 2.6.17 change, and whether there would be any
"backward" compatibility issues, so I preferred not to touch that right
now.
Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Willy Tarreau [Tue, 28 Aug 2007 05:02:24 +0000 (07:02 +0200)]
[PATCH] Revert "Keep rfcomm_dev on the list until it is freed"
Randy Dunlap and Boris B. Zhmurov reported build failure of
rfcomm/tty.c since 2.6.20.17. Marcel Holtmann confirmed that
the following patch was inappropriate for this kernel. Remove
it.
Marcel Holtmann [Fri, 17 Aug 2007 19:47:58 +0000 (21:47 +0200)]
[PATCH] Reset current->pdeath_signal on SUID binary execution (CVE-2007-3848)
This fixes a vulnerability in the "parent process death signal"
implementation discoverd by Wojciech Purczynski of COSEINC PTE Ltd.
and iSEC Security Research.
Venki Pallipadi [Wed, 20 Jun 2007 21:24:52 +0000 (14:24 -0700)]
[PATCH] CPUFREQ: ondemand: add a check to avoid negative load calculation
Due to rounding and inexact jiffy accounting, idle_ticks can sometimes
be higher than total_ticks. Make sure those cases are handled as
zero load case.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Willy Tarreau <w@1wt.eu>
Venki Pallipadi [Wed, 20 Jun 2007 21:26:24 +0000 (14:26 -0700)]
[PATCH] CPUFREQ: ondemand: fix tickless accounting and software coordination bug
With tickless kernel and software coordination os P-states, ondemand
can look at wrong idle statistics. This can happen when ondemand sampling
is happening on CPU 0 and due to software coordination sampling also looks at
utilization of CPU 1. If CPU 1 is in tickless state at that moment, its idle
statistics will not be uptodate and CPU 0 thinks CPU 1 is idle for less
amount of time than it actually is.
This can be resolved by looking at all the busy times of CPUs, which is
accurate, even with tickless, and use that to determine idle time in a
round about way (total time - busy time).
Thanks to Arjan for originally reporting the ondemand bug on
Lenovo T61.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Willy Tarreau <w@1wt.eu>
Helge Deller [Fri, 10 Aug 2007 20:00:45 +0000 (13:00 -0700)]
[PATCH] stifb: detect cards in double buffer mode more reliably
Visualize-EG, Graffiti and A4450A graphics cards on PARISC can
be configured in double-buffer and standard mode, but the stifb
driver supports standard mode only.
This patch detects double-buffered cards more reliable.
It is a real bugfix for a very nasty problem for all parisc users which have
wrongly configured their graphic card. The problem: The stifb graphics driver
will not detect that the card is wrongly configured and then nevertheless just
enables the graphics mode, which it shouldn't. In the end, the user will see
no further updates / boot messages on the screen.
We had documented this problem already on our FAQ
(http://parisc-linux.org/faq/index.html#viseg "Why do I get corrupted graphics
with my Vis-EG/Graffiti/A4450A card?") but people still run into this problem.
So having this fix in as early as possible can help us.
Badari Pulavarty [Fri, 10 Aug 2007 20:00:44 +0000 (13:00 -0700)]
[PATCH] direct-io: fix error-path crashes
Need to initialize map_bh.b_state to zero. Otherwise, in case of a faulty
user-buffer its possible to go into dio_zero_block() and submit a page by
mistake - since it checks for buffer_new().
Michael Buesch [Tue, 7 Aug 2007 10:20:40 +0000 (12:20 +0200)]
[PATCH] softmac: Fix deadlock of wx_set_essid with assoc work
The essid wireless extension does deadlock against the assoc mutex,
as we don't unlock the assoc mutex when flushing the workqueue, which
also holds the lock.
If root raised the default wakeup threshold over the size of the
output pool, the pool transfer function could overflow the stack with
RNG bytes, causing a DoS or potential privilege escalation.
(Bug reported by the PaX Team <pageexec@freemail.hu>)
Reading /proc/net/anycast6 when there is no anycast address
on an interface results in an ever-increasing inet6_dev reference
count, as well as a reference to the netdevice you can't get rid of.
Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Marcus Meissner <meissner@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Willy Tarreau <w@1wt.eu>
Ville Tervo [Wed, 11 Jul 2007 07:23:41 +0000 (09:23 +0200)]
[PATCH] Keep rfcomm_dev on the list until it is freed
This patch changes the RFCOMM TTY release process so that the TTY is kept
on the list until it is really freed. A new device flag is used to keep
track of released TTYs.
Mikko Rapeli [Wed, 11 Jul 2007 07:18:15 +0000 (09:18 +0200)]
[PATCH] Hangup TTY before releasing rfcomm_dev
The core problem is that RFCOMM socket layer ioctl can release
rfcomm_dev struct while RFCOMM TTY layer is still actively using
it. Calling tty_vhangup() is needed for a synchronous hangup before
rfcomm_dev is freed.
Addresses the oops at http://bugzilla.kernel.org/show_bug.cgi?id=7509
Stefan Bader [Thu, 12 Jul 2007 16:28:33 +0000 (17:28 +0100)]
[PATCH] dm: disable barriers
This patch causes device-mapper to reject any barrier requests. This is done
since most of the targets won't handle this correctly anyway. So until the
situation improves it is better to reject these requests at the first place.
Since barrier requests won't get to the targets, the checks there can be
removed.
Signed-off-by: Stefan Bader <shbader@de.ibm.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Willy Tarreau <w@1wt.eu>
Milan Broz [Thu, 12 Jul 2007 16:28:13 +0000 (17:28 +0100)]
[PATCH] dm snapshot: permit invalid activation
Allow invalid snapshots to be activated instead of failing.
This allows userspace to reinstate any given snapshot state - for
example after an unscheduled reboot - and clean up the invalid snapshot
at its leisure.
Signed-off-by: Milan Broz <mbroz@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Willy Tarreau <w@1wt.eu>
J. Bruce Fields [Tue, 24 Jul 2007 01:43:52 +0000 (18:43 -0700)]
[PATCH] nfsd: fix possible oops on re-insertion of rpcsec_gss modules
The handling of the re-registration case is wrong here; the "test" that was
returned from auth_domain_lookup will not be used again, so that reference
should be put. And auth_domain_lookup never did anything with "new" in
this case, so we should just clean it up ourself.
Thanks to Akinobu Mita for bug report, analysis, and testing.
Cc: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu> Cc: Neil Brown <neilb@suse.de> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Willy Tarreau <w@1wt.eu>
[PATCH] do not limit locked memory when RLIMIT_MEMLOCK is RLIM_INFINITY
Fix a bug in mm/mlock.c on 32-bit architectures that prevents a user from
locking more than 4GB of shared memory, or allocating more than 4GB of
shared memory in hugepages, when rlim[RLIMIT_MEMLOCK] is set to
RLIM_INFINITY.
Signed-off-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Acked-by: Chris Mason <chris.mason@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Willy Tarreau <w@1wt.eu>
[PATCH] acpi-cpufreq: Proper ReadModifyWrite of PERF_CTL MSR
[CPUFREQ] acpi-cpufreq: Proper ReadModifyWrite of PERF_CTL MSR
During recent acpi-cpufreq changes, writing to PERF_CTL msr
changed from RMW of entire 64 bit to RMW of low 32 bit and clearing of
upper 32 bit. Fix it back to do a proper RMW of the MSR.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Dave Jones <davej@redhat.com> Cc: Chuck Ebbert <cebbert@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Willy Tarreau <w@1wt.eu>
Define two convenient macros for read-ahead:
- MAX_RA_PAGES: rounded down counterpart of VM_MAX_READAHEAD
- MIN_RA_PAGES: rounded _up_ counterpart of VM_MIN_READAHEAD
Note that the rounded up MIN_RA_PAGES will work flawlessly with _large_
page sizes like 64k.
Signed-off-by: Fengguang Wu <wfg@mail.ustc.edu.cn> Cc: Steven Pratt <slpratt@austin.ibm.com> Cc: Ram Pai <linuxram@us.ibm.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Willy Tarreau <w@1wt.eu>
J. Bruce Fields [Thu, 19 Jul 2007 08:49:18 +0000 (01:49 -0700)]
[PATCH] nfsd: fix possible read-ahead cache and export table corruption
The value of nperbucket calculated here is too small--we should be rounding up
instead of down--with the result that the index j in the following loop can
overflow the raparm_hash array. At least in my case, the next thing in memory
turns out to be export_table, so the symptoms I see are crashes caused by the
appearance of four zeroed-out export entries in the first bucket of the hash
table of exports (which were actually entries in the readahead cache, a
pointer to which had been written to the export table in this initialization
code).
Jean Tourrilhes [Tue, 17 Jul 2007 15:46:33 +0000 (10:46 -0500)]
[PATCH] softmac: Fix ESSID problem
Victor Porton reported that the SoftMAC layer had random problem when setting the ESSID :
http://bugzilla.kernel.org/show_bug.cgi?id=8686 After investigation, it turned out to be
worse, the SoftMAC layer is left in an inconsistent state. The fix is pretty trivial.
Signed-off-by: Jean Tourrilhes <jt@hpl.hp.com> Acked-by: Michael Buesch <mb@bu3sch.de> Acked-by: Larry Finger <Larry.Finger@lwfinger.net> Acked-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Willy Tarreau <w@1wt.eu>
[PATCH] Include serial_reg.h with userspace headers
As reported by Gustavo de Nardin <gustavodn@mandriva.com.br>, while trying to
compile xosview (http://xosview.sourceforge.net/) with upstream kernel
headers being used you get the following errors:
serialmeter.cc:48:30: error: linux/serial_reg.h: No such file or directory
serialmeter.cc: In member function 'virtual void
SerialMeter::checkResources()':
serialmeter.cc:71: error: 'UART_LSR' was not declared in this scope
serialmeter.cc:71: error: 'UART_MSR' was not declared in this scope
...
Signed-off-by: Herton Ronaldo Krzesinski <herton@mandriva.com.br> Cc: Gustavo de Nardin <gustavodn@mandriva.com.br> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Willy Tarreau <w@1wt.eu>
Mingming Cao [Tue, 31 Jul 2007 07:37:46 +0000 (00:37 -0700)]
[PATCH] "ext4_ext_put_in_cache" uses __u32 to receive physical block number
Yan Zheng wrote:
> I think I found a bug in ext4/extents.c, "ext4_ext_put_in_cache" uses
> "__u32" to receive physical block number. "ext4_ext_put_in_cache" is
> used in "ext4_ext_get_blocks", it sets ext4 inode's extent cache
> according most recently tree lookup (higher 16 bits of saved physical
> block number are always zero). when serving a mapping request,
> "ext4_ext_get_blocks" first check whether the logical block is in
> inode's extent cache. if the logical block is in the cache and the
> cached region isn't a gap, "ext4_ext_get_blocks" gets physical block
> number by using cached region's physical block number and offset in
> the cached region. as described above, "ext4_ext_get_blocks" may
> return wrong result when there are physical block numbers bigger than
> 0xffffffff.
>
You are right. Thanks for reporting this!
Signed-off-by: Mingming Cao <cmm@us.ibm.com> Cc: Yan Zheng <yanzheng@21cn.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Willy Tarreau <w@1wt.eu>
Arne Redlich [Tue, 31 Jul 2007 07:37:57 +0000 (00:37 -0700)]
[PATCH] md: handle writes to broken raid10 arrays gracefully
When writing to a broken array, raid10 currently happily emits empty bio
lists. IOW, the master bio will never be completed, sending writers to
UNINTERRUPTIBLE_SLEEP forever.
Signed-off-by: Arne Redlich <agr@powerkom-dd.de> Acked-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Willy Tarreau <w@1wt.eu>
Pavel Emelianov [Tue, 31 Jul 2007 07:38:48 +0000 (00:38 -0700)]
[PATCH] Fix user struct leakage with locked IPC shem segment
When user locks an ipc shmem segmant with SHM_LOCK ctl and the segment is
already locked the shmem_lock() function returns 0. After this the
subsequent code leaks the existing user struct:
Other results of this are:
1. the new shp->mlock_user is not get-ed and will point to freed
memory when the task dies.
2. the RLIMIT_MEMLOCK is screwed on both user structs.
Is there a reason why the "online" file in the subdirectories for the CPUs
in /sys/devices/system isn't world-readable? I cannot imagine it to be
security relevant especially now that a getcpu() syscall can be used to
determine what CPUa thread runs on.
The file is useful to correctly implement the sysconf() function to return
the number of online CPUs. In the presence of hotplug we currently cannot
provide this information. The patch below should to it.
This 965G and above chipsets moved the batch buffer non-secure bits to
another place. This means that previous drm's allowed in-secure batchbuffers
to be submitted to the hardware from non-privileged users who are logged
into X and and have access to direct rendering.
If add_to_page_cache_lru() fails, the page will not be locked. But
splice jumps to an error path that does a page release and unlock,
causing a BUG() in unlock_page().
Fix this by adding one more label that just releases the page. This bug
was actually triggered on EL5 by gurudas pai <gurudas.pai@oracle.com>
using fio.
Hans Verkuil [Tue, 24 Jul 2007 12:07:17 +0000 (08:07 -0400)]
[PATCH] V4L: Add check for valid control ID to v4l2_ctrl_next
If v4l2_ctrl_next is called without the V4L2_CTRL_FLAG_NEXT_CTRL then it
should check whether the passed control ID is valid and return 0 if it
isn't. Otherwise a for-loop over the control IDs will never end.
Alan Cox [Mon, 23 Jul 2007 13:51:05 +0000 (14:51 +0100)]
[PATCH] aacraid: fix security hole
On the SCSI layer ioctl path there is no implicit permissions check for
ioctls (and indeed other drivers implement unprivileged ioctls). aacraid
however allows all sorts of very admin only things to be done so should
check.
Signed-off-by: Alan Cox <alan@redhat.com> Acked-by: Mark Salyzyn <mark_salyzyn@adaptec.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Willy Tarreau <w@1wt.eu>
Alan Stern [Fri, 20 Jul 2007 03:44:51 +0000 (20:44 -0700)]
[PATCH] USB: fix warning caused by autosuspend counter going negative
This patch (as937) fixes a minor bug in the autosuspend usage-counting
code. Each hub's usage counter keeps track of the number of
unsuspended children. However the current driver increments the
counter after registering a new child, by which time the child may
already have been suspended and caused the counter to go negative.
The obvious solution is to increment the counter before registering
the child.
[PATCH] KVM: SVM: Reliably detect if SVM was disabled by BIOS
This patch adds an implementation to the svm is_disabled function to
detect reliably if the BIOS disabled the SVM feature in the CPU. This
fixes the issues with kernel panics when loading the kvm-amd module on
machines where SVM is available but disabled.
[TCPv6] MD5SIG: Ensure to reset allocation count to avoid panic.
After clearing all passwords for IPv6 peers, we need to
set allocation count to zero as well as we free the storage.
Otherwise, we panic when a user trys to (re)add a password.
Discovered and fixed by MIYAJIMA Mitsuharu <miyajima.mitsuharu@anchor.jp>.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Willy Tarreau <w@1wt.eu>
Mark Fortescue [Wed, 25 Jul 2007 04:45:44 +0000 (21:45 -0700)]
[PATCH] Fix sparc32 udelay() rounding errors.
[SPARC32]: Fix rounding errors in ndelay/udelay implementation.
__ndelay and __udelay have not been delayung >= specified time.
The problem with __ndelay has been tacked down to the rounding of the
multiplier constant. By changing this, delays > app 18us are correctly
calculated.
The problem with __udelay has also been tracked down to rounding issues.
Changing the multiplier constant (to match that used in sparc64) corrects
for large delays and adding in a rounding constant corrects for trunctaion
errors in the claculations.
Many short delays will return without looping. This is not an error as there
is the fixed delay of doing all the maths to calculate the loop count.
Signed-off-by: Mark Fortescue <mark@mtfhpc.demon.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Willy Tarreau <w@1wt.eu>
Sparc optimized memset (arch/sparc/lib/memset.S) does not fill last
byte of the memory area, if area size is less than 8 bytes and start
address is not word (4-bytes) aligned.
Here is code chunk where bug located:
/* %o0 - memory address, %o1 - size, %g3 - value */
8:
add %o0, 1, %o0
subcc %o1, 1, %o1
bne,a 8b
stb %g3, [%o0 - 1]
This code should write byte every loop iteration, but last time delay
instruction stb is not executed because branch instruction sets
"annul" bit.
Signed-off-by: Alexander Shmelev <ashmelev@task.sun.mcst.ru> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Willy Tarreau <w@1wt.eu>
David S. Miller [Fri, 20 Jul 2007 05:06:09 +0000 (22:06 -0700)]
[PATCH] Sparc64 bootup assembler bug
[SPARC64]: Fix two year old bug in early bootup asm.
We try to fetch the CIF entry pointer from %o4, but that
can get clobbered by the early OBP calls. It is saved
in %l7 already, so actually this "mov %o4, %l7" can just
be completely removed with no other changes.
Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Willy Tarreau <w@1wt.eu>
93ec2c723e3f8a216dde2899aeb85c648672bc6b applied excessive duct tape to
the netpoll beast's netpoll_cleanup(), thus substituting one leak with
another, and opening up a little buglet :-)
net_device->npinfo (netpoll_info) is a shared and refcounted object and
cannot simply be set NULL the first time netpoll_cleanup() is called.
Otherwise, further netpoll_cleanup()'s see np->dev->npinfo == NULL and
become no-ops, thus leaking. And it's a bug too: the first call to
netpoll_cleanup() would thus (annoyingly) "disable" other (still alive)
netpolls too. Maybe nobody noticed this because netconsole (only user
of netpoll) never supported multiple netpoll objects earlier.
This is a trivial and obvious one-line fixlet.
Signed-off-by: Satyam Sharma <ssatyam@cse.iitk.ac.in> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Willy Tarreau <w@1wt.eu>
[IPV6]: Call inet6addr_chain notifiers on link down
Currently if the link is brought down via ip link or ifconfig down,
the inet6addr_chain notifiers are not called even though all
the addresses are removed from the interface. This caused SCTP
to add duplicate addresses to it's list.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Willy Tarreau <w@1wt.eu>
[IPV6]: MSG_ERRQUEUE messages do not pass to connected raw sockets
From: Dmitry Butskoy <dmitry@butskoy.name>
Taken from http://bugzilla.kernel.org/show_bug.cgi?id=8747
Problem Description:
It is related to the possibility to obtain MSG_ERRQUEUE messages from the udp
and raw sockets, both connected and unconnected.
There is a little typo in net/ipv6/icmp.c code, which prevents such messages
to be delivered to the errqueue of the correspond raw socket, when the socket
is CONNECTED. The typo is due to swap of local/remote addresses.
Consider __raw_v6_lookup() function from net/ipv6/raw.c. When a raw socket is
looked up usual way, it is something like:
sk = __raw_v6_lookup(sk, nexthdr, daddr, saddr, IP6CB(skb)->iif);
where "daddr" is a destination address of the incoming packet (IOW our local
address), "saddr" is a source address of the incoming packet (the remote end).
But when the raw socket is looked up for some icmp error report, in
net/ipv6/icmp.c:icmpv6_notify() , daddr/saddr are obtained from the echoed
fragment of the "bad" packet, i.e. "daddr" is the original destination
address of that packet, "saddr" is our local address. Hence, for
icmpv6_notify() must use "saddr, daddr" in its arguments, not "daddr, saddr"
...
Steps to reproduce:
Create some raw socket, connect it to an address, and cause some error
situation: f.e. set ttl=1 where the remote address is more than 1 hop to reach.
Set IPV6_RECVERR .
Then send something and wait for the error (f.e. poll() with POLLERR|POLLIN).
You should receive "time exceeded" icmp message (because of "ttl=1"), but the
socket do not receive it.
If you do not connect your raw socket, you will receive MSG_ERRQUEUE
successfully. (The reason is that for unconnected socket there are no actual
checks for local/remote addresses).
Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Willy Tarreau <w@1wt.eu>
-Fixes ABBA deadlock noted by Patrick McHardy <kaber@trash.net>:
> There is at least one ABBA deadlock, est_timer() does:
> read_lock(&est_lock)
> spin_lock(e->stats_lock) (which is dev->queue_lock)
>
> and qdisc_destroy calls htb_destroy under dev->queue_lock, which
> calls htb_destroy_class, then gen_kill_estimator and this
> write_locks est_lock.
To fix the ABBA deadlock the rate estimators are now kept on an rcu list.
-The est_lock changes the use from protecting the list to protecting
the update to the 'bstat' pointer in order to avoid NULL dereferencing.
-The 'interval' member of the gen_estimator structure removed as it is
not needed.
Signed-off-by: Ranko Zivojnovic <ranko@spidernet.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Willy Tarreau <w@1wt.eu>
SCTP: Add scope_id validation for link-local binds
SCTP currently permits users to bind to link-local addresses,
but doesn't verify that the scope id specified at bind matches
the interface that the address is configured on. It was report
that this can hang a system.
Adrian Bunk [Wed, 18 Jul 2007 09:37:05 +0000 (02:37 -0700)]
[PATCH] Missing header include in ipt_iprange.h
[NETFILTER]: ipt_iprange.h must #include <linux/types.h>
ipt_iprange.h must #include <linux/types.h> since it uses __be32.
This patch fixes kernel Bugzilla #7604.
Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Willy Tarreau <w@1wt.eu>
Patrick McHardy [Wed, 18 Jul 2007 09:26:27 +0000 (02:26 -0700)]
[PATCH] Fix IPCOMP crashes.
[XFRM]: Fix crash introduced by struct dst_entry reordering
XFRM expects xfrm_dst->u.next to be same pointer as dst->next, which
was broken by the dst_entry reordering in commit 1e19e02c~, causing
an oops in xfrm_bundle_ok when walking the bundle upwards.
Kill xfrm_dst->u.next and change the only user to use dst->next instead.
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Willy Tarreau <w@1wt.eu>