]> git.kernelconcepts.de Git - karo-tx-linux.git/log
karo-tx-linux.git
16 years agoLinux 2.6.24.4 v2.6.24.4
Chris Wright [Mon, 24 Mar 2008 18:49:18 +0000 (11:49 -0700)]
Linux 2.6.24.4

16 years agoS390 futex: let futex_atomic_cmpxchg_pt survive early functional tests.
Heiko Carstens [Thu, 20 Mar 2008 16:33:38 +0000 (17:33 +0100)]
S390 futex: let futex_atomic_cmpxchg_pt survive early functional tests.

a0c1e9073ef7428a14309cba010633a6cd6719ea "futex: runtime enable pi and
robust functionality" introduces a test wether futex in atomic stuff
works or not.
It does that by writing to address 0 of the kernel address space. This
will crash on older machines where addressing mode switching is enabled
but where the mvcos instruction is not available. Page table walking is
done by hand and therefore the code tries to access current->mm which
is NULL.
Therefore add an extra check, so we survive the early test.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoslab: NUMA slab allocator migration bugfix
Joe Korty [Wed, 5 Mar 2008 23:04:59 +0000 (15:04 -0800)]
slab: NUMA slab allocator migration bugfix

NUMA slab allocator cpu migration bugfix

The NUMA slab allocator (specifically, cache_alloc_refill)
is not refreshing its local copies of what cpu and what
numa node it is on, when it drops and reacquires the irq
block that it inherited from its caller.  As a result
those values become invalid if an attempt to migrate the
process to another numa node occured while the irq block
had been dropped.

The solution is to make cache_alloc_refill reload these
variables whenever it drops and reacquires the irq block.

The error is very difficult to hit.  When it does occur,
one gets the following oops + stack traceback bits in
check_spinlock_acquired:

kernel BUG at mm/slab.c:2417
cache_alloc_refill+0xe6
kmem_cache_alloc+0xd0
...

This patch was developed against 2.6.23, ported to and
compiled-tested only against 2.6.25-rc4.

Signed-off-by: Joe Korty <joe.korty@ccur.com>
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agorelay: fix subbuf_splice_actor() adding too many pages
Jens Axboe [Mon, 17 Mar 2008 08:04:59 +0000 (09:04 +0100)]
relay: fix subbuf_splice_actor() adding too many pages

If subbuf_pages was larger than the max number of pages the pipe
buffer will hold, subbuf_splice_actor() would happily go beyond
the array size.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoBLUETOOTH: Fix bugs in previous conn add/del workqueue changes.
Dave Young [Fri, 1 Feb 2008 02:33:10 +0000 (18:33 -0800)]
BLUETOOTH: Fix bugs in previous conn add/del workqueue changes.

Jens Axboe noticed that we were queueing &conn->work on both btaddconn
and keventd_wq.

Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoSCSI advansys: Fix bug in AdvLoadMicrocode
Matthew Wilcox [Fri, 21 Mar 2008 15:15:03 +0000 (15:15 +0000)]
SCSI advansys: Fix bug in AdvLoadMicrocode

commit: 951b62c11e86acf8c55d9828aa8c921575023c29

buf[i] can be up to 0xfd, so doubling it and assigning the result to an
unsigned char truncates the value.  Just use an unsigned int instead;
it's only a temporary.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoasync_tx: avoid the async xor_zero_sum path when src_cnt > device->max_xor
Dan Williams [Wed, 19 Mar 2008 04:40:04 +0000 (04:40 +0000)]
async_tx: avoid the async xor_zero_sum path when src_cnt > device->max_xor

commit: 8d8002f642886ae256a3c5d70fe8aff4faf3631a

If the channel cannot perform the operation in one call to
->device_prep_dma_zero_sum, then fallback to the xor+page_is_zero path.
This only affects users with arrays larger than 16 devices on iop13xx or
32 devices on iop3xx.

Cc: <stable@kernel.org>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
[chrisw@sous-sol.org: backport to 2.6.24.3]
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoaio: bad AIO race in aio_complete() leads to process hang
Quentin Barnes [Thu, 20 Mar 2008 02:45:07 +0000 (02:45 +0000)]
aio: bad AIO race in aio_complete() leads to process hang

commit: 6cb2a21049b8990df4576c5fce4d48d0206c22d5

My group ran into a AIO process hang on a 2.6.24 kernel with the process
sleeping indefinitely in io_getevents(2) waiting for the last wakeup to come
and it never would.

We ran the tests on x86_64 SMP.  The hang only occurred on a Xeon box
("Clovertown") but not a Core2Duo ("Conroe").  On the Xeon, the L2 cache isn't
shared between all eight processors, but is L2 is shared between between all
two processors on the Core2Duo we use.

My analysis of the hang is if you go down to the second while-loop
in read_events(), what happens on processor #1:
1) add_wait_queue_exclusive() adds thread to ctx->wait
2) aio_read_evt() to check tail
3) if aio_read_evt() returned 0, call [io_]schedule() and sleep

In aio_complete() with processor #2:
A) info->tail = tail;
B) waitqueue_active(&ctx->wait)
C) if waitqueue_active() returned non-0, call wake_up()

The way the code is written, step 1 must be seen by all other processors
before processor 1 checks for pending events in step 2 (that were recorded by
step A) and step A by processor 2 must be seen by all other processors
(checked in step 2) before step B is done.

The race I believed I was seeing is that steps 1 and 2 were
effectively swapped due to the __list_add() being delayed by the L2
cache not shared by some of the other processors.  Imagine:
proc 2: just before step A
proc 1, step 1: adds to ctx->wait, but is not visible by other processors yet
proc 1, step 2: checks tail and sees no pending events
proc 2, step A: updates tail
proc 1, step 3: calls [io_]schedule() and sleeps
proc 2, step B: checks ctx->wait, but sees no one waiting, skips wakeup
                so proc 1 sleeps indefinitely

My patch adds a memory barrier between steps A and B.  It ensures that the
update in step 1 gets seen on processor 2 before continuing.  If processor 1
was just before step 1, the memory barrier makes sure that step A (update
tail) gets seen by the time processor 1 makes it to step 2 (check tail).

Before the patch our AIO process would hang virtually 100% of the time.  After
the patch, we have yet to see the process ever hang.

Signed-off-by: Quentin Barnes <qbarnes+linux@yahoo-inc.com>
Reviewed-by: Zach Brown <zach.brown@oracle.com>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: <stable@kernel.org>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[ We should probably disallow that "if (waitqueue_active()) wake_up()"
  coding pattern, because it's so often buggy wrt memory ordering ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agojbd: correctly unescape journal data blocks
Duane Griffin [Thu, 20 Mar 2008 02:45:06 +0000 (02:45 +0000)]
jbd: correctly unescape journal data blocks

commit: 439aeec639d7c57f3561054a6d315c40fd24bb74

Fix a long-standing typo (predating git) that will cause data corruption if a
journal data block needs unescaping.  At the moment the wrong buffer head's
data is being unescaped.

To test this case mount a filesystem with data=journal, start creating and
deleting a bunch of files containing only JFS_MAGIC_NUMBER (0xc03b3998), then
pull the plug on the device.  Without this patch the files will contain zeros
instead of the correct data after recovery.

Signed-off-by: Duane Griffin <duaneg@dghda.com>
Acked-by: Jan Kara <jack@suse.cz>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agojbd2: correctly unescape journal data blocks
Duane Griffin [Thu, 20 Mar 2008 02:45:05 +0000 (02:45 +0000)]
jbd2: correctly unescape journal data blocks

commit: d00256766a0b4f1441931a7f569a13edf6c68200

Fix a long-standing typo (predating git) that will cause data corruption if a
journal data block needs unescaping.  At the moment the wrong buffer head's
data is being unescaped.

To test this case mount a filesystem with data=journal, start creating and
deleting a bunch of files containing only JBD2_MAGIC_NUMBER (0xc03b3998), then
pull the plug on the device.  Without this patch the files will contain zeros
instead of the correct data after recovery.

Signed-off-by: Duane Griffin <duaneg@dghda.com>
Acked-by: Jan Kara <jack@suse.cz>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agozisofs: fix readpage() outside i_size
Dave Young [Thu, 20 Mar 2008 02:45:04 +0000 (02:45 +0000)]
zisofs: fix readpage() outside i_size

commit: 08ca0db8aa2db4ddcf487d46d85dc8ffb22162cc

A read request outside i_size will be handled in do_generic_file_read().  So
we just return 0 to avoid getting -EIO as normal reading, let
do_generic_file_read do the rest.

At the same time we need unlock the page to avoid system stuck.

Fixes http://bugzilla.kernel.org/show_bug.cgi?id=10227

Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Acked-by: Jan Kara <jack@suse.cz>
Report-by: Christian Perle <chris@linuxinfotag.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoNETFILTER: nfnetlink_log: fix computation of netlink skb size
Eric Leblond [Mon, 17 Mar 2008 14:41:47 +0000 (15:41 +0100)]
NETFILTER: nfnetlink_log: fix computation of netlink skb size

Upstream commit 7000d38d:

This patch is similar to nfnetlink_queue fixes. It fixes the computation
of skb size by using NLMSG_SPACE instead of NLMSG_ALIGN.

Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoNETFILTER: nfnetlink_queue: fix computation of allocated size for netlink skb
Eric Leblond [Mon, 17 Mar 2008 14:41:46 +0000 (15:41 +0100)]
NETFILTER: nfnetlink_queue: fix computation of allocated size for netlink skb

Upstream commit cabaa9bf:

Size of the netlink skb was wrongly computed because the formula was using
NLMSG_ALIGN instead of NLMSG_SPACE. NLMSG_ALIGN does not add the room for
netlink header as NLMSG_SPACE does. This was causing a failure of message
building in some cases.

On my test system, all messages for packets in range [8*k+41, 8*k+48] where k
is an integer were invalid and the corresponding packets were dropped.

Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoNETFILTER: xt_time: fix failure to match on Sundays
Jan Engelhardt [Mon, 17 Mar 2008 14:41:44 +0000 (15:41 +0100)]
NETFILTER: xt_time: fix failure to match on Sundays

Upstream commit 4f4c9430:

xt_time_match() in net/netfilter/xt_time.c in kernel 2.6.24 never
matches on Sundays. On my host I have a rule like

iptables -A OUTPUT -m time --weekdays Sun -j REJECT

and it never matches. The problem is in localtime_2(), which uses

    r->weekday = (4 + r->dse) % 7;

to map the epoch day onto a weekday in {0,...,6}. In particular this
gives 0 for Sundays. But 0 has to be wrong; a weekday of 0 can never
match. xt_time_match() has

    if (!(info->weekdays_match & (1 << current_time.weekday)))
        return false;

and when current_time.weekday = 0, the result of the & is always
zero, even when info->weekdays_match = XT_TIME_ALL_WEEKDAYS = 0xFE.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agosched_nr_migrate wrong mode bits
Michal Schmidt [Mon, 17 Mar 2008 23:13:16 +0000 (00:13 +0100)]
sched_nr_migrate wrong mode bits

sched_nr_migrate has strange permission bits:

 $ ls -l /proc/sys/kernel/sched_nr_migrate
 --w----r-T 1 root root 0 2008-03-17 23:31 /proc/sys/kernel/sched_nr_migrate

The bug is an obvious decimal/octal confusion.

Fixed (collaterally) in Linus's tree by Peter Zijlstra with commit fa85ae241
"sched: rt time limit" (in 2.6.25-rc1).

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agonfsd: fix oops on access from high-numbered ports
J. Bruce Fields [Fri, 14 Mar 2008 23:37:11 +0000 (19:37 -0400)]
nfsd: fix oops on access from high-numbered ports

This bug was always here, but before my commit 6fa02839bf9412e18e77
("recheck for secure ports in fh_verify"), it could only be triggered by
failure of a kmalloc().  After that commit it could be triggered by a
client making a request from a non-reserved port for access to an export
marked "secure".  (Exports are "secure" by default.)

The result is a struct svc_export with a reference count one too low,
resulting in likely oopses next time the export is accessed.

The reference counting here is not straightforward; a later patch will
clean up fh_verify().

Thanks to Lukas Hejtmanek for the bug report and followup.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Cc: Lukas Hejtmanek <xhejtman@ics.muni.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agosched: fix race in schedule()
Hiroshi Shimamoto [Mon, 10 Mar 2008 18:01:20 +0000 (11:01 -0700)]
sched: fix race in schedule()

Fix a hard to trigger crash seen in the -rt kernel that also affects
the vanilla scheduler.

There is a race condition between schedule() and some dequeue/enqueue
functions; rt_mutex_setprio(), __setscheduler() and sched_move_task().

When scheduling to idle, idle_balance() is called to pull tasks from
other busy processor. It might drop the rq lock. It means that those 3
functions encounter on_rq=0 and running=1. The current task should be
put when running.

Here is a possible scenario:

   CPU0                               CPU1
    |                              schedule()
    |                              ->deactivate_task()
    |                              ->idle_balance()
    |                              -->load_balance_newidle()
rt_mutex_setprio()                     |
    |                              --->double_lock_balance()
    *get lock                          *rel lock
    * on_rq=0, ruuning=1               |
    * sched_class is changed           |
    *rel lock                          *get lock
    :                                  |
                                       :
                                   ->put_prev_task_rt()
                                   ->pick_next_task_fair()
                                       => panic

The current process of CPU1(P1) is scheduling. Deactivated P1, and the
scheduler looks for another process on other CPU's runqueue because CPU1
will be idle. idle_balance(), load_balance_newidle() and
double_lock_balance() are called and double_lock_balance() could drop
the rq lock. On the other hand, CPU0 is trying to boost the priority of
P1. The result of boosting only P1's prio and sched_class are changed to
RT. The sched entities of P1 and P1's group are never put. It makes
cfs_rq invalid, because the cfs_rq has curr and no leaf, but
pick_next_task_fair() is called, then the kernel panics.

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
[chrisw@sous-sol.org: backport to 2.6.24.3]
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoSCSI: mpt fusion: don't oops if NumPhys==0
Krzysztof Oledzki [Tue, 4 Mar 2008 22:56:23 +0000 (14:56 -0800)]
SCSI: mpt fusion: don't oops if NumPhys==0

Don't oops if NumPhys==0, instead return -ENODEV.
This patch fixes http://bugzilla.kernel.org/show_bug.cgi?id=9909

Signed-off-by: Krzysztof Piotr Oledzki <ole@ans.pl>
Acked-by: Eric Moore <Eric.Moore@lsi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoSCSI: gdth: fix to internal commands execution
Boaz Harrosh [Thu, 6 Mar 2008 02:10:13 +0000 (02:10 +0000)]
SCSI: gdth: fix to internal commands execution

commit: ee54cc6af95a7fa09da298493b853a9e64fa8abd

The recent patch named:
  [SCSI] gdth: !use_sg cleanup and use of scsi accessors

has done a bad job in handling internal commands issued by gdth_execute().

Internal commands are issued with device gdth_cmd_str ready made directly
to the card, without any mapping or translations of scsi commands. So here
I added a gdth_cmd_str pointer to the gdth_cmndinfo private structure which
is then copied directly to host.

following this patch is a cleanup that removes the home cooked accessors
and reverts them to regular scsi_cmnd accessors. Since they are not used
anymore. After review maybe the 2 patches should be squashed together.

FIXME: There is still a problem with gdth_get_info(). as reported there
   is a WARN_ON trigerd in dma_free_coherent() when doing:
   $ cat /proc/sys/gdth/0

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Tested-by: Joerg Dorchain: <joerg@dorchain.net>
Tested-by: Stefan Priebe <s.priebe@allied-internet.ag>
Tested-by: Jon Chelton <jchelton@ffpglobal.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoSCSI: gdth: bugfix for the at-exit problems
Boaz Harrosh [Thu, 6 Mar 2008 02:10:15 +0000 (02:10 +0000)]
SCSI: gdth: bugfix for the at-exit problems

commit: b31ddd31c266c2ad1b708cad0d3d8e0aa7fa2737

gdth_exit would first remove all cards then stop the timer
and would not sync with the timer function. This caused a crash
in gdth_timer() when module was unloaded.
So del_timer_sync the timer before we delete the cards.

also the reboot notifier function would crash. So clean
that up and fix the crashes.

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Tested-by: Joerg Dorchain: <joerg@dorchain.net>
Tested-by: Stefan Priebe <s.priebe@allied-internet.ag>
Tested-by: Jon Chelton <jchelton@ffpglobal.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoFix default compose table initialization
Samuel Thibault [Mon, 3 Mar 2008 01:23:49 +0000 (01:23 +0000)]
Fix default compose table initialization

commit: 5ce2087ed0eb424e0889bdc9102727f65d2ecdde

Oddly enough, unsigned int c = '\300'; puts a "negative" value in c, not
0300...  This fixes the default unicode compose table by using integers
instead of character constants.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fold in upstream commit 10a7f3135ac4937a3dc8ed11614a2b70cbd44728 (Build
fix for drivers/s390/char/defkeymap.c) from Tony Breeds.

Commit 5ce2087ed0eb424e0889bdc9102727f65d2ecdde (Fix default compose
table initialization) left a trailing quote.

  CC      drivers/s390/char/defkeymap.o
drivers/s390/char/defkeymap.c:155: error: missing terminating ' character
drivers/s390/char/defkeymap.c:156: error: syntax error before ';' token
make[3]: *** [drivers/s390/char/defkeymap.o] Error 1

Fix that.

Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agox86: don't use P6_NOPs if compiling with CONFIG_X86_GENERIC
H. Peter Anvin [Tue, 18 Mar 2008 23:23:07 +0000 (19:23 -0400)]
x86: don't use P6_NOPs if compiling with CONFIG_X86_GENERIC

Commit: 959b3be64cab9160cd74532a49b89cdd918d38e9

x86: don't use P6_NOPs if compiling with CONFIG_X86_GENERIC

P6_NOPs are definitely not supported on some VIA CPUs, and possibly
(unverified) on AMD K7s.  It is also the only thing that prevents a
686 kernel from running on Transmeta TM3x00/5x00 (Crusoe) series.

The performance benefit over generic NOPs is very small, so when
building for generic consumption, avoid using them.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[cebbert@redhat.com: backport take 2, with parens this time]
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoSCSI: fix BUG when sum(scatterlist) > bufflen
Tony Battersby [Wed, 5 Mar 2008 16:23:26 +0000 (10:23 -0600)]
SCSI: fix BUG when sum(scatterlist) > bufflen

commit: 4d2de3a50ce19af2008a90636436a1bf5b3b697b

When sending a SCSI command to a tape drive via the SCSI Generic (sg)
driver, if the command has a data transfer length more than
scatter_elem_sz (32 KB default) and not a multiple of 512, then I either
hit BUG_ON(!valid_dma_direction(direction)) in dma_unmap_sg() or else
the command never completes (depending on the LLDD).

When constructing scatterlists, the sg driver rounds up the scatterlist
element sizes to be a multiple of 512.  This can result in
sum(scatterlist lengths) > bufflen.  In this case, scsi_req_map_sg()
incorrectly sets bio->bi_size to sum(scatterlist lengths) rather than to
bufflen.  When the command completes, req_bio_endio() detects that
bio->bi_size != 0, and so it doesn't call bio_endio().  This causes the
command to be resubmitted, resulting in BUG_ON or the command never
completing.

This patch makes scsi_req_map_sg() set bio->bi_size to bufflen rather
than to sum(scatterlist lengths), which fixes the problem.

Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Acked-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoUSB: ehci: handle large bulk URBs correctly (again)
Misha Zhilin [Wed, 5 Mar 2008 00:50:27 +0000 (00:50 +0000)]
USB: ehci: handle large bulk URBs correctly (again)

commit: b5f7a0ec11694e60c99d682549dfaf8a03d7ad97

USB: ehci: Fixes completion for multi-qtd URB the short read case

When use of urb->status in the EHCI driver was reworked last August
(commit 14c04c0f88f228fee1f412be91d6edcb935c78aa), a bug was inserted
in the handling of early completion for bulk transactions that need
more than one qTD (e.g. more than 20KB in one URB).

This patch resolves that problem by ensuring that the early completion
status is preserved until the URB is handed back to its submitter,
instead of resetting it after each qTD.

Signed-off-by: Misha Zhilin <misha@epiphan.com>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoUSB: ftdi_sio - really enable EM1010PC
Sven Andersen [Tue, 4 Mar 2008 21:09:11 +0000 (22:09 +0100)]
USB: ftdi_sio - really enable EM1010PC

[upstream commit: 4ae897df]

Add EM1010PC to ftdi_sio.c

Signed-off-by: Sven Andersen <s.andersen@cryonet.de>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoUSB: ftdi_sio: Workaround for broken Matrix Orbital serial port
Kevin Vance [Wed, 5 Mar 2008 00:50:26 +0000 (00:50 +0000)]
USB: ftdi_sio: Workaround for broken Matrix Orbital serial port

commit: 546d7eec389a3df3173b3131d92829c14e614601

Workaround for the FT232RL-based, Matrix Orbital VK204-25-USB serial port
added to the ftdi_sio driver.

The device has an invalid endpoint descriptor, which must be modified
before it can be used.

Signed-off-by: Kevin Vance <kvance@kvance.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
[chrisw@sous-sol.org: backport to 2.6.24.3 w/out ftdi_jtag_probe]
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoVT notifier fix for VT switch
Samuel Thibault [Wed, 5 Mar 2008 00:50:23 +0000 (00:50 +0000)]
VT notifier fix for VT switch

commit: 8182ec49a73729334f5a6c65a607ba7009ebd6d6

VT notifier callbacks need to be aware of console switches.  This is already
partially done from console_callback(), but at that time fg_console, cursor
positions, etc.  are not yet updated and hence screen readers fetch the old
values.

This adds an update notify after all of the values are updated in
redraw_screen(vc, 1).

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoeCryptfs: make ecryptfs_prepare_write decrypt the page
Michael Halcrow [Wed, 5 Mar 2008 00:50:13 +0000 (00:50 +0000)]
eCryptfs: make ecryptfs_prepare_write decrypt the page

commit: e4465fdaeb3f7b5ef47f389d3eac76db79ff20d8

When the page is not up to date, ecryptfs_prepare_write() should be
acting much like ecryptfs_readpage(). This includes the painfully
obvious step of actually decrypting the page contents read from the
lower encrypted file.

Note that this patch resolves a bug in eCryptfs in 2.6.24 that one can
produce with these steps:

# mount -t ecryptfs /secret /secret
# echo "abc" > /secret/file.txt
# umount /secret
# mount -t ecryptfs /secret /secret
# echo "def" >> /secret/file.txt
# cat /secret/file.txt

Without this patch, the resulting data returned from cat is likely to
be something other than "abc\ndef\n".

(Thanks to Benedikt Driessen for reporting this.)

Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com>
Cc: Benedikt Driessen <bdriessen@escrypt.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[chrisw@sous-sol.org: backport to 2.6.24.3]
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoioat: fix 'ack' handling, driver must ensure that 'ack' is zero
Dan Williams [Tue, 4 Mar 2008 19:40:09 +0000 (19:40 +0000)]
ioat: fix 'ack' handling, driver must ensure that 'ack' is zero

commit: 6497dcffe07b7c3d863f9899280c4f6eae999161

Initialize 'ack' to zero in case the descriptor has been recycled.

Prevents "kernel BUG at crypto/async_tx/async_xor.c:185!"

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
[chrisw@sous-sol.org: backport to 2.6.24.3]
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agomacb: Fix speed setting
Atsushi Nemoto [Mon, 3 Mar 2008 15:51:48 +0000 (16:51 +0100)]
macb: Fix speed setting

commit: 179956f498bd8cc55fb803c4ee0cf18be59c8b01

Fix NCFGR.SPD setting on 10Mbps.  This bug was introduced by
conversion to generic PHY layer in kernel 2.6.23.

Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agox86: move out tick_nohz_stop_sched_tick() call from the loop
Hiroshi Shimamoto [Wed, 30 Jan 2008 12:33:00 +0000 (13:33 +0100)]
x86: move out tick_nohz_stop_sched_tick() call from the loop

[upstream commit: 3d97775a]

Move out tick_nohz_stop_sched_tick() call from the loop in cpu_idle
same as 32-bit version.

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Ed Tomlinson <edt@aei.ca>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoatmel_spi: fix clock polarity
Atsushi Nemoto [Fri, 29 Feb 2008 14:16:16 +0000 (15:16 +0100)]
atmel_spi: fix clock polarity

commit: f6febccd7f86fbe94858a4a32d9384cc014c9f40

The atmel_spi driver does not initialize clock polarity correctly (except for
at91rm9200 CS0 channel) in some case.

The atmel_spi driver uses gpio-controlled chipselect.  OTOH spi clock signal
is controlled by CSRn.CPOL bit, but this register controls clock signal
correctly only in 'real transfer' duration.  At the time of cs_activate()
call, CSRn.CPOL will be initialized correctly, but the controller do not know
which channel is to be used next, so clock signal will stay at the inactive
state of last transfer.  If clock polarity of new transfer and last transfer
was differ, new transfer will start with wrong clock signal state.

For example, if you started SPI MODE 2 or 3 transfer after SPI MODE 0 or 1
transfer, the clock signal state at the assertion of chipselect will be low.
Of course this will violates SPI transfer.

This patch is short term solution for this problem.  It makes all CSRn.CPOL
match for the transfer before activating chipselect.  For longer term, the
best fix might be to let NPCS0 stay selected permanently in MR and overwrite
CSR0 with to the new slave's settings before asserting CS.

Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Acked-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agob43: Backport bcm4311 fix
Michael Buesch [Fri, 29 Feb 2008 11:55:41 +0000 (12:55 +0100)]
b43: Backport bcm4311 fix

This is a backport of upstream commit 013978b6 ("b43: Changes to enable
BCM4311 rev 02 with wireless core revision 13") and the changes include
the following:

(1) Add the 802.11 rev 13 device to the ssb_device_id table to load b43.
(2) Add PHY revision 9 to the supported list.
(3) Change the 2-bit routing code for address extensions to 0b10 rather
    than the 0b01 used for the 32-bit case.
(4) Remove some magic numbers in the DMA setup.

The DMA implementation for this chip supports full 64-bit addressing with
one exception. Whenever the Descriptor Ring Buffer is in high memory, a
fatal DMA error occurs. This problem was not present in 2.6.24-rc2 due
to code to "Bias the placement of kernel pages at lower PFNs". When
commit 44048d70 reverted that code, the DMA error appeared. As a "fix",
use the GFP_DMA flag when allocating the buffer for 64-bit DMA. At present,
this problem is thought to arise from a hardware error.

Signed-off-by: Michael Buesch <mb@bu3sch.de>
Cc: Larry Finger <Larry.Finger@lwfinger.net>
Cc: John W. Linville <linville@tuxdriver.com>
Cc: Alexey Zaytsev <alexey.zaytsev@gmail.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoarcmsr: fix IRQs disabled warning spew
Mike Pagano [Thu, 28 Feb 2008 00:35:01 +0000 (19:35 -0500)]
arcmsr: fix IRQs disabled warning spew

As of 2.6.24, running the archttp passthrough daemon with the arcmsr
driver produces an endless spew of dma_free_coherent warnings:

WARNING: at arch/x86/kernel/pci-dma_64.c:169 dma_free_coherent()

It turns out that coherent memory is not needed, so commit 76d78300 by
Nick Cheng <nick.cheng@areca.com.tw> switched it to kmalloc (as well as
making a lot of other changes which have not been included here).

James Bottomley pointed out that the new kmalloc usage was also wrong,
I corrected this in commit 69e562c2.

This patch combines both of the above for the purpose of fixing 2.6.24.
details in http://bugs.gentoo.org/208493.

Signed-off-by: Daniel Drake <dsd@gentoo.org>
Cc: Nick Cheng <nick.cheng@areca.com.tw>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoe1000e: Fix CRC stripping in hardware context bug
Auke Kok [Tue, 12 Feb 2008 23:20:24 +0000 (15:20 -0800)]
e1000e: Fix CRC stripping in hardware context bug

CRC stripping was only correctly enabled for packet split recieves
which is used when receiving jumbo frames. Correctly enable SECRC
also for normal buffer packet receives.

Tested by Andy Gospodarek and Johan Andersson, see bugzilla #9940.

Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Cc: Mike Pagano <mike@mpagano.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoPCI x86: always use conf1 to access config space below 256 bytes
Ivan Kokshaysky [Mon, 14 Jan 2008 22:31:09 +0000 (17:31 -0500)]
PCI x86: always use conf1 to access config space below 256 bytes

[upsteam commit: a0ca9909]

Thanks to Loic Prylli <loic@myri.com>, who originally proposed
this idea.

Always using legacy configuration mechanism for the legacy config space
and extended mechanism (mmconf) for the extended config space is
a simple and very logical approach. It's supposed to resolve all
known mmconf problems. It still allows per-device quirks (tweaking
dev->cfg_size). It also allows to get rid of mmconf fallback code.

This patch fixes a boot hang on Intel Q35 chipset as detailed at:
  http://bugs.gentoo.org/198810

Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Pagano <mpagano@gentoo.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agomoduleparam: fix alpha, ia64 and ppc64 compile failures
Ivan Kokshaysky [Wed, 13 Feb 2008 23:03:26 +0000 (15:03 -0800)]
moduleparam: fix alpha, ia64 and ppc64 compile failures

[upstream commit: 91d35dd9]

On alpha, ia64 and ppc64 only relocations to local data can go into
read-only sections. The vast majority of module parameters use the global
generic param_set_*/param_get_* functions, so the 'const' attribute for
struct kernel_param is not only useless, but it also causes compile
failures due to 'section type conflict' in those rare cases where
param_set/get are local functions.

This fixes http://bugzilla.kernel.org/show_bug.cgi?id=8964

Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Adrian Bunk <bunk@stusta.de>
Cc: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Pagano <mpagano@gentoo.org>
[chrisw@sous-sol.org: backport to 2.6.24.3]
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agopata_hpt*, pata_serverworks: fix UDMA masking
Alan Cox [Tue, 26 Feb 2008 21:35:54 +0000 (13:35 -0800)]
pata_hpt*, pata_serverworks: fix UDMA masking

[upstream commit: 6ddd6861]

When masking, mask out the modes that are unsupported not the ones
that are supported.  This makes life happier.

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoSCSI advansys: fix overrun_buf aligned bug
FUJITA Tomonori [Tue, 26 Feb 2008 17:06:18 +0000 (02:06 +0900)]
SCSI advansys: fix overrun_buf aligned bug

commit 7d5d408c77cee95d1380511de46b7a4c8dc2211d

struct asc_dvc_var needs overrun buffer to be placed on an 8 byte
boundary. advansys defines struct asc_dvc_var:

struct asc_dvc_var {
    ...
    uchar overrun_buf[ASC_OVERRUN_BSIZE] __aligned(8);

The problem is that struct asc_dvc_var is placed on
shost->hostdata. So if the hostdata is not on an 8 byte boundary, the
advansys crashes. The hostdata is placed on a sizeof(unsigned long)
boundary so the 8 byte boundary is not garanteed with x86_32.

With 2.6.23 and 2.6.24, the hostdata is on an 8 byte boundary by
chance, but with the current git, it's not.

This patch removes overrun_buf static array and use kzalloc.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
FUJITA Tomonori notes:
  We thought that 2.6.24 doesn't have this bug, however it does.

Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoNETFILTER: fix ebtable targets return
Patrick McHardy [Mon, 25 Feb 2008 14:01:04 +0000 (15:01 +0100)]
NETFILTER: fix ebtable targets return

Upstream commit 1b04ab459:

The function ebt_do_table doesn't take NF_DROP as a verdict from the targets.

Signed-off-by: Joonwoo Park <joonwpark81@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoNETFILTER: Fix incorrect use of skb_make_writable
Patrick McHardy [Mon, 25 Feb 2008 14:01:02 +0000 (15:01 +0100)]
NETFILTER: Fix incorrect use of skb_make_writable

Upstream commit eb1197bc0:

http://bugzilla.kernel.org/show_bug.cgi?id=9920
The function skb_make_writable returns true or false.

Signed-off-by: Joonwoo Park <joonwpark81@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoNETFILTER: nfnetlink_queue: fix SKB_LINEAR_ASSERT when mangling packet data
Patrick McHardy [Mon, 25 Feb 2008 14:01:01 +0000 (15:01 +0100)]
NETFILTER: nfnetlink_queue: fix SKB_LINEAR_ASSERT when mangling packet data

Upstream commit e2b58a67:

As reported by Tomas Simonaitis <tomas.simonaitis@gmail.com>, inserting new
data in skbs queued over {ip,ip6,nfnetlink}_queue triggers a SKB_LINEAR_ASSERT
in skb_put().

Going back through the git history, it seems this bug is present since at
least 2.6.12-rc2, probably even since the removal of skb_linearize() for
netfilter.

Linearize non-linear skbs through skb_copy_expand() when enlarging them.
Tested by Thomas, fixes bugzilla #9933.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agospi: pxa2xx_spi clock polarity fix
Ned Forrester [Sun, 24 Feb 2008 02:10:06 +0000 (02:10 +0000)]
spi: pxa2xx_spi clock polarity fix

commit: b97c74bddce4e2c6fef6b3b58910b4fd9eb7f3b8

Fixes a sequencing bug in spi driver pxa2xx_spi.c in which the chip select
for a transfer may be asserted before the clock polarity is set on the
interface.  As a result of this bug, the clock signal may have the wrong
polarity at transfer start, so it may need to make an extra half transition
before the intended clock/data signals begin.  (This probably means all
transfers are one bit out of sequence.)

This only occurs on the first transfer following a change in clock polarity
in systems using more than one more than one such polarity.  The fix
assures that the clock mode is properly set before asserting chip select.

This bug was introduced in a patch merged on 2006/12/10, kernel 2.6.20.
The patch defines an additional bit in: include/asm-arm/arch-pxa/regs-ssp.h
for 2.6.25 and newer kernels but this addition must be made in:
include/asm-arm/arch-pxa/pxa-regs.h for kernels between 2.6.20 and 2.6.24,
inclusive

Signed-off-by: Ned Forrester <nforrester@whoi.edu>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[chrisw@sous-sol.org: backport to 2.6.24.3]
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoufs: fix parenthesisation in ufs_set_fs_state()
Roel Kluin [Sun, 24 Feb 2008 02:10:08 +0000 (02:10 +0000)]
ufs: fix parenthesisation in ufs_set_fs_state()

commit: f81e8a43871f44f98dd14e83a83bf9ca0b3b46c5

This bug snuck in with

commit 252e211e90ce56bf005cb533ad5a297c18c19407
Author: Mark Fortescue <mark@mtfhpc.demon.co.uk>
Date:   Tue Oct 16 23:26:31 2007 -0700

    Add in SunOS 4.1.x compatible mode for UFS

Signed-off-by: Roel Kluin <12o3l@tiscali.nl>
Acked-by: Evgeniy Dushistov <dushistov@mail.ru>
Cc: Mark Fortescue <mark@mtfhpc.demon.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agohugetlb: ensure we do not reference a surplus page after handing it to buddy
Andy Whitcroft [Sun, 24 Feb 2008 02:10:08 +0000 (02:10 +0000)]
hugetlb: ensure we do not reference a surplus page after handing it to buddy

commit: e5df70ab194543522397fa3da8c8f80564a0f7d3

When we free a page via free_huge_page and we detect that we are in surplus
the page will be returned to the buddy.  After this we no longer own the page.

However at the end free_huge_page we clear out our mapping pointer from
page private.  Even where the page is not a surplus we free the page to
the hugepage pool, drop the pool locks and then clear page private.  In
either case the page may have been reallocated.  BAD.

Make sure we clear out page private before we free the page.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Acked-by: Adam Litke <agl@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agofile capabilities: simplify signal check
Serge E. Hallyn [Sun, 24 Feb 2008 02:10:07 +0000 (02:10 +0000)]
file capabilities: simplify signal check

commit: 094972840f2e7c1c6fc9e1a97d817cc17085378e

Simplify the uid equivalence check in cap_task_kill().  Anyone can kill a
process owned by the same uid.

Without this patch wireshark is reported to fail.

Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Signed-off-by: Andrew G. Morgan <morgan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agofutex: runtime enable pi and robust functionality
Thomas Gleixner [Sun, 24 Feb 2008 02:10:05 +0000 (02:10 +0000)]
futex: runtime enable pi and robust functionality

commit: a0c1e9073ef7428a14309cba010633a6cd6719ea

Not all architectures implement futex_atomic_cmpxchg_inatomic().  The default
implementation returns -ENOSYS, which is currently not handled inside of the
futex guts.

Futex PI calls and robust list exits with a held futex result in an endless
loop in the futex code on architectures which have no support.

Fixing up every place where futex_atomic_cmpxchg_inatomic() is called would
add a fair amount of extra if/else constructs to the already complex code.  It
is also not possible to disable the robust feature before user space tries to
register robust lists.

Compile time disabling is not a good idea either, as there are already
architectures with runtime detection of futex_atomic_cmpxchg_inatomic support.

Detect the functionality at runtime instead by calling
cmpxchg_futex_value_locked() with a NULL pointer from the futex initialization
code.  This is guaranteed to fail, but the call of
futex_atomic_cmpxchg_inatomic() happens with pagefaults disabled.

On architectures, which use the asm-generic implementation or have a runtime
CPU feature detection, a -ENOSYS return value disables the PI/robust features.

On architectures with a working implementation the call returns -EFAULT and
the PI/robust features are enabled.

The relevant syscalls return -ENOSYS and the robust list exit code is blocked,
when the detection fails.

Fixes http://lkml.org/lkml/2008/2/11/149
Originally reported by: Lennart Buytenhek

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Lennert Buytenhek <buytenh@wantstofly.org>
Cc: Riku Voipio <riku.voipio@movial.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agofutex: fix init order
Thomas Gleixner [Sun, 24 Feb 2008 02:10:06 +0000 (02:10 +0000)]
futex: fix init order

commit: 3e4ab747efa8e78562ec6782b08bbf21a00aba1b

When the futex init code fails to initialize the futex pseudo file system it
returns early without initializing the hash queues.  Should the boot succeed
then a futex syscall which tries to enqueue a waiter on the hashqueue will
crash due to the unitilialized plist heads.

Initialize the hash queues before the filesystem.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Lennert Buytenhek <buytenh@wantstofly.org>
Cc: Riku Voipio <riku.voipio@movial.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoARM pxa: fix clock lookup to find specific device clocks
Russell King [Sun, 24 Feb 2008 14:55:37 +0000 (15:55 +0100)]
ARM pxa: fix clock lookup to find specific device clocks

commit: a0dd005d1d9f4c3beab52086f3844ef9342d1e67

Ensure that the clock lookup always finds an entry for a specific
device and ID before it falls back to finding just by ID.  This
fixes a problem reported by Holger Schurig where the BTUART was
assigned the wrong clock.

Tested-by: Holger Schurig <hs4233@mail.mn-solutions.de>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Uli Luckas notes:

  The patch fixes the otherwise unusable bluetooth uart on pxa25x. The
  patch is written by Russell King [1] who also gave his OK for
  stable inclusion [2].  The patch is also available as commit
  a0dd005d1d9f4c3beab52086f3844ef9342d1e67 to Linus' tree.

  [1] http://marc.info/?l=linux-arm-kernel&m=120298366510315
  [2] http://marc.info/?l=linux-arm-kernel&m=120384388411097

Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agox86: replace LOCK_PREFIX in futex.h
Thomas Gleixner [Sat, 23 Feb 2008 16:56:56 +0000 (11:56 -0500)]
x86: replace LOCK_PREFIX in futex.h

Commit: 9d55b9923a1b7ea8193b8875c57ec940dc2ff027

The exception fixup for the futex macros __futex_atomic_op1/2 and
futex_atomic_cmpxchg_inatomic() is missing an entry when the lock
prefix is replaced by a NOP via SMP alternatives.

Chuck Ebert tracked this down from the information provided in:
https://bugzilla.redhat.com/show_bug.cgi?id=429412

A possible solution would be to add another fixup after the
LOCK_PREFIX, so both the LOCK and NOP case have their own entry in the
exception table, but it's not really worth the trouble.

Simply replace LOCK_PREFIX with lock and keep those untouched by SMP
alternatives.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
[cebbert@redhat.com: backport to 2.6.24]
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoSCSI aic94xx: fix REQ_TASK_ABORT and REQ_DEVICE_RESET
James Bottomley [Sat, 23 Feb 2008 20:55:15 +0000 (20:55 +0000)]
SCSI aic94xx: fix REQ_TASK_ABORT and REQ_DEVICE_RESET

commit: cb84e2d2ff3b50c0da5a7604a6d8634294a00a01

This driver has been failing under heavy load with

aic94xx: escb_tasklet_complete: REQ_TASK_ABORT, reason=0x6
aic94xx: escb_tasklet_complete: Can't find task (tc=4) to abort!

The second message is because the driver fails to identify the task
it's being asked to abort.  On closer inpection, there's a thinko in
the for each task loop over pending tasks in both the REQ_TASK_ABORT
and REQ_DEVICE_RESET cases where it doesn't look at the task on the
pending list but at the one on the ESCB (which is always NULL).

Fix by looking at the right task.  Also add a print for the case where
the pending SCB doesn't have a task attached.

Not sure if this will fix all the problems, but it's a definite first
step.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoSCSI gdth: don't call pci_free_consistent under spinlock
James Bottomley [Sat, 23 Feb 2008 20:55:14 +0000 (20:55 +0000)]
SCSI gdth: don't call pci_free_consistent under spinlock

commit: ff83efacf2b77a1fe8942db6613825a4b80ee5e2

The spinlock is held over too large a region: pscratch is a permanent
address (it's allocated at boot time and never changes).  All you need
the smp lock for is mediating the scratch in use flag, so fix this by
moving the spinlock into the case where we set the pscratch_busy flag
to false.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoSCSI ips: fix data buffer accessors conversion bug
FUJITA Tomonori [Sat, 23 Feb 2008 20:55:12 +0000 (20:55 +0000)]
SCSI ips: fix data buffer accessors conversion bug

commit: 2b28a4721e068ac89bd5435472723a1bc44442fe

This fixes a bug that can't handle a passthru command with more than
two sg entries.

Big thanks to Tim Pepper for debugging the problem.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Mark Salyzyn <Mark_Salyzyn@adaptec.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agousb-storage: don't access beyond the end of the sg buffer
Alan Stern [Fri, 22 Feb 2008 22:03:25 +0000 (17:03 -0500)]
usb-storage: don't access beyond the end of the sg buffer

This patch (as1038) fixes a bug in usb_stor_access_xfer_buf() and
usb_stor_set_xfer_buf() (the bug was originally found by Boaz
Harrosh): The routine must not attempt to write beyond the end of a
scatter-gather list or beyond the number of bytes requested.

This is the minimal 2.6.24 equivalent to as1035 +
as1037 (7084191d53b224b953c8e1db525ea6c31aca5fc7 "USB:
usb-storage: don't access beyond the end of the sg buffer" +
6d512a80c26d87f8599057c86dc920fbfe0aa3aa "usb-storage: update earlier
scatter-gather bug fix").  Mark Glines has confirmed that it fixes
his problem.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Cc: Mark Glines <mark@glines.org>
Cc: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agofuse: fix permission checking
Miklos Szeredi [Sat, 23 Feb 2008 23:23:27 +0000 (15:23 -0800)]
fuse: fix permission checking

[upstream commit 1a823ac9ff09cbdf39201df37b7ede1f9395de83]

I added a nasty local variable shadowing bug to fuse in 2.6.24, with the
result, that the 'default_permissions' mount option is basically ignored.

How did this happen?

 - old err declaration in inner scope
 - new err getting declared in outer scope
 - 'return err' from inner scope getting removed
 - old declaration not being noticed

-Wshadow would have saved us, but it doesn't seem practical for
the kernel :(

More testing would have also saved us :((

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoCRYPTO xts: Use proper alignment
Sebastian Siewior [Wed, 12 Mar 2008 04:18:36 +0000 (12:18 +0800)]
CRYPTO xts: Use proper alignment

[ Upstream commit: 6212f2c7f70c591efb0d9f3d50ad29112392fee2 ]

The XTS blockmode uses a copy of the IV which is saved on the stack
and may or may not be properly aligned. If it is not, it will break
hardware cipher like the geode or padlock.
This patch encrypts the IV in place so we don't have to worry about
alignment.

Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Tested-by: Stefan Hellermann <stefan@the2masters.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoCRYPTO xcbc: Fix crash with IPsec
Joy Latten [Wed, 12 Mar 2008 04:17:45 +0000 (12:17 +0800)]
CRYPTO xcbc: Fix crash with IPsec

[ Upstream commit: 2f40a178e70030c4712fe63807c883f34c3645eb ]

When using aes-xcbc-mac for authentication in IPsec,
the kernel crashes. It seems this algorithm doesn't
account for the space IPsec may make in scatterlist for authtag.
Thus when crypto_xcbc_digest_update2() gets called,
nbytes may be less than sg[i].length.
Since nbytes is an unsigned number, it wraps
at the end of the loop allowing us to go back
into loop and causing crash in memcpy.

I used update function in digest.c to model this fix.
Please let me know if it looks ok.

Signed-off-by: Joy Latten <latten@austin.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoSCSI ips: handle scsi_add_host() failure, and other err cleanups
Jeff Garzik [Wed, 12 Mar 2008 01:25:42 +0000 (10:25 +0900)]
SCSI ips: handle scsi_add_host() failure, and other err cleanups

commit 2551a13e61d3c3df6c2da6de5a3ece78e6d67111

Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Acked-by: Mark Salyzyn <mark_salyzyn@adaptec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
FUJITA Tomonori notes:
  It didn't intend to fix a critical bug, however, it turned out that it
  does. Without this patch, the ips driver in 2.6.23 and 2.6.24 doesn't
  work at all. You can find the more details at the following thread:

  http://marc.info/?t=120293911900023&r=1&w=2

Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agox86: adjust enable_NMI_through_LVT0()
Jan Beulich [Tue, 11 Mar 2008 10:30:25 +0000 (11:30 +0100)]
x86: adjust enable_NMI_through_LVT0()

commit e94271017f0933b29362a3c9dea5a6b9d04d98e1

Its previous use in a call to on_each_cpu() was pointless, as at the
time that code gets executed only one CPU is online. Further, the
function can be __cpuinit, and for this to work without
CONFIG_HOTPLUG_CPU setup_nmi() must also get an attribute (this one
can even be __init; on 64-bits check_timer() also was lacking that
attribute).

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[ tglx@linutronix.de: backport to 2.6.24.3]
Cc: Justin Piszcz <jpiszcz@lucidpixels.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agodrivers: fix dma_get_required_mask
James Bottomley [Tue, 11 Mar 2008 01:50:07 +0000 (01:50 +0000)]
drivers: fix dma_get_required_mask

commit: e88a0c2ca81207a75afe5bbb8020541dabf606ac
Date: Sun, 9 Mar 2008 11:57:56 -0500
Subject: drivers: fix dma_get_required_mask

There's a bug in the current implementation of dma_get_required_mask()
where it ands the returned mask with the current device mask.  This
rather defeats the purpose if you're using the call to determine what
your mask should be (since you will at that time have the default
DMA_32BIT_MASK).  This bug results in any driver that uses this function
*always* getting a 32 bit mask, which is wrong.

Fix by removing the and with dev->dma_mask.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoiov_iter_advance() fix
Nick Piggin [Tue, 11 Mar 2008 01:50:06 +0000 (01:50 +0000)]
iov_iter_advance() fix

commit: f7009264c519603b8ec67c881bd368a56703cfc9

iov_iter_advance() skips over zero-length iovecs, however it does not properly
terminate at the end of the iovec array.  Fix this by checking against
i->count before we skip a zero-length iov.

The bug was reproduced with a test program that continually randomly creates
iovs to writev.  The fix was also verified with the same program and also it
could verify that the correct data was contained in the file after each
writev.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Tested-by: "Kevin Coffman" <kwc@citi.umich.edu>
Cc: "Alexey Dobriyan" <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agox86: Clear DF before calling signal handler
Aurelien Jarno [Sat, 8 Mar 2008 10:43:52 +0000 (11:43 +0100)]
x86: Clear DF before calling signal handler

x86: Clear DF before calling signal handler

The Linux kernel currently does not clear the direction flag before
calling a signal handler, whereas the x86/x86-64 ABI requires that.
This become a real problem with gcc version 4.3, which assumes that
the direction flag is correctly cleared at the entry of a function.

This patches changes the setup_frame() functions to clear the
direction before entering the signal handler.

This is a backport of patch e40cd10ccff3d9fbffd57b93780bee4b7b9bff51
("x86: clear DF before calling signal handler") that has been applied
in 2.6.25-rc.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoub: fix up the conversion to sg_init_table()
Pete Zaitcev [Fri, 7 Mar 2008 16:36:54 +0000 (17:36 +0100)]
ub: fix up the conversion to sg_init_table()

Signed-off-by: Pete Zaitcev <zaitcev@redhat.com>
Cc: "Oliver Pinter" <oliver.pntr@gmail.com>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoMIPS: Mark all but i8259 interrupts as no-probe.
Ralf Baechle [Fri, 8 Feb 2008 12:22:02 +0000 (04:22 -0800)]
MIPS: Mark all but i8259 interrupts as no-probe.

Use set_irq_noprobe() to mark all MIPS interrupts as non-probe.  Override that
default for i8259 interrupts.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Acked-and-tested-by: Rob Landley <rob@landley.net>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoIRQ_NOPROBE helper functions
Ralf Baechle [Fri, 8 Feb 2008 12:22:01 +0000 (04:22 -0800)]
IRQ_NOPROBE helper functions

Probing non-ISA interrupts using the handle_percpu_irq as their handle_irq
method may crash the system because handle_percpu_irq does not check
IRQ_WAITING.  This for example hits the MIPS Qemu configuration.

This patch provides two helper functions set_irq_noprobe and set_irq_probe to
set rsp.  clear the IRQ_NOPROBE flag.  The only current caller is MIPS code
but this really belongs into generic code.

As an aside, interrupt probing these days has become a mostly obsolete if not
dangerous art.  I think Linux interrupts should be changed to default to
non-probing but that's subject of this patch.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Acked-and-tested-by: Rob Landley <rob@landley.net>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoIPCOMP: Disable BH on output when using shared tfm
Herbert Xu [Thu, 6 Mar 2008 04:07:37 +0000 (20:07 -0800)]
IPCOMP: Disable BH on output when using shared tfm

Upstream commit: 21e43188f272c7fd9efc84b8244c0b1dfccaa105

Because we use shared tfm objects in order to conserve memory,
(each tfm requires 128K of vmalloc memory), BH needs to be turned
off on output as that can occur in process context.

Previously this was done implicitly by the xfrm output code.
That was lost when it became lockless.  So we need to add the
BH disabling to IPComp directly.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoIPCONFIG: The kernel gets no IP from some DHCP servers
Stephen Hemminger [Wed, 5 Mar 2008 22:44:01 +0000 (14:44 -0800)]
IPCONFIG: The kernel gets no IP from some DHCP servers

Upstream commit: dea75bdfa57f75a7a7ec2961ec28db506c18e5db

From: Stephen Hemminger <shemminger@linux-foundation.org>

Based upon a patch by Marcel Wappler:

   This patch fixes a DHCP issue of the kernel: some DHCP servers
   (i.e.  in the Linksys WRT54Gv5) are very strict about the contents
   of the DHCPDISCOVER packet they receive from clients.

   Table 5 in RFC2131 page 36 requests the fields 'ciaddr' and
   'siaddr' MUST be set to '0'.  These DHCP servers ignore Linux
   kernel's DHCP discovery packets with these two fields set to
   '255.255.255.255' (in contrast to popular DHCP clients, such as
   'dhclient' or 'udhcpc').  This leads to a not booting system.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoIPV4: Remove IP_TOS setting privilege checks.
David S. Miller [Wed, 5 Mar 2008 22:44:30 +0000 (14:44 -0800)]
IPV4: Remove IP_TOS setting privilege checks.

Upstream commit: e4f8b5d4edc1edb0709531bd1a342655d5e8b98e

Various RFCs have all sorts of things to say about the CS field of the
DSCP value.  In particular they try to make the distinction between
values that should be used by "user applications" and things like
routing daemons.

This seems to have influenced the CAP_NET_ADMIN check which exists for
IP_TOS socket option settings, but in fact it has an off-by-one error
so it wasn't allowing CS5 which is meant for "user applications" as
well.

Further adding to the inconsistency and brokenness here, IPV6 does not
validate the DSCP values specified for the IPV6_TCLASS socket option.

The real actual uses of these TOS values are system specific in the
final analysis, and these RFC recommendations are just that, "a
recommendation".  In fact the standards very purposefully use
"SHOULD" and "SHOULD NOT" when describing how these values can be
used.

In the final analysis the only clean way to provide consistency here
is to remove the CAP_NET_ADMIN check.  The alternatives just don't
work out:

1) If we add the CAP_NET_ADMIN check to ipv6, this can break existing
   setups.

2) If we just fix the off-by-one error in the class comparison in
   IPV4, certain DSCP values can be used in IPV6 but not IPV4 by
   default.  So people will just ask for a sysctl asking to
   override that.

I checked several other freely available kernel trees and they
do not make any privilege checks in this area like we do.  For
the BSD stacks, this goes back all the way to Stevens Volume 2
and beyond.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoIPV6: dst_entry leak in ip4ip6_err.
Denis V. Lunev [Wed, 5 Mar 2008 22:43:05 +0000 (14:43 -0800)]
IPV6: dst_entry leak in ip4ip6_err.

Upstream commit: 9937ded8e44de8865cba1509d24eea9d350cebf0

The result of the ip_route_output is not assigned to skb. This means that
- it is leaked
- possible OOPS below dereferrencing skb->dst
- no ICMP message for this case

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoIPV6: Fix IPsec datagram fragmentation
Herbert Xu [Wed, 5 Mar 2008 22:46:35 +0000 (14:46 -0800)]
IPV6: Fix IPsec datagram fragmentation

Upstream commits: 28a89453b1e8de8d777ad96fa1eef27b5d1ce074
                  b5c15fc004ac83b7ad280acbe0fd4bbed7e2c8d4

This is a long-standing bug in the IPsec IPv6 code that breaks
when we emit a IPsec tunnel-mode datagram packet.  The problem
is that the code the emits the packet assumes the IPv6 stack
will fragment it later, but the IPv6 stack assumes that whoever
is emitting the packet is going to pre-fragment the packet.

In the long term we need to fix both sides, e.g., to get the
datagram code to pre-fragment as well as to get the IPv6 stack
to fragment locally generated tunnel-mode packet.

For now this patch does the second part which should make it
work for the IPsec host case.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoNET: Fix race in dev_close(). (Bug 9750)
Matti Linnanvuori [Wed, 5 Mar 2008 22:42:41 +0000 (14:42 -0800)]
NET: Fix race in dev_close(). (Bug 9750)

Upstream commit: d8b2a4d21e0b37b9669b202867bfef19f68f786a

There is a race in Linux kernel file net/core/dev.c, function dev_close.
The function calls function dev_deactivate, which calls function
dev_watchdog_down that deletes the watchdog timer. However, after that, a
driver can call netif_carrier_ok, which calls function
__netdev_watchdog_up that can add the watchdog timer again. Function
unregister_netdevice calls function dev_shutdown that traps the bug
!timer_pending(&dev->watchdog_timer). Moving dev_deactivate after
netif_running() has been cleared prevents function netif_carrier_on
from calling __netdev_watchdog_up and adding the watchdog timer again.

Signed-off-by: Matti Linnanvuori <mattilinnanvuori@yahoo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoNET: Messed multicast lists after dev_mc_sync/unsync
Jorge Boncompte [DTI2] [Wed, 5 Mar 2008 22:47:01 +0000 (14:47 -0800)]
NET: Messed multicast lists after dev_mc_sync/unsync

Upstream commit: 12aa343add3eced38a44bdb612b35fdf634d918c

Commit a0a400d79e3dd7843e7e81baa3ef2957bdc292d0 ("[NET]: dev_mcast:
add multicast list synchronization helpers") from you introduced a new
field "da_synced" to struct dev_addr_list that is not properly
initialized to 0. So when any of the current users (8021q, macvlan,
mac80211) calls dev_mc_sync/unsync they mess the address list for both
devices.

The attached patch fixed it for me and avoid future problems.

Signed-off-by: Jorge Boncompte [DTI2] <jorge@dti2.net>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoNIU: Bump driver version and release date.
David S. Miller [Wed, 5 Mar 2008 22:49:01 +0000 (14:49 -0800)]
NIU: Bump driver version and release date.

Upstream commit: a442585952f137bd4cdb1f2f3166e4157d383b82

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoNIU: Fix BMAC alternate MAC address indexing.
Matheos Worku [Wed, 5 Mar 2008 22:47:36 +0000 (14:47 -0800)]
NIU: Fix BMAC alternate MAC address indexing.

Upstream commit: 3b5bcedeeb755b6e813537fcf4c32f010b490aef

BMAC port alternate MAC address index needs to start at 1. Index 0 is
used for the main MAC address.

Signed-off-by: Matheos Worku <matheos.worku@sun.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoNIU: More BMAC alt MAC address fixes.
Matheos Worku [Wed, 5 Mar 2008 22:48:37 +0000 (14:48 -0800)]
NIU: More BMAC alt MAC address fixes.

Upstream commit: fa907895b7b776208a1406efe5ba7ffe0f49f507

From: Matheos Worku <Matheos.Worku@Sun.COM>

1) niu_enable_alt_mac() needs to be adjusted so that the mask
   is computed properly for the BMAC case.

2) BMAC has 6 alt MAC addresses available, not 7.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoTCP: Improve ipv4 established hash function.
David S. Miller [Wed, 5 Mar 2008 22:49:24 +0000 (14:49 -0800)]
TCP: Improve ipv4 established hash function.

Upstream commit: 7adc3830f90df04a13366914d80a3ed407db5381

If all of the entropy is in the local and foreign addresses,
but xor'ing together would cancel out that entropy, the
current hash performs poorly.

Suggested by Cosmin Ratiu:

Basically, the situation is as follows: There is a client
machine and a server machine. Both create 15000 virtual
interfaces, open up a socket for each pair of interfaces and
do SIP traffic. By profiling I noticed that there is a lot of
time spent walking the established hash chains with this
particular setup.

The addresses were distributed like this: client interfaces
were 198.18.0.1/16 with increments of 1 and server interfaces
were 198.18.128.1/16 with increments of 1. As I said, there
were 15000 interfaces. Source and destination ports were 5060
for each connection.  So in this case, ports don't matter for
hashing purposes, and the bits from the address pairs used
cancel each other, meaning there are no differences in the
whole lot of pairs, so they all end up in the same hash chain.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoSPARC: Fix link errors with gcc-4.3
David S. Miller [Thu, 6 Mar 2008 22:47:51 +0000 (14:47 -0800)]
SPARC: Fix link errors with gcc-4.3

Upstream commit: f0e98c387e61de00646be31fab4c2fa0224e1efb

Reported by Adrian Bunk.

Just like in changeset a3f9985843b674cbcb58f39fab8416675e7ab842
("[SPARC64]: Move kernel unaligned trap handlers into assembler
file.") we have to move the assembler bits into a seperate
asm file because as far as the compiler is concerned
these inline bits we're doing in unaligned.c are unreachable.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoSPARC64: Loosen checks in exception table handling.
David S. Miller [Thu, 6 Mar 2008 22:47:20 +0000 (14:47 -0800)]
SPARC64: Loosen checks in exception table handling.

Upstream commits: 622eaec613130e6ea78f2a5d5070e3278b21cd8f
                  be71716e464f4ea38f08034dc666f2feb55535d9

Some parts of the kernel now do things like do *_user() accesses while
set_fs(KERNEL_DS) that fault on purpose.

See, for example, the code added by changeset
a0c1e9073ef7428a14309cba010633a6cd6719ea ("futex: runtime enable pi
and robust functionality").

That trips up the ASI sanity checking we make in do_kernel_fault().

Just remove it for now.  Maybe we can add it back later with an added
conditional which looks at the current get_fs() value.

Also, because of the new futex validation init handler, we have
to accept faults in init section text as well as the normal
kernel text.

Thanks to Tom Callaway for the bug report.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoRevert "NET: Add if_addrlabel.h to sanitized headers."
Greg Kroah-Hartman [Fri, 7 Mar 2008 00:00:36 +0000 (16:00 -0800)]
Revert "NET: Add if_addrlabel.h to sanitized headers."

This reverts commit 5fb7ba76544d95bfa05199f7394a442de5660be7.

It was incorrectly added to the .24.y stable tree and causes build
breakages.

Cc: Stephen Hemminger <stephen.hemminger@vyatta.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
16 years agoLinux 2.6.24.3 v2.6.24.3
Greg Kroah-Hartman [Tue, 26 Feb 2008 00:20:20 +0000 (16:20 -0800)]
Linux 2.6.24.3

16 years agox86_64: CPA, fix cache attribute inconsistency bug
Ingo Molnar [Fri, 15 Feb 2008 19:59:33 +0000 (20:59 +0100)]
x86_64: CPA, fix cache attribute inconsistency bug

(no matching git id as the upstream code is rewritten)

fix CPA cache attribute bug in v2.6.24. When phys_base is nonzero (when
CONFIG_RELOCATABLE=y) then change_page_attr_addr() miscalculates the
secondary alias address by -14 MB (depending on the configured offset).

The default 64-bit kernels of Fedora and Ubuntu are affected:

   $ grep RELOCA /boot/config-2.6.23.9-85.fc8
     CONFIG_RELOCATABLE=y

   $ grep RELOC /boot/config-2.6.22-14-generic
     CONFIG_RELOCATABLE=y

and probably on many other distros as well.

the bug affects all pages in the first 40 MB of physical RAM that
are allocated by some subsystem that does ioremap_nocache() on them:

       if (__pa(address) < KERNEL_TEXT_SIZE) {

Hence we might leave page table entries with inconsistent cache
attributes around (pages mapped at both UnCacheable and Write-Back),
and we can also set the wrong kernel text pages to UnCacheable.

the effects of this bug can be random slowdowns and other misbehavior.
If for example AGP allocates its aperture pages into the first 40 MB
of physical RAM, then the -14 MB bug might mark random kernel texto
pages as uncacheable, slowing down a random portion of the 64-bit
kernel until the AGP driver is unloaded.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agobonding: fix NULL pointer deref in startup processing
Jay Vosburgh [Fri, 15 Feb 2008 18:00:41 +0000 (10:00 -0800)]
bonding: fix NULL pointer deref in startup processing

patch 4fe4763cd8cacd81d892193efb48b99c99c15323 in mainline.

Fix the "are we creating a duplicate" check to not compare
the name if the name is NULL (meaning that the system should select
a name).  Bug reported by Benny Amorsen <benny+usenet@amorsen.dk>.

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoPOWERPC: Revert chrp_pci_fixup_vt8231_ata devinit to fix libata on pegasos
Olaf Hering [Fri, 22 Feb 2008 00:41:44 +0000 (19:41 -0500)]
POWERPC: Revert chrp_pci_fixup_vt8231_ata devinit to fix libata on pegasos

Commit: 092ca5bd61da6344f3b249754b337f2d48dfe08d

[POWERPC] Revert chrp_pci_fixup_vt8231_ata devinit to fix libata on pegasos

Commit 6d98bda79bea0e1be26c0767d0e9923ad3b72f2e changed the init order
for chrp_pci_fixup_vt8231_ata().

It can not work anymore because either the irq is not yet set to 14 or
pci_get_device() returns nothing.  At least the printk() in
chrp_pci_fixup_vt8231_ata() does not trigger anymore.
pata_via works again on Pegasos with the change below.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoPCMCIA: Fix station address detection in smc
Chuck Ebbert [Fri, 22 Feb 2008 00:33:00 +0000 (19:33 -0500)]
PCMCIA: Fix station address detection in smc

Commit: a1a98b72dbd17e53cd92b8e78f404525ebcfd981

Fix station address detection in smc

Megahertz EM1144 PCMCIA ethernet adapter needs special handling
because it has two VERS_1 tuples and the station address is in
the second one. Conversion to generic handling of these fields
broke it. Reverting that fixes the device.

  https://bugzilla.redhat.com/show_bug.cgi?id=233255

Thanks go to Jon Stanley for not giving up on this one until the
problem was found.

Signed-off-by: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoSCSI: gdth: scan for scsi devices
Boaz Harrosh [Thu, 14 Feb 2008 21:15:08 +0000 (21:15 +0000)]
SCSI: gdth: scan for scsi devices

commit: 61c92814dc324b541391757062ff02fbf3b08086

The patch: "gdth: switch to modern scsi host registration"

missed one simple fact when moving a way from scsi_module.c.
That is to call scsi_scan_host() on the probed host.
With this the gdth driver from 2.6.24 is again able to
see drives and boot.

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Tested-by: Joerg Dorchain <joerg@dorchain.net>
Tested-by: Stefan Priebe <s.priebe@allied-internet.ag>
Tested-by: Jon Chelton <jchelton@ffpglobal.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoUSB: fix pm counter leak in usblp
Oliver Neukum [Fri, 22 Feb 2008 00:35:05 +0000 (00:35 +0000)]
USB: fix pm counter leak in usblp

commit 1902869019918411c148c18cc3a22aade569ac9a upstream

if you fail in open() you must decrement the pm counter again.

Signed-off-by: Oliver Neukum <oneukum@suse.de>
Signed-off-by: Pete Zaitcev <zaitcev@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoS390: Fix futex_atomic_cmpxchg_std inline assembly.
Heiko Carstens [Tue, 19 Feb 2008 17:20:11 +0000 (17:20 +0000)]
S390: Fix futex_atomic_cmpxchg_std inline assembly.

commit: d5b02b3ff1d9a2e1074f559c84ed378cfa6fc3c0 upstream

Add missing exception table entry so that the kernel can handle
proctection exceptions as well on the cs instruction. Currently only
specification exceptions are handled correctly.
The missing entry allows user space to crash the kernel.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agogenirq: do not leave interupts enabled on free_irq
Thomas Gleixner [Tue, 19 Feb 2008 23:29:02 +0000 (00:29 +0100)]
genirq: do not leave interupts enabled on free_irq

commit 89d694b9dbe769ca1004e01db0ca43964806a611

The default_disable() function was changed in commit:

 76d2160147f43f982dfe881404cfde9fd0a9da21
 genirq: do not mask interrupts by default

It removed the mask function in favour of the default delayed
interrupt disabling. Unfortunately this also broke the shutdown in
free_irq() when the last handler is removed from the interrupt for
those architectures which rely on the default implementations. Now we
can end up with a enabled interrupt line after the last handler was
removed, which can result in spurious interrupts.

Fix this by adding a default_shutdown function, which is only
installed, when the irqchip implementation does provide neither a
shutdown nor a disable function.

Pointed-out-by: Michael Hennerich <Michael.Hennerich@analog.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Tested-by: Michael Hennerich <Michael.Hennerich@analog.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agohrtimer: catch expired CLOCK_REALTIME timers early
Thomas Gleixner [Wed, 20 Feb 2008 00:04:56 +0000 (01:04 +0100)]
hrtimer: catch expired CLOCK_REALTIME timers early

commit 63070a79ba482c274bad10ac8c4b587a3e011f2c

A CLOCK_REALTIME timer, which has an absolute expiry time less than
the clock realtime offset calls with a negative delta into the clock
events code and triggers the WARN_ON() there.

This is a false positive and needs to be prevented. Check the result
of timer->expires - timer->base->offset right away and return -ETIME
right away.

Thanks to Frans Pop, who reported the problem and tested the fixes.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Frans Pop <elendil@planet.nl>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agohrtimer: check relative timeouts for overflow
Thomas Gleixner [Wed, 20 Feb 2008 00:03:00 +0000 (01:03 +0100)]
hrtimer: check relative timeouts for overflow

commit: 5a7780e725d1bb4c3094fcc12f1c5c5faea1e988

Various user space callers ask for relative timeouts. While we fixed
that overflow issue in hrtimer_start(), the sites which convert
relative user space values to absolute timeouts themself were uncovered.

Instead of putting overflow checks into each place add a function
which does the sanity checking and convert all affected callers to use
it.

Thanks to Frans Pop, who reported the problem and tested the fixes.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Tested-by: Frans Pop <elendil@planet.nl>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoSLUB: Deal with annoying gcc warning on kfree()
Christoph Lameter [Fri, 8 Feb 2008 01:47:41 +0000 (17:47 -0800)]
SLUB: Deal with annoying gcc warning on kfree()

patch 5bb983b0cce9b7b281af15730f7019116dd42568 in mainline.

gcc 4.2 spits out an annoying warning if one casts a const void *
pointer to a void * pointer. No warning is generated if the
conversion is done through an assignment.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agohrtimer: fix *rmtp/restarts handling in compat_sys_nanosleep()
Oleg Nesterov [Tue, 19 Feb 2008 23:48:53 +0000 (00:48 +0100)]
hrtimer: fix *rmtp/restarts handling in compat_sys_nanosleep()

commit 416529374b4793ba2d2e97e736d108a2e0f3ef07

Spotted by Pavel Emelyanov and Alexey Dobriyan.

compat_sys_nanosleep() implicitly uses hrtimer_nanosleep_restart(), this can't
work. Make a suitable compat_nanosleep_restart() helper.

Introduced by commit c70878b4e0b6cf8d2f1e46319e48e821ef4a8aba
hrtimer: hook compat_sys_nanosleep up to high res timer code

Also, set ->addr_limit = KERNEL_DS before doing hrtimer_nanosleep(), this func
was changed by the previous patch and now takes the "__user *" parameter.

Thanks to Ingo Molnar for fixing the bug in this patch.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexey Dobriyan <adobriyan@sw.ru>
Cc: Pavel Emelyanov <xemul@sw.ru>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Toyo Abe <toyoa@mvista.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agohrtimer: fix *rmtp handling in hrtimer_nanosleep()
Oleg Nesterov [Tue, 19 Feb 2008 23:48:06 +0000 (00:48 +0100)]
hrtimer: fix *rmtp handling in hrtimer_nanosleep()

commit 080344b98805553f9b01de0f59a41b1533036d8d

Spotted by Pavel Emelyanov and Alexey Dobriyan.

hrtimer_nanosleep() sets restart_block->arg1 = rmtp, but this rmtp points to
the local variable which lives in the caller's stack frame. This means that
if sys_restart_syscall() actually happens and it is interrupted as well, we
don't update the user-space variable, but write into the already dead stack
frame.

Introduced by commit 04c227140fed77587432667a574b14736a06dd7f
hrtimer: Rework hrtimer_nanosleep to make sys_compat_nanosleep easier

Change the callers to pass "__user *rmtp" to hrtimer_nanosleep(), and change
hrtimer_nanosleep() to use copy_to_user() to actually update *rmtp.

Small problem remains. man 2 nanosleep states that *rtmp should be written if
nanosleep() was interrupted (it says nothing whether it is OK to update *rmtp
if nanosleep returns 0), but (with or without this patch) we can dirty *rem
even if nanosleep() returns 0.

NOTE: this patch doesn't change compat_sys_nanosleep(), because it has other
bugs. Fixed by the next patch.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Alexey Dobriyan <adobriyan@sw.ru>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: Pavel Emelyanov <xemul@sw.ru>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Toyo Abe <toyoa@mvista.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoDisable G5 NAP mode during SMU commands on U3
Benjamin Herrenschmidt [Thu, 7 Feb 2008 03:29:43 +0000 (14:29 +1100)]
Disable G5 NAP mode during SMU commands on U3

patch 592a607bbc053bc6f614a0e619326009f4b3829e in mainline.

It appears that with the U3 northbridge, if the processor is in NAP
mode the whole time while waiting for an SMU command to complete,
then the SMU will fail.  It could be related to the weird backward
mechanism the SMU uses to get to system memory via i2c to the
northbridge that doesn't operate properly when the said bridge is
in napping along with the CPU.  That is on U3 at least, U4 doesn't
seem to be affected.

This didn't show before NO_HZ as the timer wakeup was enough to make
it work it seems, but that is no longer the case.

This fixes it by disabling NAP mode on those machines while
an SMU command is in flight.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoBe more robust about bad arguments in get_user_pages()
Jonathan Corbet [Mon, 11 Feb 2008 23:17:33 +0000 (16:17 -0700)]
Be more robust about bad arguments in get_user_pages()

patch 900cf086fd2fbad07f72f4575449e0d0958f860f in mainline.

So I spent a while pounding my head against my monitor trying to figure
out the vmsplice() vulnerability - how could a failure to check for
*read* access turn into a root exploit? It turns out that it's a buffer
overflow problem which is made easy by the way get_user_pages() is
coded.

In particular, "len" is a signed int, and it is only checked at the
*end* of a do {} while() loop.  So, if it is passed in as zero, the loop
will execute once and decrement len to -1.  At that point, the loop will
proceed until the next invalid address is found; in the process, it will
likely overflow the pages array passed in to get_user_pages().

I think that, if get_user_pages() has been asked to grab zero pages,
that's what it should do.  Thus this patch; it is, among other things,
enough to block the (already fixed) root exploit and any others which
might be lurking in similar code.  I also think that the number of pages
should be unsigned, but changing the prototype of this function probably
requires some more careful review.

Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoAUDIT: Increase skb->truesize in audit_expand
Herbert Xu [Fri, 15 Feb 2008 09:32:40 +0000 (01:32 -0800)]
AUDIT: Increase skb->truesize in audit_expand

Upstream commit: 406a1d868001423c85a3165288e566e65f424fe6

The recent UDP patch exposed this bug in the audit code.  It
was calling pskb_expand_head without increasing skb->truesize.
The caller of pskb_expand_head needs to do so because that function
is designed to be called in places where truesize is already fixed
and therefore it doesn't update its value.

Because the audit system is using it in a place where the truesize
has not yet been fixed, it needs to update its value manually.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoBLUETOOTH: Add conn add/del workqueues to avoid connection fail.
Dave Young [Fri, 15 Feb 2008 09:34:03 +0000 (01:34 -0800)]
BLUETOOTH: Add conn add/del workqueues to avoid connection fail.

Upstream commit: b6c0632105f7d7548f1d642ba830088478d4f2b0

The bluetooth hci_conn sysfs add/del executed in the default
workqueue.  If the del_conn is executed after the new add_conn with
same target, add_conn will failed with warning of "same kobject name".

Here add btaddconn & btdelconn workqueues, flush the btdelconn
workqueue in the add_conn function to avoid the issue.

Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoINET: Prevent out-of-sync truesize on ip_fragment slow path
Herbert Xu [Fri, 15 Feb 2008 09:55:06 +0000 (01:55 -0800)]
INET: Prevent out-of-sync truesize on ip_fragment slow path

Upstream commit: 29ffe1a5c52dae13b6efead97aab9b058f38fce4

When ip_fragment has to hit the slow path the value of skb->truesize
may go out of sync because we would have updated it without changing
the packet length.  This violates the constraints on truesize.

This patch postpones the update of skb->truesize to prevent this.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoINET_DIAG: Fix inet_diag_lock_handler error path.
Arnaldo Carvalho de Melo [Fri, 15 Feb 2008 09:41:34 +0000 (01:41 -0800)]
INET_DIAG: Fix inet_diag_lock_handler error path.

Upstream commit: 8cf8e5a67fb07f583aac94482ba51a7930dab493

Fixes: http://bugzilla.kernel.org/show_bug.cgi?id=9825
The inet_diag_lock_handler function uses ERR_PTR to encode errors but
its callers were testing against NULL.

This only happens when the only inet_diag modular user, DCCP, is not
built into the kernel or available as a module.

Also there was a problem with not dropping the mutex lock when a handler
was not found, also fixed in this patch.

This caused an OOPS and ss would then hang on subsequent calls, as
&inet_diag_table_mutex was being left locked.

Thanks to spike at ml.yaroslavl.ru for report it after trying 'ss -d'
on a kernel that doesn't have DCCP available.

This bug was introduced in cset
d523a328fb0271e1a763e985a21f2488fd816e7e ("Fix inet_diag dead-lock
regression"), after 2.6.24-rc3, so just 2.6.24 seems to be affected.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
16 years agoIPCOMP: Fetch nexthdr before ipch is destroyed
Herbert Xu [Fri, 15 Feb 2008 09:44:03 +0000 (01:44 -0800)]
IPCOMP: Fetch nexthdr before ipch is destroyed

Upstream commit: 2614fa59fa805cd488083c5602eb48533cdbc018

When I moved the nexthdr setting out of IPComp I accidently moved
the reading of ipch->nexthdr after the decompression.  Unfortunately
this means that we'd be reading from a stale ipch pointer which
doesn't work very well.

This patch moves the reading up so that we get the correct nexthdr
value.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>