]> git.kernelconcepts.de Git - karo-tx-linux.git/log
karo-tx-linux.git
15 years agoLinux 2.6.27.19 v2.6.27.19
Greg Kroah-Hartman [Fri, 20 Feb 2009 22:39:34 +0000 (14:39 -0800)]
Linux 2.6.27.19

15 years agoext4: Initialize the new group descriptor when resizing the filesystem
Theodore Ts'o [Tue, 17 Feb 2009 15:58:44 +0000 (10:58 -0500)]
ext4: Initialize the new group descriptor when resizing the filesystem

(cherry picked from commit fdff73f094e7220602cc3f8959c7230517976412)

Make sure all of the fields of the group descriptor are properly
initialized.  Previously, we allowed bg_flags field to be contain
random garbage, which could trigger non-deterministic behavior,
including a kernel OOPS.

http://bugzilla.kernel.org/show_bug.cgi?id=12433

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agojbd2: On a __journal_expect() assertion failure printk "JBD2", not "EXT3-fs"
Theodore Ts'o [Tue, 17 Feb 2009 15:58:43 +0000 (10:58 -0500)]
jbd2: On a __journal_expect() assertion failure printk "JBD2", not "EXT3-fs"

(cherry picked from commit 08ec8c3878cea0bf91f2ba3c0badf44b383752d0)

Otherwise it can be very confusing to find a "EXT3-fs: " failure in
the middle of EXT4-fs failures, and it makes it harder to track the
source of the failure.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoext4: Add sanity check to make_indexed_dir
Theodore Ts'o [Tue, 17 Feb 2009 15:58:42 +0000 (10:58 -0500)]
ext4: Add sanity check to make_indexed_dir

(cherry picked from commit e6b8bc09ba2075cd91fbffefcd2778b1a00bd76f)

Make sure the rec_len field in the '..' entry is sane, lest we overrun
the directory block and cause a kernel oops on a purposefully
corrupted filesystem.

Thanks to Sami Liedes for reporting this bug.

http://bugzilla.kernel.org/show_bug.cgi?id=12430

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoext4: only use i_size_high for regular files
Theodore Ts'o [Tue, 17 Feb 2009 15:58:41 +0000 (10:58 -0500)]
ext4: only use i_size_high for regular files

(cherry picked from commit 06a279d636734da32bb62dd2f7b0ade666f65d7c)

Directories are not allowed to be bigger than 2GB, so don't use
i_size_high for anything other than regular files.  E2fsck should
complain about these inodes, but the simplest thing to do for the
kernel is to only use i_size_high for regular files.

This prevents an intentially corrupted filesystem from causing the
kernel to burn a huge amount of CPU and issuing error messages such
as:

EXT4-fs warning (device loop0): ext4_block_to_path: block 135090028 > max

Thanks to David Maciejak from Fortinet's FortiGuard Global Security
Research Team for reporting this issue.

http://bugzilla.kernel.org/show_bug.cgi?id=12375

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoext4: Add sanity checks for the superblock before mounting the filesystem
Theodore Ts'o [Tue, 17 Feb 2009 15:58:40 +0000 (10:58 -0500)]
ext4: Add sanity checks for the superblock before mounting the filesystem

(cherry picked from commit 4ec110281379826c5cf6ed14735e47027c3c5765)

This avoids insane superblock configurations that could lead to kernel
oops due to null pointer derefences.

http://bugzilla.kernel.org/show_bug.cgi?id=12371

Thanks to David Maciejak at Fortinet's FortiGuard Global Security
Research Team who discovered this bug independently (but at
approximately the same time) as Thiemo Nagel, who submitted the patch.

Signed-off-by: Thiemo Nagel <thiemo.nagel@ph.tum.de>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoext4: Init the complete page while building buddy cache
Aneesh Kumar K.V [Tue, 17 Feb 2009 15:58:39 +0000 (10:58 -0500)]
ext4: Init the complete page while building buddy cache

(cherry picked from commit 29eaf024980e07cc01f31ae4ea5d68c917f4b7da)

We need to init the complete page during buddy cache init
by setting the contents to '1'.  Otherwise we can see the
following errors after doing an online resize of the
filesystem:

EXT4-fs error (device sdb1): ext4_mb_mark_diskspace_used:
Allocating block 1040385 in system zone of 127 group

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoext4: Don't allow new groups to be added during block allocation
Aneesh Kumar K.V [Tue, 17 Feb 2009 15:58:38 +0000 (10:58 -0500)]
ext4: Don't allow new groups to be added during block allocation

(cherry picked from commit 8556e8f3b6c4c11601ce1e9ea8090a6d8bd5daae)

After we mark the blocks in the buddy cache as allocated,
we need to ensure that we don't reinit the buddy cache until
the block bitmap is updated.  This commit achieves this by holding
the group_info alloc_semaphore till ext4_mb_release_context

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoext4: mark the blocks/inode bitmap beyond end of group as used
Aneesh Kumar K.V [Tue, 17 Feb 2009 15:58:37 +0000 (10:58 -0500)]
ext4: mark the blocks/inode bitmap beyond end of group as used

(cherry picked from commit 648f5879f5892dddd3ba71cd0d285599f40f2512)

We need to mark the block/inode bitmap beyond the end of the group
with '1'.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoext4: Use new buffer_head flag to check uninit group bitmaps initialization
Aneesh Kumar K.V [Tue, 17 Feb 2009 15:58:36 +0000 (10:58 -0500)]
ext4: Use new buffer_head flag to check uninit group bitmaps initialization

(cherry picked from commit 2ccb5fb9f113dae969d1ae9b6c10e80fa34f8cd3)

For uninit block group, the ondisk bitmap is not initialized. That implies
we cannot depend on the uptodate flag on the bitmap buffer_head to
find bitmap validity. Use a new buffer_head flag which would be set after
we properly initialize the bitmap. This also prevent the initializing
the uninit group bitmap initialization every time we do a
ext4_read_block_bitmap.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agojbd2: Add BH_JBDPrivateStart
Mark Fasheh [Tue, 17 Feb 2009 15:58:35 +0000 (10:58 -0500)]
jbd2: Add BH_JBDPrivateStart

(cherry picked from commit e97fcd95a4778a8caf1980c6c72fdf68185a0838)

Add this so that file systems using JBD2 can safely allocate unused b_state
bits.

In this case, we add it so that Ocfs2 can define a single bit for tracking
the validation state of a buffer.

Acked-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoext4: Fix the race between read_inode_bitmap() and ext4_new_inode()
Aneesh Kumar K.V [Tue, 17 Feb 2009 15:58:34 +0000 (10:58 -0500)]
ext4: Fix the race between read_inode_bitmap() and ext4_new_inode()

(cherry picked from commit 393418676a7602e1d7d3f6e560159c65c8cbd50e)

We need to make sure we update the inode bitmap and clear
EXT4_BG_INODE_UNINIT flag with sb_bgl_lock held, since
ext4_read_inode_bitmap() looks at EXT4_BG_INODE_UNINIT to decide
whether to initialize the inode bitmap each time it is called.
(introduced by commit c806e68f.)

ext4_read_inode_bitmap does:

spin_lock(sb_bgl_lock(EXT4_SB(sb), block_group));
if (desc->bg_flags & cpu_to_le16(EXT4_BG_INODE_UNINIT)) {
ext4_init_inode_bitmap(sb, bh, block_group, desc);

and ext4_new_inode does
if (!ext4_set_bit_atomic(sb_bgl_lock(sbi, group),
                   ino, inode_bitmap_bh->b_data))
   ......
   ...
spin_lock(sb_bgl_lock(sbi, group));

gdp->bg_flags &= cpu_to_le16(~EXT4_BG_INODE_UNINIT);
i.e., on allocation we update the bitmap then we take the sb_bgl_lock
and clear the EXT4_BG_INODE_UNINIT flag. What can happen is a
parallel ext4_read_inode_bitmap can zero out the bitmap in between
the above ext4_set_bit_atomic and spin_lock(sb_bg_lock..)

The race results in below user visible errors
EXT4-fs error (device sdb1): ext4_free_inode: bit already cleared for inode 168449
EXT4-fs warning (device sdb1): ext4_unlink: Deleting nonexistent file ...
EXT4-fs warning (device sdb1): ext4_rmdir: empty directory has too many links ...
ls: /mnt/tmp/f/p369/d3/d6/d39/db2/dee/d10f/d3f/l71: Stale NFS file handle

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoext4: Fix race between read_block_bitmap() and mark_diskspace_used()
Aneesh Kumar K.V [Tue, 17 Feb 2009 15:58:33 +0000 (10:58 -0500)]
ext4: Fix race between read_block_bitmap() and mark_diskspace_used()

(cherry picked from commit e8134b27e351e813414da3b95aa8eac6d3908088)

We need to make sure we update the block bitmap and clear
EXT4_BG_BLOCK_UNINIT flag with sb_bgl_lock held, since
ext4_read_block_bitmap() looks at EXT4_BG_BLOCK_UNINIT to decide
whether to initialize the block bitmap each time it is called
(introduced by commit c806e68f), and this can race with block
allocations in ext4_mb_mark_diskspace_used().

ext4_read_block_bitmap does:

spin_lock(sb_bgl_lock(EXT4_SB(sb), block_group));
if (desc->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT)) {
ext4_init_block_bitmap(sb, bh, block_group, desc);

Now on the block allocation side we do

mb_set_bits(sb_bgl_lock(sbi, ac->ac_b_ex.fe_group), bitmap_bh->b_data,
ac->ac_b_ex.fe_start, ac->ac_b_ex.fe_len);
....
spin_lock(sb_bgl_lock(sbi, ac->ac_b_ex.fe_group));
if (gdp->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT)) {
gdp->bg_flags &= cpu_to_le16(~EXT4_BG_BLOCK_UNINIT);

ie on allocation we update the bitmap then we take the sb_bgl_lock
and clear the EXT4_BG_BLOCK_UNINIT flag. What can happen is a
parallel ext4_read_block_bitmap can zero out the bitmap in between
the above mb_set_bits and spin_lock(sb_bg_lock..)

The race results in below user visible errors
EXT4-fs error (device sdb1): ext4_mb_release_inode_pa: free 100, pa_free 105
EXT4-fs error (device sdb1): mb_free_blocks: double-free of inode 0's block ..

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoext4: don't use blocks freed but not yet committed in buddy cache init
Aneesh Kumar K.V [Tue, 17 Feb 2009 15:58:32 +0000 (10:58 -0500)]
ext4: don't use blocks freed but not yet committed in buddy cache init

(cherry picked from commit 7a2fcbf7f85737735fd44eb34b62315bccf6d6e4)

When we generate buddy cache (especially during resize) we need to
make sure we don't use the blocks freed but not yet comitted.  This
makes sure we have the right value of free blocks count in the group
info and also in the bitmap.  This also ensures the ordered mode
consistency

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoext4: Use an rbtree for tracking blocks freed during transaction.
Aneesh Kumar K.V [Tue, 17 Feb 2009 15:58:31 +0000 (10:58 -0500)]
ext4: Use an rbtree for tracking blocks freed during transaction.

(cherry picked from commit c894058d66637c7720569fbe12957f4de64d9991 to allow
commit e21675d4 to be included in 2.6.27.y)

With this patch we track the block freed during a transaction using
red-black tree.  We also make sure contiguous blocks freed are collected
in one node in the tree.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoext4: cleanup mballoc header files
Aneesh Kumar K.V [Tue, 17 Feb 2009 15:58:30 +0000 (10:58 -0500)]
ext4: cleanup mballoc header files

(cherry picked from commit c3a326a657562dab81acf05aee106dc1fe345eb4)

Move some of the forward declaration of the static functions
to mballoc.c where they are used. This enables us to include
mballoc.h in other .c files. Also correct the buddy cache
documentation.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoext4: Use EXT4_GROUP_INFO_NEED_INIT_BIT during resize
Aneesh Kumar K.V [Tue, 17 Feb 2009 15:58:29 +0000 (10:58 -0500)]
ext4: Use EXT4_GROUP_INFO_NEED_INIT_BIT during resize

(cherry picked from commit 920313a726e04fef0f2c0bcb04ad8229c0e700d8)

The new groups added during resize are flagged as
need_init group. Make sure we properly initialize these
groups. When we have block size < page size and we are adding
new groups the page may still be marked uptodate even though
we haven't initialized the group. While forcing the init
of buddy cache we need to make sure other groups part of the
same page of buddy cache is not using the cache.
group_info->alloc_sem is added to ensure the same.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoext4: Add blocks added during resize to bitmap
Aneesh Kumar K.V [Tue, 17 Feb 2009 15:58:28 +0000 (10:58 -0500)]
ext4: Add blocks added during resize to bitmap

(cherry picked from commit e21675d4b63975d09eb75c443c48ebe663d23e18)

With this change new blocks added during resize
are marked as free in the block bitmap and the
group is flagged with EXT4_GROUP_INFO_NEED_INIT_BIT
flag. This make sure when mballoc tries to allocate
blocks from the new group we would reload the
buddy information using the bitmap present in the disk.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoext4: Don't overwrite allocation_context ac_status
Aneesh Kumar K.V [Tue, 17 Feb 2009 15:58:27 +0000 (10:58 -0500)]
ext4: Don't overwrite allocation_context ac_status

(cherry picked from commit 032115fcef837a00336ddf7bda584e89789ea498)

We can call ext4_mb_check_limits even after successfully allocating
the requested blocks.  In that case, make sure we don't overwrite
ac_status if it already has the status AC_STATUS_FOUND.  This fixes
the lockdep warning:

=============================================
[ INFO: possible recursive locking detected ]
2.6.28-rc6-autokern1 #1
---------------------------------------------
fsstress/11948 is trying to acquire lock:
 (&meta_group_info[i]->alloc_sem){----}, at: [<c04d9a49>] ext4_mb_load_buddy+0x9f/0x278
.....

stack backtrace:
.....
 [<c04db974>] ext4_mb_regular_allocator+0xbb5/0xd44
.....

but task is already holding lock:
 (&meta_group_info[i]->alloc_sem){----}, at: [<c04d9a49>] ext4_mb_load_buddy+0x9f/0x278

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agojbd2: Add barrier not supported test to journal_wait_on_commit_record
Theodore Ts'o [Tue, 17 Feb 2009 15:58:26 +0000 (10:58 -0500)]
jbd2: Add barrier not supported test to journal_wait_on_commit_record

(cherry picked from commit fd98496f467b3d26d05ab1498f41718b5ef13de5)

Xen doesn't report that barriers are not supported until buffer I/O is
reported as completed, instead of when the buffer I/O is submitted.
Add a check and a fallback codepath to journal_wait_on_commit_record()
to detect this case, so that attempts to mount ext4 filesystems on
LVM/devicemapper devices on Xen guests don't blow up with an "Aborting
journal on device XXX"; "Remounting filesystem read-only" error.

Thanks to Andreas Sundstrom for reporting this issue.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoext4: Widen type of ext4_sb_info.s_mb_maxs[]
Yasunori Goto [Tue, 17 Feb 2009 15:58:25 +0000 (10:58 -0500)]
ext4: Widen type of ext4_sb_info.s_mb_maxs[]

(cherry picked from commit ff7ef329b268b603ea4a2303241ef1c3829fd574)

I chased the cause of following ext4 oops report which is tested on
ia64 box.

http://bugzilla.kernel.org/show_bug.cgi?id=12018

The cause is the size of s_mb_maxs array that is defined as "unsigned
short" in ext4_sb_info structure.  If the file system's block size is
8k or greater, an unsigned short is not wide enough to contain the
value fs->blocksize << 3.

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoext4: avoid ext4_error when mounting a fs with a single bg
Aneesh Kumar K.V [Tue, 17 Feb 2009 15:58:24 +0000 (10:58 -0500)]
ext4: avoid ext4_error when mounting a fs with a single bg

(cherry picked from commit 565a9617b2151e21b22700e97a8b04e70e103153)

Remove some completely unneeded code which which caused an ext4_error
to be generated when mounting a file system with only a single block
group.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoext4: Fix the delalloc writepages to allocate blocks at the right offset.
Aneesh Kumar K.V [Tue, 17 Feb 2009 15:58:23 +0000 (10:58 -0500)]
ext4: Fix the delalloc writepages to allocate blocks at the right offset.

(cherry picked from commit 791b7f08954869d7b8ff438f3dac3cfb39778297)

When iterating through the pages which have mapped buffer_heads, we
failed to update the b_state value. This results in allocating blocks
at logical offset 0.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoext4: tone down ext4_da_writepages warnings
Theodore Ts'o [Tue, 17 Feb 2009 15:58:22 +0000 (10:58 -0500)]
ext4: tone down ext4_da_writepages warnings

(cherry picked from commit 2a21e37e48b94388f2cc8c0392f104f5443d4bb8)

If the filesystem has errors, ext4_da_writepages() will return a *lot*
of errors, including lots and lots of stack dumps.  While it's true
that we are dropping user data on the floor, which is unfortunate, the
stack dumps aren't helpful, and they tend to obscure the true original
root cause of the problem.  So in the case where the filesystem has
aborted, return an EROFS right away.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoext4: Add support for non-native signed/unsigned htree hash algorithms
Theodore Ts'o [Tue, 17 Feb 2009 15:58:21 +0000 (10:58 -0500)]
ext4: Add support for non-native signed/unsigned htree hash algorithms

(cherry picked from commit f99b25897a86fcfff9140396a97261ae65fed872)

The original ext3 hash algorithms assumed that variables of type char
were signed, as God and K&R intended.  Unfortunately, this assumption
is not true on some architectures.  Userspace support for marking
filesystems with non-native signed/unsigned chars was added two years
ago, but the kernel-side support was never added (until now).

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agox86/cpa: make sure cpa is safe to call in lazy mmu mode
Jeremy Fitzhardinge [Wed, 11 Feb 2009 17:32:19 +0000 (09:32 -0800)]
x86/cpa: make sure cpa is safe to call in lazy mmu mode

commit 4f06b0436b2ddbd3b67b10e77098a6862787b3eb upstream.

Impact: fix race leading to crash under KVM and Xen

The CPA code may be called while we're in lazy mmu update mode - for
example, when using DEBUG_PAGE_ALLOC and doing a slab allocation
in an interrupt handler which interrupted a lazy mmu update.  In this
case, the in-memory pagetable state may be out of date due to pending
queued updates.  We need to flush any pending updates before inspecting
the page table.  Similarly, we must explicitly flush any modifications
CPA may have made (which comes down to flushing queued operations when
flushing the TLB).

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoSCSI: libiscsi: fix iscsi pool leak
Mike Christie [Fri, 16 Jan 2009 18:36:51 +0000 (12:36 -0600)]
SCSI: libiscsi: fix iscsi pool leak

commit 2f5899a39dcffb404c9a3d06ad438aff3e03bf04 upstream.

I am not sure what happened. It looks like we have always leaked
the q->queue that is allocated from the kfifo_init call. nab finally
noticed that we were leaking and this patch fixes it by adding a
kfree call to iscsi_pool_free. kfifo_free is not used per kfifo_init's
instructions to use kfree.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoext2/xip: refuse to change xip flag during remount with busy inodes
Carsten Otte [Wed, 11 Feb 2009 21:04:37 +0000 (13:04 -0800)]
ext2/xip: refuse to change xip flag during remount with busy inodes

commit 0e4a9b59282914fe057ab17027f55123964bc2e2 upstream.

For a reason that I was unable to understand in three months of debugging,
mount ext2 -o remount stopped working properly when remounting from
regular operation to xip, or the other way around.  According to a git
bisect search, the problem was introduced with the VM_MIXEDMAP/PTE_SPECIAL
rework in the vm:

commit 70688e4dd1647f0ceb502bbd5964fa344c5eb411
Author: Nick Piggin <npiggin@suse.de>
Date:   Mon Apr 28 02:13:02 2008 -0700

    xip: support non-struct page backed memory

In the failing scenario, the filesystem is mounted read only via root=
kernel parameter on s390x.  During remount (in rc.sysinit), the inodes of
the bash binary and its libraries are busy and cannot be invalidated (the
bash which is running rc.sysinit resides on subject filesystem).
Afterwards, another bash process (running ifup-eth) recurses into a
subshell, runs dup_mm (via fork).  Some of the mappings in this bash
process were created from inodes that could not be invalidated during
remount.

Both parent and child process crash some time later due to inconsistencies
in their address spaces.  The issue seems to be timing sensitive, various
attempts to recreate it have failed.

This patch refuses to change the xip flag during remount in case some
inodes cannot be invalidated.  This patch keeps users from running into
that issue.

[akpm@linux-foundation.org: cleanup]
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Jared Hulbert <jaredeh@gmail.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agobtsdio: free sk_buff with kfree_skb
Sergio Luis [Mon, 27 Oct 2008 06:08:48 +0000 (23:08 -0700)]
btsdio: free sk_buff with kfree_skb

commit cbfd24a75f98fe731547d3bc995f3a1f1fed6b20 upstream.

free sk_buff with kfree_skb, instead of kree

Signed-off-by: Sergio Luis <sergio@larces.uece.br>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoBluetooth: Fix TX error path in btsdio driver
Tomas Winkler [Sun, 30 Nov 2008 11:17:18 +0000 (12:17 +0100)]
Bluetooth: Fix TX error path in btsdio driver

commit 7644d63d1348ec044ccd8f775fefe5eb7cbcac69 upstream.

This patch fixes accumulating of the header in case packet was requeued
in the error path.

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoAdd support for VT6415 PCIE PATA IDE Host Controller
Zlatko Calusic [Wed, 18 Feb 2009 00:33:34 +0000 (01:33 +0100)]
Add support for VT6415 PCIE PATA IDE Host Controller

commit 5955c7a2cfb6a35429adea5dc480002b15ca8cfc upstream.

Signed-off-by: Zlatko Calusic <zlatko.calusic@iskon.hr>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years ago3c505: do not set pcb->data.raw beyond its size
Roel Kluin [Fri, 13 Feb 2009 00:52:31 +0000 (16:52 -0800)]
3c505: do not set pcb->data.raw beyond its size

commit 501aa061bd68169a5b54c123641f8dfa9ad31545 upstream.

Ensure that we do not set pcb->data.raw beyond its size, print an error message
and return false if we attempt to. A timout message was printed one too early.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agosata_nv: give up hardreset on nf2
Tejun Heo [Thu, 12 Feb 2009 01:34:32 +0000 (10:34 +0900)]
sata_nv: give up hardreset on nf2

commit 7dac745b8e367c99175b8f0d014d996f0e5ed9e5 upstream.

Kernel bz#12176 reports that nf2 hardreset simply doesn't work.  Give
up.  Argh...

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Robert Hancock <hancockr@shaw.ca>
Reported-by: Saro <saro_v@hotmail.it>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agopowerpc/vsx: Fix VSX alignment handler for regs 32-63
Michael Neuling [Thu, 12 Feb 2009 19:08:58 +0000 (19:08 +0000)]
powerpc/vsx: Fix VSX alignment handler for regs 32-63

commit 26456dcfb8d8e43b1b64b2a14710694cf7a72f05 upstream.

Fix the VSX alignment handler for VSX registers > 32.  32-63 are stored
in the VMX part of the thread_struct not the FPR part.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoFix Intel IOMMU write-buffer flushing
David Woodhouse [Fri, 13 Feb 2009 23:18:03 +0000 (23:18 +0000)]
Fix Intel IOMMU write-buffer flushing

commit ca77fde8e62cecb2c0769052228d15b901367af8 upstream.

This is the cause of the DMA faults and disk corruption that people have
been seeing. Some chipsets neglect to report the RWBF "capability" --
the flag which says that we need to flush the chipset write-buffer when
changing the DMA page tables, to ensure that the change is visible to
the IOMMU.

Override that bit on the affected chipsets, and everything is happy
again.

Thanks to Chris and Bhavesh and others for helping to debug.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Tested-by: Chris Wright <chrisw@sous-sol.org>
Reviewed-by: Bhavesh Davda <bhavesh@vmware.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agomqueue: fix si_pid value in mqueue do_notify()
Sukadev Bhattiprolu [Thu, 8 Jan 2009 02:08:50 +0000 (18:08 -0800)]
mqueue: fix si_pid value in mqueue do_notify()

commit a6684999f7c6bddd75cf9755ad7ff44435f72fff upstream.

If a process registers for asynchronous notification on a POSIX message
queue, it gets a signal and a siginfo_t structure when a message arrives
on the message queue.  The si_pid in the siginfo_t structure is set to the
PID of the process that sent the message to the message queue.

The principle is the following:
. when mq_notify(SIGEV_SIGNAL) is called, the caller registers for
  notification when a msg arrives. The associated pid structure is stroed into
  inode_info->notify_owner. Let's call this process P1.
. when mq_send() is called by say P2, P2 sends a signal to P1 to notify
  him about msg arrival.

The way .si_pid is set today is not correct, since it doesn't take into account
the fact that the process that is sending the message might not be in the
same namespace as the notified one.

This patch proposes to set si_pid to the sender's pid into the notify_owner
namespace.

Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Bastian Blank <bastian@waldi.eu.org>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agopid: implement ns_of_pid
Eric W. Biederman [Thu, 8 Jan 2009 02:08:46 +0000 (18:08 -0800)]
pid: implement ns_of_pid

commit f9fb860f67b9542cd78d1558dec7058092b57d8e upstream.

A current problem with the pid namespace is that it is easy to do pid
related work after exit_task_namespaces which drops the nsproxy pointer.

However if we are doing pid namespace related work we are always operating
on some struct pid which retains the pid_namespace pointer of the pid
namespace it was allocated in.

So provide ns_of_pid which allows us to find the pid namespace a pid was
allocated in.

Using this we have the needed infrastructure to do pid namespace related
work at anytime we have a struct pid, removing the chance of accidentally
having a NULL pointer dereference when accessing current->nsproxy.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Bastian Blank <bastian@waldi.eu.org>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: Nadia Derbey <Nadia.Derbey@bull.net>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoLinux 2.6.27.18 v2.6.27.18
Greg Kroah-Hartman [Tue, 17 Feb 2009 17:47:50 +0000 (09:47 -0800)]
Linux 2.6.27.18

15 years agonet: Fix data corruption when splicing from sockets.
Jarek Poplawski [Tue, 20 Jan 2009 01:03:56 +0000 (17:03 -0800)]
net: Fix data corruption when splicing from sockets.

[ Upstream commit 8b9d3728977760f6bd1317c4420890f73695354e ]

The trick in socket splicing where we try to convert the skb->data
into a page based reference using virt_to_page() does not work so
well.

The idea is to pass the virt_to_page() reference via the pipe
buffer, and refcount the buffer using a SKB reference.

But if we are splicing from a socket to a socket (via sendpage)
this doesn't work.

The from side processing will grab the page (and SKB) references.
The sendpage() calls will grab page references only, return, and
then the from side processing completes and drops the SKB ref.

The page based reference to skb->data is not enough to keep the
kmalloc() buffer backing it from being reused.  Yet, that is
all that the socket send side has at this point.

This leads to data corruption if the skb->data buffer is reused
by SLAB before the send side socket actually gets the TX packet
out to the device.

The fix employed here is to simply allocate a page and copy the
skb->data bytes into that page.

This will hurt performance, but there is no clear way to fix this
properly without a copy at the present time, and it is important
to get rid of the data corruption.

With fixes from Herbert Xu.

Tested-by: Willy Tarreau <w@1wt.eu>
Foreseen-by: Changli Gao <xiaosuo@gmail.com>
Diagnosed-by: Willy Tarreau <w@1wt.eu>
Reported-by: Willy Tarreau <w@1wt.eu>
Fixed-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoALSA: mtpav - Fix initial value for input hwport
Takashi Iwai [Wed, 11 Feb 2009 23:06:42 +0000 (00:06 +0100)]
ALSA: mtpav - Fix initial value for input hwport

commit 32cf9a16f4af01573ddec1eb073111fc20a9d7d4 upstream.

Fix the initial value for input hwport.  The old value (-1) may cause
Oops when an realtime MIDI byte is received before the input port is
explicitly given.
Instead, now it's set to the broadcasting as default.

Tested-by: Holger Dehnhardt <dehnhardt@ahdehnhardt.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agomac80211: fix a buffer overrun in station debug code
Jianjun Kong [Tue, 11 Nov 2008 05:37:39 +0000 (21:37 -0800)]
mac80211: fix a buffer overrun in station debug code

commit 013cd397532e5803a1625954a884d021653da720 upstream.

net/mac80211/debugfs_sta.c
The trailing zero was written to state[4], it's out of bounds.

Signed-off-by: Jianjun Kong <jianjun@zeuux.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agox86: fixup config space size of CPU functions for AMD family 11h
Andreas Herrmann [Tue, 25 Nov 2008 16:18:03 +0000 (17:18 +0100)]
x86: fixup config space size of CPU functions for AMD family 11h

commit ffd565a8b817d1eb4b25184e8418e8d96c3f56f6 upstream.

Impact: extend allowed configuration space access on 11h CPUs from 256 to 4K

Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoide/libata: fix ata_id_is_cfa() (take 4)
Sergei Shtylyov [Sun, 1 Feb 2009 16:46:39 +0000 (20:46 +0400)]
ide/libata: fix ata_id_is_cfa() (take 4)

commit 2999b58b795ad81f10e34bdbbfd2742172f247e4 upstream.

When checking for the CFA feature set support, ata_id_is_cfa() tests bit 2 in
word 82 of the identify data instead the word 83;  it also checks the ATA/PI
version support in the word 80 (which the CompactFlash specifications have as
reserved), this having no slightest chance to work on the modern CF cards that
don't have 0x848A in the word 0...

Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Cc: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agolibata: fix EH device failure handling
Tejun Heo [Thu, 29 Jan 2009 11:31:29 +0000 (20:31 +0900)]
libata: fix EH device failure handling

commit d89293abd95bfd7dd9229087d6c30c1464c5ac83 upstream.

The dev->pio_mode > XFER_PIO_0 test is there to avoid unnecessary
speed down warning messages but it accidentally disabled SATA link spd
down during configuration phase after reset where PIO mode is always
zero.

This patch fixes the problem by moving the test where it belongs.
This makes libata probing sequence behave better when the connection
is flaky at higher link speeds which isn't too uncommon for eSATA
devices.

[cebbert@redhat.com: trivial backport to 2.6.27]

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Cc: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoHID: adjust report descriptor fixup for MS 1028 receiver
Jiri Kosina [Tue, 10 Feb 2009 22:00:34 +0000 (17:00 -0500)]
HID: adjust report descriptor fixup for MS 1028 receiver

commit 0fb21de0799a985d2da3da14ae5625d724256638 upstream

HID: adjust report descriptor fixup for MS 1028 receiver
[Backport to 2.6.27: cebbert@redhat.com]

Report descriptor fixup for MS 1028 receiver changes also values for
Keyboard and Consumer, which incorrectly trims the range, causing correct
events being thrown away before passing to userspace.

We need to keep the GenDesk usage fixup though, as it reports totally bogus
values about axis.

Reported-by: Lucas Gadani <lgadani@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Cc: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agobluetooth hid: enable quirk handling for Apple Wireless Keyboards in 2.6.27
Torsten Rausche [Thu, 12 Feb 2009 01:32:44 +0000 (02:32 +0100)]
bluetooth hid: enable quirk handling for Apple Wireless Keyboards in 2.6.27

This patch is basically a backport of
commit ee8a1a0a1a5817accd03ced7e7ffde3a4430f485 upstream
which was made after the big HID overhaul in 2.6.28.

Kernel 2.6.27 fails to handle quirks for the aluminum Apple Wireless
Keyboard because it is handled as USB device and not as Bluetooth
device. This patch expands 'hidp_blacklist' to make the kernel handle
the keyboard in the same way as the Apple wireless Mighty Mouse (also a
Bluetooth device).

Signed-off-by: Torsten Rausche <torsten@rausche.net>
Cc: Jan Scholz <Scholz@fias.uni-frankfurt.de>
Cc: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agonetfilter: xt_sctp: sctp chunk mapping doesn't work
Qu Haoran [Thu, 12 Feb 2009 07:07:38 +0000 (08:07 +0100)]
netfilter: xt_sctp: sctp chunk mapping doesn't work

netfilter: xt_sctp: sctp chunk mapping doesn't work

Upstream commit: d4e2675a

When user tries to map all chunks given in argument, kernel
works on a copy of the chunkmap, but at the end it doesn't
check the copy, but the orginal one.

Signed-off-by: Qu Haoran <haoran.qu@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agonetfilter: fix tuple inversion for Node information request
Eric Leblond [Thu, 12 Feb 2009 07:07:37 +0000 (08:07 +0100)]
netfilter: fix tuple inversion for Node information request

netfilter: fix tuple inversion for Node information request

Upstream commit: a51f42f3c

The patch fixes a typo in the inverse mapping of Node Information
request. Following draft-ietf-ipngwg-icmp-name-lookups-09, "Querier"
sends a type 139 (ICMPV6_NI_QUERY) packet to "Responder" which answer
with a type 140 (ICMPV6_NI_REPLY) packet.

Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agosparc64: Annotate sparc64 specific syscalls with SYSCALL_DEFINEx()
David S. Miller [Fri, 13 Feb 2009 09:09:19 +0000 (01:09 -0800)]
sparc64: Annotate sparc64 specific syscalls with SYSCALL_DEFINEx()

[ Upstream commit e42650196df34789c825fa83f8bb37a5d5e52c14 ]

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agosparc: Enable syscall wrappers for 64-bit (CVE-2009-0029)
Christian Borntraeger [Fri, 13 Feb 2009 09:08:47 +0000 (01:08 -0800)]
sparc: Enable syscall wrappers for 64-bit (CVE-2009-0029)

[ Upstream commit 67605d6812691bbd2158d2f60259e0407611bc1b ]

sparc64 needs sign-extended function parameters. We have to enable
the system call wrappers.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agotcp: Fix length tcp_splice_data_recv passes to skb_splice_bits.
Dimitris Michailidis [Tue, 27 Jan 2009 06:15:31 +0000 (22:15 -0800)]
tcp: Fix length tcp_splice_data_recv passes to skb_splice_bits.

[ Upstream commit 9fa5fdf291c9b58b1cb8b4bb2a0ee57efa21d635 ]

tcp_splice_data_recv has two lengths to consider: the len parameter it
gets from tcp_read_sock, which specifies the amount of data in the skb,
and rd_desc->count, which is the amount of data the splice caller still
wants.  Currently it passes just the latter to skb_splice_bits, which then
splices min(rd_desc->count, skb->len - offset) bytes.

Most of the time this is fine, except when the skb contains urgent data.
In that case len goes only up to the urgent byte and is less than
skb->len - offset.  By ignoring len tcp_splice_data_recv may a) splice
data tcp_read_sock told it not to, b) return to tcp_read_sock a value > len.

Now, tcp_read_sock doesn't handle used > len and leaves the socket in a
bad state (both sk_receive_queue and copied_seq are bad at that point)
resulting in duplicated data and corruption.

Fix by passing min(rd_desc->count, len) to skb_splice_bits.

Signed-off-by: Dimitris Michailidis <dm@chelsio.com>
Acked-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agotcp: splice as many packets as possible at once
Willy Tarreau [Wed, 14 Jan 2009 00:04:36 +0000 (16:04 -0800)]
tcp: splice as many packets as possible at once

[ Upstream commit 33966dd0e2f68f26943cd9ee93ec6abbc6547a8e ]

As spotted by Willy Tarreau, current splice() from tcp socket to pipe is not
optimal. It processes at most one segment per call.
This results in low performance and very high overhead due to syscall rate
when splicing from interfaces which do not support LRO.

Willy provided a patch inside tcp_splice_read(), but a better fix
is to let tcp_read_sock() process as many segments as possible, so
that tcp_rcv_space_adjust() and tcp_cleanup_rbuf() are called less
often.

With this change, splice() behaves like tcp_recvmsg(), being able
to consume many skbs in one system call. With typical 1460 bytes
of payload per frame, that means splice(SPLICE_F_NONBLOCK) can return
16*1460 = 23360 bytes.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agopacket: Avoid lock_sock in mmap handler
Herbert Xu [Fri, 30 Jan 2009 22:12:06 +0000 (14:12 -0800)]
packet: Avoid lock_sock in mmap handler

[ Upstream commit 905db44087855e3c1709f538ecdc22fd149cadd8 ]

As the mmap handler gets called under mmap_sem, and we may grab
mmap_sem elsewhere under the socket lock to access user data, we
should avoid grabbing the socket lock in the mmap handler.

Since the only thing we care about in the mmap handler is for
pg_vec* to be invariant, i.e., to exclude packet_set_ring, we
can achieve this by simply using a new mutex.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Tested-by: Martin MOKREJĹ  <mmokrejs@ribosome.natur.cuni.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agonet: Fix OOPS in skb_seq_read().
Shyam Iyer [Fri, 30 Jan 2009 00:12:42 +0000 (16:12 -0800)]
net: Fix OOPS in skb_seq_read().

[ Upstream commit 71b3346d182355f19509fadb8fe45114a35cc499 ]

It oopsd for me in skb_seq_read. addr2line said it was
linux-2.6/net/core/skbuff.c:2228, which is this line:

while (st->frag_idx < skb_shinfo(st->cur_skb)->nr_frags) {

I added some printks in there and it looks like we hit this:

        } else if (st->root_skb == st->cur_skb &&
                   skb_shinfo(st->root_skb)->frag_list) {
                 st->cur_skb = skb_shinfo(st->root_skb)->frag_list;
                 st->frag_idx = 0;
                 goto next_skb;
        }

Actually I did some testing and added a few printks and found that the
st->cur_skb->data was 0 and hence the ptr used by iscsi_tcp was null.
This caused the kernel panic.

  if (abs_offset < block_limit) {
- *data = st->cur_skb->data + abs_offset;
+ *data = st->cur_skb->data + (abs_offset - st->stepped_offset);

I enabled the debug_tcp and with a few printks found that the code did
not go to the next_skb label and could find that the sequence being
followed was this -

It hit this if condition -

        if (st->cur_skb->next) {
                st->cur_skb = st->cur_skb->next;
                st->frag_idx = 0;
                goto next_skb;

And so, now the st pointer is shifted to the next skb whereas actually
it should have hit the second else if first since the data is in the
frag_list.

        else if (st->root_skb == st->cur_skb &&
                 skb_shinfo(st->root_skb)->frag_list) {
                st->cur_skb = skb_shinfo(st->root_skb)->frag_list;
                goto next_skb;
        }

Reversing the two conditions the attached patch fixes the issue for me
on top of Herbert's patches.

Signed-off-by: Shyam Iyer <shyam_iyer@dell.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agonet: Fix frag_list handling in skb_seq_read
Herbert Xu [Fri, 30 Jan 2009 00:07:52 +0000 (16:07 -0800)]
net: Fix frag_list handling in skb_seq_read

[ Upstream commit 95e3b24cfb4ec0479d2c42f7a1780d68063a542a ]

The frag_list handling was broken in skb_seq_read:

1) We didn't add the stepped offset when looking at the head
are of fragments other than the first.

2) We didn't take the stepped offset away when setting the data
pointer in the head area.

3) The frag index wasn't reset.

This patch fixes both issues.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agovirtio_net: Fix MAX_PACKET_LEN to support 802.1Q VLANs
Alex Williamson [Fri, 13 Feb 2009 08:06:29 +0000 (00:06 -0800)]
virtio_net: Fix MAX_PACKET_LEN to support 802.1Q VLANs

[ Upstream commit e918085aaff34086e265f825dd469926b1aec4a4 ]

802.1Q expanded the maximum ethernet frame size by 4 bytes for the
VLAN tag.  We're not taking this into account in virtio_net, which
means the buffers we provide to the backend in the virtqueue RX ring
aren't big enough to hold a full MTU VLAN packet.  For QEMU/KVM,
this results in the backend exiting with a packet truncation error.

Signed-off-by: Alex Williamson <alex.williamson@hp.com>
Acked-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoudp: increments sk_drops in __udp_queue_rcv_skb()
Eric Dumazet [Mon, 2 Feb 2009 21:41:57 +0000 (13:41 -0800)]
udp: increments sk_drops in __udp_queue_rcv_skb()

[ Upstream commit e408b8dcb5ce42243a902205005208e590f28454 ]

Commit 93821778def10ec1e69aa3ac10adee975dad4ff3 (udp: Fix rcv socket
locking) accidentally removed sk_drops increments for UDP IPV4
sockets.

This field can be used to detect incorrect sizing of socket receive
buffers.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoudp: Fix UDP short packet false positive
Jesper Dangaard Brouer [Thu, 5 Feb 2009 23:05:45 +0000 (15:05 -0800)]
udp: Fix UDP short packet false positive

[ Upstream commit 7b5e56f9d635643ad54f2f42e69ad16b80a2cff1 ]

The UDP header pointer assignment must happen after calling
pskb_may_pull().  As pskb_may_pull() can potentially alter the SKB
buffer.

This was exposted by running multicast traffic through the NIU driver,
as it won't prepull the protocol headers into the linear area on
receive.

Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agotun: Fix unicast filter overflow
Alex Williamson [Mon, 9 Feb 2009 01:49:17 +0000 (17:49 -0800)]
tun: Fix unicast filter overflow

[ Upstream commit cfbf84fcbcda98bb91ada683a8dc8e6901a83ebd ]

Tap devices can make use of a small MAC filter set via the
TUNSETTXFILTER ioctl.  The filter has a set of exact matches
plus a hash for imperfect filtering of additional multicast
addresses.  The current code is unbalanced, adding unicast
addresses to the multicast hash, but only checking the hash
against multicast addresses.  This results in the filter
dropping unicast addresses that overflow the exact filter.
The fix is simply to disable the filter by leaving count set
to zero if we find non-multicast addresses after the exact
match table is filled.

Signed-off-by: Alex Williamson <alex.williamson@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agotun: Add some missing TUN compat ioctl translations.
David S. Miller [Fri, 30 Jan 2009 00:53:35 +0000 (16:53 -0800)]
tun: Add some missing TUN compat ioctl translations.

[ Upstream commit df1c46b2b6876d0a1b1b4740f009fa69d95ebbc9 ]

Based upon a report from Michael Tokarev <mjt@tls.msk.ru>:

Just saw in dmesg:

ioctl32(kvm:4408): Unknown cmd fd(9) cmd(800454cf){t:'T';sz:4} arg(ffc668e4) on /dev/net/tun

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agosungem: Soft lockup in sungem on Netra AC200 when switching interface up
Ilkka Virta [Sat, 7 Feb 2009 06:00:36 +0000 (22:00 -0800)]
sungem: Soft lockup in sungem on Netra AC200 when switching interface up

[ Upstream commit 71822faa3bc0af5dbf5e333a2d085f1ed7cd809f ]

From: Ilkka Virta <itvirta@iki.fi>

In the lockup situation the driver seems to go off in an eternal storm
of interrupts right after calling request_irq(). It doesn't actually
do anything interesting in the interrupt handler. Since connecting the link
afterwards works, something later in initialization must fix this.

Looking at gem_do_start() and gem_open(), it seems that the only thing
done while opening the device after the request_irq(), is a call to
napi_enable().

I don't know what the ordering requirements are for the
initialization, but I boldly tried to move the napi_enable() call
inside gem_do_start() before the link state is checked and interrupts
subsequently enabled, and it seems to work for me. Doesn't even break
anything too obvious...

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agosky2: fix hard hang with netconsoling and iface going up
Alexey Dobriyan [Fri, 30 Jan 2009 21:45:31 +0000 (13:45 -0800)]
sky2: fix hard hang with netconsoling and iface going up

[ Upstream commit a11da890e4c9850411303efcf6514f048ca880ee ]

Printing anything over netconsole before hw is up and running is,
of course, not going to work.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agonet: packet socket packet_lookup_frame fix
Sebastiano Di Paola [Fri, 30 Jan 2009 23:37:17 +0000 (23:37 +0000)]
net: packet socket packet_lookup_frame fix

[ Upstream commit f9e6934502e46c363100245f137ddf0f4b1cb574 ]

packet_lookup_frames() fails to get user frame if current frame header
status contains extra flags.
This is due to the wrong assumption on the operators precedence during
frame status tests.
Fixed by forcing the right operators precedence order with explicit brackets.

Signed-off-by: Paolo Abeni <paolo.abeni@gmail.com>
Signed-off-by: Sebastiano Di Paola <sebastiano.dipaola@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agonet: 4 bytes kernel memory disclosure in SO_BSDCOMPAT gsopt try #2
Clément Lecigne [Fri, 13 Feb 2009 00:59:09 +0000 (16:59 -0800)]
net: 4 bytes kernel memory disclosure in SO_BSDCOMPAT gsopt try #2

[ Upstream commit df0bca049d01c0ee94afb7cd5dfd959541e6c8da ]

In function sock_getsockopt() located in net/core/sock.c, optval v.val
is not correctly initialized and directly returned in userland in case
we have SO_BSDCOMPAT option set.

This dummy code should trigger the bug:

int main(void)
{
unsigned char buf[4] = { 0, 0, 0, 0 };
int len;
int sock;
sock = socket(33, 2, 2);
getsockopt(sock, 1, SO_BSDCOMPAT, &buf, &len);
printf("%x%x%x%x\n", buf[0], buf[1], buf[2], buf[3]);
close(sock);
}

Here is a patch that fix this bug by initalizing v.val just after its
declaration.

Signed-off-by: Clément Lecigne <clement.lecigne@netasq.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoipv6: Copy cork options in ip6_append_data
Herbert Xu [Thu, 5 Feb 2009 23:15:50 +0000 (15:15 -0800)]
ipv6: Copy cork options in ip6_append_data

[ Upstream commit 0178b695fd6b40a62a215cbeb03dd51ada3bb5e0 ]

As the options passed to ip6_append_data may be ephemeral, we need
to duplicate it for corking.  This patch applies the simplest fix
which is to memdup all the relevant bits.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoipv6: Disallow rediculious flowlabel option sizes.
David S. Miller [Fri, 6 Feb 2009 08:49:55 +0000 (00:49 -0800)]
ipv6: Disallow rediculious flowlabel option sizes.

[ Upstream commit 684de409acff8b1fe8bf188d75ff2f99c624387d ]

Just like PKTINFO, limit the options area to 64K.

Based upon report by Eric Sesterhenn and analysis by
Roland Dreier.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoipv4: fix infinite retry loop in IP-Config
Benjamin Zores [Fri, 30 Jan 2009 00:19:13 +0000 (16:19 -0800)]
ipv4: fix infinite retry loop in IP-Config

[ Upstream commit 9d8dba6c979fa99c96938c869611b9a23b73efa9 ]

Signed-off-by: Benjamin Zores <benjamin.zores@alcatel-lucent.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agodrivers/net/skfp: if !capable(CAP_NET_ADMIN): inverted logic
Roel Kluin [Fri, 30 Jan 2009 01:32:20 +0000 (17:32 -0800)]
drivers/net/skfp: if !capable(CAP_NET_ADMIN): inverted logic

[ Upstream commit c25b9abbc2c2c0da88e180c3933d6e773245815a ]

Fix inverted logic

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agosctp: Properly timestamp outgoing data chunks for rtx purposes
Vlad Yasevich [Thu, 22 Jan 2009 22:53:01 +0000 (14:53 -0800)]
sctp: Properly timestamp outgoing data chunks for rtx purposes

[ Upstream commit 759af00ebef858015eb68876ac1f383bcb6a1774 ]

Recent changes to the retransmit code exposed a long standing
bug where it was possible for a chunk to be time stamped
after the retransmit timer was reset.  This caused a rare
situation where the retrnamist timer has expired, but
nothing was marked for retrnasmission because all of
timesamps on data were less then 1 rto ago.  As result,
the timer was never restarted since nothing was retransmitted,
and this resulted in a hung association that did couldn't
complete the data transfer.  The solution is to timestamp
the chunk when it's added to the packet for transmission
purposes.  After the packet is trsnmitted the rtx timer
is restarted.  This guarantees that when the timer expires,
there will be data to retransmit.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agosctp: Correctly start rtx timer on new packet transmissions.
Vlad Yasevich [Thu, 22 Jan 2009 22:52:43 +0000 (14:52 -0800)]
sctp: Correctly start rtx timer on new packet transmissions.

[ Upstream commit 6574df9a89f9f7da3a4e5cee7633d430319d3350 ]

Commit 62aeaff5ccd96462b7077046357a6d7886175a57
(sctp: Start T3-RTX timer when fast retransmitting lowest TSN)
introduced a regression where it was possible to forcibly
restart the sctp retransmit timer at the transmission of any
new chunk.  This resulted in much longer timeout times and
sometimes hung sctp connections.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agosctp: Fix crc32c calculations on big-endian arhes.
Vlad Yasevich [Thu, 22 Jan 2009 22:52:23 +0000 (14:52 -0800)]
sctp: Fix crc32c calculations on big-endian arhes.

[ Upstream commit 9c5ff5f75d0d0a1c7928ecfae3f38418b51a88e3 ]

crc32c algorithm provides a byteswaped result.  On little-endian
arches, the result ends up in big-endian/network byte order.
On big-endinan arches, the result ends up in little-endian
order and needs to be byte swapped again.  Thus calling cpu_to_le32
gives the right output.

Tested-by: Jukka Taimisto <jukka.taimisto@mail.suomi.net>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agozd1211rw: treat MAXIM_NEW_RF(0x08) as UW2453_RF(0x09) for TP-Link WN322/422G
Hin-Tak Leung [Wed, 4 Feb 2009 23:40:43 +0000 (23:40 +0000)]
zd1211rw: treat MAXIM_NEW_RF(0x08) as UW2453_RF(0x09) for TP-Link WN322/422G

commit efb43f4b2ccf8066abc3920a0e6858e4350a65c7 upstream.

Three people (Petr Mensik <pihhan@cipis.net>
["si" should be U+0161 U+00ED], Stephen Ho <stephenhoinhk@gmail.com>
on zd1211-devs and Ismael Ojeda Perez <iojedaperez@gmail.com>
on linux-wireless) reported success in getting TP-Link WN322G/WN422G
working by treating MAXIM_NEW_RF(0x08) as UW2453_RF(0x09) for rf
chip hardware initialization.

Signed-off-by: Hin-Tak Leung <htl10@users.sourceforge.net>
Tested-by: Petr Mensik <pihhan@cipis.net>
Tested-by: Stephen Ho <stephenhoinhk@gmail.com>
Tested-by: Ismael Ojeda Perez <iojedaperez@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agozd1211rw: adding 0ace:0xa211 as a ZD1211 device
Hin-Tak Leung [Sun, 8 Feb 2009 02:13:56 +0000 (02:13 +0000)]
zd1211rw: adding 0ace:0xa211 as a ZD1211 device

commit 14990c69b5f51dd57b4e0e2373de50239ac861e2 upstream.

Christoph Biedl <sourceforge.bnwi@manchmal.in-ulm.de> reported success
in the sourceforge zd1211 mailing list on this addition. This product ID
was supported by the vendor driver ZD1211LnxDrv 2.22.0.0 (and possibly
earlier) and it probably should have been added earlier.

Signed-off-by: Hin-Tak Leung <htl10@users.sourceforge.net>
Tested-by: Christoph Biedl <sourceforge.bnwi@manchmal.in-ulm.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agox86, vmi: put a missing paravirt_release_pmd in pgd_dtor
Alok Kataria [Fri, 6 Feb 2009 18:29:35 +0000 (10:29 -0800)]
x86, vmi: put a missing paravirt_release_pmd in pgd_dtor

commit 55a8ba4b7f76bebd7e8ce3f74c04b140627a1bad upstream.

Commit 6194ba6ff6ccf8d5c54c857600843c67aa82c407 ("x86: don't special-case
pmd allocations as much") made changes to the way we handle pmd allocations,
and while doing that it dropped a call to  paravirt_release_pd on the
pgd page from the pgd_dtor code path.

As a result of this missing release, the hypervisor is now unaware of the
pgd page being freed, and as a result it ends up tracking this page as a
page table page.

After this the guest may start using the same page for other purposes, and
depending on what use the page is put to, it may result in various performance
and/or functional issues ( hangs, reboots).

Since this release is only required for VMI, I now release the pgd page from
the (vmi)_pgd_free hook.

Signed-off-by: Alok N Kataria <akataria@vmware.com>
Acked-by: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agowriteback: fix break condition
Federico Cuello [Wed, 11 Feb 2009 21:04:39 +0000 (13:04 -0800)]
writeback: fix break condition

commit 89e1219004b3657cc014521663eeef0744f1c99d upstream.

Commit dcf6a79dda5cc2a2bec183e50d829030c0972aaa ("write-back: fix
nr_to_write counter") fixed nr_to_write counter, but didn't set the break
condition properly.

If nr_to_write == 0 after being decremented it will loop one more time
before setting done = 1 and breaking the loop.

[akpm@linux-foundation.org: coding-style fixes]
Cc: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Acked-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agowrite-back: fix nr_to_write counter
Artem Bityutskiy [Mon, 2 Feb 2009 16:33:49 +0000 (18:33 +0200)]
write-back: fix nr_to_write counter

commit dcf6a79dda5cc2a2bec183e50d829030c0972aaa upstream.

Commit 05fe478dd04e02fa230c305ab9b5616669821dd3 introduced some
@wbc->nr_to_write breakage.

It made the following changes:
 1. Decrement wbc->nr_to_write instead of nr_to_write
 2. Decrement wbc->nr_to_write _only_ if wbc->sync_mode == WB_SYNC_NONE
 3. If synced nr_to_write pages, stop only if if wbc->sync_mode ==
    WB_SYNC_NONE, otherwise keep going.

However, according to the commit message, the intention was to only make
change 3.  Change 1 is a bug.  Change 2 does not seem to be necessary,
and it breaks UBIFS expectations, so if needed, it should be done
separately later.  And change 2 does not seem to be documented in the
commit message.

This patch does the following:
 1. Undo changes 1 and 2
 2. Add a comment explaining change 3 (it very useful to have comments
    in _code_, not only in the commit).

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Acked-by: Nick Piggin <npiggin@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agow1: w1 temp calculation overflow fix
Ian Dall [Wed, 11 Feb 2009 21:04:46 +0000 (13:04 -0800)]
w1: w1 temp calculation overflow fix

commit 507e2fbaaacb6f164b4125b87c5002f95143174b upstream.

Addresses http://bugzilla.kernel.org/show_bug.cgi?id=12646

When the temperature exceeds 32767 milli-degrees the temperature overflows
to -32768 millidegrees.  These are bothe well within the -55 - +125 degree
range for the sensor.

Fix overflow in left-shift of a u8.

Signed-off-by: Ian Dall <ian@beware.dropbear.id.au>
Signed-off-by: Evgeniy Polyakov <zbr@ioremap.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agosyscall define: fix uml compile bug
Heiko Carstens [Wed, 11 Feb 2009 21:04:38 +0000 (13:04 -0800)]
syscall define: fix uml compile bug

commit 6c5979631b4b03c9288776562c18036765e398c1 upstream.

With the new system call defines we get this on uml:

arch/um/sys-i386/built-in.o: In function `sys_call_table':
(.rodata+0x308): undefined reference to `sys_sigprocmask'

Reason for this is that uml passes the preprocessor option
-Dsigprocmask=kernel_sigprocmask to gcc when compiling the kernel.
This causes SYSCALL_DEFINE3(sigprocmask, ...) to be expanded to
SYSCALL_DEFINEx(3, kernel_sigprocmask, ...) and finally to a system
call named sys_kernel_sigprocmask.  However sys_sigprocmask is missing
because of this.

To avoid macro expansion for the system call name just concatenate the
name at first define instead of carrying it through severel levels.
This was pointed out by Al Viro.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Reviewed-by: WANG Cong <wangcong@zeuux.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agopowerpc/fsl-booke: Fix mapping functions to use phys_addr_t
Kumar Gala [Tue, 10 Feb 2009 03:08:07 +0000 (21:08 -0600)]
powerpc/fsl-booke: Fix mapping functions to use phys_addr_t

commit 6c24b17453c8dc444a746e45b8a404498fc9fcf7 upstream.

Fixed v_mapped_by_tlbcam() and p_mapped_by_tlbcam() to use phys_addr_t
instead of unsigned long.  In 36-bit physical mode we really need these
functions to deal with phys_addr_t when trying to match a physical
address or when returning one.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agopowerpc: Fix swapcontext system for VSX + old ucontext size
Michael Neuling [Thu, 23 Oct 2008 00:42:36 +0000 (00:42 +0000)]
powerpc: Fix swapcontext system for VSX + old ucontext size

commit 16c29d180becc5bdf92fd0fc7314a44a671b5f4e upstream.

Since VSX support was added, we now have two sizes of ucontext_t;
the older, smaller size without the extra VSX state, and the new
larger size with the extra VSX state.  A program using the
sys_swapcontext system call and supplying smaller ucontext_t
structures will currently get an EINVAL error if the task has
used VSX (e.g. because of calling library code that uses VSX) and
the old_ctx argument is non-NULL (i.e. the program is asking for
its current context to be saved).  Thus the program will start
getting EINVAL errors on calls that previously worked.

This commit changes this behaviour so that we don't send an EINVAL in
this case.  It will now return the smaller context but the VSX MSR bit
will always be cleared to indicate that the ucontext_t doesn't include
the extra VSX state, even if the task has executed VSX instructions.

Both 32 and 64 bit cases are updated.

[paulus@samba.org - also fix some access_ok() and get_user() calls]

Thanks to Ben Herrenschmidt for noticing this problem.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoparport: parport_serial, don't bind netmos ibm 0299
Jiri Slaby [Wed, 11 Feb 2009 21:04:40 +0000 (13:04 -0800)]
parport: parport_serial, don't bind netmos ibm 0299

commit 3abdbf90a3ffb006108c831c56b092e35483b6ec upstream.

Since netmos 9835 with subids 0x1014(IBM):0x0299 is now bound with
serial/8250_pci, because it has no parallel ports and subdevice id isn't
in the expected form, return -ENODEV from probe function.

This is performed in netmos preinit_hook.

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agonbd: fix I/O hang on disconnected nbds
Paul Clements [Wed, 11 Feb 2009 21:04:45 +0000 (13:04 -0800)]
nbd: fix I/O hang on disconnected nbds

commit 4d48a542b42747c36a5937447d9c3de7c897ea50 upstream.

Fix a problem that causes I/O to a disconnected (or partially initialized)
nbd device to hang indefinitely.  To reproduce:

# ioctl NBD_SET_SIZE_BLOCKS /dev/nbd23 514048
# dd if=/dev/nbd23 of=/dev/null bs=4096 count=1

...hangs...

This can also occur when an nbd device loses its nbd-client/server
connection.  Although we clear the queue of any outstanding I/Os after the
client/server connection fails, any additional I/Os that get queued later
will hang.

This bug may also be the problem reported in this bug report:
http://bugzilla.kernel.org/show_bug.cgi?id=12277

Testing would need to be performed to determine if the two issues are the
same.

This problem was introduced by the new request handling thread code ("NBD:
allow nbd to be used locally", 3/2008), which entered into mainline around
2.6.25.

The fix, which is fairly simple, is to restore the check for lo->sock
being NULL in do_nbd_request.  This causes I/O to an uninitialized nbd to
immediately fail with an I/O error, as it did prior to the introduction of
this bug.

Signed-off-by: Paul Clements <paul.clements@steeleye.com>
Reported-by: Jon Nelson <jnelson-kernel-bugzilla@jamponi.net>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agolockd: fix regression in lockd's handling of blocked locks
J. Bruce Fields [Wed, 4 Feb 2009 22:35:38 +0000 (17:35 -0500)]
lockd: fix regression in lockd's handling of blocked locks

commit 9d9b87c1218be78ddecbc85ec3bb91c79c1d56ab upstream.

If a client requests a blocking lock, is denied, then requests it again,
then here in nlmsvc_lock() we will call vfs_lock_file() without FL_SLEEP
set, because we've already queued a block and don't need the locks code
to do it again.

But that means vfs_lock_file() will return -EAGAIN instead of
FILE_LOCK_DENIED.  So we still need to translate that -EAGAIN return
into a nlm_lck_blocked error in this case, and put ourselves back on
lockd's block list.

The bug was introduced by bde74e4bc64415b1 "locks: add special return
value for asynchronous locks".

Thanks to Frank van Maarseveen for the report; his original test
case was essentially

for i in `seq 30`; do flock /nfsmount/foo sleep 10 & done

Tested-by: Frank van Maarseveen <frankvm@frankvm.com>
Reported-by: Frank van Maarseveen <frankvm@frankvm.com>
Cc: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agokernel-doc: fix syscall wrapper processing
Randy Dunlap [Wed, 11 Feb 2009 21:04:33 +0000 (13:04 -0800)]
kernel-doc: fix syscall wrapper processing

commit b4870bc5ee8c7a37541a3eb1208b5c76c13a078a upstream.

Fix kernel-doc processing of SYSCALL wrappers.

The SYSCALL wrapper patches played havoc with kernel-doc for
syscalls.  Syscalls that were scanned for DocBook processing
reported warnings like this one, for sys_tgkill:

Warning(kernel/signal.c:2285): No description found for parameter 'tgkill'
Warning(kernel/signal.c:2285): No description found for parameter 'pid_t'
Warning(kernel/signal.c:2285): No description found for parameter 'int'

because the macro parameters all "look like" function parameters,
although they are not:

/**
 *  sys_tgkill - send signal to one specific thread
 *  @tgid: the thread group ID of the thread
 *  @pid: the PID of the thread
 *  @sig: signal to be sent
 *
 *  This syscall also checks the @tgid and returns -ESRCH even if the PID
 *  exists but it's not belonging to the target process anymore. This
 *  method solves the problem of threads exiting and PIDs getting reused.
 */
SYSCALL_DEFINE3(tgkill, pid_t, tgid, pid_t, pid, int, sig)
{
...

This patch special-cases the handling SYSCALL_DEFINE* function
prototypes by expanding them to
long sys_foobar(type1 arg1, type1 arg2, ...)

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoiwlwifi: scan correct setting of valid rx_chains
Tomas Winkler [Mon, 6 Oct 2008 08:05:29 +0000 (16:05 +0800)]
iwlwifi: scan correct setting of valid rx_chains

commit d588be6bae40f7965f1b681a4dbc3254411787b9 upstream.

This patch sets rx_chain bitmap correctly according hw configuration.

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoFix page writeback thinko, causing Berkeley DB slowdown
Nick Piggin [Thu, 12 Feb 2009 03:34:23 +0000 (04:34 +0100)]
Fix page writeback thinko, causing Berkeley DB slowdown

commit 3a4c6800f31ea8395628af5e7e490270ee5d0585 upstream.

A bug was introduced into write_cache_pages cyclic writeout by commit
31a12666d8f0c22235297e1c1575f82061480029 ("mm: write_cache_pages cyclic
fix").  The intention (and comments) is that we should cycle back and
look for more dirty pages at the beginning of the file if there is no
more work to be done.

But the !done condition was dropped from the test.  This means that any
time the page writeout loop breaks (eg.  due to nr_to_write == 0), we
will set index to 0, then goto again.  This will set done_index to
index, then find done is set, so will proceed to the end of the
function.  When updating mapping->writeback_index for cyclic writeout,
we now use done_index == 0, so we're always cycling back to 0.

This seemed to be causing random mmap writes (slapadd and iozone) to
start writing more pages from the LRU and writeout would slowdown, and
caused bugzilla entry

http://bugzilla.kernel.org/show_bug.cgi?id=12604

about Berkeley DB slowing down dramatically.

With this patch, iozone random write performance is increased nearly
5x on my system (iozone -B -r 4k -s 64k -s 512m -s 1200m on ext2).

Signed-off-by: Nick Piggin <npiggin@suse.de>
Reported-and-tested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoLinux 2.6.27.17 v2.6.27.17
Greg Kroah-Hartman [Fri, 13 Feb 2009 01:23:25 +0000 (17:23 -0800)]
Linux 2.6.27.17

15 years agoRevert "ACPI: dock: Don't eval _STA on every show_docked sysfs read"
Greg Kroah-Hartman [Fri, 13 Feb 2009 01:16:10 +0000 (17:16 -0800)]
Revert "ACPI: dock: Don't eval _STA on every show_docked sysfs read"

This reverts commit 1d672ef324e78a467603ef55aa4558cac9f895ba.

Thanks to David Engel <david@istwok.net> for pointing out the problem.
I had not added a previous commit that this patch relied on, causing an
oops whenever the dock sysfs file was read.

Reported-by: David Engel <david@istwok.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoLinux 2.6.27.16 v2.6.27.16
Greg Kroah-Hartman [Thu, 12 Feb 2009 17:38:48 +0000 (09:38 -0800)]
Linux 2.6.27.16

15 years agogenirq: NULL struct irq_desc's member 'name' in dynamic_irq_cleanup()
Dean Nelson [Sat, 18 Oct 2008 23:06:56 +0000 (16:06 -0700)]
genirq: NULL struct irq_desc's member 'name' in dynamic_irq_cleanup()

commit b6f3b7803a9231eddc36d0a2a6d2d8105ef89344 upstream.

If the member 'name' of the irq_desc structure happens to point to a
character string that is resident within a kernel module, problems ensue
if that module is rmmod'd (at which time dynamic_irq_cleanup() is called)
and then later show_interrupts() is called by someone.

It is also not a good thing if the character string resided in kmalloc'd
space that has been kfree'd (after having called dynamic_irq_cleanup()).
dynamic_irq_cleanup() fails to NULL the 'name' member and
show_interrupts() references it on a few architectures (like h8300, sh and
x86).

Signed-off-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agosctp: Fix another socket race during accept/peeloff
Vlad Yasevich [Thu, 22 Jan 2009 22:53:23 +0000 (14:53 -0800)]
sctp: Fix another socket race during accept/peeloff

commit ae53b5bd77719fed58086c5be60ce4f22bffe1c6 upstream.

There is a race between sctp_rcv() and sctp_accept() where we
have moved the association from the listening socket to the
accepted socket, but sctp_rcv() processing cached the old
socket and continues to use it.

The easy solution is to check for the socket mismatch once we've
grabed the socket lock.  If we hit a mis-match, that means
that were are currently holding the lock on the listening socket,
but the association is refrencing a newly accepted socket.  We need
to drop the lock on the old socket and grab the lock on the new one.

A more proper solution might be to create accepted sockets when
the new association is established, similar to TCP.  That would
eliminate the race for 1-to-1 style sockets, but it would still
existing for 1-to-many sockets where a user wished to peeloff an
association.  For now, we'll live with this easy solution as
it addresses the problem.

Reported-by: Michal Hocko <mhocko@suse.cz>
Reported-by: Karsten Keil <kkeil@suse.de>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoUSB: usb-storage: add Pentax to the bad-vendor list
Alan Stern [Wed, 4 Feb 2009 20:48:03 +0000 (15:48 -0500)]
USB: usb-storage: add Pentax to the bad-vendor list

commit 506e9469833c66ed6bb9acd902e208f7301b6adb upstream.

This patch (as1202) adds Pentax to usb-storage's list of bad vendors
whose devices always need the CAPACITY_HEURISTICS flag.  This is in
addition to the existing entries: Nokia, Nikon, and Motorola.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Tested-by: Virgo Pärna <virgo.parna@mail.ee>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoUSB: two more usb ids for ti_usb_3410_5052
Oliver Neukum [Wed, 4 Feb 2009 15:38:33 +0000 (16:38 +0100)]
USB: two more usb ids for ti_usb_3410_5052

commit 97dcf0416e390fc5c997d4ea60e6f975c7b7a1c3 upstream.

This patch adds device IDs and balances the counts to make the
hot ID additioning mechanism work.

Signed-off-by: Oliver Neukum <oneukum@suse.de>
Cc: Chris Adams <cmadams@hiwaay.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoUSB: option: New mobile broadband modems to be supported
Dirk De Schepper [Fri, 6 Feb 2009 20:48:34 +0000 (20:48 +0000)]
USB: option: New mobile broadband modems to be supported

commit c200b9c9e8ec93cdd262cfa1699ad92e883d4876 upstream.

- New Novatel and Dell mobile broadband modem products added
 - Dell pid variables used in stead of numerical PIDs for known
   products

Signed-off-by: Dirk De Schepper <ddeschepper@nvtl.com>
Signed-off-by: Matthias Urlichs <matthias@urlichs.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoUSB: new id for ti_usb_3410_5052 driver
Oliver Neukum [Mon, 12 Jan 2009 12:31:16 +0000 (13:31 +0100)]
USB: new id for ti_usb_3410_5052 driver

commit 1a1fab513734b3a4fca1bee8229e5ff7e1cb873c upstream.

This adds a new device id

Signed-off-by: Oliver Neukum <oneukum@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoRevert USB: option: add Pantech cards
Greg Kroah-Hartman [Wed, 4 Feb 2009 00:02:21 +0000 (16:02 -0800)]
Revert USB: option: add Pantech cards

commit 6b40c0057a7935bcf63a38a924094c7e61d4731f upstream.

Revert 8b6346ec899713a90890c9e832f7eff91ea73504 as these devices really
work just fine with the cdc-acm driver, as they follow the spec
properly.

Thanks to Chuck Ebbert for pointing out the problem here.

Cc: Chuck Ebbert <cebbert@redhat.com>
Cc: Dan Williams <dcbw@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoACPI: video: Fix reversed brightness behavior on ThinkPad SL series
Zhang Rui [Thu, 11 Dec 2008 21:24:52 +0000 (16:24 -0500)]
ACPI: video: Fix reversed brightness behavior on ThinkPad SL series

commit 935e5f290ec1eb0f1c15004421f5fd3154380fd5 upstream.

Section B.6.2 of ACPI 3.0b specification that defines _BCL method
doesn't require the brightness levels returned to be sorted.
At least ThinkPad SL300 (and probably all IdeaPads) returns the
array reversed (i.e. bightest levels have lowest indexes), which
causes the brightness management behave in completely reversed
manner on these machines (brightness increases when the laptop is
idle, while the display dims when used).

Sorting the array by brightness level values after reading the list
fixes the issue.

http://bugzilla.kernel.org/show_bug.cgi?id=12037

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Tested-by: Lubomir Rintel <lkundrak@v3.sk>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoACPI: don't load acpi_cpufreq if acpi=off
Yinghai Lu [Thu, 25 Sep 2008 02:04:31 +0000 (19:04 -0700)]
ACPI: don't load acpi_cpufreq if acpi=off

commit ee297533279a802eac8b1cbea8e65b24b36a1aac upstream.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoACPICA: Add function to dereference returned reference objects
Lin Ming [Mon, 4 Aug 2008 05:22:10 +0000 (13:22 +0800)]
ACPICA: Add function to dereference returned reference objects

commit bbc241340681557a16982f4d1840f00963bc05b4 upstream.

Examines the return object from a call to acpi_evaluate_object.
Any Index or RefOf references are automatically dereferenced in
an attempt to return something useful (these reference types
cannot be converted into an external ACPI_OBJECT.)
Lin Ming, Bob Moore.

http://bugzilla.kernel.org/show_bug.cgi?id=11105

Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoACPICA: Copy dynamically loaded tables to local buffer
Dennis Noordsij [Fri, 15 Aug 2008 01:37:58 +0000 (09:37 +0800)]
ACPICA: Copy dynamically loaded tables to local buffer

commit f0e0da8a6cca44396c7a711e308d58084e881617 upstream.

Previously, dynamically loaded tables were simply mapped, but on some machines
this memory is corrupted after suspend. Now copy the table to a local buffer.
For OpRegion case, added checksum verify. Use the table length from the table header,
not the region length. For Buffer case, use the table length also.

http://bugzilla.kernel.org/show_bug.cgi?id=10734

Signed-off-by: Dennis Noordsij <dennis.noordsij@helsinki.fi>
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>