Use ARRAY_SIZE instead of sizeof to get proper max for label length.
Since this is just a read out of bounds it's not that bad, but the
problem becomes user-visible eg if one tries to use DEBUG_PAGEALLOC and
DEBUG_RODATA, at least with some enhancements from Hiroshi. Of course
the destination array can contain garbage when we read beyond the end of
source array so that would be another user-visible problem.
Signed-off-by: Antti P Miettinen <amiettinen@nvidia.com> Reviewed-by: Hiroshi Doyu <hdoyu@nvidia.com> Tested-by: Hiroshi Doyu <hdoyu@nvidia.com> Cc: Will Drewry <wad@chromium.org> Cc: Matt Fleming <matt.fleming@intel.com> Acked-by: Davidlohr Bueso <davidlohr@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Johan Hovold [Thu, 21 Nov 2013 22:32:04 +0000 (14:32 -0800)]
ARM: drivers/rtc/rtc-at91rm9200.c: disable interrupts at shutdown
Make sure RTC-interrupts are disabled at shutdown.
As the RTC is generally powered by backup power (VDDBU), its interrupts
are not disabled on wake-up, user, watchdog or software reset. This
could cause troubles on other systems (e.g. older kernels) if an
interrupt occurs before a handler has been installed at next boot.
Let us be well-behaved and disable them on clean shutdowns at least (as
do the RTT-based rtc-at91sam9 driver).
Signed-off-by: Johan Hovold <jhovold@gmail.com> Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com> Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com> Cc: Andrew Victor <linux@maxim.org.za> Cc: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrea Arcangeli [Thu, 21 Nov 2013 22:32:02 +0000 (14:32 -0800)]
mm: hugetlbfs: fix hugetlbfs optimization
Commit 7cb2ef56e6a8 ("mm: fix aio performance regression for database
caused by THP") can cause dereference of a dangling pointer if
split_huge_page runs during PageHuge() if there are updates to the
tail_page->private field.
Also it is repeating compound_head twice for hugetlbfs and it is running
compound_head+compound_trans_head for THP when a single one is needed in
both cases.
The new code within the PageSlab() check doesn't need to verify that the
THP page size is never bigger than the smallest hugetlbfs page size, to
avoid memory corruption.
A longstanding theoretical race condition was found while fixing the
above (see the change right after the skip_unlock label, that is
relevant for the compound_lock path too).
By re-establishing the _mapcount tail refcounting for all compound
pages, this also fixes the below problem:
The oops is caused because shm_destroy() calls fput() after dropping the
ipc_lock. fput() clears the file's f_inode, f_path.dentry, and
f_path.mnt, which causes various NULL pointer references in task 2. I
reliably see the oops in task 2 if with shmlock, shmu
This patch fixes the races by:
1) set shm_file=NULL in shm_destroy() while holding ipc_object_lock().
2) modify at risk operations to check shm_file while holding
ipc_object_lock().
Example workloads, which each trigger oops...
Workload 1:
while true; do
id=$(shmget 1 4096)
shm_rmid $id &
shmlock $id &
wait
done
The oops stack shows accessing NULL f_inode due to racing fput:
_raw_spin_lock
shmem_lock
SyS_shmctl
Workload 2:
while true; do
id=$(shmget 1 4096)
shmat $id 4096 &
shm_rmid $id &
wait
done
The oops stack is similar to workload 1 due to NULL f_inode:
touch_atime
shmem_mmap
shm_mmap
mmap_region
do_mmap_pgoff
do_shmat
SyS_shmat
Workload 3:
while true; do
id=$(shmget 1 4096)
shmlock $id
shm_rmid $id &
shmunlock $id &
wait
done
The oops stack shows second fput tripping on an NULL f_inode. The
first fput() completed via from shm_destroy(), but a racing thread did
a get_file() and queued this fput():
locks_remove_flock
__fput
____fput
task_work_run
do_notify_resume
int_signal
Fixes: c2c737a0461e ("ipc,shm: shorten critical region for shmat") Fixes: 2caacaa82a51 ("ipc,shm: shorten critical region for shmctl") Signed-off-by: Greg Thelen <gthelen@google.com> Cc: Davidlohr Bueso <davidlohr@hp.com> Cc: Rik van Riel <riel@redhat.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: <stable@vger.kernel.org> # 3.10.17+ 3.11.6+ Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
and a non-hugetlbfs page has no page_hstate(). This works 99% of the
time because page_hstate() determines the hstate from the page order
alone. Since the page order of a THP page matches the default hugetlbfs
page order, it works.
But, if you change the default huge page size on the boot command-line
(say default_hugepagesz=1G), then we might not even *have* a 2MB hstate
so page_hstate() returns null and copy_huge_page() oopses pretty fast
since copy_huge_page() dereferences the hstate:
Mel noticed that the migration code is really the only user of these
functions. This moves all the copy code over to migrate.c and makes
copy_huge_page() work for THP by checking for it explicitly.
I believe the bug was introduced in commit b32967ff101a ("mm: numa: Add
THP migration for the NUMA working set scanning fault case")
[akpm@linux-foundation.org: fix coding-style and comment text, per Naoya Horiguchi] Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Hillf Danton <dhillf@gmail.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Tested-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Thu, 21 Nov 2013 22:31:57 +0000 (14:31 -0800)]
checkpatch: fix "Use of uninitialized value" warnings
checkpatch is currently confused about some complex macros and references
undefined variables $stat and $cond.
Make sure these are defined before using them.
Signed-off-by: Joe Perches <joe@perches.com> Reported-by: Gerhard Sittig <gsi@denx.de> Acked-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Junxiao Bi [Thu, 21 Nov 2013 22:31:56 +0000 (14:31 -0800)]
configfs: fix race between dentry put and lookup
A race window in configfs, it starts from one dentry is UNHASHED and end
before configfs_d_iput is called. In this window, if a lookup happen,
since the original dentry was UNHASHED, so a new dentry will be
allocated, and then in configfs_attach_attr(), sd->s_dentry will be
updated to the new dentry. Then in configfs_d_iput(),
BUG_ON(sd->s_dentry != dentry) will be triggered and system panic.
sys_open: sys_close:
... fput
dput
dentry_kill
__d_drop <--- dentry unhashed here,
but sd->dentry still point
to this dentry.
lookup_real
configfs_lookup
configfs_attach_attr---> update sd->s_dentry
to new allocated dentry here.
To fix it, change configfs_d_iput to not update sd->s_dentry if
sd->s_count > 2, that means there are another dentry is using the sd
beside the one that is going to be put. Use configfs_dirent_lock in
configfs_attach_attr to sync with configfs_d_iput.
With the following steps, you can reproduce the bug.
1. enable ocfs2, this will mount configfs at /sys/kernel/config and
fill configure in it.
2. run the following script.
while [ 1 ]; do cat /sys/kernel/config/cluster/$your_cluster_name/idle_timeout_ms > /dev/null; done &
while [ 1 ]; do cat /sys/kernel/config/cluster/$your_cluster_name/idle_timeout_ms > /dev/null; done &
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Herbert Xu [Thu, 21 Nov 2013 19:10:04 +0000 (11:10 -0800)]
gso: handle new frag_list of frags GRO packets
Recently GRO started generating packets with frag_lists of frags.
This was not handled by GSO, thus leading to a crash.
Thankfully these packets are of a regular form and are easy to
handle. This patch handles them in two ways. For completely
non-linear frag_list entries, we simply continue to iterate over
the frag_list frags once we exhaust the normal frags. For frag_list
entries with linear parts, we call pskb_trim on the first part
of the frag_list skb, and then process the rest of the frags in
the usual way.
This patch also kills a chunk of dead frag_list code that has
obviously never ever been run since it ends up generating a bogus
GSO-segmented packet with a frag_list entry.
Future work is planned to split super big packets into TSO
ones.
Fixes: 8a29111c7ca6 ("net: gro: allow to build full sized skb") Reported-by: Christoph Paasch <christoph.paasch@uclouvain.be> Reported-by: Jerry Chu <hkchu@google.com> Reported-by: Sander Eikelenboom <linux@eikelenboom.it> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Sander Eikelenboom <linux@eikelenboom.it> Tested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
In the case that atomic_open calls finish_no_open() with
the dentry that was supplied to gfs2_atomic_open() an
extra reference count is required. This patch fixes that
issue preventing a bug trap triggering at umount time.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Johannes Berg [Thu, 21 Nov 2013 17:20:28 +0000 (18:20 +0100)]
genetlink: fix genl_set_err() group ID
Fix another really stupid bug - I introduced genl_set_err()
precisely to be able to adjust the group and reject invalid
ones, but then forgot to do so.
Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Johannes Berg [Thu, 21 Nov 2013 17:17:04 +0000 (18:17 +0100)]
genetlink: fix genlmsg_multicast() bug
Unfortunately, I introduced a tremendously stupid bug into
genlmsg_multicast() when doing all those multicast group
changes: it adjusts the group number, but then passes it
to genlmsg_multicast_netns() which does that again.
Somehow, my tests failed to catch this, so add a warning
into genlmsg_multicast_netns() and remove the offending
group ID adjustment.
Also add a warning to the similar code in other functions
so people who misuse them are more loudly warned.
Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Thu, 21 Nov 2013 15:50:58 +0000 (16:50 +0100)]
packet: fix use after free race in send path when dev is released
Salam reported a use after free bug in PF_PACKET that occurs when
we're sending out frames on a socket bound device and suddenly the
net device is being unregistered. It appears that commit 827d9780
introduced a possible race condition between {t,}packet_snd() and
packet_notifier(). In the case of a bound socket, packet_notifier()
can drop the last reference to the net_device and {t,}packet_snd()
might end up suddenly sending a packet over a freed net_device.
To avoid reverting 827d9780 and thus introducing a performance
regression compared to the current state of things, we decided to
hold a cached RCU protected pointer to the net device and maintain
it on write side via bind spin_lock protected register_prot_hook()
and __unregister_prot_hook() calls.
In {t,}packet_snd() path, we access this pointer under rcu_read_lock
through packet_cached_dev_get() that holds reference to the device
to prevent it from being freed through packet_notifier() while
we're in send path. This is okay to do as dev_put()/dev_hold() are
per-cpu counters, so this should not be a performance issue. Also,
the code simplifies a bit as we don't need need_rls_dev anymore.
Fixes: 827d978037d7 ("af-packet: Use existing netdev reference for bound sockets.") Reported-by: Salam Noureddine <noureddine@aristanetworks.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Salam Noureddine <noureddine@aristanetworks.com> Cc: Ben Greear <greearb@candelatech.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David Vrabel [Thu, 21 Nov 2013 15:26:09 +0000 (15:26 +0000)]
xen-netback: stop the VIF thread before unbinding IRQs
If the VIF thread is still running after unbinding the Tx and Rx IRQs
in xenvif_disconnect(), the thread may attempt to raise an event which
will BUG (as the irq is unbound).
Signed-off-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Michael Opdenacker <michael.opdenacker@free-electrons.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Amir Shehata [Thu, 21 Nov 2013 14:24:50 +0000 (22:24 +0800)]
staging/lustre/lnet: Fix assert on empty group in selftest module
The core of the issue is that the selftest module doesn't sanitize its
own API, but it depends on lst utility to do such checks. As a result
this issue manifests itself in this particular LU through an assert
on an empty group. If the NID is misspelled then an empty group is
added. An error output is provided, but if that's never checked in a
batch script, as is the case with this issue, then the script will try
to add an empty group to a test to run in a batch, and that will cause
an assert
The fix is two fold. Ensure that lst utility checks that a group is
added with at least one node. If not the group is subsequently
deleted. And the add_test command would fail, since the group no
longer exists.
The second fix is to ensure that the kernel module itself sanitizes
its own API in this particular case, so that if a different utility is
used other than lst to communicate with the selftest kernel module
then this error would be caught. This fix looks up the batch and the
groups, src and dst, in the ioctl handle and sanitizes that input at
this point. If the group looked up either doesn't exist or doesn't
have at least one ACTIVE node, then the command fails.
NOTE:there are many other cases in the code where the selftest kernel
module doesn't check for sanity of the input, but depends totally on
the lst module to do such checks. Particularly around length of
strings passed in. Thus it is possible to crash the selftest module
if someone tries to create another userspace app to communicate with
the selftest kernel module without ensuring sanity of the params sent
to the kernel module. In effect, it's always assumed that lst is the
front end for selftest and no other front end is to be used.
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3093
Lustre-change: http://review.whamcloud.com/6092 Signed-off-by: Amir Shehata <amir.shehata@intel.com> Reviewed-by: Isaac Huang <he.huang@intel.com> Reviewed-by: Liang Zhen <liang.zhen@intel.com> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
[coding style fix of the original patch is splitted into a new one -- PengTao] Signed-off-by: Peng Tao <bergwolf@gmail.com> Signed-off-by: Andreas Dilger <andreas.dilger@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Andreas Dilger [Thu, 21 Nov 2013 14:24:49 +0000 (22:24 +0800)]
staging/lustre/ldlm: fix resource/fid check, use DLDLMRES
In ll_md_blocking_ast() the FID/resource comparison is incorrectly
checking for fid_ver() stored in res_id.name[2] instead of name[1]
changed since http://review.whamcloud.com/2271 (commit 4f91d5161d00)
landed. This does not impact current clients, since name[2] and
fid_ver() are always zero, but it could cause problems in the future.
In ldlm_cli_enqueue_fini() use ldlm_res_eq() instead of comparing
each of the resource fields separately.
Use DLDLMRES/PLDLMRES when printing resource names everywhere.
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2901
Lustre-change: http://review.whamcloud.com/6592 Signed-off-by: Lai Siyao <lai.siyao@intel.com> Reviewed-by: Johann Lombardi <johann.lombardi@intel.com> Signed-off-by: Peng Tao <bergwolf@gmail.com> Signed-off-by: Andreas Dilger <andreas.dilger@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
JC Lafoucriere [Thu, 21 Nov 2013 14:24:48 +0000 (22:24 +0800)]
staging/lustre/llite: Access to released file triggers a restore
When a client accesses data in a released file,
or truncate it, client must trig a restore request.
During this restore, the client must not glimpse and
must use size from MDT. To bring the "restore is running"
information on the client we add a new t_state bit field
to mdt_info which will be used to carry transient file state.
To memorise this information in the inode we add a new flag
LLIF_FILE_RESTORING.
David Herrmann [Thu, 21 Nov 2013 10:50:50 +0000 (20:50 +1000)]
drm/sysfs: fix hotplug regression since lifetime changes
airlied:
The lifetime changes introduced in 5bdebb183c9702a8c57a01dff09337be3de337a6
tried to use device_create, however that led to the regression where dev->type
wasn't getting set correctly. First attempt at fixing it would have led to
a race, so this undoes the device_createa work and does it all manually
making sure the dev->type is setup before we register the device.
Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
Commit [e66cf1610: GFS2: Use lockref for glocks] replaced call:
atomic_read(&gi->gl->gl_ref) == 0
with:
__lockref_is_dead(&gl->gl_lockref)
therefore changing how gl is accessed, from gi->gl to plan gl.
However, gl can be a NULL pointer, and so gi->gl needs to be
used instead (which is guaranteed not to be NULL because fo
the while loop checking that condition).
Signed-off-by: Michal Nazarewicz <mina86@mina86.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Using the address of 'empty_zero_page' as source address in order to
clear a page is wrong. On some architectures empty_zero_page is only the
pointer to the struct page of the empty_zero_page. Therefore the clear
page operation would copy the contents of a couple of struct pages instead
of clearing a page. For kvm only arm/arm64 are affected by this bug.
To fix this use the ZERO_PAGE macro instead which will return the struct
page address of the empty_zero_page on all architectures.
Inki Dae [Thu, 21 Nov 2013 03:09:51 +0000 (12:09 +0900)]
drm/exynos: g2d: fix memory leak to userptr
This patch releases a vma object when cleaning up userptr resources.
A new vma object was allocated and copied when getting userptr pages
so the new vma object should be freed properly if the userptr pages
aren't used anymore.
Signed-off-by: Inki Dae <inki.dae@samsung.com> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Thu, 21 Nov 2013 08:46:56 +0000 (18:46 +1000)]
Merge branch 'ttm-fixes-3.13' of git://people.freedesktop.org/~thomash/linux into drm-fixes
The set_need_resched() removal fix and yet another fix in
ttm_bo_move_memcpy().
* 'ttm-fixes-3.13' of git://people.freedesktop.org/~thomash/linux:
drm/ttm: Remove set_need_resched from the ttm fault handler
drm/ttm: Don't move non-existing data
Dave Airlie [Thu, 21 Nov 2013 08:46:26 +0000 (18:46 +1000)]
Merge branch 'vmwgfx-fixes-3.13' of git://people.freedesktop.org/~thomash/linux into drm-fixes
Below is a fix for a false lockep warning,
and the vmwgfx prime implementation.
* 'vmwgfx-fixes-3.13' of git://people.freedesktop.org/~thomash/linux:
drm/vmwgfx: Make vmwgfx dma buffers prime aware
drm/vmwgfx: Make surfaces prime-aware
drm/vmwgfx: Hook up the prime ioctls
drm/ttm: Add a minimal prime implementation for ttm base objects
drm/vmwgfx: Fix false lockdep warning
drm/ttm: Allow execbuf util reserves without ticket
Dave Airlie [Thu, 21 Nov 2013 08:45:51 +0000 (18:45 +1000)]
Merge tag 'drm-intel-fixes-2013-11-20' of git://people.freedesktop.org/~danvet/drm-intel into drm-fixes
Just a small pile of fixes for bugs and a few regressions. I'm still
trying to track down a driver load hang on my g33 (which infuriatingly
doesn't happen when loading the module manually after boot), somehow
bisecting loves to go astray on this one :( And there's a (harmless)
locking WARN in the suspend code due to one of Jesse's vlv backlight
rework patches. Otherwise nothing outstanding afaik.
* tag 'drm-intel-fixes-2013-11-20' of git://people.freedesktop.org/~danvet/drm-intel:
drm/i915: Fix gen3 self-refresh watermarks
drm/i915: Replicate BIOS eDP bpp clamping hack for hsw
drm/i915: Do not enable package C8 on unsupported hardware
drm/i915: Hold pc8 lock around toggling pc8.gpu_idle
drm/i915: encoder->get_config is no longer optional
drm/i915/tv: add ->get_config callback
drm/i915: restore the early forcewake cleanup
Partially revert "drm/i915: tune the RC6 threshold for stability"
drm/i915: flush cursors harder
i915: Use 120MHz LVDS SSC clock for gen5/gen6/gen7
x86/early quirk: use gen6 stolen detection for VLV
drm/i915/dp: set sink to power down mode on dp disable
Dave Airlie [Thu, 21 Nov 2013 08:42:19 +0000 (18:42 +1000)]
Merge branch 'drm-next-3.13' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
More fixes for radeon. This adds new queries for tiling on CIK, and
fixes a crash in handling acpi atif backlight events on CIK.
Some fixes for radeon for 3.13. Mostly CI stability fixes. I think
I've tracked down the stability problems with dpm on Trinity/Richland,
so I'm going to enable that by default now.
* 'drm-next-3.13' of git://people.freedesktop.org/~agd5f/linux:
drm/radeon: hook up backlight functions for CI and KV family.
drm/radeon/cik: Add macrotile mode array query
drm/radeon/cik: Return backend map information to userspace
drm/radeon: enable DPM by default in TN asics
drm/radeon: adjust TN dpm parameters for stability (v2)
drm/radeon: use a single doorbell for cik kms compute
drm/radeon/vm: don't attempt to update ptes if ib allocation fails
drm/radeon: disable CIK CP semaphores for now
drm/radeon: allow semaphore emission to fail
drm/radeon: add semaphore trace point
radeon: workaround pinning failure on low ram gpu
radeon/i2c: do not count reg index in number of i2c byte we are writing.
drm/radeon: cypress_dpm: Fix unused variable warning when CONFIG_ACPI=n
drm: radeon: ni_dpm: Fix unused variable warning when CONFIG_ACPI=n
Eric Seppanen [Wed, 20 Nov 2013 22:19:52 +0000 (14:19 -0800)]
iscsi-target: chap auth shouldn't match username with trailing garbage
In iSCSI negotiations with initiator CHAP enabled, usernames with
trailing garbage are permitted, because the string comparison only
checks the strlen of the configured username.
e.g. "usernameXXXXX" will be permitted to match "username".
Just check one more byte so the trailing null char is also matched.
Signed-off-by: Eric Seppanen <eric@purestorage.com> Cc: <stable@vger.kernel.org> #3.1+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Eric Seppanen [Wed, 20 Nov 2013 22:19:51 +0000 (14:19 -0800)]
iscsi-target: fix extract_param to handle buffer length corner case
extract_param() is called with max_length set to the total size of the
output buffer. It's not safe to allow a parameter length equal to the
buffer size as the terminating null would be written one byte past the
end of the output buffer.
Signed-off-by: Eric Seppanen <eric@purestorage.com> Cc: <stable@vger.kernel.org> #3.1+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Madalin Bucur [Wed, 20 Nov 2013 22:38:19 +0000 (16:38 -0600)]
net/phy: Add the autocross feature for forced links on VSC82x4
Add auto-MDI/MDI-X capability for forced (autonegotiation disabled)
10/100 Mbps speeds on Vitesse VSC82x4 PHYs. Exported previously static
function genphy_setup_forced() required by the new config_aneg handler
in the Vitesse PHY module.
Signed-off-by: Madalin Bucur <madalin.bucur@freescale.com> Signed-off-by: Shruti Kanetkar <Shruti@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Sandeep Singh [Wed, 20 Nov 2013 22:38:18 +0000 (16:38 -0600)]
net/phy: Add VSC8662 support
Vitesse VSC8662 is Dual Port 10/100/1000Base-T Phy
Its register set and features are similar to other Vitesse Phys.
Signed-off-by: Sandeep Singh <Sandeep@freescale.com> Signed-off-by: Andy Fleming <afleming@gmail.com> Signed-off-by: Shruti Kanetkar <Shruti@Freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
shaohui xie [Wed, 20 Nov 2013 22:38:17 +0000 (16:38 -0600)]
net/phy: Add VSC8574 support
The VSC8574 is a quad-port Gigabit Ethernet transceiver with four SerDes
interfaces for quad-port dual media capability.
Signed-off-by: Shaohui Xie <Shaohui.Xie@freescale.com> Signed-off-by: Andy Fleming <afleming@gmail.com> Signed-off-by: Shruti Kanetkar <Shruti@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Andy Fleming [Wed, 20 Nov 2013 22:38:16 +0000 (16:38 -0600)]
net/phy: Add VSC8234 support
Vitesse VSC8234 is quad port 10/100/1000BASE-T PHY
with SGMII and SERDES MAC interfaces.
Signed-off-by: Andy Fleming <afleming@gmail.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Shruti Kanetkar <Shruti@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: add BUG_ON if kernel advertises msg_namelen > sizeof(struct sockaddr_storage)
In that case it is probable that kernel code overwrote part of the
stack. So we should bail out loudly here.
The BUG_ON may be removed in future if we are sure all protocols are
conformant.
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
net: rework recvmsg handler msg_name and msg_namelen logic
This patch now always passes msg->msg_namelen as 0. recvmsg handlers must
set msg_namelen to the proper size <= sizeof(struct sockaddr_storage)
to return msg_name to the user.
This prevents numerous uninitialized memory leaks we had in the
recvmsg handlers and makes it harder for new code to accidentally leak
uninitialized memory.
Optimize for the case recvfrom is called with NULL as address. We don't
need to copy the address at all, so set it to NULL before invoking the
recvmsg handler. We can do so, because all the recvmsg handlers must
cope with the case a plain read() is called on them. read() also sets
msg_name to NULL.
Also document these changes in include/linux/net.h as suggested by David
Miller.
Changes since RFC:
Set msg->msg_name = NULL if user specified a NULL in msg_name but had a
non-null msg_namelen in verify_iovec/verify_compat_iovec. This doesn't
affect sendto as it would bail out earlier while trying to copy-in the
address. It also more naturally reflects the logic by the callers of
verify_iovec.
With this change in place I could remove "
if (!uaddr || msg_sys->msg_namelen == 0)
msg->msg_name = NULL
".
This change does not alter the user visible error logic as we ignore
msg_namelen as long as msg_name is NULL.
Also remove two unnecessary curly brackets in ___sys_recvmsg and change
comments to netdev style.
Cc: David Miller <davem@davemloft.net> Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Akinobu Mita [Mon, 18 Nov 2013 13:13:18 +0000 (22:13 +0900)]
btrfs: fix bio_size_ok() for max_sectors > 0xffff
The data type of max_sectors in queue settings is unsigned int. But
this value is stored to the local variable whose type is unsigned short
in bio_size_ok(). This can cause unexpected result when max_sectors >
0xffff.
Cc: Chris Mason <chris.mason@fusionio.com> Cc: linux-btrfs@vger.kernel.org Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Steven Rostedt [Fri, 15 Nov 2013 03:57:29 +0000 (22:57 -0500)]
btrfs: Use trace condition for get_extent tracepoint
Doing an if statement to test some condition to know if we should
trigger a tracepoint is pointless when tracing is disabled. This just
adds overhead and wastes a branch prediction. This is why the
TRACE_EVENT_CONDITION() was created. It places the check inside the jump
label so that the branch does not happen unless tracing is enabled.
That is, instead of doing:
if (em)
trace_btrfs_get_extent(root, em);
Which is basically this:
if (em)
if (static_key(trace_btrfs_get_extent)) {
Using a TRACE_EVENT_CONDITION() we can just do:
trace_btrfs_get_extent(root, em);
And the condition trace event will do:
if (static_key(trace_btrfs_get_extent)) {
if (em) {
...
The static key is a non conditional jump (or nop) that is faster than
having to check if em is NULL or not.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Miao Xie [Thu, 14 Nov 2013 09:33:21 +0000 (17:33 +0800)]
Btrfs: fix list delete warning when removing ordered root from the list
Commit b02441999efcc6152b87cd58e7970bb7843f76cf "Btrfs: don't wait for
the completion of all the ordered extents" introduced a bug that broke
the ordered root list:
WARNING: CPU: 1 PID: 7119 at lib/list_debug.c:59 __list_del_entry+0x5a/0x98()
It is because we forgot to return the roots in the splice list to the
ordered list of the fs. Fix it.
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Stefan Behrens [Wed, 13 Nov 2013 16:19:08 +0000 (17:19 +0100)]
Btrfs: print bytenr instead of page pointer in check-int
The page pointer information was useless. The bytenr is what you
want when you search for submitted write bios.
Additionally, a new bit in the print mask is added that allows
to selectively enable the check-int submit_bio verbose mode. Before,
the global verbose mode had to be enabled leading to many million
useless lines in the kernel log.
And a comment is added that explains that LOG_BUF_SHIFT needs to
be set to a really high value.
Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Wang Shilong [Tue, 12 Nov 2013 11:32:04 +0000 (19:32 +0800)]
Btrfs: remove dead codes from ctree.h
These two functions are only stated but undefined.
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Btrfs: don't wait for ordered data outside desired range
In btrfs_wait_ordered_range(), if we found an extent to the left
of the start of our desired wait range and the last byte of that
extent is 1 less than the desired range's start, we would would
wait for the IO completion of that extent unnecessarily.
Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
It's because that we don't do the right thing when checking if it's ok to
tell lockdep that we're trying to release the rwsem.
If the trans handle's type is TRANS_ATTACH, we won't acquire the freeze rwsem, but
as TRANS_ATTACH fits the check (trans < TRANS_JOIN_NOLOCK), we'll release the freeze
rwsem, which makes lockdep complains a lot.
Reported-by: Ma Jianpeng <majianpeng@gmail.com> Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Liu Bo [Tue, 5 Nov 2013 03:45:53 +0000 (11:45 +0800)]
Btrfs: avoid heavy operations in btrfs_commit_super
The 'git blame' history shows that, the old transaction commit code has to do
twice to ensure roots are updated and we have to flush metadata and super block
manually, however, right now all of these can be handled well inside
the transaction commit code without extra efforts.
And the error handling part remains same with the current code, -- 'return to
caller once we get error'.
This saves us a transaction commit and a flush of super block, which are both
heavy operations according to ftrace output analysis.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Ilya Dryomov [Sun, 3 Nov 2013 17:06:40 +0000 (19:06 +0200)]
Btrfs: fix __btrfs_start_workers retval
__btrfs_start_workers returns 0 in case it raced with
btrfs_stop_workers and lost the race. This is wrong because worker in
this case is not allowed to start and is in fact destroyed. Return
-EINVAL instead.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Ilya Dryomov [Sun, 3 Nov 2013 17:06:38 +0000 (19:06 +0200)]
Btrfs: do not inc uncorrectable_errors counter on ro scrubs
Currently if we discover an error when scrubbing in ro mode we a)
blindly increment the uncorrectable_errors counter, and b) spam the
dmesg with the 'unable to fixup (regular) error at ...' message, even
though a) we haven't tried to determine if the error is correctable or
not, and b) we haven't tried to fixup anything. Fix this.
Cc: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Josef Bacik [Wed, 13 Nov 2013 01:54:09 +0000 (20:54 -0500)]
Btrfs: only drop modified extents if we logged the whole inode
If we fsync, seek and write, rename and then fsync again we will lose the
modified hole extent because the rename will drop all of the modified extents
since we didn't do the fast search. We need to only drop the modified extents
if we didn't do the fast search and we were logging the entire inode as we don't
need them anymore, otherwise this is being premature. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Josef Bacik [Tue, 12 Nov 2013 21:25:58 +0000 (16:25 -0500)]
Btrfs: make sure to copy everything if we rename
If we rename a file that is already in the log and we fsync again we will lose
the new name. This is because we just log the inode update and not the new ref.
To fix this we just need to check if we are logging the new name of the inode
and copy all the metadata instead of just updating the inode itself. With this
patch my testcase now passes. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Michael Neuling [Wed, 20 Nov 2013 05:18:54 +0000 (16:18 +1100)]
powerpc/signals: Mark VSX not saved with small contexts
The VSX MSR bit in the user context indicates if the context contains VSX
state. Currently we set this when the process has touched VSX at any stage.
Unfortunately, if the user has not provided enough space to save the VSX state,
we can't save it but we currently still set the MSR VSX bit.
This patch changes this to clear the MSR VSX bit when the user doesn't provide
enough space. This indicates that there is no valid VSX state in the user
context.
This is needed to support get/set/make/swapcontext for applications that use
VSX but only provide a small context. For example, getcontext in glibc
provides a smaller context since the VSX registers don't need to be saved over
the glibc function call. But since the program calling getcontext may have
used VSX, the kernel currently says the VSX state is valid when it's not. If
the returned context is then used in setcontext (ie. a small context without
VSX but with MSR VSX set), the kernel will refuse the context. This situation
has been reported by the glibc community.
Based on patch from Carlos O'Donell.
Tested-by: Haren Myneni <haren@linux.vnet.ibm.com> Signed-off-by: Michael Neuling <mikey@neuling.org> Cc: stable@vger.kernel.org Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Michael Ellerman [Wed, 20 Nov 2013 00:05:02 +0000 (11:05 +1100)]
powerpc/pseries: Fix SMP=n build of rng.c
In commit a489043 "Implement arch_get_random_long() based on H_RANDOM" I
broke the SMP=n build. We were getting plpar_wrappers.h via spinlock.h
which breaks when SMP=n.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Michael Ellerman [Wed, 20 Nov 2013 00:05:01 +0000 (11:05 +1100)]
powerpc: Make cpu_to_chip_id() available when SMP=n
Up until now we have only used cpu_to_chip_id() in the topology code,
which is only used on SMP builds. However my recent commit a4da0d5
"Implement arch_get_random_long/int() for powernv" added a usage when
SMP=n, breaking the build.
Move cpu_to_chip_id() into prom.c so it is available for SMP=n builds.
We would move the extern to prom.h, but that breaks the include in
topology.h. Instead we leave it in smp.h, but move it out of the
CONFIG_SMP #ifdef. We also need to include asm/smp.h in rng.c, because
the linux version skips asm/smp.h on UP. What a mess.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Li Zhong [Tue, 19 Nov 2013 08:11:37 +0000 (16:11 +0800)]
powerpc/vio: Fix a dma_mask issue of vio
I encountered following issue:
[ 0.283035] ibmvscsi 30000015: couldn't initialize event pool
[ 5.688822] ibmvscsi: probe of 30000015 failed with error -1
which prevents the storage from being recognized, and the machine from
booting.
After some digging, it seems that it is caused by commit 4886c399da
as dma_mask pointer in viodev->dev is not set, so in
dma_set_mask_and_coherent(), dma_set_coherent_mask() is not called
because dma_set_mask(), which is dma_set_mask_pSeriesLP() returned EIO.
While before the commit, dma_set_coherent_mask() is always called.
I tried to replace dma_set_mask_and_coherent() with
dma_coerce_mask_and_coherent(), and the machine could boot again.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
arch/powerpc/platforms/wsp/wsp.c: In function ‘wsp_probe_devices’:
arch/powerpc/platforms/wsp/wsp.c:76:3: error: implicit declaration of function ‘of_address_to_resource’ [-Werror=implicit-function-declaration]
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Mon, 18 Nov 2013 03:55:28 +0000 (14:55 +1100)]
powerpc: ppc64 address space capped at 32TB, mmap randomisation disabled
Commit fba2369e6ceb (mm: use vm_unmapped_area() on powerpc architecture)
has a bug in slice_scan_available() where we compare an unsigned long
(high_slices) against a shifted int. As a result, comparisons against
the top 32 bits of high_slices (representing the top 32TB) always
returns 0 and the top of our mmap region is clamped at 32TB
This also breaks mmap randomisation since the randomised address is
always up near the top of the address space and it gets clamped down
to 32TB.
Cc: stable@vger.kernel.org # v3.10+ Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Michel Lespinasse <walken@google.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Mon, 18 Nov 2013 02:19:17 +0000 (13:19 +1100)]
powerpc: Only print PACATMSCRATCH in oops when TM is active
If TM is not active there is no need to print PACATMSCRATCH
so we can save ourselves a line.
Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Heiko Carstens [Thu, 14 Nov 2013 04:01:43 +0000 (15:01 +1100)]
powerpc: Fix __get_user_pages_fast() irq handling
__get_user_pages_fast() may be called with interrupts disabled (see e.g.
get_futex_key() in kernel/futex.c) and therefore should use local_irq_save()
and local_irq_restore() instead of local_irq_disable()/enable().
Gavin Shan [Tue, 12 Nov 2013 06:49:21 +0000 (14:49 +0800)]
powerpc/eeh: Enable PCI_COMMAND_MASTER for PCI bridges
On PHB3, we will fail to fetch IODA tables without PCI_COMMAND_MASTER
on PCI bridges. According to one experiment I had, the MSIx interrupts
didn't raise from the adapter without the bit applied to all upstream
PCI bridges including root port of the adapter. The patch forces to
have that bit enabled accordingly.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> CC: <stable@vger.kernel.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Linus Torvalds [Wed, 20 Nov 2013 23:13:47 +0000 (15:13 -0800)]
Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Pull powerpc LE updates from Ben Herrenschmidt:
"With my previous pull request I mentioned some remaining Little Endian
patches, notably support for our new ABI, which I was sitting on
making sure it was all finalized.
The toolchain folks confirmed it now, the new ABI is stable and merged
with gcc, so we are all good. Oh and we actually missed the actual
Kconfig switch for LE so here it is, along with a couple more bug
fixes.
I have more fixes but not related to LE so I'll send them as a
separate pull request tomorrow, let's get this one out of the way.
Note that this supports running user space binaries using the new ABI,
but the kernel itself still needs to be built with the old one. We'll
bring fixes for that after -rc1.
Here's Anton log that goes with this series:
This patch series adds support for the new ABI, LPAR support for
H_SET_MODE and finally adds a kconfig option and defconfig.
ABIv2 support was recently committed to binutils and gcc, and should
be merged into glibc soon. There are a number of very nice
improvements including the removal of function descriptors. Rusty's
kernel patches allow binaries of either ABI to work, easing the
transition"
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc: Wrong DWARF CFI in the kernel vdso for little-endian / ELFv2
powerpc: Add pseries_le_defconfig
powerpc: Add CONFIG_CPU_LITTLE_ENDIAN kernel config option.
powerpc: Don't use ELFv2 ABI to build the kernel
powerpc: ELF2 binaries signal handling
powerpc: ELF2 binaries launched directly.
powerpc: Set eflags correctly for ELF ABIv2 core dumps.
powerpc: Add TIF_ELF2ABI flag.
pseries: Add H_SET_MODE to change exception endianness
powerpc/pseries: Fix endian issues in pseries EEH code
Will Deacon [Tue, 19 Nov 2013 14:46:17 +0000 (15:46 +0100)]
ARM: 7894/1: kconfig: select GENERIC_CLOCKEVENTS if HAVE_ARM_ARCH_TIMER
The ARM architected timer driver doesn't compile without
GENERIC_CLOCKEVENTS selected, so ensure that we select it when building
for a platform that has the timer.
Without this patch, mach-virt fails to build without something like
mach-vexpress also selected.
Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Will Deacon [Tue, 19 Nov 2013 14:46:11 +0000 (15:46 +0100)]
ARM: 7893/1: bitops: only emit .arch_extension mp if CONFIG_SMP
Uwe reported a build failure when targetting a NOMMU platform with my
recent prefetch changes:
arch/arm/lib/changebit.S: Assembler messages:
arch/arm/lib/changebit.S:15: Error: architectural extension `mp' is
not allowed for the current base architecture
This is due to use of the .arch_extension mp directive immediately prior
to an ALT_SMP(...) instruction. Whilst the ALT_SMP macro will expand to
nothing if !CONFIG_SMP, gas will still choke on the directive.
This patch fixes the issue by only emitting the sequence (including the
directive) if CONFIG_SMP=y.
Tested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Yinghai Lu [Tue, 19 Nov 2013 00:02:45 +0000 (17:02 -0700)]
PCI: Remove duplicate pci_disable_device() from pcie_portdrv_remove()
The pcie_portdrv .probe() method calls pci_enable_device() once, in
pcie_port_device_register(), but the .remove() method calls
pci_disable_device() twice, in pcie_port_device_remove() and in
pcie_portdrv_remove().
That causes a "disabling already-disabled device" warning when removing a
PCIe port device. This happens all the time when removing Thunderbolt
devices, but is also easy to reproduce with, e.g.,
"echo 0000:00:1c.3 > /sys/bus/pci/drivers/pcieport/unbind"
This patch removes the disable from pcie_portdrv_remove().
[bhelgaas: changelog, tag for stable] Reported-by: David Bulkow <David.Bulkow@stratus.com> Reported-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: stable@vger.kernel.org # v2.6.32+
Linus Torvalds [Wed, 20 Nov 2013 23:03:45 +0000 (15:03 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha
Pull alpha updates from Matt Turner:
"It contains a few fixes and some work from Richard to make alpha
emulation under QEMU much more usable"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha:
alpha: Prevent a NULL ptr dereference in csum_partial_copy.
alpha: perf: fix out-of-bounds array access triggered from raw event
alpha: Use qemu+cserve provided high-res clock and alarm.
alpha: Switch to GENERIC_CLOCKEVENTS
alpha: Enable the rpcc clocksource for single processor
alpha: Reorganize rtc handling
alpha: Primitive support for CPU power down.
alpha: Allow HZ to be configured
alpha: Notice if we're being run under QEMU
alpha: Eliminate compiler warning from memset macro
Linus Torvalds [Wed, 20 Nov 2013 23:02:50 +0000 (15:02 -0800)]
Merge branch 'parisc-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc fixes from Helge Deller:
- revert an access_ok() patch which broke 32bit userspace on 64bit
kernels
- avoid a gcc miscompilation in two internal pa_memcpy() functions by
not inlining those
- do not export the definition of SOCK_NONBLOCK via uapi header (fixes
build of audit package)
- depending on the fault type we now correctly report either SIGBUS or
SIGSEGV
- a small fix to not compare a size_t variable for < 0
* 'parisc-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: size_t is unsigned, so comparison size < 0 doesn't make sense.
parisc: improve SIGBUS/SIGSEGV error reporting
parisc: break out SOCK_NONBLOCK define to own asm header file
parisc: do not inline pa_memcpy() internal functions
Revert "parisc: implement full version of access_ok()"
Linus Torvalds [Wed, 20 Nov 2013 23:02:22 +0000 (15:02 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/egtvedt/linux-avr32
Pull AVR32 updates from Hans-Christian Egtvedt.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/egtvedt/linux-avr32:
avr32: uapi: be sure of "_UAPI" prefix for all guard macros
avr32: add kprobe_ctlblk memory struct
avr32: fix out-of-range jump in large kernels
avr32: setup crt for early panic()
Linus Torvalds [Wed, 20 Nov 2013 22:51:37 +0000 (14:51 -0800)]
Merge tag 'squashfs-updates' of git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-next
Pull squashfs updates from Phillip Lougher:
"These patches optionally improve the multi-threading peformance of
Squashfs by adding parallel decompression, and direct decompression
into the page cache, eliminating an intermediate buffer (removing
memcpy overhead and lock contention)"
* tag 'squashfs-updates' of git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-next:
Squashfs: Check stream is not NULL in decompressor_multi.c
Squashfs: Directly decompress into the page cache for file data
Squashfs: Restructure squashfs_readpage()
Squashfs: Generalise paging handling in the decompressors
Squashfs: add multi-threaded decompression using percpu variable
squashfs: Enhance parallel I/O
Squashfs: Refactor decompressor interface and code
Al points out that while the commit *does* actually create a separate
slab for the page->ptl allocation, that slab is never actually used, and
the code continues to use kmalloc/kfree.
Damien Wyart points out that the original patch did have the conversion
to use kmem_cache_alloc/free, so it got lost somewhere on its way to me.
Revert the half-arsed attempt that didn't do anything. If we really do
want the special slab (remember: this is all relevant just for debug
builds, so it's not necessarily all that critical) we might as well redo
the patch fully.
Reported-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Andrew Morton <akpm@linux-foundation.org> Cc: Kirill A Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Wed, 20 Nov 2013 22:25:39 +0000 (14:25 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs bits and pieces from Al Viro:
"Assorted bits that got missed in the first pull request + fixes for a
couple of coredump regressions"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fold try_to_ascend() into the sole remaining caller
dcache.c: get rid of pointless macros
take read_seqbegin_or_lock() and friends to seqlock.h
consolidate simple ->d_delete() instances
gfs2: endianness misannotations
dump_emit(): use __kernel_write(), not vfs_write()
dump_align(): fix the dumb braino
Ulrich Weigand [Wed, 20 Nov 2013 20:38:05 +0000 (07:38 +1100)]
powerpc: Wrong DWARF CFI in the kernel vdso for little-endian / ELFv2
I've finally tracked down why my CR signal-unwind test case still
fails on little-endian. The problem turned to be that the kernel
installs a signal trampoline in the vDSO, and provides a DWARF CFI
record for that trampoline. This CFI describes the save location
for CR:
rsave (70, 38*RSIZE + (RSIZE - CRSIZE))
which is correct for big-endian, but points to the wrong word on
little-endian. This is wrong no matter which ABI.
In addition, for the ELFv2 ABI, we should not only provide a CFI
record for register 70 (cr2), but for all CR fields separately.
Strictly speaking, I guess this would mean providing two separate
vDSO images, one for ELFv1 processes and one for ELFv2 processes (or
maybe playing some tricks with conditional DWARF expressions).
However, having CFI records for the other CR fields in ELFv1 is not
actually wrong, they just will be ignored. So it seems the simplest
fix would be just to always provide CFI for all the fields.
Signed-off-by: Ulrich Weigand <Ulrich.Weigand@de.ibm.com> Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Alistair Popple [Wed, 20 Nov 2013 11:15:04 +0000 (22:15 +1100)]
powerpc: Don't use ELFv2 ABI to build the kernel
The kernel doesn't build correctly using the ELFv2 ABI. This patch
ensures that the ELFv1 ABI is used when building a kernel with an
ELFv2 enabled compiler.
Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Rusty Russell [Wed, 20 Nov 2013 11:15:03 +0000 (22:15 +1100)]
powerpc: ELF2 binaries signal handling
For the ELFv2 ABI, the hander is the entry point, not a function descriptor.
We also need to set up r12, and fortunately the fast_exception_return
exit path restores r12 for us so nothing else is required.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Rusty Russell [Wed, 20 Nov 2013 11:15:02 +0000 (22:15 +1100)]
powerpc: ELF2 binaries launched directly.
No function descriptor, but we set r12 up and set TIF_RESTOREALL as it
normally isn't restored on return from syscall.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Rusty Russell [Wed, 20 Nov 2013 11:15:01 +0000 (22:15 +1100)]
powerpc: Set eflags correctly for ELF ABIv2 core dumps.
We leave it at zero (though it could be 1) for old tasks.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Rusty Russell [Wed, 20 Nov 2013 11:15:00 +0000 (22:15 +1100)]
powerpc: Add TIF_ELF2ABI flag.
Little endian ppc64 is getting an exciting new ABI. This is reflected
by the bottom two bits of e_flags in the ELF header:
0 == legacy binaries (v1 ABI)
1 == binaries using the old ABI (compiled with a new toolchain)
2 == binaries using the new ABI.
We store this in a thread flag, because we need to set it in core
dumps and for signal delivery. Our chief concern is that it doesn't
use function descriptors.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Wed, 20 Nov 2013 11:14:59 +0000 (22:14 +1100)]
pseries: Add H_SET_MODE to change exception endianness
On little endian builds call H_SET_MODE so exceptions have the
correct endianness. We need to reset the endian during kexec
so do that in the MMU hashtable clear callback.
Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Linus Torvalds [Wed, 20 Nov 2013 21:25:04 +0000 (13:25 -0800)]
Merge tag 'pm+acpi-2-3.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more ACPI and power management updates from Rafael Wysocki:
- ACPI-based device hotplug fixes for issues introduced recently and a
fix for an older error code path bug in the ACPI PCI host bridge
driver
- Fix for recently broken OMAP cpufreq build from Viresh Kumar
- Fix for a recent hibernation regression related to s2disk
- Fix for a locking-related regression in the ACPI EC driver from
Puneet Kumar
- System suspend error code path fix related to runtime PM and runtime
PM documentation update from Ulf Hansson
- cpufreq's conservative governor fix from Xiaoguang Chen
- New processor IDs for intel_idle and turbostat and removal of an
obsolete Kconfig option from Len Brown
- New device IDs for the ACPI LPSS (Low-Power Subsystem) driver and
ACPI-based PCI hotplug (ACPIPHP) cleanup from Mika Westerberg
- Removal of several ACPI video DMI blacklist entries that are not
necessary any more from Aaron Lu
- Rework of the ACPI companion representation in struct device and code
cleanup related to that change from Rafael J Wysocki, Lan Tianyu and
Jarkko Nikula
- Fixes for assigning names to ACPI-enumerated I2C and SPI devices from
Jarkko Nikula
* tag 'pm+acpi-2-3.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (24 commits)
PCI / hotplug / ACPI: Drop unused acpiphp_debug declaration
ACPI / scan: Set flags.match_driver in acpi_bus_scan_fixed()
ACPI / PCI root: Clear driver_data before failing enumeration
ACPI / hotplug: Fix PCI host bridge hot removal
ACPI / hotplug: Fix acpi_bus_get_device() return value check
cpufreq: governor: Remove fossil comment in the cpufreq_governor_dbs()
ACPI / video: clean up DMI table for initial black screen problem
ACPI / EC: Ensure lock is acquired before accessing ec struct members
PM / Hibernate: Do not crash kernel in free_basic_memory_bitmaps()
ACPI / AC: Remove struct acpi_device pointer from struct acpi_ac
spi: Use stable dev_name for ACPI enumerated SPI slaves
i2c: Use stable dev_name for ACPI enumerated I2C slaves
ACPI: Provide acpi_dev_name accessor for struct acpi_device device name
ACPI / bind: Use (put|get)_device() on ACPI device objects too
ACPI: Eliminate the DEVICE_ACPI_HANDLE() macro
ACPI / driver core: Store an ACPI device pointer in struct acpi_dev_node
cpufreq: OMAP: Fix compilation error 'r & ret undeclared'
PM / Runtime: Fix error path for prepare
PM / Runtime: Update documentation around probe|remove|suspend
cpufreq: conservative: set requested_freq to policy max when it is over policy max
...
Linus Torvalds [Wed, 20 Nov 2013 21:20:24 +0000 (13:20 -0800)]
Merge branch 'next' of git://git.infradead.org/users/vkoul/slave-dma
Pull slave-dmaengine changes from Vinod Koul:
"This brings for slave dmaengine:
- Change dma notification flag to DMA_COMPLETE from DMA_SUCCESS as
dmaengine can only transfer and not verify validaty of dma
transfers
- Bunch of fixes across drivers:
- cppi41 driver fixes from Daniel
- 8 channel freescale dma engine support and updated bindings from
Hongbo
- msx-dma fixes and cleanup by Markus
- DMAengine updates from Dan:
- Bartlomiej and Dan finalized a rework of the dma address unmap
implementation.
- In the course of testing 1/ a collection of enhancements to
dmatest fell out. Notably basic performance statistics, and
fixed / enhanced test control through new module parameters
'run', 'wait', 'noverify', and 'verbose'. Thanks to Andriy and
Linus [Walleij] for their review.
- Testing the raid related corner cases of 1/ triggered bugs in
the recently added 16-source operation support in the ioatdma
driver.
- Some minor fixes / cleanups to mv_xor and ioatdma"
* 'next' of git://git.infradead.org/users/vkoul/slave-dma: (99 commits)
dma: mv_xor: Fix mis-usage of mmio 'base' and 'high_base' registers
dma: mv_xor: Remove unneeded NULL address check
ioat: fix ioat3_irq_reinit
ioat: kill msix_single_vector support
raid6test: add new corner case for ioatdma driver
ioatdma: clean up sed pool kmem_cache
ioatdma: fix selection of 16 vs 8 source path
ioatdma: fix sed pool selection
ioatdma: Fix bug in selftest after removal of DMA_MEMSET.
dmatest: verbose mode
dmatest: convert to dmaengine_unmap_data
dmatest: add a 'wait' parameter
dmatest: add basic performance metrics
dmatest: add support for skipping verification and random data setup
dmatest: use pseudo random numbers
dmatest: support xor-only, or pq-only channels in tests
dmatest: restore ability to start test at module load and init
dmatest: cleanup redundant "dmatest: " prefixes
dmatest: replace stored results mechanism, with uniform messages
Revert "dmatest: append verify result to results"
...
Linus Torvalds [Wed, 20 Nov 2013 21:06:20 +0000 (13:06 -0800)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block IO fixes from Jens Axboe:
"Normally I'd defer my initial for-linus pull request until after the
merge window, but a race was uncovered in the virtio-blk conversion to
blk-mq that could cause hangs. So here's a small collection of fixes
for you to pull:
- The fix for the virtio-blk IO hang reported by Dave Chinner, from
Shaohua and myself.
- Add the Insert blktrace event for blk-mq. This makes 'btt' happy
when it is doing it's state transition analysis.
- Ensure that blk-mq has disk/partition stats enabled by default,
instead of making it opt-in.
- A fix for __bio_add_page() and large sector counts"
* 'for-linus' of git://git.kernel.dk/linux-block:
blk-mq: add blktrace insert event trace
virtio-blk: virtqueue_kick() must be ordered with other virtqueue operations
blk-mq: ensure that we set REQ_IO_STAT so diskstats work
bio: fix argument of __bio_add_page() for max_sectors > 0xffff
Linus Torvalds [Wed, 20 Nov 2013 21:05:25 +0000 (13:05 -0800)]
Merge tag 'md/3.13' of git://neil.brown.name/md
Pull md update from Neil Brown:
"Mostly optimisations and obscure bug fixes.
- raid5 gets less lock contention
- raid1 gets less contention between normal-io and resync-io during
resync"
* tag 'md/3.13' of git://neil.brown.name/md:
md/raid5: Use conf->device_lock protect changing of multi-thread resources.
md/raid5: Before freeing old multi-thread worker, it should flush them.
md/raid5: For stripe with R5_ReadNoMerge, we replace REQ_FLUSH with REQ_NOMERGE.
UAPI: include <asm/byteorder.h> in linux/raid/md_p.h
raid1: Rewrite the implementation of iobarrier.
raid1: Add some macros to make code clearly.
raid1: Replace raise_barrier/lower_barrier with freeze_array/unfreeze_array when reconfiguring the array.
raid1: Add a field array_frozen to indicate whether raid in freeze state.
md: Convert use of typedef ctl_table to struct ctl_table
md/raid5: avoid deadlock when raid5 array has unack badblocks during md_stop_writes.
md: use MD_RECOVERY_INTR instead of kthread_should_stop in resync thread.
md: fix some places where mddev_lock return value is not checked.
raid5: Retry R5_ReadNoMerge flag when hit a read error.
raid5: relieve lock contention in get_active_stripe()
raid5: relieve lock contention in get_active_stripe()
wait: add wait_event_cmd()
md/raid5.c: add proper locking to error path of raid5_start_reshape.
md: fix calculation of stacking limits on level change.
raid5: Use slow_path to release stripe when mddev->thread is null
--------------------------- cut here -------------------------------
The reason is that when the bridge dev's address is changed, the
br_fdb_change_mac_address() will add new address in fdb, but when
the bridge was removed, the address entry in the fdb did not free,
the bridge_fdb_cache still has objects when destroy the cache, Fix
this by flushing the bridge address entry when removing the bridge.
v2: according to the Toshiaki Makita and Vlad's suggestion, I only
delete the vlan0 entry, it still have a leak here if the vlan id
is other number, so I need to call fdb_delete_by_port(br, NULL, 1)
to flush all entries whose dst is NULL for the bridge.
Suggested-by: Toshiaki Makita <toshiaki.makita1@gmail.com> Suggested-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
tried to fix a problem with VLAN devices and promiscuouse flag setting.
The issue was that VLAN device was setting a flag on an interface that
was down, thus resulting in bad promiscuity count.
This commit blocked flag propagation to any device that is currently
down.
fixed VLAN code to only propagate flags when the VLAN interface is up,
thus fixing the same issue as above, only localized to VLAN.
The problem we have now is that if we have create a complex stack
involving multiple software devices like bridges, bonds, and vlans,
then it is possible that the flags would not propagate properly to
the physical devices. A simple examle of the scenario is the
following:
eth0----> bond0 ----> bridge0 ---> vlan50
If bond0 or eth0 happen to be down at the time bond0 is added to
the bridge, then eth0 will never have promisc mode set which is
currently required for operation as part of the bridge. As a
result, packets with vlan50 will be dropped by the interface.
The only 2 devices that implement the special flag handling are
VLAN and DSA and they both have required code to prevent incorrect
flag propagation. As a result we can remove the generic solution
introduced in b6c40d68ff6498b7f63ddf97cf0aa818d748dee7 and leave
it to the individual devices to decide whether they will block
flag propagation or not.
Reported-by: Stefan Priebe <s.priebe@profihost.ag> Suggested-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>