]> git.kernelconcepts.de Git - karo-tx-linux.git/log
karo-tx-linux.git
7 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec...
David S. Miller [Fri, 17 Feb 2017 02:25:49 +0000 (21:25 -0500)]
Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next

Steffen Klassert says:

====================
pull request (net-next): ipsec-next 2017-02-16

1) Make struct xfrm_input_afinfo const, nothing writes to it.
   From Florian Westphal.

2) Remove all places that write to the afinfo policy backend
   and make the struct const then.
   From Florian Westphal.

3) Prepare for packet consuming gro callbacks and add
   ESP GRO handlers. ESP packets can be decapsulated
   at the GRO layer then. It saves a round through
   the stack for each ESP packet.

Please note that this has a merge coflict between commit

63fca65d0863 ("net: add confirm_neigh method to dst_ops")

from net-next and

3d7d25a68ea5 ("xfrm: policy: remove garbage_collect callback")
a2817d8b279b ("xfrm: policy: remove family field")

from ipsec-next.

The conflict can be solved as it is done in linux-next.

Please pull or let me know if there are problems.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next...
David S. Miller [Fri, 17 Feb 2017 02:19:44 +0000 (21:19 -0500)]
Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue

Jeff Kirsher says:

====================
10GbE Intel Wired LAN Driver Updates 2017-02-16

This series contains updates to ixgbe only.

Tony updates the driver to advertise 2.5Gb and 5.0Gb if the adapter
supports it.

Stephen Hemminger renames our dcbnl_ops since it is global to
ixgbe_dcbnl_ops to avoid namespace issues.

Mark updates the driver version based on the recent changes.

Alex has the remainder of the changes, starting with consolidating
functions that represent logical steps in the receive process so we can
later update them more easily (and align with igb).  Modify the receive
path to only synchronize the length of the frame versus the entire buffer.
Provided performance improvements by adding support for
DMA_ATTR_SKIP_CPU_SYNC and DMA_ATTR_WEAK_ORDERING.  Also made additional
performance gains by batching the page count updates instead of doing
them one at a time.  Adjusted the receive path to use 3k buffers with
8k backing them in order to support build_skb with jumbo frames.  Made
additional driver improvements by using the length of the packet instead
of the DD status to determine if a new descriptor is ready to be
processed, which cuts down on reads.  To reduce code duplication, pulled
apart the receive path into separate functions.  Added support for
providing a buffer with headroom and tailroom to allow for shared info
for NET_SKB_PAD and NET_IP_ALIGN.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
David S. Miller [Fri, 17 Feb 2017 00:34:01 +0000 (19:34 -0500)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

7 years agocxgb4: Remove redundant code in t4_uld_clean_up()
Ganesh Goudar [Thu, 16 Feb 2017 06:57:15 +0000 (12:27 +0530)]
cxgb4: Remove redundant code in t4_uld_clean_up()

Remove variable rxq_info and also remove redundant assignment
to it.

Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agocxgb4: Add new T5 and T6 pci device id's
Ganesh Goudar [Thu, 16 Feb 2017 06:55:52 +0000 (12:25 +0530)]
cxgb4: Add new T5 and T6 pci device id's

Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agocxgb4: Increase max number of tc u32 links
Arjun V [Thu, 16 Feb 2017 06:52:45 +0000 (12:22 +0530)]
cxgb4: Increase max number of tc u32 links

Make max number of supported tc u32 links equal to max number of filters
supported by hardware.

Signed-off-by: Arjun V <arjun@chelsio.com>
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge tag 'media/v4.10-5' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab...
Linus Torvalds [Thu, 16 Feb 2017 18:22:41 +0000 (10:22 -0800)]
Merge tag 'media/v4.10-5' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media fix from Mauro Carvalho Chehab:
 "A regression fix that makes the Siano driver to work again after the
  CONFIG_VMAP_STACK change"

* tag 'media/v4.10-5' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  [media] siano: make it work again with CONFIG_VMAP_STACK

7 years agovfs: fix uninitialized flags in splice_to_pipe()
Miklos Szeredi [Thu, 16 Feb 2017 16:49:02 +0000 (17:49 +0100)]
vfs: fix uninitialized flags in splice_to_pipe()

Flags (PIPE_BUF_FLAG_PACKET, PIPE_BUF_FLAG_GIFT) could remain on the
unused part of the pipe ring buffer.  Previously splice_to_pipe() left
the flags value alone, which could result in incorrect behavior.

Uninitialized flags appears to have been there from the introduction of
the splice syscall.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Cc: <stable@vger.kernel.org> # 2.6.17+
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi...
Linus Torvalds [Thu, 16 Feb 2017 17:05:34 +0000 (09:05 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse

Pull fuse fixes from Miklos Szeredi:
 "Fix a use after free bug introduced in 4.2 and using an uninitialized
  value introduced in 4.9"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  fuse: fix uninitialized flags in pipe_buffer
  fuse: fix use after free issue in fuse_dev_do_read()

7 years agoMerge tag 'pci-v4.10-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaa...
Linus Torvalds [Thu, 16 Feb 2017 17:03:37 +0000 (09:03 -0800)]
Merge tag 'pci-v4.10-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI fix from Bjorn Helgaas:
 "Add back pcie_pme_remove() so we free the IRQ when removing PCIe port
  devices; previously the leaked IRQ caused an MSI BUG_ON"

* tag 'pci-v4.10-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI/PME: Restore pcie_pme_driver.remove

7 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Thu, 16 Feb 2017 16:37:18 +0000 (08:37 -0800)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David Miller:

 1) In order to avoid problems in the future, make cgroup bpf overriding
    explicit using BPF_F_ALLOW_OVERRIDE. From Alexei Staovoitov.

 2) LLC sets skb->sk without proper skb->destructor and this explodes,
    fix from Eric Dumazet.

 3) Make sure when we have an ipv4 mapped source address, the
    destination is either also an ipv4 mapped address or
    ipv6_addr_any(). Fix from Jonathan T. Leighton.

 4) Avoid packet loss in fec driver by programming the multicast filter
    more intelligently. From Rui Sousa.

 5) Handle multiple threads invoking fanout_add(), fix from Eric
    Dumazet.

 6) Since we can invoke the TCP input path in process context, without
    BH being disabled, we have to accomodate that in the locking of the
    TCP probe. Also from Eric Dumazet.

 7) Fix erroneous emission of NETEVENT_DELAY_PROBE_TIME_UPDATE when we
    aren't even updating that sysctl value. From Marcus Huewe.

 8) Fix endian bugs in ibmvnic driver, from Thomas Falcon.

[ This is the second version of the pull that reverts the nested
  rhashtable changes that looked a bit too scary for this late in the
  release  - Linus ]

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (27 commits)
  rhashtable: Revert nested table changes.
  ibmvnic: Fix endian errors in error reporting output
  ibmvnic: Fix endian error when requesting device capabilities
  net: neigh: Fix netevent NETEVENT_DELAY_PROBE_TIME_UPDATE notification
  net: xilinx_emaclite: fix freezes due to unordered I/O
  net: xilinx_emaclite: fix receive buffer overflow
  bpf: kernel header files need to be copied into the tools directory
  tcp: tcp_probe: use spin_lock_bh()
  uapi: fix linux/if_pppol2tp.h userspace compilation errors
  packet: fix races in fanout_add()
  ibmvnic: Fix initial MTU settings
  net: ethernet: ti: cpsw: fix cpsw assignment in resume
  kcm: fix a null pointer dereference in kcm_sendmsg()
  net: fec: fix multicast filtering hardware setup
  ipv6: Handle IPv4-mapped src to in6addr_any dst.
  ipv6: Inhibit IPv4-mapped src address on the wire.
  net/mlx5e: Disable preemption when doing TC statistics upcall
  rhashtable: Add nested tables
  tipc: Fix tipc_sk_reinit race conditions
  gfs2: Use rhashtable walk interface in glock_hash_walk
  ...

7 years agofuse: fix uninitialized flags in pipe_buffer
Miklos Szeredi [Thu, 16 Feb 2017 14:08:20 +0000 (15:08 +0100)]
fuse: fix uninitialized flags in pipe_buffer

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: d82718e348fe ("fuse_dev_splice_read(): switch to add_to_pipe()")
Cc: <stable@vger.kernel.org> # 4.9+
7 years agoixgbe: Don't bother clearing buffer memory for descriptor rings
Alexander Duyck [Tue, 17 Jan 2017 16:37:29 +0000 (08:37 -0800)]
ixgbe: Don't bother clearing buffer memory for descriptor rings

This patch makes it so that we don't need to bother with clearing the
memory out for the descriptor rings.  The general idea is to only free
buffers associated with buffers in use which are located between the
next_to_clean and next_to_use or next_to_alloc values.  Everything outside
of those regions can be safely ignored since they should have no buffers
associated with them.

The advantage to doing things this way is that is should speed up bring-up
and tear-down of the rings.  Specifically we can avoid the 512 or more
cycles required to memset the rings in tear-down.  In the bring-up phase we
then clear the memory as a part of initialization.  The general idea is
that the clearing in initialization can act as a prefetch of sorts for the
buffer info structures so they are in the local CPU when we go to populate
them.  This should help to improve overall time needed to perform a
suspend/resume.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
7 years agoixgbe: Add support for build_skb
Alexander Duyck [Tue, 17 Jan 2017 16:37:13 +0000 (08:37 -0800)]
ixgbe: Add support for build_skb

This patch adds build_skb support to the Rx path.  There are several
advantages to this change.

1.  It avoids the memcpy and skb->head allocation for small packets which
    improves performance by about 5% in my tests.
2.  It avoids the memcpy, skb->head allocation, and eth_get_headlen
    for larger packets improving performance by about 10% in my tests.
3.  For VXLAN packets it allows the full header to be in skb->data which
    improves the performance by as much as 30% in some of my tests.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
7 years agoixgbe: Add private flag to control buffer mode
Alexander Duyck [Tue, 17 Jan 2017 16:37:03 +0000 (08:37 -0800)]
ixgbe: Add private flag to control buffer mode

Since there are potential drawbacks to the new Rx allocation approach I
thought it best to add a "chicken bit" so that we can turn the feature off
if in the event that a problem is found.

It also provides a means of validating the legacy Rx path in the event that
we are forced to fall back.  At some point in the future when we are
convinced we don't need it anymore we might be able to drop the legacy-rx
flag.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
7 years agoixgbe: Add support for padding packet
Alexander Duyck [Tue, 17 Jan 2017 16:36:54 +0000 (08:36 -0800)]
ixgbe: Add support for padding packet

This patch adds support for providing a buffer with headroom and tailroom
to allow for shared info, NET_SKB_PAD, and NET_IP_ALIGN.  With this
combined with the DMA changes we can start using build_skb to build frames
around an incoming Rx buffer instead of having to memcpy the headers.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
7 years agoixgbe: Break out Rx buffer page management
Alexander Duyck [Tue, 17 Jan 2017 16:36:45 +0000 (08:36 -0800)]
ixgbe: Break out Rx buffer page management

We are going to be expanding the number of Rx paths in the driver.  Instead
of duplicating all that code I am pulling it apart into separate functions
so that we don't have so much code duplication.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
7 years agoixgbe: Use length to determine if descriptor is done
Alexander Duyck [Tue, 17 Jan 2017 16:36:28 +0000 (08:36 -0800)]
ixgbe: Use length to determine if descriptor is done

This change makes it so that we use the length of the packet instead of the
DD status bit to determine if a new descriptor is ready to be processed.
The obvious advantage is that it cuts down on reads as we don't really even
need the DD bit if going from a 0 to a non-zero value on size is enough to
inform us that the packet has been completed.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
7 years agoixgbe: Make use of order 1 pages and 3K buffers independent of FCoE
Alexander Duyck [Tue, 17 Jan 2017 16:36:14 +0000 (08:36 -0800)]
ixgbe: Make use of order 1 pages and 3K buffers independent of FCoE

In order to support build_skb with jumbo frames it will be necessary to use
3K buffers for the Rx path with 8K pages backing them.  This is needed on
architectures that implement 4K pages because we can't support 2K buffers
plus padding in a 4K page.

In the case of systems that support page sizes larger than 4K the 3K
attribute will only be applied to FCoE as we can fall back to using just 2K
buffers and adding the padding.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
7 years agoixgbe: Update code to better handle incrementing page count
Alexander Duyck [Tue, 17 Jan 2017 16:36:03 +0000 (08:36 -0800)]
ixgbe: Update code to better handle incrementing page count

Batch the page count updates instead of doing them one at a time.  By doing
this we can improve the overall performance as the atomic increment
operations can be expensive due to the fact that on x86 they are locked
operations which can cause stalls.  By doing bulk updates we can
consolidate the stall which should help to improve the overall receive
performance.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
7 years agoixgbe: Update driver to make use of DMA attributes in Rx path
Alexander Duyck [Tue, 17 Jan 2017 16:35:54 +0000 (08:35 -0800)]
ixgbe: Update driver to make use of DMA attributes in Rx path

This patch adds support for DMA_ATTR_SKIP_CPU_SYNC and
DMA_ATTR_WEAK_ORDERING.  By enabling both of these for the Rx path we are
able to see performance improvements on architectures that implement either
one due to the fact that page mapping and unmapping only has to sync what
is actually being used instead of the entire buffer.  In addition by
enabling the weak ordering attribute enables a performance improvement for
architectures that can associate a memory ordering with a DMA buffer such
as Sparc.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
7 years agoixgbe: Only DMA sync frame length
Alexander Duyck [Tue, 17 Jan 2017 16:35:44 +0000 (08:35 -0800)]
ixgbe: Only DMA sync frame length

On some platforms, syncing a buffer for DMA is expensive. Rather than
sync the whole 2K receive buffer, only synchronise the length of the
frame, which will typically be the MTU, or a much smaller TCP ACK.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
7 years agoixgbe: Add function for checking to see if we can reuse page
Alexander Duyck [Tue, 17 Jan 2017 16:35:34 +0000 (08:35 -0800)]
ixgbe: Add function for checking to see if we can reuse page

This patch consolidates the code for the ixgbe driver so that it is more
inline with what is already in igb.  The general idea is to just
consolidate functions that represent logical steps in the Rx process so we
can later update them more easily.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
7 years agoixgbe: Update version to reflect added functionality
Mark Rustad [Mon, 12 Dec 2016 23:08:13 +0000 (15:08 -0800)]
ixgbe: Update version to reflect added functionality

Update the driver version to reflect the new devices that it
supports.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
7 years agoixgbe: prefix Data Center Bridge ops struct
Stephen Hemminger [Mon, 21 Nov 2016 17:52:40 +0000 (09:52 -0800)]
ixgbe: prefix Data Center Bridge ops struct

Since dcbnl_ops is global, it should be prefixed by ixgbe_

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
7 years agoixgbe: Support 2.5Gb and 5Gb speed
Tony Nguyen [Fri, 11 Nov 2016 00:01:33 +0000 (16:01 -0800)]
ixgbe: Support 2.5Gb and 5Gb speed

Though not advertised through ethtool, if the link partner advertises a
2.5Gb or 5Gb connection, and the adapter supports it, allow the speed to be
used.

Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
7 years agorhashtable: Revert nested table changes.
David S. Miller [Thu, 16 Feb 2017 03:29:51 +0000 (22:29 -0500)]
rhashtable: Revert nested table changes.

This reverts commits:

6a25478077d987edc5e2f880590a2bc5fcab4441
9dbbfb0ab6680c6a85609041011484e6658e7d3c
40137906c5f55c252194ef5834130383e639536f

It's too risky to put in this late in the release
cycle.  We'll put these changes into the next merge
window instead.

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoibmvnic: Fix endian errors in error reporting output
Thomas Falcon [Wed, 15 Feb 2017 16:33:33 +0000 (10:33 -0600)]
ibmvnic: Fix endian errors in error reporting output

Error reports received from firmware were not being converted from
big endian values, leading to bogus error codes reported on little
endian systems.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoibmvnic: Fix endian error when requesting device capabilities
Thomas Falcon [Wed, 15 Feb 2017 16:32:11 +0000 (10:32 -0600)]
ibmvnic: Fix endian error when requesting device capabilities

When a vNIC client driver requests a faulty device setting, the
server returns an acceptable value for the client to request.
This 64 bit value was incorrectly being swapped as a 32 bit value,
resulting in loss of data. This patch corrects that by using
the 64 bit swap function.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoatm: idt77252, use setup_timer and mod_timer
Jan Koniarik [Wed, 15 Feb 2017 15:59:35 +0000 (16:59 +0100)]
atm: idt77252, use setup_timer and mod_timer

Stop accessing timer struct members directly and use setup_timer and
mod_timer helpers intended for that use. It makes the code cleaner and
will allow for easier change of the timer struct internals.

Signed-off-by: Jan Koniarik <jan.koniarik@trustica.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Chas Williams <3chas3@gmail.com>
Cc: <linux-atm-general@lists.sourceforge.net>
Cc: <netdev@vger.kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomlxsw: acl: Use PBS type for forward action
Jiri Pirko [Wed, 15 Feb 2017 11:09:51 +0000 (12:09 +0100)]
mlxsw: acl: Use PBS type for forward action

Current behaviour of "mirred redirect" action (forward) offload is a bit
odd. For matched packets the action forwards them to the desired
destination, but it also lets the packet duplicates to go the original
way down (bridge, router, etc). That is more like "mirred mirror".
Fix this by using PBS type which behaves exactly like "mirred redirect".
Note that PBS does not support loopback mode.

Fixes: 4cda7d8d7098 ("mlxsw: core: Introduce flexible actions support")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'stmmac-misc'
David S. Miller [Wed, 15 Feb 2017 18:20:57 +0000 (13:20 -0500)]
Merge branch 'stmmac-misc'

Corentin Labbe says:

====================
stmmac: misc patchs

This is a follow up of my previous stmmac serie which address some comment
done in v2.
====================

Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: stmmac: invert the logic for dumping regs
LABBE Corentin [Wed, 15 Feb 2017 09:46:45 +0000 (10:46 +0100)]
net: stmmac: invert the logic for dumping regs

It is easier to follow the logic by removing the not operator

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Reviewed-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: stmmac: reduce indentation by adding a continue
LABBE Corentin [Wed, 15 Feb 2017 09:46:44 +0000 (10:46 +0100)]
net: stmmac: reduce indentation by adding a continue

As suggested by Joe Perches, replacing the "if phydev" logic permit to
reduce indentation in the for loop.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Reviewed-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: stmmac: split the stmmac_adjust_link 10/100 case
LABBE Corentin [Wed, 15 Feb 2017 09:46:43 +0000 (10:46 +0100)]
net: stmmac: split the stmmac_adjust_link 10/100 case

The 10/100 case have too many ifcase.
This patch split it for removing an if.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Reviewed-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: stmmac: run stmmac_hw_fix_mac_speed when speed is valid
LABBE Corentin [Wed, 15 Feb 2017 09:46:42 +0000 (10:46 +0100)]
net: stmmac: run stmmac_hw_fix_mac_speed when speed is valid

This patch mutualise a bit by running stmmac_hw_fix_mac_speed() after
the switch in case of valid speed.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Reviewed-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: stmmac: set speed at SPEED_UNKNOWN in case of broken speed
LABBE Corentin [Wed, 15 Feb 2017 09:46:41 +0000 (10:46 +0100)]
net: stmmac: set speed at SPEED_UNKNOWN in case of broken speed

In case of invalid speed given, stmmac_adjust_link() still record it as
current speed.
This patch modify the default case to set speed as SPEED_UNKNOWN if not
10/100/1000.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Reviewed-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: stmmac: use SPEED_UNKNOWN/DUPLEX_UNKNOWN
LABBE Corentin [Wed, 15 Feb 2017 09:46:40 +0000 (10:46 +0100)]
net: stmmac: use SPEED_UNKNOWN/DUPLEX_UNKNOWN

It is better to use DUPLEX_UNKNOWN instead of just "-1".
Using 0 for an invalid speed is bad since 0 is a valid value for speed.
So this patch replace 0 by SPEED_UNKNOWN.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Reviewed-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: stmmac: likely is useless in occasional function
LABBE Corentin [Wed, 15 Feb 2017 09:46:39 +0000 (10:46 +0100)]
net: stmmac: likely is useless in occasional function

The stmmac_adjust_link() function is called too rarely for having
likely() macros being useful.
Just remove likely annotation in it.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Reviewed-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: stmmac: remove useless parenthesis
LABBE Corentin [Wed, 15 Feb 2017 09:46:38 +0000 (10:46 +0100)]
net: stmmac: remove useless parenthesis

This patch remove some useless parenthesis.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Reviewed-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ethernet: aquantia: switch to pci_alloc_irq_vectors
Christoph Hellwig [Wed, 15 Feb 2017 07:38:47 +0000 (08:38 +0100)]
net: ethernet: aquantia: switch to pci_alloc_irq_vectors

pci_enable_msix has been long deprecated, but this driver adds a new
instance.  Convert it to pci_alloc_irq_vectors so that no new instance
of the deprecated function reaches mainline.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Pavel Belous <pavel.belous@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'qed-ptp'
David S. Miller [Wed, 15 Feb 2017 17:42:54 +0000 (12:42 -0500)]
Merge branch 'qed-ptp'

Yuval Mintz says:

====================
qed*: Add support for PTP

This patch series adds required changes for qed/qede drivers for
supporting the IEEE Precision Time Protocol (PTP).

Changes from previous versions:
v7: Fixed Kbuild robot warnings.

v6: Corrected broken loop iteration in previous version.
    Reduced approximation error of adjfreq.

v5: Removed two divisions from the adjust-frequency loop.
    Resulting logic would use 8 divisions [instead of 24].

v4: Remove the loop iteration for value '0' in the qed_ptp_hw_adjfreq()
    implementation.

v3: Use div_s64 for 64-bit divisions as do_div gives error for signed
    types.
    Incorporated review comments from Richard Cochran.
      - Clear timestamp resgisters as soon as timestamp is read.
      - Use shift operation in the place of 'divide by 16'.

v2: Use do_div for 64-bit divisions.
====================

Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoqede: Add driver support for PTP
Sudarsana Reddy Kalluru [Wed, 15 Feb 2017 08:24:11 +0000 (10:24 +0200)]
qede: Add driver support for PTP

This patch adds the driver support for,
  - Registering the ptp clock functionality with the OS.
  - Timestamping the Rx/Tx PTP packets.
  - Ethtool callbacks related to PTP.

Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoqed: Add infrastructure for PTP support
Sudarsana Reddy Kalluru [Wed, 15 Feb 2017 08:24:10 +0000 (10:24 +0200)]
qed: Add infrastructure for PTP support

The patch adds the required qed interfaces for configuring/reading
the PTP clock on the adapter.

Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agocxgb4: Update proper netdev stats for rx drops
Ganesh Goudar [Wed, 15 Feb 2017 06:15:25 +0000 (11:45 +0530)]
cxgb4: Update proper netdev stats for rx drops

Count buffer group drops or truncates as rx drops rather than
rx errors in netdev stats.

Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: Arjun V <arjun@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoopenvswitch: Set internal device max mtu to ETH_MAX_MTU.
Jarno Rajahalme [Wed, 15 Feb 2017 05:16:28 +0000 (21:16 -0800)]
openvswitch: Set internal device max mtu to ETH_MAX_MTU.

Commit 91572088e3fd ("net: use core MTU range checking in core net
infra") changed the openvswitch internal device to use the core net
infra for controlling the MTU range, but failed to actually set the
max_mtu as described in the commit message, which now defaults to
ETH_DATA_LEN.

This patch fixes this by setting max_mtu to ETH_MAX_MTU after
ether_setup() call.

Fixes: 91572088e3fd ("net: use core MTU range checking in core net infra")
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: neigh: Fix netevent NETEVENT_DELAY_PROBE_TIME_UPDATE notification
Marcus Huewe [Wed, 15 Feb 2017 00:00:36 +0000 (01:00 +0100)]
net: neigh: Fix netevent NETEVENT_DELAY_PROBE_TIME_UPDATE notification

When setting a neigh related sysctl parameter, we always send a
NETEVENT_DELAY_PROBE_TIME_UPDATE netevent. For instance, when
executing

sysctl net.ipv6.neigh.wlp3s0.retrans_time_ms=2000

a NETEVENT_DELAY_PROBE_TIME_UPDATE netevent is generated.

This is caused by commit 2a4501ae18b5 ("neigh: Send a
notification when DELAY_PROBE_TIME changes"). According to the
commit's description, it was intended to generate such an event
when setting the "delay_first_probe_time" sysctl parameter.

In order to fix this, only generate this event when actually
setting the "delay_first_probe_time" sysctl parameter. This fix
should not have any unintended side-effects, because all but one
registered netevent callbacks check for other netevent event
types (the registered callbacks were obtained by grepping for
"register_netevent_notifier"). The only callback that uses the
NETEVENT_DELAY_PROBE_TIME_UPDATE event is
mlxsw_sp_router_netevent_event() (in
drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c): in case
of this event, it only accesses the DELAY_PROBE_TIME of the
passed neigh_parms.

Fixes: 2a4501ae18b5 ("neigh: Send a notification when DELAY_PROBE_TIME changes")
Signed-off-by: Marcus Huewe <suse-tux@gmx.de>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agosched: have stub for tcf_destroy_chain in case NET_CLS is not configured
Jiri Pirko [Wed, 15 Feb 2017 10:57:50 +0000 (11:57 +0100)]
sched: have stub for tcf_destroy_chain in case NET_CLS is not configured

This fixes broken build for !NET_CLS:

net/built-in.o: In function `fq_codel_destroy':
/home/sab/linux/net-next/net/sched/sch_fq_codel.c:468: undefined reference to `tcf_destroy_chain'

Fixes: cf1facda2f61 ("sched: move tcf_proto_destroy and tcf_destroy_chain helpers into cls_api")
Reported-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Tested-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: xilinx_emaclite: fix freezes due to unordered I/O
Anssi Hannula [Tue, 14 Feb 2017 17:11:45 +0000 (19:11 +0200)]
net: xilinx_emaclite: fix freezes due to unordered I/O

The xilinx_emaclite uses __raw_writel and __raw_readl for register
accesses. Those functions do not imply any kind of memory barriers and
they may be reordered.

The driver does not seem to take that into account, though, and the
driver does not satisfy the ordering requirements of the hardware.
For clear examples, see xemaclite_mdio_write() and xemaclite_mdio_read()
which try to set MDIO address before initiating the transaction.

I'm seeing system freezes with the driver with GCC 5.4 and current
Linux kernels on Zynq-7000 SoC immediately when trying to use the
interface.

In commit 123c1407af87 ("net: emaclite: Do not use microblaze and ppc
IO functions") the driver was switched from non-generic
in_be32/out_be32 (memory barriers, big endian) to
__raw_readl/__raw_writel (no memory barriers, native endian), so
apparently the device follows system endianness and the driver was
originally written with the assumption of memory barriers.

Rather than try to hunt for each case of missing barrier, just switch
the driver to use iowrite32/ioread32/iowrite32be/ioread32be depending
on endianness instead.

Tested on little-endian Zynq-7000 ARM SoC FPGA.

Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi>
Fixes: 123c1407af87 ("net: emaclite: Do not use microblaze and ppc IO
functions")
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: xilinx_emaclite: fix receive buffer overflow
Anssi Hannula [Tue, 14 Feb 2017 17:11:44 +0000 (19:11 +0200)]
net: xilinx_emaclite: fix receive buffer overflow

xilinx_emaclite looks at the received data to try to determine the
Ethernet packet length but does not properly clamp it if
proto_type == ETH_P_IP or 1500 < proto_type <= 1518, causing a buffer
overflow and a panic via skb_panic() as the length exceeds the allocated
skb size.

Fix those cases.

Also add an additional unconditional check with WARN_ON() at the end.

Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi>
Fixes: bb81b2ddfa19 ("net: add Xilinx emac lite device driver")
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf: Rebuild bpf.o for any dependency update
Mickaël Salaün [Sat, 11 Feb 2017 22:20:23 +0000 (23:20 +0100)]
bpf: Rebuild bpf.o for any dependency update

This is needed to force a rebuild of bpf.o when one of its dependencies
(e.g. uapi/linux/bpf.h) is updated.

Add a phony target.

Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: Wang Nan <wangnan0@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf: Remove redundant ifdef
Mickaël Salaün [Sat, 11 Feb 2017 19:37:08 +0000 (20:37 +0100)]
bpf: Remove redundant ifdef

Remove a useless ifdef __NR_bpf as requested by Wang Nan.

Inline one-line static functions as it was in the bpf_sys.h file.

Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/r/828ab1ff-4dcf-53ff-c97b-074adb895006@huawei.com
Acked-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomlx4: do not use rwlock in fast path
Eric Dumazet [Thu, 9 Feb 2017 17:10:04 +0000 (09:10 -0800)]
mlx4: do not use rwlock in fast path

Using a reader-writer lock in fast path is silly, when we can
instead use RCU or a seqlock.

For mlx4 hwstamp clock, a seqlock is the way to go, removing
two atomic operations and false sharing.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoPCI/PME: Restore pcie_pme_driver.remove
Yinghai Lu [Wed, 15 Feb 2017 05:17:48 +0000 (21:17 -0800)]
PCI/PME: Restore pcie_pme_driver.remove

In addition to making PME non-modular, d7def2040077 ("PCI/PME: Make
explicitly non-modular") removed the pcie_pme_driver .remove() method,
pcie_pme_remove().

pcie_pme_remove() freed the PME IRQ that was requested in pci_pme_probe().
The fact that we don't free the IRQ after d7def2040077 causes the following
crash when removing a PCIe port device via /sys:

  ------------[ cut here ]------------
  kernel BUG at drivers/pci/msi.c:370!
  invalid opcode: 0000 [#1] SMP
  Modules linked in:
  CPU: 1 PID: 14509 Comm: sh Tainted: G    W  4.8.0-rc1-yh-00012-gd29438d
  RIP: 0010:[<ffffffff9758bbf5>]  free_msi_irqs+0x65/0x190
  ...
  Call Trace:
   [<ffffffff9758cda4>] pci_disable_msi+0x34/0x40
   [<ffffffff97583817>] cleanup_service_irqs+0x27/0x30
   [<ffffffff97583e9a>] pcie_port_device_remove+0x2a/0x40
   [<ffffffff97584250>] pcie_portdrv_remove+0x40/0x50
   [<ffffffff97576d7b>] pci_device_remove+0x4b/0xc0
   [<ffffffff9785ebe6>] __device_release_driver+0xb6/0x150
   [<ffffffff9785eca5>] device_release_driver+0x25/0x40
   [<ffffffff975702e4>] pci_stop_bus_device+0x74/0xa0
   [<ffffffff975704ea>] pci_stop_and_remove_bus_device_locked+0x1a/0x30
   [<ffffffff97578810>] remove_store+0x50/0x70
   [<ffffffff9785a378>] dev_attr_store+0x18/0x30
   [<ffffffff97260b64>] sysfs_kf_write+0x44/0x60
   [<ffffffff9725feae>] kernfs_fop_write+0x10e/0x190
   [<ffffffff971e13f8>] __vfs_write+0x28/0x110
   [<ffffffff970b0fa4>] ? percpu_down_read+0x44/0x80
   [<ffffffff971e53a7>] ? __sb_start_write+0xa7/0xe0
   [<ffffffff971e53a7>] ? __sb_start_write+0xa7/0xe0
   [<ffffffff971e1f04>] vfs_write+0xc4/0x180
   [<ffffffff971e3089>] SyS_write+0x49/0xa0
   [<ffffffff97001a46>] do_syscall_64+0xa6/0x1b0
   [<ffffffff9819201e>] entry_SYSCALL64_slow_path+0x25/0x25
  ...
   RIP  [<ffffffff9758bbf5>] free_msi_irqs+0x65/0x190
   RSP <ffff89ad3085bc48>
  ---[ end trace f4505e1dac5b95d3 ]---
  Segmentation fault

Restore pcie_pme_remove().

[bhelgaas: changelog]
Fixes: d7def2040077 ("PCI/PME: Make explicitly non-modular")
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
CC: stable@vger.kernel.org # v4.9+
7 years agoesp: Add a software GRO codepath
Steffen Klassert [Wed, 15 Feb 2017 08:40:00 +0000 (09:40 +0100)]
esp: Add a software GRO codepath

This patch adds GRO ifrastructure and callbacks for ESP on
ipv4 and ipv6.

In case the GRO layer detects an ESP packet, the
esp{4,6}_gro_receive() function does a xfrm state lookup
and calls the xfrm input layer if it finds a matching state.
The packet will be decapsulated and reinjected it into layer 2.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
7 years agoxfrm: Extend the sec_path for IPsec offloading
Steffen Klassert [Wed, 15 Feb 2017 08:39:54 +0000 (09:39 +0100)]
xfrm: Extend the sec_path for IPsec offloading

We need to keep per packet offloading informations across
the layers. So we extend the sec_path to carry these for
the input and output offload codepath.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
7 years agofuse: fix use after free issue in fuse_dev_do_read()
Sahitya Tummala [Wed, 8 Feb 2017 15:00:56 +0000 (20:30 +0530)]
fuse: fix use after free issue in fuse_dev_do_read()

There is a potential race between fuse_dev_do_write()
and request_wait_answer() contexts as shown below:

TASK 1:
__fuse_request_send():
  |--spin_lock(&fiq->waitq.lock);
  |--queue_request();
  |--spin_unlock(&fiq->waitq.lock);
  |--request_wait_answer():
       |--if (test_bit(FR_SENT, &req->flags))
       <gets pre-empted after it is validated true>
                                   TASK 2:
                                   fuse_dev_do_write():
                                     |--clears bit FR_SENT,
                                     |--request_end():
                                        |--sets bit FR_FINISHED
                                        |--spin_lock(&fiq->waitq.lock);
                                        |--list_del_init(&req->intr_entry);
                                        |--spin_unlock(&fiq->waitq.lock);
                                        |--fuse_put_request();
       |--queue_interrupt();
       <request gets queued to interrupts list>
            |--wake_up_locked(&fiq->waitq);
       |--wait_event_freezable();
       <as FR_FINISHED is set, it returns and then
       the caller frees this request>

Now, the next fuse_dev_do_read(), see interrupts list is not empty
and then calls fuse_read_interrupt() which tries to access the request
which is already free'd and gets the below crash:

[11432.401266] Unable to handle kernel paging request at virtual address
6b6b6b6b6b6b6b6b
...
[11432.418518] Kernel BUG at ffffff80083720e0
[11432.456168] PC is at __list_del_entry+0x6c/0xc4
[11432.463573] LR is at fuse_dev_do_read+0x1ac/0x474
...
[11432.679999] [<ffffff80083720e0>] __list_del_entry+0x6c/0xc4
[11432.687794] [<ffffff80082c65e0>] fuse_dev_do_read+0x1ac/0x474
[11432.693180] [<ffffff80082c6b14>] fuse_dev_read+0x6c/0x78
[11432.699082] [<ffffff80081d5638>] __vfs_read+0xc0/0xe8
[11432.704459] [<ffffff80081d5efc>] vfs_read+0x90/0x108
[11432.709406] [<ffffff80081d67f0>] SyS_read+0x58/0x94

As FR_FINISHED bit is set before deleting the intr_entry with input
queue lock in request completion path, do the testing of this flag and
queueing atomically with the same lock in queue_interrupt().

Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: fd22d62ed0c3 ("fuse: no fc->lock for iqueue parts")
Cc: <stable@vger.kernel.org> # 4.2+
7 years agoxfrm: Export xfrm_parse_spi.
Steffen Klassert [Wed, 15 Feb 2017 08:39:49 +0000 (09:39 +0100)]
xfrm: Export xfrm_parse_spi.

We need it in the ESP offload handlers, so export it.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
7 years agonet: Prepare gro for packet consuming gro callbacks
Steffen Klassert [Wed, 15 Feb 2017 08:39:44 +0000 (09:39 +0100)]
net: Prepare gro for packet consuming gro callbacks

The upcomming IPsec ESP gro callbacks will consume the skb,
so prepare for that.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
7 years agonet: Add a skb_gro_flush_final helper.
Steffen Klassert [Wed, 15 Feb 2017 08:39:39 +0000 (09:39 +0100)]
net: Add a skb_gro_flush_final helper.

Add a skb_gro_flush_final helper to prepare for  consuming
skbs in call_gro_receive. We will extend this helper to not
touch the skb if the skb is consumed by a gro callback with
a followup patch. We need this to handle the upcomming IPsec
ESP callbacks as they reinject the skb to the napi_gro_receive
asynchronous. The handler is used in all gro_receive functions
that can call the ESP gro handlers.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
7 years agoxfrm: Add a secpath_set helper.
Steffen Klassert [Wed, 15 Feb 2017 08:39:24 +0000 (09:39 +0100)]
xfrm: Add a secpath_set helper.

Add a new helper to set the secpath to the skb.
This avoids code duplication, as this is used
in multiple places.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
7 years agonet: ethernet: ti: cpsw: use var instead of func for usage count
Ivan Khoronzhuk [Tue, 14 Feb 2017 14:02:36 +0000 (16:02 +0200)]
net: ethernet: ti: cpsw: use var instead of func for usage count

The usage count function is based on ndev_running flag that is
updated before calling ndo_open/close, but if ndo is called in
another place, as with suspend/resume, the counter is not changed,
that breaks sus/resume. For common resource no difference which
device is using it, does matter only device count. So, replace
usage count function on var and inc and dec it in ndo_open/close.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf: kernel header files need to be copied into the tools directory
Stephen Rothwell [Mon, 13 Feb 2017 21:22:20 +0000 (08:22 +1100)]
bpf: kernel header files need to be copied into the tools directory

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agotcp: tcp_probe: use spin_lock_bh()
Eric Dumazet [Wed, 15 Feb 2017 01:11:14 +0000 (17:11 -0800)]
tcp: tcp_probe: use spin_lock_bh()

tcp_rcv_established() can now run in process context.

We need to disable BH while acquiring tcp probe spinlock,
or risk a deadlock.

Fixes: 5413d1babe8f ("net: do not block BH while processing socket backlog")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Ricardo Nabinger Sanchez <rnsanchez@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agouapi: fix linux/if_pppol2tp.h userspace compilation errors
Dmitry V. Levin [Wed, 15 Feb 2017 02:23:26 +0000 (05:23 +0300)]
uapi: fix linux/if_pppol2tp.h userspace compilation errors

Because of <linux/libc-compat.h> interface limitations, <netinet/in.h>
provided by libc cannot be included after <linux/in.h>, therefore any
header that includes <netinet/in.h> cannot be included after <linux/in.h>.

Change uapi/linux/l2tp.h, the last uapi header that includes
<netinet/in.h>, to include <linux/in.h> and <linux/in6.h> instead of
<netinet/in.h> and use __SOCK_SIZE__ instead of sizeof(struct sockaddr)
the same way as uapi/linux/in.h does, to fix linux/if_pppol2tp.h userspace
compilation errors like this:

In file included from /usr/include/linux/l2tp.h:12:0,
                 from /usr/include/linux/if_pppol2tp.h:21,
/usr/include/netinet/in.h:31:8: error: redefinition of 'struct in_addr'

Fixes: 47c3e7783be4 ("net: l2tp: deprecate PPPOL2TP_MSG_* in favour of L2TP_MSG_*")
Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years ago[media] siano: make it work again with CONFIG_VMAP_STACK
Mauro Carvalho Chehab [Tue, 14 Feb 2017 19:47:57 +0000 (17:47 -0200)]
[media] siano: make it work again with CONFIG_VMAP_STACK

Reported as a Kaffeine bug:
https://bugs.kde.org/show_bug.cgi?id=375811

The USB control messages require DMA to work. We cannot pass
a stack-allocated buffer, as it is not warranted that the
stack would be into a DMA enabled area.

On Kernel 4.9, the default is to not accept DMA on stack anymore
on x86 architecture. On other architectures, this has been a
requirement since Kernel 2.2. So, after this patch, this driver
should likely work fine on all archs.

Tested with USB ID 2040:5510: Hauppauge Windham

Cc: stable@vger.kernel.org
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
7 years agopacket: fix races in fanout_add()
Eric Dumazet [Tue, 14 Feb 2017 17:03:51 +0000 (09:03 -0800)]
packet: fix races in fanout_add()

Multiple threads can call fanout_add() at the same time.

We need to grab fanout_mutex earlier to avoid races that could
lead to one thread freeing po->rollover that was set by another thread.

Do the same in fanout_release(), for peace of mind, and to help us
finding lockdep issues earlier.

Fixes: dc99f600698d ("packet: Add fanout support.")
Fixes: 0648ab70afe6 ("packet: rollover prepare: per-socket state")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agopch_gbe: Omit private ndo_get_stats function
Tobias Klauser [Tue, 14 Feb 2017 16:47:12 +0000 (17:47 +0100)]
pch_gbe: Omit private ndo_get_stats function

pch_gbe_get_stats() just returns dev->stats so we can leave it out
altogether and let dev_get_stats() do the job.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: hip04: Omit private ndo_get_stats function
Tobias Klauser [Tue, 14 Feb 2017 14:10:06 +0000 (15:10 +0100)]
net: hip04: Omit private ndo_get_stats function

hip04_get_stats() just returns dev->stats so we can leave it
out altogether and let dev_get_stats() do the job.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet_sched: nla_memdup_cookie() can be static
Wei Yongjun [Tue, 14 Feb 2017 14:19:32 +0000 (14:19 +0000)]
net_sched: nla_memdup_cookie() can be static

Fixes the following sparse warning:

net/sched/act_api.c:532:5: warning:
 symbol 'nla_memdup_cookie' was not declared. Should it be static?

Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoibmvnic: Fix initial MTU settings
Thomas Falcon [Tue, 14 Feb 2017 16:22:59 +0000 (10:22 -0600)]
ibmvnic: Fix initial MTU settings

In the current driver, the MTU is set to the maximum value
capable for the backing device. This decision turned out to
be a mistake as it led to confusion among users. The expected
initial MTU value used for other IBM vNIC capable operating
systems is 1500, with the maximum value (9000) reserved for
when Jumbo frames are enabled. This patch sets the MTU to
the default value for a net device.

It also corrects a discrepancy between MTU values received from
firmware, which includes the ethernet header length, and net
device MTU values.

Finally, it removes redundant min/max MTU assignments after device
initialization.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ethernet: ti: cpsw: fix cpsw assignment in resume
Ivan Khoronzhuk [Tue, 14 Feb 2017 12:42:15 +0000 (14:42 +0200)]
net: ethernet: ti: cpsw: fix cpsw assignment in resume

There is a copy-paste error, which hides breaking of resume
for CPSW driver: there was replaced netdev_priv() to ndev_to_cpsw(ndev)
in suspend, but left it unchanged in resume.

Fixes: 606f39939595a4d4540406bfc11f265b2036af6d
(ti: cpsw: move platform data and slaves info to cpsw_common)

Reported-by: Alexey Starikovskiy <AStarikovskiy@topcon.com>
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: irda: au1k_ir: drop useless include
Manuel Lauss [Tue, 14 Feb 2017 12:08:04 +0000 (13:08 +0100)]
net: irda: au1k_ir: drop useless include

remove useless ioport.h include.

Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: irda: au1k_ir: remove unused timer
Manuel Lauss [Tue, 14 Feb 2017 12:08:03 +0000 (13:08 +0100)]
net: irda: au1k_ir: remove unused timer

remove the unused timer.  I suppose it was intended as a timeout
detector, but never properly implemented.

Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf: reduce compiler warnings by adding fallthrough comments
Alexander Alemayhu [Mon, 13 Feb 2017 23:02:35 +0000 (00:02 +0100)]
bpf: reduce compiler warnings by adding fallthrough comments

Fixes the following warnings:

kernel/bpf/verifier.c: In function ‘may_access_direct_pkt_data’:
kernel/bpf/verifier.c:702:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
   if (t == BPF_WRITE)
      ^
kernel/bpf/verifier.c:704:2: note: here
  case BPF_PROG_TYPE_SCHED_CLS:
  ^~~~
kernel/bpf/verifier.c: In function ‘reg_set_min_max_inv’:
kernel/bpf/verifier.c:2057:23: warning: this statement may fall through [-Wimplicit-fallthrough=]
   true_reg->min_value = 0;
   ~~~~~~~~~~~~~~~~~~~~^~~
kernel/bpf/verifier.c:2058:2: note: here
  case BPF_JSGT:
  ^~~~
kernel/bpf/verifier.c:2068:23: warning: this statement may fall through [-Wimplicit-fallthrough=]
   true_reg->min_value = 0;
   ~~~~~~~~~~~~~~~~~~~~^~~
kernel/bpf/verifier.c:2069:2: note: here
  case BPF_JSGE:
  ^~~~
kernel/bpf/verifier.c: In function ‘reg_set_min_max’:
kernel/bpf/verifier.c:2009:24: warning: this statement may fall through [-Wimplicit-fallthrough=]
   false_reg->min_value = 0;
   ~~~~~~~~~~~~~~~~~~~~~^~~
kernel/bpf/verifier.c:2010:2: note: here
  case BPF_JSGT:
  ^~~~
kernel/bpf/verifier.c:2019:24: warning: this statement may fall through [-Wimplicit-fallthrough=]
   false_reg->min_value = 0;
   ~~~~~~~~~~~~~~~~~~~~~^~~
kernel/bpf/verifier.c:2020:2: note: here
  case BPF_JSGE:
  ^~~~

Reported-by: David Binderman <dcb314@hotmail.com>
Signed-off-by: Alexander Alemayhu <alexander@alemayhu.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agopcnet32: fix BNC/AUI port on AM79C970A
Ondrej Zary [Mon, 13 Feb 2017 22:45:47 +0000 (23:45 +0100)]
pcnet32: fix BNC/AUI port on AM79C970A

Even though the port autoselection is enabled by default on AM79C970A,
BNC/AUI port does not work because the link is always reported to be
down. The link state reported by the chip belongs only to the TP port
but the driver uses it regardless of the port used. The chip can't
detect BNC/AUI link state.

Disable port autoselection and use TP port by default to keep current
behavior (link detection works on TP port, BNC/AUI port does not work).

Implement ethtool autoneg, port and duplex configuration to allow
using the BNC/AUI port.

Report the TP link state only if the TP port is selected. When the
port autoselection is enabled or AUI port is selected, report the link
as always up.

Move pcnet32_suspend() and pcnet32_clr_suspend() functions to avoid
forward declarations.

Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agopcnet32: factor out pcnet32_clr_suspend()
Ondrej Zary [Mon, 13 Feb 2017 22:45:46 +0000 (23:45 +0100)]
pcnet32: factor out pcnet32_clr_suspend()

Move the code to clear SUSPEND flag to a separate function to simplify
code.

Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomlxsw: spectrum: Change ipv6 unregistered mc table
Nogah Frankel [Mon, 13 Feb 2017 20:03:02 +0000 (21:03 +0100)]
mlxsw: spectrum: Change ipv6 unregistered mc table

Point back the unregister IPv6 mc table to the bc table.
It is done since IPv6 mcast snooping is not supported for Spectrum yet.

Reported-by: Jiri Pirko <jiri@mellanox.com>
Fixes: 71c365bdc439 ("mlxsw: spectrum: Separate bc and mc floods")
Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Tested-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agokcm: fix a null pointer dereference in kcm_sendmsg()
WANG Cong [Mon, 13 Feb 2017 19:13:16 +0000 (11:13 -0800)]
kcm: fix a null pointer dereference in kcm_sendmsg()

In commit 98e3862ca2b1 ("kcm: fix 0-length case for kcm_sendmsg()")
I tried to avoid skb allocation for 0-length case, but missed
a check for NULL pointer in the non EOR case.

Fixes: 98e3862ca2b1 ("kcm: fix 0-length case for kcm_sendmsg()")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Tom Herbert <tom@herbertland.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'sunvnet-driver-updates'
David S. Miller [Tue, 14 Feb 2017 18:04:11 +0000 (13:04 -0500)]
Merge branch 'sunvnet-driver-updates'

Shannon Nelson says:

====================
sunvnet driver updates

The sunvnet ldom virtual network driver was due for some updates and
a bugfix or two.  These patches address a few items left over from
last year's make-over.

v2:
 - changed memory barrier fix to use smp_wmb
 - put NETIF_F_SG back into the advertised ldmvsw hw_features

v3:
 - the sunvnet_common module doesn't need module_init or _exit

v4:
 - dropped the statistics patch
 - fixed up "default" tag for SUNVNET_COMMON
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoldmvsw: disable tso and gso for bridge operations
Shannon Nelson [Mon, 13 Feb 2017 18:57:04 +0000 (10:57 -0800)]
ldmvsw: disable tso and gso for bridge operations

The ldmvsw driver is specifically for supporting the ldom virtual
networking by running in the primary ldom and using the LDC to connect
the remaining ldoms to the outside world via a bridge.  With TSO and GSO
supported while connected the bridge, things tend to misbehave as seen
in our case by delayed packets, enough to begin triggering retransmits
and affecting overall throughput.  By turning off advertised support for
TSO and GSO we restore stable traffic flow through the bridge.

Orabug: 23293104

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoldmvsw: update and simplify version string
Shannon Nelson [Mon, 13 Feb 2017 18:57:03 +0000 (10:57 -0800)]
ldmvsw: update and simplify version string

New version and simplify the print code.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agosunvnet: remove extra rcu_read_unlocks
Shannon Nelson [Mon, 13 Feb 2017 18:57:02 +0000 (10:57 -0800)]
sunvnet: remove extra rcu_read_unlocks

The RCU read lock is grabbed first thing in sunvnet_start_xmit_common()
so it always needs to be released.  This removes the conditional release
in the dropped packet error path and removes a couple of superfluous
calls in the middle of the code.

Reported-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com>
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agosunvnet: straighten up message event handling logic
Shannon Nelson [Mon, 13 Feb 2017 18:57:01 +0000 (10:57 -0800)]
sunvnet: straighten up message event handling logic

The use of gotos for handling the incoming events made this code
harder to read and support than it should be.  This patch straightens
out and clears up the logic.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agosunvnet: add memory barrier before check for tx enable
Shannon Nelson [Mon, 13 Feb 2017 18:57:00 +0000 (10:57 -0800)]
sunvnet: add memory barrier before check for tx enable

In order to allow the underlying LDC and outstanding memory operations
to potentially catch up with the driver's Tx requests, add a memory
barrier before checking again for available tx descriptors.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agosunvnet: update version and version printing
Shannon Nelson [Mon, 13 Feb 2017 18:56:59 +0000 (10:56 -0800)]
sunvnet: update version and version printing

There have been several changes since the first version of this code, so
we bump the version number.  While we're at it, we can simplify the
version printing a bit and drop a couple lines of code.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agosunvnet: remove unused variable in maybe_tx_wakeup
Sowmini Varadhan [Mon, 13 Feb 2017 18:56:58 +0000 (10:56 -0800)]
sunvnet: remove unused variable in maybe_tx_wakeup

The vio_dring_state *dr variable is unused in maybe_tx_wakeup().
As the comments indicate, we call maybe_tx_wakeup() whenever we
get a STOPPED LDC message on the port. If the queue is stopped,
we want to wake it up so that we will send another START message
at the next TX and trigger the consumer to drain the dring.

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agosunvnet: make sunvnet common code dynamically loadable
Shannon Nelson [Mon, 13 Feb 2017 18:56:57 +0000 (10:56 -0800)]
sunvnet: make sunvnet common code dynamically loadable

When the sunvnet_common code was split out for use by both sunvnet
and the newer ldmvsw, it was made into a static kernel library, which
limits the usefulness of sunvnet and ldmvsw as loadables, since most
of the real work is being done in the shared code.  Also, this is
simply dead code in kernels that aren't running the LDoms.

This patch makes the sunvnet_common into a dynamically loadable
module and makes sunvnet and ldmvsw dependent on sunvnet_common.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'sfc-bogus-interrupt-mode-fallbacks'
David S. Miller [Tue, 14 Feb 2017 17:43:19 +0000 (12:43 -0500)]
Merge branch 'sfc-bogus-interrupt-mode-fallbacks'

Edward Cree says:

====================
sfc: prevent bogus interrupt-mode fallbacks

EF10 VFs only support MSI-X interrupts, not MSI or legacy.  This series
 stops the probe logic from trying to fallback to those if MSI-X interrupt
 probe fails.  It also prevents selecting them with the interrupt_mode
 module parameter.
This avoids producing messages like "failed to hook legacy IRQ 0" and "IRQ
 handler type mismatch for IRQ 0", and ensures that the relevant error
 (from the attempt to enable MSI-X) is reported to the caller.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agosfc: only fall back to a lower interrupt mode if it is supported
Andrew Rybchenko [Mon, 13 Feb 2017 14:59:04 +0000 (14:59 +0000)]
sfc: only fall back to a lower interrupt mode if it is supported

If we fail to probe interrupts with our minimum mode, return that error.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agosfc: MSI-X is the only interrupt mode for EF10 VFs
Andrew Rybchenko [Mon, 13 Feb 2017 14:57:39 +0000 (14:57 +0000)]
sfc: MSI-X is the only interrupt mode for EF10 VFs

Add min_interrupt_mode specification per NIC type.
It is a bit confusing because of "highest interrupt mode is less capable".

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'bridge-fdb-minor-cleanup'
David S. Miller [Tue, 14 Feb 2017 17:41:04 +0000 (12:41 -0500)]
Merge branch 'bridge-fdb-minor-cleanup'

Nikolay Aleksandrov says:

====================
bridge: minor fdb cleanup

These patches aim to simplify the bridge fdb API a little by removing some
redundant functions and converting them into wrappers of a single function.
Also add proper lock checking to avoid future mistakes for the search
functions.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobridge: fdb: converge fdb_delete_by functions into one
Nikolay Aleksandrov [Mon, 13 Feb 2017 13:59:11 +0000 (14:59 +0100)]
bridge: fdb: converge fdb_delete_by functions into one

We can simplify the logic of entries pointing to the bridge by
converging the fdb_delete_by functions, this would allow us to use the
same function for both cases since the fdb's dst is set to NULL if it is
pointing to the bridge thus we can always check for a port match.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobridge: fdb: add proper lock checks in searching functions
Nikolay Aleksandrov [Mon, 13 Feb 2017 13:59:10 +0000 (14:59 +0100)]
bridge: fdb: add proper lock checks in searching functions

In order to avoid new errors add checks to br_fdb_find and fdb_find_rcu
functions. The first requires hash_lock, the second obviously RCU.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobridge: fdb: converge fdb searching functions into one
Nikolay Aleksandrov [Mon, 13 Feb 2017 13:59:09 +0000 (14:59 +0100)]
bridge: fdb: converge fdb searching functions into one

Before this patch we had 3 different fdb searching functions which was
confusing. This patch reduces all of them to one - fdb_find_rcu(), and
two flavors: br_fdb_find() which requires hash_lock and br_fdb_find_rcu
which requires RCU. This makes it clear what needs to be used, we also
remove two abusers of __br_fdb_get which called it under hash_lock and
replace them with br_fdb_find().

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: fec: fix multicast filtering hardware setup
Rui Sousa [Mon, 13 Feb 2017 02:01:25 +0000 (10:01 +0800)]
net: fec: fix multicast filtering hardware setup

Fix hardware setup of multicast address hash:
- Never clear the hardware hash (to avoid packet loss)
- Construct the hash register values in software and then write once
to hardware

Signed-off-by: Rui Sousa <rui.sousa@nxp.com>
Signed-off-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'ipv6-v4mapped'
David S. Miller [Tue, 14 Feb 2017 17:13:52 +0000 (12:13 -0500)]
Merge branch 'ipv6-v4mapped'

Jonathan T. Leighton says:

====================
IPv4-mapped on wire, :: dst address issue

Under some circumstances IPv6 datagrams are sent with IPv4-mapped IPv6
addresses as the source. Given an IPv6 socket bound to an IPv4-mapped
IPv6 address, and an IPv6 destination address, both TCP and UDP will
will send packets using the IPv4-mapped IPv6 address as the source. Per
RFC 6890 (Table 20), IPv4-mapped IPv6 source addresses are not allowed
in an IP datagram. The problem can be observed by attempting to
connect() either a TCP or UDP socket, or by using sendmsg() with a UDP
socket. The patch is intended to correct this issue for all socket
types.

linux follows the BSD convention that an IPv6 destination address
specified as in6addr_any is converted to the loopback address.
Currently, neither TCP nor UDP consider the possibility that the source
address is an IPv4-mapped IPv6 address, and assume that the appropriate
loopback address is ::1. The patch adds a check on whether or not the
source address is an IPv4-mapped IPv6 address and then sets the
destination address to either ::ffff:127.0.0.1 or ::1, as appropriate.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoipv6: Handle IPv4-mapped src to in6addr_any dst.
Jonathan T. Leighton [Sun, 12 Feb 2017 22:26:07 +0000 (17:26 -0500)]
ipv6: Handle IPv4-mapped src to in6addr_any dst.

This patch adds a check on the type of the source address for the case
where the destination address is in6addr_any. If the source is an
IPv4-mapped IPv6 source address, the destination is changed to
::ffff:127.0.0.1, and otherwise the destination is changed to ::1. This
is done in three locations to handle UDP calls to either connect() or
sendmsg() and TCP calls to connect(). Note that udpv6_sendmsg() delays
handling an in6addr_any destination until very late, so the patch only
needs to handle the case where the source is an IPv4-mapped IPv6
address.

Signed-off-by: Jonathan T. Leighton <jtleight@udel.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoipv6: Inhibit IPv4-mapped src address on the wire.
Jonathan T. Leighton [Sun, 12 Feb 2017 22:26:06 +0000 (17:26 -0500)]
ipv6: Inhibit IPv4-mapped src address on the wire.

This patch adds a check for the problematic case of an IPv4-mapped IPv6
source address and a destination address that is neither an IPv4-mapped
IPv6 address nor in6addr_any, and returns an appropriate error. The
check in done before returning from looking up the route.

Signed-off-by: Jonathan T. Leighton <jtleight@udel.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet/mlx5e: Disable preemption when doing TC statistics upcall
Or Gerlitz [Sun, 12 Feb 2017 09:21:31 +0000 (11:21 +0200)]
net/mlx5e: Disable preemption when doing TC statistics upcall

When called by HW offloading drivers, the TC action (e.g
net/sched/act_mirred.c) code uses this_cpu logic, e.g

 _bstats_cpu_update(this_cpu_ptr(a->cpu_bstats), bytes, packets)

per the kernel documention, preemption should be disabled, add that.

Before the fix, when running with CONFIG_PREEMPT set, we get a

BUG: using smp_processor_id() in preemptible [00000000] code: tc/3793

asserion from the TC action (mirred) stats_update callback.

Fixes: aad7e08d39bd ('net/mlx5e: Hardware offloaded flower filter statistics support')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>