Nir Tzachar <nir.tzachar@gmail.com> reported a warning when sending
fragments over loopback with NAT:
[ 6658.338121] WARNING: at net/ipv4/netfilter/nf_nat_standalone.c:89 nf_nat_fn+0x33/0x155()
The reason is that defragmentation is skipped for already tracked connections.
This is wrong in combination with NAT and ip_conntrack actually had some ifdefs
to avoid this behaviour when NAT is compiled in.
The entire "optimization" may seem a bit silly, for now simply restoring the
lost #ifdef is the easiest solution until we can come up with something better.
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
In some BIOSes, every _STA method call will send a notification again,
this cause freeze. And in some BIOSes, it appears _STA should be called
after _DCK. This tries to avoid calls _STA, and still keep the device
present check.
http://bugzilla.kernel.org/show_bug.cgi?id=10431
Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Make arch/sparc64/kernel/trampoline.S in 2.6.27.1 lock prom_entry_lock
when calling the PROM. This prevents a race condition that I observed
causing a hang on startup on a 12-CPU E4500.
I am not subscribed to this list, so please CC me on replies.
Signed-off-by: Andrea Shepard <andrea@persephoneslair.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
I'm trying to move the powerpc math-emu code to use the include/math-emu bits.
In doing so I've been using TestFloat to see how good or bad we are
doing. For the most part the current math-emu code that PPC uses has
a number of issues that the code in include/math-emu seems to solve
(plus bugs we've had for ever that no one every realized).
Anyways, I've come across a case that we are flagging underflow and
inexact because we think we have a denormalized result from a double
precision divide:
The problem seems like we aren't normalizing the result and bumping the exp.
Now that I'm digging into this a bit I'm thinking my issue has to do with
the fix DaveM put in place from back in Aug 2007 (commit 405849610fd96b4f34cd1875c4c033228fea6c0f):
[MATH-EMU]: Fix underflow exception reporting.
2) we ended up rounding back up to normal (this is the case where
we set the exponent to 1 and set the fraction to zero), this
should set inexact too
...
Another example, "0x0.0000000000001p-1022 / 16.0", should signal both
inexact and underflow. The cpu implementations and ieee1754
literature is very clear about this. This is case #2 above.
Here is the distilled glibc test case from Jakub Jelinek which prompted that
commit:
More breakage :-), part of timestamps just were previously
overwritten.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Benjamin Thery tracked down a bug that explains many instances
of the error
unregister_netdevice: waiting for %s to become free. Usage count = %d
It turns out that netdev_run_todo can dead-lock with itself if
a second instance of it is run in a thread that will then free
a reference to the device waited on by the first instance.
The problem is really quite silly. We were trying to create
parallelism where none was required. As netdev_run_todo always
follows a RTNL section, and that todo tasks can only be added
with the RTNL held, by definition you should only need to wait
for the very ones that you've added and be done with it.
There is no need for a second mutex or spinlock.
This is exactly what the following patch does.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
DVB: s5h1411: Perform s5h1411 soft reset after tuning
If you instruct the tuner to change frequencies, it can take up to 2500ms to
get a demod lock. By performing a soft reset after the tuning call (which
is consistent with how the Pinnacle 801e Windows driver behaves), you get
a demod lock inside of 300ms
Currently not always an EV_SYN event is reported to userland
after the EV_SW SW_LID event has been sent. This is easy to verify
by using “input-events” from input-utils and just closing and opening
the lid.
Signed-off-by: Guillem Jover <guillem.jover@nokia.com> Acked-by: Dmitry Torokhov <dtor@mail.ru> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
While Linux doesn't honor setuid on scripts. However, it mistakenly
behaves differently for file capabilities.
This patch fixes that behavior by making sure that get_file_caps()
begins with empty bprm->caps_*. That way when a script is loaded,
its bprm->caps_* may be filled when binfmt_misc calls prepare_binprm(),
but they will be cleared again when binfmt_elf calls prepare_binprm()
next to read the interpreter's file capabilities.
Signed-off-by: Serge Hallyn <serue@us.ibm.com> Acked-by: David Howells <dhowells@redhat.com> Acked-by: Andrew G. Morgan <morgan@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
If somebody sends an invalid beacon/probe response, that can trash the
whole BSS descriptor. The descriptor is, luckily, large enough so that
it cannot scribble past the end of it; it's well above 400 bytes long.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
__scm_destroy() walks the list of file descriptors in the scm_fp_list
pointed to by the scm_cookie argument.
Those, in turn, can close sockets and invoke __scm_destroy() again.
There is nothing which limits how deeply this can occur.
The idea for how to fix this is from Linus. Basically, we do all of
the fput()s at the top level by collecting all of the scm_fp_list
objects hit by an fput(). Inside of the initial __scm_destroy() we
keep running the list until it is empty.
Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The cell_edac driver is setting the edac_mode field of the csrow's to an
incorrect value, causing the sysfs show routine for that field to go out
of an array bound and Oopsing the kernel when used.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
ext[234]: Avoid printk floods in the face of directory corruption
Note: some people thinks this represents a security bug, since it
might make the system go away while it is printing a large number of
console messages, especially if a serial console is involved. Hence,
it has been assigned CVE-2008-3528, but it requires that the attacker
either has physical access to your machine to insert a USB disk with a
corrupted filesystem image (at which point why not just hit the power
button), or is otherwise able to convince the system administrator to
mount an arbitrary filesystem image (at which point why not just
include a setuid shell or world-writable hard disk device file or some
such). Me, I think they're just being silly. --tytso
It's OK to request the value of *any* GPIO; most GPIOs are bidirectional,
so configuring them as outputs just enables an output driver and doesn't
disable the input logic.
So the problem is that gpio_get_value_cansleep() isn't making the same
sanity check that gpio_get_value() does: making sure this GPIO isn't one
of the atypical "no input logic" cases.
Reported-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
NULL function pointers are very bad security wise. This one got caught by
kerneloops.org quite a few times, so it's happening in the field....
Fix is simple, check the function pointer for NULL, like 6 other places
in the same function are already doing.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Olaf Kirch noticed that the i915_set_status_page() function of the i915
kernel driver calls ioremap with an address offset that is supplied by
userspace via ioctl. The function zeroes the mapped memory via memset
and tells the hardware about the address. Turns out that access to that
ioctl is not restricted to root so users could probably exploit that to
do nasty things. We haven't tried to write actual exploit code though.
It only affects the Intel G33 series and newer.
Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
According to acpi spec , the objectes of _BCL and _BCM are required if
integrated LCD is present and supports brightness level and the _BQC is
the optional object. So the _BQC object will be ignored when the backlight
device is registered.
At the same time when there is no _BQC object, the current brightness will be
set to the maximum.
http://bugzilla.kernel.org/show_bug.cgi?id=10206
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Len Brown <lenb@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
On the Shuttle SN68PT, FAN_CTL2 is apparently not connected to a fan,
but to something else. One user has reported instant system power-off
when changing the PWM2 duty cycle, so we disable it.
I use the board name string as the trigger in case the same board is
ever used in other systems.
This closes lm-sensors ticket #2349:
pwmconfig causes a hard poweroff
http://www.lm-sensors.org/ticket/2349
Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This is loosely based on a patch by Jesse Barnes to check the user-space
PCI mappings though the sysfs interfaces. Quoting Jesse's original
explanation:
It's fairly common for applications to map PCI resources through sysfs.
However, with the current implementation, it's possible for an application
to map far more than the range corresponding to the resourceN file it
opened. This patch plugs that hole by checking the range at mmap time,
similar to what is done on platforms like sparc64 in their lower level
PCI remapping routines.
It was initially put together to help debug the e1000e NVRAM corruption
problem, since we initially thought an X driver might be walking past the
end of one of its mappings and clobbering the NVRAM. It now looks like
that's not the case, but doing the check is still important for obvious
reasons.
and this version of the patch differs in that it uses a helper function
to clarify the code, and does all the checks in pages (instead of bytes)
in order to avoid overflows when doing "<< PAGE_SHIFT" etc.
[cebbert@redhat.com: backport, changing WARN() to printk()]
It's possible for get_wchan() to dereference past task->stack + THREAD_SIZE
while iterating through instruction pointers if fp equals the upper boundary,
causing a kernel panic.
The ACPI FADT table includes an ASPM control bit. If the bit is set, do
not enable ASPM since it may indicate that the platform doesn't actually
support the feature.
Tested-by: Jack Howarth <howarth@bromo.msbb.uc.edu> Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Chuck Ebbert <cebbert@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
There is a buffer overflow in drivers/media/video/uvc/uvc_ctrl.c:
INFO: 0xf2c5ce08-0xf2c5ce0b. First byte 0xa1 instead of 0xcc
INFO: Allocated in uvc_query_v4l2_ctrl+0x3c/0x239 [uvcvideo] age=13 cpu=1 pid=4975
...
A fixed size 8-byte buffer is allocated, and a variable size field is read
into it; there is no particular bound on the size of the field (it is
dependent on hardware and configuration) and it can overflow [also
verified by inserting printk's.]
The patch attempts to size the buffer to the correctly.
[required to get the following two patches to apply]
Although the V4L2 spec states that the minimum and maximum fields may not be
valid for control types other than V4L2_CTRL_TYPE_INTEGER, it makes sense
to set the bounds to 0 and 1 for boolean controls instead of returning
uninitialized values.
This is debatable, but while we're debating it, let's disallow the
combination of splice and an O_APPEND destination.
It's not entirely clear what the semantics of O_APPEND should be, and
POSIX apparently expects pwrite() to ignore O_APPEND, for example. So
we could make up any semantics we want, including the old ones.
But Miklos convinced me that we should at least give it some thought,
and that accepting writes at arbitrary offsets is wrong at least for
IS_APPEND() files (which always have O_APPEND set, even if the reverse
isn't true: you can obviously have O_APPEND set on a regular file).
So disallow O_APPEND entirely for now. I doubt anybody cares, and this
way we have one less gray area to worry about.
The zr36067 driver is improperly declaring pixel format RGBP twice,
once as "16-bit RGB LE" and once as "16-bit RGB BE". The latter is
actually RGBR. Fix the code to properly map both pixel formats.
This happens because radio_open assumes that all present bttv devices
have a radio function. If a bttv device without radio and one with
radio are installed on the same system, and the one without radio is
registered first, then radio_open checks for the radio device number
of a bttv device that has no radio function, and this breaks. All we
have to do to fix it is to skip bttv devices without a radio function.
Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Michael Krufky <mkrufky@linuxtv.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
I recently bought 3 HGST P7K500-series 500GB SATA drives and
had trouble accessing the block right on the LBA28-LBA48 border.
Here's how it fails (same for all 3 drives):
# dd if=/dev/sdc bs=512 count=1 skip=268435455 > /dev/null
dd: reading `/dev/sdc': Input/output error
0+0 records in
0+0 records out
0 bytes (0 B) copied, 0.288033 seconds, 0.0 kB/s
# dmesg
ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata1.00: BMDMA stat 0x25
ata1.00: cmd c8/00:08:f8:ff:ff/00:00:00:00:00/ef tag 0 dma 4096 in
res 51/04:08:f8:ff:ff/00:00:00:00:00/ef Emask 0x1 (device error)
ata1.00: status: { DRDY ERR }
ata1.00: error: { ABRT }
ata1.00: configured for UDMA/33
ata1: EH complete
...
After some investigations, it turned out this seems to be caused
by misinterpretation of the ATA specification on LBA28 access.
Following part is the code in question:
HGST drive (sometimes) fails with LBA28 access of {block = 0xfffffff,
n_block = 1}, and this behavior seems to be comformant. Other drives,
including other HGST drives are not that strict, through.
>From the ATA specification:
(http://www.t13.org/Documents/UploadedDocuments/project/d1410r3b-ATA-ATAPI-6.pdf)
8.15.29 Word (61:60): Total number of user addressable sectors
This field contains a value that is one greater than the total number
of user addressable sectors (see 6.2). The maximum value that shall
be placed in this field is 0FFFFFFFh.
So the driver shouldn't use the value of 0xfffffff for LBA28 request
as this exceeds maximum user addressable sector. The logical maximum
value for LBA28 is 0xffffffe.
The obvious fix is to cut "- 1" part, and the patch attached just do
that. I've been using the patched kernel for about a month now, and
the same fix is also floating on the net for some time. So I believe
this fix works reliably.
Just FYI, many Windows/Intel platform users also seems to be struck
by this, and HGST has issued a note pointing to Intel ICH8/9 driver.
ehc->i.action got accidentally overwritten to ATA_EH_HARD/SOFTRESET in
ata_eh_reset(). The original intention was to clear reset action
which wasn't selected. This can cause unexpected behavior when other
EH actions are scheduled together with reset. Fix it.
As an optimization, follow-up SRST used to be skipped if
classification wasn't requested even when hardreset requested it via
-EAGAIN. However, some hardresets can't wait for device readiness and
skipping SRST can cause timeout or other failures during revalidation.
Always perform follow-up SRST if hardreset returns -EAGAIN. This
makes reset paths more predictable and thus less error-prone.
While at it, move hardreset error checking such that it's done right
after hardreset is finished. This simplifies followup SRST condition
check a bit and makes the reset path easier to modify.
Because fbcon_set_all_vcs()->FBCON_SWAP() uses display->rotate == 0 instead
of fbcon_ops->rotate, and vc_resize() has no effect because it is called with
new_cols/rows == ->vc_cols/rows.
Tested on 2.6.26.5-45.fc9.x86_64, but
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git seems to
have the same problem.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Krzysztof Helt <krzysztof.h1@poczta.fm> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
A coding error present since b43legacy was incorporated into the
kernel has prevented the driver from using the rate-setting mechanism
of mac80211. The driver has been forced to remain at a 1 Mb/s rate.
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When we do a seekdir() or equivalent, we usually end up doing a
FindFirst call and then call FindNext until we get to the offset that we
want. The problem is that when we call FindNext, the code usually
doesn't have the proper info (mostly, the filename of the entry from the
last search) to resume the search.
Add a "last_entry" field to the cifs_search_info that points to the last
entry in the search. We calculate this pointer by using the
LastNameOffset field from the search parms that are returned. We then
use that info to do a cifs_save_resume_key before we call CIFSFindNext.
This patch allows CIFS to reliably pass the "telldir" connectathon test.
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
While working on the new version of the code for SCHED_SPORADIC I
noticed something strange in the present throttling mechanism. More
specifically in the throttling timer handler in sched_rt.c
(do_sched_rt_period_timer()) and in rt_rq_enqueue().
The problem is that, when unthrottling a runqueue, rt_rq_enqueue() only
asks for rescheduling if the runqueue has a sched_entity associated to
it (i.e., rt_rq->rt_se != NULL).
Now, if the runqueue is the root rq (which has a rt_se = NULL)
rescheduling does not take place, and it is delayed to some undefined
instant in the future.
This imply some random bandwidth usage by the RT tasks under throttling.
For instance, setting rt_runtime_us/rt_period_us = 950ms/1000ms an RT
task will get less than 95%. In our tests we got something varying
between 70% to 95%.
Using smaller time values, e.g., 95ms/100ms, things are even worse, and
I can see values also going down to 20-25%!!
The tests we performed are simply running 'yes' as a SCHED_FIFO task,
and checking the CPU usage with top, but we can investigate thoroughly
if you think it is needed.
Things go much better, for us, with the attached patch... Don't know if
it is the best approach, but it solved the issue for us.
Signed-off-by: Dario Faggioli <raistlin@linux.it> Signed-off-by: Michael Trimarchi <trimarchimichael@yahoo.it> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The x86 implementation of early_ioremap has an off by one error. If we get
an object which ends on the first byte of a page we undermap by one page and
this causes a crash on boot with the ASUS P5QL whose DMI table happens to fit
this alignment.
Stefan Bader [Sat, 27 Sep 2008 15:07:30 +0000 (11:07 -0400)]
x86: Reserve FIRST_DEVICE_VECTOR in used_vectors bitmap.
Not in upstream above 2.6.27 due to change in the way this code works
(has been fixed differently there.)
Someone from the community found out, that after repeatedly unloading
and loading a device driver that uses MSI IRQs, the system eventually
assigned the vector initially reserved for IRQ0 to the device driver.
The reason for this is, that although IRQ0 is tied to the
FIRST_DEVICE_VECTOR when declaring the irq_vector table, the
corresponding bit in the used_vectors map is not set. So, if vectors are
released and assigned often enough, the vector will get assigned to
another interrupt. This happens more often with MSI interrupts as those
are exclusively using a vector.
Fix this by setting the bit for the FIRST_DEVICE_VECTOR in the bitmap.
When running a 31-bit ptrace, on either an s390 or s390x kernel,
reads and writes into a padding area in struct user_regs_struct32
will result in a kernel panic.
This is also known as CVE-2008-1514.
Test case available here:
http://sources.redhat.com/cgi-bin/cvsweb.cgi/~checkout~/tests/ptrace-tests/tests/user-area-padding.c?cvsroot=systemtap
Steps to reproduce:
1) wget the above
2) gcc -o user-area-padding-31bit user-area-padding.c -Wall -ggdb2 -D_GNU_SOURCE -m31
3) ./user-area-padding-31bit
<panic>
Test status
-----------
Without patch, both s390 and s390x kernels panic. With patch, the test case,
as well as the gdb testsuite, pass without incident, padding area reads
returning zero, writes ignored.
Nb: original version returned -EINVAL on write attempts, which broke the
gdb test and made the test case slightly unhappy, Jan Kratochvil suggested
the change to return 0 on write attempts.
Signed-off-by: Jarod Wilson <jarod@redhat.com> Tested-by: Jan Kratochvil <jan.kratochvil@redhat.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Moritz Muehlenhoff <jmm@debian.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Balbir Singh [Sun, 5 Oct 2008 16:43:37 +0000 (17:43 +0100)]
mm owner: fix race between swapoff and exit
[Here's a backport of 2.6.27-rc8's 31a78f23bac0069004e69f98808b6988baccb6b6
to 2.6.26 or 2.6.26.5: I wouldn't trouble -stable for the (root only)
swapoff case which uncovered the bug, but the /proc/<pid>/<mmstats> case
is open to all, so I think worth plugging in the next 2.6.26-stable.
- Hugh]
There's a race between mm->owner assignment and swapoff, more easily
seen when task slab poisoning is turned on. The condition occurs when
try_to_unuse() runs in parallel with an exiting task. A similar race
can occur with callers of get_task_mm(), such as /proc/<pid>/<mmstats>
or ptrace or page migration.
CPU0 CPU1
try_to_unuse
looks at mm = task0->mm
increments mm->mm_users
task 0 exits
mm->owner needs to be updated, but no
new owner is found (mm_users > 1, but
no other task has task->mm = task0->mm)
mm_update_next_owner() leaves
mmput(mm) decrements mm->mm_users
task0 freed
dereferencing mm->owner fails
The fix is to notify the subsystem via mm_owner_changed callback(),
if no new owner is found, by specifying the new task as NULL.
Jiri Slaby:
mm->owner was set to NULL prior to calling cgroup_mm_owner_callbacks(), but
must be set after that, so as not to pass NULL as old owner causing oops.
Daisuke Nishimura:
mm_update_next_owner() may set mm->owner to NULL, but mem_cgroup_from_task()
and its callers need to take account of this situation to avoid oops.
Hugh Dickins:
Lockdep warning and hang below exec_mmap() when testing these patches.
exit_mm() up_reads mmap_sem before calling mm_update_next_owner(),
so exec_mmap() now needs to do the same. And with that repositioning,
there's now no point in mm_need_new_owner() allowing for NULL mm.
When userspace uses SIGIO notification and forgets to disable it before
closing file descriptor, rtc->async_queue contains stale pointer to struct
file. When user space enables again SIGIO notification in different
process, kernel dereferences this (poisoned) pointer and crashes.
So disable SIGIO notification on close.
Kernel panic:
(second run of qemu (requires echo 1024 > /sys/class/rtc/rtc0/max_user_freq))
Commit 22af89aa0c0b4012a7431114a340efd3665a7617 ("fbcon: replace mono_col
macro with static inline") changed the order of operations for computing
monochrome color values. This generates 0xffff000f instead of 0x0000000f
for a 4 bit monochrome color, leading to image corruption if it is passed
to cfb_imageblit or other similar functions. Fix it up.
Cc: Harvey Harrison <harvey.harrison@gmail.com> Cc: "Antonino A. Daplas" <adaplas@pol.net> Cc: Krzysztof Helt <krzysztof.h1@poczta.fm> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Just like in the arch/sparc64/kernel/of_device.c code fix commit 071d7f4c3b411beae08d27656e958070c43b78b4 ("sparc64: Fix disappearing
PCI devices on e3500.") we have to check the OF device node name for
"pci" instead of relying upon the 'device_type' property being there
on all PCI bridges.
Tested by Meelis Roos, and confirmed to make the PCI QFE devices
reappear on the E3500 system.
Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The OF device layer builds properties by matching bus types and
applying 'range' properties as appropriate, up to the root.
The match for "PCI" busses is looking at the 'device_type' property,
and this does work %99 of the time.
But on an E3500 system with a PCI QFE card, the DEC 21153 bridge
sitting above the QFE network interface devices has a 'name' of "pci",
but it completely lacks a 'device_type' property. So we don't match
it as a PCI bus, and subsequently we end up with no resource values at
all for the devices sitting under that DEC bridge.
Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The previous patch in response to the recursive locking on IPsec
reception is broken as it tries to drop the BH socket lock while in
user context.
This patch fixes it by shrinking the section protected by the
socket lock to sock_queue_rcv_skb only. The only reason we added
the lock is for the accounting which happens in that function.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
If INIT-ACK is received with SupportedExtensions parameter which
indicates that the peer does not support AUTH, the packet will be
silently ignore, and sctp_process_init() do cleanup all of the
transports in the association.
When T1-Init timer is expires, OOPS happen while we try to choose
a different init transport.
The solution is to only clean up the non-active transports, i.e
the ones that the peer added. However, that introduces a problem
with sctp_connectx(), because we don't mark the proper state for
the transports provided by the user. So, we'll simply mark
user-provided transports as ACTIVE. That will allow INIT
retransmissions to work properly in the sctp_connectx() context
and prevent the crash.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Do not enable peer features like addip and auth, if they
are administratively disabled localy. If the peer resports
that he supports something that we don't, neither end can
use it so enabling it is pointless. This solves a problem
when talking to a peer that has auth and addip enabled while
we do not. Found by Andrei Pelinescu-Onciul <andrei@iptel.org>.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
We're never supposed to shrink the headroom or tailroom. In fact,
shrinking the headroom is a fatal action.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
/**
* nla_ok - check if the netlink attribute fits into the remaining bytes
* @nla: netlink attribute
* @remaining: number of bytes remaining in attribute stream
*/
static inline int nla_ok(const struct nlattr *nla, int remaining)
{
return remaining >= sizeof(*nla) &&
nla->nla_len >= sizeof(*nla) &&
nla->nla_len <= remaining;
}
It turns out that remaining can become negative due to alignment in
nla_next(). But GCC promotes "remaining" to unsigned in the test
against sizeof(*nla) above. Therefore the test succeeds, and the
nla_for_each_attr() may access memory outside the received buffer.
A short example illustrating this point is here:
#include <stdio.h>
main(void)
{
printf("%d\n", -1 >= sizeof(int));
}
...which prints "1".
This patch adds a cast in front of the sizeof so that GCC will make
a signed comparison and fix the illegal memory dereference. With the
patch applied, there is no kmemcheck report.
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The reset_task function in the niu driver does not reset the tx and rx
buffers properly. This leads to panic on reset. This patch is a
modified implementation of the previously posted fix.
Signed-off-by: Santwona Behera <santwona.behera@sun.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This fixes kernel bugzilla 11469: "TUN with 1024 neighbours:
ip6_dst_lookup_tail NULL crash"
dst->neighbour is not necessarily hooked up at this point
in the processing path, so blindly dereferencing it is
the wrong thing to do. This NULL check exists in other
similar paths and this case was just an oversight.
Also fix the completely wrong and confusing indentation
here while we're at it.
Based upon a patch by Evgeniy Polyakov.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
ip6_dst_blackhole_ops.kmem_cachep is not expected to be NULL (i.e. to
be initialized) when dst_alloc() is called from ip6_dst_blackhole().
Otherwise, it results in the following (xfrm_larval_drop is now set to
1 by default):
Signed-off-by: Arnaud Ebalard <arno@natisbad.org> Acked-by: Benjamin Thery <benjamin.thery@bull.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
PCMCIA abuses dev->private_data in the probe methods. Unfortunately it
continues to abuse it after calling drv->probe() which leads to crashes and
other nasties (such as bogus probes of multifunction devices) giving errors like
pcmcia: registering new device pcmcia0.1
kernel: 0.1: GetNextTuple: No more items
Extract the passed data before calling the driver probe function that way
we don't blow up when the driver reuses dev->private_data as its right.
The issue of the endless reprogramming loop due to a too small
min_delta_ns was fixed with the previous updates of the clock events
code, but we had no information about the spread of this problem. I
added a WARN_ON to get automated information via kerneloops.org and to
get some direct reports, which allowed me to analyse the affected
machines.
The WARN_ON has served its purpose and would be annoying for a release
kernel. Remove it and just keep the information about the increase of
the min_delta_ns value.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
We have a bug in the calculation of the next jiffie to trigger the RTC
synchronisation. The aim here is to run sync_cmos_clock() as close as
possible to the middle of a second. Which means we want this function to
be called less than or equal to half a jiffie away from when now.tv_nsec
equals 5e8 (500000000).
If this is not the case for a given call to the function, for this purpose
instead of updating the RTC we calculate the offset in nanoseconds to the
next point in time where now.tv_nsec will be equal 5e8. The calculated
offset is then converted to jiffies as these are the unit used by the
timer.
Hovewer timespec_to_jiffies() used here uses a ceil()-type rounding mode,
where the resulting value is rounded up. As a result the range of
now.tv_nsec when the timer will trigger is from 5e8 to 5e8 + TICK_NSEC
rather than the desired 5e8 - TICK_NSEC / 2 to 5e8 + TICK_NSEC / 2.
As a result if for example sync_cmos_clock() happens to be called at the
time when now.tv_nsec is between 5e8 + TICK_NSEC / 2 and 5e8 to 5e8 +
TICK_NSEC, it will simply be rescheduled HZ jiffies later, falling in the
same range of now.tv_nsec again. Similarly for cases offsetted by an
integer multiple of TICK_NSEC.
This change addresses the problem by subtracting TICK_NSEC / 2 from the
nanosecond offset to the next point in time where now.tv_nsec will be
equal 5e8, effectively shifting the following rounding in
timespec_to_jiffies() so that it produces a rounded-to-nearest result.
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
After fixing the u32 thinko I sill had occasional hickups on ATI chipsets
with small deltas. There seems to be a delay between writing the compare
register and the transffer to the internal register which triggers the
interrupt. Reading back the value makes sure, that it hit the internal
match register befor we compare against the counter value.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Until the C1E patches arrived there where no users of periodic broadcast
before switching to oneshot mode. Now we need to trigger a possible
waiter for a periodic broadcast when switching to oneshot mode.
Otherwise we can starve them for ever.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The minimum reprogramming delta was hardcoded in HPET ticks,
which is stupid as it does not work with faster running HPETs.
The C1E idle patches made this prominent on AMD/RS690 chipsets,
where the HPET runs with 25MHz. Set it to 5us which seems to be
a reasonable value and fixes the problems on the bug reporters
machines. We have a further sanity check now in the clock events,
which increases the delta when it is not sufficient.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Luiz Fernando N. Capitulino <lcapitulino@mandriva.com.br> Tested-by: Dmitry Nezhevenko <dion@inhex.net> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The C1E/HPET bug reports on AMDX2/RS690 systems where tracked down to a
too small value of the HPET minumum delta for programming an event.
The clockevents code needs to enforce an interrupt event on the clock event
device in some cases. The enforcement code was stupid and naive, as it just
added the minimum delta to the current time and tried to reprogram the device.
When the minimum delta is too small, then this loops forever.
Add a sanity check. Allow reprogramming to fail 3 times, then print a warning
and double the minimum delta value to make sure, that this does not happen again.
Use the same function for both tick-oneshot and tick-broadcast code.
While chasing the C1E/HPET bugreports I went through the clock events
code inch by inch and found that the broadcast device can be initialized
and shutdown multiple times. Multiple shutdowns are not critical, but
useless waste of time. Multiple initializations are simply broken. Another
CPU might have the device in use already after the first initialization and
the second init could just render it unusable again.
In tick_oneshot_setup we program the device to the given next_event,
but we do not check the return value. We need to make sure that the
device is programmed enforced so the interrupt handler engine starts
working. Split out the reprogramming function from tick_program_event()
and call it with the device, which was handed in to tick_setup_oneshot().
Set the force argument, so the devices is firing an interrupt.
The reprogramming of the periodic broadcast handler was broken,
when the first programming returned -ETIME. The clockevents code
stores the new expiry value in the clock events device next_event field
only when the programming time has not been elapsed yet. The loop in
question calculates the new expiry value from the next_event value
and therefor never increases.
There is a ordering related problem with clockevents code, due to which
clockevents_register_device() called after tickless/highres switch
will not work. The new clockevent ends up with clockevents_handle_noop as
event handler, resulting in no timer activity.
The problematic path seems to be
* old device already has hrtimer_interrupt as the event_handler
* new clockevent device registers with a higher rating
* tick_check_new_device() is called
* clockevents_exchange_device() gets called
* old->event_handler is set to clockevents_handle_noop
* tick_setup_device() is called for the new device
* which sets new->event_handler using the old->event_handler which is noop.
Change the ordering so that new device inherits the proper handler.
This does not have any issue in normal case as most likely all the clockevent
devices are setup before the highres switch. But, can potentially be affecting
some corner case where HPET force detect happens after the highres switch.
This was a problem with HPET in MSI mode code that we have been experimenting
with.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This is happening because of an improper usage of strcmp() in the
e820 parsing code. The strcmp() always returns !0 and never resets the
value for e820.nr_map and returns an incorrect user-defined map.
When EC is in Polling mode, OS will check the EC status continually by using
the following source code:
clear_bit(EC_FLAGS_WAIT_GPE, &ec->flags);
while (time_before(jiffies, delay)) {
if (acpi_ec_check_status(ec, event))
return 0;
msleep(1);
}
But msleep is realized by the function of schedule_timeout. At the same time
although one process is already waken up by some events, it won't be scheduled
immediately. So maybe there exists the following phenomena:
a. The current jiffies is already after the predefined jiffies.
But before timeout happens, OS has no chance to check the EC
status again.
b. If preemptible schedule is enabled, maybe preempt schedule will happen
before checking loop. When the process is resumed again, maybe
timeout already happens, which means that OS has no chance to check
the EC status.
In such case maybe EC status is already what OS expects when timeout happens.
But OS has no chance to check the EC status and regards it as AE_TIME.
So it will be more appropriate that OS will try to check the EC status again
when timeout happens. If the EC status is what we expect, it won't be regarded
as timeout. Only when the EC status is not what we expect, it will be regarded
as timeout, which means that EC controller can't give a response in time.
There is a race with dirty page accounting where a page may not properly
be accounted for.
clear_page_dirty_for_io() calls page_mkclean; then TestClearPageDirty.
page_mkclean walks the rmaps for that page, and for each one it cleans and
write protects the pte if it was dirty. It uses page_check_address to
find the pte. That function has a shortcut to avoid the ptl if the pte is
not present. Unfortunately, the pte can be switched to not-present then
back to present by other code while holding the page table lock -- this
should not be a signal for page_mkclean to ignore that pte, because it may
be dirty.
For example, powerpc64's set_pte_at will clear a previously present pte
before setting it to the desired value. There may also be other code in
core mm or in arch which do similar things.
The consequence of the bug is loss of data integrity due to msync, and
loss of dirty page accounting accuracy. XIP's __xip_unmap could easily
also be unreliable (depending on the exact XIP locking scheme), which can
lead to data corruption.
Fix this by having an option to always take ptl to check the pte in
page_check_address.
It's possible to retain this optimization for page_referenced and
try_to_unmap.
Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Jared Hulbert <jaredeh@gmail.com> Cc: Carsten Otte <cotte@freenet.de> Cc: Hugh Dickins <hugh@veritas.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Chuck Ebbert <cebbert@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
> Now some warnings:
>
> ------------[ cut here ]------------
> WARNING: at /uio/arkimedes/s29/vegardno/git-working/linux-2.6/kernel/smp.c:328 s
> mp_call_function_mask+0x194/0x1a0()
The usual problem: the suspend function when interrupts are
already disabled calls smp_call_function which is not allowed with
interrupt off. But at this point all the other CPUs should be already
down anyways, so it should be enough to just drop that.
This patch should fix that problem at least by fixing cpu hotplug&
suspend support.
The fdiv detection code writes s32 integer into
the boot_cpu_data.fdiv_bug.
However, the boot_cpu_data.fdiv_bug is only char (s8)
field so the detection overwrites already set fields for
other bugs, e.g. the f00f bug field.
Use local s32 variable to receive result.
This is a partial fix to Bugzilla #9928 - fixes wrong
information about the f00f bug (tested) and probably
for coma bug (I have no cpu to test this).
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Chuck Ebbert <cebbert@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Remove the rt2x00 singlethreaded workqueue and move
the link tuner and packet filter scheduled work to
the ieee80211_hw->workqueue again.
The only exception is the interface scheduled work
handler which uses the mac80211 interface iterator
under the RTNL lock. This work needs to be handled
on the kernel workqueue to prevent lockdep issues.
Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
vsmp_patch has been marked with __init ever since pvops, however,
apply_paravirt can be called during module load causing calls to
freed memory location.
Since apply_paravirt can only be called during bootup and module load,
mark vsmp patch with "__init_or_module"
The callers of sg_copy_buffer must disable interrupts before calling
it (since it uses kmap_atomic). Some callers use it on
interrupt-disabled code but some need to take the trouble to disable
interrupts just for this. No wonder they forget about it and we hit a
bug like:
http://bugzilla.kernel.org/show_bug.cgi?id=11529
James said that it might be better to disable interrupts inside the
function rather than risk the callers getting it wrong.
The ocfs2_stack_driver_request() function failed to increment the
refcount of an already-active stack. It only did the increment on the
first reference. Whoops.
Signed-off-by: Joel Becker <joel.becker@oracle.com> Tested-by: Marcos Matsunaga <marcos.matsunaga@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>