Merge branch 'for-linus' of master.kernel.org:/home/rmk/linux-2.6-arm

author Linus Torvalds <torvalds@linux-foundation.org>

Mon, 21 Apr 2008 22:40:55 +0000 (15:40 -0700)

committer Linus Torvalds <torvalds@linux-foundation.org>

Mon, 21 Apr 2008 22:40:55 +0000 (15:40 -0700)
author Linus Torvalds <torvalds@linux-foundation.org>
Mon, 21 Apr 2008 22:40:55 +0000 (15:40 -0700)
committer Linus Torvalds <torvalds@linux-foundation.org>
Mon, 21 Apr 2008 22:40:55 +0000 (15:40 -0700)
diff --git a/Documentation/cpusets.txt b/Documentation/cpusets.txt

index ad2bb3b3acc1792a2f8a1c2ccdb999c37aaf794c..aa854b9b18cda8de6fae047540a529b55ca81405 100644 (file)
--- a/Documentation/cpusets.txt
+++ b/Documentation/cpusets.txt
@@ -8,6 +8,7 @@ Portions Copyright (c) 2004-2006 Silicon Graphics, Inc.
  Modified by Paul Jackson <pj@sgi.com>
  Modified by Christoph Lameter <clameter@sgi.com>
  Modified by Paul Menage <menage@google.com>
+Modified by Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
  
  CONTENTS:
  =========
@@ -20,7 +21,8 @@ CONTENTS:
    1.5 What is memory_pressure ?
    1.6 What is memory spread ?
    1.7 What is sched_load_balance ?
-  1.8 How do I use cpusets ?
+  1.8 What is sched_relax_domain_level ?
+  1.9 How do I use cpusets ?
  2. Usage Examples and Syntax
    2.1 Basic Usage
    2.2 Adding/removing cpus
@@ -497,7 +499,73 @@ the cpuset code to update these sched domains, it compares the new
  partition requested with the current, and updates its sched domains,
  removing the old and adding the new, for each change.
  
-1.8 How do I use cpusets ?
+
+1.8 What is sched_relax_domain_level ?
+--------------------------------------
+
+In sched domain, the scheduler migrates tasks in 2 ways; periodic load
+balance on tick, and at time of some schedule events.
+
+When a task is woken up, scheduler try to move the task on idle CPU.
+For example, if a task A running on CPU X activates another task B
+on the same CPU X, and if CPU Y is X's sibling and performing idle,
+then scheduler migrate task B to CPU Y so that task B can start on
+CPU Y without waiting task A on CPU X.
+
+And if a CPU run out of tasks in its runqueue, the CPU try to pull
+extra tasks from other busy CPUs to help them before it is going to
+be idle.
+
+Of course it takes some searching cost to find movable tasks and/or
+idle CPUs, the scheduler might not search all CPUs in the domain
+everytime.  In fact, in some architectures, the searching ranges on
+events are limited in the same socket or node where the CPU locates,
+while the load balance on tick searchs all.
+
+For example, assume CPU Z is relatively far from CPU X.  Even if CPU Z
+is idle while CPU X and the siblings are busy, scheduler can't migrate
+woken task B from X to Z since it is out of its searching range.
+As the result, task B on CPU X need to wait task A or wait load balance
+on the next tick.  For some applications in special situation, waiting
+1 tick may be too long.
+
+The 'sched_relax_domain_level' file allows you to request changing
+this searching range as you like.  This file takes int value which
+indicates size of searching range in levels ideally as follows,
+otherwise initial value -1 that indicates the cpuset has no request.
+
+  -1  : no request. use system default or follow request of others.
+   0  : no search.
+   1  : search siblings (hyperthreads in a core).
+   2  : search cores in a package.
+   3  : search cpus in a node [= system wide on non-NUMA system]
+ ( 4  : search nodes in a chunk of node [on NUMA system] )
+ ( 5~ : search system wide [on NUMA system])
+
+This file is per-cpuset and affect the sched domain where the cpuset
+belongs to.  Therefore if the flag 'sched_load_balance' of a cpuset
+is disabled, then 'sched_relax_domain_level' have no effect since
+there is no sched domain belonging the cpuset.
+
+If multiple cpusets are overlapping and hence they form a single sched
+domain, the largest value among those is used.  Be careful, if one
+requests 0 and others are -1 then 0 is used.
+
+Note that modifying this file will have both good and bad effects,
+and whether it is acceptable or not will be depend on your situation.
+Don't modify this file if you are not sure.
+
+If your situation is:
+ - The migration costs between each cpu can be assumed considerably
+   small(for you) due to your special application's behavior or
+   special hardware support for CPU cache etc.
+ - The searching cost doesn't have impact(for you) or you can make
+   the searching cost enough small by managing cpuset to compact etc.
+ - The latency is required even it sacrifices cache hit rate etc.
+then increasing 'sched_relax_domain_level' would benefit you.
+
+
+1.9 How do I use cpusets ?
  --------------------------
  
  In order to minimize the impact of cpusets on critical kernel
diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt

index af0e9393bf684d62c70afcaa1c620709fceebb86..309c47b91598e4fba5d23eaec16234d560624895 100644 (file)
--- a/Documentation/feature-removal-schedule.txt
+++ b/Documentation/feature-removal-schedule.txt
@@ -282,6 +282,13 @@ Why:       Not used in-tree. The current out-of-tree users used it to
         out-of-tree driver.
  Who:   Thomas Gleixner <tglx@linutronix.de>
  
+----------------------------
+
+What:  usedac i386 kernel parameter
+When:  2.6.27
+Why:   replaced by allowdac and no dac combination
+Who:   Glauber Costa <gcosta@redhat.com>
+
  ---------------------------
  
  What:  /sys/o2cb symlink
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt

index 4b0f1ae31a4c152f4547050e2b3d8cc6b375bb31..f4839606988beb27b95ac651eb6d12d71bd13279 100644 (file)
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -1280,8 +1280,16 @@ and is between 256 and 4096 characters. It is defined in the file
         noexec          [IA-64]
  
         noexec          [X86-32,X86-64]
+                       On X86-32 available only on PAE configured kernels.
                         noexec=on: enable non-executable mappings (default)
-                       noexec=off: disable nn-executable mappings
+                       noexec=off: disable non-executable mappings
+
+       noexec32        [X86-64]
+                       This affects only 32-bit executables.
+                       noexec32=on: enable non-executable mappings (default)
+                               read doesn't imply executable mappings
+                       noexec32=off: disable non-executable mappings
+                               read implies executable mappings
  
         nofxsr          [BUGS=X86-32] Disables x86 floating point extended
                         register save and restore. The kernel will only save
diff --git a/Documentation/prctl/disable-tsc-ctxt-sw-stress-test.c b/Documentation/prctl/disable-tsc-ctxt-sw-stress-test.c

new file mode 100644 (file)

index 0000000..f8e8e95
--- /dev/null
+++ b/Documentation/prctl/disable-tsc-ctxt-sw-stress-test.c
@@ -0,0 +1,96 @@
+/*
+ * Tests for prctl(PR_GET_TSC, ...) / prctl(PR_SET_TSC, ...)
+ *
+ * Tests if the control register is updated correctly
+ * at context switches
+ *
+ * Warning: this test will cause a very high load for a few seconds
+ *
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <signal.h>
+#include <inttypes.h>
+#include <wait.h>
+
+
+#include <sys/prctl.h>
+#include <linux/prctl.h>
+
+/* Get/set the process' ability to use the timestamp counter instruction */
+#ifndef PR_GET_TSC
+#define PR_GET_TSC 25
+#define PR_SET_TSC 26
+# define PR_TSC_ENABLE         1   /* allow the use of the timestamp counter */
+# define PR_TSC_SIGSEGV                2   /* throw a SIGSEGV instead of reading the TSC */
+#endif
+
+uint64_t rdtsc() {
+uint32_t lo, hi;
+/* We cannot use "=A", since this would use %rax on x86_64 */
+__asm__ __volatile__ ("rdtsc" : "=a" (lo), "=d" (hi));
+return (uint64_t)hi << 32 | lo;
+}
+
+void sigsegv_expect(int sig)
+{
+       /* */
+}
+
+void segvtask(void)
+{
+       if (prctl(PR_SET_TSC, PR_TSC_SIGSEGV) < 0)
+       {
+               perror("prctl");
+               exit(0);
+       }
+       signal(SIGSEGV, sigsegv_expect);
+       alarm(10);
+       rdtsc();
+       fprintf(stderr, "FATAL ERROR, rdtsc() succeeded while disabled\n");
+       exit(0);
+}
+
+
+void sigsegv_fail(int sig)
+{
+       fprintf(stderr, "FATAL ERROR, rdtsc() failed while enabled\n");
+       exit(0);
+}
+
+void rdtsctask(void)
+{
+       if (prctl(PR_SET_TSC, PR_TSC_ENABLE) < 0)
+       {
+               perror("prctl");
+               exit(0);
+       }
+       signal(SIGSEGV, sigsegv_fail);
+       alarm(10);
+       for(;;) rdtsc();
+}
+
+
+int main(int argc, char **argv)
+{
+       int n_tasks = 100, i;
+
+       fprintf(stderr, "[No further output means we're allright]\n");
+
+       for (i=0; i<n_tasks; i++)
+               if (fork() == 0)
+               {
+                       if (i & 1)
+                               segvtask();
+                       else
+                               rdtsctask();
+               }
+
+       for (i=0; i<n_tasks; i++)
+               wait(NULL);
+
+       exit(0);
+}
+
diff --git a/Documentation/prctl/disable-tsc-on-off-stress-test.c b/Documentation/prctl/disable-tsc-on-off-stress-test.c

new file mode 100644 (file)

index 0000000..1fcd914
--- /dev/null
+++ b/Documentation/prctl/disable-tsc-on-off-stress-test.c
@@ -0,0 +1,95 @@
+/*
+ * Tests for prctl(PR_GET_TSC, ...) / prctl(PR_SET_TSC, ...)
+ *
+ * Tests if the control register is updated correctly
+ * when set with prctl()
+ *
+ * Warning: this test will cause a very high load for a few seconds
+ *
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <signal.h>
+#include <inttypes.h>
+#include <wait.h>
+
+
+#include <sys/prctl.h>
+#include <linux/prctl.h>
+
+/* Get/set the process' ability to use the timestamp counter instruction */
+#ifndef PR_GET_TSC
+#define PR_GET_TSC 25
+#define PR_SET_TSC 26
+# define PR_TSC_ENABLE         1   /* allow the use of the timestamp counter */
+# define PR_TSC_SIGSEGV                2   /* throw a SIGSEGV instead of reading the TSC */
+#endif
+
+/* snippet from wikipedia :-) */
+
+uint64_t rdtsc() {
+uint32_t lo, hi;
+/* We cannot use "=A", since this would use %rax on x86_64 */
+__asm__ __volatile__ ("rdtsc" : "=a" (lo), "=d" (hi));
+return (uint64_t)hi << 32 | lo;
+}
+
+int should_segv = 0;
+
+void sigsegv_cb(int sig)
+{
+       if (!should_segv)
+       {
+               fprintf(stderr, "FATAL ERROR, rdtsc() failed while enabled\n");
+               exit(0);
+       }
+       if (prctl(PR_SET_TSC, PR_TSC_ENABLE) < 0)
+       {
+               perror("prctl");
+               exit(0);
+       }
+       should_segv = 0;
+
+       rdtsc();
+}
+
+void task(void)
+{
+       signal(SIGSEGV, sigsegv_cb);
+       alarm(10);
+       for(;;)
+       {
+               rdtsc();
+               if (should_segv)
+               {
+                       fprintf(stderr, "FATAL ERROR, rdtsc() succeeded while disabled\n");
+                       exit(0);
+               }
+               if (prctl(PR_SET_TSC, PR_TSC_SIGSEGV) < 0)
+               {
+                       perror("prctl");
+                       exit(0);
+               }
+               should_segv = 1;
+       }
+}
+
+
+int main(int argc, char **argv)
+{
+       int n_tasks = 100, i;
+
+       fprintf(stderr, "[No further output means we're allright]\n");
+
+       for (i=0; i<n_tasks; i++)
+               if (fork() == 0)
+                       task();
+
+       for (i=0; i<n_tasks; i++)
+               wait(NULL);
+
+       exit(0);
+}
+
diff --git a/Documentation/prctl/disable-tsc-test.c b/Documentation/prctl/disable-tsc-test.c

new file mode 100644 (file)

index 0000000..843c81e
--- /dev/null
+++ b/Documentation/prctl/disable-tsc-test.c
@@ -0,0 +1,94 @@
+/*
+ * Tests for prctl(PR_GET_TSC, ...) / prctl(PR_SET_TSC, ...)
+ *
+ * Basic test to test behaviour of PR_GET_TSC and PR_SET_TSC
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <signal.h>
+#include <inttypes.h>
+
+
+#include <sys/prctl.h>
+#include <linux/prctl.h>
+
+/* Get/set the process' ability to use the timestamp counter instruction */
+#ifndef PR_GET_TSC
+#define PR_GET_TSC 25
+#define PR_SET_TSC 26
+# define PR_TSC_ENABLE         1   /* allow the use of the timestamp counter */
+# define PR_TSC_SIGSEGV                2   /* throw a SIGSEGV instead of reading the TSC */
+#endif
+
+const char *tsc_names[] =
+{
+       [0] = "[not set]",
+       [PR_TSC_ENABLE] = "PR_TSC_ENABLE",
+       [PR_TSC_SIGSEGV] = "PR_TSC_SIGSEGV",
+};
+
+uint64_t rdtsc() {
+uint32_t lo, hi;
+/* We cannot use "=A", since this would use %rax on x86_64 */
+__asm__ __volatile__ ("rdtsc" : "=a" (lo), "=d" (hi));
+return (uint64_t)hi << 32 | lo;
+}
+
+void sigsegv_cb(int sig)
+{
+       int tsc_val = 0;
+
+       printf("[ SIG_SEGV ]\n");
+       printf("prctl(PR_GET_TSC, &tsc_val); ");
+       fflush(stdout);
+
+       if ( prctl(PR_GET_TSC, &tsc_val) == -1)
+               perror("prctl");
+
+       printf("tsc_val == %s\n", tsc_names[tsc_val]);
+       printf("prctl(PR_SET_TSC, PR_TSC_ENABLE)\n");
+       fflush(stdout);
+       if ( prctl(PR_SET_TSC, PR_TSC_ENABLE) == -1)
+               perror("prctl");
+
+       printf("rdtsc() == ");
+}
+
+int main(int argc, char **argv)
+{
+       int tsc_val = 0;
+
+       signal(SIGSEGV, sigsegv_cb);
+
+       printf("rdtsc() == %llu\n", (unsigned long long)rdtsc());
+       printf("prctl(PR_GET_TSC, &tsc_val); ");
+       fflush(stdout);
+
+       if ( prctl(PR_GET_TSC, &tsc_val) == -1)
+               perror("prctl");
+
+       printf("tsc_val == %s\n", tsc_names[tsc_val]);
+       printf("rdtsc() == %llu\n", (unsigned long long)rdtsc());
+       printf("prctl(PR_SET_TSC, PR_TSC_ENABLE)\n");
+       fflush(stdout);
+
+       if ( prctl(PR_SET_TSC, PR_TSC_ENABLE) == -1)
+               perror("prctl");
+
+       printf("rdtsc() == %llu\n", (unsigned long long)rdtsc());
+       printf("prctl(PR_SET_TSC, PR_TSC_SIGSEGV)\n");
+       fflush(stdout);
+
+       if ( prctl(PR_SET_TSC, PR_TSC_SIGSEGV) == -1)
+               perror("prctl");
+
+       printf("rdtsc() == ");
+       fflush(stdout);
+       printf("%llu\n", (unsigned long long)rdtsc());
+       fflush(stdout);
+
+       exit(EXIT_SUCCESS);
+}
+
diff --git a/Documentation/scheduler/sched-rt-group.txt b/Documentation/scheduler/sched-rt-group.txt

index 1c6332f4543c350889eae9ba2b4c766270c1b65e..14f901f639ee3df948a5d4cf9b10b802c7e3f310 100644 (file)
--- a/Documentation/scheduler/sched-rt-group.txt
+++ b/Documentation/scheduler/sched-rt-group.txt
@@ -1,59 +1,177 @@
+                               Real-Time group scheduling
+                               --------------------------
  
+CONTENTS
+========
  
-Real-Time group scheduling.
+1. Overview
+  1.1 The problem
+  1.2 The solution
+2. The interface
+  2.1 System-wide settings
+  2.2 Default behaviour
+  2.3 Basis for grouping tasks
+3. Future plans
  
-The problem space:
  
-In order to schedule multiple groups of realtime tasks each group must
-be assigned a fixed portion of the CPU time available. Without a minimum
-guarantee a realtime group can obviously fall short. A fuzzy upper limit
-is of no use since it cannot be relied upon. Which leaves us with just
-the single fixed portion.
+1. Overview
+===========
  
-CPU time is divided by means of specifying how much time can be spent
-running in a given period. Say a frame fixed realtime renderer must
-deliver 25 frames a second, which yields a period of 0.04s. Now say
-it will also have to play some music and respond to input, leaving it
-with around 80% for the graphics. We can then give this group a runtime
-of 0.8 * 0.04s = 0.032s.
  
-This way the graphics group will have a 0.04s period with a 0.032s runtime
-limit.
+1.1 The problem
+---------------
  
-Now if the audio thread needs to refill the DMA buffer every 0.005s, but
-needs only about 3% CPU time to do so, it can do with a 0.03 * 0.005s
-= 0.00015s.
+Realtime scheduling is all about determinism, a group has to be able to rely on
+the amount of bandwidth (eg. CPU time) being constant. In order to schedule
+multiple groups of realtime tasks, each group must be assigned a fixed portion
+of the CPU time available.  Without a minimum guarantee a realtime group can
+obviously fall short. A fuzzy upper limit is of no use since it cannot be
+relied upon. Which leaves us with just the single fixed portion.
  
+1.2 The solution
+----------------
  
-The Interface:
+CPU time is divided by means of specifying how much time can be spent running
+in a given period. We allocate this "run time" for each realtime group which
+the other realtime groups will not be permitted to use.
  
-system wide:
+Any time not allocated to a realtime group will be used to run normal priority
+tasks (SCHED_OTHER). Any allocated run time not used will also be picked up by
+SCHED_OTHER.
  
-/proc/sys/kernel/sched_rt_period_ms
-/proc/sys/kernel/sched_rt_runtime_us
+Let's consider an example: a frame fixed realtime renderer must deliver 25
+frames a second, which yields a period of 0.04s per frame. Now say it will also
+have to play some music and respond to input, leaving it with around 80% CPU
+time dedicated for the graphics. We can then give this group a run time of 0.8
+* 0.04s = 0.032s.
  
-CONFIG_FAIR_USER_SCHED
+This way the graphics group will have a 0.04s period with a 0.032s run time
+limit. Now if the audio thread needs to refill the DMA buffer every 0.005s, but
+needs only about 3% CPU time to do so, it can do with a 0.03 * 0.005s =
+0.00015s. So this group can be scheduled with a period of 0.005s and a run time
+of 0.00015s.
  
-/sys/kernel/uids/<uid>/cpu_rt_runtime_us
+The remaining CPU time will be used for user input and other tass. Because
+realtime tasks have explicitly allocated the CPU time they need to perform
+their tasks, buffer underruns in the graphocs or audio can be eliminated.
  
-or
+NOTE: the above example is not fully implemented as of yet (2.6.25). We still
+lack an EDF scheduler to make non-uniform periods usable.
  
-CONFIG_FAIR_CGROUP_SCHED
  
-/cgroup/<cgroup>/cpu.rt_runtime_us
+2. The Interface
+================
  
-[ time is specified in us because the interface is s32; this gives an
-  operating range of ~35m to 1us ]
  
-The period takes values in [ 1, INT_MAX ], runtime in [ -1, INT_MAX - 1 ].
+2.1 System wide settings
+------------------------
  
-A runtime of -1 specifies runtime == period, ie. no limit.
+The system wide settings are configured under the /proc virtual file system:
  
-New groups get the period from /proc/sys/kernel/sched_rt_period_us and
-a runtime of 0.
+/proc/sys/kernel/sched_rt_period_us:
+  The scheduling period that is equivalent to 100% CPU bandwidth
  
-Settings are constrained to:
+/proc/sys/kernel/sched_rt_runtime_us:
+  A global limit on how much time realtime scheduling may use.  Even without
+  CONFIG_RT_GROUP_SCHED enabled, this will limit time reserved to realtime
+  processes. With CONFIG_RT_GROUP_SCHED it signifies the total bandwidth
+  available to all realtime groups.
+
+  * Time is specified in us because the interface is s32. This gives an
+    operating range from 1us to about 35 minutes.
+  * sched_rt_period_us takes values from 1 to INT_MAX.
+  * sched_rt_runtime_us takes values from -1 to (INT_MAX - 1).
+  * A run time of -1 specifies runtime == period, ie. no limit.
+
+
+2.2 Default behaviour
+---------------------
+
+The default values for sched_rt_period_us (1000000 or 1s) and
+sched_rt_runtime_us (950000 or 0.95s).  This gives 0.05s to be used by
+SCHED_OTHER (non-RT tasks). These defaults were chosen so that a run-away
+realtime tasks will not lock up the machine but leave a little time to recover
+it.  By setting runtime to -1 you'd get the old behaviour back.
+
+By default all bandwidth is assigned to the root group and new groups get the
+period from /proc/sys/kernel/sched_rt_period_us and a run time of 0. If you
+want to assign bandwidth to another group, reduce the root group's bandwidth
+and assign some or all of the difference to another group.
+
+Realtime group scheduling means you have to assign a portion of total CPU
+bandwidth to the group before it will accept realtime tasks. Therefore you will
+not be able to run realtime tasks as any user other than root until you have
+done that, even if the user has the rights to run processes with realtime
+priority!
+
+
+2.3 Basis for grouping tasks
+----------------------------
+
+There are two compile-time settings for allocating CPU bandwidth. These are
+configured using the "Basis for grouping tasks" multiple choice menu under
+General setup > Group CPU Scheduler:
+
+a. CONFIG_USER_SCHED (aka "Basis for grouping tasks" =  "user id")
+
+This lets you use the virtual files under
+"/sys/kernel/uids/<uid>/cpu_rt_runtime_us" to control he CPU time reserved for
+each user .
+
+The other option is:
+
+.o CONFIG_CGROUP_SCHED (aka "Basis for grouping tasks" = "Control groups")
+
+This uses the /cgroup virtual file system and "/cgroup/<cgroup>/cpu.rt_runtime_us"
+to control the CPU time reserved for each control group instead.
+
+For more information on working with control groups, you should read
+Documentation/cgroups.txt as well.
+
+Group settings are checked against the following limits in order to keep the configuration
+schedulable:
  
     \Sum_{i} runtime_{i} / global_period <= global_runtime / global_period
  
-in order to keep the configuration schedulable.
+For now, this can be simplified to just the following (but see Future plans):
+
+   \Sum_{i} runtime_{i} <= global_runtime
+
+
+3. Future plans
+===============
+
+There is work in progress to make the scheduling period for each group
+("/sys/kernel/uids/<uid>/cpu_rt_period_us" or
+"/cgroup/<cgroup>/cpu.rt_period_us" respectively) configurable as well.
+
+The constraint on the period is that a subgroup must have a smaller or
+equal period to its parent. But realistically its not very useful _yet_
+as its prone to starvation without deadline scheduling.
+
+Consider two sibling groups A and B; both have 50% bandwidth, but A's
+period is twice the length of B's.
+
+* group A: period=100000us, runtime=10000us
+       - this runs for 0.01s once every 0.1s
+
+* group B: period= 50000us, runtime=10000us
+       - this runs for 0.01s twice every 0.1s (or once every 0.05 sec).
+
+This means that currently a while (1) loop in A will run for the full period of
+B and can starve B's tasks (assuming they are of lower priority) for a whole
+period.
+
+The next project will be SCHED_EDF (Earliest Deadline First scheduling) to bring
+full deadline scheduling to the linux kernel. Deadline scheduling the above
+groups and treating end of the period as a deadline will ensure that they both
+get their allocated time.
+
+Implementing SCHED_EDF might take a while to complete. Priority Inheritance is
+the biggest challenge as the current linux PI infrastructure is geared towards
+the limited static priority levels 0-139. With deadline scheduling you need to
+do deadline inheritance (since priority is inversely proportional to the
+deadline delta (deadline - now).
+
+This means the whole PI machinery will have to be reworked - and that is one of
+the most complex pieces of code we have.
diff --git a/arch/sh/Kconfig b/arch/sh/Kconfig

index 8d2cd1de57265c947a340396613dd2b83ff1e5a7..6a679c3e15e817d14adcee8d6771c9e3c89e5b30 100644 (file)
--- a/arch/sh/Kconfig
+++ b/arch/sh/Kconfig
@@ -167,6 +167,12 @@ config CPU_SUBTYPE_SH7263
         select CPU_SH2A
         select CPU_HAS_FPU
  
+config CPU_SUBTYPE_MXG
+       bool "Support MX-G processor"
+       select CPU_SH2A
+       help
+         Select MX-G if running on an R8A03022BG part.
+
  # SH-3 Processor Support
  
  config CPU_SUBTYPE_SH7705
@@ -270,6 +276,15 @@ config CPU_SUBTYPE_SH4_202
  
  # SH-4A Processor Support
  
+config CPU_SUBTYPE_SH7723
+       bool "Support SH7723 processor"
+       select CPU_SH4A
+       select CPU_SHX2
+       select ARCH_SPARSEMEM_ENABLE
+       select SYS_SUPPORTS_NUMA
+       help
+         Select SH7723 if you have an SH-MobileR2 CPU.
+
  config CPU_SUBTYPE_SH7763
         bool "Support SH7763 processor"
         select CPU_SH4A
@@ -366,6 +381,14 @@ config SH_7619_SOLUTION_ENGINE
           Select 7619 SolutionEngine if configuring for a Hitachi SH7619
           evaluation board.
         
+config SH_7721_SOLUTION_ENGINE
+       bool "SolutionEngine7721"
+       select SOLUTION_ENGINE
+       depends on CPU_SUBTYPE_SH7721
+       help
+         Select 7721 SolutionEngine if configuring for a Hitachi SH7721
+         evaluation board.
+
  config SH_7722_SOLUTION_ENGINE
         bool "SolutionEngine7722"
         select SOLUTION_ENGINE
@@ -560,7 +583,7 @@ config SH_TMU
  config SH_CMT
         def_bool y
         prompt "CMT timer support"
-       depends on CPU_SH2
+       depends on CPU_SH2 && !CPU_SUBTYPE_MXG
         help
           This enables the use of the CMT as the system timer.
  
@@ -578,6 +601,7 @@ config SH_TIMER_IRQ
         default "86" if CPU_SUBTYPE_SH7619
         default "140" if CPU_SUBTYPE_SH7206
         default "142" if CPU_SUBTYPE_SH7203
+       default "238" if CPU_SUBTYPE_MXG
         default "16"
  
  config SH_PCLK_FREQ
@@ -585,10 +609,10 @@ config SH_PCLK_FREQ
         default "27000000" if CPU_SUBTYPE_SH7343
         default "31250000" if CPU_SUBTYPE_SH7619
         default "32000000" if CPU_SUBTYPE_SH7722
-       default "33333333" if CPU_SUBTYPE_SH7770 || \
+       default "33333333" if CPU_SUBTYPE_SH7770 || CPU_SUBTYPE_SH7723 || \
                               CPU_SUBTYPE_SH7760 || CPU_SUBTYPE_SH7705 || \
                               CPU_SUBTYPE_SH7203 || CPU_SUBTYPE_SH7206 || \
-                             CPU_SUBTYPE_SH7263
+                             CPU_SUBTYPE_SH7263 || CPU_SUBTYPE_MXG
         default "60000000" if CPU_SUBTYPE_SH7751 || CPU_SUBTYPE_SH7751R
         default "66000000" if CPU_SUBTYPE_SH4_202
         default "50000000"
diff --git a/arch/sh/Kconfig.debug b/arch/sh/Kconfig.debug

index 5dcb74b947a968ca0b18ddfbf7e1da8ed7958209..d9d28f9dd0db5fbd990a1262d07898017e73cc04 100644 (file)
--- a/arch/sh/Kconfig.debug
+++ b/arch/sh/Kconfig.debug
@@ -29,16 +29,17 @@ config EARLY_SCIF_CONSOLE
  config EARLY_SCIF_CONSOLE_PORT
         hex
         depends on EARLY_SCIF_CONSOLE
-       default "0xffe00000" if CPU_SUBTYPE_SH7780 || CPU_SUBTYPE_SH7763
-       default "0xffe00000" if CPU_SUBTYPE_SH7722 || CPU_SUBTYPE_SH7366
-       default "0xffea0000" if CPU_SUBTYPE_SH7785
-       default "0xfffe8000" if CPU_SUBTYPE_SH7203
-       default "0xfffe9800" if CPU_SUBTYPE_SH7206 || CPU_SUBTYPE_SH7263
-       default "0xf8420000" if CPU_SUBTYPE_SH7619
         default "0xa4400000" if CPU_SUBTYPE_SH7712 || CPU_SUBTYPE_SH7705
         default "0xa4430000" if CPU_SUBTYPE_SH7720 || CPU_SUBTYPE_SH7721
+       default "0xf8420000" if CPU_SUBTYPE_SH7619
+       default "0xff804000" if CPU_SUBTYPE_MXG
         default "0xffc30000" if CPU_SUBTYPE_SHX3
+       default "0xffe00000" if CPU_SUBTYPE_SH7780 || CPU_SUBTYPE_SH7763 || \
+                               CPU_SUBTYPE_SH7722 || CPU_SUBTYPE_SH7366
         default "0xffe80000" if CPU_SH4
+       default "0xffea0000" if CPU_SUBTYPE_SH7785
+       default "0xfffe8000" if CPU_SUBTYPE_SH7203
+       default "0xfffe9800" if CPU_SUBTYPE_SH7206 || CPU_SUBTYPE_SH7263
         default "0x00000000"
  
  config EARLY_PRINTK
diff --git a/arch/sh/Makefile b/arch/sh/Makefile

index cffc92b1bf2e8d7236d29cdf599bd7ee07ecf553..bb06f83e6239c648e138a1d820f0d3e6dab7fa51 100644 (file)
--- a/arch/sh/Makefile
+++ b/arch/sh/Makefile
@@ -107,6 +107,7 @@ machdir-$(CONFIG_SH_7722_SOLUTION_ENGINE)   += se/7722
  machdir-$(CONFIG_SH_7751_SOLUTION_ENGINE)      += se/7751
  machdir-$(CONFIG_SH_7780_SOLUTION_ENGINE)      += se/7780
  machdir-$(CONFIG_SH_7343_SOLUTION_ENGINE)      += se/7343
+machdir-$(CONFIG_SH_7721_SOLUTION_ENGINE)      += se/7721
  machdir-$(CONFIG_SH_HP6XX)                     += hp6xx
  machdir-$(CONFIG_SH_DREAMCAST)                 += dreamcast
  machdir-$(CONFIG_SH_MPC1211)                   += mpc1211
diff --git a/arch/sh/boards/renesas/migor/setup.c b/arch/sh/boards/renesas/migor/setup.c

index 21ab8c8fb590a9d2c7f41429895b2aa952e834ff..00d52a20d8a59425105f0307aa227d66b4871308 100644 (file)
--- a/arch/sh/boards/renesas/migor/setup.c
+++ b/arch/sh/boards/renesas/migor/setup.c
@@ -10,8 +10,14 @@
  #include <linux/init.h>
  #include <linux/platform_device.h>
  #include <linux/interrupt.h>
+#include <linux/input.h>
+#include <linux/mtd/physmap.h>
+#include <linux/mtd/nand.h>
+#include <linux/i2c.h>
  #include <asm/machvec.h>
  #include <asm/io.h>
+#include <asm/sh_keysc.h>
+#include <asm/migor.h>
  
  /* Address     IRQ  Size  Bus  Description
   * 0x00000000       64MB  16   NOR Flash (SP29PL256N)
@@ -23,9 +29,9 @@
  
  static struct resource smc91x_eth_resources[] = {
         [0] = {
-               .name   = "smc91x-regs" ,
-               .start  = P2SEGADDR(0x10000300),
-               .end    = P2SEGADDR(0x1000030f),
+               .name   = "SMC91C111" ,
+               .start  = 0x10000300,
+               .end    = 0x1000030f,
                 .flags  = IORESOURCE_MEM,
         },
         [1] = {
@@ -40,19 +46,202 @@ static struct platform_device smc91x_eth_device = {
         .resource       = smc91x_eth_resources,
  };
  
+static struct sh_keysc_info sh_keysc_info = {
+       .mode = SH_KEYSC_MODE_2, /* KEYOUT0->4, KEYIN1->5 */
+       .scan_timing = 3,
+       .delay = 5,
+       .keycodes = {
+               0, KEY_UP, KEY_DOWN, KEY_LEFT, KEY_RIGHT, KEY_ENTER,
+               0, KEY_F, KEY_C, KEY_D, KEY_H, KEY_1,
+               0, KEY_2, KEY_3, KEY_4, KEY_5, KEY_6,
+               0, KEY_7, KEY_8, KEY_9, KEY_S, KEY_0,
+               0, KEY_P, KEY_STOP, KEY_REWIND, KEY_PLAY, KEY_FASTFORWARD,
+       },
+};
+
+static struct resource sh_keysc_resources[] = {
+       [0] = {
+               .start  = 0x044b0000,
+               .end    = 0x044b000f,
+               .flags  = IORESOURCE_MEM,
+       },
+       [1] = {
+               .start  = 79,
+               .flags  = IORESOURCE_IRQ,
+       },
+};
+
+static struct platform_device sh_keysc_device = {
+       .name           = "sh_keysc",
+       .num_resources  = ARRAY_SIZE(sh_keysc_resources),
+       .resource       = sh_keysc_resources,
+       .dev    = {
+               .platform_data  = &sh_keysc_info,
+       },
+};
+
+static struct mtd_partition migor_nor_flash_partitions[] =
+{
+       {
+               .name = "uboot",
+               .offset = 0,
+               .size = (1 * 1024 * 1024),
+               .mask_flags = MTD_WRITEABLE,    /* Read-only */
+       },
+       {
+               .name = "rootfs",
+               .offset = MTDPART_OFS_APPEND,
+               .size = (15 * 1024 * 1024),
+       },
+       {
+               .name = "other",
+               .offset = MTDPART_OFS_APPEND,
+               .size = MTDPART_SIZ_FULL,
+       },
+};
+
+static struct physmap_flash_data migor_nor_flash_data = {
+       .width          = 2,
+       .parts          = migor_nor_flash_partitions,
+       .nr_parts       = ARRAY_SIZE(migor_nor_flash_partitions),
+};
+
+static struct resource migor_nor_flash_resources[] = {
+       [0] = {
+               .name           = "NOR Flash",
+               .start          = 0x00000000,
+               .end            = 0x03ffffff,
+               .flags          = IORESOURCE_MEM,
+       }
+};
+
+static struct platform_device migor_nor_flash_device = {
+       .name           = "physmap-flash",
+       .resource       = migor_nor_flash_resources,
+       .num_resources  = ARRAY_SIZE(migor_nor_flash_resources),
+       .dev            = {
+               .platform_data = &migor_nor_flash_data,
+       },
+};
+
+static struct mtd_partition migor_nand_flash_partitions[] = {
+       {
+               .name           = "nanddata1",
+               .offset         = 0x0,
+               .size           = 512 * 1024 * 1024,
+       },
+       {
+               .name           = "nanddata2",
+               .offset         = MTDPART_OFS_APPEND,
+               .size           = 512 * 1024 * 1024,
+       },
+};
+
+static void migor_nand_flash_cmd_ctl(struct mtd_info *mtd, int cmd,
+                                    unsigned int ctrl)
+{
+       struct nand_chip *chip = mtd->priv;
+
+       if (cmd == NAND_CMD_NONE)
+               return;
+
+       if (ctrl & NAND_CLE)
+               writeb(cmd, chip->IO_ADDR_W + 0x00400000);
+       else if (ctrl & NAND_ALE)
+               writeb(cmd, chip->IO_ADDR_W + 0x00800000);
+       else
+               writeb(cmd, chip->IO_ADDR_W);
+}
+
+static int migor_nand_flash_ready(struct mtd_info *mtd)
+{
+       return ctrl_inb(PORT_PADR) & 0x02; /* PTA1 */
+}
+
+struct platform_nand_data migor_nand_flash_data = {
+       .chip = {
+               .nr_chips = 1,
+               .partitions = migor_nand_flash_partitions,
+               .nr_partitions = ARRAY_SIZE(migor_nand_flash_partitions),
+               .chip_delay = 20,
+               .part_probe_types = (const char *[]) { "cmdlinepart", NULL },
+       },
+       .ctrl = {
+               .dev_ready = migor_nand_flash_ready,
+               .cmd_ctrl = migor_nand_flash_cmd_ctl,
+       },
+};
+
+static struct resource migor_nand_flash_resources[] = {
+       [0] = {
+               .name           = "NAND Flash",
+               .start          = 0x18000000,
+               .end            = 0x18ffffff,
+               .flags          = IORESOURCE_MEM,
+       },
+};
+
+static struct platform_device migor_nand_flash_device = {
+       .name           = "gen_nand",
+       .resource       = migor_nand_flash_resources,
+       .num_resources  = ARRAY_SIZE(migor_nand_flash_resources),
+       .dev            = {
+               .platform_data = &migor_nand_flash_data,
+       }
+};
+
  static struct platform_device *migor_devices[] __initdata = {
         &smc91x_eth_device,
+       &sh_keysc_device,
+       &migor_nor_flash_device,
+       &migor_nand_flash_device,
+};
+
+static struct i2c_board_info __initdata migor_i2c_devices[] = {
+       {
+               I2C_BOARD_INFO("rtc-rs5c372", 0x32),
+               .type   = "rs5c372b",
+       },
+       {
+               I2C_BOARD_INFO("migor_ts", 0x51),
+               .irq = 38, /* IRQ6 */
+       },
  };
  
  static int __init migor_devices_setup(void)
  {
+       i2c_register_board_info(0, migor_i2c_devices,
+                               ARRAY_SIZE(migor_i2c_devices));
+ 
         return platform_add_devices(migor_devices, ARRAY_SIZE(migor_devices));
  }
  __initcall(migor_devices_setup);
  
  static void __init migor_setup(char **cmdline_p)
  {
-       ctrl_outw(0x1000, 0xa4050110); /* Enable IRQ0 in PJCR */
+       /* SMC91C111 - Enable IRQ0 */
+       ctrl_outw(ctrl_inw(PORT_PJCR) & ~0x0003, PORT_PJCR);
+
+       /* KEYSC */
+       ctrl_outw(ctrl_inw(PORT_PYCR) & ~0x0fff, PORT_PYCR);
+       ctrl_outw(ctrl_inw(PORT_PZCR) & ~0x0ff0, PORT_PZCR);
+       ctrl_outw(ctrl_inw(PORT_PSELA) & ~0x4100, PORT_PSELA);
+       ctrl_outw(ctrl_inw(PORT_HIZCRA) & ~0x4000, PORT_HIZCRA);
+       ctrl_outw(ctrl_inw(PORT_HIZCRC) & ~0xc000, PORT_HIZCRC);
+       ctrl_outl(ctrl_inl(MSTPCR2) & ~0x00004000, MSTPCR2);
+
+       /* NAND Flash */
+       ctrl_outw(ctrl_inw(PORT_PXCR) & 0x0fff, PORT_PXCR);
+       ctrl_outl((ctrl_inl(BSC_CS6ABCR) & ~0x00000600) | 0x00000200,
+                 BSC_CS6ABCR);
+
+       /* I2C */
+       ctrl_outl(ctrl_inl(MSTPCR1) & ~0x00000200, MSTPCR1);
+
+       /* Touch Panel - Enable IRQ6 */
+       ctrl_outw(ctrl_inw(PORT_PZCR) & ~0xc, PORT_PZCR);
+       ctrl_outw((ctrl_inw(PORT_PSELA) | 0x8000), PORT_PSELA);
+       ctrl_outw((ctrl_inw(PORT_HIZCRC) & ~0x4000), PORT_HIZCRC);
  }
  
  static struct sh_machine_vector mv_migor __initmv = {
diff --git a/arch/sh/boards/renesas/r7780rp/irq-r7780mp.c b/arch/sh/boards/renesas/r7780rp/irq-r7780mp.c

index 1f8f073f27be94c85d6e3202e97c3391615e45e2..68f0ad1b637dd3335243e7e101761fe34cd08688 100644 (file)
--- a/arch/sh/boards/renesas/r7780rp/irq-r7780mp.c
+++ b/arch/sh/boards/renesas/r7780rp/irq-r7780mp.c
@@ -18,31 +18,44 @@ enum {
         UNUSED = 0,
  
         /* board specific interrupt sources */
-       AX88796,          /* Ethernet controller */
-       CF,               /* Compact Flash */
-       PSW,              /* Push Switch */
-       EXT1,             /* EXT1n IRQ */
-       EXT4,             /* EXT4n IRQ */
+       CF,             /* Compact Flash */
+       TP,             /* Touch panel */
+       SCIF1,          /* FPGA SCIF1 */
+       SCIF0,          /* FPGA SCIF0 */
+       SMBUS,          /* SMBUS */
+       RTC,            /* RTC Alarm */
+       AX88796,        /* Ethernet controller */
+       PSW,            /* Push Switch */
+
+       /* external bus connector */
+       EXT1, EXT2, EXT4, EXT5, EXT6,
  };
  
  static struct intc_vect vectors[] __initdata = {
         INTC_IRQ(CF, IRQ_CF),
-       INTC_IRQ(PSW, IRQ_PSW),
+       INTC_IRQ(TP, IRQ_TP),
+       INTC_IRQ(SCIF1, IRQ_SCIF1),
+       INTC_IRQ(SCIF0, IRQ_SCIF0),
+       INTC_IRQ(SMBUS, IRQ_SMBUS),
+       INTC_IRQ(RTC, IRQ_RTC),
         INTC_IRQ(AX88796, IRQ_AX88796),
-       INTC_IRQ(EXT1, IRQ_EXT1),
-       INTC_IRQ(EXT4, IRQ_EXT4),
+       INTC_IRQ(PSW, IRQ_PSW),
+
+       INTC_IRQ(EXT1, IRQ_EXT1), INTC_IRQ(EXT2, IRQ_EXT2),
+       INTC_IRQ(EXT4, IRQ_EXT4), INTC_IRQ(EXT5, IRQ_EXT5),
+       INTC_IRQ(EXT6, IRQ_EXT6),
  };
  
  static struct intc_mask_reg mask_registers[] __initdata = {
         { 0xa4000000, 0, 16, /* IRLMSK */
-         { 0, 0, 0, 0, CF, 0, 0, 0,
-           0, 0, 0, EXT4, 0, EXT1, PSW, AX88796 } },
+         { SCIF0, SCIF1, RTC, 0, CF, 0, TP, SMBUS,
+           0, EXT6, EXT5, EXT4, EXT2, EXT1, PSW, AX88796 } },
  };
  
  static unsigned char irl2irq[HL_NR_IRL] __initdata = {
-       0, IRQ_CF, 0, 0,
-       0, 0, 0, 0,
-       0, IRQ_EXT4, 0, IRQ_EXT1,
+       0, IRQ_CF, IRQ_TP, IRQ_SCIF1,
+       IRQ_SCIF0, IRQ_SMBUS, IRQ_RTC, IRQ_EXT6,
+       IRQ_EXT5, IRQ_EXT4, IRQ_EXT2, IRQ_EXT1,
         0, IRQ_AX88796, IRQ_PSW,
  };
  
diff --git a/arch/sh/boards/renesas/r7780rp/setup.c b/arch/sh/boards/renesas/r7780rp/setup.c

index 2f68bea7890c64f6ac0ae3883b91ca9b6e72d6d3..a5c5e92365011bfd7f1ba4594b0fb26f1c116540 100644 (file)
--- a/arch/sh/boards/renesas/r7780rp/setup.c
+++ b/arch/sh/boards/renesas/r7780rp/setup.c
@@ -4,7 +4,7 @@
   * Renesas Solutions Highlander Support.
   *
   * Copyright (C) 2002 Atom Create Engineering Co., Ltd.
- * Copyright (C) 2005 - 2007 Paul Mundt
+ * Copyright (C) 2005 - 2008 Paul Mundt
   *
   * This contains support for the R7780RP-1, R7780MP, and R7785RP
   * Highlander modules.
@@ -17,6 +17,7 @@
  #include <linux/platform_device.h>
  #include <linux/ata_platform.h>
  #include <linux/types.h>
+#include <linux/i2c.h>
  #include <net/ax88796.h>
  #include <asm/machvec.h>
  #include <asm/r7780rp.h>
@@ -176,11 +177,38 @@ static struct platform_device ax88796_device = {
         .resource       = ax88796_resources,
  };
  
+static struct resource smbus_resources[] = {
+       [0] = {
+               .start  = PA_SMCR,
+               .end    = PA_SMCR + 0x100 - 1,
+               .flags  = IORESOURCE_MEM,
+       },
+       [1] = {
+               .start  = IRQ_SMBUS,
+               .end    = IRQ_SMBUS,
+               .flags  = IORESOURCE_IRQ,
+       },
+};
+
+static struct platform_device smbus_device = {
+       .name           = "i2c-highlander",
+       .id             = 0,
+       .num_resources  = ARRAY_SIZE(smbus_resources),
+       .resource       = smbus_resources,
+};
+
+static struct i2c_board_info __initdata highlander_i2c_devices[] = {
+       {
+               I2C_BOARD_INFO("rtc-rs5c372", 0x32),
+               .type   = "r2025sd",
+       },
+};
  
  static struct platform_device *r7780rp_devices[] __initdata = {
         &r8a66597_usb_host_device,
         &m66592_usb_peripheral_device,
         &heartbeat_device,
+       &smbus_device,
  #ifndef CONFIG_SH_R7780RP
         &ax88796_device,
  #endif
@@ -199,12 +227,20 @@ static struct trapped_io cf_trapped_io = {
  
  static int __init r7780rp_devices_setup(void)
  {
+       int ret = 0;
+
  #ifndef CONFIG_SH_R7780RP
         if (register_trapped_io(&cf_trapped_io) == 0)
-               platform_device_register(&cf_ide_device);
+               ret |= platform_device_register(&cf_ide_device);
  #endif
-       return platform_add_devices(r7780rp_devices,
+
+       ret |= platform_add_devices(r7780rp_devices,
                                     ARRAY_SIZE(r7780rp_devices));
+
+       ret |= i2c_register_board_info(0, highlander_i2c_devices,
+                                      ARRAY_SIZE(highlander_i2c_devices));
+
+       return ret;
  }
  device_initcall(r7780rp_devices_setup);
  
diff --git a/arch/sh/boards/se/7721/Makefile b/arch/sh/boards/se/7721/Makefile

new file mode 100644 (file)

index 0000000..7f09030
--- /dev/null
+++ b/arch/sh/boards/se/7721/Makefile
@@ -0,0 +1 @@
+obj-y   := setup.o irq.o
diff --git a/arch/sh/boards/se/7721/irq.c b/arch/sh/boards/se/7721/irq.c

new file mode 100644 (file)

index 0000000..c4fdd62
--- /dev/null
+++ b/arch/sh/boards/se/7721/irq.c
@@ -0,0 +1,45 @@
+/*
+ * linux/arch/sh/boards/se/7721/irq.c
+ *
+ * Copyright (C) 2008  Renesas Solutions Corp.
+ *
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License.  See the file "COPYING" in the main directory of this archive
+ * for more details.
+ */
+#include <linux/init.h>
+#include <linux/irq.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <asm/se7721.h>
+
+enum {
+       UNUSED = 0,
+
+       /* board specific interrupt sources */
+       MRSHPC,
+};
+
+static struct intc_vect vectors[] __initdata = {
+       INTC_IRQ(MRSHPC, MRSHPC_IRQ0),
+};
+
+static struct intc_prio_reg prio_registers[] __initdata = {
+       { FPGA_ILSR6, 0, 8, 4, /* IRLMSK */
+         { 0, MRSHPC } },
+};
+
+static DECLARE_INTC_DESC(intc_desc, "SE7721", vectors,
+                        NULL, NULL, prio_registers, NULL);
+
+/*
+ * Initialize IRQ setting
+ */
+void __init init_se7721_IRQ(void)
+{
+       /* PPCR */
+       ctrl_outw(ctrl_inw(0xa4050118) & ~0x00ff, 0xa4050118);
+
+       register_intc_controller(&intc_desc);
+       intc_set_priority(MRSHPC_IRQ0, 0xf - MRSHPC_IRQ0);
+}
diff --git a/arch/sh/boards/se/7721/setup.c b/arch/sh/boards/se/7721/setup.c

new file mode 100644 (file)

index 0000000..1be3e92
--- /dev/null
+++ b/arch/sh/boards/se/7721/setup.c
@@ -0,0 +1,99 @@
+/*
+ * linux/arch/sh/boards/se/7721/setup.c
+ *
+ * Copyright (C) 2008 Renesas Solutions Corp.
+ *
+ * Hitachi UL SolutionEngine 7721 Support.
+ *
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License.  See the file "COPYING" in the main directory of this archive
+ * for more details.
+ *
+ */
+#include <linux/init.h>
+#include <linux/platform_device.h>
+#include <asm/machvec.h>
+#include <asm/se7721.h>
+#include <asm/io.h>
+#include <asm/heartbeat.h>
+
+static unsigned char heartbeat_bit_pos[] = { 8, 9, 10, 11, 12, 13, 14, 15 };
+
+static struct heartbeat_data heartbeat_data = {
+       .bit_pos        = heartbeat_bit_pos,
+       .nr_bits        = ARRAY_SIZE(heartbeat_bit_pos),
+       .regsize        = 16,
+};
+
+static struct resource heartbeat_resources[] = {
+       [0] = {
+               .start  = PA_LED,
+               .end    = PA_LED,
+               .flags  = IORESOURCE_MEM,
+       },
+};
+
+static struct platform_device heartbeat_device = {
+       .name           = "heartbeat",
+       .id             = -1,
+       .dev    = {
+               .platform_data  = &heartbeat_data,
+       },
+       .num_resources  = ARRAY_SIZE(heartbeat_resources),
+       .resource       = heartbeat_resources,
+};
+
+static struct resource cf_ide_resources[] = {
+       [0] = {
+               .start  = PA_MRSHPC_IO + 0x1f0,
+               .end    = PA_MRSHPC_IO + 0x1f0 + 8 ,
+               .flags  = IORESOURCE_IO,
+       },
+       [1] = {
+               .start  = PA_MRSHPC_IO + 0x1f0 + 0x206,
+               .end    = PA_MRSHPC_IO + 0x1f0 + 8 + 0x206 + 8,
+               .flags  = IORESOURCE_IO,
+       },
+       [2] = {
+               .start  = MRSHPC_IRQ0,
+               .flags  = IORESOURCE_IRQ,
+       },
+};
+
+static struct platform_device cf_ide_device = {
+       .name           = "pata_platform",
+       .id             = -1,
+       .num_resources  = ARRAY_SIZE(cf_ide_resources),
+       .resource       = cf_ide_resources,
+};
+
+static struct platform_device *se7721_devices[] __initdata = {
+       &cf_ide_device,
+       &heartbeat_device
+};
+
+static int __init se7721_devices_setup(void)
+{
+       return platform_add_devices(se7721_devices,
+               ARRAY_SIZE(se7721_devices));
+}
+device_initcall(se7721_devices_setup);
+
+static void __init se7721_setup(char **cmdline_p)
+{
+       /* for USB */
+       ctrl_outw(0x0000, 0xA405010C);  /* PGCR */
+       ctrl_outw(0x0000, 0xA405010E);  /* PHCR */
+       ctrl_outw(0x00AA, 0xA4050118);  /* PPCR */
+       ctrl_outw(0x0000, 0xA4050124);  /* PSELA */
+}
+
+/*
+ * The Machine Vector
+ */
+struct sh_machine_vector mv_se7721 __initmv = {
+       .mv_name                = "Solution Engine 7721",
+       .mv_setup               = se7721_setup,
+       .mv_nr_irqs             = 109,
+       .mv_init_irq            = init_se7721_IRQ,
+};
diff --git a/arch/sh/boards/se/7722/setup.c b/arch/sh/boards/se/7722/setup.c

index b1a3d9d0172f50678c5add2c1388bb5094afbcf2..33f6ee71f8483f9c58dd736ed93c9237d54f2e33 100644 (file)
--- a/arch/sh/boards/se/7722/setup.c
+++ b/arch/sh/boards/se/7722/setup.c
@@ -13,10 +13,12 @@
  #include <linux/init.h>
  #include <linux/platform_device.h>
  #include <linux/ata_platform.h>
+#include <linux/input.h>
  #include <asm/machvec.h>
  #include <asm/se7722.h>
  #include <asm/io.h>
  #include <asm/heartbeat.h>
+#include <asm/sh_keysc.h>
  
  /* Heartbeat */
  static struct heartbeat_data heartbeat_data = {
@@ -92,10 +94,47 @@ static struct platform_device cf_ide_device  = {
         .resource       = cf_ide_resources,
  };
  
+static struct sh_keysc_info sh_keysc_info = {
+       .mode = SH_KEYSC_MODE_1, /* KEYOUT0->5, KEYIN0->4 */
+       .scan_timing = 3,
+       .delay = 5,
+       .keycodes = { /* SW1 -> SW30 */
+               KEY_A, KEY_B, KEY_C, KEY_D, KEY_E,
+               KEY_F, KEY_G, KEY_H, KEY_I, KEY_J,
+               KEY_K, KEY_L, KEY_M, KEY_N, KEY_O,
+               KEY_P, KEY_Q, KEY_R, KEY_S, KEY_T,
+               KEY_U, KEY_V, KEY_W, KEY_X, KEY_Y,
+               KEY_Z,
+               KEY_HOME, KEY_SLEEP, KEY_WAKEUP, KEY_COFFEE, /* life */
+       },
+};
+
+static struct resource sh_keysc_resources[] = {
+       [0] = {
+               .start  = 0x044b0000,
+               .end    = 0x044b000f,
+               .flags  = IORESOURCE_MEM,
+       },
+       [1] = {
+               .start  = 79,
+               .flags  = IORESOURCE_IRQ,
+       },
+};
+
+static struct platform_device sh_keysc_device = {
+       .name           = "sh_keysc",
+       .num_resources  = ARRAY_SIZE(sh_keysc_resources),
+       .resource       = sh_keysc_resources,
+       .dev    = {
+               .platform_data  = &sh_keysc_info,
+       },
+};
+
  static struct platform_device *se7722_devices[] __initdata = {
         &heartbeat_device,
         &smc91x_eth_device,
         &cf_ide_device,
+       &sh_keysc_device,
  };
  
  static int __init se7722_devices_setup(void)
@@ -136,6 +175,8 @@ static void __init se7722_setup(char **cmdline_p)
         ctrl_outw(0x0A10, PORT_PSELA); /* BS,SHHID2 */
         ctrl_outw(0x0000, PORT_PYCR);
         ctrl_outw(0x0000, PORT_PZCR);
+       ctrl_outw(ctrl_inw(PORT_HIZCRA) & ~0x4000, PORT_HIZCRA);
+       ctrl_outw(ctrl_inw(PORT_HIZCRC) & ~0xc000, PORT_HIZCRC);
  }
  
  /*
diff --git a/arch/sh/configs/se7721_defconfig b/arch/sh/configs/se7721_defconfig

new file mode 100644 (file)

index 0000000..f3d4ca0
--- /dev/null
+++ b/arch/sh/configs/se7721_defconfig
@@ -0,0 +1,1085 @@
+#
+# Automatically generated make config: don't edit
+# Linux kernel version: 2.6.25-rc5
+# Fri Mar 21 12:05:31 2008
+#
+CONFIG_SUPERH=y
+CONFIG_SUPERH32=y
+CONFIG_RWSEM_GENERIC_SPINLOCK=y
+CONFIG_GENERIC_FIND_NEXT_BIT=y
+CONFIG_GENERIC_HWEIGHT=y
+CONFIG_GENERIC_HARDIRQS=y
+CONFIG_GENERIC_IRQ_PROBE=y
+CONFIG_GENERIC_CALIBRATE_DELAY=y
+CONFIG_GENERIC_TIME=y
+CONFIG_GENERIC_CLOCKEVENTS=y
+CONFIG_STACKTRACE_SUPPORT=y
+CONFIG_LOCKDEP_SUPPORT=y
+# CONFIG_ARCH_HAS_ILOG2_U32 is not set
+# CONFIG_ARCH_HAS_ILOG2_U64 is not set
+CONFIG_ARCH_NO_VIRT_TO_BUS=y
+CONFIG_ARCH_SUPPORTS_AOUT=y
+CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"
+
+#
+# General setup
+#
+CONFIG_EXPERIMENTAL=y
+CONFIG_BROKEN_ON_SMP=y
+CONFIG_INIT_ENV_ARG_LIMIT=32
+CONFIG_LOCALVERSION=""
+# CONFIG_LOCALVERSION_AUTO is not set
+# CONFIG_SWAP is not set
+CONFIG_SYSVIPC=y
+CONFIG_SYSVIPC_SYSCTL=y
+CONFIG_POSIX_MQUEUE=y
+CONFIG_BSD_PROCESS_ACCT=y
+# CONFIG_BSD_PROCESS_ACCT_V3 is not set
+# CONFIG_TASKSTATS is not set
+# CONFIG_AUDIT is not set
+# CONFIG_IKCONFIG is not set
+CONFIG_LOG_BUF_SHIFT=14
+# CONFIG_CGROUPS is not set
+CONFIG_GROUP_SCHED=y
+CONFIG_FAIR_GROUP_SCHED=y
+# CONFIG_RT_GROUP_SCHED is not set
+CONFIG_USER_SCHED=y
+# CONFIG_CGROUP_SCHED is not set
+CONFIG_SYSFS_DEPRECATED=y
+CONFIG_SYSFS_DEPRECATED_V2=y
+# CONFIG_RELAY is not set
+# CONFIG_NAMESPACES is not set
+# CONFIG_BLK_DEV_INITRD is not set
+# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
+CONFIG_SYSCTL=y
+CONFIG_EMBEDDED=y
+CONFIG_UID16=y
+CONFIG_SYSCTL_SYSCALL=y
+CONFIG_KALLSYMS=y
+CONFIG_KALLSYMS_ALL=y
+# CONFIG_KALLSYMS_EXTRA_PASS is not set
+CONFIG_HOTPLUG=y
+CONFIG_PRINTK=y
+# CONFIG_BUG is not set
+CONFIG_ELF_CORE=y
+CONFIG_COMPAT_BRK=y
+# CONFIG_BASE_FULL is not set
+CONFIG_FUTEX=y
+CONFIG_ANON_INODES=y
+CONFIG_EPOLL=y
+CONFIG_SIGNALFD=y
+CONFIG_TIMERFD=y
+CONFIG_EVENTFD=y
+# CONFIG_SHMEM is not set
+CONFIG_VM_EVENT_COUNTERS=y
+CONFIG_SLAB=y
+# CONFIG_SLUB is not set
+# CONFIG_SLOB is not set
+# CONFIG_PROFILING is not set
+# CONFIG_MARKERS is not set
+CONFIG_HAVE_OPROFILE=y
+# CONFIG_HAVE_KPROBES is not set
+# CONFIG_HAVE_KRETPROBES is not set
+CONFIG_PROC_PAGE_MONITOR=y
+CONFIG_SLABINFO=y
+CONFIG_RT_MUTEXES=y
+CONFIG_TINY_SHMEM=y
+CONFIG_BASE_SMALL=1
+CONFIG_MODULES=y
+# CONFIG_MODULE_UNLOAD is not set
+# CONFIG_MODVERSIONS is not set
+# CONFIG_MODULE_SRCVERSION_ALL is not set
+# CONFIG_KMOD is not set
+CONFIG_BLOCK=y
+# CONFIG_LBD is not set
+# CONFIG_BLK_DEV_IO_TRACE is not set
+# CONFIG_LSF is not set
+# CONFIG_BLK_DEV_BSG is not set
+
+#
+# IO Schedulers
+#
+CONFIG_IOSCHED_NOOP=y
+# CONFIG_IOSCHED_AS is not set
+# CONFIG_IOSCHED_DEADLINE is not set
+# CONFIG_IOSCHED_CFQ is not set
+# CONFIG_DEFAULT_AS is not set
+# CONFIG_DEFAULT_DEADLINE is not set
+# CONFIG_DEFAULT_CFQ is not set
+CONFIG_DEFAULT_NOOP=y
+CONFIG_DEFAULT_IOSCHED="noop"
+CONFIG_CLASSIC_RCU=y
+
+#
+# System type
+#
+CONFIG_CPU_SH3=y
+# CONFIG_CPU_SUBTYPE_SH7619 is not set
+# CONFIG_CPU_SUBTYPE_SH7203 is not set
+# CONFIG_CPU_SUBTYPE_SH7206 is not set
+# CONFIG_CPU_SUBTYPE_SH7263 is not set
+# CONFIG_CPU_SUBTYPE_MXG is not set
+# CONFIG_CPU_SUBTYPE_SH7705 is not set
+# CONFIG_CPU_SUBTYPE_SH7706 is not set
+# CONFIG_CPU_SUBTYPE_SH7707 is not set
+# CONFIG_CPU_SUBTYPE_SH7708 is not set
+# CONFIG_CPU_SUBTYPE_SH7709 is not set
+# CONFIG_CPU_SUBTYPE_SH7710 is not set
+# CONFIG_CPU_SUBTYPE_SH7712 is not set
+# CONFIG_CPU_SUBTYPE_SH7720 is not set
+CONFIG_CPU_SUBTYPE_SH7721=y
+# CONFIG_CPU_SUBTYPE_SH7750 is not set
+# CONFIG_CPU_SUBTYPE_SH7091 is not set
+# CONFIG_CPU_SUBTYPE_SH7750R is not set
+# CONFIG_CPU_SUBTYPE_SH7750S is not set
+# CONFIG_CPU_SUBTYPE_SH7751 is not set
+# CONFIG_CPU_SUBTYPE_SH7751R is not set
+# CONFIG_CPU_SUBTYPE_SH7760 is not set
+# CONFIG_CPU_SUBTYPE_SH4_202 is not set
+# CONFIG_CPU_SUBTYPE_SH7763 is not set
+# CONFIG_CPU_SUBTYPE_SH7770 is not set
+# CONFIG_CPU_SUBTYPE_SH7780 is not set
+# CONFIG_CPU_SUBTYPE_SH7785 is not set
+# CONFIG_CPU_SUBTYPE_SHX3 is not set
+# CONFIG_CPU_SUBTYPE_SH7343 is not set
+# CONFIG_CPU_SUBTYPE_SH7722 is not set
+# CONFIG_CPU_SUBTYPE_SH7366 is not set
+# CONFIG_CPU_SUBTYPE_SH5_101 is not set
+# CONFIG_CPU_SUBTYPE_SH5_103 is not set
+
+#
+# Memory management options
+#
+CONFIG_QUICKLIST=y
+CONFIG_MMU=y
+CONFIG_PAGE_OFFSET=0x80000000
+CONFIG_MEMORY_START=0x0c000000
+CONFIG_MEMORY_SIZE=0x02000000
+CONFIG_29BIT=y
+CONFIG_VSYSCALL=y
+CONFIG_ARCH_FLATMEM_ENABLE=y
+CONFIG_ARCH_SPARSEMEM_ENABLE=y
+CONFIG_ARCH_SPARSEMEM_DEFAULT=y
+CONFIG_MAX_ACTIVE_REGIONS=1
+CONFIG_ARCH_POPULATES_NODE_MAP=y
+CONFIG_ARCH_SELECT_MEMORY_MODEL=y
+CONFIG_PAGE_SIZE_4KB=y
+# CONFIG_PAGE_SIZE_8KB is not set
+# CONFIG_PAGE_SIZE_64KB is not set
+CONFIG_SELECT_MEMORY_MODEL=y
+CONFIG_FLATMEM_MANUAL=y
+# CONFIG_DISCONTIGMEM_MANUAL is not set
+# CONFIG_SPARSEMEM_MANUAL is not set
+CONFIG_FLATMEM=y
+CONFIG_FLAT_NODE_MEM_MAP=y
+CONFIG_SPARSEMEM_STATIC=y
+# CONFIG_SPARSEMEM_VMEMMAP_ENABLE is not set
+CONFIG_SPLIT_PTLOCK_CPUS=4
+# CONFIG_RESOURCES_64BIT is not set
+CONFIG_ZONE_DMA_FLAG=0
+CONFIG_NR_QUICK=2
+
+#
+# Cache configuration
+#
+# CONFIG_SH_DIRECT_MAPPED is not set
+CONFIG_CACHE_WRITEBACK=y
+# CONFIG_CACHE_WRITETHROUGH is not set
+# CONFIG_CACHE_OFF is not set
+
+#
+# Processor features
+#
+CONFIG_CPU_LITTLE_ENDIAN=y
+# CONFIG_CPU_BIG_ENDIAN is not set
+# CONFIG_SH_FPU_EMU is not set
+# CONFIG_SH_DSP is not set
+# CONFIG_SH_ADC is not set
+CONFIG_CPU_HAS_INTEVT=y
+CONFIG_CPU_HAS_SR_RB=y
+CONFIG_CPU_HAS_DSP=y
+
+#
+# Board support
+#
+CONFIG_SOLUTION_ENGINE=y
+CONFIG_SH_7721_SOLUTION_ENGINE=y
+
+#
+# Timer and clock configuration
+#
+CONFIG_SH_TMU=y
+CONFIG_SH_TIMER_IRQ=16
+CONFIG_SH_PCLK_FREQ=33333333
+# CONFIG_TICK_ONESHOT is not set
+# CONFIG_NO_HZ is not set
+# CONFIG_HIGH_RES_TIMERS is not set
+CONFIG_GENERIC_CLOCKEVENTS_BUILD=y
+
+#
+# CPU Frequency scaling
+#
+# CONFIG_CPU_FREQ is not set
+
+#
+# DMA support
+#
+# CONFIG_SH_DMA is not set
+
+#
+# Companion Chips
+#
+
+#
+# Additional SuperH Device Drivers
+#
+CONFIG_HEARTBEAT=y
+# CONFIG_PUSH_SWITCH is not set
+
+#
+# Kernel features
+#
+# CONFIG_HZ_100 is not set
+CONFIG_HZ_250=y
+# CONFIG_HZ_300 is not set
+# CONFIG_HZ_1000 is not set
+CONFIG_HZ=250
+# CONFIG_SCHED_HRTICK is not set
+# CONFIG_KEXEC is not set
+# CONFIG_CRASH_DUMP is not set
+# CONFIG_PREEMPT_NONE is not set
+CONFIG_PREEMPT_VOLUNTARY=y
+# CONFIG_PREEMPT is not set
+CONFIG_GUSA=y
+# CONFIG_GUSA_RB is not set
+
+#
+# Boot options
+#
+CONFIG_ZERO_PAGE_OFFSET=0x00001000
+CONFIG_BOOT_LINK_OFFSET=0x00800000
+CONFIG_CMDLINE_BOOL=y
+CONFIG_CMDLINE="console=ttySC0,115200 root=/dev/sda2"
+
+#
+# Bus options
+#
+CONFIG_CF_ENABLER=y
+# CONFIG_CF_AREA5 is not set
+CONFIG_CF_AREA6=y
+CONFIG_CF_BASE_ADDR=0xb8000000
+# CONFIG_ARCH_SUPPORTS_MSI is not set
+# CONFIG_PCCARD is not set
+
+#
+# Executable file formats
+#
+CONFIG_BINFMT_ELF=y
+# CONFIG_BINFMT_MISC is not set
+
+#
+# Networking
+#
+CONFIG_NET=y
+
+#
+# Networking options
+#
+CONFIG_PACKET=y
+CONFIG_PACKET_MMAP=y
+CONFIG_UNIX=y
+CONFIG_XFRM=y
+# CONFIG_XFRM_USER is not set
+# CONFIG_XFRM_SUB_POLICY is not set
+# CONFIG_XFRM_MIGRATE is not set
+# CONFIG_XFRM_STATISTICS is not set
+CONFIG_NET_KEY=y
+# CONFIG_NET_KEY_MIGRATE is not set
+CONFIG_INET=y
+CONFIG_IP_MULTICAST=y
+CONFIG_IP_ADVANCED_ROUTER=y
+CONFIG_ASK_IP_FIB_HASH=y
+# CONFIG_IP_FIB_TRIE is not set
+CONFIG_IP_FIB_HASH=y
+CONFIG_IP_MULTIPLE_TABLES=y
+CONFIG_IP_ROUTE_MULTIPATH=y
+CONFIG_IP_ROUTE_VERBOSE=y
+CONFIG_IP_PNP=y
+CONFIG_IP_PNP_DHCP=y
+# CONFIG_IP_PNP_BOOTP is not set
+# CONFIG_IP_PNP_RARP is not set
+# CONFIG_NET_IPIP is not set
+# CONFIG_NET_IPGRE is not set
+CONFIG_IP_MROUTE=y
+CONFIG_IP_PIMSM_V1=y
+CONFIG_IP_PIMSM_V2=y
+# CONFIG_ARPD is not set
+CONFIG_SYN_COOKIES=y
+CONFIG_INET_AH=y
+CONFIG_INET_ESP=y
+CONFIG_INET_IPCOMP=y
+CONFIG_INET_XFRM_TUNNEL=y
+CONFIG_INET_TUNNEL=y
+CONFIG_INET_XFRM_MODE_TRANSPORT=y
+CONFIG_INET_XFRM_MODE_TUNNEL=y
+CONFIG_INET_XFRM_MODE_BEET=y
+# CONFIG_INET_LRO is not set
+# CONFIG_INET_DIAG is not set
+# CONFIG_TCP_CONG_ADVANCED is not set
+CONFIG_TCP_CONG_CUBIC=y
+CONFIG_DEFAULT_TCP_CONG="cubic"
+# CONFIG_TCP_MD5SIG is not set
+# CONFIG_IPV6 is not set
+# CONFIG_INET6_XFRM_TUNNEL is not set
+# CONFIG_INET6_TUNNEL is not set
+# CONFIG_NETWORK_SECMARK is not set
+# CONFIG_NETFILTER is not set
+# CONFIG_IP_DCCP is not set
+# CONFIG_IP_SCTP is not set
+# CONFIG_TIPC is not set
+# CONFIG_ATM is not set
+# CONFIG_BRIDGE is not set
+# CONFIG_VLAN_8021Q is not set
+# CONFIG_DECNET is not set
+# CONFIG_LLC2 is not set
+# CONFIG_IPX is not set
+# CONFIG_ATALK is not set
+# CONFIG_X25 is not set
+# CONFIG_LAPB is not set
+# CONFIG_ECONET is not set
+# CONFIG_WAN_ROUTER is not set
+CONFIG_NET_SCHED=y
+
+#
+# Queueing/Scheduling
+#
+CONFIG_NET_SCH_CBQ=y
+CONFIG_NET_SCH_HTB=y
+CONFIG_NET_SCH_HFSC=y
+CONFIG_NET_SCH_PRIO=y
+# CONFIG_NET_SCH_RR is not set
+CONFIG_NET_SCH_RED=y
+CONFIG_NET_SCH_SFQ=y
+CONFIG_NET_SCH_TEQL=y
+CONFIG_NET_SCH_TBF=y
+CONFIG_NET_SCH_GRED=y
+CONFIG_NET_SCH_DSMARK=y
+CONFIG_NET_SCH_NETEM=y
+
+#
+# Classification
+#
+CONFIG_NET_CLS=y
+# CONFIG_NET_CLS_BASIC is not set
+CONFIG_NET_CLS_TCINDEX=y
+CONFIG_NET_CLS_ROUTE4=y
+CONFIG_NET_CLS_ROUTE=y
+CONFIG_NET_CLS_FW=y
+# CONFIG_NET_CLS_U32 is not set
+# CONFIG_NET_CLS_RSVP is not set
+# CONFIG_NET_CLS_RSVP6 is not set
+# CONFIG_NET_CLS_FLOW is not set
+# CONFIG_NET_EMATCH is not set
+# CONFIG_NET_CLS_ACT is not set
+CONFIG_NET_CLS_IND=y
+CONFIG_NET_SCH_FIFO=y
+
+#
+# Network testing
+#
+# CONFIG_NET_PKTGEN is not set
+# CONFIG_HAMRADIO is not set
+# CONFIG_CAN is not set
+# CONFIG_IRDA is not set
+# CONFIG_BT is not set
+# CONFIG_AF_RXRPC is not set
+CONFIG_FIB_RULES=y
+
+#
+# Wireless
+#
+# CONFIG_CFG80211 is not set
+# CONFIG_WIRELESS_EXT is not set
+# CONFIG_MAC80211 is not set
+# CONFIG_IEEE80211 is not set
+# CONFIG_RFKILL is not set
+# CONFIG_NET_9P is not set
+
+#
+# Device Drivers
+#
+
+#
+# Generic Driver Options
+#
+CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug"
+CONFIG_STANDALONE=y
+CONFIG_PREVENT_FIRMWARE_BUILD=y
+CONFIG_FW_LOADER=y
+# CONFIG_DEBUG_DRIVER is not set
+# CONFIG_DEBUG_DEVRES is not set
+# CONFIG_SYS_HYPERVISOR is not set
+# CONFIG_CONNECTOR is not set
+CONFIG_MTD=y
+# CONFIG_MTD_DEBUG is not set
+CONFIG_MTD_CONCAT=y
+CONFIG_MTD_PARTITIONS=y
+# CONFIG_MTD_REDBOOT_PARTS is not set
+# CONFIG_MTD_CMDLINE_PARTS is not set
+
+#
+# User Modules And Translation Layers
+#
+CONFIG_MTD_CHAR=y
+CONFIG_MTD_BLKDEVS=y
+CONFIG_MTD_BLOCK=y
+# CONFIG_FTL is not set
+# CONFIG_NFTL is not set
+# CONFIG_INFTL is not set
+# CONFIG_RFD_FTL is not set
+# CONFIG_SSFDC is not set
+# CONFIG_MTD_OOPS is not set
+
+#
+# RAM/ROM/Flash chip drivers
+#
+CONFIG_MTD_CFI=y
+# CONFIG_MTD_JEDECPROBE is not set
+CONFIG_MTD_GEN_PROBE=y
+# CONFIG_MTD_CFI_ADV_OPTIONS is not set
+CONFIG_MTD_MAP_BANK_WIDTH_1=y
+CONFIG_MTD_MAP_BANK_WIDTH_2=y
+CONFIG_MTD_MAP_BANK_WIDTH_4=y
+# CONFIG_MTD_MAP_BANK_WIDTH_8 is not set
+# CONFIG_MTD_MAP_BANK_WIDTH_16 is not set
+# CONFIG_MTD_MAP_BANK_WIDTH_32 is not set
+CONFIG_MTD_CFI_I1=y
+CONFIG_MTD_CFI_I2=y
+# CONFIG_MTD_CFI_I4 is not set
+# CONFIG_MTD_CFI_I8 is not set
+# CONFIG_MTD_CFI_INTELEXT is not set
+CONFIG_MTD_CFI_AMDSTD=y
+# CONFIG_MTD_CFI_STAA is not set
+CONFIG_MTD_CFI_UTIL=y
+# CONFIG_MTD_RAM is not set
+# CONFIG_MTD_ROM is not set
+# CONFIG_MTD_ABSENT is not set
+
+#
+# Mapping drivers for chip access
+#
+# CONFIG_MTD_COMPLEX_MAPPINGS is not set
+# CONFIG_MTD_PHYSMAP is not set
+# CONFIG_MTD_PLATRAM is not set
+
+#
+# Self-contained MTD device drivers
+#
+# CONFIG_MTD_SLRAM is not set
+# CONFIG_MTD_PHRAM is not set
+# CONFIG_MTD_MTDRAM is not set
+# CONFIG_MTD_BLOCK2MTD is not set
+
+#
+# Disk-On-Chip Device Drivers
+#
+# CONFIG_MTD_DOC2000 is not set
+# CONFIG_MTD_DOC2001 is not set
+# CONFIG_MTD_DOC2001PLUS is not set
+# CONFIG_MTD_NAND is not set
+# CONFIG_MTD_ONENAND is not set
+
+#
+# UBI - Unsorted block images
+#
+# CONFIG_MTD_UBI is not set
+# CONFIG_PARPORT is not set
+CONFIG_BLK_DEV=y
+# CONFIG_BLK_DEV_COW_COMMON is not set
+# CONFIG_BLK_DEV_LOOP is not set
+# CONFIG_BLK_DEV_NBD is not set
+# CONFIG_BLK_DEV_UB is not set
+# CONFIG_BLK_DEV_RAM is not set
+# CONFIG_CDROM_PKTCDVD is not set
+# CONFIG_ATA_OVER_ETH is not set
+CONFIG_MISC_DEVICES=y
+# CONFIG_EEPROM_93CX6 is not set
+# CONFIG_ENCLOSURE_SERVICES is not set
+CONFIG_HAVE_IDE=y
+# CONFIG_IDE is not set
+
+#
+# SCSI device support
+#
+# CONFIG_RAID_ATTRS is not set
+CONFIG_SCSI=y
+CONFIG_SCSI_DMA=y
+# CONFIG_SCSI_TGT is not set
+# CONFIG_SCSI_NETLINK is not set
+CONFIG_SCSI_PROC_FS=y
+
+#
+# SCSI support type (disk, tape, CD-ROM)
+#
+CONFIG_BLK_DEV_SD=y
+# CONFIG_CHR_DEV_ST is not set
+# CONFIG_CHR_DEV_OSST is not set
+# CONFIG_BLK_DEV_SR is not set
+# CONFIG_CHR_DEV_SG is not set
+# CONFIG_CHR_DEV_SCH is not set
+
+#
+# Some SCSI devices (e.g. CD jukebox) support multiple LUNs
+#
+CONFIG_SCSI_MULTI_LUN=y
+# CONFIG_SCSI_CONSTANTS is not set
+# CONFIG_SCSI_LOGGING is not set
+# CONFIG_SCSI_SCAN_ASYNC is not set
+CONFIG_SCSI_WAIT_SCAN=m
+
+#
+# SCSI Transports
+#
+# CONFIG_SCSI_SPI_ATTRS is not set
+# CONFIG_SCSI_FC_ATTRS is not set
+# CONFIG_SCSI_ISCSI_ATTRS is not set
+# CONFIG_SCSI_SAS_LIBSAS is not set
+# CONFIG_SCSI_SRP_ATTRS is not set
+# CONFIG_SCSI_LOWLEVEL is not set
+CONFIG_ATA=y
+# CONFIG_ATA_NONSTANDARD is not set
+# CONFIG_SATA_MV is not set
+CONFIG_PATA_PLATFORM=y
+# CONFIG_MD is not set
+CONFIG_NETDEVICES=y
+# CONFIG_NETDEVICES_MULTIQUEUE is not set
+# CONFIG_DUMMY is not set
+# CONFIG_BONDING is not set
+# CONFIG_MACVLAN is not set
+# CONFIG_EQUALIZER is not set
+# CONFIG_TUN is not set
+# CONFIG_VETH is not set
+# CONFIG_NET_ETHERNET is not set
+CONFIG_NETDEV_1000=y
+# CONFIG_E1000E_ENABLED is not set
+CONFIG_NETDEV_10000=y
+
+#
+# Wireless LAN
+#
+# CONFIG_WLAN_PRE80211 is not set
+# CONFIG_WLAN_80211 is not set
+
+#
+# USB Network Adapters
+#
+# CONFIG_USB_CATC is not set
+# CONFIG_USB_KAWETH is not set
+# CONFIG_USB_PEGASUS is not set
+# CONFIG_USB_RTL8150 is not set
+# CONFIG_USB_USBNET is not set
+# CONFIG_WAN is not set
+# CONFIG_PPP is not set
+# CONFIG_SLIP is not set
+# CONFIG_NETCONSOLE is not set
+# CONFIG_NETPOLL is not set
+# CONFIG_NET_POLL_CONTROLLER is not set
+# CONFIG_ISDN is not set
+# CONFIG_PHONE is not set
+
+#
+# Input device support
+#
+CONFIG_INPUT=y
+# CONFIG_INPUT_FF_MEMLESS is not set
+# CONFIG_INPUT_POLLDEV is not set
+
+#
+# Userland interfaces
+#
+CONFIG_INPUT_MOUSEDEV=y
+CONFIG_INPUT_MOUSEDEV_PSAUX=y
+CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
+CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768
+# CONFIG_INPUT_JOYDEV is not set
+CONFIG_INPUT_EVDEV=y
+# CONFIG_INPUT_EVBUG is not set
+
+#
+# Input Device Drivers
+#
+CONFIG_INPUT_KEYBOARD=y
+# CONFIG_KEYBOARD_ATKBD is not set
+# CONFIG_KEYBOARD_SUNKBD is not set
+# CONFIG_KEYBOARD_LKKBD is not set
+# CONFIG_KEYBOARD_XTKBD is not set
+# CONFIG_KEYBOARD_NEWTON is not set
+# CONFIG_KEYBOARD_STOWAWAY is not set
+# CONFIG_KEYBOARD_SH_KEYSC is not set
+CONFIG_INPUT_MOUSE=y
+# CONFIG_MOUSE_PS2 is not set
+# CONFIG_MOUSE_SERIAL is not set
+# CONFIG_MOUSE_APPLETOUCH is not set
+# CONFIG_MOUSE_VSXXXAA is not set
+# CONFIG_INPUT_JOYSTICK is not set
+# CONFIG_INPUT_TABLET is not set
+# CONFIG_INPUT_TOUCHSCREEN is not set
+# CONFIG_INPUT_MISC is not set
+
+#
+# Hardware I/O ports
+#
+# CONFIG_SERIO is not set
+# CONFIG_GAMEPORT is not set
+
+#
+# Character devices
+#
+# CONFIG_VT is not set
+# CONFIG_SERIAL_NONSTANDARD is not set
+
+#
+# Serial drivers
+#
+# CONFIG_SERIAL_8250 is not set
+
+#
+# Non-8250 serial port support
+#
+CONFIG_SERIAL_SH_SCI=y
+CONFIG_SERIAL_SH_SCI_NR_UARTS=2
+CONFIG_SERIAL_SH_SCI_CONSOLE=y
+CONFIG_SERIAL_CORE=y
+CONFIG_SERIAL_CORE_CONSOLE=y
+CONFIG_UNIX98_PTYS=y
+# CONFIG_LEGACY_PTYS is not set
+# CONFIG_IPMI_HANDLER is not set
+# CONFIG_HW_RANDOM is not set
+# CONFIG_R3964 is not set
+# CONFIG_RAW_DRIVER is not set
+# CONFIG_TCG_TPM is not set
+# CONFIG_I2C is not set
+
+#
+# SPI support
+#
+# CONFIG_SPI is not set
+# CONFIG_SPI_MASTER is not set
+# CONFIG_W1 is not set
+# CONFIG_POWER_SUPPLY is not set
+# CONFIG_HWMON is not set
+CONFIG_THERMAL=y
+# CONFIG_WATCHDOG is not set
+
+#
+# Sonics Silicon Backplane
+#
+CONFIG_SSB_POSSIBLE=y
+# CONFIG_SSB is not set
+
+#
+# Multifunction device drivers
+#
+# CONFIG_MFD_SM501 is not set
+
+#
+# Multimedia devices
+#
+# CONFIG_VIDEO_DEV is not set
+# CONFIG_DVB_CORE is not set
+# CONFIG_DAB is not set
+
+#
+# Graphics support
+#
+# CONFIG_VGASTATE is not set
+# CONFIG_VIDEO_OUTPUT_CONTROL is not set
+# CONFIG_FB is not set
+# CONFIG_BACKLIGHT_LCD_SUPPORT is not set
+
+#
+# Display device support
+#
+# CONFIG_DISPLAY_SUPPORT is not set
+
+#
+# Sound
+#
+# CONFIG_SOUND is not set
+CONFIG_HID_SUPPORT=y
+CONFIG_HID=y
+# CONFIG_HID_DEBUG is not set
+# CONFIG_HIDRAW is not set
+
+#
+# USB Input Devices
+#
+CONFIG_USB_HID=y
+# CONFIG_USB_HIDINPUT_POWERBOOK is not set
+# CONFIG_HID_FF is not set
+# CONFIG_USB_HIDDEV is not set
+CONFIG_USB_SUPPORT=y
+CONFIG_USB_ARCH_HAS_HCD=y
+CONFIG_USB_ARCH_HAS_OHCI=y
+# CONFIG_USB_ARCH_HAS_EHCI is not set
+CONFIG_USB=y
+# CONFIG_USB_DEBUG is not set
+# CONFIG_USB_ANNOUNCE_NEW_DEVICES is not set
+
+#
+# Miscellaneous USB options
+#
+# CONFIG_USB_DEVICEFS is not set
+CONFIG_USB_DEVICE_CLASS=y
+# CONFIG_USB_DYNAMIC_MINORS is not set
+# CONFIG_USB_OTG is not set
+
+#
+# USB Host Controller Drivers
+#
+# CONFIG_USB_ISP116X_HCD is not set
+CONFIG_USB_OHCI_HCD=y
+# CONFIG_USB_OHCI_BIG_ENDIAN_DESC is not set
+# CONFIG_USB_OHCI_BIG_ENDIAN_MMIO is not set
+CONFIG_USB_OHCI_LITTLE_ENDIAN=y
+# CONFIG_USB_SL811_HCD is not set
+# CONFIG_USB_R8A66597_HCD is not set
+
+#
+# USB Device Class drivers
+#
+# CONFIG_USB_ACM is not set
+# CONFIG_USB_PRINTER is not set
+
+#
+# NOTE: USB_STORAGE enables SCSI, and 'SCSI disk support'
+#
+
+#
+# may also be needed; see USB_STORAGE Help for more information
+#
+CONFIG_USB_STORAGE=y
+# CONFIG_USB_STORAGE_DEBUG is not set
+# CONFIG_USB_STORAGE_DATAFAB is not set
+# CONFIG_USB_STORAGE_FREECOM is not set
+# CONFIG_USB_STORAGE_ISD200 is not set
+# CONFIG_USB_STORAGE_DPCM is not set
+# CONFIG_USB_STORAGE_USBAT is not set
+# CONFIG_USB_STORAGE_SDDR09 is not set
+# CONFIG_USB_STORAGE_SDDR55 is not set
+# CONFIG_USB_STORAGE_JUMPSHOT is not set
+# CONFIG_USB_STORAGE_ALAUDA is not set
+# CONFIG_USB_STORAGE_ONETOUCH is not set
+# CONFIG_USB_STORAGE_KARMA is not set
+# CONFIG_USB_LIBUSUAL is not set
+
+#
+# USB Imaging devices
+#
+# CONFIG_USB_MDC800 is not set
+# CONFIG_USB_MICROTEK is not set
+CONFIG_USB_MON=y
+
+#
+# USB port drivers
+#
+# CONFIG_USB_SERIAL is not set
+
+#
+# USB Miscellaneous drivers
+#
+# CONFIG_USB_EMI62 is not set
+# CONFIG_USB_EMI26 is not set
+# CONFIG_USB_ADUTUX is not set
+# CONFIG_USB_AUERSWALD is not set
+# CONFIG_USB_RIO500 is not set
+# CONFIG_USB_LEGOTOWER is not set
+# CONFIG_USB_LCD is not set
+# CONFIG_USB_BERRY_CHARGE is not set
+# CONFIG_USB_LED is not set
+# CONFIG_USB_CYPRESS_CY7C63 is not set
+# CONFIG_USB_CYTHERM is not set
+# CONFIG_USB_PHIDGET is not set
+# CONFIG_USB_IDMOUSE is not set
+# CONFIG_USB_FTDI_ELAN is not set
+# CONFIG_USB_APPLEDISPLAY is not set
+# CONFIG_USB_LD is not set
+# CONFIG_USB_TRANCEVIBRATOR is not set
+# CONFIG_USB_IOWARRIOR is not set
+# CONFIG_USB_GADGET is not set
+# CONFIG_MMC is not set
+# CONFIG_MEMSTICK is not set
+CONFIG_NEW_LEDS=y
+CONFIG_LEDS_CLASS=y
+
+#
+# LED drivers
+#
+
+#
+# LED Triggers
+#
+CONFIG_LEDS_TRIGGERS=y
+# CONFIG_LEDS_TRIGGER_TIMER is not set
+# CONFIG_LEDS_TRIGGER_HEARTBEAT is not set
+# CONFIG_RTC_CLASS is not set
+
+#
+# Userspace I/O
+#
+# CONFIG_UIO is not set
+
+#
+# File systems
+#
+CONFIG_EXT2_FS=y
+CONFIG_EXT2_FS_XATTR=y
+CONFIG_EXT2_FS_POSIX_ACL=y
+CONFIG_EXT2_FS_SECURITY=y
+# CONFIG_EXT2_FS_XIP is not set
+CONFIG_EXT3_FS=y
+CONFIG_EXT3_FS_XATTR=y
+# CONFIG_EXT3_FS_POSIX_ACL is not set
+# CONFIG_EXT3_FS_SECURITY is not set
+# CONFIG_EXT4DEV_FS is not set
+CONFIG_JBD=y
+CONFIG_FS_MBCACHE=y
+# CONFIG_REISERFS_FS is not set
+# CONFIG_JFS_FS is not set
+CONFIG_FS_POSIX_ACL=y
+# CONFIG_XFS_FS is not set
+# CONFIG_GFS2_FS is not set
+# CONFIG_OCFS2_FS is not set
+# CONFIG_DNOTIFY is not set
+# CONFIG_INOTIFY is not set
+# CONFIG_QUOTA is not set
+# CONFIG_AUTOFS_FS is not set
+# CONFIG_AUTOFS4_FS is not set
+# CONFIG_FUSE_FS is not set
+
+#
+# CD-ROM/DVD Filesystems
+#
+# CONFIG_ISO9660_FS is not set
+# CONFIG_UDF_FS is not set
+
+#
+# DOS/FAT/NT Filesystems
+#
+CONFIG_FAT_FS=y
+CONFIG_MSDOS_FS=y
+CONFIG_VFAT_FS=y
+CONFIG_FAT_DEFAULT_CODEPAGE=437
+CONFIG_FAT_DEFAULT_IOCHARSET="iso8859-1"
+# CONFIG_NTFS_FS is not set
+
+#
+# Pseudo filesystems
+#
+CONFIG_PROC_FS=y
+# CONFIG_PROC_KCORE is not set
+CONFIG_PROC_SYSCTL=y
+CONFIG_SYSFS=y
+CONFIG_TMPFS=y
+# CONFIG_TMPFS_POSIX_ACL is not set
+# CONFIG_HUGETLBFS is not set
+# CONFIG_HUGETLB_PAGE is not set
+# CONFIG_CONFIGFS_FS is not set
+
+#
+# Miscellaneous filesystems
+#
+# CONFIG_ADFS_FS is not set
+# CONFIG_AFFS_FS is not set
+# CONFIG_HFS_FS is not set
+# CONFIG_HFSPLUS_FS is not set
+# CONFIG_BEFS_FS is not set
+# CONFIG_BFS_FS is not set
+# CONFIG_EFS_FS is not set
+CONFIG_JFFS2_FS=y
+CONFIG_JFFS2_FS_DEBUG=0
+CONFIG_JFFS2_FS_WRITEBUFFER=y
+# CONFIG_JFFS2_FS_WBUF_VERIFY is not set
+# CONFIG_JFFS2_SUMMARY is not set
+# CONFIG_JFFS2_FS_XATTR is not set
+# CONFIG_JFFS2_COMPRESSION_OPTIONS is not set
+CONFIG_JFFS2_ZLIB=y
+# CONFIG_JFFS2_LZO is not set
+CONFIG_JFFS2_RTIME=y
+# CONFIG_JFFS2_RUBIN is not set
+CONFIG_CRAMFS=y
+# CONFIG_VXFS_FS is not set
+# CONFIG_MINIX_FS is not set
+# CONFIG_HPFS_FS is not set
+# CONFIG_QNX4FS_FS is not set
+# CONFIG_ROMFS_FS is not set
+# CONFIG_SYSV_FS is not set
+# CONFIG_UFS_FS is not set
+# CONFIG_NETWORK_FILESYSTEMS is not set
+
+#
+# Partition Types
+#
+# CONFIG_PARTITION_ADVANCED is not set
+CONFIG_MSDOS_PARTITION=y
+CONFIG_NLS=y
+CONFIG_NLS_DEFAULT="iso8859-1"
+CONFIG_NLS_CODEPAGE_437=y
+# CONFIG_NLS_CODEPAGE_737 is not set
+# CONFIG_NLS_CODEPAGE_775 is not set
+# CONFIG_NLS_CODEPAGE_850 is not set
+# CONFIG_NLS_CODEPAGE_852 is not set
+# CONFIG_NLS_CODEPAGE_855 is not set
+# CONFIG_NLS_CODEPAGE_857 is not set
+# CONFIG_NLS_CODEPAGE_860 is not set
+# CONFIG_NLS_CODEPAGE_861 is not set
+# CONFIG_NLS_CODEPAGE_862 is not set
+# CONFIG_NLS_CODEPAGE_863 is not set
+# CONFIG_NLS_CODEPAGE_864 is not set
+# CONFIG_NLS_CODEPAGE_865 is not set
+# CONFIG_NLS_CODEPAGE_866 is not set
+# CONFIG_NLS_CODEPAGE_869 is not set
+# CONFIG_NLS_CODEPAGE_936 is not set
+# CONFIG_NLS_CODEPAGE_950 is not set
+CONFIG_NLS_CODEPAGE_932=y
+# CONFIG_NLS_CODEPAGE_949 is not set
+# CONFIG_NLS_CODEPAGE_874 is not set
+# CONFIG_NLS_ISO8859_8 is not set
+# CONFIG_NLS_CODEPAGE_1250 is not set
+# CONFIG_NLS_CODEPAGE_1251 is not set
+# CONFIG_NLS_ASCII is not set
+CONFIG_NLS_ISO8859_1=y
+# CONFIG_NLS_ISO8859_2 is not set
+# CONFIG_NLS_ISO8859_3 is not set
+# CONFIG_NLS_ISO8859_4 is not set
+# CONFIG_NLS_ISO8859_5 is not set
+# CONFIG_NLS_ISO8859_6 is not set
+# CONFIG_NLS_ISO8859_7 is not set
+# CONFIG_NLS_ISO8859_9 is not set
+# CONFIG_NLS_ISO8859_13 is not set
+# CONFIG_NLS_ISO8859_14 is not set
+# CONFIG_NLS_ISO8859_15 is not set
+# CONFIG_NLS_KOI8_R is not set
+# CONFIG_NLS_KOI8_U is not set
+# CONFIG_NLS_UTF8 is not set
+# CONFIG_DLM is not set
+
+#
+# Kernel hacking
+#
+CONFIG_TRACE_IRQFLAGS_SUPPORT=y
+# CONFIG_PRINTK_TIME is not set
+CONFIG_ENABLE_WARN_DEPRECATED=y
+CONFIG_ENABLE_MUST_CHECK=y
+# CONFIG_MAGIC_SYSRQ is not set
+# CONFIG_UNUSED_SYMBOLS is not set
+# CONFIG_DEBUG_FS is not set
+# CONFIG_HEADERS_CHECK is not set
+CONFIG_DEBUG_KERNEL=y
+# CONFIG_DEBUG_SHIRQ is not set
+# CONFIG_DETECT_SOFTLOCKUP is not set
+CONFIG_SCHED_DEBUG=y
+# CONFIG_SCHEDSTATS is not set
+# CONFIG_TIMER_STATS is not set
+# CONFIG_DEBUG_SLAB is not set
+# CONFIG_DEBUG_RT_MUTEXES is not set
+# CONFIG_RT_MUTEX_TESTER is not set
+# CONFIG_DEBUG_SPINLOCK is not set
+# CONFIG_DEBUG_MUTEXES is not set
+# CONFIG_DEBUG_LOCK_ALLOC is not set
+# CONFIG_PROVE_LOCKING is not set
+# CONFIG_LOCK_STAT is not set
+# CONFIG_DEBUG_SPINLOCK_SLEEP is not set
+# CONFIG_DEBUG_LOCKING_API_SELFTESTS is not set
+# CONFIG_DEBUG_KOBJECT is not set
+CONFIG_DEBUG_INFO=y
+# CONFIG_DEBUG_VM is not set
+# CONFIG_DEBUG_LIST is not set
+# CONFIG_DEBUG_SG is not set
+CONFIG_FRAME_POINTER=y
+# CONFIG_BOOT_PRINTK_DELAY is not set
+# CONFIG_RCU_TORTURE_TEST is not set
+# CONFIG_BACKTRACE_SELF_TEST is not set
+# CONFIG_FAULT_INJECTION is not set
+# CONFIG_SAMPLES is not set
+# CONFIG_SH_STANDARD_BIOS is not set
+# CONFIG_EARLY_SCIF_CONSOLE is not set
+# CONFIG_DEBUG_BOOTMEM is not set
+# CONFIG_DEBUG_STACKOVERFLOW is not set
+# CONFIG_DEBUG_STACK_USAGE is not set
+# CONFIG_4KSTACKS is not set
+# CONFIG_IRQSTACKS is not set
+# CONFIG_SH_KGDB is not set
+
+#
+# Security options
+#
+# CONFIG_KEYS is not set
+# CONFIG_SECURITY is not set
+# CONFIG_SECURITY_FILE_CAPABILITIES is not set
+CONFIG_CRYPTO=y
+CONFIG_CRYPTO_ALGAPI=y
+CONFIG_CRYPTO_AEAD=y
+CONFIG_CRYPTO_BLKCIPHER=y
+# CONFIG_CRYPTO_SEQIV is not set
+CONFIG_CRYPTO_HASH=y
+CONFIG_CRYPTO_MANAGER=y
+CONFIG_CRYPTO_HMAC=y
+# CONFIG_CRYPTO_XCBC is not set
+# CONFIG_CRYPTO_NULL is not set
+# CONFIG_CRYPTO_MD4 is not set
+CONFIG_CRYPTO_MD5=y
+CONFIG_CRYPTO_SHA1=y
+# CONFIG_CRYPTO_SHA256 is not set
+# CONFIG_CRYPTO_SHA512 is not set
+# CONFIG_CRYPTO_WP512 is not set
+# CONFIG_CRYPTO_TGR192 is not set
+# CONFIG_CRYPTO_GF128MUL is not set
+# CONFIG_CRYPTO_ECB is not set
+CONFIG_CRYPTO_CBC=y
+# CONFIG_CRYPTO_PCBC is not set
+# CONFIG_CRYPTO_LRW is not set
+# CONFIG_CRYPTO_XTS is not set
+# CONFIG_CRYPTO_CTR is not set
+# CONFIG_CRYPTO_GCM is not set
+# CONFIG_CRYPTO_CCM is not set
+# CONFIG_CRYPTO_CRYPTD is not set
+CONFIG_CRYPTO_DES=y
+# CONFIG_CRYPTO_FCRYPT is not set
+# CONFIG_CRYPTO_BLOWFISH is not set
+# CONFIG_CRYPTO_TWOFISH is not set
+# CONFIG_CRYPTO_SERPENT is not set
+# CONFIG_CRYPTO_AES is not set
+# CONFIG_CRYPTO_CAST5 is not set
+# CONFIG_CRYPTO_CAST6 is not set
+# CONFIG_CRYPTO_TEA is not set
+# CONFIG_CRYPTO_ARC4 is not set
+# CONFIG_CRYPTO_KHAZAD is not set
+# CONFIG_CRYPTO_ANUBIS is not set
+# CONFIG_CRYPTO_SEED is not set
+# CONFIG_CRYPTO_SALSA20 is not set
+CONFIG_CRYPTO_DEFLATE=y
+# CONFIG_CRYPTO_MICHAEL_MIC is not set
+# CONFIG_CRYPTO_CRC32C is not set
+# CONFIG_CRYPTO_CAMELLIA is not set
+# CONFIG_CRYPTO_TEST is not set
+CONFIG_CRYPTO_AUTHENC=y
+# CONFIG_CRYPTO_LZO is not set
+CONFIG_CRYPTO_HW=y
+
+#
+# Library routines
+#
+CONFIG_BITREVERSE=y
+CONFIG_CRC_CCITT=y
+# CONFIG_CRC16 is not set
+# CONFIG_CRC_ITU_T is not set
+CONFIG_CRC32=y
+# CONFIG_CRC7 is not set
+# CONFIG_LIBCRC32C is not set
+CONFIG_ZLIB_INFLATE=y
+CONFIG_ZLIB_DEFLATE=y
+CONFIG_PLIST=y
+CONFIG_HAS_IOMEM=y
+CONFIG_HAS_IOPORT=y
+CONFIG_HAS_DMA=y
diff --git a/arch/sh/kernel/cf-enabler.c b/arch/sh/kernel/cf-enabler.c

index 1c3b99642e1c3df4d2d90b37f26a1482655c27d4..01ff4d05aab0303c38cdad719302db33fd904a59 100644 (file)
--- a/arch/sh/kernel/cf-enabler.c
+++ b/arch/sh/kernel/cf-enabler.c
@@ -83,6 +83,8 @@ static int __init cf_init_default(void)
  #include <asm/se.h>
  #elif defined(CONFIG_SH_7722_SOLUTION_ENGINE)
  #include <asm/se7722.h>
+#elif defined(CONFIG_SH_7721_SOLUTION_ENGINE)
+#include <asm/se7721.h>
  #endif
  
  /*
@@ -99,7 +101,9 @@ static int __init cf_init_default(void)
   * 0xB0600000 : I/O
   */
  
-#if defined(CONFIG_SH_SOLUTION_ENGINE) || defined(CONFIG_SH_7722_SOLUTION_ENGINE) 
+#if defined(CONFIG_SH_SOLUTION_ENGINE) || \
+    defined(CONFIG_SH_7722_SOLUTION_ENGINE) || \
+    defined(CONFIG_SH_7721_SOLUTION_ENGINE)
  static int __init cf_init_se(void)
  {
         if ((ctrl_inw(MRSHPC_CSR) & 0x000c) != 0)
@@ -112,7 +116,7 @@ static int __init cf_init_se(void)
         }
  
         /*
-        *  PC-Card window open 
+        *  PC-Card window open
          *  flag == COMMON/ATTRIBUTE/IO
          */
         /* common window open */
@@ -122,7 +126,7 @@ static int __init cf_init_se(void)
                 ctrl_outw(0x0b00, MRSHPC_MW0CR2);
         else
                 /* common mode & bus width 16bit SWAP = 0*/
-               ctrl_outw(0x0300, MRSHPC_MW0CR2); 
+               ctrl_outw(0x0300, MRSHPC_MW0CR2);
  
         /* attribute window open */
         ctrl_outw(0x8a85, MRSHPC_MW1CR1);
@@ -155,10 +159,9 @@ static int __init cf_init_se(void)
  
  int __init cf_init(void)
  {
-       if( mach_is_se() || mach_is_7722se() ){
+       if (mach_is_se() || mach_is_7722se() || mach_is_7721se())
                 return cf_init_se();
-       }
-       
+
         return cf_init_default();
  }
  
diff --git a/arch/sh/kernel/cpu/sh2a/Makefile b/arch/sh/kernel/cpu/sh2a/Makefile

index b279cdc3a23305856300299c51ffefad487b2cd6..7e2b90cfa7bf88492e08d253fb100bad5f261273 100644 (file)
--- a/arch/sh/kernel/cpu/sh2a/Makefile
+++ b/arch/sh/kernel/cpu/sh2a/Makefile
@@ -8,6 +8,7 @@ common-y        += $(addprefix ../sh2/, ex.o entry.o)
  
  obj-$(CONFIG_SH_FPU)   += fpu.o
  
-obj-$(CONFIG_CPU_SUBTYPE_SH7206) += setup-sh7206.o clock-sh7206.o
-obj-$(CONFIG_CPU_SUBTYPE_SH7203) += setup-sh7203.o clock-sh7203.o
-obj-$(CONFIG_CPU_SUBTYPE_SH7263) += setup-sh7203.o clock-sh7203.o
+obj-$(CONFIG_CPU_SUBTYPE_SH7206)       += setup-sh7206.o clock-sh7206.o
+obj-$(CONFIG_CPU_SUBTYPE_SH7203)       += setup-sh7203.o clock-sh7203.o
+obj-$(CONFIG_CPU_SUBTYPE_SH7263)       += setup-sh7203.o clock-sh7203.o
+obj-$(CONFIG_CPU_SUBTYPE_MXG)          += setup-mxg.o clock-sh7206.o
diff --git a/arch/sh/kernel/cpu/sh2a/probe.c b/arch/sh/kernel/cpu/sh2a/probe.c

index 6910e2664468cedb00cc16378483646158671e6d..6e79132f6f3047dc275bd912de3d95710fd2f8d1 100644 (file)
--- a/arch/sh/kernel/cpu/sh2a/probe.c
+++ b/arch/sh/kernel/cpu/sh2a/probe.c
@@ -29,6 +29,9 @@ int __init detect_cpu_and_cache_system(void)
         boot_cpu_data.type                      = CPU_SH7206;
         /* While SH7206 has a DSP.. */
         boot_cpu_data.flags                     |= CPU_HAS_DSP;
+#elif defined(CONFIG_CPU_SUBTYPE_MXG)
+       boot_cpu_data.type                      = CPU_MXG;
+       boot_cpu_data.flags                     |= CPU_HAS_DSP;
  #endif
  
         boot_cpu_data.dcache.ways               = 4;
diff --git a/arch/sh/kernel/cpu/sh2a/setup-mxg.c b/arch/sh/kernel/cpu/sh2a/setup-mxg.c

new file mode 100644 (file)

index 0000000..e611d79
--- /dev/null
+++ b/arch/sh/kernel/cpu/sh2a/setup-mxg.c
@@ -0,0 +1,168 @@
+/*
+ * Renesas MX-G (R8A03022BG) Setup
+ *
+ *  Copyright (C) 2008  Paul Mundt
+ *
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License.  See the file "COPYING" in the main directory of this archive
+ * for more details.
+ */
+#include <linux/platform_device.h>
+#include <linux/init.h>
+#include <linux/serial.h>
+#include <linux/serial_sci.h>
+
+enum {
+       UNUSED = 0,
+
+       /* interrupt sources */
+       IRQ0, IRQ1, IRQ2, IRQ3, IRQ4, IRQ5, IRQ6, IRQ7,
+       IRQ8, IRQ9, IRQ10, IRQ11, IRQ12, IRQ13, IRQ14, IRQ15,
+
+       PINT0, PINT1, PINT2, PINT3, PINT4, PINT5, PINT6, PINT7,
+
+       SINT8, SINT7, SINT6, SINT5, SINT4, SINT3, SINT2, SINT1,
+
+       SCIF0_BRI, SCIF0_ERI, SCIF0_RXI, SCIF0_TXI,
+       SCIF1_BRI, SCIF1_ERI, SCIF1_RXI, SCIF1_TXI,
+
+       MTU2_TGI0A, MTU2_TGI0B, MTU2_TGI0C, MTU2_TGI0D,
+       MTU2_TCI0V, MTU2_TGI0E, MTU2_TGI0F,
+       MTU2_TGI1A, MTU2_TGI1B, MTU2_TCI1V, MTU2_TCI1U,
+       MTU2_TGI2A, MTU2_TGI2B, MTU2_TCI2V, MTU2_TCI2U,
+       MTU2_TGI3A, MTU2_TGI3B, MTU2_TGI3C, MTU2_TGI3D, MTU2_TCI3V,
+       MTU2_TGI4A, MTU2_TGI4B, MTU2_TGI4C, MTU2_TGI4D, MTU2_TCI4V,
+       MTU2_TGI5U, MTU2_TGI5V, MTU2_TGI5W,
+
+       /* interrupt groups */
+       PINT, SCIF0, SCIF1,
+       MTU2_GROUP1, MTU2_GROUP2, MTU2_GROUP3, MTU2_GROUP4, MTU2_GROUP5
+};
+
+static struct intc_vect vectors[] __initdata = {
+       INTC_IRQ(IRQ0, 64), INTC_IRQ(IRQ1, 65),
+       INTC_IRQ(IRQ2, 66), INTC_IRQ(IRQ3, 67),
+       INTC_IRQ(IRQ4, 68), INTC_IRQ(IRQ5, 69),
+       INTC_IRQ(IRQ6, 70), INTC_IRQ(IRQ7, 71),
+       INTC_IRQ(IRQ8, 72), INTC_IRQ(IRQ9, 73),
+       INTC_IRQ(IRQ10, 74), INTC_IRQ(IRQ11, 75),
+       INTC_IRQ(IRQ12, 76), INTC_IRQ(IRQ13, 77),
+       INTC_IRQ(IRQ14, 78), INTC_IRQ(IRQ15, 79),
+
+       INTC_IRQ(PINT0, 80), INTC_IRQ(PINT1, 81),
+       INTC_IRQ(PINT2, 82), INTC_IRQ(PINT3, 83),
+       INTC_IRQ(PINT4, 84), INTC_IRQ(PINT5, 85),
+       INTC_IRQ(PINT6, 86), INTC_IRQ(PINT7, 87),
+
+       INTC_IRQ(SINT8, 94), INTC_IRQ(SINT7, 95),
+       INTC_IRQ(SINT6, 96), INTC_IRQ(SINT5, 97),
+       INTC_IRQ(SINT4, 98), INTC_IRQ(SINT3, 99),
+       INTC_IRQ(SINT2, 100), INTC_IRQ(SINT1, 101),
+
+       INTC_IRQ(SCIF0_RXI, 220), INTC_IRQ(SCIF0_TXI, 221),
+       INTC_IRQ(SCIF0_BRI, 222), INTC_IRQ(SCIF0_ERI, 223),
+       INTC_IRQ(SCIF1_RXI, 224), INTC_IRQ(SCIF1_TXI, 225),
+       INTC_IRQ(SCIF1_BRI, 226), INTC_IRQ(SCIF1_ERI, 227),
+
+       INTC_IRQ(MTU2_TGI0A, 228), INTC_IRQ(MTU2_TGI0B, 229),
+       INTC_IRQ(MTU2_TGI0C, 230), INTC_IRQ(MTU2_TGI0D, 231),
+       INTC_IRQ(MTU2_TCI0V, 232), INTC_IRQ(MTU2_TGI0E, 233),
+
+       INTC_IRQ(MTU2_TGI0F, 234), INTC_IRQ(MTU2_TGI1A, 235),
+       INTC_IRQ(MTU2_TGI1B, 236), INTC_IRQ(MTU2_TCI1V, 237),
+       INTC_IRQ(MTU2_TCI1U, 238), INTC_IRQ(MTU2_TGI2A, 239),
+
+       INTC_IRQ(MTU2_TGI2B, 240), INTC_IRQ(MTU2_TCI2V, 241),
+       INTC_IRQ(MTU2_TCI2U, 242), INTC_IRQ(MTU2_TGI3A, 243),
+
+       INTC_IRQ(MTU2_TGI3B, 244),
+       INTC_IRQ(MTU2_TGI3C, 245),
+
+       INTC_IRQ(MTU2_TGI3D, 246), INTC_IRQ(MTU2_TCI3V, 247),
+       INTC_IRQ(MTU2_TGI4A, 248), INTC_IRQ(MTU2_TGI4B, 249),
+       INTC_IRQ(MTU2_TGI4C, 250), INTC_IRQ(MTU2_TGI4D, 251),
+
+       INTC_IRQ(MTU2_TCI4V, 252), INTC_IRQ(MTU2_TGI5U, 253),
+       INTC_IRQ(MTU2_TGI5V, 254), INTC_IRQ(MTU2_TGI5W, 255),
+};
+
+static struct intc_group groups[] __initdata = {
+       INTC_GROUP(PINT, PINT0, PINT1, PINT2, PINT3,
+                  PINT4, PINT5, PINT6, PINT7),
+       INTC_GROUP(MTU2_GROUP1, MTU2_TGI0A, MTU2_TGI0B, MTU2_TGI0C, MTU2_TGI0D,
+                  MTU2_TCI0V, MTU2_TGI0E),
+       INTC_GROUP(MTU2_GROUP2, MTU2_TGI0F, MTU2_TGI1A, MTU2_TGI1B,
+                  MTU2_TCI1V, MTU2_TCI1U, MTU2_TGI2A),
+       INTC_GROUP(MTU2_GROUP3, MTU2_TGI2B, MTU2_TCI2V, MTU2_TCI2U,
+                  MTU2_TGI3A),
+       INTC_GROUP(MTU2_GROUP4, MTU2_TGI3D, MTU2_TCI3V, MTU2_TGI4A,
+                  MTU2_TGI4B, MTU2_TGI4C, MTU2_TGI4D),
+       INTC_GROUP(MTU2_GROUP5, MTU2_TCI4V, MTU2_TGI5U, MTU2_TGI5V, MTU2_TGI5W),
+       INTC_GROUP(SCIF0, SCIF0_BRI, SCIF0_ERI, SCIF0_RXI, SCIF0_TXI),
+       INTC_GROUP(SCIF1, SCIF1_BRI, SCIF1_ERI, SCIF1_RXI, SCIF1_TXI),
+};
+
+static struct intc_prio_reg prio_registers[] __initdata = {
+       { 0xfffd9418, 0, 16, 4, /* IPR01 */ { IRQ0, IRQ1, IRQ2, IRQ3 } },
+       { 0xfffd941a, 0, 16, 4, /* IPR02 */ { IRQ4, IRQ5, IRQ6, IRQ7 } },
+       { 0xfffd941c, 0, 16, 4, /* IPR03 */ { IRQ8, IRQ9, IRQ10, IRQ11 } },
+       { 0xfffd941e, 0, 16, 4, /* IPR04 */ { IRQ12, IRQ13, IRQ14, IRQ15 } },
+       { 0xfffd9420, 0, 16, 4, /* IPR05 */ { PINT, 0, 0, 0 } },
+       { 0xfffd9800, 0, 16, 4, /* IPR06 */ { } },
+       { 0xfffd9802, 0, 16, 4, /* IPR07 */ { } },
+       { 0xfffd9804, 0, 16, 4, /* IPR08 */ { } },
+       { 0xfffd9806, 0, 16, 4, /* IPR09 */ { } },
+       { 0xfffd9808, 0, 16, 4, /* IPR10 */ { } },
+       { 0xfffd980a, 0, 16, 4, /* IPR11 */ { } },
+       { 0xfffd980c, 0, 16, 4, /* IPR12 */ { } },
+       { 0xfffd980e, 0, 16, 4, /* IPR13 */ { } },
+       { 0xfffd9810, 0, 16, 4, /* IPR14 */ { 0, 0, 0, SCIF0 } },
+       { 0xfffd9812, 0, 16, 4, /* IPR15 */
+               { SCIF1, MTU2_GROUP1, MTU2_GROUP2, MTU2_GROUP3 } },
+       { 0xfffd9814, 0, 16, 4, /* IPR16 */
+               { MTU2_TGI3B, MTU2_TGI3C, MTU2_GROUP4, MTU2_GROUP5 } },
+};
+
+static struct intc_mask_reg mask_registers[] __initdata = {
+       { 0xfffd9408, 0, 16, /* PINTER */
+         { 0, 0, 0, 0, 0, 0, 0, 0,
+           PINT7, PINT6, PINT5, PINT4, PINT3, PINT2, PINT1, PINT0 } },
+};
+
+static DECLARE_INTC_DESC(intc_desc, "mxg", vectors, groups,
+                        mask_registers, prio_registers, NULL);
+
+static struct plat_sci_port sci_platform_data[] = {
+       {
+               .mapbase        = 0xff804000,
+               .flags          = UPF_BOOT_AUTOCONF,
+               .type           = PORT_SCIF,
+               .irqs           = { 223, 220, 221, 222 },
+       }, {
+               .flags = 0,
+       }
+};
+
+static struct platform_device sci_device = {
+       .name           = "sh-sci",
+       .id             = -1,
+       .dev            = {
+               .platform_data  = sci_platform_data,
+       },
+};
+
+static struct platform_device *mxg_devices[] __initdata = {
+       &sci_device,
+};
+
+static int __init mxg_devices_setup(void)
+{
+       return platform_add_devices(mxg_devices,
+                                   ARRAY_SIZE(mxg_devices));
+}
+__initcall(mxg_devices_setup);
+
+void __init plat_irq_setup(void)
+{
+       register_intc_controller(&intc_desc);
+}
diff --git a/arch/sh/kernel/cpu/sh4/probe.c b/arch/sh/kernel/cpu/sh4/probe.c

index 9e89984c4f1d3930d8d15bce210347806eeb7a7b..ebceb0dadff58fc70026f646fe507b48f29c6a2d 100644 (file)
--- a/arch/sh/kernel/cpu/sh4/probe.c
+++ b/arch/sh/kernel/cpu/sh4/probe.c
@@ -53,7 +53,7 @@ int __init detect_cpu_and_cache_system(void)
         /*
          * Setup some generic flags we can probe on SH-4A parts
          */
-       if (((pvr >> 16) & 0xff) == 0x10) {
+       if (((pvr >> 24) & 0xff) == 0x10) {
                 if ((cvr & 0x10000000) == 0)
                         boot_cpu_data.flags |= CPU_HAS_DSP;
  
@@ -126,17 +126,22 @@ int __init detect_cpu_and_cache_system(void)
                                           CPU_HAS_LLSC;
                 break;
         case 0x3008:
-               if (prr == 0xa0 || prr == 0xa1) {
-                       boot_cpu_data.type = CPU_SH7722;
-                       boot_cpu_data.icache.ways = 4;
-                       boot_cpu_data.dcache.ways = 4;
-                       boot_cpu_data.flags |= CPU_HAS_LLSC;
-               }
-               else if (prr == 0x70) {
+               boot_cpu_data.icache.ways = 4;
+               boot_cpu_data.dcache.ways = 4;
+               boot_cpu_data.flags |= CPU_HAS_LLSC;
+
+               switch (prr) {
+               case 0x50:
+                       boot_cpu_data.type = CPU_SH7723;
+                       boot_cpu_data.flags |= CPU_HAS_FPU | CPU_HAS_L2_CACHE;
+                       break;
+               case 0x70:
                         boot_cpu_data.type = CPU_SH7366;
-                       boot_cpu_data.icache.ways = 4;
-                       boot_cpu_data.dcache.ways = 4;
-                       boot_cpu_data.flags |= CPU_HAS_LLSC;
+                       break;
+               case 0xa0:
+               case 0xa1:
+                       boot_cpu_data.type = CPU_SH7722;
+                       break;
                 }
                 break;
         case 0x4000:    /* 1st cut */
@@ -215,6 +220,12 @@ int __init detect_cpu_and_cache_system(void)
          * SH-4A's have an optional PIPT L2.
          */
         if (boot_cpu_data.flags & CPU_HAS_L2_CACHE) {
+               /* Bug if we can't decode the L2 info */
+               BUG_ON(!(cvr & 0xf));
+
+               /* Silicon and specifications have clearly never met.. */
+               cvr ^= 0xf;
+
                 /*
                  * Size calculation is much more sensible
                  * than it is for the L1.
diff --git a/arch/sh/kernel/cpu/sh4a/Makefile b/arch/sh/kernel/cpu/sh4a/Makefile

index 5d890ac8e793559da9d8baddd552dde2314263f7..a880e7968750a63aa3aba1ae4ea3fcd477d06b82 100644 (file)
--- a/arch/sh/kernel/cpu/sh4a/Makefile
+++ b/arch/sh/kernel/cpu/sh4a/Makefile
@@ -9,6 +9,7 @@ obj-$(CONFIG_CPU_SUBTYPE_SH7780)        += setup-sh7780.o
  obj-$(CONFIG_CPU_SUBTYPE_SH7785)       += setup-sh7785.o
  obj-$(CONFIG_CPU_SUBTYPE_SH7343)       += setup-sh7343.o
  obj-$(CONFIG_CPU_SUBTYPE_SH7722)       += setup-sh7722.o
+obj-$(CONFIG_CPU_SUBTYPE_SH7723)       += setup-sh7723.o
  obj-$(CONFIG_CPU_SUBTYPE_SH7366)       += setup-sh7366.o
  obj-$(CONFIG_CPU_SUBTYPE_SHX3)         += setup-shx3.o
  
@@ -22,6 +23,7 @@ clock-$(CONFIG_CPU_SUBTYPE_SH7780)    := clock-sh7780.o
  clock-$(CONFIG_CPU_SUBTYPE_SH7785)     := clock-sh7785.o
  clock-$(CONFIG_CPU_SUBTYPE_SH7343)     := clock-sh7343.o
  clock-$(CONFIG_CPU_SUBTYPE_SH7722)     := clock-sh7722.o
+clock-$(CONFIG_CPU_SUBTYPE_SH7723)     := clock-sh7722.o
  clock-$(CONFIG_CPU_SUBTYPE_SH7366)     := clock-sh7722.o
  clock-$(CONFIG_CPU_SUBTYPE_SHX3)       := clock-shx3.o
  
diff --git a/arch/sh/kernel/cpu/sh4a/setup-sh7722.c b/arch/sh/kernel/cpu/sh4a/setup-sh7722.c

index b98b4bc93ec9df41bfd5b61ada70a65b3bd8ba99..069314037049d725f5da27a13d738a04a25d1853 100644 (file)
--- a/arch/sh/kernel/cpu/sh4a/setup-sh7722.c
+++ b/arch/sh/kernel/cpu/sh4a/setup-sh7722.c
@@ -16,13 +16,12 @@
  
  static struct resource usbf_resources[] = {
         [0] = {
-               .name   = "m66592_udc",
-               .start  = 0xA4480000,
-               .end    = 0xA44800FF,
+               .name   = "USBF",
+               .start  = 0x04480000,
+               .end    = 0x044800FF,
                 .flags  = IORESOURCE_MEM,
         },
         [1] = {
-               .name   = "m66592_udc",
                 .start  = 65,
                 .end    = 65,
                 .flags  = IORESOURCE_IRQ,
@@ -40,6 +39,26 @@ static struct platform_device usbf_device = {
         .resource       = usbf_resources,
  };
  
+static struct resource iic_resources[] = {
+       [0] = {
+               .name   = "IIC",
+               .start  = 0x04470000,
+               .end    = 0x04470017,
+               .flags  = IORESOURCE_MEM,
+       },
+       [1] = {
+               .start  = 96,
+               .end    = 99,
+               .flags  = IORESOURCE_IRQ,
+       },
+};
+
+static struct platform_device iic_device = {
+       .name           = "i2c-sh_mobile",
+       .num_resources  = ARRAY_SIZE(iic_resources),
+       .resource       = iic_resources,
+};
+
  static struct plat_sci_port sci_platform_data[] = {
         {
                 .mapbase        = 0xffe00000,
@@ -74,6 +93,7 @@ static struct platform_device sci_device = {
  
  static struct platform_device *sh7722_devices[] __initdata = {
         &usbf_device,
+       &iic_device,
         &sci_device,
  };
  
diff --git a/arch/sh/kernel/cpu/sh4a/setup-sh7723.c b/arch/sh/kernel/cpu/sh4a/setup-sh7723.c

new file mode 100644 (file)

index 0000000..16925cf
--- /dev/null
+++ b/arch/sh/kernel/cpu/sh4a/setup-sh7723.c
@@ -0,0 +1,300 @@
+/*
+ * SH7723 Setup
+ *
+ *  Copyright (C) 2008  Paul Mundt
+ *
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License.  See the file "COPYING" in the main directory of this archive
+ * for more details.
+ */
+#include <linux/platform_device.h>
+#include <linux/init.h>
+#include <linux/serial.h>
+#include <linux/mm.h>
+#include <linux/serial_sci.h>
+#include <asm/mmzone.h>
+
+static struct plat_sci_port sci_platform_data[] = {
+       {
+               .mapbase        = 0xa4e30000,
+               .flags          = UPF_BOOT_AUTOCONF,
+               .type           = PORT_SCI,
+               .irqs           = { 56, 56, 56, 56 },
+       },{
+               .mapbase        = 0xa4e40000,
+               .flags          = UPF_BOOT_AUTOCONF,
+               .type           = PORT_SCI,
+               .irqs           = { 88, 88, 88, 88 },
+       },{
+               .mapbase        = 0xa4e50000,
+               .flags          = UPF_BOOT_AUTOCONF,
+               .type           = PORT_SCI,
+               .irqs           = { 109, 109, 109, 109 },
+       }, {
+               .flags = 0,
+       }
+};
+
+static struct platform_device sci_device = {
+       .name           = "sh-sci",
+       .id             = -1,
+       .dev            = {
+               .platform_data  = sci_platform_data,
+       },
+};
+
+static struct resource rtc_resources[] = {
+       [0] = {
+               .start  = 0xa465fec0,
+               .end    = 0xa465fec0 + 0x58 - 1,
+               .flags  = IORESOURCE_IO,
+       },
+       [1] = {
+               /* Period IRQ */
+               .start  = 69,
+               .flags  = IORESOURCE_IRQ,
+       },
+       [2] = {
+               /* Carry IRQ */
+               .start  = 70,
+               .flags  = IORESOURCE_IRQ,
+       },
+       [3] = {
+               /* Alarm IRQ */
+               .start  = 68,
+               .flags  = IORESOURCE_IRQ,
+       },
+};
+
+static struct platform_device rtc_device = {
+       .name           = "sh-rtc",
+       .id             = -1,
+       .num_resources  = ARRAY_SIZE(rtc_resources),
+       .resource       = rtc_resources,
+};
+
+static struct platform_device *sh7723_devices[] __initdata = {
+       &sci_device,
+       &rtc_device,
+};
+
+static int __init sh7723_devices_setup(void)
+{
+       return platform_add_devices(sh7723_devices,
+                                   ARRAY_SIZE(sh7723_devices));
+}
+__initcall(sh7723_devices_setup);
+
+enum {
+       UNUSED=0,
+
+       /* interrupt sources */
+       IRQ0, IRQ1, IRQ2, IRQ3, IRQ4, IRQ5, IRQ6, IRQ7,
+       HUDI,
+       DMAC1A_DEI0,DMAC1A_DEI1,DMAC1A_DEI2,DMAC1A_DEI3,
+       _2DG_TRI,_2DG_INI,_2DG_CEI,
+       DMAC0A_DEI0,DMAC0A_DEI1,DMAC0A_DEI2,DMAC0A_DEI3,
+       VIO_CEUI,VIO_BEUI,VIO_VEU2HI,VIO_VOUI,
+       SCIFA_SCIFA0,
+       VPU_VPUI,
+       TPU_TPUI,
+       ADC_ADI,
+       USB_USI0,
+       RTC_ATI,RTC_PRI,RTC_CUI,
+       DMAC1B_DEI4,DMAC1B_DEI5,DMAC1B_DADERR,
+       DMAC0B_DEI4,DMAC0B_DEI5,DMAC0B_DADERR,
+       KEYSC_KEYI,
+       SCIF_SCIF0,SCIF_SCIF1,SCIF_SCIF2,
+       MSIOF_MSIOFI0,MSIOF_MSIOFI1,
+       SCIFA_SCIFA1,
+       FLCTL_FLSTEI,FLCTL_FLTENDI,FLCTL_FLTREQ0I,FLCTL_FLTREQ1I,
+       I2C_ALI,I2C_TACKI,I2C_WAITI,I2C_DTEI,
+       SDHI0_SDHII0,SDHI0_SDHII1,SDHI0_SDHII2,
+       CMT_CMTI,
+       TSIF_TSIFI,
+       SIU_SIUI,
+       SCIFA_SCIFA2,
+       TMU0_TUNI0, TMU0_TUNI1, TMU0_TUNI2,
+       IRDA_IRDAI,
+       ATAPI_ATAPII,
+       SDHI1_SDHII0,SDHI1_SDHII1,SDHI1_SDHII2,
+       VEU2H1_VEU2HI,
+       LCDC_LCDCI,
+       TMU1_TUNI0,TMU1_TUNI1,TMU1_TUNI2,
+
+       /* interrupt groups */
+       DMAC1A, DMAC0A, VIO, DMAC0B, FLCTL, I2C, _2DG,
+       SDHI1, RTC, DMAC1B, SDHI0,
+};
+
+static struct intc_vect vectors[] __initdata = {
+       INTC_VECT(IRQ0, 0x600), INTC_VECT(IRQ1, 0x620),
+       INTC_VECT(IRQ2, 0x640), INTC_VECT(IRQ3, 0x660),
+       INTC_VECT(IRQ4, 0x680), INTC_VECT(IRQ5, 0x6a0),
+       INTC_VECT(IRQ6, 0x6c0), INTC_VECT(IRQ7, 0x6e0),
+
+       INTC_VECT(DMAC1A_DEI0,0x700),
+       INTC_VECT(DMAC1A_DEI1,0x720),
+       INTC_VECT(DMAC1A_DEI2,0x740),
+       INTC_VECT(DMAC1A_DEI3,0x760),
+
+       INTC_VECT(_2DG_TRI, 0x780),
+       INTC_VECT(_2DG_INI, 0x7A0),
+       INTC_VECT(_2DG_CEI, 0x7C0),
+
+       INTC_VECT(DMAC0A_DEI0,0x800),
+       INTC_VECT(DMAC0A_DEI1,0x820),
+       INTC_VECT(DMAC0A_DEI2,0x840),
+       INTC_VECT(DMAC0A_DEI3,0x860),
+
+       INTC_VECT(VIO_CEUI,0x880),
+       INTC_VECT(VIO_BEUI,0x8A0),
+       INTC_VECT(VIO_VEU2HI,0x8C0),
+       INTC_VECT(VIO_VOUI,0x8E0),
+
+       INTC_VECT(SCIFA_SCIFA0,0x900),
+       INTC_VECT(VPU_VPUI,0x920),
+       INTC_VECT(TPU_TPUI,0x9A0),
+       INTC_VECT(ADC_ADI,0x9E0),
+       INTC_VECT(USB_USI0,0xA20),
+
+       INTC_VECT(RTC_ATI,0xA80),
+       INTC_VECT(RTC_PRI,0xAA0),
+       INTC_VECT(RTC_CUI,0xAC0),
+
+       INTC_VECT(DMAC1B_DEI4,0xB00),
+       INTC_VECT(DMAC1B_DEI5,0xB20),
+       INTC_VECT(DMAC1B_DADERR,0xB40),
+
+       INTC_VECT(DMAC0B_DEI4,0xB80),
+       INTC_VECT(DMAC0B_DEI5,0xBA0),
+       INTC_VECT(DMAC0B_DADERR,0xBC0),
+
+       INTC_VECT(KEYSC_KEYI,0xBE0),
+       INTC_VECT(SCIF_SCIF0,0xC00),
+       INTC_VECT(SCIF_SCIF1,0xC20),
+       INTC_VECT(SCIF_SCIF2,0xC40),
+       INTC_VECT(MSIOF_MSIOFI0,0xC80),
+       INTC_VECT(MSIOF_MSIOFI1,0xCA0),
+       INTC_VECT(SCIFA_SCIFA1,0xD00),
+
+       INTC_VECT(FLCTL_FLSTEI,0xD80),
+       INTC_VECT(FLCTL_FLTENDI,0xDA0),
+       INTC_VECT(FLCTL_FLTREQ0I,0xDC0),
+       INTC_VECT(FLCTL_FLTREQ1I,0xDE0),
+
+       INTC_VECT(I2C_ALI,0xE00),
+       INTC_VECT(I2C_TACKI,0xE20),
+       INTC_VECT(I2C_WAITI,0xE40),
+       INTC_VECT(I2C_DTEI,0xE60),
+
+       INTC_VECT(SDHI0_SDHII0,0xE80),
+       INTC_VECT(SDHI0_SDHII1,0xEA0),
+       INTC_VECT(SDHI0_SDHII2,0xEC0),
+
+       INTC_VECT(CMT_CMTI,0xF00),
+       INTC_VECT(TSIF_TSIFI,0xF20),
+       INTC_VECT(SIU_SIUI,0xF80),
+       INTC_VECT(SCIFA_SCIFA2,0xFA0),
+
+       INTC_VECT(TMU0_TUNI0,0x400),
+       INTC_VECT(TMU0_TUNI1,0x420),
+       INTC_VECT(TMU0_TUNI2,0x440),
+
+       INTC_VECT(IRDA_IRDAI,0x480),
+       INTC_VECT(ATAPI_ATAPII,0x4A0),
+
+       INTC_VECT(SDHI1_SDHII0,0x4E0),
+       INTC_VECT(SDHI1_SDHII1,0x500),
+       INTC_VECT(SDHI1_SDHII2,0x520),
+
+       INTC_VECT(VEU2H1_VEU2HI,0x560),
+       INTC_VECT(LCDC_LCDCI,0x580),
+
+       INTC_VECT(TMU1_TUNI0,0x920),
+       INTC_VECT(TMU1_TUNI1,0x940),
+       INTC_VECT(TMU1_TUNI2,0x960),
+
+};
+
+static struct intc_group groups[] __initdata = {
+       INTC_GROUP(DMAC1A,DMAC1A_DEI0,DMAC1A_DEI1,DMAC1A_DEI2,DMAC1A_DEI3),
+       INTC_GROUP(DMAC0A,DMAC0A_DEI0,DMAC0A_DEI1,DMAC0A_DEI2,DMAC0A_DEI3),
+       INTC_GROUP(VIO, VIO_CEUI,VIO_BEUI,VIO_VEU2HI,VIO_VOUI),
+       INTC_GROUP(DMAC0B, DMAC0B_DEI4,DMAC0B_DEI5,DMAC0B_DADERR),
+       INTC_GROUP(FLCTL,FLCTL_FLSTEI,FLCTL_FLTENDI,FLCTL_FLTREQ0I,FLCTL_FLTREQ1I),
+       INTC_GROUP(I2C,I2C_ALI,I2C_TACKI,I2C_WAITI,I2C_DTEI),
+       INTC_GROUP(_2DG, _2DG_TRI,_2DG_INI,_2DG_CEI),
+       INTC_GROUP(SDHI1, SDHI1_SDHII0,SDHI1_SDHII1,SDHI1_SDHII2),
+       INTC_GROUP(RTC, RTC_ATI,RTC_PRI,RTC_CUI),
+       INTC_GROUP(DMAC1B, DMAC1B_DEI4,DMAC1B_DEI5,DMAC1B_DADERR),
+       INTC_GROUP(SDHI0,SDHI0_SDHII0,SDHI0_SDHII1,SDHI0_SDHII2),
+};
+
+static struct intc_mask_reg mask_registers[] __initdata = {
+       { 0xa4080080, 0xa40800c0, 8, /* IMR0 / IMCR0 */
+         { 0,  TMU1_TUNI2,TMU1_TUNI1,TMU1_TUNI0,0,SDHI1_SDHII2,SDHI1_SDHII1,SDHI1_SDHII0} },
+       { 0xa4080084, 0xa40800c4, 8, /* IMR1 / IMCR1 */
+         { VIO_VOUI, VIO_VEU2HI,VIO_BEUI,VIO_CEUI,DMAC0A_DEI3,DMAC0A_DEI2,DMAC0A_DEI1,DMAC0A_DEI0 } },
+       { 0xa4080088, 0xa40800c8, 8, /* IMR2 / IMCR2 */
+         { 0, 0, 0, VPU_VPUI,0,0,0,SCIFA_SCIFA0 } },
+       { 0xa408008c, 0xa40800cc, 8, /* IMR3 / IMCR3 */
+         { DMAC1A_DEI3,DMAC1A_DEI2,DMAC1A_DEI1,DMAC1A_DEI0,0,0,0,IRDA_IRDAI } },
+       { 0xa4080090, 0xa40800d0, 8, /* IMR4 / IMCR4 */
+         { 0,TMU0_TUNI2,TMU0_TUNI1,TMU0_TUNI0,VEU2H1_VEU2HI,0,0,LCDC_LCDCI } },
+       { 0xa4080094, 0xa40800d4, 8, /* IMR5 / IMCR5 */
+         { KEYSC_KEYI,DMAC0B_DADERR,DMAC0B_DEI5,DMAC0B_DEI4,0,SCIF_SCIF2,SCIF_SCIF1,SCIF_SCIF0 } },
+       { 0xa4080098, 0xa40800d8, 8, /* IMR6 / IMCR6 */
+         { 0,0,0,SCIFA_SCIFA1,ADC_ADI,0,MSIOF_MSIOFI1,MSIOF_MSIOFI0 } },
+       { 0xa408009c, 0xa40800dc, 8, /* IMR7 / IMCR7 */
+         { I2C_DTEI, I2C_WAITI, I2C_TACKI, I2C_ALI,
+           FLCTL_FLTREQ1I, FLCTL_FLTREQ0I, FLCTL_FLTENDI, FLCTL_FLSTEI } },
+       { 0xa40800a0, 0xa40800e0, 8, /* IMR8 / IMCR8 */
+         { 0,SDHI0_SDHII2,SDHI0_SDHII1,SDHI0_SDHII0,0,0,SCIFA_SCIFA2,SIU_SIUI } },
+       { 0xa40800a4, 0xa40800e4, 8, /* IMR9 / IMCR9 */
+         { 0, 0, 0, CMT_CMTI, 0, 0, USB_USI0,0 } },
+       { 0xa40800a8, 0xa40800e8, 8, /* IMR10 / IMCR10 */
+         { 0, DMAC1B_DADERR,DMAC1B_DEI5,DMAC1B_DEI4,0,RTC_ATI,RTC_PRI,RTC_CUI } },
+       { 0xa40800ac, 0xa40800ec, 8, /* IMR11 / IMCR11 */
+         { 0,_2DG_CEI,_2DG_INI,_2DG_TRI,0,TPU_TPUI,0,TSIF_TSIFI } },
+       { 0xa40800b0, 0xa40800f0, 8, /* IMR12 / IMCR12 */
+         { 0,0,0,0,0,0,0,ATAPI_ATAPII } },
+       { 0xa4140044, 0xa4140064, 8, /* INTMSK00 / INTMSKCLR00 */
+         { IRQ0, IRQ1, IRQ2, IRQ3, IRQ4, IRQ5, IRQ6, IRQ7 } },
+};
+
+static struct intc_prio_reg prio_registers[] __initdata = {
+       { 0xa4080000, 0, 16, 4, /* IPRA */ { TMU0_TUNI0, TMU0_TUNI1, TMU0_TUNI2, IRDA_IRDAI } },
+       { 0xa4080004, 0, 16, 4, /* IPRB */ { VEU2H1_VEU2HI, LCDC_LCDCI, DMAC1A, 0} },
+       { 0xa4080008, 0, 16, 4, /* IPRC */ { TMU1_TUNI0, TMU1_TUNI1, TMU1_TUNI2, 0} },
+       { 0xa408000c, 0, 16, 4, /* IPRD */ { } },
+       { 0xa4080010, 0, 16, 4, /* IPRE */ { DMAC0A, VIO, SCIFA_SCIFA0, VPU_VPUI } },
+       { 0xa4080014, 0, 16, 4, /* IPRF */ { KEYSC_KEYI, DMAC0B, USB_USI0, CMT_CMTI } },
+       { 0xa4080018, 0, 16, 4, /* IPRG */ { SCIF_SCIF0, SCIF_SCIF1, SCIF_SCIF2,0 } },
+       { 0xa408001c, 0, 16, 4, /* IPRH */ { MSIOF_MSIOFI0,MSIOF_MSIOFI1, FLCTL, I2C } },
+       { 0xa4080020, 0, 16, 4, /* IPRI */ { SCIFA_SCIFA1,0,TSIF_TSIFI,_2DG } },
+       { 0xa4080024, 0, 16, 4, /* IPRJ */ { ADC_ADI,0,SIU_SIUI,SDHI1 } },
+       { 0xa4080028, 0, 16, 4, /* IPRK */ { RTC,DMAC1B,0,SDHI0 } },
+       { 0xa408002c, 0, 16, 4, /* IPRL */ { SCIFA_SCIFA2,0,TPU_TPUI,ATAPI_ATAPII } },
+       { 0xa4140010, 0, 32, 4, /* INTPRI00 */
+         { IRQ0, IRQ1, IRQ2, IRQ3, IRQ4, IRQ5, IRQ6, IRQ7 } },
+};
+
+static struct intc_sense_reg sense_registers[] __initdata = {
+       { 0xa414001c, 16, 2, /* ICR1 */
+         { IRQ0, IRQ1, IRQ2, IRQ3, IRQ4, IRQ5, IRQ6, IRQ7 } },
+};
+
+static DECLARE_INTC_DESC(intc_desc, "sh7723", vectors, groups,
+                        mask_registers, prio_registers, sense_registers);
+
+void __init plat_irq_setup(void)
+{
+       register_intc_controller(&intc_desc);
+}
+
+void __init plat_mem_setup(void)
+{
+       /* Register the URAM space as Node 1 */
+       setup_bootmem_node(1, 0x055f0000, 0x05610000);
+}
diff --git a/arch/sh/kernel/cpu/sh4a/setup-sh7763.c b/arch/sh/kernel/cpu/sh4a/setup-sh7763.c

index 07c988dc9de6fa7bf07e7f315f294cc5e90453c0..ae2b22219f02dffa1c6b05e7520ef46cfb495357 100644 (file)
--- a/arch/sh/kernel/cpu/sh4a/setup-sh7763.c
+++ b/arch/sh/kernel/cpu/sh4a/setup-sh7763.c
@@ -231,12 +231,6 @@ static struct intc_group groups[] __initdata = {
         INTC_GROUP(GPIO, GPIO_CH0, GPIO_CH1, GPIO_CH2, GPIO_CH3),
  };
  
-static struct intc_prio priorities[] __initdata = {
-       INTC_PRIO(SCIF0, 3),
-       INTC_PRIO(SCIF1, 3),
-       INTC_PRIO(SCIF2, 3),
-};
-
  static struct intc_mask_reg mask_registers[] __initdata = {
         { 0xffd40038, 0xffd4003c, 32, /* INT2MSKR / INT2MSKCR */
           { 0, 0, 0, 0, 0, 0, GPIO, 0,
@@ -270,11 +264,10 @@ static struct intc_prio_reg prio_registers[] __initdata = {
         { 0xffd400b4, 0, 32, 8, /* INT2PRI13 */ { 0, 0, STIF1, STIF0 } },
  };
  
-static DECLARE_INTC_DESC(intc_desc, "sh7763", vectors, groups, priorities,
+static DECLARE_INTC_DESC(intc_desc, "sh7763", vectors, groups,
                          mask_registers, prio_registers, NULL);
  
  /* Support for external interrupt pins in IRQ mode */
-
  static struct intc_vect irq_vectors[] __initdata = {
         INTC_VECT(IRQ0, 0x240), INTC_VECT(IRQ1, 0x280),
         INTC_VECT(IRQ2, 0x2c0), INTC_VECT(IRQ3, 0x300),
@@ -302,7 +295,6 @@ static DECLARE_INTC_DESC(intc_irq_desc, "sh7763-irq", irq_vectors,
                          irq_sense_registers);
  
  /* External interrupt pins in IRL mode */
-
  static struct intc_vect irl_vectors[] __initdata = {
         INTC_VECT(IRL_LLLL, 0x200), INTC_VECT(IRL_LLLH, 0x220),
         INTC_VECT(IRL_LLHL, 0x240), INTC_VECT(IRL_LLHH, 0x260),
diff --git a/arch/sh/kernel/cpu/sh4a/setup-sh7770.c b/arch/sh/kernel/cpu/sh4a/setup-sh7770.c

index b9cec48b18088dc0b0239f23b120626d10c5abe3..b73578ee295d47c5335fcf8ec4b170b9b327ca77 100644 (file)
--- a/arch/sh/kernel/cpu/sh4a/setup-sh7770.c
+++ b/arch/sh/kernel/cpu/sh4a/setup-sh7770.c
@@ -1,7 +1,7 @@
  /*
   * SH7770 Setup
   *
- *  Copyright (C) 2006  Paul Mundt
+ *  Copyright (C) 2006 - 2008  Paul Mundt
   *
   * This file is subject to the terms and conditions of the GNU General Public
   * License.  See the file "COPYING" in the main directory of this archive
@@ -28,6 +28,41 @@ static struct plat_sci_port sci_platform_data[] = {
                 .flags          = UPF_BOOT_AUTOCONF,
                 .type           = PORT_SCIF,
                 .irqs           = { 63, 63, 63, 63 },
+       }, {
+               .mapbase        = 0xff926000,
+               .flags          = UPF_BOOT_AUTOCONF,
+               .type           = PORT_SCIF,
+               .irqs           = { 64, 64, 64, 64 },
+       }, {
+               .mapbase        = 0xff927000,
+               .flags          = UPF_BOOT_AUTOCONF,
+               .type           = PORT_SCIF,
+               .irqs           = { 65, 65, 65, 65 },
+       }, {
+               .mapbase        = 0xff928000,
+               .flags          = UPF_BOOT_AUTOCONF,
+               .type           = PORT_SCIF,
+               .irqs           = { 66, 66, 66, 66 },
+       }, {
+               .mapbase        = 0xff929000,
+               .flags          = UPF_BOOT_AUTOCONF,
+               .type           = PORT_SCIF,
+               .irqs           = { 67, 67, 67, 67 },
+       }, {
+               .mapbase        = 0xff92a000,
+               .flags          = UPF_BOOT_AUTOCONF,
+               .type           = PORT_SCIF,
+               .irqs           = { 68, 68, 68, 68 },
+       }, {
+               .mapbase        = 0xff92b000,
+               .flags          = UPF_BOOT_AUTOCONF,
+               .type           = PORT_SCIF,
+               .irqs           = { 69, 69, 69, 69 },
+       }, {
+               .mapbase        = 0xff92c000,
+               .flags          = UPF_BOOT_AUTOCONF,
+               .type           = PORT_SCIF,
+               .irqs           = { 70, 70, 70, 70 },
         }, {
                 .flags = 0,
         }
diff --git a/arch/sh/kernel/setup.c b/arch/sh/kernel/setup.c

index ff4f54a47c0724942ca9f489de602d5884c287f9..284f66f1ebbeb812be6623c7111d820ccaa4ad01 100644 (file)
--- a/arch/sh/kernel/setup.c
+++ b/arch/sh/kernel/setup.c
@@ -23,6 +23,8 @@
  #include <linux/kexec.h>
  #include <linux/module.h>
  #include <linux/smp.h>
+#include <linux/err.h>
+#include <linux/debugfs.h>
  #include <asm/uaccess.h>
  #include <asm/io.h>
  #include <asm/page.h>
@@ -333,6 +335,7 @@ static const char *cpu_name[] = {
         [CPU_SH7343]    = "SH7343",     [CPU_SH7785]    = "SH7785",
         [CPU_SH7722]    = "SH7722",     [CPU_SHX3]      = "SH-X3",
         [CPU_SH5_101]   = "SH5-101",    [CPU_SH5_103]   = "SH5-103",
+       [CPU_MXG]       = "MX-G",       [CPU_SH7723]    = "SH7723",
         [CPU_SH7366]    = "SH7366",     [CPU_SH_NONE]   = "Unknown"
  };
  
@@ -443,3 +446,15 @@ const struct seq_operations cpuinfo_op = {
         .show   = show_cpuinfo,
  };
  #endif /* CONFIG_PROC_FS */
+
+struct dentry *sh_debugfs_root;
+
+static int __init sh_debugfs_init(void)
+{
+       sh_debugfs_root = debugfs_create_dir("sh", NULL);
+       if (IS_ERR(sh_debugfs_root))
+               return PTR_ERR(sh_debugfs_root);
+
+       return 0;
+}
+arch_initcall(sh_debugfs_init);
diff --git a/arch/sh/lib/clear_page.S b/arch/sh/lib/clear_page.S

index 3539123fe5174b29fb95c1b64fcc3e46a3e688a3..8342bfbde64c23a1f1128785b1ed810ba4f46b86 100644 (file)
--- a/arch/sh/lib/clear_page.S
+++ b/arch/sh/lib/clear_page.S
@@ -27,11 +27,11 @@ ENTRY(clear_page)
         mov     #0,r0
         !
  1:
-#if defined(CONFIG_CPU_SH3)
-       mov.l   r0,@r4
-#elif defined(CONFIG_CPU_SH4)
+#if defined(CONFIG_CPU_SH4)
         movca.l r0,@r4
         mov     r4,r1
+#else
+       mov.l   r0,@r4
  #endif
         add     #32,r4
         mov.l   r0,@-r4
diff --git a/arch/sh/lib/copy_page.S b/arch/sh/lib/copy_page.S

index e002b91c87526bf506757b0e9684fcb25d7ceee5..5d12e657be34e051b8ba2ab676cf472d58ffe77b 100644 (file)
--- a/arch/sh/lib/copy_page.S
+++ b/arch/sh/lib/copy_page.S
@@ -41,11 +41,11 @@ ENTRY(copy_page)
         mov.l   @r11+,r5
         mov.l   @r11+,r6
         mov.l   @r11+,r7
-#if defined(CONFIG_CPU_SH3)
-       mov.l   r0,@r10
-#elif defined(CONFIG_CPU_SH4)
+#if defined(CONFIG_CPU_SH4)
         movca.l r0,@r10
         mov     r10,r0
+#else
+       mov.l   r0,@r10
  #endif
         add     #32,r10
         mov.l   r7,@-r10
diff --git a/arch/sh/mm/cache-debugfs.c b/arch/sh/mm/cache-debugfs.c

index db6d950b6f5e2f768204982a4b2e2cc015429a68..c5b56d52b7d27cf7d87a6bd3e2d1e99b24f97bf2 100644 (file)
--- a/arch/sh/mm/cache-debugfs.c
+++ b/arch/sh/mm/cache-debugfs.c
@@ -127,13 +127,13 @@ static int __init cache_debugfs_init(void)
  {
         struct dentry *dcache_dentry, *icache_dentry;
  
-       dcache_dentry = debugfs_create_file("dcache", S_IRUSR, NULL,
+       dcache_dentry = debugfs_create_file("dcache", S_IRUSR, sh_debugfs_root,
                                             (unsigned int *)CACHE_TYPE_DCACHE,
                                             &cache_debugfs_fops);
         if (IS_ERR(dcache_dentry))
                 return PTR_ERR(dcache_dentry);
  
-       icache_dentry = debugfs_create_file("icache", S_IRUSR, NULL,
+       icache_dentry = debugfs_create_file("icache", S_IRUSR, sh_debugfs_root,
                                             (unsigned int *)CACHE_TYPE_ICACHE,
                                             &cache_debugfs_fops);
         if (IS_ERR(icache_dentry)) {
diff --git a/arch/sh/mm/pmb.c b/arch/sh/mm/pmb.c

index ab81c602295f063906c6de58b398325b2ccd49cc..0b0ec6e047530bd1b7a4dde27088a004631755d5 100644 (file)
--- a/arch/sh/mm/pmb.c
+++ b/arch/sh/mm/pmb.c
@@ -393,7 +393,7 @@ static int __init pmb_debugfs_init(void)
         struct dentry *dentry;
  
         dentry = debugfs_create_file("pmb", S_IFREG | S_IRUGO,
-                                    NULL, NULL, &pmb_debugfs_fops);
+                                    sh_debugfs_root, NULL, &pmb_debugfs_fops);
         if (IS_ERR(dentry))
                 return PTR_ERR(dentry);
  
diff --git a/arch/sh/tools/mach-types b/arch/sh/tools/mach-types

index d63b93da952d4e0eafa9e6df984bccdb6517881b..987c6682bf9906a51f120dc73cc478b00a07d8d8 100644 (file)
--- a/arch/sh/tools/mach-types
+++ b/arch/sh/tools/mach-types
@@ -21,8 +21,9 @@ HD64465                       HD64465
  7206SE                 SH_7206_SOLUTION_ENGINE
  7343SE                 SH_7343_SOLUTION_ENGINE
  7619SE                 SH_7619_SOLUTION_ENGINE
-7722SE                 SH_7722_SOLUTION_ENGINE         
-7751SE                 SH_7751_SOLUTION_ENGINE         
+7721SE                 SH_7721_SOLUTION_ENGINE
+7722SE                 SH_7722_SOLUTION_ENGINE
+7751SE                 SH_7751_SOLUTION_ENGINE
  7780SE                 SH_7780_SOLUTION_ENGINE
  7751SYSTEMH            SH_7751_SYSTEMH
  HP6XX                  SH_HP6XX
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig

index 2a59dbb28248b7da44c9695250e97bf68a05799c..87a693cf2bb79f6cebbd82368bb2da35e3b45cf9 100644 (file)
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -117,6 +117,9 @@ config ARCH_HAS_CPU_RELAX
  config HAVE_SETUP_PER_CPU_AREA
         def_bool X86_64 || (X86_SMP && !X86_VOYAGER)
  
+config HAVE_CPUMASK_OF_CPU_MAP
+       def_bool X86_64_SMP
+
  config ARCH_HIBERNATION_POSSIBLE
         def_bool y
         depends on !SMP || !X86_VOYAGER
@@ -903,6 +906,15 @@ config X86_64_ACPI_NUMA
         help
           Enable ACPI SRAT based node topology detection.
  
+# Some NUMA nodes have memory ranges that span
+# other nodes.  Even though a pfn is valid and
+# between a node's start and end pfns, it may not
+# reside on that node.  See memmap_init_zone()
+# for details.
+config NODES_SPAN_OTHER_NODES
+       def_bool y
+       depends on X86_64_ACPI_NUMA
+
  config NUMA_EMU
         bool "NUMA emulation"
         depends on X86_64 && NUMA
diff --git a/arch/x86/boot/a20.c b/arch/x86/boot/a20.c

index 31348d054fca6129db01e2fd1e6f3600dc3038dd..90943f83e84d69ba1172ab745febe31a231f58fc 100644 (file)
--- a/arch/x86/boot/a20.c
+++ b/arch/x86/boot/a20.c
@@ -9,8 +9,6 @@
   * ----------------------------------------------------------------------- */
  
  /*
- * arch/i386/boot/a20.c
- *
   * Enable A20 gate (return -1 on failure)
   */
  
diff --git a/arch/x86/boot/apm.c b/arch/x86/boot/apm.c

index c117c7fb859c12bb13685f9fd213dd3e8facec2e..7aa6033001f9ace30ef1a4ffdb52bfd629a98f26 100644 (file)
--- a/arch/x86/boot/apm.c
+++ b/arch/x86/boot/apm.c
@@ -12,8 +12,6 @@
   * ----------------------------------------------------------------------- */
  
  /*
- * arch/i386/boot/apm.c
- *
   * Get APM BIOS information
   */
  
diff --git a/arch/x86/boot/bitops.h b/arch/x86/boot/bitops.h

index 8dcc8dc7db88c98a143a7361966ee6901b51f596..878e4b9940d9212ce581c5bea0ac518ae6bbf85f 100644 (file)
--- a/arch/x86/boot/bitops.h
+++ b/arch/x86/boot/bitops.h
@@ -9,8 +9,6 @@
   * ----------------------------------------------------------------------- */
  
  /*
- * arch/i386/boot/bitops.h
- *
   * Very simple bitops for the boot code.
   */
  
diff --git a/arch/x86/boot/boot.h b/arch/x86/boot/boot.h

index 09578070bfba94b2a66c750e6f0a1aafb25e9a1b..a34b9982c7cbcf928bd96e3b17955a789f226f57 100644 (file)
--- a/arch/x86/boot/boot.h
+++ b/arch/x86/boot/boot.h
@@ -9,8 +9,6 @@
   * ----------------------------------------------------------------------- */
  
  /*
- * arch/i386/boot/boot.h
- *
   * Header file for the real-mode kernel code
   */
  
diff --git a/arch/x86/boot/cmdline.c b/arch/x86/boot/cmdline.c

index 680408a0f46317c898d5e73a620b499e9a9cbdb2..a1d35634bce0097d6ea32110f4285d0c1a9225b8 100644 (file)
--- a/arch/x86/boot/cmdline.c
+++ b/arch/x86/boot/cmdline.c
@@ -9,8 +9,6 @@
   * ----------------------------------------------------------------------- */
  
  /*
- * arch/i386/boot/cmdline.c
- *
   * Simple command-line parser for early boot.
   */
  
diff --git a/arch/x86/boot/compressed/head_32.S b/arch/x86/boot/compressed/head_32.S

index 036e635f18a3f14391fa39033fb70d37f8cad8d2..ba7736cf2ec73e8977e447a8ab852d083f079c42 100644 (file)
--- a/arch/x86/boot/compressed/head_32.S
+++ b/arch/x86/boot/compressed/head_32.S
@@ -130,7 +130,7 @@ relocated:
  /*
   * Setup the stack for the decompressor
   */
-       leal stack_end(%ebx), %esp
+       leal boot_stack_end(%ebx), %esp
  
  /*
   * Do the decompression, and jump to the new kernel..
@@ -142,8 +142,8 @@ relocated:
         pushl %eax      # input_len
         leal input_data(%ebx), %eax
         pushl %eax      # input_data
-       leal _end(%ebx), %eax
-       pushl %eax      # end of the image as third argument
+       leal boot_heap(%ebx), %eax
+       pushl %eax      # heap area as third argument
         pushl %esi      # real mode pointer as second arg
         call decompress_kernel
         addl $20, %esp
@@ -181,7 +181,10 @@ relocated:
         jmp *%ebp
  
  .bss
+/* Stack and heap for uncompression */
  .balign 4
-stack:
-       .fill 4096, 1, 0
-stack_end:
+boot_heap:
+       .fill BOOT_HEAP_SIZE, 1, 0
+boot_stack:
+       .fill BOOT_STACK_SIZE, 1, 0
+boot_stack_end:
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S

index e8657b98c902a0773a1e9861fc4db6349f0bb909..d8819efac81dc4488ff59163172448710311cce0 100644 (file)
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -28,6 +28,7 @@
  #include <asm/segment.h>
  #include <asm/pgtable.h>
  #include <asm/page.h>
+#include <asm/boot.h>
  #include <asm/msr.h>
  #include <asm/asm-offsets.h>
  
@@ -62,7 +63,7 @@ startup_32:
         subl    $1b, %ebp
  
  /* setup a stack and make sure cpu supports long mode. */
-       movl    $user_stack_end, %eax
+       movl    $boot_stack_end, %eax
         addl    %ebp, %eax
         movl    %eax, %esp
  
@@ -243,9 +244,9 @@ ENTRY(startup_64)
  /* Copy the compressed kernel to the end of our buffer
   * where decompression in place becomes safe.
   */
-       leaq    _end(%rip), %r8
-       leaq    _end(%rbx), %r9
-       movq    $_end /* - $startup_32 */, %rcx
+       leaq    _end_before_pgt(%rip), %r8
+       leaq    _end_before_pgt(%rbx), %r9
+       movq    $_end_before_pgt /* - $startup_32 */, %rcx
  1:     subq    $8, %r8
         subq    $8, %r9
         movq    0(%r8), %rax
@@ -267,14 +268,14 @@ relocated:
   */
         xorq    %rax, %rax
         leaq    _edata(%rbx), %rdi
-       leaq    _end(%rbx), %rcx
+       leaq    _end_before_pgt(%rbx), %rcx
         subq    %rdi, %rcx
         cld
         rep
         stosb
  
         /* Setup the stack */
-       leaq    user_stack_end(%rip), %rsp
+       leaq    boot_stack_end(%rip), %rsp
  
         /* zero EFLAGS after setting rsp */
         pushq   $0
@@ -285,7 +286,7 @@ relocated:
   */
         pushq   %rsi                    # Save the real mode argument
         movq    %rsi, %rdi              # real mode address
-       leaq    _heap(%rip), %rsi       # _heap
+       leaq    boot_heap(%rip), %rsi   # malloc area for uncompression
         leaq    input_data(%rip), %rdx  # input_data
         movl    input_len(%rip), %eax
         movq    %rax, %rcx              # input_len
@@ -310,9 +311,12 @@ gdt:
         .quad   0x0080890000000000      /* TS descriptor */
         .quad   0x0000000000000000      /* TS continued */
  gdt_end:
-       .bss
-/* Stack for uncompression */
-       .balign 4
-user_stack:
-       .fill 4096,4,0
-user_stack_end:
+
+.bss
+/* Stack and heap for uncompression */
+.balign 4
+boot_heap:
+       .fill BOOT_HEAP_SIZE, 1, 0
+boot_stack:
+       .fill BOOT_STACK_SIZE, 1, 0
+boot_stack_end:
diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c

index dad4e699f5a326d41bd6c72d988517c32f0ec538..90456cee47c337b226a8874582440377c428a91e 100644 (file)
--- a/arch/x86/boot/compressed/misc.c
+++ b/arch/x86/boot/compressed/misc.c
@@ -217,12 +217,6 @@ static void putstr(const char *);
  static memptr free_mem_ptr;
  static memptr free_mem_end_ptr;
  
-#ifdef CONFIG_X86_64
-#define HEAP_SIZE             0x7000
-#else
-#define HEAP_SIZE             0x4000
-#endif
-
  static char *vidmem;
  static int vidport;
  static int lines, cols;
@@ -449,7 +443,7 @@ asmlinkage void decompress_kernel(void *rmode, memptr heap,
  
         window = output;                /* Output buffer (Normally at 1M) */
         free_mem_ptr     = heap;        /* Heap */
-       free_mem_end_ptr = heap + HEAP_SIZE;
+       free_mem_end_ptr = heap + BOOT_HEAP_SIZE;
         inbuf  = input_data;            /* Input buffer */
         insize = input_len;
         inptr  = 0;
diff --git a/arch/x86/boot/compressed/vmlinux_64.lds b/arch/x86/boot/compressed/vmlinux_64.lds

index 7e5c7209f6cc2f0b932c6deb3208b8c4d5583374..bef1ac891bce9987d2693cdc1ff601457af0d6c4 100644 (file)
--- a/arch/x86/boot/compressed/vmlinux_64.lds
+++ b/arch/x86/boot/compressed/vmlinux_64.lds
@@ -39,10 +39,10 @@ SECTIONS
                 *(.bss.*)
                 *(COMMON)
                 . = ALIGN(8);
-               _end = . ;
+               _end_before_pgt = . ;
                 . = ALIGN(4096);
                 pgtable = . ;
                 . = . + 4096 * 6;
-               _heap = .;
+               _ebss = .;
         }
  }
diff --git a/arch/x86/boot/copy.S b/arch/x86/boot/copy.S

index ef127e56a3cf7ca36fed6bd275edc690325857fd..ef50c84e8b4bd6a517417d84c9275240ae19dd0c 100644 (file)
--- a/arch/x86/boot/copy.S
+++ b/arch/x86/boot/copy.S
@@ -9,8 +9,6 @@
   * ----------------------------------------------------------------------- */
  
  /*
- * arch/i386/boot/copy.S
- *
   * Memory copy routines
   */
  
diff --git a/arch/x86/boot/cpucheck.c b/arch/x86/boot/cpucheck.c

index 2462c88689edd47ea939de1b2d54cb82c25b7a40..7804389ee0059eb8f4be589cccd24a2623f9567f 100644 (file)
--- a/arch/x86/boot/cpucheck.c
+++ b/arch/x86/boot/cpucheck.c
@@ -9,8 +9,6 @@
   * ----------------------------------------------------------------------- */
  
  /*
- * arch/i386/boot/cpucheck.c
- *
   * Check for obligatory CPU features and abort if the features are not
   * present.  This code should be compilable as 16-, 32- or 64-bit
   * code, so be very careful with types and inline assembly.
diff --git a/arch/x86/boot/edd.c b/arch/x86/boot/edd.c

index 8721dc46a0b618336b9908e0d855611f7beb6175..d84a48ece78503b10e8429981e44397b6755a454 100644 (file)
--- a/arch/x86/boot/edd.c
+++ b/arch/x86/boot/edd.c
@@ -9,8 +9,6 @@
   * ----------------------------------------------------------------------- */
  
  /*
- * arch/i386/boot/edd.c
- *
   * Get EDD BIOS disk information
   */
  
diff --git a/arch/x86/boot/install.sh b/arch/x86/boot/install.sh

index 88d77761d01bfbccba3dc9d970ede616b0900d52..8d60ee15dfd9b2e5e4ff3808976ea8143e94c05a 100644 (file)
--- a/arch/x86/boot/install.sh
+++ b/arch/x86/boot/install.sh
@@ -1,7 +1,5 @@
  #!/bin/sh
  #
-# arch/i386/boot/install.sh
-#
  # This file is subject to the terms and conditions of the GNU General Public
  # License.  See the file "COPYING" in the main directory of this archive
  # for more details.
diff --git a/arch/x86/boot/main.c b/arch/x86/boot/main.c

index 7828da5cfd07475376c7d4d3fc95282334b60a3d..77569a4a3be114aaca948d04b060ebd28c80c1a9 100644 (file)
--- a/arch/x86/boot/main.c
+++ b/arch/x86/boot/main.c
@@ -9,8 +9,6 @@
   * ----------------------------------------------------------------------- */
  
  /*
- * arch/i386/boot/main.c
- *
   * Main module for the real-mode kernel code
   */
  
diff --git a/arch/x86/boot/mca.c b/arch/x86/boot/mca.c

index 68222f2d4b670479fd7656f4036c848bfed7e5e8..911eaae5d696427c98fdad7f344a14bdc0340efc 100644 (file)
--- a/arch/x86/boot/mca.c
+++ b/arch/x86/boot/mca.c
@@ -9,8 +9,6 @@
   * ----------------------------------------------------------------------- */
  
  /*
- * arch/i386/boot/mca.c
- *
   * Get the MCA system description table
   */
  
diff --git a/arch/x86/boot/memory.c b/arch/x86/boot/memory.c

index e77d89f9e8aa23c13751bee268718e8ed7ee54a7..acad32eb4290861b1539553c1e8eb49f7ec82e92 100644 (file)
--- a/arch/x86/boot/memory.c
+++ b/arch/x86/boot/memory.c
@@ -9,8 +9,6 @@
   * ----------------------------------------------------------------------- */
  
  /*
- * arch/i386/boot/memory.c
- *
   * Memory detection code
   */
  
diff --git a/arch/x86/boot/pm.c b/arch/x86/boot/pm.c

index a93cb8bded4da529e76ee746874cc628808c767a..328956fdb59e79dc354e96f47e98c5cb5dff4a6b 100644 (file)
--- a/arch/x86/boot/pm.c
+++ b/arch/x86/boot/pm.c
@@ -9,8 +9,6 @@
   * ----------------------------------------------------------------------- */
  
  /*
- * arch/i386/boot/pm.c
- *
   * Prepare the machine for transition to protected mode.
   */
  
diff --git a/arch/x86/boot/pmjump.S b/arch/x86/boot/pmjump.S

index f5402d51f7c3d1ff534a009d54d438a3cfa54ec8..ab049d40a884d7ef44d32fb63e30c44ed6aedabf 100644 (file)
--- a/arch/x86/boot/pmjump.S
+++ b/arch/x86/boot/pmjump.S
@@ -9,8 +9,6 @@
   * ----------------------------------------------------------------------- */
  
  /*
- * arch/i386/boot/pmjump.S
- *
   * The actual transition into protected mode
   */
  
diff --git a/arch/x86/boot/printf.c b/arch/x86/boot/printf.c

index 7e7e890699be9acf34756c7179e1b928db3c0bb8..c1d00c0274c4888c203ac186b4f3930791669be4 100644 (file)
--- a/arch/x86/boot/printf.c
+++ b/arch/x86/boot/printf.c
@@ -9,8 +9,6 @@
   * ----------------------------------------------------------------------- */
  
  /*
- * arch/i386/boot/printf.c
- *
   * Oh, it's a waste of space, but oh-so-yummy for debugging.  This
   * version of printf() does not include 64-bit support.  "Live with
   * it."
diff --git a/arch/x86/boot/string.c b/arch/x86/boot/string.c

index 481a22097781953c088931b52021d9a5365b80b2..f94b7a0c2abf8e2477ec5589a54ba682872f680b 100644 (file)
--- a/arch/x86/boot/string.c
+++ b/arch/x86/boot/string.c
@@ -9,8 +9,6 @@
   * ----------------------------------------------------------------------- */
  
  /*
- * arch/i386/boot/string.c
- *
   * Very basic string functions
   */
  
diff --git a/arch/x86/boot/tty.c b/arch/x86/boot/tty.c

index f3f14bd2637191ca7a87faf8f69785f87dce2aef..0be77b39328afc957c8aa70b64bbe0641f477a7c 100644 (file)
--- a/arch/x86/boot/tty.c
+++ b/arch/x86/boot/tty.c
@@ -9,8 +9,6 @@
   * ----------------------------------------------------------------------- */
  
  /*
- * arch/i386/boot/tty.c
- *
   * Very simple screen I/O
   * XXX: Probably should add very simple serial I/O?
   */
diff --git a/arch/x86/boot/version.c b/arch/x86/boot/version.c

index c61462f7d9a75b636dae291e65924bdf39f3604b..2723d9b5ce432b4d229a9d3cbe47f9b8fdf08a2c 100644 (file)
--- a/arch/x86/boot/version.c
+++ b/arch/x86/boot/version.c
@@ -9,8 +9,6 @@
   * ----------------------------------------------------------------------- */
  
  /*
- * arch/i386/boot/version.c
- *
   * Kernel version string
   */
  
diff --git a/arch/x86/boot/video-bios.c b/arch/x86/boot/video-bios.c

index 39e247e96172a53c24aefbfe1c6a6beae89d01bc..49f26aaaebc8f2d58064cc1be981103742ab8524 100644 (file)
--- a/arch/x86/boot/video-bios.c
+++ b/arch/x86/boot/video-bios.c
@@ -9,8 +9,6 @@
   * ----------------------------------------------------------------------- */
  
  /*
- * arch/i386/boot/video-bios.c
- *
   * Standard video BIOS modes
   *
   * We have two options for this; silent and scanned.
diff --git a/arch/x86/boot/video-vesa.c b/arch/x86/boot/video-vesa.c

index 5d5a3f6e8b5ca31e05bac9e862c9067aaf1afd72..401ad998ad08d7cd63943a5b7a7ae893264b5e3f 100644 (file)
--- a/arch/x86/boot/video-vesa.c
+++ b/arch/x86/boot/video-vesa.c
@@ -9,8 +9,6 @@
   * ----------------------------------------------------------------------- */
  
  /*
- * arch/i386/boot/video-vesa.c
- *
   * VESA text modes
   */
  
diff --git a/arch/x86/boot/video-vga.c b/arch/x86/boot/video-vga.c

index 330d6589a2adf854e6e43836b6d4327fb2a072dc..40ecb8d7688c3d5c8a7587aa8fce8bb7ea06d76b 100644 (file)
--- a/arch/x86/boot/video-vga.c
+++ b/arch/x86/boot/video-vga.c
@@ -9,8 +9,6 @@
   * ----------------------------------------------------------------------- */
  
  /*
- * arch/i386/boot/video-vga.c
- *
   * Common all-VGA modes
   */
  
diff --git a/arch/x86/boot/video.c b/arch/x86/boot/video.c

index c1c47ba069ef7c98c9732f03d408a927bb2eb905..83598b23093aa31398db2d1c9a5f89eb30733357 100644 (file)
--- a/arch/x86/boot/video.c
+++ b/arch/x86/boot/video.c
@@ -9,8 +9,6 @@
   * ----------------------------------------------------------------------- */
  
  /*
- * arch/i386/boot/video.c
- *
   * Select video mode
   */
  
diff --git a/arch/x86/boot/video.h b/arch/x86/boot/video.h

index d69347f79e8e5b76e23233f7b64216023336e96f..ee63f5d14461517ef96b89a6c644ac6c3a60a2f6 100644 (file)
--- a/arch/x86/boot/video.h
+++ b/arch/x86/boot/video.h
@@ -9,8 +9,6 @@
   * ----------------------------------------------------------------------- */
  
  /*
- * arch/i386/boot/video.h
- *
   * Header file for the real-mode video probing code
   */
  
diff --git a/arch/x86/boot/voyager.c b/arch/x86/boot/voyager.c

index 6499e3239b4132213907ab9ff54d00be9219dd1b..433909d61e5cb2ef7a20f7386c0ad87c362b47c1 100644 (file)
--- a/arch/x86/boot/voyager.c
+++ b/arch/x86/boot/voyager.c
@@ -9,8 +9,6 @@
   * ----------------------------------------------------------------------- */
  
  /*
- * arch/i386/boot/voyager.c
- *
   * Get the Voyager config information
   */
  
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile

index c3920ea8ac56f99020c2de9562b422c97ec2a0c3..90e092d0af0c639211932fc44a0f5dd959ec590b 100644 (file)
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -22,13 +22,14 @@ obj-y                       += setup_$(BITS).o i8259_$(BITS).o setup.o
  obj-$(CONFIG_X86_32)   += sys_i386_32.o i386_ksyms_32.o
  obj-$(CONFIG_X86_64)   += sys_x86_64.o x8664_ksyms_64.o
  obj-$(CONFIG_X86_64)   += syscall_64.o vsyscall_64.o setup64.o
-obj-y                  += pci-dma_$(BITS).o  bootflag.o e820_$(BITS).o
-obj-y                  += quirks.o i8237.o topology.o kdebugfs.o
-obj-y                  += alternative.o i8253.o
-obj-$(CONFIG_X86_64)   += pci-nommu_64.o bugs_64.o
+obj-y                  += bootflag.o e820_$(BITS).o
+obj-y                  += pci-dma.o quirks.o i8237.o topology.o kdebugfs.o
+obj-y                  += alternative.o i8253.o pci-nommu.o
+obj-$(CONFIG_X86_64)   += bugs_64.o
  obj-y                  += tsc_$(BITS).o io_delay.o rtc.o
  
  obj-$(CONFIG_X86_TRAMPOLINE)   += trampoline.o
+obj-y                          += process.o
  obj-y                          += i387.o
  obj-y                          += ptrace.o
  obj-y                          += ds.o
diff --git a/arch/x86/kernel/acpi/cstate.c b/arch/x86/kernel/acpi/cstate.c

index 8ca3557a6d599c947a8b71314902d2d9470e8d48..c2502eb9aa8355488a7057602bfdfa71134785a7 100644 (file)
--- a/arch/x86/kernel/acpi/cstate.c
+++ b/arch/x86/kernel/acpi/cstate.c
@@ -1,6 +1,4 @@
  /*
- * arch/i386/kernel/acpi/cstate.c
- *
   * Copyright (C) 2005 Intel Corporation
   *     Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
   *     - Added _PDC for SMP C-states on Intel CPUs
@@ -93,7 +91,7 @@ int acpi_processor_ffh_cstate_probe(unsigned int cpu,
  
         /* Make sure we are running on right CPU */
         saved_mask = current->cpus_allowed;
-       retval = set_cpus_allowed(current, cpumask_of_cpu(cpu));
+       retval = set_cpus_allowed_ptr(current, &cpumask_of_cpu(cpu));
         if (retval)
                 return -1;
  
@@ -130,7 +128,7 @@ int acpi_processor_ffh_cstate_probe(unsigned int cpu,
                  cx->address);
  
  out:
-       set_cpus_allowed(current, saved_mask);
+       set_cpus_allowed_ptr(current, &saved_mask);
         return retval;
  }
  EXPORT_SYMBOL_GPL(acpi_processor_ffh_cstate_probe);
diff --git a/arch/x86/kernel/acpi/processor.c b/arch/x86/kernel/acpi/processor.c

index 324eb0cab19ceb7c8353f402d78884f139c6741d..de2d2e4ebad97217be93de05b32886cfa09d0526 100644 (file)
--- a/arch/x86/kernel/acpi/processor.c
+++ b/arch/x86/kernel/acpi/processor.c
@@ -1,6 +1,4 @@
  /*
- * arch/i386/kernel/acpi/processor.c
- *
   * Copyright (C) 2005 Intel Corporation
   *     Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
   *     - Added _PDC for platforms with Intel CPUs
diff --git a/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c b/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c

index a962dcb9c408518add7154ac9757ad2ed2c08f7c..e2d870de837c2ba73599a8f58c9408669446ae9c 100644 (file)
--- a/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c
+++ b/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c
@@ -192,9 +192,9 @@ static void drv_read(struct drv_cmd *cmd)
         cpumask_t saved_mask = current->cpus_allowed;
         cmd->val = 0;
  
-       set_cpus_allowed(current, cmd->mask);
+       set_cpus_allowed_ptr(current, &cmd->mask);
         do_drv_read(cmd);
-       set_cpus_allowed(current, saved_mask);
+       set_cpus_allowed_ptr(current, &saved_mask);
  }
  
  static void drv_write(struct drv_cmd *cmd)
@@ -203,30 +203,30 @@ static void drv_write(struct drv_cmd *cmd)
         unsigned int i;
  
         for_each_cpu_mask(i, cmd->mask) {
-               set_cpus_allowed(current, cpumask_of_cpu(i));
+               set_cpus_allowed_ptr(current, &cpumask_of_cpu(i));
                 do_drv_write(cmd);
         }
  
-       set_cpus_allowed(current, saved_mask);
+       set_cpus_allowed_ptr(current, &saved_mask);
         return;
  }
  
-static u32 get_cur_val(cpumask_t mask)
+static u32 get_cur_val(const cpumask_t *mask)
  {
         struct acpi_processor_performance *perf;
         struct drv_cmd cmd;
  
-       if (unlikely(cpus_empty(mask)))
+       if (unlikely(cpus_empty(*mask)))
                 return 0;
  
-       switch (per_cpu(drv_data, first_cpu(mask))->cpu_feature) {
+       switch (per_cpu(drv_data, first_cpu(*mask))->cpu_feature) {
         case SYSTEM_INTEL_MSR_CAPABLE:
                 cmd.type = SYSTEM_INTEL_MSR_CAPABLE;
                 cmd.addr.msr.reg = MSR_IA32_PERF_STATUS;
                 break;
         case SYSTEM_IO_CAPABLE:
                 cmd.type = SYSTEM_IO_CAPABLE;
-               perf = per_cpu(drv_data, first_cpu(mask))->acpi_data;
+               perf = per_cpu(drv_data, first_cpu(*mask))->acpi_data;
                 cmd.addr.io.port = perf->control_register.address;
                 cmd.addr.io.bit_width = perf->control_register.bit_width;
                 break;
@@ -234,7 +234,7 @@ static u32 get_cur_val(cpumask_t mask)
                 return 0;
         }
  
-       cmd.mask = mask;
+       cmd.mask = *mask;
  
         drv_read(&cmd);
  
@@ -271,7 +271,7 @@ static unsigned int get_measured_perf(unsigned int cpu)
         unsigned int retval;
  
         saved_mask = current->cpus_allowed;
-       set_cpus_allowed(current, cpumask_of_cpu(cpu));
+       set_cpus_allowed_ptr(current, &cpumask_of_cpu(cpu));
         if (get_cpu() != cpu) {
                 /* We were not able to run on requested processor */
                 put_cpu();
@@ -329,7 +329,7 @@ static unsigned int get_measured_perf(unsigned int cpu)
         retval = per_cpu(drv_data, cpu)->max_freq * perf_percent / 100;
  
         put_cpu();
-       set_cpus_allowed(current, saved_mask);
+       set_cpus_allowed_ptr(current, &saved_mask);
  
         dprintk("cpu %d: performance percent %d\n", cpu, perf_percent);
         return retval;
@@ -347,13 +347,13 @@ static unsigned int get_cur_freq_on_cpu(unsigned int cpu)
                 return 0;
         }
  
-       freq = extract_freq(get_cur_val(cpumask_of_cpu(cpu)), data);
+       freq = extract_freq(get_cur_val(&cpumask_of_cpu(cpu)), data);
         dprintk("cur freq = %u\n", freq);
  
         return freq;
  }
  
-static unsigned int check_freqs(cpumask_t mask, unsigned int freq,
+static unsigned int check_freqs(const cpumask_t *mask, unsigned int freq,
                                 struct acpi_cpufreq_data *data)
  {
         unsigned int cur_freq;
@@ -449,7 +449,7 @@ static int acpi_cpufreq_target(struct cpufreq_policy *policy,
         drv_write(&cmd);
  
         if (acpi_pstate_strict) {
-               if (!check_freqs(cmd.mask, freqs.new, data)) {
+               if (!check_freqs(&cmd.mask, freqs.new, data)) {
                         dprintk("acpi_cpufreq_target failed (%d)\n",
                                 policy->cpu);
                         return -EAGAIN;
diff --git a/arch/x86/kernel/cpu/cpufreq/p4-clockmod.c b/arch/x86/kernel/cpu/cpufreq/p4-clockmod.c

index 14791ec55cfd16798eb0366cb908b31d0dedf1e7..199e4e05e5dc1c7cb82cbe417864377cb399d2d8 100644 (file)
--- a/arch/x86/kernel/cpu/cpufreq/p4-clockmod.c
+++ b/arch/x86/kernel/cpu/cpufreq/p4-clockmod.c
@@ -289,8 +289,8 @@ static int __init cpufreq_p4_init(void)
         if (c->x86_vendor != X86_VENDOR_INTEL)
                 return -ENODEV;
  
-       if (!test_bit(X86_FEATURE_ACPI, c->x86_capability) ||
-               !test_bit(X86_FEATURE_ACC, c->x86_capability))
+       if (!test_cpu_cap(c, X86_FEATURE_ACPI) ||
+                               !test_cpu_cap(c, X86_FEATURE_ACC))
                 return -ENODEV;
  
         ret = cpufreq_register_driver(&p4clockmod_driver);
diff --git a/arch/x86/kernel/cpu/cpufreq/powernow-k8.c b/arch/x86/kernel/cpu/cpufreq/powernow-k8.c

index c99d59d8ef2ea46694ac8b3da1cb07de2714cf10..46d4034d9f379237feba3ad4bbe0a4689f1e0b9b 100644 (file)
--- a/arch/x86/kernel/cpu/cpufreq/powernow-k8.c
+++ b/arch/x86/kernel/cpu/cpufreq/powernow-k8.c
@@ -478,12 +478,12 @@ static int core_voltage_post_transition(struct powernow_k8_data *data, u32 reqvi
  
  static int check_supported_cpu(unsigned int cpu)
  {
-       cpumask_t oldmask = CPU_MASK_ALL;
+       cpumask_t oldmask;
         u32 eax, ebx, ecx, edx;
         unsigned int rc = 0;
  
         oldmask = current->cpus_allowed;
-       set_cpus_allowed(current, cpumask_of_cpu(cpu));
+       set_cpus_allowed_ptr(current, &cpumask_of_cpu(cpu));
  
         if (smp_processor_id() != cpu) {
                 printk(KERN_ERR PFX "limiting to cpu %u failed\n", cpu);
@@ -528,7 +528,7 @@ static int check_supported_cpu(unsigned int cpu)
         rc = 1;
  
  out:
-       set_cpus_allowed(current, oldmask);
+       set_cpus_allowed_ptr(current, &oldmask);
         return rc;
  }
  
@@ -1015,7 +1015,7 @@ static int transition_frequency_pstate(struct powernow_k8_data *data, unsigned i
  /* Driver entry point to switch to the target frequency */
  static int powernowk8_target(struct cpufreq_policy *pol, unsigned targfreq, unsigned relation)
  {
-       cpumask_t oldmask = CPU_MASK_ALL;
+       cpumask_t oldmask;
         struct powernow_k8_data *data = per_cpu(powernow_data, pol->cpu);
         u32 checkfid;
         u32 checkvid;
@@ -1030,7 +1030,7 @@ static int powernowk8_target(struct cpufreq_policy *pol, unsigned targfreq, unsi
  
         /* only run on specific CPU from here on */
         oldmask = current->cpus_allowed;
-       set_cpus_allowed(current, cpumask_of_cpu(pol->cpu));
+       set_cpus_allowed_ptr(current, &cpumask_of_cpu(pol->cpu));
  
         if (smp_processor_id() != pol->cpu) {
                 printk(KERN_ERR PFX "limiting to cpu %u failed\n", pol->cpu);
@@ -1085,7 +1085,7 @@ static int powernowk8_target(struct cpufreq_policy *pol, unsigned targfreq, unsi
         ret = 0;
  
  err_out:
-       set_cpus_allowed(current, oldmask);
+       set_cpus_allowed_ptr(current, &oldmask);
         return ret;
  }
  
@@ -1104,7 +1104,7 @@ static int powernowk8_verify(struct cpufreq_policy *pol)
  static int __cpuinit powernowk8_cpu_init(struct cpufreq_policy *pol)
  {
         struct powernow_k8_data *data;
-       cpumask_t oldmask = CPU_MASK_ALL;
+       cpumask_t oldmask;
         int rc;
  
         if (!cpu_online(pol->cpu))
@@ -1145,7 +1145,7 @@ static int __cpuinit powernowk8_cpu_init(struct cpufreq_policy *pol)
  
         /* only run on specific CPU from here on */
         oldmask = current->cpus_allowed;
-       set_cpus_allowed(current, cpumask_of_cpu(pol->cpu));
+       set_cpus_allowed_ptr(current, &cpumask_of_cpu(pol->cpu));
  
         if (smp_processor_id() != pol->cpu) {
                 printk(KERN_ERR PFX "limiting to cpu %u failed\n", pol->cpu);
@@ -1164,7 +1164,7 @@ static int __cpuinit powernowk8_cpu_init(struct cpufreq_policy *pol)
                 fidvid_msr_init();
  
         /* run on any CPU again */
-       set_cpus_allowed(current, oldmask);
+       set_cpus_allowed_ptr(current, &oldmask);
  
         if (cpu_family == CPU_HW_PSTATE)
                 pol->cpus = cpumask_of_cpu(pol->cpu);
@@ -1205,7 +1205,7 @@ static int __cpuinit powernowk8_cpu_init(struct cpufreq_policy *pol)
         return 0;
  
  err_out:
-       set_cpus_allowed(current, oldmask);
+       set_cpus_allowed_ptr(current, &oldmask);
         powernow_k8_cpu_exit_acpi(data);
  
         kfree(data);
@@ -1242,10 +1242,11 @@ static unsigned int powernowk8_get (unsigned int cpu)
         if (!data)
                 return -EINVAL;
  
-       set_cpus_allowed(current, cpumask_of_cpu(cpu));
+       set_cpus_allowed_ptr(current, &cpumask_of_cpu(cpu));
         if (smp_processor_id() != cpu) {
-               printk(KERN_ERR PFX "limiting to CPU %d failed in powernowk8_get\n", cpu);
-               set_cpus_allowed(current, oldmask);
+               printk(KERN_ERR PFX
+                       "limiting to CPU %d failed in powernowk8_get\n", cpu);
+               set_cpus_allowed_ptr(current, &oldmask);
                 return 0;
         }
  
@@ -1253,13 +1254,14 @@ static unsigned int powernowk8_get (unsigned int cpu)
                 goto out;
  
         if (cpu_family == CPU_HW_PSTATE)
-               khz = find_khz_freq_from_pstate(data->powernow_table, data->currpstate);
+               khz = find_khz_freq_from_pstate(data->powernow_table,
+                                               data->currpstate);
         else
                 khz = find_khz_freq_from_fid(data->currfid);
  
  
  out:
-       set_cpus_allowed(current, oldmask);
+       set_cpus_allowed_ptr(current, &oldmask);
         return khz;
  }
  
diff --git a/arch/x86/kernel/cpu/cpufreq/speedstep-centrino.c b/arch/x86/kernel/cpu/cpufreq/speedstep-centrino.c

index 3031f119619212d40946ee2d0b297bf8a43bc576..908dd347c67ec3dc29aa5ab8b3dcbae9a12e5e37 100644 (file)
--- a/arch/x86/kernel/cpu/cpufreq/speedstep-centrino.c
+++ b/arch/x86/kernel/cpu/cpufreq/speedstep-centrino.c
@@ -315,7 +315,7 @@ static unsigned int get_cur_freq(unsigned int cpu)
         cpumask_t saved_mask;
  
         saved_mask = current->cpus_allowed;
-       set_cpus_allowed(current, cpumask_of_cpu(cpu));
+       set_cpus_allowed_ptr(current, &cpumask_of_cpu(cpu));
         if (smp_processor_id() != cpu)
                 return 0;
  
@@ -333,7 +333,7 @@ static unsigned int get_cur_freq(unsigned int cpu)
                 clock_freq = extract_clock(l, cpu, 1);
         }
  
-       set_cpus_allowed(current, saved_mask);
+       set_cpus_allowed_ptr(current, &saved_mask);
         return clock_freq;
  }
  
@@ -487,7 +487,7 @@ static int centrino_target (struct cpufreq_policy *policy,
                 else
                         cpu_set(j, set_mask);
  
-               set_cpus_allowed(current, set_mask);
+               set_cpus_allowed_ptr(current, &set_mask);
                 preempt_disable();
                 if (unlikely(!cpu_isset(smp_processor_id(), set_mask))) {
                         dprintk("couldn't limit to CPUs in this domain\n");
@@ -555,7 +555,8 @@ static int centrino_target (struct cpufreq_policy *policy,
  
                 if (!cpus_empty(covered_cpus)) {
                         for_each_cpu_mask(j, covered_cpus) {
-                               set_cpus_allowed(current, cpumask_of_cpu(j));
+                               set_cpus_allowed_ptr(current,
+                                                    &cpumask_of_cpu(j));
                                 wrmsr(MSR_IA32_PERF_CTL, oldmsr, h);
                         }
                 }
@@ -569,12 +570,12 @@ static int centrino_target (struct cpufreq_policy *policy,
                         cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
                 }
         }
-       set_cpus_allowed(current, saved_mask);
+       set_cpus_allowed_ptr(current, &saved_mask);
         return 0;
  
  migrate_end:
         preempt_enable();
-       set_cpus_allowed(current, saved_mask);
+       set_cpus_allowed_ptr(current, &saved_mask);
         return 0;
  }
  
diff --git a/arch/x86/kernel/cpu/cpufreq/speedstep-ich.c b/arch/x86/kernel/cpu/cpufreq/speedstep-ich.c

index 14d68aa301eea923ebff083e7cfbc5ad42a1083d..1b50244b1fdfd892e016d015e922fc12130992e2 100644 (file)
--- a/arch/x86/kernel/cpu/cpufreq/speedstep-ich.c
+++ b/arch/x86/kernel/cpu/cpufreq/speedstep-ich.c
@@ -229,22 +229,22 @@ static unsigned int speedstep_detect_chipset (void)
         return 0;
  }
  
-static unsigned int _speedstep_get(cpumask_t cpus)
+static unsigned int _speedstep_get(const cpumask_t *cpus)
  {
         unsigned int speed;
         cpumask_t cpus_allowed;
  
         cpus_allowed = current->cpus_allowed;
-       set_cpus_allowed(current, cpus);
+       set_cpus_allowed_ptr(current, cpus);
         speed = speedstep_get_processor_frequency(speedstep_processor);
-       set_cpus_allowed(current, cpus_allowed);
+       set_cpus_allowed_ptr(current, &cpus_allowed);
         dprintk("detected %u kHz as current frequency\n", speed);
         return speed;
  }
  
  static unsigned int speedstep_get(unsigned int cpu)
  {
-       return _speedstep_get(cpumask_of_cpu(cpu));
+       return _speedstep_get(&cpumask_of_cpu(cpu));
  }
  
  /**
@@ -267,7 +267,7 @@ static int speedstep_target (struct cpufreq_policy *policy,
         if (cpufreq_frequency_table_target(policy, &speedstep_freqs[0], target_freq, relation, &newstate))
                 return -EINVAL;
  
-       freqs.old = _speedstep_get(policy->cpus);
+       freqs.old = _speedstep_get(&policy->cpus);
         freqs.new = speedstep_freqs[newstate].frequency;
         freqs.cpu = policy->cpu;
  
@@ -285,12 +285,12 @@ static int speedstep_target (struct cpufreq_policy *policy,
         }
  
         /* switch to physical CPU where state is to be changed */
-       set_cpus_allowed(current, policy->cpus);
+       set_cpus_allowed_ptr(current, &policy->cpus);
  
         speedstep_set_state(newstate);
  
         /* allow to be run on all CPUs */
-       set_cpus_allowed(current, cpus_allowed);
+       set_cpus_allowed_ptr(current, &cpus_allowed);
  
         for_each_cpu_mask(i, policy->cpus) {
                 freqs.cpu = i;
@@ -326,7 +326,7 @@ static int speedstep_cpu_init(struct cpufreq_policy *policy)
  #endif
  
         cpus_allowed = current->cpus_allowed;
-       set_cpus_allowed(current, policy->cpus);
+       set_cpus_allowed_ptr(current, &policy->cpus);
  
         /* detect low and high frequency and transition latency */
         result = speedstep_get_freqs(speedstep_processor,
@@ -334,12 +334,12 @@ static int speedstep_cpu_init(struct cpufreq_policy *policy)
                                      &speedstep_freqs[SPEEDSTEP_HIGH].frequency,
                                      &policy->cpuinfo.transition_latency,
                                      &speedstep_set_state);
-       set_cpus_allowed(current, cpus_allowed);
+       set_cpus_allowed_ptr(current, &cpus_allowed);
         if (result)
                 return result;
  
         /* get current speed setting */
-       speed = _speedstep_get(policy->cpus);
+       speed = _speedstep_get(&policy->cpus);
         if (!speed)
                 return -EIO;
  
diff --git a/arch/x86/kernel/cpu/intel_cacheinfo.c b/arch/x86/kernel/cpu/intel_cacheinfo.c

index 1b889860eb730fc3081b31d02fd304c923667711..26d615dcb1498ed7664b2483e3a68e74abc5ee71 100644 (file)
--- a/arch/x86/kernel/cpu/intel_cacheinfo.c
+++ b/arch/x86/kernel/cpu/intel_cacheinfo.c
@@ -129,7 +129,7 @@ struct _cpuid4_info {
         union _cpuid4_leaf_ebx ebx;
         union _cpuid4_leaf_ecx ecx;
         unsigned long size;
-       cpumask_t shared_cpu_map;
+       cpumask_t shared_cpu_map;       /* future?: only cpus/node is needed */
  };
  
  unsigned short                 num_cache_leaves;
@@ -451,8 +451,8 @@ unsigned int __cpuinit init_intel_cacheinfo(struct cpuinfo_x86 *c)
  }
  
  /* pointer to _cpuid4_info array (for each cache leaf) */
-static struct _cpuid4_info *cpuid4_info[NR_CPUS];
-#define CPUID4_INFO_IDX(x,y)    (&((cpuid4_info[x])[y]))
+static DEFINE_PER_CPU(struct _cpuid4_info *, cpuid4_info);
+#define CPUID4_INFO_IDX(x, y)    (&((per_cpu(cpuid4_info, x))[y]))
  
  #ifdef CONFIG_SMP
  static void __cpuinit cache_shared_cpu_map_setup(unsigned int cpu, int index)
@@ -474,7 +474,7 @@ static void __cpuinit cache_shared_cpu_map_setup(unsigned int cpu, int index)
                         if (cpu_data(i).apicid >> index_msb ==
                             c->apicid >> index_msb) {
                                 cpu_set(i, this_leaf->shared_cpu_map);
-                               if (i != cpu && cpuid4_info[i])  {
+                               if (i != cpu && per_cpu(cpuid4_info, i))  {
                                         sibling_leaf = CPUID4_INFO_IDX(i, index);
                                         cpu_set(cpu, sibling_leaf->shared_cpu_map);
                                 }
@@ -505,8 +505,8 @@ static void __cpuinit free_cache_attributes(unsigned int cpu)
         for (i = 0; i < num_cache_leaves; i++)
                 cache_remove_shared_cpu_map(cpu, i);
  
-       kfree(cpuid4_info[cpu]);
-       cpuid4_info[cpu] = NULL;
+       kfree(per_cpu(cpuid4_info, cpu));
+       per_cpu(cpuid4_info, cpu) = NULL;
  }
  
  static int __cpuinit detect_cache_attributes(unsigned int cpu)
@@ -519,13 +519,13 @@ static int __cpuinit detect_cache_attributes(unsigned int cpu)
         if (num_cache_leaves == 0)
                 return -ENOENT;
  
-       cpuid4_info[cpu] = kzalloc(
+       per_cpu(cpuid4_info, cpu) = kzalloc(
             sizeof(struct _cpuid4_info) * num_cache_leaves, GFP_KERNEL);
-       if (cpuid4_info[cpu] == NULL)
+       if (per_cpu(cpuid4_info, cpu) == NULL)
                 return -ENOMEM;
  
         oldmask = current->cpus_allowed;
-       retval = set_cpus_allowed(current, cpumask_of_cpu(cpu));
+       retval = set_cpus_allowed_ptr(current, &cpumask_of_cpu(cpu));
         if (retval)
                 goto out;
  
@@ -542,12 +542,12 @@ static int __cpuinit detect_cache_attributes(unsigned int cpu)
                 }
                 cache_shared_cpu_map_setup(cpu, j);
         }
-       set_cpus_allowed(current, oldmask);
+       set_cpus_allowed_ptr(current, &oldmask);
  
  out:
         if (retval) {
-               kfree(cpuid4_info[cpu]);
-               cpuid4_info[cpu] = NULL;
+               kfree(per_cpu(cpuid4_info, cpu));
+               per_cpu(cpuid4_info, cpu) = NULL;
         }
  
         return retval;
@@ -561,7 +561,7 @@ out:
  extern struct sysdev_class cpu_sysdev_class; /* from drivers/base/cpu.c */
  
  /* pointer to kobject for cpuX/cache */
-static struct kobject * cache_kobject[NR_CPUS];
+static DEFINE_PER_CPU(struct kobject *, cache_kobject);
  
  struct _index_kobject {
         struct kobject kobj;
@@ -570,8 +570,8 @@ struct _index_kobject {
  };
  
  /* pointer to array of kobjects for cpuX/cache/indexY */
-static struct _index_kobject *index_kobject[NR_CPUS];
-#define INDEX_KOBJECT_PTR(x,y)    (&((index_kobject[x])[y]))
+static DEFINE_PER_CPU(struct _index_kobject *, index_kobject);
+#define INDEX_KOBJECT_PTR(x, y)    (&((per_cpu(index_kobject, x))[y]))
  
  #define show_one_plus(file_name, object, val)                          \
  static ssize_t show_##file_name                                                \
@@ -591,11 +591,32 @@ static ssize_t show_size(struct _cpuid4_info *this_leaf, char *buf)
         return sprintf (buf, "%luK\n", this_leaf->size / 1024);
  }
  
-static ssize_t show_shared_cpu_map(struct _cpuid4_info *this_leaf, char *buf)
+static ssize_t show_shared_cpu_map_func(struct _cpuid4_info *this_leaf,
+                                       int type, char *buf)
  {
-       char mask_str[NR_CPUS];
-       cpumask_scnprintf(mask_str, NR_CPUS, this_leaf->shared_cpu_map);
-       return sprintf(buf, "%s\n", mask_str);
+       ptrdiff_t len = PTR_ALIGN(buf + PAGE_SIZE - 1, PAGE_SIZE) - buf;
+       int n = 0;
+
+       if (len > 1) {
+               cpumask_t *mask = &this_leaf->shared_cpu_map;
+
+               n = type?
+                       cpulist_scnprintf(buf, len-2, *mask):
+                       cpumask_scnprintf(buf, len-2, *mask);
+               buf[n++] = '\n';
+               buf[n] = '\0';
+       }
+       return n;
+}
+
+static inline ssize_t show_shared_cpu_map(struct _cpuid4_info *leaf, char *buf)
+{
+       return show_shared_cpu_map_func(leaf, 0, buf);
+}
+
+static inline ssize_t show_shared_cpu_list(struct _cpuid4_info *leaf, char *buf)
+{
+       return show_shared_cpu_map_func(leaf, 1, buf);
  }
  
  static ssize_t show_type(struct _cpuid4_info *this_leaf, char *buf) {
@@ -633,6 +654,7 @@ define_one_ro(ways_of_associativity);
  define_one_ro(number_of_sets);
  define_one_ro(size);
  define_one_ro(shared_cpu_map);
+define_one_ro(shared_cpu_list);
  
  static struct attribute * default_attrs[] = {
         &type.attr,
@@ -643,6 +665,7 @@ static struct attribute * default_attrs[] = {
         &number_of_sets.attr,
         &size.attr,
         &shared_cpu_map.attr,
+       &shared_cpu_list.attr,
         NULL
  };
  
@@ -684,10 +707,10 @@ static struct kobj_type ktype_percpu_entry = {
  
  static void __cpuinit cpuid4_cache_sysfs_exit(unsigned int cpu)
  {
-       kfree(cache_kobject[cpu]);
-       kfree(index_kobject[cpu]);
-       cache_kobject[cpu] = NULL;
-       index_kobject[cpu] = NULL;
+       kfree(per_cpu(cache_kobject, cpu));
+       kfree(per_cpu(index_kobject, cpu));
+       per_cpu(cache_kobject, cpu) = NULL;
+       per_cpu(index_kobject, cpu) = NULL;
         free_cache_attributes(cpu);
  }
  
@@ -703,13 +726,14 @@ static int __cpuinit cpuid4_cache_sysfs_init(unsigned int cpu)
                 return err;
  
         /* Allocate all required memory */
-       cache_kobject[cpu] = kzalloc(sizeof(struct kobject), GFP_KERNEL);
-       if (unlikely(cache_kobject[cpu] == NULL))
+       per_cpu(cache_kobject, cpu) =
+               kzalloc(sizeof(struct kobject), GFP_KERNEL);
+       if (unlikely(per_cpu(cache_kobject, cpu) == NULL))
                 goto err_out;
  
-       index_kobject[cpu] = kzalloc(
+       per_cpu(index_kobject, cpu) = kzalloc(
             sizeof(struct _index_kobject ) * num_cache_leaves, GFP_KERNEL);
-       if (unlikely(index_kobject[cpu] == NULL))
+       if (unlikely(per_cpu(index_kobject, cpu) == NULL))
                 goto err_out;
  
         return 0;
@@ -733,7 +757,8 @@ static int __cpuinit cache_add_dev(struct sys_device * sys_dev)
         if (unlikely(retval < 0))
                 return retval;
  
-       retval = kobject_init_and_add(cache_kobject[cpu], &ktype_percpu_entry,
+       retval = kobject_init_and_add(per_cpu(cache_kobject, cpu),
+                                     &ktype_percpu_entry,
                                       &sys_dev->kobj, "%s", "cache");
         if (retval < 0) {
                 cpuid4_cache_sysfs_exit(cpu);
@@ -745,13 +770,14 @@ static int __cpuinit cache_add_dev(struct sys_device * sys_dev)
                 this_object->cpu = cpu;
                 this_object->index = i;
                 retval = kobject_init_and_add(&(this_object->kobj),
-                                             &ktype_cache, cache_kobject[cpu],
+                                             &ktype_cache,
+                                             per_cpu(cache_kobject, cpu),
                                               "index%1lu", i);
                 if (unlikely(retval)) {
                         for (j = 0; j < i; j++) {
                                 kobject_put(&(INDEX_KOBJECT_PTR(cpu,j)->kobj));
                         }
-                       kobject_put(cache_kobject[cpu]);
+                       kobject_put(per_cpu(cache_kobject, cpu));
                         cpuid4_cache_sysfs_exit(cpu);
                         break;
                 }
@@ -760,7 +786,7 @@ static int __cpuinit cache_add_dev(struct sys_device * sys_dev)
         if (!retval)
                 cpu_set(cpu, cache_dev_map);
  
-       kobject_uevent(cache_kobject[cpu], KOBJ_ADD);
+       kobject_uevent(per_cpu(cache_kobject, cpu), KOBJ_ADD);
         return retval;
  }
  
@@ -769,7 +795,7 @@ static void __cpuinit cache_remove_dev(struct sys_device * sys_dev)
         unsigned int cpu = sys_dev->id;
         unsigned long i;
  
-       if (cpuid4_info[cpu] == NULL)
+       if (per_cpu(cpuid4_info, cpu) == NULL)
                 return;
         if (!cpu_isset(cpu, cache_dev_map))
                 return;
@@ -777,7 +803,7 @@ static void __cpuinit cache_remove_dev(struct sys_device * sys_dev)
  
         for (i = 0; i < num_cache_leaves; i++)
                 kobject_put(&(INDEX_KOBJECT_PTR(cpu,i)->kobj));
-       kobject_put(cache_kobject[cpu]);
+       kobject_put(per_cpu(cache_kobject, cpu));
         cpuid4_cache_sysfs_exit(cpu);
  }
  
diff --git a/arch/x86/kernel/cpu/mcheck/mce_amd_64.c b/arch/x86/kernel/cpu/mcheck/mce_amd_64.c

index 32671da8184e5e78c1e090d5d795606057f3cd37..7c9a813e11939b7e7fa583549d4c9affdde26476 100644 (file)
--- a/arch/x86/kernel/cpu/mcheck/mce_amd_64.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_amd_64.c
@@ -251,18 +251,18 @@ struct threshold_attr {
         ssize_t(*store) (struct threshold_block *, const char *, size_t count);
  };
  
-static cpumask_t affinity_set(unsigned int cpu)
+static void affinity_set(unsigned int cpu, cpumask_t *oldmask,
+                                          cpumask_t *newmask)
  {
-       cpumask_t oldmask = current->cpus_allowed;
-       cpumask_t newmask = CPU_MASK_NONE;
-       cpu_set(cpu, newmask);
-       set_cpus_allowed(current, newmask);
-       return oldmask;
+       *oldmask = current->cpus_allowed;
+       cpus_clear(*newmask);
+       cpu_set(cpu, *newmask);
+       set_cpus_allowed_ptr(current, newmask);
  }
  
-static void affinity_restore(cpumask_t oldmask)
+static void affinity_restore(const cpumask_t *oldmask)
  {
-       set_cpus_allowed(current, oldmask);
+       set_cpus_allowed_ptr(current, oldmask);
  }
  
  #define SHOW_FIELDS(name)                                           \
@@ -277,15 +277,15 @@ static ssize_t store_interrupt_enable(struct threshold_block *b,
                                       const char *buf, size_t count)
  {
         char *end;
-       cpumask_t oldmask;
+       cpumask_t oldmask, newmask;
         unsigned long new = simple_strtoul(buf, &end, 0);
         if (end == buf)
                 return -EINVAL;
         b->interrupt_enable = !!new;
  
-       oldmask = affinity_set(b->cpu);
+       affinity_set(b->cpu, &oldmask, &newmask);
         threshold_restart_bank(b, 0, 0);
-       affinity_restore(oldmask);
+       affinity_restore(&oldmask);
  
         return end - buf;
  }
@@ -294,7 +294,7 @@ static ssize_t store_threshold_limit(struct threshold_block *b,
                                      const char *buf, size_t count)
  {
         char *end;
-       cpumask_t oldmask;
+       cpumask_t oldmask, newmask;
         u16 old;
         unsigned long new = simple_strtoul(buf, &end, 0);
         if (end == buf)
@@ -306,9 +306,9 @@ static ssize_t store_threshold_limit(struct threshold_block *b,
         old = b->threshold_limit;
         b->threshold_limit = new;
  
-       oldmask = affinity_set(b->cpu);
+       affinity_set(b->cpu, &oldmask, &newmask);
         threshold_restart_bank(b, 0, old);
-       affinity_restore(oldmask);
+       affinity_restore(&oldmask);
  
         return end - buf;
  }
@@ -316,10 +316,10 @@ static ssize_t store_threshold_limit(struct threshold_block *b,
  static ssize_t show_error_count(struct threshold_block *b, char *buf)
  {
         u32 high, low;
-       cpumask_t oldmask;
-       oldmask = affinity_set(b->cpu);
+       cpumask_t oldmask, newmask;
+       affinity_set(b->cpu, &oldmask, &newmask);
         rdmsr(b->address, low, high);
-       affinity_restore(oldmask);
+       affinity_restore(&oldmask);
         return sprintf(buf, "%x\n",
                        (high & 0xFFF) - (THRESHOLD_MAX - b->threshold_limit));
  }
@@ -327,10 +327,10 @@ static ssize_t show_error_count(struct threshold_block *b, char *buf)
  static ssize_t store_error_count(struct threshold_block *b,
                                  const char *buf, size_t count)
  {
-       cpumask_t oldmask;
-       oldmask = affinity_set(b->cpu);
+       cpumask_t oldmask, newmask;
+       affinity_set(b->cpu, &oldmask, &newmask);
         threshold_restart_bank(b, 1, 0);
-       affinity_restore(oldmask);
+       affinity_restore(&oldmask);
         return 1;
  }
  
@@ -468,7 +468,7 @@ static __cpuinit int threshold_create_bank(unsigned int cpu, unsigned int bank)
  {
         int i, err = 0;
         struct threshold_bank *b = NULL;
-       cpumask_t oldmask = CPU_MASK_NONE;
+       cpumask_t oldmask, newmask;
         char name[32];
  
         sprintf(name, "threshold_bank%i", bank);
@@ -519,10 +519,10 @@ static __cpuinit int threshold_create_bank(unsigned int cpu, unsigned int bank)
  
         per_cpu(threshold_banks, cpu)[bank] = b;
  
-       oldmask = affinity_set(cpu);
+       affinity_set(cpu, &oldmask, &newmask);
         err = allocate_threshold_blocks(cpu, bank, 0,
                                         MSR_IA32_MC0_MISC + bank * 4);
-       affinity_restore(oldmask);
+       affinity_restore(&oldmask);
  
         if (err)
                 goto out_free;
diff --git a/arch/x86/kernel/cpu/mcheck/therm_throt.c b/arch/x86/kernel/cpu/mcheck/therm_throt.c

index 9b7e01daa1ca22c70830add7e46e68247f903899..1f4cc48c14c633be4e0a1ca62b1502616337b048 100644 (file)
--- a/arch/x86/kernel/cpu/mcheck/therm_throt.c
+++ b/arch/x86/kernel/cpu/mcheck/therm_throt.c
@@ -1,5 +1,4 @@
  /*
- * linux/arch/i386/kernel/cpu/mcheck/therm_throt.c
   *
   * Thermal throttle event support code (such as syslog messaging and rate
   * limiting) that was factored out from x86_64 (mce_intel.c) and i386 (p4.c).
diff --git a/arch/x86/kernel/e820_32.c b/arch/x86/kernel/e820_32.c

index 0240cd778365d12a05e3096e4b7492b510573e5a..ed733e7cf4e611c454f9ea622c9322919aa624b4 100644 (file)
--- a/arch/x86/kernel/e820_32.c
+++ b/arch/x86/kernel/e820_32.c
@@ -475,7 +475,7 @@ int __init copy_e820_map(struct e820entry *biosmap, int nr_map)
  /*
   * Find the highest page frame number we have available
   */
-void __init find_max_pfn(void)
+void __init propagate_e820_map(void)
  {
         int i;
  
@@ -704,7 +704,7 @@ static int __init parse_memmap(char *arg)
                  * size before original memory map is
                  * reset.
                  */
-               find_max_pfn();
+               propagate_e820_map();
                 saved_max_pfn = max_pfn;
  #endif
                 e820.nr_map = 0;
diff --git a/arch/x86/kernel/e820_64.c b/arch/x86/kernel/e820_64.c

index 7f6c0c85c8f65e0a37524a095b6ac5d7d86009b1..cbd42e51cb082d82b8f287eaa1c244913268d64c 100644 (file)
--- a/arch/x86/kernel/e820_64.c
+++ b/arch/x86/kernel/e820_64.c
@@ -96,7 +96,7 @@ void __init early_res_to_bootmem(void)
  }
  
  /* Check for already reserved areas */
-static inline int
+static inline int __init
  bad_addr(unsigned long *addrp, unsigned long size, unsigned long align)
  {
         int i;
@@ -116,7 +116,7 @@ again:
  }
  
  /* Check for already reserved areas */
-static inline int
+static inline int __init
  bad_addr_size(unsigned long *addrp, unsigned long *sizep, unsigned long align)
  {
         int i;
diff --git a/arch/x86/kernel/efi.c b/arch/x86/kernel/efi.c

index 759e02bec0708f955764907ec380e6eb737c6dfa..77d424cf68b38e6b919f55b6fd895436e887fd74 100644 (file)
--- a/arch/x86/kernel/efi.c
+++ b/arch/x86/kernel/efi.c
@@ -383,6 +383,7 @@ static void __init runtime_code_page_mkexec(void)
  {
         efi_memory_desc_t *md;
         void *p;
+       u64 addr, npages;
  
         /* Make EFI runtime service code area executable */
         for (p = memmap.map; p < memmap.map_end; p += memmap.desc_size) {
@@ -391,7 +392,10 @@ static void __init runtime_code_page_mkexec(void)
                 if (md->type != EFI_RUNTIME_SERVICES_CODE)
                         continue;
  
-               set_memory_x(md->virt_addr, md->num_pages);
+               addr = md->virt_addr;
+               npages = md->num_pages;
+               memrange_efi_to_native(&addr, &npages);
+               set_memory_x(addr, npages);
         }
  }
  
@@ -408,7 +412,7 @@ void __init efi_enter_virtual_mode(void)
         efi_memory_desc_t *md;
         efi_status_t status;
         unsigned long size;
-       u64 end, systab;
+       u64 end, systab, addr, npages;
         void *p, *va;
  
         efi.systab = NULL;
@@ -420,7 +424,7 @@ void __init efi_enter_virtual_mode(void)
                 size = md->num_pages << EFI_PAGE_SHIFT;
                 end = md->phys_addr + size;
  
-               if ((end >> PAGE_SHIFT) <= max_pfn_mapped)
+               if (PFN_UP(end) <= max_pfn_mapped)
                         va = __va(md->phys_addr);
                 else
                         va = efi_ioremap(md->phys_addr, size);
@@ -433,8 +437,12 @@ void __init efi_enter_virtual_mode(void)
                         continue;
                 }
  
-               if (!(md->attribute & EFI_MEMORY_WB))
-                       set_memory_uc(md->virt_addr, md->num_pages);
+               if (!(md->attribute & EFI_MEMORY_WB)) {
+                       addr = md->virt_addr;
+                       npages = md->num_pages;
+                       memrange_efi_to_native(&addr, &npages);
+                       set_memory_uc(addr, npages);
+               }
  
                 systab = (u64) (unsigned long) efi_phys.systab;
                 if (md->phys_addr <= systab && systab < end) {
diff --git a/arch/x86/kernel/efi_64.c b/arch/x86/kernel/efi_64.c

index d143a1e76b301737c6fc34f2a0d6da656b787038..d0060fdcccac1658968db527f4c05d791b056e6b 100644 (file)
--- a/arch/x86/kernel/efi_64.c
+++ b/arch/x86/kernel/efi_64.c
@@ -105,14 +105,14 @@ void __init efi_reserve_bootmem(void)
  
  void __iomem * __init efi_ioremap(unsigned long phys_addr, unsigned long size)
  {
-       static unsigned pages_mapped;
+       static unsigned pages_mapped __initdata;
         unsigned i, pages;
+       unsigned long offset;
  
-       /* phys_addr and size must be page aligned */
-       if ((phys_addr & ~PAGE_MASK) || (size & ~PAGE_MASK))
-               return NULL;
+       pages = PFN_UP(phys_addr + size) - PFN_DOWN(phys_addr);
+       offset = phys_addr & ~PAGE_MASK;
+       phys_addr &= PAGE_MASK;
  
-       pages = size >> PAGE_SHIFT;
         if (pages_mapped + pages > MAX_EFI_IO_PAGES)
                 return NULL;
  
@@ -124,5 +124,5 @@ void __iomem * __init efi_ioremap(unsigned long phys_addr, unsigned long size)
         }
  
         return (void __iomem *)__fix_to_virt(FIX_EFI_IO_MAP_FIRST_PAGE - \
-                                            (pages_mapped - pages));
+                                            (pages_mapped - pages)) + offset;
  }
diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S

index 9ba49a26dff8a25c5ad0e1adfb9c175e7cb3639e..f0f8934fc30324fc29af50dc436db6f14b198a93 100644 (file)
--- a/arch/x86/kernel/entry_32.S
+++ b/arch/x86/kernel/entry_32.S
@@ -1,5 +1,4 @@
  /*
- *  linux/arch/i386/entry.S
   *
   *  Copyright (C) 1991, 1992  Linus Torvalds
   */
diff --git a/arch/x86/kernel/genx2apic_uv_x.c b/arch/x86/kernel/genx2apic_uv_x.c

index 5d77c9cd8e15c6782e4a0872d5f8e4e820509395..ebf13908a743b6cd8cbdc231d0758f2388b92243 100644 (file)
--- a/arch/x86/kernel/genx2apic_uv_x.c
+++ b/arch/x86/kernel/genx2apic_uv_x.c
@@ -61,26 +61,31 @@ int uv_wakeup_secondary(int phys_apicid, unsigned int start_rip)
         val = (1UL << UVH_IPI_INT_SEND_SHFT) |
             (phys_apicid << UVH_IPI_INT_APIC_ID_SHFT) |
             (((long)start_rip << UVH_IPI_INT_VECTOR_SHFT) >> 12) |
-           (6 << UVH_IPI_INT_DELIVERY_MODE_SHFT);
+           APIC_DM_INIT;
+       uv_write_global_mmr64(nasid, UVH_IPI_INT, val);
+       mdelay(10);
+
+       val = (1UL << UVH_IPI_INT_SEND_SHFT) |
+           (phys_apicid << UVH_IPI_INT_APIC_ID_SHFT) |
+           (((long)start_rip << UVH_IPI_INT_VECTOR_SHFT) >> 12) |
+           APIC_DM_STARTUP;
         uv_write_global_mmr64(nasid, UVH_IPI_INT, val);
         return 0;
  }
  
  static void uv_send_IPI_one(int cpu, int vector)
  {
-       unsigned long val, apicid;
+       unsigned long val, apicid, lapicid;
         int nasid;
  
         apicid = per_cpu(x86_cpu_to_apicid, cpu); /* ZZZ - cache node-local ? */
+       lapicid = apicid & 0x3f;                /* ZZZ macro needed */
         nasid = uv_apicid_to_nasid(apicid);
         val =
-           (1UL << UVH_IPI_INT_SEND_SHFT) | (apicid <<
+           (1UL << UVH_IPI_INT_SEND_SHFT) | (lapicid <<
                                               UVH_IPI_INT_APIC_ID_SHFT) |
             (vector << UVH_IPI_INT_VECTOR_SHFT);
         uv_write_global_mmr64(nasid, UVH_IPI_INT, val);
-       printk(KERN_DEBUG
-            "UV: IPI to cpu %d, apicid 0x%lx, vec %d, nasid%d, val 0x%lx\n",
-            cpu, apicid, vector, nasid, val);
  }
  
  static void uv_send_IPI_mask(cpumask_t mask, int vector)
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c

index d6d54faa84dfb1020ea838896a2aca8ee18bda06..993c767732564afe285e7ddc23228511108b7b11 100644 (file)
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -146,6 +146,7 @@ void __init x86_64_start_kernel(char * real_mode_data)
  
         reserve_early(__pa_symbol(&_text), __pa_symbol(&_end), "TEXT DATA BSS");
  
+#ifdef CONFIG_BLK_DEV_INITRD
         /* Reserve INITRD */
         if (boot_params.hdr.type_of_loader && boot_params.hdr.ramdisk_image) {
                 unsigned long ramdisk_image = boot_params.hdr.ramdisk_image;
@@ -153,6 +154,7 @@ void __init x86_64_start_kernel(char * real_mode_data)
                 unsigned long ramdisk_end   = ramdisk_image + ramdisk_size;
                 reserve_early(ramdisk_image, ramdisk_end, "RAMDISK");
         }
+#endif
  
         reserve_ebda_region();
  
diff --git a/arch/x86/kernel/head_32.S b/arch/x86/kernel/head_32.S

index 826988a6e964717e623350efec94e03db932c798..90f038af3adc326725cbdec61825e1b31f340ebb 100644 (file)
--- a/arch/x86/kernel/head_32.S
+++ b/arch/x86/kernel/head_32.S
@@ -1,5 +1,4 @@
  /*
- *  linux/arch/i386/kernel/head.S -- the 32-bit startup code.
   *
   *  Copyright (C) 1991, 1992  Linus Torvalds
   *
diff --git a/arch/x86/kernel/i387.c b/arch/x86/kernel/i387.c

index 8f8102d967b3f4111c58be6dccbd6e4192fc3864..db6839b53195e1a83d9186a76bf9a05562fc39e6 100644 (file)
--- a/arch/x86/kernel/i387.c
+++ b/arch/x86/kernel/i387.c
@@ -35,17 +35,18 @@
  #endif
  
  static unsigned int            mxcsr_feature_mask __read_mostly = 0xffffffffu;
+unsigned int xstate_size;
+static struct i387_fxsave_struct fx_scratch __cpuinitdata;
  
-void mxcsr_feature_mask_init(void)
+void __cpuinit mxcsr_feature_mask_init(void)
  {
         unsigned long mask = 0;
  
         clts();
         if (cpu_has_fxsr) {
-               memset(&current->thread.i387.fxsave, 0,
-                      sizeof(struct i387_fxsave_struct));
-               asm volatile("fxsave %0" : : "m" (current->thread.i387.fxsave));
-               mask = current->thread.i387.fxsave.mxcsr_mask;
+               memset(&fx_scratch, 0, sizeof(struct i387_fxsave_struct));
+               asm volatile("fxsave %0" : : "m" (fx_scratch));
+               mask = fx_scratch.mxcsr_mask;
                 if (mask == 0)
                         mask = 0x0000ffbf;
         }
@@ -53,6 +54,16 @@ void mxcsr_feature_mask_init(void)
         stts();
  }
  
+void __init init_thread_xstate(void)
+{
+       if (cpu_has_fxsr)
+               xstate_size = sizeof(struct i387_fxsave_struct);
+#ifdef CONFIG_X86_32
+       else
+               xstate_size = sizeof(struct i387_fsave_struct);
+#endif
+}
+
  #ifdef CONFIG_X86_64
  /*
   * Called at bootup to set up the initial FPU state that is later cloned
@@ -61,10 +72,6 @@ void mxcsr_feature_mask_init(void)
  void __cpuinit fpu_init(void)
  {
         unsigned long oldcr0 = read_cr0();
-       extern void __bad_fxsave_alignment(void);
-
-       if (offsetof(struct task_struct, thread.i387.fxsave) & 15)
-               __bad_fxsave_alignment();
  
         set_in_cr4(X86_CR4_OSFXSR);
         set_in_cr4(X86_CR4_OSXMMEXCPT);
@@ -84,32 +91,44 @@ void __cpuinit fpu_init(void)
   * value at reset if we support XMM instructions and then
   * remeber the current task has used the FPU.
   */
-void init_fpu(struct task_struct *tsk)
+int init_fpu(struct task_struct *tsk)
  {
         if (tsk_used_math(tsk)) {
                 if (tsk == current)
                         unlazy_fpu(tsk);
-               return;
+               return 0;
+       }
+
+       /*
+        * Memory allocation at the first usage of the FPU and other state.
+        */
+       if (!tsk->thread.xstate) {
+               tsk->thread.xstate = kmem_cache_alloc(task_xstate_cachep,
+                                                     GFP_KERNEL);
+               if (!tsk->thread.xstate)
+                       return -ENOMEM;
         }
  
         if (cpu_has_fxsr) {
-               memset(&tsk->thread.i387.fxsave, 0,
-                      sizeof(struct i387_fxsave_struct));
-               tsk->thread.i387.fxsave.cwd = 0x37f;
+               struct i387_fxsave_struct *fx = &tsk->thread.xstate->fxsave;
+
+               memset(fx, 0, xstate_size);
+               fx->cwd = 0x37f;
                 if (cpu_has_xmm)
-                       tsk->thread.i387.fxsave.mxcsr = MXCSR_DEFAULT;
+                       fx->mxcsr = MXCSR_DEFAULT;
         } else {
-               memset(&tsk->thread.i387.fsave, 0,
-                      sizeof(struct i387_fsave_struct));
-               tsk->thread.i387.fsave.cwd = 0xffff037fu;
-               tsk->thread.i387.fsave.swd = 0xffff0000u;
-               tsk->thread.i387.fsave.twd = 0xffffffffu;
-               tsk->thread.i387.fsave.fos = 0xffff0000u;
+               struct i387_fsave_struct *fp = &tsk->thread.xstate->fsave;
+               memset(fp, 0, xstate_size);
+               fp->cwd = 0xffff037fu;
+               fp->swd = 0xffff0000u;
+               fp->twd = 0xffffffffu;
+               fp->fos = 0xffff0000u;
         }
         /*
          * Only the device not available exception or ptrace can call init_fpu.
          */
         set_stopped_child_used_math(tsk);
+       return 0;
  }
  
  int fpregs_active(struct task_struct *target, const struct user_regset *regset)
@@ -126,13 +145,17 @@ int xfpregs_get(struct task_struct *target, const struct user_regset *regset,
                 unsigned int pos, unsigned int count,
                 void *kbuf, void __user *ubuf)
  {
+       int ret;
+
         if (!cpu_has_fxsr)
                 return -ENODEV;
  
-       init_fpu(target);
+       ret = init_fpu(target);
+       if (ret)
+               return ret;
  
         return user_regset_copyout(&pos, &count, &kbuf, &ubuf,
-                                  &target->thread.i387.fxsave, 0, -1);
+                                  &target->thread.xstate->fxsave, 0, -1);
  }
  
  int xfpregs_set(struct task_struct *target, const struct user_regset *regset,
@@ -144,16 +167,19 @@ int xfpregs_set(struct task_struct *target, const struct user_regset *regset,
         if (!cpu_has_fxsr)
                 return -ENODEV;
  
-       init_fpu(target);
+       ret = init_fpu(target);
+       if (ret)
+               return ret;
+
         set_stopped_child_used_math(target);
  
         ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf,
-                                &target->thread.i387.fxsave, 0, -1);
+                                &target->thread.xstate->fxsave, 0, -1);
  
         /*
          * mxcsr reserved bits must be masked to zero for security reasons.
          */
-       target->thread.i387.fxsave.mxcsr &= mxcsr_feature_mask;
+       target->thread.xstate->fxsave.mxcsr &= mxcsr_feature_mask;
  
         return ret;
  }
@@ -233,7 +259,7 @@ static inline u32 twd_fxsr_to_i387(struct i387_fxsave_struct *fxsave)
  static void
  convert_from_fxsr(struct user_i387_ia32_struct *env, struct task_struct *tsk)
  {
-       struct i387_fxsave_struct *fxsave = &tsk->thread.i387.fxsave;
+       struct i387_fxsave_struct *fxsave = &tsk->thread.xstate->fxsave;
         struct _fpreg *to = (struct _fpreg *) &env->st_space[0];
         struct _fpxreg *from = (struct _fpxreg *) &fxsave->st_space[0];
         int i;
@@ -273,7 +299,7 @@ static void convert_to_fxsr(struct task_struct *tsk,
                             const struct user_i387_ia32_struct *env)
  
  {
-       struct i387_fxsave_struct *fxsave = &tsk->thread.i387.fxsave;
+       struct i387_fxsave_struct *fxsave = &tsk->thread.xstate->fxsave;
         struct _fpreg *from = (struct _fpreg *) &env->st_space[0];
         struct _fpxreg *to = (struct _fpxreg *) &fxsave->st_space[0];
         int i;
@@ -302,15 +328,19 @@ int fpregs_get(struct task_struct *target, const struct user_regset *regset,
                void *kbuf, void __user *ubuf)
  {
         struct user_i387_ia32_struct env;
+       int ret;
  
         if (!HAVE_HWFP)
                 return fpregs_soft_get(target, regset, pos, count, kbuf, ubuf);
  
-       init_fpu(target);
+       ret = init_fpu(target);
+       if (ret)
+               return ret;
  
         if (!cpu_has_fxsr) {
                 return user_regset_copyout(&pos, &count, &kbuf, &ubuf,
-                                          &target->thread.i387.fsave, 0, -1);
+                                          &target->thread.xstate->fsave, 0,
+                                          -1);
         }
  
         if (kbuf && pos == 0 && count == sizeof(env)) {
@@ -333,12 +363,15 @@ int fpregs_set(struct task_struct *target, const struct user_regset *regset,
         if (!HAVE_HWFP)
                 return fpregs_soft_set(target, regset, pos, count, kbuf, ubuf);
  
-       init_fpu(target);
+       ret = init_fpu(target);
+       if (ret)
+               return ret;
+
         set_stopped_child_used_math(target);
  
         if (!cpu_has_fxsr) {
                 return user_regset_copyin(&pos, &count, &kbuf, &ubuf,
-                                         &target->thread.i387.fsave, 0, -1);
+                                         &target->thread.xstate->fsave, 0, -1);
         }
  
         if (pos > 0 || count < sizeof(env))
@@ -358,11 +391,11 @@ int fpregs_set(struct task_struct *target, const struct user_regset *regset,
  static inline int save_i387_fsave(struct _fpstate_ia32 __user *buf)
  {
         struct task_struct *tsk = current;
+       struct i387_fsave_struct *fp = &tsk->thread.xstate->fsave;
  
         unlazy_fpu(tsk);
-       tsk->thread.i387.fsave.status = tsk->thread.i387.fsave.swd;
-       if (__copy_to_user(buf, &tsk->thread.i387.fsave,
-                          sizeof(struct i387_fsave_struct)))
+       fp->status = fp->swd;
+       if (__copy_to_user(buf, fp, sizeof(struct i387_fsave_struct)))
                 return -1;
         return 1;
  }
@@ -370,6 +403,7 @@ static inline int save_i387_fsave(struct _fpstate_ia32 __user *buf)
  static int save_i387_fxsave(struct _fpstate_ia32 __user *buf)
  {
         struct task_struct *tsk = current;
+       struct i387_fxsave_struct *fx = &tsk->thread.xstate->fxsave;
         struct user_i387_ia32_struct env;
         int err = 0;
  
@@ -379,12 +413,12 @@ static int save_i387_fxsave(struct _fpstate_ia32 __user *buf)
         if (__copy_to_user(buf, &env, sizeof(env)))
                 return -1;
  
-       err |= __put_user(tsk->thread.i387.fxsave.swd, &buf->status);
+       err |= __put_user(fx->swd, &buf->status);
         err |= __put_user(X86_FXSR_MAGIC, &buf->magic);
         if (err)
                 return -1;
  
-       if (__copy_to_user(&buf->_fxsr_env[0], &tsk->thread.i387.fxsave,
+       if (__copy_to_user(&buf->_fxsr_env[0], fx,
                            sizeof(struct i387_fxsave_struct)))
                 return -1;
         return 1;
@@ -417,7 +451,7 @@ static inline int restore_i387_fsave(struct _fpstate_ia32 __user *buf)
         struct task_struct *tsk = current;
  
         clear_fpu(tsk);
-       return __copy_from_user(&tsk->thread.i387.fsave, buf,
+       return __copy_from_user(&tsk->thread.xstate->fsave, buf,
                                 sizeof(struct i387_fsave_struct));
  }
  
@@ -428,10 +462,10 @@ static int restore_i387_fxsave(struct _fpstate_ia32 __user *buf)
         int err;
  
         clear_fpu(tsk);
-       err = __copy_from_user(&tsk->thread.i387.fxsave, &buf->_fxsr_env[0],
+       err = __copy_from_user(&tsk->thread.xstate->fxsave, &buf->_fxsr_env[0],
                                sizeof(struct i387_fxsave_struct));
         /* mxcsr reserved bits must be masked to zero for security reasons */
-       tsk->thread.i387.fxsave.mxcsr &= mxcsr_feature_mask;
+       tsk->thread.xstate->fxsave.mxcsr &= mxcsr_feature_mask;
         if (err || __copy_from_user(&env, buf, sizeof(env)))
                 return 1;
         convert_to_fxsr(tsk, &env);
diff --git a/arch/x86/kernel/io_apic_64.c b/arch/x86/kernel/io_apic_64.c

index b54464b26658227413cbbd17b810cf77ed9a14d6..9ba11d07920f794c71e11de98d51d8430e129cf2 100644 (file)
--- a/arch/x86/kernel/io_apic_64.c
+++ b/arch/x86/kernel/io_apic_64.c
@@ -785,7 +785,7 @@ static void __clear_irq_vector(int irq)
                 per_cpu(vector_irq, cpu)[vector] = -1;
  
         cfg->vector = 0;
-       cfg->domain = CPU_MASK_NONE;
+       cpus_clear(cfg->domain);
  }
  
  void __setup_vector_irq(int cpu)
diff --git a/arch/x86/kernel/kgdb.c b/arch/x86/kernel/kgdb.c

index 24362ecf5f9a9006f76698a2e410a48312904ed5..f47f0eb886b8ddeab27ec8a8fd319801c45ef503 100644 (file)
--- a/arch/x86/kernel/kgdb.c
+++ b/arch/x86/kernel/kgdb.c
@@ -46,11 +46,7 @@
  #include <asm/apicdef.h>
  #include <asm/system.h>
  
-#ifdef CONFIG_X86_32
-# include <mach_ipi.h>
-#else
-# include <asm/mach_apic.h>
-#endif
+#include <mach_ipi.h>
  
  /*
   * Put the error code here just in case the user cares:
diff --git a/arch/x86/kernel/microcode.c b/arch/x86/kernel/microcode.c

index 25cf6dee4e56f74befff8a9be9d3261caba2b0ea..69729e38b78a2d83e7f7e799bd2ea37730305d18 100644 (file)
--- a/arch/x86/kernel/microcode.c
+++ b/arch/x86/kernel/microcode.c
@@ -402,7 +402,7 @@ static int do_microcode_update (void)
  
                         if (!uci->valid)
                                 continue;
-                       set_cpus_allowed(current, cpumask_of_cpu(cpu));
+                       set_cpus_allowed_ptr(current, &cpumask_of_cpu(cpu));
                         error = get_maching_microcode(new_mc, cpu);
                         if (error < 0)
                                 goto out;
@@ -416,7 +416,7 @@ out:
                 vfree(new_mc);
         if (cursor < 0)
                 error = cursor;
-       set_cpus_allowed(current, old);
+       set_cpus_allowed_ptr(current, &old);
         return error;
  }
  
@@ -579,7 +579,7 @@ static int apply_microcode_check_cpu(int cpu)
                 return 0;
  
         old = current->cpus_allowed;
-       set_cpus_allowed(current, cpumask_of_cpu(cpu));
+       set_cpus_allowed_ptr(current, &cpumask_of_cpu(cpu));
  
         /* Check if the microcode we have in memory matches the CPU */
         if (c->x86_vendor != X86_VENDOR_INTEL || c->x86 < 6 ||
@@ -610,7 +610,7 @@ static int apply_microcode_check_cpu(int cpu)
                         " sig=0x%x, pf=0x%x, rev=0x%x\n",
                         cpu, uci->sig, uci->pf, uci->rev);
  
-       set_cpus_allowed(current, old);
+       set_cpus_allowed_ptr(current, &old);
         return err;
  }
  
@@ -621,13 +621,13 @@ static void microcode_init_cpu(int cpu, int resume)
  
         old = current->cpus_allowed;
  
-       set_cpus_allowed(current, cpumask_of_cpu(cpu));
+       set_cpus_allowed_ptr(current, &cpumask_of_cpu(cpu));
         mutex_lock(&microcode_mutex);
         collect_cpu_info(cpu);
         if (uci->valid && system_state == SYSTEM_RUNNING && !resume)
                 cpu_request_microcode(cpu);
         mutex_unlock(&microcode_mutex);
-       set_cpus_allowed(current, old);
+       set_cpus_allowed_ptr(current, &old);
  }
  
  static void microcode_fini_cpu(int cpu)
@@ -657,14 +657,14 @@ static ssize_t reload_store(struct sys_device *dev, const char *buf, size_t sz)
                 old = current->cpus_allowed;
  
                 get_online_cpus();
-               set_cpus_allowed(current, cpumask_of_cpu(cpu));
+               set_cpus_allowed_ptr(current, &cpumask_of_cpu(cpu));
  
                 mutex_lock(&microcode_mutex);
                 if (uci->valid)
                         err = cpu_request_microcode(cpu);
                 mutex_unlock(&microcode_mutex);
                 put_online_cpus();
-               set_cpus_allowed(current, old);
+               set_cpus_allowed_ptr(current, &old);
         }
         if (err)
                 return err;
diff --git a/arch/x86/kernel/nmi_32.c b/arch/x86/kernel/nmi_32.c

index 8421d0ac6f2200fbf91dcae68aceba921ea49cbd..11b14bbaa61e6be6cfb1ffe7fd5b4c8f69fd71a7 100644 (file)
--- a/arch/x86/kernel/nmi_32.c
+++ b/arch/x86/kernel/nmi_32.c
@@ -321,7 +321,8 @@ EXPORT_SYMBOL(touch_nmi_watchdog);
  
  extern void die_nmi(struct pt_regs *, const char *msg);
  
-__kprobes int nmi_watchdog_tick(struct pt_regs * regs, unsigned reason)
+notrace __kprobes int
+nmi_watchdog_tick(struct pt_regs *regs, unsigned reason)
  {
  
         /*
diff --git a/arch/x86/kernel/nmi_64.c b/arch/x86/kernel/nmi_64.c

index 11f9130ac513e732fb6f9832d3af32c07bd4bd05..5a29ded994fa345fb67f92cc19b9fa9dd5606cf8 100644 (file)
--- a/arch/x86/kernel/nmi_64.c
+++ b/arch/x86/kernel/nmi_64.c
@@ -313,7 +313,8 @@ void touch_nmi_watchdog(void)
  }
  EXPORT_SYMBOL(touch_nmi_watchdog);
  
-int __kprobes nmi_watchdog_tick(struct pt_regs * regs, unsigned reason)
+notrace __kprobes int
+nmi_watchdog_tick(struct pt_regs *regs, unsigned reason)
  {
         int sum;
         int touched = 0;
@@ -384,7 +385,8 @@ int __kprobes nmi_watchdog_tick(struct pt_regs * regs, unsigned reason)
  
  static unsigned ignore_nmis;
  
-asmlinkage __kprobes void do_nmi(struct pt_regs * regs, long error_code)
+asmlinkage notrace __kprobes void
+do_nmi(struct pt_regs *regs, long error_code)
  {
         nmi_enter();
         add_pda(__nmi_count,1);
diff --git a/arch/x86/kernel/pci-calgary_64.c b/arch/x86/kernel/pci-calgary_64.c

index 1b5464c2434f2fbe115816a6603fc3434dd55b96..adb91e4b62dad3d6ef51188eb8e628b2024755c2 100644 (file)
--- a/arch/x86/kernel/pci-calgary_64.c
+++ b/arch/x86/kernel/pci-calgary_64.c
@@ -470,10 +470,11 @@ error:
         return 0;
  }
  
-static dma_addr_t calgary_map_single(struct device *dev, void *vaddr,
+static dma_addr_t calgary_map_single(struct device *dev, phys_addr_t paddr,
         size_t size, int direction)
  {
         dma_addr_t dma_handle = bad_dma_address;
+       void *vaddr = phys_to_virt(paddr);
         unsigned long uaddr;
         unsigned int npages;
         struct iommu_table *tbl = find_iommu_table(dev);
diff --git a/arch/x86/kernel/pci-dma_64.c b/arch/x86/kernel/pci-dma.c

similarity index 55%

rename from arch/x86/kernel/pci-dma_64.c

rename to arch/x86/kernel/pci-dma.c

index ada5a0604992d7788a3fda58ec82e44a2c662804..388b113a7d88ffbc0efaf52c348412af9e439a8c 100644 (file)
--- a/arch/x86/kernel/pci-dma_64.c
+++ b/arch/x86/kernel/pci-dma.c
@@ -1,61 +1,370 @@
-/*
- * Dynamic DMA mapping support.
- */
-
-#include <linux/types.h>
-#include <linux/mm.h>
-#include <linux/string.h>
-#include <linux/pci.h>
-#include <linux/module.h>
+#include <linux/dma-mapping.h>
  #include <linux/dmar.h>
-#include <asm/io.h>
+#include <linux/bootmem.h>
+#include <linux/pci.h>
+
+#include <asm/proto.h>
+#include <asm/dma.h>
  #include <asm/gart.h>
  #include <asm/calgary.h>
  
-int iommu_merge __read_mostly = 0;
-
-dma_addr_t bad_dma_address __read_mostly;
-EXPORT_SYMBOL(bad_dma_address);
+int forbid_dac __read_mostly;
+EXPORT_SYMBOL(forbid_dac);
  
-/* This tells the BIO block layer to assume merging. Default to off
-   because we cannot guarantee merging later. */
-int iommu_bio_merge __read_mostly = 0;
-EXPORT_SYMBOL(iommu_bio_merge);
+const struct dma_mapping_ops *dma_ops;
+EXPORT_SYMBOL(dma_ops);
  
-static int iommu_sac_force __read_mostly = 0;
+int iommu_sac_force __read_mostly = 0;
  
-int no_iommu __read_mostly;
  #ifdef CONFIG_IOMMU_DEBUG
  int panic_on_overflow __read_mostly = 1;
  int force_iommu __read_mostly = 1;
  #else
  int panic_on_overflow __read_mostly = 0;
-int force_iommu __read_mostly= 0;
+int force_iommu __read_mostly = 0;
  #endif
  
+int iommu_merge __read_mostly = 0;
+
+int no_iommu __read_mostly;
  /* Set this to 1 if there is a HW IOMMU in the system */
  int iommu_detected __read_mostly = 0;
  
+/* This tells the BIO block layer to assume merging. Default to off
+   because we cannot guarantee merging later. */
+int iommu_bio_merge __read_mostly = 0;
+EXPORT_SYMBOL(iommu_bio_merge);
+
+dma_addr_t bad_dma_address __read_mostly = 0;
+EXPORT_SYMBOL(bad_dma_address);
+
  /* Dummy device used for NULL arguments (normally ISA). Better would
     be probably a smaller DMA mask, but this is bug-to-bug compatible
-   to i386. */
+   to older i386. */
  struct device fallback_dev = {
         .bus_id = "fallback device",
         .coherent_dma_mask = DMA_32BIT_MASK,
         .dma_mask = &fallback_dev.coherent_dma_mask,
  };
  
+int dma_set_mask(struct device *dev, u64 mask)
+{
+       if (!dev->dma_mask || !dma_supported(dev, mask))
+               return -EIO;
+
+       *dev->dma_mask = mask;
+
+       return 0;
+}
+EXPORT_SYMBOL(dma_set_mask);
+
+#ifdef CONFIG_X86_64
+static __initdata void *dma32_bootmem_ptr;
+static unsigned long dma32_bootmem_size __initdata = (128ULL<<20);
+
+static int __init parse_dma32_size_opt(char *p)
+{
+       if (!p)
+               return -EINVAL;
+       dma32_bootmem_size = memparse(p, &p);
+       return 0;
+}
+early_param("dma32_size", parse_dma32_size_opt);
+
+void __init dma32_reserve_bootmem(void)
+{
+       unsigned long size, align;
+       if (end_pfn <= MAX_DMA32_PFN)
+               return;
+
+       align = 64ULL<<20;
+       size = round_up(dma32_bootmem_size, align);
+       dma32_bootmem_ptr = __alloc_bootmem_nopanic(size, align,
+                                __pa(MAX_DMA_ADDRESS));
+       if (dma32_bootmem_ptr)
+               dma32_bootmem_size = size;
+       else
+               dma32_bootmem_size = 0;
+}
+static void __init dma32_free_bootmem(void)
+{
+       int node;
+
+       if (end_pfn <= MAX_DMA32_PFN)
+               return;
+
+       if (!dma32_bootmem_ptr)
+               return;
+
+       for_each_online_node(node)
+               free_bootmem_node(NODE_DATA(node), __pa(dma32_bootmem_ptr),
+                                 dma32_bootmem_size);
+
+       dma32_bootmem_ptr = NULL;
+       dma32_bootmem_size = 0;
+}
+
+void __init pci_iommu_alloc(void)
+{
+       /* free the range so iommu could get some range less than 4G */
+       dma32_free_bootmem();
+       /*
+        * The order of these functions is important for
+        * fall-back/fail-over reasons
+        */
+#ifdef CONFIG_GART_IOMMU
+       gart_iommu_hole_init();
+#endif
+
+#ifdef CONFIG_CALGARY_IOMMU
+       detect_calgary();
+#endif
+
+       detect_intel_iommu();
+
+#ifdef CONFIG_SWIOTLB
+       pci_swiotlb_init();
+#endif
+}
+#endif
+
+/*
+ * See <Documentation/x86_64/boot-options.txt> for the iommu kernel parameter
+ * documentation.
+ */
+static __init int iommu_setup(char *p)
+{
+       iommu_merge = 1;
+
+       if (!p)
+               return -EINVAL;
+
+       while (*p) {
+               if (!strncmp(p, "off", 3))
+                       no_iommu = 1;
+               /* gart_parse_options has more force support */
+               if (!strncmp(p, "force", 5))
+                       force_iommu = 1;
+               if (!strncmp(p, "noforce", 7)) {
+                       iommu_merge = 0;
+                       force_iommu = 0;
+               }
+
+               if (!strncmp(p, "biomerge", 8)) {
+                       iommu_bio_merge = 4096;
+                       iommu_merge = 1;
+                       force_iommu = 1;
+               }
+               if (!strncmp(p, "panic", 5))
+                       panic_on_overflow = 1;
+               if (!strncmp(p, "nopanic", 7))
+                       panic_on_overflow = 0;
+               if (!strncmp(p, "merge", 5)) {
+                       iommu_merge = 1;
+                       force_iommu = 1;
+               }
+               if (!strncmp(p, "nomerge", 7))
+                       iommu_merge = 0;
+               if (!strncmp(p, "forcesac", 8))
+                       iommu_sac_force = 1;
+               if (!strncmp(p, "allowdac", 8))
+                       forbid_dac = 0;
+               if (!strncmp(p, "nodac", 5))
+                       forbid_dac = -1;
+               if (!strncmp(p, "usedac", 6)) {
+                       forbid_dac = -1;
+                       return 1;
+               }
+#ifdef CONFIG_SWIOTLB
+               if (!strncmp(p, "soft", 4))
+                       swiotlb = 1;
+#endif
+
+#ifdef CONFIG_GART_IOMMU
+               gart_parse_options(p);
+#endif
+
+#ifdef CONFIG_CALGARY_IOMMU
+               if (!strncmp(p, "calgary", 7))
+                       use_calgary = 1;
+#endif /* CONFIG_CALGARY_IOMMU */
+
+               p += strcspn(p, ",");
+               if (*p == ',')
+                       ++p;
+       }
+       return 0;
+}
+early_param("iommu", iommu_setup);
+
+#ifdef CONFIG_X86_32
+int dma_declare_coherent_memory(struct device *dev, dma_addr_t bus_addr,
+                               dma_addr_t device_addr, size_t size, int flags)
+{
+       void __iomem *mem_base = NULL;
+       int pages = size >> PAGE_SHIFT;
+       int bitmap_size = BITS_TO_LONGS(pages) * sizeof(long);
+
+       if ((flags & (DMA_MEMORY_MAP | DMA_MEMORY_IO)) == 0)
+               goto out;
+       if (!size)
+               goto out;
+       if (dev->dma_mem)
+               goto out;
+
+       /* FIXME: this routine just ignores DMA_MEMORY_INCLUDES_CHILDREN */
+
+       mem_base = ioremap(bus_addr, size);
+       if (!mem_base)
+               goto out;
+
+       dev->dma_mem = kzalloc(sizeof(struct dma_coherent_mem), GFP_KERNEL);
+       if (!dev->dma_mem)
+               goto out;
+       dev->dma_mem->bitmap = kzalloc(bitmap_size, GFP_KERNEL);
+       if (!dev->dma_mem->bitmap)
+               goto free1_out;
+
+       dev->dma_mem->virt_base = mem_base;
+       dev->dma_mem->device_base = device_addr;
+       dev->dma_mem->size = pages;
+       dev->dma_mem->flags = flags;
+
+       if (flags & DMA_MEMORY_MAP)
+               return DMA_MEMORY_MAP;
+
+       return DMA_MEMORY_IO;
+
+ free1_out:
+       kfree(dev->dma_mem);
+ out:
+       if (mem_base)
+               iounmap(mem_base);
+       return 0;
+}
+EXPORT_SYMBOL(dma_declare_coherent_memory);
+
+void dma_release_declared_memory(struct device *dev)
+{
+       struct dma_coherent_mem *mem = dev->dma_mem;
+
+       if (!mem)
+               return;
+       dev->dma_mem = NULL;
+       iounmap(mem->virt_base);
+       kfree(mem->bitmap);
+       kfree(mem);
+}
+EXPORT_SYMBOL(dma_release_declared_memory);
+
+void *dma_mark_declared_memory_occupied(struct device *dev,
+                                       dma_addr_t device_addr, size_t size)
+{
+       struct dma_coherent_mem *mem = dev->dma_mem;
+       int pos, err;
+       int pages = (size + (device_addr & ~PAGE_MASK) + PAGE_SIZE - 1);
+
+       pages >>= PAGE_SHIFT;
+
+       if (!mem)
+               return ERR_PTR(-EINVAL);
+
+       pos = (device_addr - mem->device_base) >> PAGE_SHIFT;
+       err = bitmap_allocate_region(mem->bitmap, pos, get_order(pages));
+       if (err != 0)
+               return ERR_PTR(err);
+       return mem->virt_base + (pos << PAGE_SHIFT);
+}
+EXPORT_SYMBOL(dma_mark_declared_memory_occupied);
+
+static int dma_alloc_from_coherent_mem(struct device *dev, ssize_t size,
+                                      dma_addr_t *dma_handle, void **ret)
+{
+       struct dma_coherent_mem *mem = dev ? dev->dma_mem : NULL;
+       int order = get_order(size);
+
+       if (mem) {
+               int page = bitmap_find_free_region(mem->bitmap, mem->size,
+                                                    order);
+               if (page >= 0) {
+                       *dma_handle = mem->device_base + (page << PAGE_SHIFT);
+                       *ret = mem->virt_base + (page << PAGE_SHIFT);
+                       memset(*ret, 0, size);
+               }
+               if (mem->flags & DMA_MEMORY_EXCLUSIVE)
+                       *ret = NULL;
+       }
+       return (mem != NULL);
+}
+
+static int dma_release_coherent(struct device *dev, int order, void *vaddr)
+{
+       struct dma_coherent_mem *mem = dev ? dev->dma_mem : NULL;
+
+       if (mem && vaddr >= mem->virt_base && vaddr <
+                  (mem->virt_base + (mem->size << PAGE_SHIFT))) {
+               int page = (vaddr - mem->virt_base) >> PAGE_SHIFT;
+
+               bitmap_release_region(mem->bitmap, page, order);
+               return 1;
+       }
+       return 0;
+}
+#else
+#define dma_alloc_from_coherent_mem(dev, size, handle, ret) (0)
+#define dma_release_coherent(dev, order, vaddr) (0)
+#endif /* CONFIG_X86_32 */
+
+int dma_supported(struct device *dev, u64 mask)
+{
+#ifdef CONFIG_PCI
+       if (mask > 0xffffffff && forbid_dac > 0) {
+               printk(KERN_INFO "PCI: Disallowing DAC for device %s\n",
+                                dev->bus_id);
+               return 0;
+       }
+#endif
+
+       if (dma_ops->dma_supported)
+               return dma_ops->dma_supported(dev, mask);
+
+       /* Copied from i386. Doesn't make much sense, because it will
+          only work for pci_alloc_coherent.
+          The caller just has to use GFP_DMA in this case. */
+       if (mask < DMA_24BIT_MASK)
+               return 0;
+
+       /* Tell the device to use SAC when IOMMU force is on.  This
+          allows the driver to use cheaper accesses in some cases.
+
+          Problem with this is that if we overflow the IOMMU area and
+          return DAC as fallback address the device may not handle it
+          correctly.
+
+          As a special case some controllers have a 39bit address
+          mode that is as efficient as 32bit (aic79xx). Don't force
+          SAC for these.  Assume all masks <= 40 bits are of this
+          type. Normally this doesn't make any difference, but gives
+          more gentle handling of IOMMU overflow. */
+       if (iommu_sac_force && (mask >= DMA_40BIT_MASK)) {
+               printk(KERN_INFO "%s: Force SAC with mask %Lx\n",
+                                dev->bus_id, mask);
+               return 0;
+       }
+
+       return 1;
+}
+EXPORT_SYMBOL(dma_supported);
+
  /* Allocate DMA memory on node near device */
-noinline static void *
+noinline struct page *
  dma_alloc_pages(struct device *dev, gfp_t gfp, unsigned order)
  {
-       struct page *page;
         int node;
  
         node = dev_to_node(dev);
  
-       page = alloc_pages_node(node, gfp, order);
-       return page ? page_address(page) : NULL;
+       return alloc_pages_node(node, gfp, order);
  }
  
  /*
@@ -65,9 +374,16 @@ void *
  dma_alloc_coherent(struct device *dev, size_t size, dma_addr_t *dma_handle,
                    gfp_t gfp)
  {
-       void *memory;
+       void *memory = NULL;
+       struct page *page;
         unsigned long dma_mask = 0;
-       u64 bus;
+       dma_addr_t bus;
+
+       /* ignore region specifiers */
+       gfp &= ~(__GFP_DMA | __GFP_HIGHMEM | __GFP_DMA32);
+
+       if (dma_alloc_from_coherent_mem(dev, size, dma_handle, &memory))
+               return memory;
  
         if (!dev)
                 dev = &fallback_dev;
@@ -82,26 +398,25 @@ dma_alloc_coherent(struct device *dev, size_t size, dma_addr_t *dma_handle,
         /* Don't invoke OOM killer */
         gfp |= __GFP_NORETRY;
  
-       /* Kludge to make it bug-to-bug compatible with i386. i386
-          uses the normal dma_mask for alloc_coherent. */
-       dma_mask &= *dev->dma_mask;
-
+#ifdef CONFIG_X86_64
         /* Why <=? Even when the mask is smaller than 4GB it is often
            larger than 16MB and in this case we have a chance of
            finding fitting memory in the next higher zone first. If
            not retry with true GFP_DMA. -AK */
         if (dma_mask <= DMA_32BIT_MASK)
                 gfp |= GFP_DMA32;
+#endif
  
   again:
-       memory = dma_alloc_pages(dev, gfp, get_order(size));
-       if (memory == NULL)
+       page = dma_alloc_pages(dev, gfp, get_order(size));
+       if (page == NULL)
                 return NULL;
  
         {
                 int high, mmu;
-               bus = virt_to_bus(memory);
-               high = (bus + size) >= dma_mask;
+               bus = page_to_phys(page);
+               memory = page_address(page);
+               high = (bus + size) >= dma_mask;
                 mmu = high;
                 if (force_iommu && !(gfp & GFP_DMA))
                         mmu = 1;
@@ -127,7 +442,7 @@ dma_alloc_coherent(struct device *dev, size_t size, dma_addr_t *dma_handle,
  
                 memset(memory, 0, size);
                 if (!mmu) {
-                       *dma_handle = virt_to_bus(memory);
+                       *dma_handle = bus;
                         return memory;
                 }
         }
@@ -139,7 +454,7 @@ dma_alloc_coherent(struct device *dev, size_t size, dma_addr_t *dma_handle,
         }
  
         if (dma_ops->map_simple) {
-               *dma_handle = dma_ops->map_simple(dev, memory,
+               *dma_handle = dma_ops->map_simple(dev, virt_to_phys(memory),
                                               size,
                                               PCI_DMA_BIDIRECTIONAL);
                 if (*dma_handle != bad_dma_address)
@@ -147,7 +462,8 @@ dma_alloc_coherent(struct device *dev, size_t size, dma_addr_t *dma_handle,
         }
  
         if (panic_on_overflow)
-               panic("dma_alloc_coherent: IOMMU overflow by %lu bytes\n",size);
+               panic("dma_alloc_coherent: IOMMU overflow by %lu bytes\n",
+                     (unsigned long)size);
         free_pages((unsigned long)memory, get_order(size));
         return NULL;
  }
@@ -160,153 +476,16 @@ EXPORT_SYMBOL(dma_alloc_coherent);
  void dma_free_coherent(struct device *dev, size_t size,
                          void *vaddr, dma_addr_t bus)
  {
+       int order = get_order(size);
         WARN_ON(irqs_disabled());       /* for portability */
+       if (dma_release_coherent(dev, order, vaddr))
+               return;
         if (dma_ops->unmap_single)
                 dma_ops->unmap_single(dev, bus, size, 0);
-       free_pages((unsigned long)vaddr, get_order(size));
+       free_pages((unsigned long)vaddr, order);
  }
  EXPORT_SYMBOL(dma_free_coherent);
  
-static int forbid_dac __read_mostly;
-
-int dma_supported(struct device *dev, u64 mask)
-{
-#ifdef CONFIG_PCI
-       if (mask > 0xffffffff && forbid_dac > 0) {
-
-
-
-               printk(KERN_INFO "PCI: Disallowing DAC for device %s\n", dev->bus_id);
-               return 0;
-       }
-#endif
-
-       if (dma_ops->dma_supported)
-               return dma_ops->dma_supported(dev, mask);
-
-       /* Copied from i386. Doesn't make much sense, because it will
-          only work for pci_alloc_coherent.
-          The caller just has to use GFP_DMA in this case. */
-        if (mask < DMA_24BIT_MASK)
-                return 0;
-
-       /* Tell the device to use SAC when IOMMU force is on.  This
-          allows the driver to use cheaper accesses in some cases.
-
-          Problem with this is that if we overflow the IOMMU area and
-          return DAC as fallback address the device may not handle it
-          correctly.
-
-          As a special case some controllers have a 39bit address
-          mode that is as efficient as 32bit (aic79xx). Don't force
-          SAC for these.  Assume all masks <= 40 bits are of this
-          type. Normally this doesn't make any difference, but gives
-          more gentle handling of IOMMU overflow. */
-       if (iommu_sac_force && (mask >= DMA_40BIT_MASK)) {
-               printk(KERN_INFO "%s: Force SAC with mask %Lx\n", dev->bus_id,mask);
-               return 0;
-       }
-
-       return 1;
-}
-EXPORT_SYMBOL(dma_supported);
-
-int dma_set_mask(struct device *dev, u64 mask)
-{
-       if (!dev->dma_mask || !dma_supported(dev, mask))
-               return -EIO;
-       *dev->dma_mask = mask;
-       return 0;
-}
-EXPORT_SYMBOL(dma_set_mask);
-
-/*
- * See <Documentation/x86_64/boot-options.txt> for the iommu kernel parameter
- * documentation.
- */
-static __init int iommu_setup(char *p)
-{
-       iommu_merge = 1;
-
-       if (!p)
-               return -EINVAL;
-
-       while (*p) {
-               if (!strncmp(p, "off", 3))
-                       no_iommu = 1;
-               /* gart_parse_options has more force support */
-               if (!strncmp(p, "force", 5))
-                       force_iommu = 1;
-               if (!strncmp(p, "noforce", 7)) {
-                       iommu_merge = 0;
-                       force_iommu = 0;
-               }
-
-               if (!strncmp(p, "biomerge", 8)) {
-                       iommu_bio_merge = 4096;
-                       iommu_merge = 1;
-                       force_iommu = 1;
-               }
-               if (!strncmp(p, "panic", 5))
-                       panic_on_overflow = 1;
-               if (!strncmp(p, "nopanic", 7))
-                       panic_on_overflow = 0;
-               if (!strncmp(p, "merge", 5)) {
-                       iommu_merge = 1;
-                       force_iommu = 1;
-               }
-               if (!strncmp(p, "nomerge", 7))
-                       iommu_merge = 0;
-               if (!strncmp(p, "forcesac", 8))
-                       iommu_sac_force = 1;
-               if (!strncmp(p, "allowdac", 8))
-                       forbid_dac = 0;
-               if (!strncmp(p, "nodac", 5))
-                       forbid_dac = -1;
-
-#ifdef CONFIG_SWIOTLB
-               if (!strncmp(p, "soft", 4))
-                       swiotlb = 1;
-#endif
-
-#ifdef CONFIG_GART_IOMMU
-               gart_parse_options(p);
-#endif
-
-#ifdef CONFIG_CALGARY_IOMMU
-               if (!strncmp(p, "calgary", 7))
-                       use_calgary = 1;
-#endif /* CONFIG_CALGARY_IOMMU */
-
-               p += strcspn(p, ",");
-               if (*p == ',')
-                       ++p;
-       }
-       return 0;
-}
-early_param("iommu", iommu_setup);
-
-void __init pci_iommu_alloc(void)
-{
-       /*
-        * The order of these functions is important for
-        * fall-back/fail-over reasons
-        */
-#ifdef CONFIG_GART_IOMMU
-       gart_iommu_hole_init();
-#endif
-
-#ifdef CONFIG_CALGARY_IOMMU
-       detect_calgary();
-#endif
-
-       detect_intel_iommu();
-
-#ifdef CONFIG_SWIOTLB
-       pci_swiotlb_init();
-#endif
-}
-
  static int __init pci_iommu_init(void)
  {
  #ifdef CONFIG_CALGARY_IOMMU
@@ -327,6 +506,8 @@ void pci_iommu_shutdown(void)
  {
         gart_iommu_shutdown();
  }
+/* Must execute after PCI subsystem */
+fs_initcall(pci_iommu_init);
  
  #ifdef CONFIG_PCI
  /* Many VIA bridges seem to corrupt data for DAC. Disable it here */
@@ -334,11 +515,10 @@ void pci_iommu_shutdown(void)
  static __devinit void via_no_dac(struct pci_dev *dev)
  {
         if ((dev->class >> 8) == PCI_CLASS_BRIDGE_PCI && forbid_dac == 0) {
-               printk(KERN_INFO "PCI: VIA PCI bridge detected. Disabling DAC.\n");
+               printk(KERN_INFO "PCI: VIA PCI bridge detected."
+                                "Disabling DAC.\n");
                 forbid_dac = 1;
         }
  }
  DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_VIA, PCI_ANY_ID, via_no_dac);
  #endif
-/* Must execute after PCI subsystem */
-fs_initcall(pci_iommu_init);
diff --git a/arch/x86/kernel/pci-dma_32.c b/arch/x86/kernel/pci-dma_32.c

deleted file mode 100644 (file)

index 5133032..0000000
--- a/arch/x86/kernel/pci-dma_32.c
+++ /dev/null
@@ -1,177 +0,0 @@
-/*
- * Dynamic DMA mapping support.
- *
- * On i386 there is no hardware dynamic DMA address translation,
- * so consistent alloc/free are merely page allocation/freeing.
- * The rest of the dynamic DMA mapping interface is implemented
- * in asm/pci.h.
- */
-
-#include <linux/types.h>
-#include <linux/mm.h>
-#include <linux/string.h>
-#include <linux/pci.h>
-#include <linux/module.h>
-#include <asm/io.h>
-
-struct dma_coherent_mem {
-       void            *virt_base;
-       u32             device_base;
-       int             size;
-       int             flags;
-       unsigned long   *bitmap;
-};
-
-void *dma_alloc_coherent(struct device *dev, size_t size,
-                          dma_addr_t *dma_handle, gfp_t gfp)
-{
-       void *ret;
-       struct dma_coherent_mem *mem = dev ? dev->dma_mem : NULL;
-       int order = get_order(size);
-       /* ignore region specifiers */
-       gfp &= ~(__GFP_DMA | __GFP_HIGHMEM);
-
-       if (mem) {
-               int page = bitmap_find_free_region(mem->bitmap, mem->size,
-                                                    order);
-               if (page >= 0) {
-                       *dma_handle = mem->device_base + (page << PAGE_SHIFT);
-                       ret = mem->virt_base + (page << PAGE_SHIFT);
-                       memset(ret, 0, size);
-                       return ret;
-               }
-               if (mem->flags & DMA_MEMORY_EXCLUSIVE)
-                       return NULL;
-       }
-
-       if (dev == NULL || (dev->coherent_dma_mask < 0xffffffff))
-               gfp |= GFP_DMA;
-
-       ret = (void *)__get_free_pages(gfp, order);
-
-       if (ret != NULL) {
-               memset(ret, 0, size);
-               *dma_handle = virt_to_phys(ret);
-       }
-       return ret;
-}
-EXPORT_SYMBOL(dma_alloc_coherent);
-
-void dma_free_coherent(struct device *dev, size_t size,
-                        void *vaddr, dma_addr_t dma_handle)
-{
-       struct dma_coherent_mem *mem = dev ? dev->dma_mem : NULL;
-       int order = get_order(size);
-
-       WARN_ON(irqs_disabled());       /* for portability */
-       if (mem && vaddr >= mem->virt_base && vaddr < (mem->virt_base + (mem->size << PAGE_SHIFT))) {
-               int page = (vaddr - mem->virt_base) >> PAGE_SHIFT;
-
-               bitmap_release_region(mem->bitmap, page, order);
-       } else
-               free_pages((unsigned long)vaddr, order);
-}
-EXPORT_SYMBOL(dma_free_coherent);
-
-int dma_declare_coherent_memory(struct device *dev, dma_addr_t bus_addr,
-                               dma_addr_t device_addr, size_t size, int flags)
-{
-       void __iomem *mem_base = NULL;
-       int pages = size >> PAGE_SHIFT;
-       int bitmap_size = BITS_TO_LONGS(pages) * sizeof(long);
-
-       if ((flags & (DMA_MEMORY_MAP | DMA_MEMORY_IO)) == 0)
-               goto out;
-       if (!size)
-               goto out;
-       if (dev->dma_mem)
-               goto out;
-
-       /* FIXME: this routine just ignores DMA_MEMORY_INCLUDES_CHILDREN */
-
-       mem_base = ioremap(bus_addr, size);
-       if (!mem_base)
-               goto out;
-
-       dev->dma_mem = kzalloc(sizeof(struct dma_coherent_mem), GFP_KERNEL);
-       if (!dev->dma_mem)
-               goto out;
-       dev->dma_mem->bitmap = kzalloc(bitmap_size, GFP_KERNEL);
-       if (!dev->dma_mem->bitmap)
-               goto free1_out;
-
-       dev->dma_mem->virt_base = mem_base;
-       dev->dma_mem->device_base = device_addr;
-       dev->dma_mem->size = pages;
-       dev->dma_mem->flags = flags;
-
-       if (flags & DMA_MEMORY_MAP)
-               return DMA_MEMORY_MAP;
-
-       return DMA_MEMORY_IO;
-
- free1_out:
-       kfree(dev->dma_mem);
- out:
-       if (mem_base)
-               iounmap(mem_base);
-       return 0;
-}
-EXPORT_SYMBOL(dma_declare_coherent_memory);
-
-void dma_release_declared_memory(struct device *dev)
-{
-       struct dma_coherent_mem *mem = dev->dma_mem;
-       
-       if(!mem)
-               return;
-       dev->dma_mem = NULL;
-       iounmap(mem->virt_base);
-       kfree(mem->bitmap);
-       kfree(mem);
-}
-EXPORT_SYMBOL(dma_release_declared_memory);
-
-void *dma_mark_declared_memory_occupied(struct device *dev,
-                                       dma_addr_t device_addr, size_t size)
-{
-       struct dma_coherent_mem *mem = dev->dma_mem;
-       int pages = (size + (device_addr & ~PAGE_MASK) + PAGE_SIZE - 1) >> PAGE_SHIFT;
-       int pos, err;
-
-       if (!mem)
-               return ERR_PTR(-EINVAL);
-
-       pos = (device_addr - mem->device_base) >> PAGE_SHIFT;
-       err = bitmap_allocate_region(mem->bitmap, pos, get_order(pages));
-       if (err != 0)
-               return ERR_PTR(err);
-       return mem->virt_base + (pos << PAGE_SHIFT);
-}
-EXPORT_SYMBOL(dma_mark_declared_memory_occupied);
-
-#ifdef CONFIG_PCI
-/* Many VIA bridges seem to corrupt data for DAC. Disable it here */
-
-int forbid_dac;
-EXPORT_SYMBOL(forbid_dac);
-
-static __devinit void via_no_dac(struct pci_dev *dev)
-{
-       if ((dev->class >> 8) == PCI_CLASS_BRIDGE_PCI && forbid_dac == 0) {
-               printk(KERN_INFO "PCI: VIA PCI bridge detected. Disabling DAC.\n");
-               forbid_dac = 1;
-       }
-}
-DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_VIA, PCI_ANY_ID, via_no_dac);
-
-static int check_iommu(char *s)
-{
-       if (!strcmp(s, "usedac")) {
-               forbid_dac = -1;
-               return 1;
-       }
-       return 0;
-}
-__setup("iommu=", check_iommu);
-#endif
diff --git a/arch/x86/kernel/pci-gart_64.c b/arch/x86/kernel/pci-gart_64.c

index 700e4647dd30214ba33b90845f4121421515948d..c07455d1695f531843df8bf165ce5131bb1ca836 100644 (file)
--- a/arch/x86/kernel/pci-gart_64.c
+++ b/arch/x86/kernel/pci-gart_64.c
@@ -264,9 +264,9 @@ static dma_addr_t dma_map_area(struct device *dev, dma_addr_t phys_mem,
  }
  
  static dma_addr_t
-gart_map_simple(struct device *dev, char *buf, size_t size, int dir)
+gart_map_simple(struct device *dev, phys_addr_t paddr, size_t size, int dir)
  {
-       dma_addr_t map = dma_map_area(dev, virt_to_bus(buf), size, dir);
+       dma_addr_t map = dma_map_area(dev, paddr, size, dir);
  
         flush_gart();
  
@@ -275,18 +275,17 @@ gart_map_simple(struct device *dev, char *buf, size_t size, int dir)
  
  /* Map a single area into the IOMMU */
  static dma_addr_t
-gart_map_single(struct device *dev, void *addr, size_t size, int dir)
+gart_map_single(struct device *dev, phys_addr_t paddr, size_t size, int dir)
  {
-       unsigned long phys_mem, bus;
+       unsigned long bus;
  
         if (!dev)
                 dev = &fallback_dev;
  
-       phys_mem = virt_to_phys(addr);
-       if (!need_iommu(dev, phys_mem, size))
-               return phys_mem;
+       if (!need_iommu(dev, paddr, size))
+               return paddr;
  
-       bus = gart_map_simple(dev, addr, size, dir);
+       bus = gart_map_simple(dev, paddr, size, dir);
  
         return bus;
  }
diff --git a/arch/x86/kernel/pci-nommu_64.c b/arch/x86/kernel/pci-nommu.c

similarity index 77%

rename from arch/x86/kernel/pci-nommu_64.c

rename to arch/x86/kernel/pci-nommu.c

index ab08e1832228a74869bc6e7aeb517b0e29e8b044..aec43d56f49c57cedffaefad7205fbbfe9b984c2 100644 (file)
--- a/arch/x86/kernel/pci-nommu_64.c
+++ b/arch/x86/kernel/pci-nommu.c
@@ -14,7 +14,7 @@
  static int
  check_addr(char *name, struct device *hwdev, dma_addr_t bus, size_t size)
  {
-        if (hwdev && bus + size > *hwdev->dma_mask) {
+       if (hwdev && bus + size > *hwdev->dma_mask) {
                 if (*hwdev->dma_mask >= DMA_32BIT_MASK)
                         printk(KERN_ERR
                             "nommu_%s: overflow %Lx+%zu of device mask %Lx\n",
@@ -26,19 +26,17 @@ check_addr(char *name, struct device *hwdev, dma_addr_t bus, size_t size)
  }
  
  static dma_addr_t
-nommu_map_single(struct device *hwdev, void *ptr, size_t size,
+nommu_map_single(struct device *hwdev, phys_addr_t paddr, size_t size,
                int direction)
  {
-       dma_addr_t bus = virt_to_bus(ptr);
+       dma_addr_t bus = paddr;
+       WARN_ON(size == 0);
         if (!check_addr("map_single", hwdev, bus, size))
                                 return bad_dma_address;
+       flush_write_buffers();
         return bus;
  }
  
-static void nommu_unmap_single(struct device *dev, dma_addr_t addr,size_t size,
-                       int direction)
-{
-}
  
  /* Map a set of buffers described by scatterlist in streaming
   * mode for DMA.  This is the scatter-gather version of the
@@ -61,30 +59,34 @@ static int nommu_map_sg(struct device *hwdev, struct scatterlist *sg,
         struct scatterlist *s;
         int i;
  
+       WARN_ON(nents == 0 || sg[0].length == 0);
+
         for_each_sg(sg, s, nents, i) {
                 BUG_ON(!sg_page(s));
-               s->dma_address = virt_to_bus(sg_virt(s));
+               s->dma_address = sg_phys(s);
                 if (!check_addr("map_sg", hwdev, s->dma_address, s->length))
                         return 0;
                 s->dma_length = s->length;
         }
+       flush_write_buffers();
         return nents;
  }
  
-/* Unmap a set of streaming mode DMA translations.
- * Again, cpu read rules concerning calls here are the same as for
- * pci_unmap_single() above.
- */
-static void nommu_unmap_sg(struct device *dev, struct scatterlist *sg,
-                 int nents, int dir)
+/* Make sure we keep the same behaviour */
+static int nommu_mapping_error(dma_addr_t dma_addr)
  {
+#ifdef CONFIG_X86_32
+       return 0;
+#else
+       return (dma_addr == bad_dma_address);
+#endif
  }
  
+
  const struct dma_mapping_ops nommu_dma_ops = {
         .map_single = nommu_map_single,
-       .unmap_single = nommu_unmap_single,
         .map_sg = nommu_map_sg,
-       .unmap_sg = nommu_unmap_sg,
+       .mapping_error = nommu_mapping_error,
         .is_phys = 1,
  };
  
diff --git a/arch/x86/kernel/pci-swiotlb_64.c b/arch/x86/kernel/pci-swiotlb_64.c

index 82a0a674a003f815b5d98cd540e56e26f4c332df..490da7f4b8d0dd8e29abf7ddd2e11b01fe19f196 100644 (file)
--- a/arch/x86/kernel/pci-swiotlb_64.c
+++ b/arch/x86/kernel/pci-swiotlb_64.c
@@ -11,11 +11,18 @@
  
  int swiotlb __read_mostly;
  
+static dma_addr_t
+swiotlb_map_single_phys(struct device *hwdev, phys_addr_t paddr, size_t size,
+                       int direction)
+{
+       return swiotlb_map_single(hwdev, phys_to_virt(paddr), size, direction);
+}
+
  const struct dma_mapping_ops swiotlb_dma_ops = {
         .mapping_error = swiotlb_dma_mapping_error,
         .alloc_coherent = swiotlb_alloc_coherent,
         .free_coherent = swiotlb_free_coherent,
-       .map_single = swiotlb_map_single,
+       .map_single = swiotlb_map_single_phys,
         .unmap_single = swiotlb_unmap_single,
         .sync_single_for_cpu = swiotlb_sync_single_for_cpu,
         .sync_single_for_device = swiotlb_sync_single_for_device,
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c

new file mode 100644 (file)

index 0000000..3004d71
--- /dev/null
+++ b/arch/x86/kernel/process.c
@@ -0,0 +1,44 @@
+#include <linux/errno.h>
+#include <linux/kernel.h>
+#include <linux/mm.h>
+#include <linux/smp.h>
+#include <linux/slab.h>
+#include <linux/sched.h>
+
+struct kmem_cache *task_xstate_cachep;
+
+int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
+{
+       *dst = *src;
+       if (src->thread.xstate) {
+               dst->thread.xstate = kmem_cache_alloc(task_xstate_cachep,
+                                                     GFP_KERNEL);
+               if (!dst->thread.xstate)
+                       return -ENOMEM;
+               WARN_ON((unsigned long)dst->thread.xstate & 15);
+               memcpy(dst->thread.xstate, src->thread.xstate, xstate_size);
+       }
+       return 0;
+}
+
+void free_thread_xstate(struct task_struct *tsk)
+{
+       if (tsk->thread.xstate) {
+               kmem_cache_free(task_xstate_cachep, tsk->thread.xstate);
+               tsk->thread.xstate = NULL;
+       }
+}
+
+void free_thread_info(struct thread_info *ti)
+{
+       free_thread_xstate(ti->task);
+       free_pages((unsigned long)ti, get_order(THREAD_SIZE));
+}
+
+void arch_task_cache_init(void)
+{
+        task_xstate_cachep =
+               kmem_cache_create("task_xstate", xstate_size,
+                                 __alignof__(union thread_xstate),
+                                 SLAB_PANIC, NULL);
+}
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c

index 3903a8f2eb978f2d9c6676c8f1bc9813d749fae9..7adad088e373fbf4bf5523dcf70f22665331b105 100644 (file)
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -36,6 +36,7 @@
  #include <linux/personality.h>
  #include <linux/tick.h>
  #include <linux/percpu.h>
+#include <linux/prctl.h>
  
  #include <asm/uaccess.h>
  #include <asm/pgtable.h>
@@ -45,7 +46,6 @@
  #include <asm/processor.h>
  #include <asm/i387.h>
  #include <asm/desc.h>
-#include <asm/vm86.h>
  #ifdef CONFIG_MATH_EMULATION
  #include <asm/math_emu.h>
  #endif
@@ -521,14 +521,18 @@ start_thread(struct pt_regs *regs, unsigned long new_ip, unsigned long new_sp)
         regs->cs                = __USER_CS;
         regs->ip                = new_ip;
         regs->sp                = new_sp;
+       /*
+        * Free the old FP and other extended state
+        */
+       free_thread_xstate(current);
  }
  EXPORT_SYMBOL_GPL(start_thread);
  
-#ifdef CONFIG_SECCOMP
  static void hard_disable_TSC(void)
  {
         write_cr4(read_cr4() | X86_CR4_TSD);
  }
+
  void disable_TSC(void)
  {
         preempt_disable();
@@ -540,11 +544,47 @@ void disable_TSC(void)
                 hard_disable_TSC();
         preempt_enable();
  }
+
  static void hard_enable_TSC(void)
  {
         write_cr4(read_cr4() & ~X86_CR4_TSD);
  }
-#endif /* CONFIG_SECCOMP */
+
+void enable_TSC(void)
+{
+       preempt_disable();
+       if (test_and_clear_thread_flag(TIF_NOTSC))
+               /*
+                * Must flip the CPU state synchronously with
+                * TIF_NOTSC in the current running context.
+                */
+               hard_enable_TSC();
+       preempt_enable();
+}
+
+int get_tsc_mode(unsigned long adr)
+{
+       unsigned int val;
+
+       if (test_thread_flag(TIF_NOTSC))
+               val = PR_TSC_SIGSEGV;
+       else
+               val = PR_TSC_ENABLE;
+
+       return put_user(val, (unsigned int __user *)adr);
+}
+
+int set_tsc_mode(unsigned int val)
+{
+       if (val == PR_TSC_SIGSEGV)
+               disable_TSC();
+       else if (val == PR_TSC_ENABLE)
+               enable_TSC();
+       else
+               return -EINVAL;
+
+       return 0;
+}
  
  static noinline void
  __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p,
@@ -578,7 +618,6 @@ __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p,
                 set_debugreg(next->debugreg7, 7);
         }
  
-#ifdef CONFIG_SECCOMP
         if (test_tsk_thread_flag(prev_p, TIF_NOTSC) ^
             test_tsk_thread_flag(next_p, TIF_NOTSC)) {
                 /* prev and next are different */
@@ -587,7 +626,6 @@ __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p,
                 else
                         hard_enable_TSC();
         }
-#endif
  
  #ifdef X86_BTS
         if (test_tsk_thread_flag(prev_p, TIF_BTS_TRACE_TS))
@@ -669,7 +707,7 @@ struct task_struct * __switch_to(struct task_struct *prev_p, struct task_struct
  
         /* we're going to use this soon, after a few expensive things */
         if (next_p->fpu_counter > 5)
-               prefetch(&next->i387.fxsave);
+               prefetch(next->xstate);
  
         /*
          * Reload esp0.
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c

index e75ccc8a2b87b7e659e0ee4c8ad74459239b98bb..891af1a1b48a5cbf0ef303d9f1b5c5601b485869 100644 (file)
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -36,6 +36,7 @@
  #include <linux/kprobes.h>
  #include <linux/kdebug.h>
  #include <linux/tick.h>
+#include <linux/prctl.h>
  
  #include <asm/uaccess.h>
  #include <asm/pgtable.h>
@@ -532,9 +533,71 @@ start_thread(struct pt_regs *regs, unsigned long new_ip, unsigned long new_sp)
         regs->ss                = __USER_DS;
         regs->flags             = 0x200;
         set_fs(USER_DS);
+       /*
+        * Free the old FP and other extended state
+        */
+       free_thread_xstate(current);
  }
  EXPORT_SYMBOL_GPL(start_thread);
  
+static void hard_disable_TSC(void)
+{
+       write_cr4(read_cr4() | X86_CR4_TSD);
+}
+
+void disable_TSC(void)
+{
+       preempt_disable();
+       if (!test_and_set_thread_flag(TIF_NOTSC))
+               /*
+                * Must flip the CPU state synchronously with
+                * TIF_NOTSC in the current running context.
+                */
+               hard_disable_TSC();
+       preempt_enable();
+}
+
+static void hard_enable_TSC(void)
+{
+       write_cr4(read_cr4() & ~X86_CR4_TSD);
+}
+
+void enable_TSC(void)
+{
+       preempt_disable();
+       if (test_and_clear_thread_flag(TIF_NOTSC))
+               /*
+                * Must flip the CPU state synchronously with
+                * TIF_NOTSC in the current running context.
+                */
+               hard_enable_TSC();
+       preempt_enable();
+}
+
+int get_tsc_mode(unsigned long adr)
+{
+       unsigned int val;
+
+       if (test_thread_flag(TIF_NOTSC))
+               val = PR_TSC_SIGSEGV;
+       else
+               val = PR_TSC_ENABLE;
+
+       return put_user(val, (unsigned int __user *)adr);
+}
+
+int set_tsc_mode(unsigned int val)
+{
+       if (val == PR_TSC_SIGSEGV)
+               disable_TSC();
+       else if (val == PR_TSC_ENABLE)
+               enable_TSC();
+       else
+               return -EINVAL;
+
+       return 0;
+}
+
  /*
   * This special macro can be used to load a debugging register
   */
@@ -572,6 +635,15 @@ static inline void __switch_to_xtra(struct task_struct *prev_p,
                 loaddebug(next, 7);
         }
  
+       if (test_tsk_thread_flag(prev_p, TIF_NOTSC) ^
+           test_tsk_thread_flag(next_p, TIF_NOTSC)) {
+               /* prev and next are different */
+               if (test_tsk_thread_flag(next_p, TIF_NOTSC))
+                       hard_disable_TSC();
+               else
+                       hard_enable_TSC();
+       }
+
         if (test_tsk_thread_flag(next_p, TIF_IO_BITMAP)) {
                 /*
                  * Copy the relevant range of the IO bitmap.
@@ -614,7 +686,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
  
         /* we're going to use this soon, after a few expensive things */
         if (next_p->fpu_counter>5)
-               prefetch(&next->i387.fxsave);
+               prefetch(next->xstate);
  
         /*
          * Reload esp0, LDT and the page table pointer:
diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c

index 9692202d3bfb62125c42f0e437f4a618354a9223..19c9386ac1187e4f9b25144437732dc45b11a98c 100644 (file)
--- a/arch/x86/kernel/reboot.c
+++ b/arch/x86/kernel/reboot.c
@@ -420,7 +420,7 @@ static void native_machine_shutdown(void)
                 reboot_cpu_id = smp_processor_id();
  
         /* Make certain I only run on the appropriate processor */
-       set_cpus_allowed(current, cpumask_of_cpu(reboot_cpu_id));
+       set_cpus_allowed_ptr(current, &cpumask_of_cpu(reboot_cpu_id));
  
         /* O.K Now that I'm on the appropriate processor,
          * stop all of the others.
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c

index ed157c90412e25a01076dd268133a8d4d86a9296..0d1f44ae6eea83b3ab32d17fe04a817227dd2acf 100644 (file)
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -54,6 +54,24 @@ static void __init setup_per_cpu_maps(void)
  #endif
  }
  
+#ifdef CONFIG_HAVE_CPUMASK_OF_CPU_MAP
+cpumask_t *cpumask_of_cpu_map __read_mostly;
+EXPORT_SYMBOL(cpumask_of_cpu_map);
+
+/* requires nr_cpu_ids to be initialized */
+static void __init setup_cpumask_of_cpu(void)
+{
+       int i;
+
+       /* alloc_bootmem zeroes memory */
+       cpumask_of_cpu_map = alloc_bootmem_low(sizeof(cpumask_t) * nr_cpu_ids);
+       for (i = 0; i < nr_cpu_ids; i++)
+               cpu_set(i, cpumask_of_cpu_map[i]);
+}
+#else
+static inline void setup_cpumask_of_cpu(void) { }
+#endif
+
  #ifdef CONFIG_X86_32
  /*
   * Great future not-so-futuristic plan: make i386 and x86_64 do it
@@ -70,7 +88,7 @@ EXPORT_SYMBOL(__per_cpu_offset);
   */
  void __init setup_per_cpu_areas(void)
  {
-       int i;
+       int i, highest_cpu = 0;
         unsigned long size;
  
  #ifdef CONFIG_HOTPLUG_CPU
@@ -104,10 +122,18 @@ void __init setup_per_cpu_areas(void)
                 __per_cpu_offset[i] = ptr - __per_cpu_start;
  #endif
                 memcpy(ptr, __per_cpu_start, __per_cpu_end - __per_cpu_start);
+
+               highest_cpu = i;
         }
  
+       nr_cpu_ids = highest_cpu + 1;
+       printk(KERN_DEBUG "NR_CPUS: %d, nr_cpu_ids: %d\n", NR_CPUS, nr_cpu_ids);
+
         /* Setup percpu data maps */
         setup_per_cpu_maps();
+
+       /* Setup cpumask_of_cpu map */
+       setup_cpumask_of_cpu();
  }
  
  #endif
diff --git a/arch/x86/kernel/setup64.c b/arch/x86/kernel/setup64.c

index 9042fb0e36f54b8846fea6ddb8dfe37ce4db9ce2..aee0e8200777057699f8cd580c1add23684d780d 100644 (file)
--- a/arch/x86/kernel/setup64.c
+++ b/arch/x86/kernel/setup64.c
@@ -74,8 +74,8 @@ int force_personality32 = 0;
  Control non executable heap for 32bit processes.
  To control the stack too use noexec=off
  
-on     PROT_READ does not imply PROT_EXEC for 32bit processes
-off    PROT_READ implies PROT_EXEC (default)
+on     PROT_READ does not imply PROT_EXEC for 32bit processes (default)
+off    PROT_READ implies PROT_EXEC
  */
  static int __init nonx32_setup(char *str)
  {
diff --git a/arch/x86/kernel/setup_32.c b/arch/x86/kernel/setup_32.c

index 5b0bffb7fcc91d9aa237367a396dd1e5321aba6c..1c4799e687183f231f9a1c8c3248148bcc4aceae 100644 (file)
--- a/arch/x86/kernel/setup_32.c
+++ b/arch/x86/kernel/setup_32.c
@@ -812,10 +812,10 @@ void __init setup_arch(char **cmdline_p)
                 efi_init();
  
         /* update e820 for memory not covered by WB MTRRs */
-       find_max_pfn();
+       propagate_e820_map();
         mtrr_bp_init();
         if (mtrr_trim_uncached_memory(max_pfn))
-               find_max_pfn();
+               propagate_e820_map();
  
         max_low_pfn = setup_memory();
  
diff --git a/arch/x86/kernel/setup_64.c b/arch/x86/kernel/setup_64.c

index 674ef3510cdfd66972a08f716e374a0d96884dea..6b8e11f0c15d79c699dec0bea0df7c73e2759c8e 100644 (file)
--- a/arch/x86/kernel/setup_64.c
+++ b/arch/x86/kernel/setup_64.c
@@ -398,6 +398,8 @@ void __init setup_arch(char **cmdline_p)
  
         early_res_to_bootmem();
  
+       dma32_reserve_bootmem();
+
  #ifdef CONFIG_ACPI_SLEEP
         /*
          * Reserve low memory region for sleep support.
@@ -420,11 +422,14 @@ void __init setup_arch(char **cmdline_p)
                 unsigned long end_of_mem    = end_pfn << PAGE_SHIFT;
  
                 if (ramdisk_end <= end_of_mem) {
-                       reserve_bootmem_generic(ramdisk_image, ramdisk_size);
+                       /*
+                        * don't need to reserve again, already reserved early
+                        * in x86_64_start_kernel, and early_res_to_bootmem
+                        * convert that to reserved in bootmem
+                        */
                         initrd_start = ramdisk_image + PAGE_OFFSET;
                         initrd_end = initrd_start+ramdisk_size;
                 } else {
-                       /* Assumes everything on node 0 */
                         free_bootmem(ramdisk_image, ramdisk_size);
                         printk(KERN_ERR "initrd extends beyond end of memory "
                                "(0x%08lx > 0x%08lx)\ndisabling initrd\n",
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c

index e6abe8a49b1fa0b63cccf4b4e1ebfb03c6de8fde..6a925394bc7e646e70cd17a54fd6da0710a11e80 100644 (file)
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -61,6 +61,7 @@
  #include <asm/mtrr.h>
  #include <asm/nmi.h>
  #include <asm/vmi.h>
+#include <asm/genapic.h>
  #include <linux/mc146818rtc.h>
  
  #include <mach_apic.h>
@@ -677,6 +678,12 @@ wakeup_secondary_cpu(int phys_apicid, unsigned long start_eip)
         unsigned long send_status, accept_status = 0;
         int maxlvt, num_starts, j;
  
+       if (get_uv_system_type() == UV_NON_UNIQUE_APIC) {
+               send_status = uv_wakeup_secondary(phys_apicid, start_eip);
+               atomic_set(&init_deasserted, 1);
+               return send_status;
+       }
+
         /*
          * Be paranoid about clearing APIC errors.
          */
@@ -918,16 +925,19 @@ do_rest:
  
         atomic_set(&init_deasserted, 0);
  
-       Dprintk("Setting warm reset code and vector.\n");
+       if (get_uv_system_type() != UV_NON_UNIQUE_APIC) {
  
-       store_NMI_vector(&nmi_high, &nmi_low);
+               Dprintk("Setting warm reset code and vector.\n");
  
-       smpboot_setup_warm_reset_vector(start_ip);
-       /*
-        * Be paranoid about clearing APIC errors.
-        */
-       apic_write(APIC_ESR, 0);
-       apic_read(APIC_ESR);
+               store_NMI_vector(&nmi_high, &nmi_low);
+
+               smpboot_setup_warm_reset_vector(start_ip);
+               /*
+                * Be paranoid about clearing APIC errors.
+               */
+               apic_write(APIC_ESR, 0);
+               apic_read(APIC_ESR);
+       }
  
         /*
          * Starting actual IPI sequence...
@@ -966,7 +976,8 @@ do_rest:
                         else
                                 /* trampoline code not run */
                                 printk(KERN_ERR "Not responding.\n");
-                       inquire_remote_apic(apicid);
+                       if (get_uv_system_type() != UV_NON_UNIQUE_APIC)
+                               inquire_remote_apic(apicid);
                 }
         }
  
diff --git a/arch/x86/kernel/traps_32.c b/arch/x86/kernel/traps_32.c

index 65791ca2824a289651017fb524485746908a804e..471e694d6713193baa5a25dac995f177ef34abf2 100644 (file)
--- a/arch/x86/kernel/traps_32.c
+++ b/arch/x86/kernel/traps_32.c
@@ -681,7 +681,7 @@ gp_in_kernel:
         }
  }
  
-static __kprobes void
+static notrace __kprobes void
  mem_parity_error(unsigned char reason, struct pt_regs *regs)
  {
         printk(KERN_EMERG
@@ -707,7 +707,7 @@ mem_parity_error(unsigned char reason, struct pt_regs *regs)
         clear_mem_error(reason);
  }
  
-static __kprobes void
+static notrace __kprobes void
  io_check_error(unsigned char reason, struct pt_regs *regs)
  {
         unsigned long i;
@@ -727,7 +727,7 @@ io_check_error(unsigned char reason, struct pt_regs *regs)
         outb(reason, 0x61);
  }
  
-static __kprobes void
+static notrace __kprobes void
  unknown_nmi_error(unsigned char reason, struct pt_regs *regs)
  {
         if (notify_die(DIE_NMIUNKNOWN, "nmi", regs, reason, 2, SIGINT) == NOTIFY_STOP)
@@ -755,7 +755,7 @@ unknown_nmi_error(unsigned char reason, struct pt_regs *regs)
  
  static DEFINE_SPINLOCK(nmi_print_lock);
  
-void __kprobes die_nmi(struct pt_regs *regs, const char *msg)
+void notrace __kprobes die_nmi(struct pt_regs *regs, const char *msg)
  {
         if (notify_die(DIE_NMIWATCHDOG, msg, regs, 0, 2, SIGINT) == NOTIFY_STOP)
                 return;
@@ -786,7 +786,7 @@ void __kprobes die_nmi(struct pt_regs *regs, const char *msg)
         do_exit(SIGSEGV);
  }
  
-static __kprobes void default_do_nmi(struct pt_regs *regs)
+static notrace __kprobes void default_do_nmi(struct pt_regs *regs)
  {
         unsigned char reason = 0;
  
@@ -828,7 +828,7 @@ static __kprobes void default_do_nmi(struct pt_regs *regs)
  
  static int ignore_nmis;
  
-__kprobes void do_nmi(struct pt_regs *regs, long error_code)
+notrace __kprobes void do_nmi(struct pt_regs *regs, long error_code)
  {
         int cpu;
  
@@ -1148,9 +1148,22 @@ asmlinkage void math_state_restore(void)
         struct thread_info *thread = current_thread_info();
         struct task_struct *tsk = thread->task;
  
+       if (!tsk_used_math(tsk)) {
+               local_irq_enable();
+               /*
+                * does a slab alloc which can sleep
+                */
+               if (init_fpu(tsk)) {
+                       /*
+                        * ran out of memory!
+                        */
+                       do_group_exit(SIGKILL);
+                       return;
+               }
+               local_irq_disable();
+       }
+
         clts();                         /* Allow maths ops (or we recurse) */
-       if (!tsk_used_math(tsk))
-               init_fpu(tsk);
         restore_fpu(tsk);
         thread->status |= TS_USEDFPU;   /* So we fnsave on switch_to() */
         tsk->fpu_counter++;
@@ -1208,11 +1221,6 @@ void __init trap_init(void)
  #endif
         set_trap_gate(19, &simd_coprocessor_error);
  
-       /*
-        * Verify that the FXSAVE/FXRSTOR data will be 16-byte aligned.
-        * Generate a build-time error if the alignment is wrong.
-        */
-       BUILD_BUG_ON(offsetof(struct task_struct, thread.i387.fxsave) & 15);
         if (cpu_has_fxsr) {
                 printk(KERN_INFO "Enabling fast FPU save and restore... ");
                 set_in_cr4(X86_CR4_OSFXSR);
@@ -1233,6 +1241,7 @@ void __init trap_init(void)
  
         set_bit(SYSCALL_VECTOR, used_vectors);
  
+       init_thread_xstate();
         /*
          * Should be a barrier for any external CPU state:
          */
diff --git a/arch/x86/kernel/traps_64.c b/arch/x86/kernel/traps_64.c

index 79aa6fc0815c9e75de806430c5660335bc1cc13e..adff76ea97c4732de4b7766272c8b6ad26ace478 100644 (file)
--- a/arch/x86/kernel/traps_64.c
+++ b/arch/x86/kernel/traps_64.c
@@ -600,7 +600,8 @@ void die(const char * str, struct pt_regs * regs, long err)
         oops_end(flags, regs, SIGSEGV);
  }
  
-void __kprobes die_nmi(char *str, struct pt_regs *regs, int do_panic)
+notrace __kprobes void
+die_nmi(char *str, struct pt_regs *regs, int do_panic)
  {
         unsigned long flags;
  
@@ -772,7 +773,7 @@ asmlinkage void __kprobes do_general_protection(struct pt_regs * regs,
         die("general protection fault", regs, error_code);
  }
  
-static __kprobes void
+static notrace __kprobes void
  mem_parity_error(unsigned char reason, struct pt_regs * regs)
  {
         printk(KERN_EMERG "Uhhuh. NMI received for unknown reason %02x.\n",
@@ -796,7 +797,7 @@ mem_parity_error(unsigned char reason, struct pt_regs * regs)
         outb(reason, 0x61);
  }
  
-static __kprobes void
+static notrace __kprobes void
  io_check_error(unsigned char reason, struct pt_regs * regs)
  {
         printk("NMI: IOCK error (debug interrupt?)\n");
@@ -810,7 +811,7 @@ io_check_error(unsigned char reason, struct pt_regs * regs)
         outb(reason, 0x61);
  }
  
-static __kprobes void
+static notrace __kprobes void
  unknown_nmi_error(unsigned char reason, struct pt_regs * regs)
  {
         if (notify_die(DIE_NMIUNKNOWN, "nmi", regs, reason, 2, SIGINT) == NOTIFY_STOP)
@@ -827,7 +828,7 @@ unknown_nmi_error(unsigned char reason, struct pt_regs * regs)
  
  /* Runs on IST stack. This code must keep interrupts off all the time.
     Nested NMIs are prevented by the CPU. */
-asmlinkage __kprobes void default_do_nmi(struct pt_regs *regs)
+asmlinkage notrace  __kprobes void default_do_nmi(struct pt_regs *regs)
  {
         unsigned char reason = 0;
         int cpu;
@@ -1123,11 +1124,24 @@ asmlinkage void __attribute__((weak)) mce_threshold_interrupt(void)
  asmlinkage void math_state_restore(void)
  {
         struct task_struct *me = current;
-       clts();                 /* Allow maths ops (or we recurse) */
  
-       if (!used_math())
-               init_fpu(me);
-       restore_fpu_checking(&me->thread.i387.fxsave);
+       if (!used_math()) {
+               local_irq_enable();
+               /*
+                * does a slab alloc which can sleep
+                */
+               if (init_fpu(me)) {
+                       /*
+                        * ran out of memory!
+                        */
+                       do_group_exit(SIGKILL);
+                       return;
+               }
+               local_irq_disable();
+       }
+
+       clts();                 /* Allow maths ops (or we recurse) */
+       restore_fpu_checking(&me->thread.xstate->fxsave);
         task_thread_info(me)->status |= TS_USEDFPU;
         me->fpu_counter++;
  }
@@ -1162,6 +1176,10 @@ void __init trap_init(void)
         set_system_gate(IA32_SYSCALL_VECTOR, ia32_syscall);
  #endif
         
+       /*
+        * initialize the per thread extended state:
+        */
+        init_thread_xstate();
         /*
          * Should be a barrier for any external CPU state.
          */
diff --git a/arch/x86/kernel/tsc_32.c b/arch/x86/kernel/tsc_32.c

index 3d7e6e9fa6c2e4eaf5bee5c78927f63ddfa49e51..e4790728b2244fc90c60ff755d17f32653bcbbab 100644 (file)
--- a/arch/x86/kernel/tsc_32.c
+++ b/arch/x86/kernel/tsc_32.c
@@ -221,9 +221,9 @@ EXPORT_SYMBOL(recalibrate_cpu_khz);
   * if the CPU frequency is scaled, TSC-based delays will need a different
   * loops_per_jiffy value to function properly.
   */
-static unsigned int ref_freq = 0;
-static unsigned long loops_per_jiffy_ref = 0;
-static unsigned long cpu_khz_ref = 0;
+static unsigned int ref_freq;
+static unsigned long loops_per_jiffy_ref;
+static unsigned long cpu_khz_ref;
  
  static int
  time_cpufreq_notifier(struct notifier_block *nb, unsigned long val, void *data)
@@ -283,15 +283,28 @@ core_initcall(cpufreq_tsc);
  
  /* clock source code */
  
-static unsigned long current_tsc_khz = 0;
+static unsigned long current_tsc_khz;
+static struct clocksource clocksource_tsc;
  
+/*
+ * We compare the TSC to the cycle_last value in the clocksource
+ * structure to avoid a nasty time-warp issue. This can be observed in
+ * a very small window right after one CPU updated cycle_last under
+ * xtime lock and the other CPU reads a TSC value which is smaller
+ * than the cycle_last reference value due to a TSC which is slighty
+ * behind. This delta is nowhere else observable, but in that case it
+ * results in a forward time jump in the range of hours due to the
+ * unsigned delta calculation of the time keeping core code, which is
+ * necessary to support wrapping clocksources like pm timer.
+ */
  static cycle_t read_tsc(void)
  {
         cycle_t ret;
  
         rdtscll(ret);
  
-       return ret;
+       return ret >= clocksource_tsc.cycle_last ?
+               ret : clocksource_tsc.cycle_last;
  }
  
  static struct clocksource clocksource_tsc = {
diff --git a/arch/x86/kernel/tsc_64.c b/arch/x86/kernel/tsc_64.c

index ceeba01e7f479a9dabc73906d5d2e243ff5e7926..fcc16e58609e17c42de7fae8a8309106df366423 100644 (file)
--- a/arch/x86/kernel/tsc_64.c
+++ b/arch/x86/kernel/tsc_64.c
@@ -11,6 +11,7 @@
  #include <asm/hpet.h>
  #include <asm/timex.h>
  #include <asm/timer.h>
+#include <asm/vgtod.h>
  
  static int notsc __initdata = 0;
  
@@ -287,18 +288,34 @@ int __init notsc_setup(char *s)
  
  __setup("notsc", notsc_setup);
  
+static struct clocksource clocksource_tsc;
  
-/* clock source code: */
+/*
+ * We compare the TSC to the cycle_last value in the clocksource
+ * structure to avoid a nasty time-warp. This can be observed in a
+ * very small window right after one CPU updated cycle_last under
+ * xtime/vsyscall_gtod lock and the other CPU reads a TSC value which
+ * is smaller than the cycle_last reference value due to a TSC which
+ * is slighty behind. This delta is nowhere else observable, but in
+ * that case it results in a forward time jump in the range of hours
+ * due to the unsigned delta calculation of the time keeping core
+ * code, which is necessary to support wrapping clocksources like pm
+ * timer.
+ */
  static cycle_t read_tsc(void)
  {
         cycle_t ret = (cycle_t)get_cycles();
-       return ret;
+
+       return ret >= clocksource_tsc.cycle_last ?
+               ret : clocksource_tsc.cycle_last;
  }
  
  static cycle_t __vsyscall_fn vread_tsc(void)
  {
         cycle_t ret = (cycle_t)vget_cycles();
-       return ret;
+
+       return ret >= __vsyscall_gtod_data.clock.cycle_last ?
+               ret : __vsyscall_gtod_data.clock.cycle_last;
  }
  
  static struct clocksource clocksource_tsc = {
diff --git a/arch/x86/mach-visws/visws_apic.c b/arch/x86/mach-visws/visws_apic.c

index 710faf71a650b11a991cdbc034cb21e31f0a98f1..cef9cb1d15accf6a6677efcfd76260dcd8990dca 100644 (file)
--- a/arch/x86/mach-visws/visws_apic.c
+++ b/arch/x86/mach-visws/visws_apic.c
@@ -1,6 +1,4 @@
  /*
- *     linux/arch/i386/mach-visws/visws_apic.c
- *
   *     Copyright (C) 1999 Bent Hagemark, Ingo Molnar
   *
   *  SGI Visual Workstation interrupt controller
diff --git a/arch/x86/mach-voyager/voyager_basic.c b/arch/x86/mach-voyager/voyager_basic.c

index 6a949e4edde8f17758fbb5fc9ff004eff0b4d286..46d6f8067690d9f37610718c9af1b924a3233e98 100644 (file)
--- a/arch/x86/mach-voyager/voyager_basic.c
+++ b/arch/x86/mach-voyager/voyager_basic.c
@@ -2,8 +2,6 @@
   *
   * Author: J.E.J.Bottomley@HansenPartnership.com
   *
- * linux/arch/i386/kernel/voyager.c
- *
   * This file contains all the voyager specific routines for getting
   * initialisation of the architecture to function.  For additional
   * features see:
diff --git a/arch/x86/mach-voyager/voyager_cat.c b/arch/x86/mach-voyager/voyager_cat.c

index 17a7904f75b19ce55b63a562158bc82c172cf677..ecab9fff0fd17579b43d550f8bae2e5eea98f22b 100644 (file)
--- a/arch/x86/mach-voyager/voyager_cat.c
+++ b/arch/x86/mach-voyager/voyager_cat.c
@@ -4,8 +4,6 @@
   *
   * Author: J.E.J.Bottomley@HansenPartnership.com
   *
- * linux/arch/i386/kernel/voyager_cat.c
- *
   * This file contains all the logic for manipulating the CAT bus
   * in a level 5 machine.
   *
diff --git a/arch/x86/mach-voyager/voyager_smp.c b/arch/x86/mach-voyager/voyager_smp.c

index be7235bf105d8ac7f2e77838e36988710fd51048..96f60c7cd124a141e7f4825bbbe555e8893d9ef3 100644 (file)
--- a/arch/x86/mach-voyager/voyager_smp.c
+++ b/arch/x86/mach-voyager/voyager_smp.c
@@ -4,8 +4,6 @@
   *
   * Author: J.E.J.Bottomley@HansenPartnership.com
   *
- * linux/arch/i386/kernel/voyager_smp.c
- *
   * This file provides all the same external entries as smp.c but uses
   * the voyager hal to provide the functionality
   */
diff --git a/arch/x86/mach-voyager/voyager_thread.c b/arch/x86/mach-voyager/voyager_thread.c

index c69c931818ed49b8d5d9326a50aefb8aa339ba9e..15464a20fb388ee56fb4c7f05de9e8ed34803be6 100644 (file)
--- a/arch/x86/mach-voyager/voyager_thread.c
+++ b/arch/x86/mach-voyager/voyager_thread.c
@@ -4,8 +4,6 @@
   *
   * Author: J.E.J.Bottomley@HansenPartnership.com
   *
- * linux/arch/i386/kernel/voyager_thread.c
- *
   * This module provides the machine status monitor thread for the
   * voyager architecture.  This allows us to monitor the machine
   * environment (temp, voltage, fan function) and the front panel and
diff --git a/arch/x86/math-emu/fpu_entry.c b/arch/x86/math-emu/fpu_entry.c

index 4bab3b14539242ceeddf3a1580617f71c9227710..6e38d877ea7725fb8d94d2c91a0ee7daab2585d1 100644 (file)
--- a/arch/x86/math-emu/fpu_entry.c
+++ b/arch/x86/math-emu/fpu_entry.c
@@ -678,7 +678,7 @@ int fpregs_soft_set(struct task_struct *target,
                     unsigned int pos, unsigned int count,
                     const void *kbuf, const void __user *ubuf)
  {
-       struct i387_soft_struct *s387 = &target->thread.i387.soft;
+       struct i387_soft_struct *s387 = &target->thread.xstate->soft;
         void *space = s387->st_space;
         int ret;
         int offset, other, i, tags, regnr, tag, newtop;
@@ -730,7 +730,7 @@ int fpregs_soft_get(struct task_struct *target,
                     unsigned int pos, unsigned int count,
                     void *kbuf, void __user *ubuf)
  {
-       struct i387_soft_struct *s387 = &target->thread.i387.soft;
+       struct i387_soft_struct *s387 = &target->thread.xstate->soft;
         const void *space = s387->st_space;
         int ret;
         int offset = (S387->ftop & 7) * 10, other = 80 - offset;
diff --git a/arch/x86/math-emu/fpu_system.h b/arch/x86/math-emu/fpu_system.h

index a3ae28c49dddad063de9c177f7d0f13480df3da5..13488fa153e0c81dacc9370edc1f0a2128d0babd 100644 (file)
--- a/arch/x86/math-emu/fpu_system.h
+++ b/arch/x86/math-emu/fpu_system.h
@@ -35,8 +35,8 @@
  #define SEG_EXPAND_DOWN(s)     (((s).b & ((1 << 11) | (1 << 10))) \
                                  == (1 << 10))
  
-#define I387                   (current->thread.i387)
-#define FPU_info               (I387.soft.info)
+#define I387                   (current->thread.xstate)
+#define FPU_info               (I387->soft.info)
  
  #define FPU_CS                 (*(unsigned short *) &(FPU_info->___cs))
  #define FPU_SS                 (*(unsigned short *) &(FPU_info->___ss))
@@ -46,25 +46,25 @@
  #define FPU_EIP                        (FPU_info->___eip)
  #define FPU_ORIG_EIP           (FPU_info->___orig_eip)
  
-#define FPU_lookahead           (I387.soft.lookahead)
+#define FPU_lookahead           (I387->soft.lookahead)
  
  /* nz if ip_offset and cs_selector are not to be set for the current
     instruction. */
-#define no_ip_update           (*(u_char *)&(I387.soft.no_update))
-#define FPU_rm                 (*(u_char *)&(I387.soft.rm))
+#define no_ip_update           (*(u_char *)&(I387->soft.no_update))
+#define FPU_rm                 (*(u_char *)&(I387->soft.rm))
  
  /* Number of bytes of data which can be legally accessed by the current
     instruction. This only needs to hold a number <= 108, so a byte will do. */
-#define access_limit           (*(u_char *)&(I387.soft.alimit))
+#define access_limit           (*(u_char *)&(I387->soft.alimit))
  
-#define partial_status         (I387.soft.swd)
-#define control_word           (I387.soft.cwd)
-#define fpu_tag_word           (I387.soft.twd)
-#define registers              (I387.soft.st_space)
-#define top                    (I387.soft.ftop)
+#define partial_status         (I387->soft.swd)
+#define control_word           (I387->soft.cwd)
+#define fpu_tag_word           (I387->soft.twd)
+#define registers              (I387->soft.st_space)
+#define top                    (I387->soft.ftop)
  
-#define instruction_address    (*(struct address *)&I387.soft.fip)
-#define operand_address                (*(struct address *)&I387.soft.foo)
+#define instruction_address    (*(struct address *)&I387->soft.fip)
+#define operand_address                (*(struct address *)&I387->soft.foo)
  
  #define FPU_access_ok(x,y,z)   if ( !access_ok(x,y,z) ) \
                                 math_abort(FPU_info,SIGSEGV)
diff --git a/arch/x86/math-emu/reg_ld_str.c b/arch/x86/math-emu/reg_ld_str.c

index 02af772a24db24f12d2fbd17cfbd53fc5d22515b..d597fe7423c98441f7ed52f8bcb0df3bd45646a9 100644 (file)
--- a/arch/x86/math-emu/reg_ld_str.c
+++ b/arch/x86/math-emu/reg_ld_str.c
@@ -1180,8 +1180,8 @@ u_char __user *fstenv(fpu_addr_modes addr_modes, u_char __user *d)
                 control_word |= 0xffff0040;
                 partial_status = status_word() | 0xffff0000;
                 fpu_tag_word |= 0xffff0000;
-               I387.soft.fcs &= ~0xf8000000;
-               I387.soft.fos |= 0xffff0000;
+               I387->soft.fcs &= ~0xf8000000;
+               I387->soft.fos |= 0xffff0000;
  #endif /* PECULIAR_486 */
                 if (__copy_to_user(d, &control_word, 7 * 4))
                         FPU_abort;
diff --git a/arch/x86/mm/discontig_32.c b/arch/x86/mm/discontig_32.c

index eba0bbede7a6b8e05319c5148b8d8b8b715e0db3..18378850e25aab13c3149903b5c2d6c611ca10a9 100644 (file)
--- a/arch/x86/mm/discontig_32.c
+++ b/arch/x86/mm/discontig_32.c
@@ -120,7 +120,7 @@ int __init get_memcfg_numa_flat(void)
         printk("NUMA - single node, flat memory mode\n");
  
         /* Run the memory configuration and find the top of memory. */
-       find_max_pfn();
+       propagate_e820_map();
         node_start_pfn[0] = 0;
         node_end_pfn[0] = max_pfn;
         memory_present(0, 0, max_pfn);
@@ -134,7 +134,7 @@ int __init get_memcfg_numa_flat(void)
  /*
   * Find the highest page frame number we have available for the node
   */
-static void __init find_max_pfn_node(int nid)
+static void __init propagate_e820_map_node(int nid)
  {
         if (node_end_pfn[nid] > max_pfn)
                 node_end_pfn[nid] = max_pfn;
@@ -379,7 +379,7 @@ unsigned long __init setup_memory(void)
         printk("High memory starts at vaddr %08lx\n",
                         (ulong) pfn_to_kaddr(highstart_pfn));
         for_each_online_node(nid)
-               find_max_pfn_node(nid);
+               propagate_e820_map_node(nid);
  
         memset(NODE_DATA(0), 0, sizeof(struct pglist_data));
         NODE_DATA(0)->bdata = &node0_bdata;
diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c

index 1500dc8d63e4676586470d458722b71cad63ace1..9ec62da85fd79a75fc450b58c4ea54e4c77f857b 100644 (file)
--- a/arch/x86/mm/init_32.c
+++ b/arch/x86/mm/init_32.c
@@ -1,5 +1,4 @@
  /*
- *  linux/arch/i386/mm/init.c
   *
   *  Copyright (C) 1995  Linus Torvalds
   *
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c

index 1076097dcab22115f5de1e9b22afc1baf2129ab6..1ff7906a9a4dbc7afa6dfdd24f274e077cd26773 100644 (file)
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -47,9 +47,6 @@
  #include <asm/numa.h>
  #include <asm/cacheflush.h>
  
-const struct dma_mapping_ops *dma_ops;
-EXPORT_SYMBOL(dma_ops);
-
  static unsigned long dma_reserve __initdata;
  
  DEFINE_PER_CPU(struct mmu_gather, mmu_gathers);
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c

index c590fd200e297e892d1552b8b575e2cf7b8604f7..3a4baf95e24d5a5cee6626a5b8789064e9dce564 100644 (file)
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -134,7 +134,7 @@ static void __iomem *__ioremap(resource_size_t phys_addr, unsigned long size,
  
         if (!phys_addr_valid(phys_addr)) {
                 printk(KERN_WARNING "ioremap: invalid physical address %llx\n",
-                      phys_addr);
+                      (unsigned long long)phys_addr);
                 WARN_ON_ONCE(1);
                 return NULL;
         }
@@ -187,7 +187,8 @@ static void __iomem *__ioremap(resource_size_t phys_addr, unsigned long size,
                      new_prot_val == _PAGE_CACHE_WB)) {
                         pr_debug(
                 "ioremap error for 0x%llx-0x%llx, requested 0x%lx, got 0x%lx\n",
-                               phys_addr, phys_addr + size,
+                               (unsigned long long)phys_addr,
+                               (unsigned long long)(phys_addr + size),
                                 prot_val, new_prot_val);
                         free_memtype(phys_addr, phys_addr + size);
                         return NULL;
diff --git a/arch/x86/mm/k8topology_64.c b/arch/x86/mm/k8topology_64.c

index 7a2ebce87df5dee511701d3f3d521d62e2d26dc8..86808e666f9c2aeea15492a8609400915f819a09 100644 (file)
--- a/arch/x86/mm/k8topology_64.c
+++ b/arch/x86/mm/k8topology_64.c
@@ -164,7 +164,7 @@ int __init k8_scan_nodes(unsigned long start, unsigned long end)
         if (!found)
                 return -1;
  
-       memnode_shift = compute_hash_shift(nodes, 8);
+       memnode_shift = compute_hash_shift(nodes, 8, NULL);
         if (memnode_shift < 0) {
                 printk(KERN_ERR "No NUMA node hash function found. Contact maintainer\n");
                 return -1;
diff --git a/arch/x86/mm/numa_64.c b/arch/x86/mm/numa_64.c

index 2ea56f48f29b506c3a180bfd79b3bdb62bf5fd2c..9a6892200b271a1ebec5fe64da0adef13b54df78 100644 (file)
--- a/arch/x86/mm/numa_64.c
+++ b/arch/x86/mm/numa_64.c
@@ -60,7 +60,7 @@ unsigned long __initdata nodemap_size;
   * -1 if node overlap or lost ram (shift too big)
   */
  static int __init populate_memnodemap(const struct bootnode *nodes,
-                                     int numnodes, int shift)
+                                     int numnodes, int shift, int *nodeids)
  {
         unsigned long addr, end;
         int i, res = -1;
@@ -76,7 +76,12 @@ static int __init populate_memnodemap(const struct bootnode *nodes,
                 do {
                         if (memnodemap[addr >> shift] != NUMA_NO_NODE)
                                 return -1;
-                       memnodemap[addr >> shift] = i;
+
+                       if (!nodeids)
+                               memnodemap[addr >> shift] = i;
+                       else
+                               memnodemap[addr >> shift] = nodeids[i];
+
                         addr += (1UL << shift);
                 } while (addr < end);
                 res = 1;
@@ -139,7 +144,8 @@ static int __init extract_lsb_from_nodes(const struct bootnode *nodes,
         return i;
  }
  
-int __init compute_hash_shift(struct bootnode *nodes, int numnodes)
+int __init compute_hash_shift(struct bootnode *nodes, int numnodes,
+                             int *nodeids)
  {
         int shift;
  
@@ -149,7 +155,7 @@ int __init compute_hash_shift(struct bootnode *nodes, int numnodes)
         printk(KERN_DEBUG "NUMA: Using %d for the hash shift.\n",
                 shift);
  
-       if (populate_memnodemap(nodes, numnodes, shift) != 1) {
+       if (populate_memnodemap(nodes, numnodes, shift, nodeids) != 1) {
                 printk(KERN_INFO "Your memory is not aligned you need to "
                        "rebuild your kernel with a bigger NODEMAPSIZE "
                        "shift=%d\n", shift);
@@ -380,9 +386,10 @@ static int __init split_nodes_by_size(struct bootnode *nodes, u64 *addr,
   * Sets up the system RAM area from start_pfn to end_pfn according to the
   * numa=fake command-line option.
   */
+static struct bootnode nodes[MAX_NUMNODES] __initdata;
+
  static int __init numa_emulation(unsigned long start_pfn, unsigned long end_pfn)
  {
-       struct bootnode nodes[MAX_NUMNODES];
         u64 size, addr = start_pfn << PAGE_SHIFT;
         u64 max_addr = end_pfn << PAGE_SHIFT;
         int num_nodes = 0, num = 0, coeff_flag, coeff = -1, i;
@@ -462,7 +469,7 @@ done:
                 }
         }
  out:
-       memnode_shift = compute_hash_shift(nodes, num_nodes);
+       memnode_shift = compute_hash_shift(nodes, num_nodes, NULL);
         if (memnode_shift < 0) {
                 memnode_shift = 0;
                 printk(KERN_ERR "No NUMA hash function found.  NUMA emulation "
diff --git a/arch/x86/mm/pgtable_32.c b/arch/x86/mm/pgtable_32.c

index 3165ec0672bd1855cb607c6864b83c810c29d729..6fb9e7c6893fd44afad45232355b47c867c7cf75 100644 (file)
--- a/arch/x86/mm/pgtable_32.c
+++ b/arch/x86/mm/pgtable_32.c
@@ -1,7 +1,3 @@
-/*
- *  linux/arch/i386/mm/pgtable.c
- */
-
  #include <linux/sched.h>
  #include <linux/kernel.h>
  #include <linux/errno.h>
diff --git a/arch/x86/mm/srat_64.c b/arch/x86/mm/srat_64.c

index 1bae9c855ceb8b56cce821129a21194a693f7c81..fb43d89f46f3c92b51e5eedc85be49340eabd765 100644 (file)
--- a/arch/x86/mm/srat_64.c
+++ b/arch/x86/mm/srat_64.c
@@ -32,6 +32,10 @@ static struct bootnode nodes_add[MAX_NUMNODES];
  static int found_add_area __initdata;
  int hotadd_percent __initdata = 0;
  
+static int num_node_memblks __initdata;
+static struct bootnode node_memblk_range[NR_NODE_MEMBLKS] __initdata;
+static int memblk_nodeid[NR_NODE_MEMBLKS] __initdata;
+
  /* Too small nodes confuse the VM badly. Usually they result
     from BIOS bugs. */
  #define NODE_MIN_SIZE (4*1024*1024)
@@ -41,17 +45,17 @@ static __init int setup_node(int pxm)
         return acpi_map_pxm_to_node(pxm);
  }
  
-static __init int conflicting_nodes(unsigned long start, unsigned long end)
+static __init int conflicting_memblks(unsigned long start, unsigned long end)
  {
         int i;
-       for_each_node_mask(i, nodes_parsed) {
-               struct bootnode *nd = &nodes[i];
+       for (i = 0; i < num_node_memblks; i++) {
+               struct bootnode *nd = &node_memblk_range[i];
                 if (nd->start == nd->end)
                         continue;
                 if (nd->end > start && nd->start < end)
-                       return i;
+                       return memblk_nodeid[i];
                 if (nd->end == end && nd->start == start)
-                       return i;
+                       return memblk_nodeid[i];
         }
         return -1;
  }
@@ -258,7 +262,7 @@ acpi_numa_memory_affinity_init(struct acpi_srat_mem_affinity *ma)
                 bad_srat();
                 return;
         }
-       i = conflicting_nodes(start, end);
+       i = conflicting_memblks(start, end);
         if (i == node) {
                 printk(KERN_WARNING
                 "SRAT: Warning: PXM %d (%lx-%lx) overlaps with itself (%Lx-%Lx)\n",
@@ -283,10 +287,10 @@ acpi_numa_memory_affinity_init(struct acpi_srat_mem_affinity *ma)
                         nd->end = end;
         }
  
-       printk(KERN_INFO "SRAT: Node %u PXM %u %Lx-%Lx\n", node, pxm,
-              nd->start, nd->end);
-       e820_register_active_regions(node, nd->start >> PAGE_SHIFT,
-                                               nd->end >> PAGE_SHIFT);
+       printk(KERN_INFO "SRAT: Node %u PXM %u %lx-%lx\n", node, pxm,
+              start, end);
+       e820_register_active_regions(node, start >> PAGE_SHIFT,
+                                    end >> PAGE_SHIFT);
         push_node_boundaries(node, nd->start >> PAGE_SHIFT,
                                                 nd->end >> PAGE_SHIFT);
  
@@ -298,6 +302,11 @@ acpi_numa_memory_affinity_init(struct acpi_srat_mem_affinity *ma)
                 if ((nd->start | nd->end) == 0)
                         node_clear(node, nodes_parsed);
         }
+
+       node_memblk_range[num_node_memblks].start = start;
+       node_memblk_range[num_node_memblks].end = end;
+       memblk_nodeid[num_node_memblks] = node;
+       num_node_memblks++;
  }
  
  /* Sanity check to catch more bad SRATs (they are amazingly common).
@@ -368,7 +377,8 @@ int __init acpi_scan_nodes(unsigned long start, unsigned long end)
                 return -1;
         }
  
-       memnode_shift = compute_hash_shift(nodes, MAX_NUMNODES);
+       memnode_shift = compute_hash_shift(node_memblk_range, num_node_memblks,
+                                          memblk_nodeid);
         if (memnode_shift < 0) {
                 printk(KERN_ERR
                      "SRAT: No NUMA node hash function found. Contact maintainer\n");
diff --git a/arch/x86/oprofile/nmi_int.c b/arch/x86/oprofile/nmi_int.c

index 1f11cf0a307f448e777f5a132e0de15ebc4ca7a3..cc48d3fde545c014dc6d4a34f07cc2578ea0f481 100644 (file)
--- a/arch/x86/oprofile/nmi_int.c
+++ b/arch/x86/oprofile/nmi_int.c
@@ -23,8 +23,8 @@
  #include "op_x86_model.h"
  
  static struct op_x86_model_spec const *model;
-static struct op_msrs cpu_msrs[NR_CPUS];
-static unsigned long saved_lvtpc[NR_CPUS];
+static DEFINE_PER_CPU(struct op_msrs, cpu_msrs);
+static DEFINE_PER_CPU(unsigned long, saved_lvtpc);
  
  static int nmi_start(void);
  static void nmi_stop(void);
@@ -89,7 +89,7 @@ static int profile_exceptions_notify(struct notifier_block *self,
  
         switch (val) {
         case DIE_NMI:
-               if (model->check_ctrs(args->regs, &cpu_msrs[cpu]))
+               if (model->check_ctrs(args->regs, &per_cpu(cpu_msrs, cpu)))
                         ret = NOTIFY_STOP;
                 break;
         default:
@@ -126,7 +126,7 @@ static void nmi_cpu_save_registers(struct op_msrs *msrs)
  static void nmi_save_registers(void *dummy)
  {
         int cpu = smp_processor_id();
-       struct op_msrs *msrs = &cpu_msrs[cpu];
+       struct op_msrs *msrs = &per_cpu(cpu_msrs, cpu);
         nmi_cpu_save_registers(msrs);
  }
  
@@ -134,10 +134,10 @@ static void free_msrs(void)
  {
         int i;
         for_each_possible_cpu(i) {
-               kfree(cpu_msrs[i].counters);
-               cpu_msrs[i].counters = NULL;
-               kfree(cpu_msrs[i].controls);
-               cpu_msrs[i].controls = NULL;
+               kfree(per_cpu(cpu_msrs, i).counters);
+               per_cpu(cpu_msrs, i).counters = NULL;
+               kfree(per_cpu(cpu_msrs, i).controls);
+               per_cpu(cpu_msrs, i).controls = NULL;
         }
  }
  
@@ -149,13 +149,15 @@ static int allocate_msrs(void)
  
         int i;
         for_each_possible_cpu(i) {
-               cpu_msrs[i].counters = kmalloc(counters_size, GFP_KERNEL);
-               if (!cpu_msrs[i].counters) {
+               per_cpu(cpu_msrs, i).counters = kmalloc(counters_size,
+                                                               GFP_KERNEL);
+               if (!per_cpu(cpu_msrs, i).counters) {
                         success = 0;
                         break;
                 }
-               cpu_msrs[i].controls = kmalloc(controls_size, GFP_KERNEL);
-               if (!cpu_msrs[i].controls) {
+               per_cpu(cpu_msrs, i).controls = kmalloc(controls_size,
+                                                               GFP_KERNEL);
+               if (!per_cpu(cpu_msrs, i).controls) {
                         success = 0;
                         break;
                 }
@@ -170,11 +172,11 @@ static int allocate_msrs(void)
  static void nmi_cpu_setup(void *dummy)
  {
         int cpu = smp_processor_id();
-       struct op_msrs *msrs = &cpu_msrs[cpu];
+       struct op_msrs *msrs = &per_cpu(cpu_msrs, cpu);
         spin_lock(&oprofilefs_lock);
         model->setup_ctrs(msrs);
         spin_unlock(&oprofilefs_lock);
-       saved_lvtpc[cpu] = apic_read(APIC_LVTPC);
+       per_cpu(saved_lvtpc, cpu) = apic_read(APIC_LVTPC);
         apic_write(APIC_LVTPC, APIC_DM_NMI);
  }
  
@@ -203,13 +205,15 @@ static int nmi_setup(void)
          */
  
         /* Assume saved/restored counters are the same on all CPUs */
-       model->fill_in_addresses(&cpu_msrs[0]);
+       model->fill_in_addresses(&per_cpu(cpu_msrs, 0));
         for_each_possible_cpu(cpu) {
                 if (cpu != 0) {
-                       memcpy(cpu_msrs[cpu].counters, cpu_msrs[0].counters,
+                       memcpy(per_cpu(cpu_msrs, cpu).counters,
+                               per_cpu(cpu_msrs, 0).counters,
                                 sizeof(struct op_msr) * model->num_counters);
  
-                       memcpy(cpu_msrs[cpu].controls, cpu_msrs[0].controls,
+                       memcpy(per_cpu(cpu_msrs, cpu).controls,
+                               per_cpu(cpu_msrs, 0).controls,
                                 sizeof(struct op_msr) * model->num_controls);
                 }
  
@@ -249,7 +253,7 @@ static void nmi_cpu_shutdown(void *dummy)
  {
         unsigned int v;
         int cpu = smp_processor_id();
-       struct op_msrs *msrs = &cpu_msrs[cpu];
+       struct op_msrs *msrs = &__get_cpu_var(cpu_msrs);
  
         /* restoring APIC_LVTPC can trigger an apic error because the delivery
          * mode and vector nr combination can be illegal. That's by design: on
@@ -258,23 +262,24 @@ static void nmi_cpu_shutdown(void *dummy)
          */
         v = apic_read(APIC_LVTERR);
         apic_write(APIC_LVTERR, v | APIC_LVT_MASKED);
-       apic_write(APIC_LVTPC, saved_lvtpc[cpu]);
+       apic_write(APIC_LVTPC, per_cpu(saved_lvtpc, cpu));
         apic_write(APIC_LVTERR, v);
         nmi_restore_registers(msrs);
  }
  
  static void nmi_shutdown(void)
  {
+       struct op_msrs *msrs = &__get_cpu_var(cpu_msrs);
         nmi_enabled = 0;
         on_each_cpu(nmi_cpu_shutdown, NULL, 0, 1);
         unregister_die_notifier(&profile_exceptions_nb);
-       model->shutdown(cpu_msrs);
+       model->shutdown(msrs);
         free_msrs();
  }
  
  static void nmi_cpu_start(void *dummy)
  {
-       struct op_msrs const *msrs = &cpu_msrs[smp_processor_id()];
+       struct op_msrs const *msrs = &__get_cpu_var(cpu_msrs);
         model->start(msrs);
  }
  
@@ -286,7 +291,7 @@ static int nmi_start(void)
  
  static void nmi_cpu_stop(void *dummy)
  {
-       struct op_msrs const *msrs = &cpu_msrs[smp_processor_id()];
+       struct op_msrs const *msrs = &__get_cpu_var(cpu_msrs);
         model->stop(msrs);
  }
  
diff --git a/arch/x86/vdso/Makefile b/arch/x86/vdso/Makefile

index 17a6b057856b6f6759eb358d7060b5b682861440..b7ad9f89d21f8898dbf792de05b0dddd21f3a803 100644 (file)
--- a/arch/x86/vdso/Makefile
+++ b/arch/x86/vdso/Makefile
@@ -37,7 +37,8 @@ $(obj)/%.so: OBJCOPYFLAGS := -S
  $(obj)/%.so: $(obj)/%.so.dbg FORCE
         $(call if_changed,objcopy)
  
-CFL := $(PROFILING) -mcmodel=small -fPIC -g0 -O2 -fasynchronous-unwind-tables -m64
+CFL := $(PROFILING) -mcmodel=small -fPIC -O2 -fasynchronous-unwind-tables -m64 \
+       $(filter -g%,$(KBUILD_CFLAGS))
  
  $(vobjs): KBUILD_CFLAGS += $(CFL)
  
diff --git a/arch/x86/video/fbdev.c b/arch/x86/video/fbdev.c

index 48fb38d7d2c0ff695a60deedfb0d92139f84c21d..4db42bff8c603ee216345bba6143ad5f5efd3dec 100644 (file)
--- a/arch/x86/video/fbdev.c
+++ b/arch/x86/video/fbdev.c
@@ -1,5 +1,4 @@
  /*
- * arch/i386/video/fbdev.c - i386 Framebuffer
   *
   * Copyright (C) 2007 Antonino Daplas <adaplas@gmail.com>
   *
diff --git a/drivers/acpi/processor_throttling.c b/drivers/acpi/processor_throttling.c

index 1b8e592a82415a0772ebf1474acde406229236c9..0bba3a914e865562c51b11703fca695cfd0051a0 100644 (file)
--- a/drivers/acpi/processor_throttling.c
+++ b/drivers/acpi/processor_throttling.c
@@ -838,10 +838,10 @@ static int acpi_processor_get_throttling(struct acpi_processor *pr)
          * Migrate task to the cpu pointed by pr.
          */
         saved_mask = current->cpus_allowed;
-       set_cpus_allowed(current, cpumask_of_cpu(pr->id));
+       set_cpus_allowed_ptr(current, &cpumask_of_cpu(pr->id));
         ret = pr->throttling.acpi_processor_get_throttling(pr);
         /* restore the previous state */
-       set_cpus_allowed(current, saved_mask);
+       set_cpus_allowed_ptr(current, &saved_mask);
  
         return ret;
  }
@@ -1025,7 +1025,7 @@ int acpi_processor_set_throttling(struct acpi_processor *pr, int state)
          * it can be called only for the cpu pointed by pr.
          */
         if (p_throttling->shared_type == DOMAIN_COORD_TYPE_SW_ANY) {
-               set_cpus_allowed(current, cpumask_of_cpu(pr->id));
+               set_cpus_allowed_ptr(current, &cpumask_of_cpu(pr->id));
                 ret = p_throttling->acpi_processor_set_throttling(pr,
                                                 t_state.target_state);
         } else {
@@ -1056,7 +1056,7 @@ int acpi_processor_set_throttling(struct acpi_processor *pr, int state)
                                 continue;
                         }
                         t_state.cpu = i;
-                       set_cpus_allowed(current, cpumask_of_cpu(i));
+                       set_cpus_allowed_ptr(current, &cpumask_of_cpu(i));
                         ret = match_pr->throttling.
                                 acpi_processor_set_throttling(
                                 match_pr, t_state.target_state);
@@ -1074,7 +1074,7 @@ int acpi_processor_set_throttling(struct acpi_processor *pr, int state)
                                                         &t_state);
         }
         /* restore the previous state */
-       set_cpus_allowed(current, saved_mask);
+       set_cpus_allowed_ptr(current, &saved_mask);
         return ret;
  }
  
diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c

index 499b003f92782dfc8259791d6f13b4a377f24f5b..2c76afff3b15060e723fcec52569e450a3c46265 100644 (file)
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -102,6 +102,51 @@ static ssize_t show_crash_notes(struct sys_device *dev, char *buf)
  static SYSDEV_ATTR(crash_notes, 0400, show_crash_notes, NULL);
  #endif
  
+/*
+ * Print cpu online, possible, present, and system maps
+ */
+static ssize_t print_cpus_map(char *buf, cpumask_t *map)
+{
+       int n = cpulist_scnprintf(buf, PAGE_SIZE-2, *map);
+
+       buf[n++] = '\n';
+       buf[n] = '\0';
+       return n;
+}
+
+#define        print_cpus_func(type) \
+static ssize_t print_cpus_##type(struct sysdev_class *class, char *buf)        \
+{                                                                      \
+       return print_cpus_map(buf, &cpu_##type##_map);                  \
+}                                                                      \
+struct sysdev_class_attribute attr_##type##_map =                      \
+       _SYSDEV_CLASS_ATTR(type, 0444, print_cpus_##type, NULL)
+
+print_cpus_func(online);
+print_cpus_func(possible);
+print_cpus_func(present);
+
+struct sysdev_class_attribute *cpu_state_attr[] = {
+       &attr_online_map,
+       &attr_possible_map,
+       &attr_present_map,
+};
+
+static int cpu_states_init(void)
+{
+       int i;
+       int err = 0;
+
+       for (i = 0;  i < ARRAY_SIZE(cpu_state_attr); i++) {
+               int ret;
+               ret = sysdev_class_create_file(&cpu_sysdev_class,
+                                               cpu_state_attr[i]);
+               if (!err)
+                       err = ret;
+       }
+       return err;
+}
+
  /*
   * register_cpu - Setup a sysfs device for a CPU.
   * @cpu - cpu->hotpluggable field set to 1 will generate a control file in
@@ -147,6 +192,9 @@ int __init cpu_dev_init(void)
         int err;
  
         err = sysdev_class_register(&cpu_sysdev_class);
+       if (!err)
+               err = cpu_states_init();
+
  #if defined(CONFIG_SCHED_MC) || defined(CONFIG_SCHED_SMT)
         if (!err)
                 err = sched_create_sysfs_power_savings_entries(&cpu_sysdev_class);
diff --git a/drivers/base/node.c b/drivers/base/node.c

index e59861f18ce55616981e7ca9fbc5ad2a6698f04f..12fde2d03d695a112f02debd25c4fb897a4b937c 100644 (file)
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -19,21 +19,34 @@ static struct sysdev_class node_class = {
  };
  
  
-static ssize_t node_read_cpumap(struct sys_device * dev, char * buf)
+static ssize_t node_read_cpumap(struct sys_device *dev, int type, char *buf)
  {
         struct node *node_dev = to_node(dev);
-       cpumask_t mask = node_to_cpumask(node_dev->sysdev.id);
+       node_to_cpumask_ptr(mask, node_dev->sysdev.id);
         int len;
  
-       /* 2004/06/03: buf currently PAGE_SIZE, need > 1 char per 4 bits. */
-       BUILD_BUG_ON(MAX_NUMNODES/4 > PAGE_SIZE/2);
+       /* 2008/04/07: buf currently PAGE_SIZE, need 9 chars per 32 bits. */
+       BUILD_BUG_ON((NR_CPUS/32 * 9) > (PAGE_SIZE-1));
  
-       len = cpumask_scnprintf(buf, PAGE_SIZE-1, mask);
-       len += sprintf(buf + len, "\n");
+       len = type?
+               cpulist_scnprintf(buf, PAGE_SIZE-2, *mask):
+               cpumask_scnprintf(buf, PAGE_SIZE-2, *mask);
+       buf[len++] = '\n';
+       buf[len] = '\0';
         return len;
  }
  
-static SYSDEV_ATTR(cpumap, S_IRUGO, node_read_cpumap, NULL);
+static inline ssize_t node_read_cpumask(struct sys_device *dev, char *buf)
+{
+       return node_read_cpumap(dev, 0, buf);
+}
+static inline ssize_t node_read_cpulist(struct sys_device *dev, char *buf)
+{
+       return node_read_cpumap(dev, 1, buf);
+}
+
+static SYSDEV_ATTR(cpumap,  S_IRUGO, node_read_cpumask, NULL);
+static SYSDEV_ATTR(cpulist, S_IRUGO, node_read_cpulist, NULL);
  
  #define K(x) ((x) << (PAGE_SHIFT - 10))
  static ssize_t node_read_meminfo(struct sys_device * dev, char * buf)
@@ -149,6 +162,7 @@ int register_node(struct node *node, int num, struct node *parent)
  
         if (!error){
                 sysdev_create_file(&node->sysdev, &attr_cpumap);
+               sysdev_create_file(&node->sysdev, &attr_cpulist);
                 sysdev_create_file(&node->sysdev, &attr_meminfo);
                 sysdev_create_file(&node->sysdev, &attr_numastat);
                 sysdev_create_file(&node->sysdev, &attr_distance);
@@ -166,6 +180,7 @@ int register_node(struct node *node, int num, struct node *parent)
  void unregister_node(struct node *node)
  {
         sysdev_remove_file(&node->sysdev, &attr_cpumap);
+       sysdev_remove_file(&node->sysdev, &attr_cpulist);
         sysdev_remove_file(&node->sysdev, &attr_meminfo);
         sysdev_remove_file(&node->sysdev, &attr_numastat);
         sysdev_remove_file(&node->sysdev, &attr_distance);
diff --git a/drivers/base/topology.c b/drivers/base/topology.c

index e1d3ad4db2f04355631efd8120946c07313789c8..fdf4044d2e74a90a24ff7b5e44a16efcab52c7a4 100644 (file)
--- a/drivers/base/topology.c
+++ b/drivers/base/topology.c
@@ -40,15 +40,38 @@ static ssize_t show_##name(struct sys_device *dev, char *buf)       \
         return sprintf(buf, "%d\n", topology_##name(cpu));      \
  }
  
-#define define_siblings_show_func(name)                                        \
-static ssize_t show_##name(struct sys_device *dev, char *buf)          \
+static ssize_t show_cpumap(int type, cpumask_t *mask, char *buf)
+{
+       ptrdiff_t len = PTR_ALIGN(buf + PAGE_SIZE - 1, PAGE_SIZE) - buf;
+       int n = 0;
+
+       if (len > 1) {
+               n = type?
+                       cpulist_scnprintf(buf, len-2, *mask):
+                       cpumask_scnprintf(buf, len-2, *mask);
+               buf[n++] = '\n';
+               buf[n] = '\0';
+       }
+       return n;
+}
+
+#define define_siblings_show_map(name)                                 \
+static inline ssize_t show_##name(struct sys_device *dev, char *buf)   \
  {                                                                      \
-       ssize_t len = -1;                                               \
         unsigned int cpu = dev->id;                                     \
-       len = cpumask_scnprintf(buf, NR_CPUS+1, topology_##name(cpu));  \
-       return (len + sprintf(buf + len, "\n"));                        \
+       return show_cpumap(0, &(topology_##name(cpu)), buf);            \
  }
  
+#define define_siblings_show_list(name)                                        \
+static inline ssize_t show_##name##_list(struct sys_device *dev, char *buf) \
+{                                                                      \
+       unsigned int cpu = dev->id;                                     \
+       return show_cpumap(1, &(topology_##name(cpu)), buf);            \
+}
+
+#define define_siblings_show_func(name)                \
+       define_siblings_show_map(name); define_siblings_show_list(name)
+
  #ifdef topology_physical_package_id
  define_id_show_func(physical_package_id);
  define_one_ro(physical_package_id);
@@ -68,7 +91,9 @@ define_one_ro(core_id);
  #ifdef topology_thread_siblings
  define_siblings_show_func(thread_siblings);
  define_one_ro(thread_siblings);
-#define ref_thread_siblings_attr       &attr_thread_siblings.attr,
+define_one_ro(thread_siblings_list);
+#define ref_thread_siblings_attr       \
+               &attr_thread_siblings.attr, &attr_thread_siblings_list.attr,
  #else
  #define ref_thread_siblings_attr
  #endif
@@ -76,7 +101,9 @@ define_one_ro(thread_siblings);
  #ifdef topology_core_siblings
  define_siblings_show_func(core_siblings);
  define_one_ro(core_siblings);
-#define ref_core_siblings_attr         &attr_core_siblings.attr,
+define_one_ro(core_siblings_list);
+#define ref_core_siblings_attr         \
+               &attr_core_siblings.attr, &attr_core_siblings_list.attr,
  #else
  #define ref_core_siblings_attr
  #endif
diff --git a/drivers/firmware/dcdbas.c b/drivers/firmware/dcdbas.c

index 1636806ec55e8432af872e6327ef02b297db795c..0ffef3b7c6ca1400526007b867aecf4680534086 100644 (file)
--- a/drivers/firmware/dcdbas.c
+++ b/drivers/firmware/dcdbas.c
@@ -265,7 +265,7 @@ static int smi_request(struct smi_cmd *smi_cmd)
  
         /* SMI requires CPU 0 */
         old_mask = current->cpus_allowed;
-       set_cpus_allowed(current, cpumask_of_cpu(0));
+       set_cpus_allowed_ptr(current, &cpumask_of_cpu(0));
         if (smp_processor_id() != 0) {
                 dev_dbg(&dcdbas_pdev->dev, "%s: failed to get CPU 0\n",
                         __FUNCTION__);
@@ -285,7 +285,7 @@ static int smi_request(struct smi_cmd *smi_cmd)
         );
  
  out:
-       set_cpus_allowed(current, old_mask);
+       set_cpus_allowed_ptr(current, &old_mask);
         return ret;
  }
  
diff --git a/drivers/input/keyboard/Kconfig b/drivers/input/keyboard/Kconfig

index 8ea709be3306976244a35180b4ce15c2923a0cb0..efd70a9745910bfacd97e39b4fdebe451fa14ca4 100644 (file)
--- a/drivers/input/keyboard/Kconfig
+++ b/drivers/input/keyboard/Kconfig
@@ -314,4 +314,13 @@ config KEYBOARD_BFIN
           To compile this driver as a module, choose M here: the
           module will be called bf54x-keys.
  
+config KEYBOARD_SH_KEYSC
+       tristate "SuperH KEYSC keypad support"
+       depends on SUPERH
+       help
+         Say Y here if you want to use a keypad attached to the KEYSC block
+         on SuperH processors such as sh7722 and sh7343.
+
+         To compile this driver as a module, choose M here: the
+         module will be called sh_keysc.
  endif
diff --git a/drivers/input/keyboard/Makefile b/drivers/input/keyboard/Makefile

index e741f4031012c470d829abf122e9d1408d345b2e..0edc8f285d1cf57f21a93eb4da8e6fa279920a7a 100644 (file)
--- a/drivers/input/keyboard/Makefile
+++ b/drivers/input/keyboard/Makefile
@@ -26,3 +26,4 @@ obj-$(CONFIG_KEYBOARD_HP6XX)          += jornada680_kbd.o
  obj-$(CONFIG_KEYBOARD_HP7XX)           += jornada720_kbd.o
  obj-$(CONFIG_KEYBOARD_MAPLE)           += maple_keyb.o
  obj-$(CONFIG_KEYBOARD_BFIN)            += bf54x-keys.o
+obj-$(CONFIG_KEYBOARD_SH_KEYSC)                += sh_keysc.o
diff --git a/drivers/input/keyboard/sh_keysc.c b/drivers/input/keyboard/sh_keysc.c

new file mode 100644 (file)

index 0000000..8486abc
--- /dev/null
+++ b/drivers/input/keyboard/sh_keysc.c
@@ -0,0 +1,280 @@
+/*
+ * SuperH KEYSC Keypad Driver
+ *
+ * Copyright (C) 2008 Magnus Damm
+ *
+ * Based on gpio_keys.c, Copyright 2005 Phil Blundell
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/interrupt.h>
+#include <linux/irq.h>
+#include <linux/delay.h>
+#include <linux/platform_device.h>
+#include <linux/input.h>
+#include <linux/io.h>
+#include <asm/sh_keysc.h>
+
+#define KYCR1_OFFS   0x00
+#define KYCR2_OFFS   0x04
+#define KYINDR_OFFS  0x08
+#define KYOUTDR_OFFS 0x0c
+
+#define KYCR2_IRQ_LEVEL    0x10
+#define KYCR2_IRQ_DISABLED 0x00
+
+static const struct {
+       unsigned char kymd, keyout, keyin;
+} sh_keysc_mode[] = {
+       [SH_KEYSC_MODE_1] = { 0, 6, 5 },
+       [SH_KEYSC_MODE_2] = { 1, 5, 6 },
+       [SH_KEYSC_MODE_3] = { 2, 4, 7 },
+};
+
+struct sh_keysc_priv {
+       void __iomem *iomem_base;
+       unsigned long last_keys;
+       struct input_dev *input;
+       struct sh_keysc_info pdata;
+};
+
+static irqreturn_t sh_keysc_isr(int irq, void *dev_id)
+{
+       struct platform_device *pdev = dev_id;
+       struct sh_keysc_priv *priv = platform_get_drvdata(pdev);
+       struct sh_keysc_info *pdata = &priv->pdata;
+       unsigned long keys, keys1, keys0, mask;
+       unsigned char keyin_set, tmp;
+       int i, k;
+
+       dev_dbg(&pdev->dev, "isr!\n");
+
+       keys1 = ~0;
+       keys0 = 0;
+
+       do {
+               keys = 0;
+               keyin_set = 0;
+
+               iowrite16(KYCR2_IRQ_DISABLED, priv->iomem_base + KYCR2_OFFS);
+
+               for (i = 0; i < sh_keysc_mode[pdata->mode].keyout; i++) {
+                       iowrite16(0xfff ^ (3 << (i * 2)),
+                                 priv->iomem_base + KYOUTDR_OFFS);
+                       udelay(pdata->delay);
+                       tmp = ioread16(priv->iomem_base + KYINDR_OFFS);
+                       keys |= tmp << (sh_keysc_mode[pdata->mode].keyin * i);
+                       tmp ^= (1 << sh_keysc_mode[pdata->mode].keyin) - 1;
+                       keyin_set |= tmp;
+               }
+
+               iowrite16(0, priv->iomem_base + KYOUTDR_OFFS);
+               iowrite16(KYCR2_IRQ_LEVEL | (keyin_set << 8),
+                         priv->iomem_base + KYCR2_OFFS);
+
+               keys ^= ~0;
+               keys &= (1 << (sh_keysc_mode[pdata->mode].keyin *
+                              sh_keysc_mode[pdata->mode].keyout)) - 1;
+               keys1 &= keys;
+               keys0 |= keys;
+
+               dev_dbg(&pdev->dev, "keys 0x%08lx\n", keys);
+
+       } while (ioread16(priv->iomem_base + KYCR2_OFFS) & 0x01);
+
+       dev_dbg(&pdev->dev, "last_keys 0x%08lx keys0 0x%08lx keys1 0x%08lx\n",
+               priv->last_keys, keys0, keys1);
+
+       for (i = 0; i < SH_KEYSC_MAXKEYS; i++) {
+               k = pdata->keycodes[i];
+               if (!k)
+                       continue;
+
+               mask = 1 << i;
+
+               if (!((priv->last_keys ^ keys0) & mask))
+                       continue;
+
+               if ((keys1 | keys0) & mask) {
+                       input_event(priv->input, EV_KEY, k, 1);
+                       priv->last_keys |= mask;
+               }
+
+               if (!(keys1 & mask)) {
+                       input_event(priv->input, EV_KEY, k, 0);
+                       priv->last_keys &= ~mask;
+               }
+
+       }
+       input_sync(priv->input);
+
+       return IRQ_HANDLED;
+}
+
+#define res_size(res) ((res)->end - (res)->start + 1)
+
+static int __devinit sh_keysc_probe(struct platform_device *pdev)
+{
+       struct sh_keysc_priv *priv;
+       struct sh_keysc_info *pdata;
+       struct resource *res;
+       struct input_dev *input;
+       int i, k;
+       int irq, error;
+
+       if (!pdev->dev.platform_data) {
+               dev_err(&pdev->dev, "no platform data defined\n");
+               error = -EINVAL;
+               goto err0;
+       }
+
+       error = -ENXIO;
+       res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+       if (res == NULL) {
+               dev_err(&pdev->dev, "failed to get I/O memory\n");
+               goto err0;
+       }
+
+       irq = platform_get_irq(pdev, 0);
+       if (irq < 0) {
+               dev_err(&pdev->dev, "failed to get irq\n");
+               goto err0;
+       }
+
+       priv = kzalloc(sizeof(*priv), GFP_KERNEL);
+       if (priv == NULL) {
+               dev_err(&pdev->dev, "failed to allocate driver data\n");
+               error = -ENOMEM;
+               goto err0;
+       }
+
+       platform_set_drvdata(pdev, priv);
+       memcpy(&priv->pdata, pdev->dev.platform_data, sizeof(priv->pdata));
+       pdata = &priv->pdata;
+
+       res = request_mem_region(res->start, res_size(res), pdev->name);
+       if (res == NULL) {
+               dev_err(&pdev->dev, "failed to request I/O memory\n");
+               error = -EBUSY;
+               goto err1;
+       }
+
+       priv->iomem_base = ioremap_nocache(res->start, res_size(res));
+       if (priv->iomem_base == NULL) {
+               dev_err(&pdev->dev, "failed to remap I/O memory\n");
+               error = -ENXIO;
+               goto err2;
+       }
+
+       priv->input = input_allocate_device();
+       if (!priv->input) {
+               dev_err(&pdev->dev, "failed to allocate input device\n");
+               error = -ENOMEM;
+               goto err3;
+       }
+
+       input = priv->input;
+       input->evbit[0] = BIT_MASK(EV_KEY);
+
+       input->name = pdev->name;
+       input->phys = "sh-keysc-keys/input0";
+       input->dev.parent = &pdev->dev;
+
+       input->id.bustype = BUS_HOST;
+       input->id.vendor = 0x0001;
+       input->id.product = 0x0001;
+       input->id.version = 0x0100;
+
+       error = request_irq(irq, sh_keysc_isr, 0, pdev->name, pdev);
+       if (error) {
+               dev_err(&pdev->dev, "failed to request IRQ\n");
+               goto err4;
+       }
+
+       for (i = 0; i < SH_KEYSC_MAXKEYS; i++) {
+               k = pdata->keycodes[i];
+               if (k)
+                       input_set_capability(input, EV_KEY, k);
+       }
+
+       error = input_register_device(input);
+       if (error) {
+               dev_err(&pdev->dev, "failed to register input device\n");
+               goto err5;
+       }
+
+       iowrite16((sh_keysc_mode[pdata->mode].kymd << 8) |
+                 pdata->scan_timing, priv->iomem_base + KYCR1_OFFS);
+       iowrite16(0, priv->iomem_base + KYOUTDR_OFFS);
+       iowrite16(KYCR2_IRQ_LEVEL, priv->iomem_base + KYCR2_OFFS);
+       return 0;
+ err5:
+       free_irq(irq, pdev);
+ err4:
+       input_free_device(input);
+ err3:
+       iounmap(priv->iomem_base);
+ err2:
+       release_mem_region(res->start, res_size(res));
+ err1:
+       platform_set_drvdata(pdev, NULL);
+       kfree(priv);
+ err0:
+       return error;
+}
+
+static int __devexit sh_keysc_remove(struct platform_device *pdev)
+{
+       struct sh_keysc_priv *priv = platform_get_drvdata(pdev);
+       struct resource *res;
+
+       iowrite16(KYCR2_IRQ_DISABLED, priv->iomem_base + KYCR2_OFFS);
+
+       input_unregister_device(priv->input);
+       free_irq(platform_get_irq(pdev, 0), pdev);
+       iounmap(priv->iomem_base);
+
+       res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+       release_mem_region(res->start, res_size(res));
+
+       platform_set_drvdata(pdev, NULL);
+       kfree(priv);
+       return 0;
+}
+
+
+#define sh_keysc_suspend NULL
+#define sh_keysc_resume NULL
+
+struct platform_driver sh_keysc_device_driver = {
+       .probe          = sh_keysc_probe,
+       .remove         = __devexit_p(sh_keysc_remove),
+       .suspend        = sh_keysc_suspend,
+       .resume         = sh_keysc_resume,
+       .driver         = {
+               .name   = "sh_keysc",
+       }
+};
+
+static int __init sh_keysc_init(void)
+{
+       return platform_driver_register(&sh_keysc_device_driver);
+}
+
+static void __exit sh_keysc_exit(void)
+{
+       platform_driver_unregister(&sh_keysc_device_driver);
+}
+
+module_init(sh_keysc_init);
+module_exit(sh_keysc_exit);
+
+MODULE_AUTHOR("Magnus Damm");
+MODULE_DESCRIPTION("SuperH KEYSC Keypad Driver");
+MODULE_LICENSE("GPL");
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c

index e571c72e67531017f140ee6e323e89644f7320b0..e8d94fafc2804e74fddd9c66eee9fac29a11883b 100644 (file)
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -182,15 +182,18 @@ static int pci_call_probe(struct pci_driver *drv, struct pci_dev *dev,
         struct mempolicy *oldpol;
         cpumask_t oldmask = current->cpus_allowed;
         int node = pcibus_to_node(dev->bus);
-       if (node >= 0 && node_online(node))
-           set_cpus_allowed(current, node_to_cpumask(node));
+
+       if (node >= 0) {
+               node_to_cpumask_ptr(nodecpumask, node);
+               set_cpus_allowed_ptr(current, nodecpumask);
+       }
         /* And set default memory allocation policy */
         oldpol = current->mempolicy;
         current->mempolicy = NULL;      /* fall back to system default policy */
  #endif
         error = drv->probe(dev, id);
  #ifdef CONFIG_NUMA
-       set_cpus_allowed(current, oldmask);
+       set_cpus_allowed_ptr(current, &oldmask);
         current->mempolicy = oldpol;
  #endif
         return error;
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c

index 8dcf1458aa2fafe6e0433efd1dfb515ce503271a..8d9d648daeba74123a6f9cf9c74434f903292bac 100644 (file)
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -73,8 +73,23 @@ static ssize_t local_cpus_show(struct device *dev,
  
         mask = pcibus_to_cpumask(to_pci_dev(dev)->bus);
         len = cpumask_scnprintf(buf, PAGE_SIZE-2, mask);
-       strcat(buf,"\n"); 
-       return 1+len;
+       buf[len++] = '\n';
+       buf[len] = '\0';
+       return len;
+}
+
+
+static ssize_t local_cpulist_show(struct device *dev,
+                       struct device_attribute *attr, char *buf)
+{
+       cpumask_t mask;
+       int len;
+
+       mask = pcibus_to_cpumask(to_pci_dev(dev)->bus);
+       len = cpulist_scnprintf(buf, PAGE_SIZE-2, mask);
+       buf[len++] = '\n';
+       buf[len] = '\0';
+       return len;
  }
  
  /* show resources */
@@ -201,6 +216,7 @@ struct device_attribute pci_dev_attrs[] = {
         __ATTR_RO(class),
         __ATTR_RO(irq),
         __ATTR_RO(local_cpus),
+       __ATTR_RO(local_cpulist),
         __ATTR_RO(modalias),
  #ifdef CONFIG_NUMA
         __ATTR_RO(numa_node),
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c

index 2db2e4bb0d1ed6073b6ec9d3e0dec604868974ab..4b3011a23effa43f9053d2889430e43225bfb2ea 100644 (file)
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -82,6 +82,7 @@ void pci_remove_legacy_files(struct pci_bus *bus) { return; }
   * PCI Bus Class Devices
   */
  static ssize_t pci_bus_show_cpuaffinity(struct device *dev,
+                                       int type,
                                         struct device_attribute *attr,
                                         char *buf)
  {
@@ -89,12 +90,30 @@ static ssize_t pci_bus_show_cpuaffinity(struct device *dev,
         cpumask_t cpumask;
  
         cpumask = pcibus_to_cpumask(to_pci_bus(dev));
-       ret = cpumask_scnprintf(buf, PAGE_SIZE, cpumask);
-       if (ret < PAGE_SIZE)
-               buf[ret++] = '\n';
+       ret = type?
+               cpulist_scnprintf(buf, PAGE_SIZE-2, cpumask):
+               cpumask_scnprintf(buf, PAGE_SIZE-2, cpumask);
+       buf[ret++] = '\n';
+       buf[ret] = '\0';
         return ret;
  }
-DEVICE_ATTR(cpuaffinity, S_IRUGO, pci_bus_show_cpuaffinity, NULL);
+
+static ssize_t inline pci_bus_show_cpumaskaffinity(struct device *dev,
+                                       struct device_attribute *attr,
+                                       char *buf)
+{
+       return pci_bus_show_cpuaffinity(dev, 0, attr, buf);
+}
+
+static ssize_t inline pci_bus_show_cpulistaffinity(struct device *dev,
+                                       struct device_attribute *attr,
+                                       char *buf)
+{
+       return pci_bus_show_cpuaffinity(dev, 1, attr, buf);
+}
+
+DEVICE_ATTR(cpuaffinity,     S_IRUGO, pci_bus_show_cpumaskaffinity, NULL);
+DEVICE_ATTR(cpulistaffinity, S_IRUGO, pci_bus_show_cpulistaffinity, NULL);
  
  /*
   * PCI Bus Class
diff --git a/drivers/rtc/rtc-sh.c b/drivers/rtc/rtc-sh.c

index 9e9caa5d7f5f78ad747576d176c2b8052e639f2f..c594b34c67679e3f1d7df1ba143854217bc38455 100644 (file)
--- a/drivers/rtc/rtc-sh.c
+++ b/drivers/rtc/rtc-sh.c
@@ -1,8 +1,9 @@
  /*
   * SuperH On-Chip RTC Support
   *
- * Copyright (C) 2006, 2007  Paul Mundt
+ * Copyright (C) 2006, 2007, 2008  Paul Mundt
   * Copyright (C) 2006  Jamie Lenehan
+ * Copyright (C) 2008  Angelo Castello
   *
   * Based on the old arch/sh/kernel/cpu/rtc.c by:
   *
@@ -26,7 +27,7 @@
  #include <asm/rtc.h>
  
  #define DRV_NAME       "sh-rtc"
-#define DRV_VERSION    "0.1.6"
+#define DRV_VERSION    "0.2.0"
  
  #define RTC_REG(r)     ((r) * rtc_reg_size)
  
@@ -63,6 +64,13 @@
  /* ALARM Bits - or with BCD encoded value */
  #define AR_ENB         0x80    /* Enable for alarm cmp   */
  
+/* Period Bits */
+#define PF_HP          0x100   /* Enable Half Period to support 8,32,128Hz */
+#define PF_COUNT       0x200   /* Half periodic counter */
+#define PF_OXS         0x400   /* Periodic One x Second */
+#define PF_KOU         0x800   /* Kernel or User periodic request 1=kernel */
+#define PF_MASK                0xf00
+
  /* RCR1 Bits */
  #define RCR1_CF                0x80    /* Carry Flag             */
  #define RCR1_CIE       0x10    /* Carry Interrupt Enable */
@@ -84,33 +92,24 @@ struct sh_rtc {
         unsigned int alarm_irq, periodic_irq, carry_irq;
         struct rtc_device *rtc_dev;
         spinlock_t lock;
-       int rearm_aie;
         unsigned long capabilities;     /* See asm-sh/rtc.h for cap bits */
+       unsigned short periodic_freq;
  };
  
  static irqreturn_t sh_rtc_interrupt(int irq, void *dev_id)
  {
-       struct platform_device *pdev = to_platform_device(dev_id);
-       struct sh_rtc *rtc = platform_get_drvdata(pdev);
-       unsigned int tmp, events = 0;
+       struct sh_rtc *rtc = dev_id;
+       unsigned int tmp;
  
         spin_lock(&rtc->lock);
  
         tmp = readb(rtc->regbase + RCR1);
         tmp &= ~RCR1_CF;
-
-       if (rtc->rearm_aie) {
-               if (tmp & RCR1_AF)
-                       tmp &= ~RCR1_AF;        /* try to clear AF again */
-               else {
-                       tmp |= RCR1_AIE;        /* AF has cleared, rearm IRQ */
-                       rtc->rearm_aie = 0;
-               }
-       }
-
         writeb(tmp, rtc->regbase + RCR1);
  
-       rtc_update_irq(rtc->rtc_dev, 1, events);
+       /* Users have requested One x Second IRQ */
+       if (rtc->periodic_freq & PF_OXS)
+               rtc_update_irq(rtc->rtc_dev, 1, RTC_UF | RTC_IRQF);
  
         spin_unlock(&rtc->lock);
  
@@ -119,47 +118,48 @@ static irqreturn_t sh_rtc_interrupt(int irq, void *dev_id)
  
  static irqreturn_t sh_rtc_alarm(int irq, void *dev_id)
  {
-       struct platform_device *pdev = to_platform_device(dev_id);
-       struct sh_rtc *rtc = platform_get_drvdata(pdev);
-       unsigned int tmp, events = 0;
+       struct sh_rtc *rtc = dev_id;
+       unsigned int tmp;
  
         spin_lock(&rtc->lock);
  
         tmp = readb(rtc->regbase + RCR1);
-
-       /*
-        * If AF is set then the alarm has triggered. If we clear AF while
-        * the alarm time still matches the RTC time then AF will
-        * immediately be set again, and if AIE is enabled then the alarm
-        * interrupt will immediately be retrigger. So we clear AIE here
-        * and use rtc->rearm_aie so that the carry interrupt will keep
-        * trying to clear AF and once it stays cleared it'll re-enable
-        * AIE.
-        */
-       if (tmp & RCR1_AF) {
-               events |= RTC_AF | RTC_IRQF;
-
-               tmp &= ~(RCR1_AF|RCR1_AIE);
-
+       tmp &= ~(RCR1_AF | RCR1_AIE);
                 writeb(tmp, rtc->regbase + RCR1);
  
-               rtc->rearm_aie = 1;
-
-               rtc_update_irq(rtc->rtc_dev, 1, events);
-       }
+       rtc_update_irq(rtc->rtc_dev, 1, RTC_AF | RTC_IRQF);
  
         spin_unlock(&rtc->lock);
+
         return IRQ_HANDLED;
  }
  
  static irqreturn_t sh_rtc_periodic(int irq, void *dev_id)
  {
-       struct platform_device *pdev = to_platform_device(dev_id);
-       struct sh_rtc *rtc = platform_get_drvdata(pdev);
+       struct sh_rtc *rtc = dev_id;
+       struct rtc_device *rtc_dev = rtc->rtc_dev;
+       unsigned int tmp;
  
         spin_lock(&rtc->lock);
  
-       rtc_update_irq(rtc->rtc_dev, 1, RTC_PF | RTC_IRQF);
+       tmp = readb(rtc->regbase + RCR2);
+       tmp &= ~RCR2_PEF;
+       writeb(tmp, rtc->regbase + RCR2);
+
+       /* Half period enabled than one skipped and the next notified */
+       if ((rtc->periodic_freq & PF_HP) && (rtc->periodic_freq & PF_COUNT))
+               rtc->periodic_freq &= ~PF_COUNT;
+       else {
+               if (rtc->periodic_freq & PF_HP)
+                       rtc->periodic_freq |= PF_COUNT;
+               if (rtc->periodic_freq & PF_KOU) {
+                       spin_lock(&rtc_dev->irq_task_lock);
+                       if (rtc_dev->irq_task)
+                               rtc_dev->irq_task->func(rtc_dev->irq_task->private_data);
+                       spin_unlock(&rtc_dev->irq_task_lock);
+               } else
+                       rtc_update_irq(rtc->rtc_dev, 1, RTC_PF | RTC_IRQF);
+       }
  
         spin_unlock(&rtc->lock);
  
@@ -176,8 +176,8 @@ static inline void sh_rtc_setpie(struct device *dev, unsigned int enable)
         tmp = readb(rtc->regbase + RCR2);
  
         if (enable) {
-               tmp &= ~RCR2_PESMASK;
-               tmp |= RCR2_PEF | (2 << 4);
+               tmp &= ~RCR2_PEF;       /* Clear PES bit */
+               tmp |= (rtc->periodic_freq & ~PF_HP);   /* Set PES2-0 */
         } else
                 tmp &= ~(RCR2_PESMASK | RCR2_PEF);
  
@@ -186,82 +186,81 @@ static inline void sh_rtc_setpie(struct device *dev, unsigned int enable)
         spin_unlock_irq(&rtc->lock);
  }
  
-static inline void sh_rtc_setaie(struct device *dev, unsigned int enable)
+static inline int sh_rtc_setfreq(struct device *dev, unsigned int freq)
  {
         struct sh_rtc *rtc = dev_get_drvdata(dev);
-       unsigned int tmp;
+       int tmp, ret = 0;
  
         spin_lock_irq(&rtc->lock);
+       tmp = rtc->periodic_freq & PF_MASK;
  
-       tmp = readb(rtc->regbase + RCR1);
-
-       if (!enable) {
-               tmp &= ~RCR1_AIE;
-               rtc->rearm_aie = 0;
-       } else if (rtc->rearm_aie == 0)
-               tmp |= RCR1_AIE;
+       switch (freq) {
+       case 0:
+               rtc->periodic_freq = 0x00;
+               break;
+       case 1:
+               rtc->periodic_freq = 0x60;
+               break;
+       case 2:
+               rtc->periodic_freq = 0x50;
+               break;
+       case 4:
+               rtc->periodic_freq = 0x40;
+               break;
+       case 8:
+               rtc->periodic_freq = 0x30 | PF_HP;
+               break;
+       case 16:
+               rtc->periodic_freq = 0x30;
+               break;
+       case 32:
+               rtc->periodic_freq = 0x20 | PF_HP;
+               break;
+       case 64:
+               rtc->periodic_freq = 0x20;
+               break;
+       case 128:
+               rtc->periodic_freq = 0x10 | PF_HP;
+               break;
+       case 256:
+               rtc->periodic_freq = 0x10;
+               break;
+       default:
+               ret = -ENOTSUPP;
+       }
  
-       writeb(tmp, rtc->regbase + RCR1);
+       if (ret == 0) {
+               rtc->periodic_freq |= tmp;
+               rtc->rtc_dev->irq_freq = freq;
+       }
  
         spin_unlock_irq(&rtc->lock);
+       return ret;
  }
  
-static int sh_rtc_open(struct device *dev)
+static inline void sh_rtc_setaie(struct device *dev, unsigned int enable)
  {
         struct sh_rtc *rtc = dev_get_drvdata(dev);
         unsigned int tmp;
-       int ret;
-
-       tmp = readb(rtc->regbase + RCR1);
-       tmp &= ~RCR1_CF;
-       tmp |= RCR1_CIE;
-       writeb(tmp, rtc->regbase + RCR1);
  
-       ret = request_irq(rtc->periodic_irq, sh_rtc_periodic, IRQF_DISABLED,
-                         "sh-rtc period", dev);
-       if (unlikely(ret)) {
-               dev_err(dev, "request period IRQ failed with %d, IRQ %d\n",
-                       ret, rtc->periodic_irq);
-               return ret;
-       }
-
-       ret = request_irq(rtc->carry_irq, sh_rtc_interrupt, IRQF_DISABLED,
-                         "sh-rtc carry", dev);
-       if (unlikely(ret)) {
-               dev_err(dev, "request carry IRQ failed with %d, IRQ %d\n",
-                       ret, rtc->carry_irq);
-               free_irq(rtc->periodic_irq, dev);
-               goto err_bad_carry;
-       }
+       spin_lock_irq(&rtc->lock);
  
-       ret = request_irq(rtc->alarm_irq, sh_rtc_alarm, IRQF_DISABLED,
-                         "sh-rtc alarm", dev);
-       if (unlikely(ret)) {
-               dev_err(dev, "request alarm IRQ failed with %d, IRQ %d\n",
-                       ret, rtc->alarm_irq);
-               goto err_bad_alarm;
-       }
+       tmp = readb(rtc->regbase + RCR1);
  
-       return 0;
+       if (!enable)
+               tmp &= ~RCR1_AIE;
+       else
+               tmp |= RCR1_AIE;
  
-err_bad_alarm:
-       free_irq(rtc->carry_irq, dev);
-err_bad_carry:
-       free_irq(rtc->periodic_irq, dev);
+       writeb(tmp, rtc->regbase + RCR1);
  
-       return ret;
+       spin_unlock_irq(&rtc->lock);
  }
  
  static void sh_rtc_release(struct device *dev)
  {
-       struct sh_rtc *rtc = dev_get_drvdata(dev);
-
         sh_rtc_setpie(dev, 0);
         sh_rtc_setaie(dev, 0);
-
-       free_irq(rtc->periodic_irq, dev);
-       free_irq(rtc->carry_irq, dev);
-       free_irq(rtc->alarm_irq, dev);
  }
  
  static int sh_rtc_proc(struct device *dev, struct seq_file *seq)
@@ -270,31 +269,44 @@ static int sh_rtc_proc(struct device *dev, struct seq_file *seq)
         unsigned int tmp;
  
         tmp = readb(rtc->regbase + RCR1);
-       seq_printf(seq, "carry_IRQ\t: %s\n",
-                  (tmp & RCR1_CIE) ? "yes" : "no");
+       seq_printf(seq, "carry_IRQ\t: %s\n", (tmp & RCR1_CIE) ? "yes" : "no");
  
         tmp = readb(rtc->regbase + RCR2);
         seq_printf(seq, "periodic_IRQ\t: %s\n",
-                  (tmp & RCR2_PEF) ? "yes" : "no");
+                  (tmp & RCR2_PESMASK) ? "yes" : "no");
  
         return 0;
  }
  
  static int sh_rtc_ioctl(struct device *dev, unsigned int cmd, unsigned long arg)
  {
-       unsigned int ret = -ENOIOCTLCMD;
+       struct sh_rtc *rtc = dev_get_drvdata(dev);
+       unsigned int ret = 0;
  
         switch (cmd) {
         case RTC_PIE_OFF:
         case RTC_PIE_ON:
                 sh_rtc_setpie(dev, cmd == RTC_PIE_ON);
-               ret = 0;
                 break;
         case RTC_AIE_OFF:
         case RTC_AIE_ON:
                 sh_rtc_setaie(dev, cmd == RTC_AIE_ON);
-               ret = 0;
                 break;
+       case RTC_UIE_OFF:
+               rtc->periodic_freq &= ~PF_OXS;
+               break;
+       case RTC_UIE_ON:
+               rtc->periodic_freq |= PF_OXS;
+               break;
+       case RTC_IRQP_READ:
+               ret = put_user(rtc->rtc_dev->irq_freq,
+                              (unsigned long __user *)arg);
+               break;
+       case RTC_IRQP_SET:
+               ret = sh_rtc_setfreq(dev, arg);
+               break;
+       default:
+               ret = -ENOIOCTLCMD;
         }
  
         return ret;
@@ -421,7 +433,7 @@ static int sh_rtc_read_alarm(struct device *dev, struct rtc_wkalrm *wkalrm)
  {
         struct platform_device *pdev = to_platform_device(dev);
         struct sh_rtc *rtc = platform_get_drvdata(pdev);
-       struct rtc_time* tm = &wkalrm->time;
+       struct rtc_time *tm = &wkalrm->time;
  
         spin_lock_irq(&rtc->lock);
  
@@ -452,7 +464,7 @@ static inline void sh_rtc_write_alarm_value(struct sh_rtc *rtc,
                 writeb(BIN2BCD(value) | AR_ENB,  rtc->regbase + reg_off);
  }
  
-static int sh_rtc_check_alarm(struct rtc_time* tm)
+static int sh_rtc_check_alarm(struct rtc_time *tm)
  {
         /*
          * The original rtc says anything > 0xc0 is "don't care" or "match
@@ -503,11 +515,9 @@ static int sh_rtc_set_alarm(struct device *dev, struct rtc_wkalrm *wkalrm)
  
         /* disable alarm interrupt and clear the alarm flag */
         rcr1 = readb(rtc->regbase + RCR1);
-       rcr1 &= ~(RCR1_AF|RCR1_AIE);
+       rcr1 &= ~(RCR1_AF | RCR1_AIE);
         writeb(rcr1, rtc->regbase + RCR1);
  
-       rtc->rearm_aie = 0;
-
         /* set alarm time */
         sh_rtc_write_alarm_value(rtc, tm->tm_sec,  RSECAR);
         sh_rtc_write_alarm_value(rtc, tm->tm_min,  RMINAR);
@@ -529,14 +539,34 @@ static int sh_rtc_set_alarm(struct device *dev, struct rtc_wkalrm *wkalrm)
         return 0;
  }
  
+static int sh_rtc_irq_set_state(struct device *dev, int enabled)
+{
+       struct platform_device *pdev = to_platform_device(dev);
+       struct sh_rtc *rtc = platform_get_drvdata(pdev);
+
+       if (enabled) {
+               rtc->periodic_freq |= PF_KOU;
+               return sh_rtc_ioctl(dev, RTC_PIE_ON, 0);
+       } else {
+               rtc->periodic_freq &= ~PF_KOU;
+               return sh_rtc_ioctl(dev, RTC_PIE_OFF, 0);
+       }
+}
+
+static int sh_rtc_irq_set_freq(struct device *dev, int freq)
+{
+       return sh_rtc_ioctl(dev, RTC_IRQP_SET, freq);
+}
+
  static struct rtc_class_ops sh_rtc_ops = {
-       .open           = sh_rtc_open,
         .release        = sh_rtc_release,
         .ioctl          = sh_rtc_ioctl,
         .read_time      = sh_rtc_read_time,
         .set_time       = sh_rtc_set_time,
         .read_alarm     = sh_rtc_read_alarm,
         .set_alarm      = sh_rtc_set_alarm,
+       .irq_set_state  = sh_rtc_irq_set_state,
+       .irq_set_freq   = sh_rtc_irq_set_freq,
         .proc           = sh_rtc_proc,
  };
  
@@ -544,6 +574,7 @@ static int __devinit sh_rtc_probe(struct platform_device *pdev)
  {
         struct sh_rtc *rtc;
         struct resource *res;
+       unsigned int tmp;
         int ret = -ENOENT;
  
         rtc = kzalloc(sizeof(struct sh_rtc), GFP_KERNEL);
@@ -552,6 +583,7 @@ static int __devinit sh_rtc_probe(struct platform_device *pdev)
  
         spin_lock_init(&rtc->lock);
  
+       /* get periodic/carry/alarm irqs */
         rtc->periodic_irq = platform_get_irq(pdev, 0);
         if (unlikely(rtc->periodic_irq < 0)) {
                 dev_err(&pdev->dev, "No IRQ for period\n");
@@ -608,8 +640,48 @@ static int __devinit sh_rtc_probe(struct platform_device *pdev)
                 rtc->capabilities |= pinfo->capabilities;
         }
  
+       rtc->rtc_dev->max_user_freq = 256;
+       rtc->rtc_dev->irq_freq = 1;
+       rtc->periodic_freq = 0x60;
+
         platform_set_drvdata(pdev, rtc);
  
+       /* register periodic/carry/alarm irqs */
+       ret = request_irq(rtc->periodic_irq, sh_rtc_periodic, IRQF_DISABLED,
+                         "sh-rtc period", rtc);
+       if (unlikely(ret)) {
+               dev_err(&pdev->dev,
+                       "request period IRQ failed with %d, IRQ %d\n", ret,
+                       rtc->periodic_irq);
+               goto err_badmap;
+       }
+
+       ret = request_irq(rtc->carry_irq, sh_rtc_interrupt, IRQF_DISABLED,
+                         "sh-rtc carry", rtc);
+       if (unlikely(ret)) {
+               dev_err(&pdev->dev,
+                       "request carry IRQ failed with %d, IRQ %d\n", ret,
+                       rtc->carry_irq);
+               free_irq(rtc->periodic_irq, rtc);
+               goto err_badmap;
+       }
+
+       ret = request_irq(rtc->alarm_irq, sh_rtc_alarm, IRQF_DISABLED,
+                         "sh-rtc alarm", rtc);
+       if (unlikely(ret)) {
+               dev_err(&pdev->dev,
+                       "request alarm IRQ failed with %d, IRQ %d\n", ret,
+                       rtc->alarm_irq);
+               free_irq(rtc->carry_irq, rtc);
+               free_irq(rtc->periodic_irq, rtc);
+               goto err_badmap;
+       }
+
+       tmp = readb(rtc->regbase + RCR1);
+       tmp &= ~RCR1_CF;
+       tmp |= RCR1_CIE;
+       writeb(tmp, rtc->regbase + RCR1);
+
         return 0;
  
  err_badmap:
@@ -630,6 +702,10 @@ static int __devexit sh_rtc_remove(struct platform_device *pdev)
         sh_rtc_setpie(&pdev->dev, 0);
         sh_rtc_setaie(&pdev->dev, 0);
  
+       free_irq(rtc->carry_irq, rtc);
+       free_irq(rtc->periodic_irq, rtc);
+       free_irq(rtc->alarm_irq, rtc);
+
         release_resource(rtc->res);
  
         platform_set_drvdata(pdev, NULL);
@@ -662,6 +738,8 @@ module_exit(sh_rtc_exit);
  
  MODULE_DESCRIPTION("SuperH on-chip RTC driver");
  MODULE_VERSION(DRV_VERSION);
-MODULE_AUTHOR("Paul Mundt <lethal@linux-sh.org>, Jamie Lenehan <lenehan@twibble.org>");
+MODULE_AUTHOR("Paul Mundt <lethal@linux-sh.org>, "
+             "Jamie Lenehan <lenehan@twibble.org>, "
+             "Angelo Castello <angelo.castello@st.com>");
  MODULE_LICENSE("GPL");
  MODULE_ALIAS("platform:" DRV_NAME);
diff --git a/drivers/serial/sh-sci.c b/drivers/serial/sh-sci.c

index eff593080d4fe8358886fd02e88672704d21bfa8..c2ea5d4df44a07c5aab6ae01bb86b9a73cade4d0 100644 (file)
--- a/drivers/serial/sh-sci.c
+++ b/drivers/serial/sh-sci.c
@@ -333,7 +333,6 @@ static void sci_init_pins_scif(struct uart_port *port, unsigned int cflag)
         }
         sci_out(port, SCFCR, fcr_val);
  }
-
  #elif defined(CONFIG_CPU_SH3)
  /* For SH7705, SH7706, SH7707, SH7709, SH7709A, SH7729 */
  static void sci_init_pins_scif(struct uart_port *port, unsigned int cflag)
@@ -384,6 +383,12 @@ static void sci_init_pins_scif(struct uart_port *port, unsigned int cflag)
  
         sci_out(port, SCFCR, fcr_val);
  }
+#elif defined(CONFIG_CPU_SUBTYPE_SH7723)
+static void sci_init_pins_scif(struct uart_port *port, unsigned int cflag)
+{
+       /* Nothing to do here.. */
+       sci_out(port, SCFCR, 0);
+}
  #else
  /* For SH7750 */
  static void sci_init_pins_scif(struct uart_port *port, unsigned int cflag)
diff --git a/drivers/serial/sh-sci.h b/drivers/serial/sh-sci.h

index 01a9dd715f5d5bc26c3dae1fe0ce71e334ccac04..fa8700a968fc4be01018a1a916e0660fed6e6530 100644 (file)
--- a/drivers/serial/sh-sci.h
+++ b/drivers/serial/sh-sci.h
@@ -1,20 +1,5 @@
-/* $Id: sh-sci.h,v 1.4 2004/02/19 16:43:56 lethal Exp $
- *
- *  linux/drivers/serial/sh-sci.h
- *
- *  SuperH on-chip serial module support.  (SCI with no FIFO / with FIFO)
- *  Copyright (C) 1999, 2000  Niibe Yutaka
- *  Copyright (C) 2000  Greg Banks
- *  Copyright (C) 2002, 2003  Paul Mundt
- *  Modified to support multiple serial ports. Stuart Menefy (May 2000).
- *  Modified to support SH7300(SH-Mobile) SCIF. Takashi Kusuda (Jun 2003).
- *  Modified to support H8/300 Series Yoshinori Sato (Feb 2004).
- *  Removed SH7300 support (Jul 2007).
- *  Modified to support SH7720 SCIF. Markus Brunner, Mark Jonas (Aug 2007).
- */
  #include <linux/serial_core.h>
  #include <asm/io.h>
-
  #include <asm/gpio.h>
  
  #if defined(CONFIG_H83007) || defined(CONFIG_H83068)
@@ -102,6 +87,15 @@
  # define SCSPTR0               SCPDR0
  # define SCIF_ORER             0x0001  /* overrun error bit */
  # define SCSCR_INIT(port)      0x0038  /* TIE=0,RIE=0,TE=1,RE=1,REIE=1 */
+#elif defined(CONFIG_CPU_SUBTYPE_SH7723)
+# define SCSPTR0                0xa4050160
+# define SCSPTR1                0xa405013e
+# define SCSPTR2                0xa4050160
+# define SCSPTR3                0xa405013e
+# define SCSPTR4                0xa4050128
+# define SCSPTR5                0xa4050128
+# define SCIF_ORER              0x0001  /* overrun error bit */
+# define SCSCR_INIT(port)       0x0038  /* TIE=0,RIE=0,TE=1,RE=1,REIE=1 */
  # define SCIF_ONLY
  #elif defined(CONFIG_CPU_SUBTYPE_SH4_202)
  # define SCSPTR2 0xffe80020 /* 16 bit SCIF */
@@ -395,6 +389,11 @@
                   h8_sci_offset, h8_sci_size) \
    CPU_SCI_FNS(name, h8_sci_offset, h8_sci_size)
  #define SCIF_FNS(name, sh3_scif_offset, sh3_scif_size, sh4_scif_offset, sh4_scif_size)
+#elif defined(CONFIG_CPU_SUBTYPE_SH7723)
+        #define SCIx_FNS(name, sh4_scifa_offset, sh4_scifa_size, sh4_scif_offset, sh4_scif_size) \
+                CPU_SCIx_FNS(name, sh4_scifa_offset, sh4_scifa_size, sh4_scif_offset, sh4_scif_size)
+        #define SCIF_FNS(name, sh4_scif_offset, sh4_scif_size) \
+                CPU_SCIF_FNS(name, sh4_scif_offset, sh4_scif_size)
  #else
  #define SCIx_FNS(name, sh3_sci_offset, sh3_sci_size, sh4_sci_offset, sh4_sci_size, \
                  sh3_scif_offset, sh3_scif_size, sh4_scif_offset, sh4_scif_size, \
@@ -419,6 +418,18 @@ SCIF_FNS(SCFDR,  0x1c, 16)
  SCIF_FNS(SCxTDR, 0x20,  8)
  SCIF_FNS(SCxRDR, 0x24,  8)
  SCIF_FNS(SCLSR,  0x24, 16)
+#elif defined(CONFIG_CPU_SUBTYPE_SH7723)
+SCIx_FNS(SCSMR,  0x00, 16, 0x00, 16)
+SCIx_FNS(SCBRR,  0x04,  8, 0x04,  8)
+SCIx_FNS(SCSCR,  0x08, 16, 0x08, 16)
+SCIx_FNS(SCxTDR, 0x20,  8, 0x0c,  8)
+SCIx_FNS(SCxSR,  0x14, 16, 0x10, 16)
+SCIx_FNS(SCxRDR, 0x24,  8, 0x14,  8)
+SCIF_FNS(SCTDSR, 0x0c,  8)
+SCIF_FNS(SCFER,  0x10, 16)
+SCIF_FNS(SCFCR,  0x18, 16)
+SCIF_FNS(SCFDR,  0x1c, 16)
+SCIF_FNS(SCLSR,  0x24, 16)
  #else
  /*      reg      SCI/SH3   SCI/SH4  SCIF/SH3   SCIF/SH4  SCI/H8*/
  /*      name     off  sz   off  sz   off  sz   off  sz   off  sz*/
@@ -589,6 +600,23 @@ static inline int sci_rxd_in(struct uart_port *port)
                 return ctrl_inb(SCPDR0) & 0x0001 ? 1 : 0; /* SCIF0 */
         return 1;
  }
+#elif defined(CONFIG_CPU_SUBTYPE_SH7723)
+static inline int sci_rxd_in(struct uart_port *port)
+{
+        if (port->mapbase == 0xffe00000)
+                return ctrl_inb(SCSPTR0) & 0x0008 ? 1 : 0; /* SCIF0 */
+        if (port->mapbase == 0xffe10000)
+                return ctrl_inb(SCSPTR1) & 0x0020 ? 1 : 0; /* SCIF1 */
+        if (port->mapbase == 0xffe20000)
+                return ctrl_inb(SCSPTR2) & 0x0001 ? 1 : 0; /* SCIF2 */
+        if (port->mapbase == 0xa4e30000)
+                return ctrl_inb(SCSPTR3) & 0x0001 ? 1 : 0; /* SCIF3 */
+        if (port->mapbase == 0xa4e40000)
+                return ctrl_inb(SCSPTR4) & 0x0001 ? 1 : 0; /* SCIF4 */
+        if (port->mapbase == 0xa4e50000)
+                return ctrl_inb(SCSPTR5) & 0x0008 ? 1 : 0; /* SCIF5 */
+        return 1;
+}
  #elif defined(CONFIG_CPU_SUBTYPE_SH5_101) || defined(CONFIG_CPU_SUBTYPE_SH5_103)
  static inline int sci_rxd_in(struct uart_port *port)
  {
@@ -727,6 +755,8 @@ static inline int sci_rxd_in(struct uart_port *port)
        defined(CONFIG_CPU_SUBTYPE_SH7720) || \
        defined(CONFIG_CPU_SUBTYPE_SH7721)
  #define SCBRR_VALUE(bps, clk) (((clk*2)+16*bps)/(32*bps)-1)
+#elif defined(CONFIG_CPU_SUBTYPE_SH7723)
+#define SCBRR_VALUE(bps, clk) (((clk*2)+16*bps)/(16*bps)-1)
  #elif defined(__H8300H__) || defined(__H8300S__)
  #define SCBRR_VALUE(bps) (((CONFIG_CPU_CLOCK*1000/32)/bps)-1)
  #elif defined(CONFIG_SUPERH64)
diff --git a/fs/ext2/ioctl.c b/fs/ext2/ioctl.c

index b8ea11fee5c6fc36e059f9dc873564e82a464ab0..de876fa793e1d140386f8bd9693785943f706c09 100644 (file)
--- a/fs/ext2/ioctl.c
+++ b/fs/ext2/ioctl.c
@@ -12,6 +12,7 @@
  #include <linux/time.h>
  #include <linux/sched.h>
  #include <linux/compat.h>
+#include <linux/mount.h>
  #include <linux/smp_lock.h>
  #include <asm/current.h>
  #include <asm/uaccess.h>
@@ -23,6 +24,7 @@ long ext2_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
         struct ext2_inode_info *ei = EXT2_I(inode);
         unsigned int flags;
         unsigned short rsv_window_size;
+       int ret;
  
         ext2_debug ("cmd = %u, arg = %lu\n", cmd, arg);
  
@@ -34,14 +36,19 @@ long ext2_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
         case EXT2_IOC_SETFLAGS: {
                 unsigned int oldflags;
  
-               if (IS_RDONLY(inode))
-                       return -EROFS;
+               ret = mnt_want_write(filp->f_path.mnt);
+               if (ret)
+                       return ret;
  
-               if (!is_owner_or_cap(inode))
-                       return -EACCES;
+               if (!is_owner_or_cap(inode)) {
+                       ret = -EACCES;
+                       goto setflags_out;
+               }
  
-               if (get_user(flags, (int __user *) arg))
-                       return -EFAULT;
+               if (get_user(flags, (int __user *) arg)) {
+                       ret = -EFAULT;
+                       goto setflags_out;
+               }
  
                 if (!S_ISDIR(inode->i_mode))
                         flags &= ~EXT2_DIRSYNC_FL;
@@ -50,7 +57,8 @@ long ext2_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
                 /* Is it quota file? Do not allow user to mess with it */
                 if (IS_NOQUOTA(inode)) {
                         mutex_unlock(&inode->i_mutex);
-                       return -EPERM;
+                       ret = -EPERM;
+                       goto setflags_out;
                 }
                 oldflags = ei->i_flags;
  
@@ -63,7 +71,8 @@ long ext2_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
                 if ((flags ^ oldflags) & (EXT2_APPEND_FL | EXT2_IMMUTABLE_FL)) {
                         if (!capable(CAP_LINUX_IMMUTABLE)) {
                                 mutex_unlock(&inode->i_mutex);
-                               return -EPERM;
+                               ret = -EPERM;
+                               goto setflags_out;
                         }
                 }
  
@@ -75,20 +84,26 @@ long ext2_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
                 ext2_set_inode_flags(inode);
                 inode->i_ctime = CURRENT_TIME_SEC;
                 mark_inode_dirty(inode);
-               return 0;
+setflags_out:
+               mnt_drop_write(filp->f_path.mnt);
+               return ret;
         }
         case EXT2_IOC_GETVERSION:
                 return put_user(inode->i_generation, (int __user *) arg);
         case EXT2_IOC_SETVERSION:
                 if (!is_owner_or_cap(inode))
                         return -EPERM;
-               if (IS_RDONLY(inode))
-                       return -EROFS;
-               if (get_user(inode->i_generation, (int __user *) arg))
-                       return -EFAULT; 
-               inode->i_ctime = CURRENT_TIME_SEC;
-               mark_inode_dirty(inode);
-               return 0;
+               ret = mnt_want_write(filp->f_path.mnt);
+               if (ret)
+                       return ret;
+               if (get_user(inode->i_generation, (int __user *) arg)) {
+                       ret = -EFAULT;
+               } else {
+                       inode->i_ctime = CURRENT_TIME_SEC;
+                       mark_inode_dirty(inode);
+               }
+               mnt_drop_write(filp->f_path.mnt);
+               return ret;
         case EXT2_IOC_GETRSVSZ:
                 if (test_opt(inode->i_sb, RESERVATION)
                         && S_ISREG(inode->i_mode)
@@ -102,15 +117,16 @@ long ext2_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
                 if (!test_opt(inode->i_sb, RESERVATION) ||!S_ISREG(inode->i_mode))
                         return -ENOTTY;
  
-               if (IS_RDONLY(inode))
-                       return -EROFS;
-
-               if ((current->fsuid != inode->i_uid) && !capable(CAP_FOWNER))
+               if (!is_owner_or_cap(inode))
                         return -EACCES;
  
                 if (get_user(rsv_window_size, (int __user *)arg))
                         return -EFAULT;
  
+               ret = mnt_want_write(filp->f_path.mnt);
+               if (ret)
+                       return ret;
+
                 if (rsv_window_size > EXT2_MAX_RESERVE_BLOCKS)
                         rsv_window_size = EXT2_MAX_RESERVE_BLOCKS;
  
@@ -131,6 +147,7 @@ long ext2_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
                         rsv->rsv_goal_size = rsv_window_size;
                 }
                 mutex_unlock(&ei->truncate_mutex);
+               mnt_drop_write(filp->f_path.mnt);
                 return 0;
         }
         default:
diff --git a/fs/ext3/ioctl.c b/fs/ext3/ioctl.c

index 023a070f55f18fac29cfddafa08557085ad6b2fa..0d0c70151642faf026c98d53b4179c6395e1b645 100644 (file)
--- a/fs/ext3/ioctl.c
+++ b/fs/ext3/ioctl.c
@@ -12,6 +12,7 @@
  #include <linux/capability.h>
  #include <linux/ext3_fs.h>
  #include <linux/ext3_jbd.h>
+#include <linux/mount.h>
  #include <linux/time.h>
  #include <linux/compat.h>
  #include <linux/smp_lock.h>
@@ -38,14 +39,19 @@ int ext3_ioctl (struct inode * inode, struct file * filp, unsigned int cmd,
                 unsigned int oldflags;
                 unsigned int jflag;
  
-               if (IS_RDONLY(inode))
-                       return -EROFS;
+               err = mnt_want_write(filp->f_path.mnt);
+               if (err)
+                       return err;
  
-               if (!is_owner_or_cap(inode))
-                       return -EACCES;
+               if (!is_owner_or_cap(inode)) {
+                       err = -EACCES;
+                       goto flags_out;
+               }
  
-               if (get_user(flags, (int __user *) arg))
-                       return -EFAULT;
+               if (get_user(flags, (int __user *) arg)) {
+                       err = -EFAULT;
+                       goto flags_out;
+               }
  
                 if (!S_ISDIR(inode->i_mode))
                         flags &= ~EXT3_DIRSYNC_FL;
@@ -54,7 +60,8 @@ int ext3_ioctl (struct inode * inode, struct file * filp, unsigned int cmd,
                 /* Is it quota file? Do not allow user to mess with it */
                 if (IS_NOQUOTA(inode)) {
                         mutex_unlock(&inode->i_mutex);
-                       return -EPERM;
+                       err = -EPERM;
+                       goto flags_out;
                 }
                 oldflags = ei->i_flags;
  
@@ -70,7 +77,8 @@ int ext3_ioctl (struct inode * inode, struct file * filp, unsigned int cmd,
                 if ((flags ^ oldflags) & (EXT3_APPEND_FL | EXT3_IMMUTABLE_FL)) {
                         if (!capable(CAP_LINUX_IMMUTABLE)) {
                                 mutex_unlock(&inode->i_mutex);
-                               return -EPERM;
+                               err = -EPERM;
+                               goto flags_out;
                         }
                 }
  
@@ -81,7 +89,8 @@ int ext3_ioctl (struct inode * inode, struct file * filp, unsigned int cmd,
                 if ((jflag ^ oldflags) & (EXT3_JOURNAL_DATA_FL)) {
                         if (!capable(CAP_SYS_RESOURCE)) {
                                 mutex_unlock(&inode->i_mutex);
-                               return -EPERM;
+                               err = -EPERM;
+                               goto flags_out;
                         }
                 }
  
@@ -89,7 +98,8 @@ int ext3_ioctl (struct inode * inode, struct file * filp, unsigned int cmd,
                 handle = ext3_journal_start(inode, 1);
                 if (IS_ERR(handle)) {
                         mutex_unlock(&inode->i_mutex);
-                       return PTR_ERR(handle);
+                       err = PTR_ERR(handle);
+                       goto flags_out;
                 }
                 if (IS_SYNC(inode))
                         handle->h_sync = 1;
@@ -115,6 +125,8 @@ flags_err:
                 if ((jflag ^ oldflags) & (EXT3_JOURNAL_DATA_FL))
                         err = ext3_change_inode_journal_flag(inode, jflag);
                 mutex_unlock(&inode->i_mutex);
+flags_out:
+               mnt_drop_write(filp->f_path.mnt);
                 return err;
         }
         case EXT3_IOC_GETVERSION:
@@ -129,14 +141,18 @@ flags_err:
  
                 if (!is_owner_or_cap(inode))
                         return -EPERM;
-               if (IS_RDONLY(inode))
-                       return -EROFS;
-               if (get_user(generation, (int __user *) arg))
-                       return -EFAULT;
-
+               err = mnt_want_write(filp->f_path.mnt);
+               if (err)
+                       return err;
+               if (get_user(generation, (int __user *) arg)) {
+                       err = -EFAULT;
+                       goto setversion_out;
+               }
                 handle = ext3_journal_start(inode, 1);
-               if (IS_ERR(handle))
-                       return PTR_ERR(handle);
+               if (IS_ERR(handle)) {
+                       err = PTR_ERR(handle);
+                       goto setversion_out;
+               }
                 err = ext3_reserve_inode_write(handle, inode, &iloc);
                 if (err == 0) {
                         inode->i_ctime = CURRENT_TIME_SEC;
@@ -144,6 +160,8 @@ flags_err:
                         err = ext3_mark_iloc_dirty(handle, inode, &iloc);
                 }
                 ext3_journal_stop(handle);
+setversion_out:
+               mnt_drop_write(filp->f_path.mnt);
                 return err;
         }
  #ifdef CONFIG_JBD_DEBUG
@@ -179,18 +197,24 @@ flags_err:
                 }
                 return -ENOTTY;
         case EXT3_IOC_SETRSVSZ: {
+               int err;
  
                 if (!test_opt(inode->i_sb, RESERVATION) ||!S_ISREG(inode->i_mode))
                         return -ENOTTY;
  
-               if (IS_RDONLY(inode))
-                       return -EROFS;
+               err = mnt_want_write(filp->f_path.mnt);
+               if (err)
+                       return err;
  
-               if (!is_owner_or_cap(inode))
-                       return -EACCES;
+               if (!is_owner_or_cap(inode)) {
+                       err = -EACCES;
+                       goto setrsvsz_out;
+               }
  
-               if (get_user(rsv_window_size, (int __user *)arg))
-                       return -EFAULT;
+               if (get_user(rsv_window_size, (int __user *)arg)) {
+                       err = -EFAULT;
+                       goto setrsvsz_out;
+               }
  
                 if (rsv_window_size > EXT3_MAX_RESERVE_BLOCKS)
                         rsv_window_size = EXT3_MAX_RESERVE_BLOCKS;
@@ -208,7 +232,9 @@ flags_err:
                         rsv->rsv_goal_size = rsv_window_size;
                 }
                 mutex_unlock(&ei->truncate_mutex);
-               return 0;
+setrsvsz_out:
+               mnt_drop_write(filp->f_path.mnt);
+               return err;
         }
         case EXT3_IOC_GROUP_EXTEND: {
                 ext3_fsblk_t n_blocks_count;
@@ -218,17 +244,20 @@ flags_err:
                 if (!capable(CAP_SYS_RESOURCE))
                         return -EPERM;
  
-               if (IS_RDONLY(inode))
-                       return -EROFS;
-
-               if (get_user(n_blocks_count, (__u32 __user *)arg))
-                       return -EFAULT;
+               err = mnt_want_write(filp->f_path.mnt);
+               if (err)
+                       return err;
  
+               if (get_user(n_blocks_count, (__u32 __user *)arg)) {
+                       err = -EFAULT;
+                       goto group_extend_out;
+               }
                 err = ext3_group_extend(sb, EXT3_SB(sb)->s_es, n_blocks_count);
                 journal_lock_updates(EXT3_SB(sb)->s_journal);
                 journal_flush(EXT3_SB(sb)->s_journal);
                 journal_unlock_updates(EXT3_SB(sb)->s_journal);
-
+group_extend_out:
+               mnt_drop_write(filp->f_path.mnt);
                 return err;
         }
         case EXT3_IOC_GROUP_ADD: {
@@ -239,18 +268,22 @@ flags_err:
                 if (!capable(CAP_SYS_RESOURCE))
                         return -EPERM;
  
-               if (IS_RDONLY(inode))
-                       return -EROFS;
+               err = mnt_want_write(filp->f_path.mnt);
+               if (err)
+                       return err;
  
                 if (copy_from_user(&input, (struct ext3_new_group_input __user *)arg,
-                               sizeof(input)))
-                       return -EFAULT;
+                               sizeof(input))) {
+                       err = -EFAULT;
+                       goto group_add_out;
+               }
  
                 err = ext3_group_add(sb, &input);
                 journal_lock_updates(EXT3_SB(sb)->s_journal);
                 journal_flush(EXT3_SB(sb)->s_journal);
                 journal_unlock_updates(EXT3_SB(sb)->s_journal);
-
+group_add_out:
+               mnt_drop_write(filp->f_path.mnt);
                 return err;
         }
  
diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c

index 2ed7c37f897e79f0b9c6d6af3a5c2aad79fa7f59..25b13ede8086c606a4b1320c2a364889434c7b7c 100644 (file)
--- a/fs/ext4/ioctl.c
+++ b/fs/ext4/ioctl.c
@@ -15,6 +15,7 @@
  #include <linux/time.h>
  #include <linux/compat.h>
  #include <linux/smp_lock.h>
+#include <linux/mount.h>
  #include <asm/uaccess.h>
  
  int ext4_ioctl (struct inode * inode, struct file * filp, unsigned int cmd,
@@ -38,24 +39,25 @@ int ext4_ioctl (struct inode * inode, struct file * filp, unsigned int cmd,
                 unsigned int oldflags;
                 unsigned int jflag;
  
-               if (IS_RDONLY(inode))
-                       return -EROFS;
-
                 if (!is_owner_or_cap(inode))
                         return -EACCES;
  
                 if (get_user(flags, (int __user *) arg))
                         return -EFAULT;
  
+               err = mnt_want_write(filp->f_path.mnt);
+               if (err)
+                       return err;
+
                 if (!S_ISDIR(inode->i_mode))
                         flags &= ~EXT4_DIRSYNC_FL;
  
+               err = -EPERM;
                 mutex_lock(&inode->i_mutex);
                 /* Is it quota file? Do not allow user to mess with it */
-               if (IS_NOQUOTA(inode)) {
-                       mutex_unlock(&inode->i_mutex);
-                       return -EPERM;
-               }
+               if (IS_NOQUOTA(inode))
+                       goto flags_out;
+
                 oldflags = ei->i_flags;
  
                 /* The JOURNAL_DATA flag is modifiable only by root */
@@ -68,10 +70,8 @@ int ext4_ioctl (struct inode * inode, struct file * filp, unsigned int cmd,
                  * This test looks nicer. Thanks to Pauline Middelink
                  */
                 if ((flags ^ oldflags) & (EXT4_APPEND_FL | EXT4_IMMUTABLE_FL)) {
-                       if (!capable(CAP_LINUX_IMMUTABLE)) {
-                               mutex_unlock(&inode->i_mutex);
-                               return -EPERM;
-                       }
+                       if (!capable(CAP_LINUX_IMMUTABLE))
+                               goto flags_out;
                 }
  
                 /*
@@ -79,17 +79,14 @@ int ext4_ioctl (struct inode * inode, struct file * filp, unsigned int cmd,
                  * the relevant capability.
                  */
                 if ((jflag ^ oldflags) & (EXT4_JOURNAL_DATA_FL)) {
-                       if (!capable(CAP_SYS_RESOURCE)) {
-                               mutex_unlock(&inode->i_mutex);
-                               return -EPERM;
-                       }
+                       if (!capable(CAP_SYS_RESOURCE))
+                               goto flags_out;
                 }
  
-
                 handle = ext4_journal_start(inode, 1);
                 if (IS_ERR(handle)) {
-                       mutex_unlock(&inode->i_mutex);
-                       return PTR_ERR(handle);
+                       err = PTR_ERR(handle);
+                       goto flags_out;
                 }
                 if (IS_SYNC(inode))
                         handle->h_sync = 1;
@@ -107,14 +104,14 @@ int ext4_ioctl (struct inode * inode, struct file * filp, unsigned int cmd,
                 err = ext4_mark_iloc_dirty(handle, inode, &iloc);
  flags_err:
                 ext4_journal_stop(handle);
-               if (err) {
-                       mutex_unlock(&inode->i_mutex);
-                       return err;
-               }
+               if (err)
+                       goto flags_out;
  
                 if ((jflag ^ oldflags) & (EXT4_JOURNAL_DATA_FL))
                         err = ext4_change_inode_journal_flag(inode, jflag);
+flags_out:
                 mutex_unlock(&inode->i_mutex);
+               mnt_drop_write(filp->f_path.mnt);
                 return err;
         }
         case EXT4_IOC_GETVERSION:
@@ -129,14 +126,20 @@ flags_err:
  
                 if (!is_owner_or_cap(inode))
                         return -EPERM;
-               if (IS_RDONLY(inode))
-                       return -EROFS;
-               if (get_user(generation, (int __user *) arg))
-                       return -EFAULT;
+
+               err = mnt_want_write(filp->f_path.mnt);
+               if (err)
+                       return err;
+               if (get_user(generation, (int __user *) arg)) {
+                       err = -EFAULT;
+                       goto setversion_out;
+               }
  
                 handle = ext4_journal_start(inode, 1);
-               if (IS_ERR(handle))
-                       return PTR_ERR(handle);
+               if (IS_ERR(handle)) {
+                       err = PTR_ERR(handle);
+                       goto setversion_out;
+               }
                 err = ext4_reserve_inode_write(handle, inode, &iloc);
                 if (err == 0) {
                         inode->i_ctime = ext4_current_time(inode);
@@ -144,6 +147,8 @@ flags_err:
                         err = ext4_mark_iloc_dirty(handle, inode, &iloc);
                 }
                 ext4_journal_stop(handle);
+setversion_out:
+               mnt_drop_write(filp->f_path.mnt);
                 return err;
         }
  #ifdef CONFIG_JBD2_DEBUG
@@ -179,19 +184,21 @@ flags_err:
                 }
                 return -ENOTTY;
         case EXT4_IOC_SETRSVSZ: {
+               int err;
  
                 if (!test_opt(inode->i_sb, RESERVATION) ||!S_ISREG(inode->i_mode))
                         return -ENOTTY;
  
-               if (IS_RDONLY(inode))
-                       return -EROFS;
-
                 if (!is_owner_or_cap(inode))
                         return -EACCES;
  
                 if (get_user(rsv_window_size, (int __user *)arg))
                         return -EFAULT;
  
+               err = mnt_want_write(filp->f_path.mnt);
+               if (err)
+                       return err;
+
                 if (rsv_window_size > EXT4_MAX_RESERVE_BLOCKS)
                         rsv_window_size = EXT4_MAX_RESERVE_BLOCKS;
  
@@ -208,6 +215,7 @@ flags_err:
                         rsv->rsv_goal_size = rsv_window_size;
                 }
                 up_write(&ei->i_data_sem);
+               mnt_drop_write(filp->f_path.mnt);
                 return 0;
         }
         case EXT4_IOC_GROUP_EXTEND: {
@@ -218,16 +226,18 @@ flags_err:
                 if (!capable(CAP_SYS_RESOURCE))
                         return -EPERM;
  
-               if (IS_RDONLY(inode))
-                       return -EROFS;
-
                 if (get_user(n_blocks_count, (__u32 __user *)arg))
                         return -EFAULT;
  
+               err = mnt_want_write(filp->f_path.mnt);
+               if (err)
+                       return err;
+
                 err = ext4_group_extend(sb, EXT4_SB(sb)->s_es, n_blocks_count);
                 jbd2_journal_lock_updates(EXT4_SB(sb)->s_journal);
                 jbd2_journal_flush(EXT4_SB(sb)->s_journal);
                 jbd2_journal_unlock_updates(EXT4_SB(sb)->s_journal);
+               mnt_drop_write(filp->f_path.mnt);
  
                 return err;
         }
@@ -239,17 +249,19 @@ flags_err:
                 if (!capable(CAP_SYS_RESOURCE))
                         return -EPERM;
  
-               if (IS_RDONLY(inode))
-                       return -EROFS;
-
                 if (copy_from_user(&input, (struct ext4_new_group_input __user *)arg,
                                 sizeof(input)))
                         return -EFAULT;
  
+               err = mnt_want_write(filp->f_path.mnt);
+               if (err)
+                       return err;
+
                 err = ext4_group_add(sb, &input);
                 jbd2_journal_lock_updates(EXT4_SB(sb)->s_journal);
                 jbd2_journal_flush(EXT4_SB(sb)->s_journal);
                 jbd2_journal_unlock_updates(EXT4_SB(sb)->s_journal);
+               mnt_drop_write(filp->f_path.mnt);
  
                 return err;
         }
diff --git a/fs/fat/file.c b/fs/fat/file.c

index c614175876e09316ea0a36566dc072fe6819169c..2a3bed96704148c1377537677c4c7d9edd9372a3 100644 (file)
--- a/fs/fat/file.c
+++ b/fs/fat/file.c
@@ -8,6 +8,7 @@
  
  #include <linux/capability.h>
  #include <linux/module.h>
+#include <linux/mount.h>
  #include <linux/time.h>
  #include <linux/msdos_fs.h>
  #include <linux/smp_lock.h>
@@ -46,10 +47,9 @@ int fat_generic_ioctl(struct inode *inode, struct file *filp,
  
                 mutex_lock(&inode->i_mutex);
  
-               if (IS_RDONLY(inode)) {
-                       err = -EROFS;
-                       goto up;
-               }
+               err = mnt_want_write(filp->f_path.mnt);
+               if (err)
+                       goto up_no_drop_write;
  
                 /*
                  * ATTR_VOLUME and ATTR_DIR cannot be changed; this also
@@ -105,7 +105,9 @@ int fat_generic_ioctl(struct inode *inode, struct file *filp,
  
                 MSDOS_I(inode)->i_attrs = attr & ATTR_UNUSED;
                 mark_inode_dirty(inode);
-       up:
+up:
+               mnt_drop_write(filp->f_path.mnt);
+up_no_drop_write:
                 mutex_unlock(&inode->i_mutex);
                 return err;
         }
diff --git a/fs/file_table.c b/fs/file_table.c

index 986ff4ed0a7cbf56d258b290635c32203284c565..7a0a9b8722513faae34fc0f145c06f241f2630f7 100644 (file)
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -42,6 +42,7 @@ static inline void file_free_rcu(struct rcu_head *head)
  static inline void file_free(struct file *f)
  {
         percpu_counter_dec(&nr_files);
+       file_check_state(f);
         call_rcu(&f->f_u.fu_rcuhead, file_free_rcu);
  }
  
@@ -199,6 +200,18 @@ int init_file(struct file *file, struct vfsmount *mnt, struct dentry *dentry,
         file->f_mapping = dentry->d_inode->i_mapping;
         file->f_mode = mode;
         file->f_op = fop;
+
+       /*
+        * These mounts don't really matter in practice
+        * for r/o bind mounts.  They aren't userspace-
+        * visible.  We do this for consistency, and so
+        * that we can do debugging checks at __fput()
+        */
+       if ((mode & FMODE_WRITE) && !special_file(dentry->d_inode->i_mode)) {
+               file_take_write(file);
+               error = mnt_want_write(mnt);
+               WARN_ON(error);
+       }
         return error;
  }
  EXPORT_SYMBOL(init_file);
@@ -211,6 +224,31 @@ void fput(struct file *file)
  
  EXPORT_SYMBOL(fput);
  
+/**
+ * drop_file_write_access - give up ability to write to a file
+ * @file: the file to which we will stop writing
+ *
+ * This is a central place which will give up the ability
+ * to write to @file, along with access to write through
+ * its vfsmount.
+ */
+void drop_file_write_access(struct file *file)
+{
+       struct vfsmount *mnt = file->f_path.mnt;
+       struct dentry *dentry = file->f_path.dentry;
+       struct inode *inode = dentry->d_inode;
+
+       put_write_access(inode);
+
+       if (special_file(inode->i_mode))
+               return;
+       if (file_check_writeable(file) != 0)
+               return;
+       mnt_drop_write(mnt);
+       file_release_write(file);
+}
+EXPORT_SYMBOL_GPL(drop_file_write_access);
+
  /* __fput is called from task context when aio completion releases the last
   * last use of a struct file *.  Do not use otherwise.
   */
@@ -236,10 +274,10 @@ void __fput(struct file *file)
         if (unlikely(S_ISCHR(inode->i_mode) && inode->i_cdev != NULL))
                 cdev_put(inode->i_cdev);
         fops_put(file->f_op);
-       if (file->f_mode & FMODE_WRITE)
-               put_write_access(inode);
         put_pid(file->f_owner.pid);
         file_kill(file);
+       if (file->f_mode & FMODE_WRITE)
+               drop_file_write_access(file);
         file->f_path.dentry = NULL;
         file->f_path.mnt = NULL;
         file_free(file);
diff --git a/fs/hfsplus/ioctl.c b/fs/hfsplus/ioctl.c

index b60c0affbec58af68e45fd063692a933949ad457..f457d2ca51ab8e68a3e60cd9028620fa029f8f18 100644 (file)
--- a/fs/hfsplus/ioctl.c
+++ b/fs/hfsplus/ioctl.c
@@ -14,6 +14,7 @@
  
  #include <linux/capability.h>
  #include <linux/fs.h>
+#include <linux/mount.h>
  #include <linux/sched.h>
  #include <linux/xattr.h>
  #include <asm/uaccess.h>
@@ -35,25 +36,32 @@ int hfsplus_ioctl(struct inode *inode, struct file *filp, unsigned int cmd,
                         flags |= FS_NODUMP_FL; /* EXT2_NODUMP_FL */
                 return put_user(flags, (int __user *)arg);
         case HFSPLUS_IOC_EXT2_SETFLAGS: {
-               if (IS_RDONLY(inode))
-                       return -EROFS;
-
-               if (!is_owner_or_cap(inode))
-                       return -EACCES;
-
-               if (get_user(flags, (int __user *)arg))
-                       return -EFAULT;
-
+               int err = 0;
+               err = mnt_want_write(filp->f_path.mnt);
+               if (err)
+                       return err;
+
+               if (!is_owner_or_cap(inode)) {
+                       err = -EACCES;
+                       goto setflags_out;
+               }
+               if (get_user(flags, (int __user *)arg)) {
+                       err = -EFAULT;
+                       goto setflags_out;
+               }
                 if (flags & (FS_IMMUTABLE_FL|FS_APPEND_FL) ||
                     HFSPLUS_I(inode).rootflags & (HFSPLUS_FLG_IMMUTABLE|HFSPLUS_FLG_APPEND)) {
-                       if (!capable(CAP_LINUX_IMMUTABLE))
-                               return -EPERM;
+                       if (!capable(CAP_LINUX_IMMUTABLE)) {
+                               err = -EPERM;
+                               goto setflags_out;
+                       }
                 }
  
                 /* don't silently ignore unsupported ext2 flags */
-               if (flags & ~(FS_IMMUTABLE_FL|FS_APPEND_FL|FS_NODUMP_FL))
-                       return -EOPNOTSUPP;
-
+               if (flags & ~(FS_IMMUTABLE_FL|FS_APPEND_FL|FS_NODUMP_FL)) {
+                       err = -EOPNOTSUPP;
+                       goto setflags_out;
+               }
                 if (flags & FS_IMMUTABLE_FL) { /* EXT2_IMMUTABLE_FL */
                         inode->i_flags |= S_IMMUTABLE;
                         HFSPLUS_I(inode).rootflags |= HFSPLUS_FLG_IMMUTABLE;
@@ -75,7 +83,9 @@ int hfsplus_ioctl(struct inode *inode, struct file *filp, unsigned int cmd,
  
                 inode->i_ctime = CURRENT_TIME_SEC;
                 mark_inode_dirty(inode);
-               return 0;
+setflags_out:
+               mnt_drop_write(filp->f_path.mnt);
+               return err;
         }
         default:
                 return -ENOTTY;
diff --git a/fs/inode.c b/fs/inode.c

index 53245ffcf93dc8a4dd4d852257e5cc2e52ed64d6..27ee1af50d02c6537febbaed5cf7bd60debc832d 100644 (file)
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -1199,42 +1199,37 @@ void touch_atime(struct vfsmount *mnt, struct dentry *dentry)
         struct inode *inode = dentry->d_inode;
         struct timespec now;
  
-       if (inode->i_flags & S_NOATIME)
+       if (mnt_want_write(mnt))
                 return;
+       if (inode->i_flags & S_NOATIME)
+               goto out;
         if (IS_NOATIME(inode))
-               return;
+               goto out;
         if ((inode->i_sb->s_flags & MS_NODIRATIME) && S_ISDIR(inode->i_mode))
-               return;
+               goto out;
  
-       /*
-        * We may have a NULL vfsmount when coming from NFSD
-        */
-       if (mnt) {
-               if (mnt->mnt_flags & MNT_NOATIME)
-                       return;
-               if ((mnt->mnt_flags & MNT_NODIRATIME) && S_ISDIR(inode->i_mode))
-                       return;
-
-               if (mnt->mnt_flags & MNT_RELATIME) {
-                       /*
-                        * With relative atime, only update atime if the
-                        * previous atime is earlier than either the ctime or
-                        * mtime.
-                        */
-                       if (timespec_compare(&inode->i_mtime,
-                                               &inode->i_atime) < 0 &&
-                           timespec_compare(&inode->i_ctime,
-                                               &inode->i_atime) < 0)
-                               return;
-               }
+       if (mnt->mnt_flags & MNT_NOATIME)
+               goto out;
+       if ((mnt->mnt_flags & MNT_NODIRATIME) && S_ISDIR(inode->i_mode))
+               goto out;
+       if (mnt->mnt_flags & MNT_RELATIME) {
+               /*
+                * With relative atime, only update atime if the previous
+                * atime is earlier than either the ctime or mtime.
+                */
+               if (timespec_compare(&inode->i_mtime, &inode->i_atime) < 0 &&
+                   timespec_compare(&inode->i_ctime, &inode->i_atime) < 0)
+                       goto out;
         }
  
         now = current_fs_time(inode->i_sb);
         if (timespec_equal(&inode->i_atime, &now))
-               return;
+               goto out;
  
         inode->i_atime = now;
         mark_inode_dirty_sync(inode);
+out:
+       mnt_drop_write(mnt);
  }
  EXPORT_SYMBOL(touch_atime);
  
@@ -1255,10 +1250,13 @@ void file_update_time(struct file *file)
         struct inode *inode = file->f_path.dentry->d_inode;
         struct timespec now;
         int sync_it = 0;
+       int err;
  
         if (IS_NOCMTIME(inode))
                 return;
-       if (IS_RDONLY(inode))
+
+       err = mnt_want_write(file->f_path.mnt);
+       if (err)
                 return;
  
         now = current_fs_time(inode->i_sb);
@@ -1279,6 +1277,7 @@ void file_update_time(struct file *file)
  
         if (sync_it)
                 mark_inode_dirty_sync(inode);
+       mnt_drop_write(file->f_path.mnt);
  }
  
  EXPORT_SYMBOL(file_update_time);
diff --git a/fs/jfs/ioctl.c b/fs/jfs/ioctl.c

index a1f8e375ad2114f2e75669f7a817bdf20baa6fdc..afe222bf300fc90f0931492b45969c1d09fcc9bc 100644 (file)
--- a/fs/jfs/ioctl.c
+++ b/fs/jfs/ioctl.c
@@ -8,6 +8,7 @@
  #include <linux/fs.h>
  #include <linux/ctype.h>
  #include <linux/capability.h>
+#include <linux/mount.h>
  #include <linux/time.h>
  #include <linux/sched.h>
  #include <asm/current.h>
@@ -65,23 +66,30 @@ long jfs_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
                 return put_user(flags, (int __user *) arg);
         case JFS_IOC_SETFLAGS: {
                 unsigned int oldflags;
+               int err;
  
-               if (IS_RDONLY(inode))
-                       return -EROFS;
+               err = mnt_want_write(filp->f_path.mnt);
+               if (err)
+                       return err;
  
-               if (!is_owner_or_cap(inode))
-                       return -EACCES;
-
-               if (get_user(flags, (int __user *) arg))
-                       return -EFAULT;
+               if (!is_owner_or_cap(inode)) {
+                       err = -EACCES;
+                       goto setflags_out;
+               }
+               if (get_user(flags, (int __user *) arg)) {
+                       err = -EFAULT;
+                       goto setflags_out;
+               }
  
                 flags = jfs_map_ext2(flags, 1);
                 if (!S_ISDIR(inode->i_mode))
                         flags &= ~JFS_DIRSYNC_FL;
  
                 /* Is it quota file? Do not allow user to mess with it */
-               if (IS_NOQUOTA(inode))
-                       return -EPERM;
+               if (IS_NOQUOTA(inode)) {
+                       err = -EPERM;
+                       goto setflags_out;
+               }
  
                 /* Lock against other parallel changes of flags */
                 mutex_lock(&inode->i_mutex);
@@ -98,7 +106,8 @@ long jfs_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
                         (JFS_APPEND_FL | JFS_IMMUTABLE_FL))) {
                         if (!capable(CAP_LINUX_IMMUTABLE)) {
                                 mutex_unlock(&inode->i_mutex);
-                               return -EPERM;
+                               err = -EPERM;
+                               goto setflags_out;
                         }
                 }
  
@@ -110,7 +119,9 @@ long jfs_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
                 mutex_unlock(&inode->i_mutex);
                 inode->i_ctime = CURRENT_TIME_SEC;
                 mark_inode_dirty(inode);
-               return 0;
+setflags_out:
+               mnt_drop_write(filp->f_path.mnt);
+               return err;
         }
         default:
                 return -ENOTTY;
diff --git a/fs/namei.c b/fs/namei.c

index 8cf9bb9c2fc0b0133d85f0f6ff0f3e35a9fb8e07..e179f71bfcb058df613f83cb1ff923d47ffc4618 100644 (file)
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1623,8 +1623,7 @@ int may_open(struct nameidata *nd, int acc_mode, int flag)
                         return -EACCES;
  
                 flag &= ~O_TRUNC;
-       } else if (IS_RDONLY(inode) && (acc_mode & MAY_WRITE))
-               return -EROFS;
+       }
  
         error = vfs_permission(nd, acc_mode);
         if (error)
@@ -1677,7 +1676,12 @@ int may_open(struct nameidata *nd, int acc_mode, int flag)
         return 0;
  }
  
-static int open_namei_create(struct nameidata *nd, struct path *path,
+/*
+ * Be careful about ever adding any more callers of this
+ * function.  Its flags must be in the namei format, not
+ * what get passed to sys_open().
+ */
+static int __open_namei_create(struct nameidata *nd, struct path *path,
                                 int flag, int mode)
  {
         int error;
@@ -1696,26 +1700,56 @@ static int open_namei_create(struct nameidata *nd, struct path *path,
  }
  
  /*
- *     open_namei()
+ * Note that while the flag value (low two bits) for sys_open means:
+ *     00 - read-only
+ *     01 - write-only
+ *     10 - read-write
+ *     11 - special
+ * it is changed into
+ *     00 - no permissions needed
+ *     01 - read-permission
+ *     10 - write-permission
+ *     11 - read-write
+ * for the internal routines (ie open_namei()/follow_link() etc)
+ * This is more logical, and also allows the 00 "no perm needed"
+ * to be used for symlinks (where the permissions are checked
+ * later).
   *
- * namei for open - this is in fact almost the whole open-routine.
- *
- * Note that the low bits of "flag" aren't the same as in the open
- * system call - they are 00 - no permissions needed
- *                       01 - read permission needed
- *                       10 - write permission needed
- *                       11 - read/write permissions needed
- * which is a lot more logical, and also allows the "no perm" needed
- * for symlinks (where the permissions are checked later).
- * SMP-safe
+*/
+static inline int open_to_namei_flags(int flag)
+{
+       if ((flag+1) & O_ACCMODE)
+               flag++;
+       return flag;
+}
+
+static int open_will_write_to_fs(int flag, struct inode *inode)
+{
+       /*
+        * We'll never write to the fs underlying
+        * a device file.
+        */
+       if (special_file(inode->i_mode))
+               return 0;
+       return (flag & O_TRUNC);
+}
+
+/*
+ * Note that the low bits of the passed in "open_flag"
+ * are not the same as in the local variable "flag". See
+ * open_to_namei_flags() for more details.
   */
-int open_namei(int dfd, const char *pathname, int flag,
-               int mode, struct nameidata *nd)
+struct file *do_filp_open(int dfd, const char *pathname,
+               int open_flag, int mode)
  {
+       struct file *filp;
+       struct nameidata nd;
         int acc_mode, error;
         struct path path;
         struct dentry *dir;
         int count = 0;
+       int will_write;
+       int flag = open_to_namei_flags(open_flag);
  
         acc_mode = ACC_MODE(flag);
  
@@ -1733,18 +1767,19 @@ int open_namei(int dfd, const char *pathname, int flag,
          */
         if (!(flag & O_CREAT)) {
                 error = path_lookup_open(dfd, pathname, lookup_flags(flag),
-                                        nd, flag);
+                                        &nd, flag);
                 if (error)
-                       return error;
+                       return ERR_PTR(error);
                 goto ok;
         }
  
         /*
          * Create - we need to know the parent.
          */
-       error = path_lookup_create(dfd,pathname,LOOKUP_PARENT,nd,flag,mode);
+       error = path_lookup_create(dfd, pathname, LOOKUP_PARENT,
+                                  &nd, flag, mode);
         if (error)
-               return error;
+               return ERR_PTR(error);
  
         /*
          * We have the parent and last component. First of all, check
@@ -1752,14 +1787,14 @@ int open_namei(int dfd, const char *pathname, int flag,
          * will not do.
          */
         error = -EISDIR;
-       if (nd->last_type != LAST_NORM || nd->last.name[nd->last.len])
+       if (nd.last_type != LAST_NORM || nd.last.name[nd.last.len])
                 goto exit;
  
-       dir = nd->path.dentry;
-       nd->flags &= ~LOOKUP_PARENT;
+       dir = nd.path.dentry;
+       nd.flags &= ~LOOKUP_PARENT;
         mutex_lock(&dir->d_inode->i_mutex);
-       path.dentry = lookup_hash(nd);
-       path.mnt = nd->path.mnt;
+       path.dentry = lookup_hash(&nd);
+       path.mnt = nd.path.mnt;
  
  do_last:
         error = PTR_ERR(path.dentry);
@@ -1768,18 +1803,31 @@ do_last:
                 goto exit;
         }
  
-       if (IS_ERR(nd->intent.open.file)) {
-               mutex_unlock(&dir->d_inode->i_mutex);
-               error = PTR_ERR(nd->intent.open.file);
-               goto exit_dput;
+       if (IS_ERR(nd.intent.open.file)) {
+               error = PTR_ERR(nd.intent.open.file);
+               goto exit_mutex_unlock;
         }
  
         /* Negative dentry, just create the file */
         if (!path.dentry->d_inode) {
-               error = open_namei_create(nd, &path, flag, mode);
+               /*
+                * This write is needed to ensure that a
+                * ro->rw transition does not occur between
+                * the time when the file is created and when
+                * a permanent write count is taken through
+                * the 'struct file' in nameidata_to_filp().
+                */
+               error = mnt_want_write(nd.path.mnt);
                 if (error)
+                       goto exit_mutex_unlock;
+               error = __open_namei_create(&nd, &path, flag, mode);
+               if (error) {
+                       mnt_drop_write(nd.path.mnt);
                         goto exit;
-               return 0;
+               }
+               filp = nameidata_to_filp(&nd, open_flag);
+               mnt_drop_write(nd.path.mnt);
+               return filp;
         }
  
         /*
@@ -1804,23 +1852,52 @@ do_last:
         if (path.dentry->d_inode->i_op && path.dentry->d_inode->i_op->follow_link)
                 goto do_link;
  
-       path_to_nameidata(&path, nd);
+       path_to_nameidata(&path, &nd);
         error = -EISDIR;
         if (path.dentry->d_inode && S_ISDIR(path.dentry->d_inode->i_mode))
                 goto exit;
  ok:
-       error = may_open(nd, acc_mode, flag);
-       if (error)
+       /*
+        * Consider:
+        * 1. may_open() truncates a file
+        * 2. a rw->ro mount transition occurs
+        * 3. nameidata_to_filp() fails due to
+        *    the ro mount.
+        * That would be inconsistent, and should
+        * be avoided. Taking this mnt write here
+        * ensures that (2) can not occur.
+        */
+       will_write = open_will_write_to_fs(flag, nd.path.dentry->d_inode);
+       if (will_write) {
+               error = mnt_want_write(nd.path.mnt);
+               if (error)
+                       goto exit;
+       }
+       error = may_open(&nd, acc_mode, flag);
+       if (error) {
+               if (will_write)
+                       mnt_drop_write(nd.path.mnt);
                 goto exit;
-       return 0;
+       }
+       filp = nameidata_to_filp(&nd, open_flag);
+       /*
+        * It is now safe to drop the mnt write
+        * because the filp has had a write taken
+        * on its behalf.
+        */
+       if (will_write)
+               mnt_drop_write(nd.path.mnt);
+       return filp;
  
+exit_mutex_unlock:
+       mutex_unlock(&dir->d_inode->i_mutex);
  exit_dput:
-       path_put_conditional(&path, nd);
+       path_put_conditional(&path, &nd);
  exit:
-       if (!IS_ERR(nd->intent.open.file))
-               release_open_intent(nd);
-       path_put(&nd->path);
-       return error;
+       if (!IS_ERR(nd.intent.open.file))
+               release_open_intent(&nd);
+       path_put(&nd.path);
+       return ERR_PTR(error);
  
  do_link:
         error = -ELOOP;
@@ -1836,42 +1913,59 @@ do_link:
          * stored in nd->last.name and we will have to putname() it when we
          * are done. Procfs-like symlinks just set LAST_BIND.
          */
-       nd->flags |= LOOKUP_PARENT;
-       error = security_inode_follow_link(path.dentry, nd);
+       nd.flags |= LOOKUP_PARENT;
+       error = security_inode_follow_link(path.dentry, &nd);
         if (error)
                 goto exit_dput;
-       error = __do_follow_link(&path, nd);
+       error = __do_follow_link(&path, &nd);
         if (error) {
                 /* Does someone understand code flow here? Or it is only
                  * me so stupid? Anathema to whoever designed this non-sense
                  * with "intent.open".
                  */
-               release_open_intent(nd);
-               return error;
+               release_open_intent(&nd);
+               return ERR_PTR(error);
         }
-       nd->flags &= ~LOOKUP_PARENT;
-       if (nd->last_type == LAST_BIND)
+       nd.flags &= ~LOOKUP_PARENT;
+       if (nd.last_type == LAST_BIND)
                 goto ok;
         error = -EISDIR;
-       if (nd->last_type != LAST_NORM)
+       if (nd.last_type != LAST_NORM)
                 goto exit;
-       if (nd->last.name[nd->last.len]) {
-               __putname(nd->last.name);
+       if (nd.last.name[nd.last.len]) {
+               __putname(nd.last.name);
                 goto exit;
         }
         error = -ELOOP;
         if (count++==32) {
-               __putname(nd->last.name);
+               __putname(nd.last.name);
                 goto exit;
         }
-       dir = nd->path.dentry;
+       dir = nd.path.dentry;
         mutex_lock(&dir->d_inode->i_mutex);
-       path.dentry = lookup_hash(nd);
-       path.mnt = nd->path.mnt;
-       __putname(nd->last.name);
+       path.dentry = lookup_hash(&nd);
+       path.mnt = nd.path.mnt;
+       __putname(nd.last.name);
         goto do_last;
  }
  
+/**
+ * filp_open - open file and return file pointer
+ *
+ * @filename:  path to open
+ * @flags:     open flags as per the open(2) second argument
+ * @mode:      mode for the new file if O_CREAT is set, else ignored
+ *
+ * This is the helper to open a file from kernelspace if you really
+ * have to.  But in generally you should not do this, so please move
+ * along, nothing to see here..
+ */
+struct file *filp_open(const char *filename, int flags, int mode)
+{
+       return do_filp_open(AT_FDCWD, filename, flags, mode);
+}
+EXPORT_SYMBOL(filp_open);
+
  /**
   * lookup_create - lookup a dentry, creating it if it doesn't exist
   * @nd: nameidata info
@@ -1945,6 +2039,23 @@ int vfs_mknod(struct inode *dir, struct dentry *dentry, int mode, dev_t dev)
         return error;
  }
  
+static int may_mknod(mode_t mode)
+{
+       switch (mode & S_IFMT) {
+       case S_IFREG:
+       case S_IFCHR:
+       case S_IFBLK:
+       case S_IFIFO:
+       case S_IFSOCK:
+       case 0: /* zero mode translates to S_IFREG */
+               return 0;
+       case S_IFDIR:
+               return -EPERM;
+       default:
+               return -EINVAL;
+       }
+}
+
  asmlinkage long sys_mknodat(int dfd, const char __user *filename, int mode,
                                 unsigned dev)
  {
@@ -1963,12 +2074,19 @@ asmlinkage long sys_mknodat(int dfd, const char __user *filename, int mode,
         if (error)
                 goto out;
         dentry = lookup_create(&nd, 0);
-       error = PTR_ERR(dentry);
-
+       if (IS_ERR(dentry)) {
+               error = PTR_ERR(dentry);
+               goto out_unlock;
+       }
         if (!IS_POSIXACL(nd.path.dentry->d_inode))
                 mode &= ~current->fs->umask;
-       if (!IS_ERR(dentry)) {
-               switch (mode & S_IFMT) {
+       error = may_mknod(mode);
+       if (error)
+               goto out_dput;
+       error = mnt_want_write(nd.path.mnt);
+       if (error)
+               goto out_dput;
+       switch (mode & S_IFMT) {
                 case 0: case S_IFREG:
                         error = vfs_create(nd.path.dentry->d_inode,dentry,mode,&nd);
                         break;
@@ -1979,14 +2097,11 @@ asmlinkage long sys_mknodat(int dfd, const char __user *filename, int mode,
                 case S_IFIFO: case S_IFSOCK:
                         error = vfs_mknod(nd.path.dentry->d_inode,dentry,mode,0);
                         break;
-               case S_IFDIR:
-                       error = -EPERM;
-                       break;
-               default:
-                       error = -EINVAL;
-               }
-               dput(dentry);
         }
+       mnt_drop_write(nd.path.mnt);
+out_dput:
+       dput(dentry);
+out_unlock:
         mutex_unlock(&nd.path.dentry->d_inode->i_mutex);
         path_put(&nd.path);
  out:
@@ -2044,7 +2159,12 @@ asmlinkage long sys_mkdirat(int dfd, const char __user *pathname, int mode)
  
         if (!IS_POSIXACL(nd.path.dentry->d_inode))
                 mode &= ~current->fs->umask;
+       error = mnt_want_write(nd.path.mnt);
+       if (error)
+               goto out_dput;
         error = vfs_mkdir(nd.path.dentry->d_inode, dentry, mode);
+       mnt_drop_write(nd.path.mnt);
+out_dput:
         dput(dentry);
  out_unlock:
         mutex_unlock(&nd.path.dentry->d_inode->i_mutex);
@@ -2151,7 +2271,12 @@ static long do_rmdir(int dfd, const char __user *pathname)
         error = PTR_ERR(dentry);
         if (IS_ERR(dentry))
                 goto exit2;
+       error = mnt_want_write(nd.path.mnt);
+       if (error)
+               goto exit3;
         error = vfs_rmdir(nd.path.dentry->d_inode, dentry);
+       mnt_drop_write(nd.path.mnt);
+exit3:
         dput(dentry);
  exit2:
         mutex_unlock(&nd.path.dentry->d_inode->i_mutex);
@@ -2232,7 +2357,11 @@ static long do_unlinkat(int dfd, const char __user *pathname)
                 inode = dentry->d_inode;
                 if (inode)
                         atomic_inc(&inode->i_count);
+               error = mnt_want_write(nd.path.mnt);
+               if (error)
+                       goto exit2;
                 error = vfs_unlink(nd.path.dentry->d_inode, dentry);
+               mnt_drop_write(nd.path.mnt);
         exit2:
                 dput(dentry);
         }
@@ -2313,7 +2442,12 @@ asmlinkage long sys_symlinkat(const char __user *oldname,
         if (IS_ERR(dentry))
                 goto out_unlock;
  
+       error = mnt_want_write(nd.path.mnt);
+       if (error)
+               goto out_dput;
         error = vfs_symlink(nd.path.dentry->d_inode, dentry, from, S_IALLUGO);
+       mnt_drop_write(nd.path.mnt);
+out_dput:
         dput(dentry);
  out_unlock:
         mutex_unlock(&nd.path.dentry->d_inode->i_mutex);
@@ -2408,7 +2542,12 @@ asmlinkage long sys_linkat(int olddfd, const char __user *oldname,
         error = PTR_ERR(new_dentry);
         if (IS_ERR(new_dentry))
                 goto out_unlock;
+       error = mnt_want_write(nd.path.mnt);
+       if (error)
+               goto out_dput;
         error = vfs_link(old_nd.path.dentry, nd.path.dentry->d_inode, new_dentry);
+       mnt_drop_write(nd.path.mnt);
+out_dput:
         dput(new_dentry);
  out_unlock:
         mutex_unlock(&nd.path.dentry->d_inode->i_mutex);
@@ -2634,8 +2773,12 @@ static int do_rename(int olddfd, const char *oldname,
         if (new_dentry == trap)
                 goto exit5;
  
+       error = mnt_want_write(oldnd.path.mnt);
+       if (error)
+               goto exit5;
         error = vfs_rename(old_dir->d_inode, old_dentry,
                                    new_dir->d_inode, new_dentry);
+       mnt_drop_write(oldnd.path.mnt);
  exit5:
         dput(new_dentry);
  exit4:
diff --git a/fs/namespace.c b/fs/namespace.c

index 94f026ec990ae24548eb02754f221b844abe0c2b..678f7ce060f2d69addc3af5110f6048b322dead7 100644 (file)
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -17,6 +17,7 @@
  #include <linux/quotaops.h>
  #include <linux/acct.h>
  #include <linux/capability.h>
+#include <linux/cpumask.h>
  #include <linux/module.h>
  #include <linux/sysfs.h>
  #include <linux/seq_file.h>
@@ -55,6 +56,8 @@ static inline unsigned long hash(struct vfsmount *mnt, struct dentry *dentry)
         return tmp & (HASH_SIZE - 1);
  }
  
+#define MNT_WRITER_UNDERFLOW_LIMIT -(1<<16)
+
  struct vfsmount *alloc_vfsmnt(const char *name)
  {
         struct vfsmount *mnt = kmem_cache_zalloc(mnt_cache, GFP_KERNEL);
@@ -68,6 +71,7 @@ struct vfsmount *alloc_vfsmnt(const char *name)
                 INIT_LIST_HEAD(&mnt->mnt_share);
                 INIT_LIST_HEAD(&mnt->mnt_slave_list);
                 INIT_LIST_HEAD(&mnt->mnt_slave);
+               atomic_set(&mnt->__mnt_writers, 0);
                 if (name) {
                         int size = strlen(name) + 1;
                         char *newname = kmalloc(size, GFP_KERNEL);
@@ -80,6 +84,263 @@ struct vfsmount *alloc_vfsmnt(const char *name)
         return mnt;
  }
  
+/*
+ * Most r/o checks on a fs are for operations that take
+ * discrete amounts of time, like a write() or unlink().
+ * We must keep track of when those operations start
+ * (for permission checks) and when they end, so that
+ * we can determine when writes are able to occur to
+ * a filesystem.
+ */
+/*
+ * __mnt_is_readonly: check whether a mount is read-only
+ * @mnt: the mount to check for its write status
+ *
+ * This shouldn't be used directly ouside of the VFS.
+ * It does not guarantee that the filesystem will stay
+ * r/w, just that it is right *now*.  This can not and
+ * should not be used in place of IS_RDONLY(inode).
+ * mnt_want/drop_write() will _keep_ the filesystem
+ * r/w.
+ */
+int __mnt_is_readonly(struct vfsmount *mnt)
+{
+       if (mnt->mnt_flags & MNT_READONLY)
+               return 1;
+       if (mnt->mnt_sb->s_flags & MS_RDONLY)
+               return 1;
+       return 0;
+}
+EXPORT_SYMBOL_GPL(__mnt_is_readonly);
+
+struct mnt_writer {
+       /*
+        * If holding multiple instances of this lock, they
+        * must be ordered by cpu number.
+        */
+       spinlock_t lock;
+       struct lock_class_key lock_class; /* compiles out with !lockdep */
+       unsigned long count;
+       struct vfsmount *mnt;
+} ____cacheline_aligned_in_smp;
+static DEFINE_PER_CPU(struct mnt_writer, mnt_writers);
+
+static int __init init_mnt_writers(void)
+{
+       int cpu;
+       for_each_possible_cpu(cpu) {
+               struct mnt_writer *writer = &per_cpu(mnt_writers, cpu);
+               spin_lock_init(&writer->lock);
+               lockdep_set_class(&writer->lock, &writer->lock_class);
+               writer->count = 0;
+       }
+       return 0;
+}
+fs_initcall(init_mnt_writers);
+
+static void unlock_mnt_writers(void)
+{
+       int cpu;
+       struct mnt_writer *cpu_writer;
+
+       for_each_possible_cpu(cpu) {
+               cpu_writer = &per_cpu(mnt_writers, cpu);
+               spin_unlock(&cpu_writer->lock);
+       }
+}
+
+static inline void __clear_mnt_count(struct mnt_writer *cpu_writer)
+{
+       if (!cpu_writer->mnt)
+               return;
+       /*
+        * This is in case anyone ever leaves an invalid,
+        * old ->mnt and a count of 0.
+        */
+       if (!cpu_writer->count)
+               return;
+       atomic_add(cpu_writer->count, &cpu_writer->mnt->__mnt_writers);
+       cpu_writer->count = 0;
+}
+ /*
+ * must hold cpu_writer->lock
+ */
+static inline void use_cpu_writer_for_mount(struct mnt_writer *cpu_writer,
+                                         struct vfsmount *mnt)
+{
+       if (cpu_writer->mnt == mnt)
+               return;
+       __clear_mnt_count(cpu_writer);
+       cpu_writer->mnt = mnt;
+}
+
+/*
+ * Most r/o checks on a fs are for operations that take
+ * discrete amounts of time, like a write() or unlink().
+ * We must keep track of when those operations start
+ * (for permission checks) and when they end, so that
+ * we can determine when writes are able to occur to
+ * a filesystem.
+ */
+/**
+ * mnt_want_write - get write access to a mount
+ * @mnt: the mount on which to take a write
+ *
+ * This tells the low-level filesystem that a write is
+ * about to be performed to it, and makes sure that
+ * writes are allowed before returning success.  When
+ * the write operation is finished, mnt_drop_write()
+ * must be called.  This is effectively a refcount.
+ */
+int mnt_want_write(struct vfsmount *mnt)
+{
+       int ret = 0;
+       struct mnt_writer *cpu_writer;
+
+       cpu_writer = &get_cpu_var(mnt_writers);
+       spin_lock(&cpu_writer->lock);
+       if (__mnt_is_readonly(mnt)) {
+               ret = -EROFS;
+               goto out;
+       }
+       use_cpu_writer_for_mount(cpu_writer, mnt);
+       cpu_writer->count++;
+out:
+       spin_unlock(&cpu_writer->lock);
+       put_cpu_var(mnt_writers);
+       return ret;
+}
+EXPORT_SYMBOL_GPL(mnt_want_write);
+
+static void lock_mnt_writers(void)
+{
+       int cpu;
+       struct mnt_writer *cpu_writer;
+
+       for_each_possible_cpu(cpu) {
+               cpu_writer = &per_cpu(mnt_writers, cpu);
+               spin_lock(&cpu_writer->lock);
+               __clear_mnt_count(cpu_writer);
+               cpu_writer->mnt = NULL;
+       }
+}
+
+/*
+ * These per-cpu write counts are not guaranteed to have
+ * matched increments and decrements on any given cpu.
+ * A file open()ed for write on one cpu and close()d on
+ * another cpu will imbalance this count.  Make sure it
+ * does not get too far out of whack.
+ */
+static void handle_write_count_underflow(struct vfsmount *mnt)
+{
+       if (atomic_read(&mnt->__mnt_writers) >=
+           MNT_WRITER_UNDERFLOW_LIMIT)
+               return;
+       /*
+        * It isn't necessary to hold all of the locks
+        * at the same time, but doing it this way makes
+        * us share a lot more code.
+        */
+       lock_mnt_writers();
+       /*
+        * vfsmount_lock is for mnt_flags.
+        */
+       spin_lock(&vfsmount_lock);
+       /*
+        * If coalescing the per-cpu writer counts did not
+        * get us back to a positive writer count, we have
+        * a bug.
+        */
+       if ((atomic_read(&mnt->__mnt_writers) < 0) &&
+           !(mnt->mnt_flags & MNT_IMBALANCED_WRITE_COUNT)) {
+               printk(KERN_DEBUG "leak detected on mount(%p) writers "
+                               "count: %d\n",
+                       mnt, atomic_read(&mnt->__mnt_writers));
+               WARN_ON(1);
+               /* use the flag to keep the dmesg spam down */
+               mnt->mnt_flags |= MNT_IMBALANCED_WRITE_COUNT;
+       }
+       spin_unlock(&vfsmount_lock);
+       unlock_mnt_writers();
+}
+
+/**
+ * mnt_drop_write - give up write access to a mount
+ * @mnt: the mount on which to give up write access
+ *
+ * Tells the low-level filesystem that we are done
+ * performing writes to it.  Must be matched with
+ * mnt_want_write() call above.
+ */
+void mnt_drop_write(struct vfsmount *mnt)
+{
+       int must_check_underflow = 0;
+       struct mnt_writer *cpu_writer;
+
+       cpu_writer = &get_cpu_var(mnt_writers);
+       spin_lock(&cpu_writer->lock);
+
+       use_cpu_writer_for_mount(cpu_writer, mnt);
+       if (cpu_writer->count > 0) {
+               cpu_writer->count--;
+       } else {
+               must_check_underflow = 1;
+               atomic_dec(&mnt->__mnt_writers);
+       }
+
+       spin_unlock(&cpu_writer->lock);
+       /*
+        * Logically, we could call this each time,
+        * but the __mnt_writers cacheline tends to
+        * be cold, and makes this expensive.
+        */
+       if (must_check_underflow)
+               handle_write_count_underflow(mnt);
+       /*
+        * This could be done right after the spinlock
+        * is taken because the spinlock keeps us on
+        * the cpu, and disables preemption.  However,
+        * putting it here bounds the amount that
+        * __mnt_writers can underflow.  Without it,
+        * we could theoretically wrap __mnt_writers.
+        */
+       put_cpu_var(mnt_writers);
+}
+EXPORT_SYMBOL_GPL(mnt_drop_write);
+
+static int mnt_make_readonly(struct vfsmount *mnt)
+{
+       int ret = 0;
+
+       lock_mnt_writers();
+       /*
+        * With all the locks held, this value is stable
+        */
+       if (atomic_read(&mnt->__mnt_writers) > 0) {
+               ret = -EBUSY;
+               goto out;
+       }
+       /*
+        * nobody can do a successful mnt_want_write() with all
+        * of the counts in MNT_DENIED_WRITE and the locks held.
+        */
+       spin_lock(&vfsmount_lock);
+       if (!ret)
+               mnt->mnt_flags |= MNT_READONLY;
+       spin_unlock(&vfsmount_lock);
+out:
+       unlock_mnt_writers();
+       return ret;
+}
+
+static void __mnt_unmake_readonly(struct vfsmount *mnt)
+{
+       spin_lock(&vfsmount_lock);
+       mnt->mnt_flags &= ~MNT_READONLY;
+       spin_unlock(&vfsmount_lock);
+}
+
  int simple_set_mnt(struct vfsmount *mnt, struct super_block *sb)
  {
         mnt->mnt_sb = sb;
@@ -271,7 +532,36 @@ static struct vfsmount *clone_mnt(struct vfsmount *old, struct dentry *root,
  
  static inline void __mntput(struct vfsmount *mnt)
  {
+       int cpu;
         struct super_block *sb = mnt->mnt_sb;
+       /*
+        * We don't have to hold all of the locks at the
+        * same time here because we know that we're the
+        * last reference to mnt and that no new writers
+        * can come in.
+        */
+       for_each_possible_cpu(cpu) {
+               struct mnt_writer *cpu_writer = &per_cpu(mnt_writers, cpu);
+               if (cpu_writer->mnt != mnt)
+                       continue;
+               spin_lock(&cpu_writer->lock);
+               atomic_add(cpu_writer->count, &mnt->__mnt_writers);
+               cpu_writer->count = 0;
+               /*
+                * Might as well do this so that no one
+                * ever sees the pointer and expects
+                * it to be valid.
+                */
+               cpu_writer->mnt = NULL;
+               spin_unlock(&cpu_writer->lock);
+       }
+       /*
+        * This probably indicates that somebody messed
+        * up a mnt_want/drop_write() pair.  If this
+        * happens, the filesystem was probably unable
+        * to make r/w->r/o transitions.
+        */
+       WARN_ON(atomic_read(&mnt->__mnt_writers));
         dput(mnt->mnt_root);
         free_vfsmnt(mnt);
         deactivate_super(sb);
@@ -417,7 +707,7 @@ static int show_vfsmnt(struct seq_file *m, void *v)
                 seq_putc(m, '.');
                 mangle(m, mnt->mnt_sb->s_subtype);
         }
-       seq_puts(m, mnt->mnt_sb->s_flags & MS_RDONLY ? " ro" : " rw");
+       seq_puts(m, __mnt_is_readonly(mnt) ? " ro" : " rw");
         for (fs_infop = fs_info; fs_infop->flag; fs_infop++) {
                 if (mnt->mnt_sb->s_flags & fs_infop->flag)
                         seq_puts(m, fs_infop->str);
@@ -1019,6 +1309,23 @@ out:
         return err;
  }
  
+static int change_mount_flags(struct vfsmount *mnt, int ms_flags)
+{
+       int error = 0;
+       int readonly_request = 0;
+
+       if (ms_flags & MS_RDONLY)
+               readonly_request = 1;
+       if (readonly_request == __mnt_is_readonly(mnt))
+               return 0;
+
+       if (readonly_request)
+               error = mnt_make_readonly(mnt);
+       else
+               __mnt_unmake_readonly(mnt);
+       return error;
+}
+
  /*
   * change filesystem flags. dir should be a physical root of filesystem.
   * If you've mounted a non-root directory somewhere and want to do remount
@@ -1041,7 +1348,10 @@ static noinline int do_remount(struct nameidata *nd, int flags, int mnt_flags,
                 return -EINVAL;
  
         down_write(&sb->s_umount);
-       err = do_remount_sb(sb, flags, data, 0);
+       if (flags & MS_BIND)
+               err = change_mount_flags(nd->path.mnt, flags);
+       else
+               err = do_remount_sb(sb, flags, data, 0);
         if (!err)
                 nd->path.mnt->mnt_flags = mnt_flags;
         up_write(&sb->s_umount);
@@ -1425,6 +1735,8 @@ long do_mount(char *dev_name, char *dir_name, char *type_page,
                 mnt_flags |= MNT_NODIRATIME;
         if (flags & MS_RELATIME)
                 mnt_flags |= MNT_RELATIME;
+       if (flags & MS_RDONLY)
+               mnt_flags |= MNT_READONLY;
  
         flags &= ~(MS_NOSUID | MS_NOEXEC | MS_NODEV | MS_ACTIVE |
                    MS_NOATIME | MS_NODIRATIME | MS_RELATIME| MS_KERNMOUNT);
diff --git a/fs/ncpfs/ioctl.c b/fs/ncpfs/ioctl.c

index c67b4bdcf719d0e02e86272962f22156c1f5ff25..ad8f167e54bc5e6069526b7406d26ed72fddb2ec 100644 (file)
--- a/fs/ncpfs/ioctl.c
+++ b/fs/ncpfs/ioctl.c
@@ -14,6 +14,7 @@
  #include <linux/ioctl.h>
  #include <linux/time.h>
  #include <linux/mm.h>
+#include <linux/mount.h>
  #include <linux/highuid.h>
  #include <linux/smp_lock.h>
  #include <linux/vmalloc.h>
@@ -261,7 +262,7 @@ ncp_get_charsets(struct ncp_server* server, struct ncp_nls_ioctl __user *arg)
  }
  #endif /* CONFIG_NCPFS_NLS */
  
-int ncp_ioctl(struct inode *inode, struct file *filp,
+static int __ncp_ioctl(struct inode *inode, struct file *filp,
               unsigned int cmd, unsigned long arg)
  {
         struct ncp_server *server = NCP_SERVER(inode);
@@ -822,6 +823,57 @@ outrel:
         return -EINVAL;
  }
  
+static int ncp_ioctl_need_write(unsigned int cmd)
+{
+       switch (cmd) {
+       case NCP_IOC_GET_FS_INFO:
+       case NCP_IOC_GET_FS_INFO_V2:
+       case NCP_IOC_NCPREQUEST:
+       case NCP_IOC_SETDENTRYTTL:
+       case NCP_IOC_SIGN_INIT:
+       case NCP_IOC_LOCKUNLOCK:
+       case NCP_IOC_SET_SIGN_WANTED:
+               return 1;
+       case NCP_IOC_GETOBJECTNAME:
+       case NCP_IOC_SETOBJECTNAME:
+       case NCP_IOC_GETPRIVATEDATA:
+       case NCP_IOC_SETPRIVATEDATA:
+       case NCP_IOC_SETCHARSETS:
+       case NCP_IOC_GETCHARSETS:
+       case NCP_IOC_CONN_LOGGED_IN:
+       case NCP_IOC_GETDENTRYTTL:
+       case NCP_IOC_GETMOUNTUID2:
+       case NCP_IOC_SIGN_WANTED:
+       case NCP_IOC_GETROOT:
+       case NCP_IOC_SETROOT:
+               return 0;
+       default:
+               /* unkown IOCTL command, assume write */
+               return 1;
+       }
+}
+
+int ncp_ioctl(struct inode *inode, struct file *filp,
+             unsigned int cmd, unsigned long arg)
+{
+       int ret;
+
+       if (ncp_ioctl_need_write(cmd)) {
+               /*
+                * inside the ioctl(), any failures which
+                * are because of file_permission() are
+                * -EACCESS, so it seems consistent to keep
+                *  that here.
+                */
+               if (mnt_want_write(filp->f_path.mnt))
+                       return -EACCES;
+       }
+       ret = __ncp_ioctl(inode, filp, cmd, arg);
+       if (ncp_ioctl_need_write(cmd))
+               mnt_drop_write(filp->f_path.mnt);
+       return ret;
+}
+
  #ifdef CONFIG_COMPAT
  long ncp_compat_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
  {
diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c

index 6cea7479c5b4d13136fb753eecc453f19dc43229..d9e30ac2798dc1d79ac46ad1a9b6b385252d4ab3 100644 (file)
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -967,7 +967,8 @@ static int is_atomic_open(struct inode *dir, struct nameidata *nd)
         if (nd->flags & LOOKUP_DIRECTORY)
                 return 0;
         /* Are we trying to write to a read only partition? */
-       if (IS_RDONLY(dir) && (nd->intent.open.flags & (O_CREAT|O_TRUNC|FMODE_WRITE)))
+       if (__mnt_is_readonly(nd->path.mnt) &&
+           (nd->intent.open.flags & (O_CREAT|O_TRUNC|FMODE_WRITE)))
                 return 0;
         return 1;
  }
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c

index c593db047d8bbd51babb22bbacc765f65636ff68..c309c881bd4e4e7a88500ad4c757d8c378216d99 100644 (file)
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -658,14 +658,19 @@ nfsd4_setattr(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
                         return status;
                 }
         }
+       status = mnt_want_write(cstate->current_fh.fh_export->ex_path.mnt);
+       if (status)
+               return status;
         status = nfs_ok;
         if (setattr->sa_acl != NULL)
                 status = nfsd4_set_nfs4_acl(rqstp, &cstate->current_fh,
                                             setattr->sa_acl);
         if (status)
-               return status;
+               goto out;
         status = nfsd_setattr(rqstp, &cstate->current_fh, &setattr->sa_iattr,
                                 0, (time_t)0);
+out:
+       mnt_drop_write(cstate->current_fh.fh_export->ex_path.mnt);
         return status;
  }
  
diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c

index 1ff90625860f712397e9251d328ac3708d43b5b8..145b3c877a27c222984f3671f29b004c84db9cfd 100644 (file)
--- a/fs/nfsd/nfs4recover.c
+++ b/fs/nfsd/nfs4recover.c
@@ -46,6 +46,7 @@
  #include <linux/scatterlist.h>
  #include <linux/crypto.h>
  #include <linux/sched.h>
+#include <linux/mount.h>
  
  #define NFSDDBG_FACILITY                NFSDDBG_PROC
  
@@ -154,7 +155,11 @@ nfsd4_create_clid_dir(struct nfs4_client *clp)
                 dprintk("NFSD: nfsd4_create_clid_dir: DIRECTORY EXISTS\n");
                 goto out_put;
         }
+       status = mnt_want_write(rec_dir.path.mnt);
+       if (status)
+               goto out_put;
         status = vfs_mkdir(rec_dir.path.dentry->d_inode, dentry, S_IRWXU);
+       mnt_drop_write(rec_dir.path.mnt);
  out_put:
         dput(dentry);
  out_unlock:
@@ -313,12 +318,17 @@ nfsd4_remove_clid_dir(struct nfs4_client *clp)
         if (!rec_dir_init || !clp->cl_firststate)
                 return;
  
+       status = mnt_want_write(rec_dir.path.mnt);
+       if (status)
+               goto out;
         clp->cl_firststate = 0;
         nfs4_save_user(&uid, &gid);
         status = nfsd4_unlink_clid_dir(clp->cl_recdir, HEXDIR_LEN-1);
         nfs4_reset_user(uid, gid);
         if (status == 0)
                 nfsd4_sync_rec_dir();
+       mnt_drop_write(rec_dir.path.mnt);
+out:
         if (status)
                 printk("NFSD: Failed to remove expired client state directory"
                                 " %.*s\n", HEXDIR_LEN, clp->cl_recdir);
@@ -347,13 +357,17 @@ nfsd4_recdir_purge_old(void) {
  
         if (!rec_dir_init)
                 return;
+       status = mnt_want_write(rec_dir.path.mnt);
+       if (status)
+               goto out;
         status = nfsd4_list_rec_dir(rec_dir.path.dentry, purge_old);
         if (status == 0)
                 nfsd4_sync_rec_dir();
+       mnt_drop_write(rec_dir.path.mnt);
+out:
         if (status)
                 printk("nfsd4: failed to purge old clients from recovery"
                         " directory %s\n", rec_dir.path.dentry->d_name.name);
-       return;
  }
  
  static int
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c

index bcb97d8e8b8bd6791a6161eb216ecba76a6e712d..81a75f3081f434891ad5e39ad7514182853bd837 100644 (file)
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -41,6 +41,7 @@
  #include <linux/sunrpc/svc.h>
  #include <linux/nfsd/nfsd.h>
  #include <linux/nfsd/cache.h>
+#include <linux/file.h>
  #include <linux/mount.h>
  #include <linux/workqueue.h>
  #include <linux/smp_lock.h>
@@ -1239,7 +1240,7 @@ static inline void
  nfs4_file_downgrade(struct file *filp, unsigned int share_access)
  {
         if (share_access & NFS4_SHARE_ACCESS_WRITE) {
-               put_write_access(filp->f_path.dentry->d_inode);
+               drop_file_write_access(filp);
                 filp->f_mode = (filp->f_mode | FMODE_READ) & ~FMODE_WRITE;
         }
  }
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c

index 46f59d5365a0d49986512c54a25ecdc6daeb54f1..304bf5f643c944e5dae2607798310ff3299aff22 100644 (file)
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1255,23 +1255,35 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
         err = 0;
         switch (type) {
         case S_IFREG:
+               host_err = mnt_want_write(fhp->fh_export->ex_path.mnt);
+               if (host_err)
+                       goto out_nfserr;
                 host_err = vfs_create(dirp, dchild, iap->ia_mode, NULL);
                 break;
         case S_IFDIR:
+               host_err = mnt_want_write(fhp->fh_export->ex_path.mnt);
+               if (host_err)
+                       goto out_nfserr;
                 host_err = vfs_mkdir(dirp, dchild, iap->ia_mode);
                 break;
         case S_IFCHR:
         case S_IFBLK:
         case S_IFIFO:
         case S_IFSOCK:
+               host_err = mnt_want_write(fhp->fh_export->ex_path.mnt);
+               if (host_err)
+                       goto out_nfserr;
                 host_err = vfs_mknod(dirp, dchild, iap->ia_mode, rdev);
                 break;
         default:
                 printk("nfsd: bad file type %o in nfsd_create\n", type);
                 host_err = -EINVAL;
+               goto out_nfserr;
         }
-       if (host_err < 0)
+       if (host_err < 0) {
+               mnt_drop_write(fhp->fh_export->ex_path.mnt);
                 goto out_nfserr;
+       }
  
         if (EX_ISSYNC(fhp->fh_export)) {
                 err = nfserrno(nfsd_sync_dir(dentry));
@@ -1282,6 +1294,7 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
         err2 = nfsd_create_setattr(rqstp, resfhp, iap);
         if (err2)
                 err = err2;
+       mnt_drop_write(fhp->fh_export->ex_path.mnt);
         /*
          * Update the file handle to get the new inode info.
          */
@@ -1359,6 +1372,9 @@ nfsd_create_v3(struct svc_rqst *rqstp, struct svc_fh *fhp,
                 v_atime = verifier[1]&0x7fffffff;
         }
         
+       host_err = mnt_want_write(fhp->fh_export->ex_path.mnt);
+       if (host_err)
+               goto out_nfserr;
         if (dchild->d_inode) {
                 err = 0;
  
@@ -1390,12 +1406,15 @@ nfsd_create_v3(struct svc_rqst *rqstp, struct svc_fh *fhp,
                 case NFS3_CREATE_GUARDED:
                         err = nfserr_exist;
                 }
+               mnt_drop_write(fhp->fh_export->ex_path.mnt);
                 goto out;
         }
  
         host_err = vfs_create(dirp, dchild, iap->ia_mode, NULL);
-       if (host_err < 0)
+       if (host_err < 0) {
+               mnt_drop_write(fhp->fh_export->ex_path.mnt);
                 goto out_nfserr;
+       }
         if (created)
                 *created = 1;
  
@@ -1420,6 +1439,7 @@ nfsd_create_v3(struct svc_rqst *rqstp, struct svc_fh *fhp,
         if (err2)
                 err = err2;
  
+       mnt_drop_write(fhp->fh_export->ex_path.mnt);
         /*
          * Update the filehandle to get the new inode info.
          */
@@ -1522,6 +1542,10 @@ nfsd_symlink(struct svc_rqst *rqstp, struct svc_fh *fhp,
         if (iap && (iap->ia_valid & ATTR_MODE))
                 mode = iap->ia_mode & S_IALLUGO;
  
+       host_err = mnt_want_write(fhp->fh_export->ex_path.mnt);
+       if (host_err)
+               goto out_nfserr;
+
         if (unlikely(path[plen] != 0)) {
                 char *path_alloced = kmalloc(plen+1, GFP_KERNEL);
                 if (path_alloced == NULL)
@@ -1542,6 +1566,8 @@ nfsd_symlink(struct svc_rqst *rqstp, struct svc_fh *fhp,
         err = nfserrno(host_err);
         fh_unlock(fhp);
  
+       mnt_drop_write(fhp->fh_export->ex_path.mnt);
+
         cerr = fh_compose(resfhp, fhp->fh_export, dnew, fhp);
         dput(dnew);
         if (err==0) err = cerr;
@@ -1592,6 +1618,11 @@ nfsd_link(struct svc_rqst *rqstp, struct svc_fh *ffhp,
         dold = tfhp->fh_dentry;
         dest = dold->d_inode;
  
+       host_err = mnt_want_write(tfhp->fh_export->ex_path.mnt);
+       if (host_err) {
+               err = nfserrno(host_err);
+               goto out_dput;
+       }
         host_err = vfs_link(dold, dirp, dnew);
         if (!host_err) {
                 if (EX_ISSYNC(ffhp->fh_export)) {
@@ -1605,7 +1636,8 @@ nfsd_link(struct svc_rqst *rqstp, struct svc_fh *ffhp,
                 else
                         err = nfserrno(host_err);
         }
-
+       mnt_drop_write(tfhp->fh_export->ex_path.mnt);
+out_dput:
         dput(dnew);
  out_unlock:
         fh_unlock(ffhp);
@@ -1678,13 +1710,20 @@ nfsd_rename(struct svc_rqst *rqstp, struct svc_fh *ffhp, char *fname, int flen,
         if (ndentry == trap)
                 goto out_dput_new;
  
-#ifdef MSNFS
-       if ((ffhp->fh_export->ex_flags & NFSEXP_MSNFS) &&
+       if (svc_msnfs(ffhp) &&
                 ((atomic_read(&odentry->d_count) > 1)
                  || (atomic_read(&ndentry->d_count) > 1))) {
                         host_err = -EPERM;
-       } else
-#endif
+                       goto out_dput_new;
+       }
+
+       host_err = -EXDEV;
+       if (ffhp->fh_export->ex_path.mnt != tfhp->fh_export->ex_path.mnt)
+               goto out_dput_new;
+       host_err = mnt_want_write(ffhp->fh_export->ex_path.mnt);
+       if (host_err)
+               goto out_dput_new;
+
         host_err = vfs_rename(fdir, odentry, tdir, ndentry);
         if (!host_err && EX_ISSYNC(tfhp->fh_export)) {
                 host_err = nfsd_sync_dir(tdentry);
@@ -1692,6 +1731,8 @@ nfsd_rename(struct svc_rqst *rqstp, struct svc_fh *ffhp, char *fname, int flen,
                         host_err = nfsd_sync_dir(fdentry);
         }
  
+       mnt_drop_write(ffhp->fh_export->ex_path.mnt);
+
   out_dput_new:
         dput(ndentry);
   out_dput_old:
@@ -1750,6 +1791,10 @@ nfsd_unlink(struct svc_rqst *rqstp, struct svc_fh *fhp, int type,
         if (!type)
                 type = rdentry->d_inode->i_mode & S_IFMT;
  
+       host_err = mnt_want_write(fhp->fh_export->ex_path.mnt);
+       if (host_err)
+               goto out_nfserr;
+
         if (type != S_IFDIR) { /* It's UNLINK */
  #ifdef MSNFS
                 if ((fhp->fh_export->ex_flags & NFSEXP_MSNFS) &&
@@ -1765,10 +1810,12 @@ nfsd_unlink(struct svc_rqst *rqstp, struct svc_fh *fhp, int type,
         dput(rdentry);
  
         if (host_err)
-               goto out_nfserr;
+               goto out_drop;
         if (EX_ISSYNC(fhp->fh_export))
                 host_err = nfsd_sync_dir(dentry);
  
+out_drop:
+       mnt_drop_write(fhp->fh_export->ex_path.mnt);
  out_nfserr:
         err = nfserrno(host_err);
  out:
@@ -1865,7 +1912,7 @@ nfsd_permission(struct svc_rqst *rqstp, struct svc_export *exp,
                 inode->i_mode,
                 IS_IMMUTABLE(inode)?    " immut" : "",
                 IS_APPEND(inode)?       " append" : "",
-               IS_RDONLY(inode)?       " ro" : "");
+               __mnt_is_readonly(exp->ex_path.mnt)?    " ro" : "");
         dprintk("      owner %d/%d user %d/%d\n",
                 inode->i_uid, inode->i_gid, current->fsuid, current->fsgid);
  #endif
@@ -1876,7 +1923,8 @@ nfsd_permission(struct svc_rqst *rqstp, struct svc_export *exp,
          */
         if (!(acc & MAY_LOCAL_ACCESS))
                 if (acc & (MAY_WRITE | MAY_SATTR | MAY_TRUNC)) {
-                       if (exp_rdonly(rqstp, exp) || IS_RDONLY(inode))
+                       if (exp_rdonly(rqstp, exp) ||
+                           __mnt_is_readonly(exp->ex_path.mnt))
                                 return nfserr_rofs;
                         if (/* (acc & MAY_WRITE) && */ IS_IMMUTABLE(inode))
                                 return nfserr_perm;
@@ -2039,6 +2087,9 @@ nfsd_set_posix_acl(struct svc_fh *fhp, int type, struct posix_acl *acl)
         } else
                 size = 0;
  
+       error = mnt_want_write(fhp->fh_export->ex_path.mnt);
+       if (error)
+               goto getout;
         if (size)
                 error = vfs_setxattr(fhp->fh_dentry, name, value, size, 0);
         else {
@@ -2050,6 +2101,7 @@ nfsd_set_posix_acl(struct svc_fh *fhp, int type, struct posix_acl *acl)
                                 error = 0;
                 }
         }
+       mnt_drop_write(fhp->fh_export->ex_path.mnt);
  
  getout:
         kfree(value);
diff --git a/fs/ocfs2/ioctl.c b/fs/ocfs2/ioctl.c

index b413166dd16340c0a159abbbf24efdd71d69ad2c..7b142f0ce995bd07a5116567a0b2b56e65b8dd1b 100644 (file)
--- a/fs/ocfs2/ioctl.c
+++ b/fs/ocfs2/ioctl.c
@@ -60,10 +60,6 @@ static int ocfs2_set_inode_attr(struct inode *inode, unsigned flags,
                 goto bail;
         }
  
-       status = -EROFS;
-       if (IS_RDONLY(inode))
-               goto bail_unlock;
-
         status = -EACCES;
         if (!is_owner_or_cap(inode))
                 goto bail_unlock;
@@ -134,8 +130,13 @@ long ocfs2_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
                 if (get_user(flags, (int __user *) arg))
                         return -EFAULT;
  
-               return ocfs2_set_inode_attr(inode, flags,
+               status = mnt_want_write(filp->f_path.mnt);
+               if (status)
+                       return status;
+               status = ocfs2_set_inode_attr(inode, flags,
                         OCFS2_FL_MODIFIABLE);
+               mnt_drop_write(filp->f_path.mnt);
+               return status;
         case OCFS2_IOC_RESVSP:
         case OCFS2_IOC_RESVSP64:
         case OCFS2_IOC_UNRESVSP:
diff --git a/fs/open.c b/fs/open.c

index 3fa4e4ffce4cb4ba69862703f90d289731809eab..b70e7666bb2c3f725a761b268b2370e5c151ba63 100644 (file)
--- a/fs/open.c
+++ b/fs/open.c
@@ -244,21 +244,21 @@ static long do_sys_truncate(const char __user * path, loff_t length)
         if (!S_ISREG(inode->i_mode))
                 goto dput_and_out;
  
-       error = vfs_permission(&nd, MAY_WRITE);
+       error = mnt_want_write(nd.path.mnt);
         if (error)
                 goto dput_and_out;
  
-       error = -EROFS;
-       if (IS_RDONLY(inode))
-               goto dput_and_out;
+       error = vfs_permission(&nd, MAY_WRITE);
+       if (error)
+               goto mnt_drop_write_and_out;
  
         error = -EPERM;
         if (IS_IMMUTABLE(inode) || IS_APPEND(inode))
-               goto dput_and_out;
+               goto mnt_drop_write_and_out;
  
         error = get_write_access(inode);
         if (error)
-               goto dput_and_out;
+               goto mnt_drop_write_and_out;
  
         /*
          * Make sure that there are no leases.  get_write_access() protects
@@ -276,6 +276,8 @@ static long do_sys_truncate(const char __user * path, loff_t length)
  
  put_write_and_out:
         put_write_access(inode);
+mnt_drop_write_and_out:
+       mnt_drop_write(nd.path.mnt);
  dput_and_out:
         path_put(&nd.path);
  out:
@@ -457,8 +459,17 @@ asmlinkage long sys_faccessat(int dfd, const char __user *filename, int mode)
         if(res || !(mode & S_IWOTH) ||
            special_file(nd.path.dentry->d_inode->i_mode))
                 goto out_path_release;
-
-       if(IS_RDONLY(nd.path.dentry->d_inode))
+       /*
+        * This is a rare case where using __mnt_is_readonly()
+        * is OK without a mnt_want/drop_write() pair.  Since
+        * no actual write to the fs is performed here, we do
+        * not need to telegraph to that to anyone.
+        *
+        * By doing this, we accept that this access is
+        * inherently racy and know that the fs may change
+        * state before we even see this result.
+        */
+       if (__mnt_is_readonly(nd.path.mnt))
                 res = -EROFS;
  
  out_path_release:
@@ -567,12 +578,12 @@ asmlinkage long sys_fchmod(unsigned int fd, mode_t mode)
  
         audit_inode(NULL, dentry);
  
-       err = -EROFS;
-       if (IS_RDONLY(inode))
+       err = mnt_want_write(file->f_path.mnt);
+       if (err)
                 goto out_putf;
         err = -EPERM;
         if (IS_IMMUTABLE(inode) || IS_APPEND(inode))
-               goto out_putf;
+               goto out_drop_write;
         mutex_lock(&inode->i_mutex);
         if (mode == (mode_t) -1)
                 mode = inode->i_mode;
@@ -581,6 +592,8 @@ asmlinkage long sys_fchmod(unsigned int fd, mode_t mode)
         err = notify_change(dentry, &newattrs);
         mutex_unlock(&inode->i_mutex);
  
+out_drop_write:
+       mnt_drop_write(file->f_path.mnt);
  out_putf:
         fput(file);
  out:
@@ -600,13 +613,13 @@ asmlinkage long sys_fchmodat(int dfd, const char __user *filename,
                 goto out;
         inode = nd.path.dentry->d_inode;
  
-       error = -EROFS;
-       if (IS_RDONLY(inode))
+       error = mnt_want_write(nd.path.mnt);
+       if (error)
                 goto dput_and_out;
  
         error = -EPERM;
         if (IS_IMMUTABLE(inode) || IS_APPEND(inode))
-               goto dput_and_out;
+               goto out_drop_write;
  
         mutex_lock(&inode->i_mutex);
         if (mode == (mode_t) -1)
@@ -616,6 +629,8 @@ asmlinkage long sys_fchmodat(int dfd, const char __user *filename,
         error = notify_change(nd.path.dentry, &newattrs);
         mutex_unlock(&inode->i_mutex);
  
+out_drop_write:
+       mnt_drop_write(nd.path.mnt);
  dput_and_out:
         path_put(&nd.path);
  out:
@@ -638,9 +653,6 @@ static int chown_common(struct dentry * dentry, uid_t user, gid_t group)
                 printk(KERN_ERR "chown_common: NULL inode\n");
                 goto out;
         }
-       error = -EROFS;
-       if (IS_RDONLY(inode))
-               goto out;
         error = -EPERM;
         if (IS_IMMUTABLE(inode) || IS_APPEND(inode))
                 goto out;
@@ -671,7 +683,12 @@ asmlinkage long sys_chown(const char __user * filename, uid_t user, gid_t group)
         error = user_path_walk(filename, &nd);
         if (error)
                 goto out;
+       error = mnt_want_write(nd.path.mnt);
+       if (error)
+               goto out_release;
         error = chown_common(nd.path.dentry, user, group);
+       mnt_drop_write(nd.path.mnt);
+out_release:
         path_put(&nd.path);
  out:
         return error;
@@ -691,7 +708,12 @@ asmlinkage long sys_fchownat(int dfd, const char __user *filename, uid_t user,
         error = __user_walk_fd(dfd, filename, follow, &nd);
         if (error)
                 goto out;
+       error = mnt_want_write(nd.path.mnt);
+       if (error)
+               goto out_release;
         error = chown_common(nd.path.dentry, user, group);
+       mnt_drop_write(nd.path.mnt);
+out_release:
         path_put(&nd.path);
  out:
         return error;
@@ -705,7 +727,12 @@ asmlinkage long sys_lchown(const char __user * filename, uid_t user, gid_t group
         error = user_path_walk_link(filename, &nd);
         if (error)
                 goto out;
+       error = mnt_want_write(nd.path.mnt);
+       if (error)
+               goto out_release;
         error = chown_common(nd.path.dentry, user, group);
+       mnt_drop_write(nd.path.mnt);
+out_release:
         path_put(&nd.path);
  out:
         return error;
@@ -722,14 +749,48 @@ asmlinkage long sys_fchown(unsigned int fd, uid_t user, gid_t group)
         if (!file)
                 goto out;
  
+       error = mnt_want_write(file->f_path.mnt);
+       if (error)
+               goto out_fput;
         dentry = file->f_path.dentry;
         audit_inode(NULL, dentry);
         error = chown_common(dentry, user, group);
+       mnt_drop_write(file->f_path.mnt);
+out_fput:
         fput(file);
  out:
         return error;
  }
  
+/*
+ * You have to be very careful that these write
+ * counts get cleaned up in error cases and
+ * upon __fput().  This should probably never
+ * be called outside of __dentry_open().
+ */
+static inline int __get_file_write_access(struct inode *inode,
+                                         struct vfsmount *mnt)
+{
+       int error;
+       error = get_write_access(inode);
+       if (error)
+               return error;
+       /*
+        * Do not take mount writer counts on
+        * special files since no writes to
+        * the mount itself will occur.
+        */
+       if (!special_file(inode->i_mode)) {
+               /*
+                * Balanced in __fput()
+                */
+               error = mnt_want_write(mnt);
+               if (error)
+                       put_write_access(inode);
+       }
+       return error;
+}
+
  static struct file *__dentry_open(struct dentry *dentry, struct vfsmount *mnt,
                                         int flags, struct file *f,
                                         int (*open)(struct inode *, struct file *))
@@ -742,9 +803,11 @@ static struct file *__dentry_open(struct dentry *dentry, struct vfsmount *mnt,
                                 FMODE_PREAD | FMODE_PWRITE;
         inode = dentry->d_inode;
         if (f->f_mode & FMODE_WRITE) {
-               error = get_write_access(inode);
+               error = __get_file_write_access(inode, mnt);
                 if (error)
                         goto cleanup_file;
+               if (!special_file(inode->i_mode))
+                       file_take_write(f);
         }
  
         f->f_mapping = inode->i_mapping;
@@ -784,8 +847,19 @@ static struct file *__dentry_open(struct dentry *dentry, struct vfsmount *mnt,
  
  cleanup_all:
         fops_put(f->f_op);
-       if (f->f_mode & FMODE_WRITE)
+       if (f->f_mode & FMODE_WRITE) {
                 put_write_access(inode);
+               if (!special_file(inode->i_mode)) {
+                       /*
+                        * We don't consider this a real
+                        * mnt_want/drop_write() pair
+                        * because it all happenend right
+                        * here, so just reset the state.
+                        */
+                       file_reset_write(f);
+                       mnt_drop_write(mnt);
+               }
+       }
         file_kill(f);
         f->f_path.dentry = NULL;
         f->f_path.mnt = NULL;
@@ -796,43 +870,6 @@ cleanup_file:
         return ERR_PTR(error);
  }
  
-/*
- * Note that while the flag value (low two bits) for sys_open means:
- *     00 - read-only
- *     01 - write-only
- *     10 - read-write
- *     11 - special
- * it is changed into
- *     00 - no permissions needed
- *     01 - read-permission
- *     10 - write-permission
- *     11 - read-write
- * for the internal routines (ie open_namei()/follow_link() etc). 00 is
- * used by symlinks.
- */
-static struct file *do_filp_open(int dfd, const char *filename, int flags,
-                                int mode)
-{
-       int namei_flags, error;
-       struct nameidata nd;
-
-       namei_flags = flags;
-       if ((namei_flags+1) & O_ACCMODE)
-               namei_flags++;
-
-       error = open_namei(dfd, filename, namei_flags, mode, &nd);
-       if (!error)
-               return nameidata_to_filp(&nd, flags);
-
-       return ERR_PTR(error);
-}
-
-struct file *filp_open(const char *filename, int flags, int mode)
-{
-       return do_filp_open(AT_FDCWD, filename, flags, mode);
-}
-EXPORT_SYMBOL(filp_open);
-
  /**
   * lookup_instantiate_filp - instantiates the open intent filp
   * @nd: pointer to nameidata
diff --git a/fs/reiserfs/ioctl.c b/fs/reiserfs/ioctl.c

index e0f0f098a523a8d1c3a41099ab76b411989a23a0..74363a7aacbcb64154c81b1f6bcb832e0fee8727 100644 (file)
--- a/fs/reiserfs/ioctl.c
+++ b/fs/reiserfs/ioctl.c
@@ -4,6 +4,7 @@
  
  #include <linux/capability.h>
  #include <linux/fs.h>
+#include <linux/mount.h>
  #include <linux/reiserfs_fs.h>
  #include <linux/time.h>
  #include <asm/uaccess.h>
@@ -25,6 +26,7 @@ int reiserfs_ioctl(struct inode *inode, struct file *filp, unsigned int cmd,
                    unsigned long arg)
  {
         unsigned int flags;
+       int err = 0;
  
         switch (cmd) {
         case REISERFS_IOC_UNPACK:
@@ -48,50 +50,67 @@ int reiserfs_ioctl(struct inode *inode, struct file *filp, unsigned int cmd,
                         if (!reiserfs_attrs(inode->i_sb))
                                 return -ENOTTY;
  
-                       if (IS_RDONLY(inode))
-                               return -EROFS;
+                       err = mnt_want_write(filp->f_path.mnt);
+                       if (err)
+                               return err;
  
-                       if (!is_owner_or_cap(inode))
-                               return -EPERM;
-
-                       if (get_user(flags, (int __user *)arg))
-                               return -EFAULT;
-
-                       /* Is it quota file? Do not allow user to mess with it. */
-                       if (IS_NOQUOTA(inode))
-                               return -EPERM;
+                       if (!is_owner_or_cap(inode)) {
+                               err = -EPERM;
+                               goto setflags_out;
+                       }
+                       if (get_user(flags, (int __user *)arg)) {
+                               err = -EFAULT;
+                               goto setflags_out;
+                       }
+                       /*
+                        * Is it quota file? Do not allow user to mess with it
+                        */
+                       if (IS_NOQUOTA(inode)) {
+                               err = -EPERM;
+                               goto setflags_out;
+                       }
                         if (((flags ^ REISERFS_I(inode)->
                               i_attrs) & (REISERFS_IMMUTABLE_FL |
                                           REISERFS_APPEND_FL))
-                           && !capable(CAP_LINUX_IMMUTABLE))
-                               return -EPERM;
-
+                           && !capable(CAP_LINUX_IMMUTABLE)) {
+                               err = -EPERM;
+                               goto setflags_out;
+                       }
                         if ((flags & REISERFS_NOTAIL_FL) &&
                             S_ISREG(inode->i_mode)) {
                                 int result;
  
                                 result = reiserfs_unpack(inode, filp);
-                               if (result)
-                                       return result;
+                               if (result) {
+                                       err = result;
+                                       goto setflags_out;
+                               }
                         }
                         sd_attrs_to_i_attrs(flags, inode);
                         REISERFS_I(inode)->i_attrs = flags;
                         inode->i_ctime = CURRENT_TIME_SEC;
                         mark_inode_dirty(inode);
-                       return 0;
+setflags_out:
+                       mnt_drop_write(filp->f_path.mnt);
+                       return err;
                 }
         case REISERFS_IOC_GETVERSION:
                 return put_user(inode->i_generation, (int __user *)arg);
         case REISERFS_IOC_SETVERSION:
                 if (!is_owner_or_cap(inode))
                         return -EPERM;
-               if (IS_RDONLY(inode))
-                       return -EROFS;
-               if (get_user(inode->i_generation, (int __user *)arg))
-                       return -EFAULT;
+               err = mnt_want_write(filp->f_path.mnt);
+               if (err)
+                       return err;
+               if (get_user(inode->i_generation, (int __user *)arg)) {
+                       err = -EFAULT;
+                       goto setversion_out;
+               }
                 inode->i_ctime = CURRENT_TIME_SEC;
                 mark_inode_dirty(inode);
-               return 0;
+setversion_out:
+               mnt_drop_write(filp->f_path.mnt);
+               return err;
         default:
                 return -ENOTTY;
         }
diff --git a/fs/super.c b/fs/super.c

index 09008dbd264e731411c6da60def6648888cc6b52..1f8f05ede437067cadcfb00b77baa47bd09e9a5e 100644 (file)
--- a/fs/super.c
+++ b/fs/super.c
@@ -37,6 +37,7 @@
  #include <linux/idr.h>
  #include <linux/kobject.h>
  #include <linux/mutex.h>
+#include <linux/file.h>
  #include <asm/uaccess.h>
  
  
@@ -567,10 +568,29 @@ static void mark_files_ro(struct super_block *sb)
  {
         struct file *f;
  
+retry:
         file_list_lock();
         list_for_each_entry(f, &sb->s_files, f_u.fu_list) {
-               if (S_ISREG(f->f_path.dentry->d_inode->i_mode) && file_count(f))
-                       f->f_mode &= ~FMODE_WRITE;
+               struct vfsmount *mnt;
+               if (!S_ISREG(f->f_path.dentry->d_inode->i_mode))
+                      continue;
+               if (!file_count(f))
+                       continue;
+               if (!(f->f_mode & FMODE_WRITE))
+                       continue;
+               f->f_mode &= ~FMODE_WRITE;
+               if (file_check_writeable(f) != 0)
+                       continue;
+               file_release_write(f);
+               mnt = mntget(f->f_path.mnt);
+               file_list_unlock();
+               /*
+                * This can sleep, so we can't hold
+                * the file_list_lock() spinlock.
+                */
+               mnt_drop_write(mnt);
+               mntput(mnt);
+               goto retry;
         }
         file_list_unlock();
  }
diff --git a/fs/utimes.c b/fs/utimes.c

index b18da9c0b97f5f03185b13678ac0ca16932d9c70..a2bef77dc9c9878c3f93684acf86548f443d989f 100644 (file)
--- a/fs/utimes.c
+++ b/fs/utimes.c
@@ -2,6 +2,7 @@
  #include <linux/file.h>
  #include <linux/fs.h>
  #include <linux/linkage.h>
+#include <linux/mount.h>
  #include <linux/namei.h>
  #include <linux/sched.h>
  #include <linux/stat.h>
@@ -59,6 +60,7 @@ long do_utimes(int dfd, char __user *filename, struct timespec *times, int flags
         struct inode *inode;
         struct iattr newattrs;
         struct file *f = NULL;
+       struct vfsmount *mnt;
  
         error = -EINVAL;
         if (times && (!nsec_valid(times[0].tv_nsec) ||
@@ -79,18 +81,20 @@ long do_utimes(int dfd, char __user *filename, struct timespec *times, int flags
                 if (!f)
                         goto out;
                 dentry = f->f_path.dentry;
+               mnt = f->f_path.mnt;
         } else {
                 error = __user_walk_fd(dfd, filename, (flags & AT_SYMLINK_NOFOLLOW) ? 0 : LOOKUP_FOLLOW, &nd);
                 if (error)
                         goto out;
  
                 dentry = nd.path.dentry;
+               mnt = nd.path.mnt;
         }
  
         inode = dentry->d_inode;
  
-       error = -EROFS;
-       if (IS_RDONLY(inode))
+       error = mnt_want_write(mnt);
+       if (error)
                 goto dput_and_out;
  
         /* Don't worry, the checks are done in inode_change_ok() */
@@ -98,7 +102,7 @@ long do_utimes(int dfd, char __user *filename, struct timespec *times, int flags
         if (times) {
                 error = -EPERM;
                  if (IS_APPEND(inode) || IS_IMMUTABLE(inode))
-                        goto dput_and_out;
+                       goto mnt_drop_write_and_out;
  
                 if (times[0].tv_nsec == UTIME_OMIT)
                         newattrs.ia_valid &= ~ATTR_ATIME;
@@ -118,22 +122,24 @@ long do_utimes(int dfd, char __user *filename, struct timespec *times, int flags
         } else {
                 error = -EACCES;
                  if (IS_IMMUTABLE(inode))
-                        goto dput_and_out;
+                       goto mnt_drop_write_and_out;
  
                 if (!is_owner_or_cap(inode)) {
                         if (f) {
                                 if (!(f->f_mode & FMODE_WRITE))
-                                       goto dput_and_out;
+                                       goto mnt_drop_write_and_out;
                         } else {
                                 error = vfs_permission(&nd, MAY_WRITE);
                                 if (error)
-                                       goto dput_and_out;
+                                       goto mnt_drop_write_and_out;
                         }
                 }
         }
         mutex_lock(&inode->i_mutex);
         error = notify_change(dentry, &newattrs);
         mutex_unlock(&inode->i_mutex);
+mnt_drop_write_and_out:
+       mnt_drop_write(mnt);
  dput_and_out:
         if (f)
                 fput(f);
diff --git a/fs/xattr.c b/fs/xattr.c

index 3acab16154608724f5558458eea5cac2a00b0b5a..f7062da505d4b1a6d64fc5d686a98ed8cea5e7f4 100644 (file)
--- a/fs/xattr.c
+++ b/fs/xattr.c
@@ -11,6 +11,7 @@
  #include <linux/slab.h>
  #include <linux/file.h>
  #include <linux/xattr.h>
+#include <linux/mount.h>
  #include <linux/namei.h>
  #include <linux/security.h>
  #include <linux/syscalls.h>
@@ -32,8 +33,6 @@ xattr_permission(struct inode *inode, const char *name, int mask)
          * filesystem  or on an immutable / append-only inode.
          */
         if (mask & MAY_WRITE) {
-               if (IS_RDONLY(inode))
-                       return -EROFS;
                 if (IS_IMMUTABLE(inode) || IS_APPEND(inode))
                         return -EPERM;
         }
@@ -262,7 +261,11 @@ sys_setxattr(char __user *path, char __user *name, void __user *value,
         error = user_path_walk(path, &nd);
         if (error)
                 return error;
-       error = setxattr(nd.path.dentry, name, value, size, flags);
+       error = mnt_want_write(nd.path.mnt);
+       if (!error) {
+               error = setxattr(nd.path.dentry, name, value, size, flags);
+               mnt_drop_write(nd.path.mnt);
+       }
         path_put(&nd.path);
         return error;
  }
@@ -277,7 +280,11 @@ sys_lsetxattr(char __user *path, char __user *name, void __user *value,
         error = user_path_walk_link(path, &nd);
         if (error)
                 return error;
-       error = setxattr(nd.path.dentry, name, value, size, flags);
+       error = mnt_want_write(nd.path.mnt);
+       if (!error) {
+               error = setxattr(nd.path.dentry, name, value, size, flags);
+               mnt_drop_write(nd.path.mnt);
+       }
         path_put(&nd.path);
         return error;
  }
@@ -295,7 +302,12 @@ sys_fsetxattr(int fd, char __user *name, void __user *value,
                 return error;
         dentry = f->f_path.dentry;
         audit_inode(NULL, dentry);
-       error = setxattr(dentry, name, value, size, flags);
+       error = mnt_want_write(f->f_path.mnt);
+       if (!error) {
+               error = setxattr(dentry, name, value, size, flags);
+               mnt_drop_write(f->f_path.mnt);
+       }
+out_fput:
         fput(f);
         return error;
  }
@@ -482,7 +494,11 @@ sys_removexattr(char __user *path, char __user *name)
         error = user_path_walk(path, &nd);
         if (error)
                 return error;
-       error = removexattr(nd.path.dentry, name);
+       error = mnt_want_write(nd.path.mnt);
+       if (!error) {
+               error = removexattr(nd.path.dentry, name);
+               mnt_drop_write(nd.path.mnt);
+       }
         path_put(&nd.path);
         return error;
  }
@@ -496,7 +512,11 @@ sys_lremovexattr(char __user *path, char __user *name)
         error = user_path_walk_link(path, &nd);
         if (error)
                 return error;
-       error = removexattr(nd.path.dentry, name);
+       error = mnt_want_write(nd.path.mnt);
+       if (!error) {
+               error = removexattr(nd.path.dentry, name);
+               mnt_drop_write(nd.path.mnt);
+       }
         path_put(&nd.path);
         return error;
  }
@@ -513,7 +533,11 @@ sys_fremovexattr(int fd, char __user *name)
                 return error;
         dentry = f->f_path.dentry;
         audit_inode(NULL, dentry);
-       error = removexattr(dentry, name);
+       error = mnt_want_write(f->f_path.mnt);
+       if (!error) {
+               error = removexattr(dentry, name);
+               mnt_drop_write(f->f_path.mnt);
+       }
         fput(f);
         return error;
  }
diff --git a/fs/xfs/linux-2.6/xfs_ioctl.c b/fs/xfs/linux-2.6/xfs_ioctl.c

index bf77597938564d1b498c10e90440ff3339495684..4ddb86b73c6b537034b09d2632c7d2061432a736 100644 (file)
--- a/fs/xfs/linux-2.6/xfs_ioctl.c
+++ b/fs/xfs/linux-2.6/xfs_ioctl.c
@@ -535,8 +535,6 @@ xfs_attrmulti_attr_set(
         char                    *kbuf;
         int                     error = EFAULT;
  
-       if (IS_RDONLY(inode))
-               return -EROFS;
         if (IS_IMMUTABLE(inode) || IS_APPEND(inode))
                 return EPERM;
         if (len > XATTR_SIZE_MAX)
@@ -562,8 +560,6 @@ xfs_attrmulti_attr_remove(
         char                    *name,
         __uint32_t              flags)
  {
-       if (IS_RDONLY(inode))
-               return -EROFS;
         if (IS_IMMUTABLE(inode) || IS_APPEND(inode))
                 return EPERM;
         return xfs_attr_remove(XFS_I(inode), name, flags);
@@ -573,6 +569,7 @@ STATIC int
  xfs_attrmulti_by_handle(
         xfs_mount_t             *mp,
         void                    __user *arg,
+       struct file             *parfilp,
         struct inode            *parinode)
  {
         int                     error;
@@ -626,13 +623,21 @@ xfs_attrmulti_by_handle(
                                         &ops[i].am_length, ops[i].am_flags);
                         break;
                 case ATTR_OP_SET:
+                       ops[i].am_error = mnt_want_write(parfilp->f_path.mnt);
+                       if (ops[i].am_error)
+                               break;
                         ops[i].am_error = xfs_attrmulti_attr_set(inode,
                                         attr_name, ops[i].am_attrvalue,
                                         ops[i].am_length, ops[i].am_flags);
+                       mnt_drop_write(parfilp->f_path.mnt);
                         break;
                 case ATTR_OP_REMOVE:
+                       ops[i].am_error = mnt_want_write(parfilp->f_path.mnt);
+                       if (ops[i].am_error)
+                               break;
                         ops[i].am_error = xfs_attrmulti_attr_remove(inode,
                                         attr_name, ops[i].am_flags);
+                       mnt_drop_write(parfilp->f_path.mnt);
                         break;
                 default:
                         ops[i].am_error = EINVAL;
@@ -1133,7 +1138,7 @@ xfs_ioctl(
                 return xfs_attrlist_by_handle(mp, arg, inode);
  
         case XFS_IOC_ATTRMULTI_BY_HANDLE:
-               return xfs_attrmulti_by_handle(mp, arg, inode);
+               return xfs_attrmulti_by_handle(mp, arg, filp, inode);
  
         case XFS_IOC_SWAPEXT: {
                 error = xfs_swapext((struct xfs_swapext __user *)arg);
diff --git a/fs/xfs/linux-2.6/xfs_iops.c b/fs/xfs/linux-2.6/xfs_iops.c

index 0c958cf7775880ce0a8004d38caa424887387edd..a1237dad6430b28221be1dc033f6f0a4e599f2cf 100644 (file)
--- a/fs/xfs/linux-2.6/xfs_iops.c
+++ b/fs/xfs/linux-2.6/xfs_iops.c
@@ -155,13 +155,6 @@ xfs_ichgtime_fast(
          */
         ASSERT((flags & XFS_ICHGTIME_ACC) == 0);
  
-       /*
-        * We're not supposed to change timestamps in readonly-mounted
-        * filesystems.  Throw it away if anyone asks us.
-        */
-       if (unlikely(IS_RDONLY(inode)))
-               return;
-
         if (flags & XFS_ICHGTIME_MOD) {
                 tvp = &inode->i_mtime;
                 ip->i_d.di_mtime.t_sec = (__int32_t)tvp->tv_sec;
diff --git a/fs/xfs/linux-2.6/xfs_lrw.c b/fs/xfs/linux-2.6/xfs_lrw.c

index 21c0dbc74093b2bcf62971d58ff3a6b0168d27f1..1ebd8004469c1d3a70a0afc0f8a34f8400ea5011 100644 (file)
--- a/fs/xfs/linux-2.6/xfs_lrw.c
+++ b/fs/xfs/linux-2.6/xfs_lrw.c
@@ -51,6 +51,7 @@
  #include "xfs_vnodeops.h"
  
  #include <linux/capability.h>
+#include <linux/mount.h>
  #include <linux/writeback.h>
  
  
@@ -670,10 +671,16 @@ start:
         if (new_size > xip->i_size)
                 xip->i_new_size = new_size;
  
-       if (likely(!(ioflags & IO_INVIS))) {
+       /*
+        * We're not supposed to change timestamps in readonly-mounted
+        * filesystems.  Throw it away if anyone asks us.
+        */
+       if (likely(!(ioflags & IO_INVIS) &&
+                  !mnt_want_write(file->f_path.mnt))) {
                 file_update_time(file);
                 xfs_ichgtime_fast(xip, inode,
                                   XFS_ICHGTIME_MOD | XFS_ICHGTIME_CHG);
+               mnt_drop_write(file->f_path.mnt);
         }
  
         /*
diff --git a/include/asm-alpha/topology.h b/include/asm-alpha/topology.h

index 420ccde6b916a31f8d601d46e601a3062bfc658b..149532e162c4f4293decb64e8e73aab339c13521 100644 (file)
--- a/include/asm-alpha/topology.h
+++ b/include/asm-alpha/topology.h
@@ -41,8 +41,7 @@ static inline cpumask_t node_to_cpumask(int node)
  
  #define pcibus_to_cpumask(bus) (cpu_online_map)
  
-#else /* CONFIG_NUMA */
-# include <asm-generic/topology.h>
  #endif /* !CONFIG_NUMA */
+# include <asm-generic/topology.h>
  
  #endif /* _ASM_ALPHA_TOPOLOGY_H */
diff --git a/include/asm-frv/topology.h b/include/asm-frv/topology.h

index abe7298742ac23f59ffa65c0e3e51f3638a35b9d..94272435270552b4fe0b266c00796ec1aa4fbd9c 100644 (file)
--- a/include/asm-frv/topology.h
+++ b/include/asm-frv/topology.h
@@ -5,10 +5,8 @@
  
  #error NUMA not supported yet
  
-#else /* !CONFIG_NUMA */
+#endif /* CONFIG_NUMA */
  
  #include <asm-generic/topology.h>
  
-#endif /* CONFIG_NUMA */
-
  #endif /* _ASM_TOPOLOGY_H */
diff --git a/include/asm-generic/topology.h b/include/asm-generic/topology.h

index 342a2a0105c4c070f988d39afc90f46d52dca12d..a6aea79bca4f369b51f0e384e15bfabcdd7e13c5 100644 (file)
--- a/include/asm-generic/topology.h
+++ b/include/asm-generic/topology.h
@@ -27,6 +27,8 @@
  #ifndef _ASM_GENERIC_TOPOLOGY_H
  #define _ASM_GENERIC_TOPOLOGY_H
  
+#ifndef        CONFIG_NUMA
+
  /* Other architectures wishing to use this simple topology API should fill
     in the below functions as appropriate in their own <asm/topology.h> file. */
  #ifndef cpu_to_node
@@ -52,4 +54,16 @@
                                 )
  #endif
  
+#endif /* CONFIG_NUMA */
+
+/* returns pointer to cpumask for specified node */
+#ifndef node_to_cpumask_ptr
+
+#define        node_to_cpumask_ptr(v, node)                                    \
+               cpumask_t _##v = node_to_cpumask(node), *v = &_##v
+
+#define node_to_cpumask_ptr_next(v, node)                              \
+                         _##v = node_to_cpumask(node)
+#endif
+
  #endif /* _ASM_GENERIC_TOPOLOGY_H */
diff --git a/include/asm-ia64/topology.h b/include/asm-ia64/topology.h

index 2d67b72b18d07049d5ba0f46fea96369605a39ba..f2f72ef2a8974cc3496dc0fd92435626f708a7e8 100644 (file)
--- a/include/asm-ia64/topology.h
+++ b/include/asm-ia64/topology.h
@@ -93,7 +93,7 @@ void build_cpu_to_node_map(void);
         .cache_nice_tries       = 2,                    \
         .busy_idx               = 3,                    \
         .idle_idx               = 2,                    \
-       .newidle_idx            = 0, /* unused */       \
+       .newidle_idx            = 2,                    \
         .wake_idx               = 1,                    \
         .forkexec_idx           = 1,                    \
         .flags                  = SD_LOAD_BALANCE       \
@@ -116,6 +116,11 @@ void build_cpu_to_node_map(void);
  #define smt_capable()                          (smp_num_siblings > 1)
  #endif
  
+#define pcibus_to_cpumask(bus) (pcibus_to_node(bus) == -1 ? \
+                                       CPU_MASK_ALL : \
+                                       node_to_cpumask(pcibus_to_node(bus)) \
+                               )
+
  #include <asm-generic/topology.h>
  
  #endif /* _ASM_IA64_TOPOLOGY_H */
diff --git a/include/asm-powerpc/topology.h b/include/asm-powerpc/topology.h

index ca23b681ad058e7f984d9bffff2f1b71af52382e..100c6fbfc587058d889b657e61e2b180ae1b2e13 100644 (file)
--- a/include/asm-powerpc/topology.h
+++ b/include/asm-powerpc/topology.h
@@ -96,11 +96,10 @@ static inline void sysfs_remove_device_from_node(struct sys_device *dev,
  {
  }
  
+#endif /* CONFIG_NUMA */
  
  #include <asm-generic/topology.h>
  
-#endif /* CONFIG_NUMA */
-
  #ifdef CONFIG_SMP
  #include <asm/cputable.h>
  #define smt_capable()          (cpu_has_feature(CPU_FTR_SMT))
diff --git a/include/asm-sh/bugs.h b/include/asm-sh/bugs.h

index cfda7d5bf0262544ae61e2437b5ed7f297518d9e..121b2ecddfc35d46042677e1304913461937b71d 100644 (file)
--- a/include/asm-sh/bugs.h
+++ b/include/asm-sh/bugs.h
@@ -25,7 +25,7 @@ static void __init check_bugs(void)
         case CPU_SH7619:
                 *p++ = '2';
                 break;
-       case CPU_SH7203 ... CPU_SH7263:
+       case CPU_SH7203 ... CPU_MXG:
                 *p++ = '2';
                 *p++ = 'a';
                 break;
diff --git a/include/asm-sh/cpu-sh4/freq.h b/include/asm-sh/cpu-sh4/freq.h

index ec028c649215542843871530de938f8ec2b98bce..da46e67ae26d422449e31f51c28b9ce43b3a8eaf 100644 (file)
--- a/include/asm-sh/cpu-sh4/freq.h
+++ b/include/asm-sh/cpu-sh4/freq.h
@@ -10,14 +10,14 @@
  #ifndef __ASM_CPU_SH4_FREQ_H
  #define __ASM_CPU_SH4_FREQ_H
  
-#if defined(CONFIG_CPU_SUBTYPE_SH7722) || defined(CONFIG_CPU_SUBTYPE_SH7366)
+#if defined(CONFIG_CPU_SUBTYPE_SH7722) || \
+    defined(CONFIG_CPU_SUBTYPE_SH7723) || \
+    defined(CONFIG_CPU_SUBTYPE_SH7366)
  #define FRQCR                  0xa4150000
  #define VCLKCR                 0xa4150004
  #define SCLKACR                        0xa4150008
  #define SCLKBCR                        0xa415000c
-#if defined(CONFIG_CPU_SUBTYPE_SH7722)
  #define IrDACLKCR              0xa4150010
-#endif
  #elif defined(CONFIG_CPU_SUBTYPE_SH7763) || \
        defined(CONFIG_CPU_SUBTYPE_SH7780)
  #define        FRQCR                   0xffc80000
diff --git a/include/asm-sh/cpu-sh4/rtc.h b/include/asm-sh/cpu-sh4/rtc.h

index f3d0f53275e491dfed87625fc87b32e7dc9fa485..25b1e6adfe8ca938a410fd7bea8b93f1f6c1aad8 100644 (file)
--- a/include/asm-sh/cpu-sh4/rtc.h
+++ b/include/asm-sh/cpu-sh4/rtc.h
@@ -1,7 +1,12 @@
  #ifndef __ASM_SH_CPU_SH4_RTC_H
  #define __ASM_SH_CPU_SH4_RTC_H
  
+#ifdef CONFIG_CPU_SUBTYPE_SH7723
+#define rtc_reg_size           sizeof(u16)
+#else
  #define rtc_reg_size           sizeof(u32)
+#endif
+
  #define RTC_BIT_INVERTED       0x40    /* bug on SH7750, SH7750S */
  #define RTC_DEF_CAPABILITIES   RTC_CAP_4_DIGIT_YEAR
  
diff --git a/include/asm-sh/migor.h b/include/asm-sh/migor.h

new file mode 100644 (file)

index 0000000..2329363
--- /dev/null
+++ b/include/asm-sh/migor.h
@@ -0,0 +1,58 @@
+#ifndef __ASM_SH_MIGOR_H
+#define __ASM_SH_MIGOR_H
+
+/*
+ * linux/include/asm-sh/migor.h
+ *
+ * Copyright (C) 2008 Renesas Solutions
+ *
+ * Portions Copyright (C) 2007 Nobuhiro Iwamatsu
+ *
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License. See the file "COPYING" in the main directory of this archive
+ * for more details.
+ *
+ */
+#include <asm/addrspace.h>
+
+/* GPIO */
+#define MSTPCR0 0xa4150030
+#define MSTPCR1 0xa4150034
+#define MSTPCR2 0xa4150038
+
+#define PORT_PACR 0xa4050100
+#define PORT_PDCR 0xa4050106
+#define PORT_PECR 0xa4050108
+#define PORT_PHCR 0xa405010e
+#define PORT_PJCR 0xa4050110
+#define PORT_PKCR 0xa4050112
+#define PORT_PLCR 0xa4050114
+#define PORT_PMCR 0xa4050116
+#define PORT_PRCR 0xa405011c
+#define PORT_PWCR 0xa4050146
+#define PORT_PXCR 0xa4050148
+#define PORT_PYCR 0xa405014a
+#define PORT_PZCR 0xa405014c
+#define PORT_PADR 0xa4050120
+#define PORT_PWDR 0xa4050166
+
+#define PORT_HIZCRA 0xa4050158
+#define PORT_HIZCRC 0xa405015c
+
+#define PORT_MSELCRB 0xa4050182
+
+#define MSTPCR1 0xa4150034
+#define MSTPCR2 0xa4150038
+
+#define PORT_PSELA 0xa405014e
+#define PORT_PSELB 0xa4050150
+#define PORT_PSELC 0xa4050152
+#define PORT_PSELD 0xa4050154
+
+#define PORT_HIZCRA 0xa4050158
+#define PORT_HIZCRB 0xa405015a
+#define PORT_HIZCRC 0xa405015c
+
+#define BSC_CS6ABCR 0xfec1001c
+
+#endif /* __ASM_SH_MIGOR_H */
diff --git a/include/asm-sh/processor.h b/include/asm-sh/processor.h

index ec707b98e5b91a88e9e21a6aa076490ae1ba6987..b7c7ce80f03e110007c75e2a1b32e4b2d2379cda 100644 (file)
--- a/include/asm-sh/processor.h
+++ b/include/asm-sh/processor.h
@@ -16,7 +16,7 @@ enum cpu_type {
         CPU_SH7619,
  
         /* SH-2A types */
-       CPU_SH7203, CPU_SH7206, CPU_SH7263,
+       CPU_SH7203, CPU_SH7206, CPU_SH7263, CPU_MXG,
  
         /* SH-3 types */
         CPU_SH7705, CPU_SH7706, CPU_SH7707,
@@ -29,7 +29,8 @@ enum cpu_type {
         CPU_SH7760, CPU_SH4_202, CPU_SH4_501,
  
         /* SH-4A types */
-       CPU_SH7763, CPU_SH7770, CPU_SH7780, CPU_SH7781, CPU_SH7785, CPU_SHX3,
+       CPU_SH7763, CPU_SH7770, CPU_SH7780, CPU_SH7781, CPU_SH7785,
+       CPU_SH7723, CPU_SHX3,
  
         /* SH4AL-DSP types */
         CPU_SH7343, CPU_SH7722, CPU_SH7366,
diff --git a/include/asm-sh/r7780rp.h b/include/asm-sh/r7780rp.h

index 1770460a4616e6a7980dc5f142a6fcb1066b8b71..a33838f23a6d879b1b443554f66c36b1d6e685a8 100644 (file)
--- a/include/asm-sh/r7780rp.h
+++ b/include/asm-sh/r7780rp.h
@@ -55,11 +55,11 @@
  #define PA_SCSPTR1      (PA_BCR+0x0524) /* SCIF1 Serial Port control */
  #define PA_SCLSR1       (PA_BCR+0x0528) /* SCIF1 Line Status control */
  #define PA_SCRER1       (PA_BCR+0x052c) /* SCIF1 Serial Error control */
-#define PA_ICCR         (PA_BCR+0x0600) /* Serial control */
-#define PA_SAR          (PA_BCR+0x0602) /* Serial Slave control */
-#define PA_MDR          (PA_BCR+0x0604) /* Serial Mode control */
-#define PA_ADR1         (PA_BCR+0x0606) /* Serial Address1 control */
-#define PA_DAR1         (PA_BCR+0x0646) /* Serial Data1 control */
+#define PA_SMCR         (PA_BCR+0x0600) /* 2-wire Serial control */
+#define PA_SMSMADR      (PA_BCR+0x0602) /* 2-wire Serial Slave control */
+#define PA_SMMR         (PA_BCR+0x0604) /* 2-wire Serial Mode control */
+#define PA_SMSADR1      (PA_BCR+0x0606) /* 2-wire Serial Address1 control */
+#define PA_SMTRDR1      (PA_BCR+0x0646) /* 2-wire Serial Data1 control */
  #define PA_VERREG       (PA_BCR+0x0700) /* FPGA Version Register */
  #define PA_POFF         (PA_BCR+0x0800) /* System Power Off control */
  #define PA_PMR          (PA_BCR+0x0900) /*  */
@@ -107,11 +107,11 @@
  #define PA_SCFCR       (PA_BCR+0x040c) /* SCIF FIFO control */
  #define PA_SCFDR       (PA_BCR+0x040e) /* SCIF FIFO data control */
  #define PA_SCLSR       (PA_BCR+0x0412) /* SCIF Line Status control */
-#define PA_ICCR                (PA_BCR+0x0500) /* Serial control */
-#define PA_SAR         (PA_BCR+0x0502) /* Serial Slave control */
-#define PA_MDR         (PA_BCR+0x0504) /* Serial Mode control */
-#define PA_ADR1                (PA_BCR+0x0506) /* Serial Address1 control */
-#define PA_DAR1                (PA_BCR+0x0546) /* Serial Data1 control */
+#define PA_SMCR                (PA_BCR+0x0500) /* 2-wire Serial control */
+#define PA_SMSMADR     (PA_BCR+0x0502) /* 2-wire Serial Slave control */
+#define PA_SMMR                (PA_BCR+0x0504) /* 2-wire Serial Mode control */
+#define PA_SMSADR1     (PA_BCR+0x0506) /* 2-wire Serial Address1 control */
+#define PA_SMTRDR1     (PA_BCR+0x0546) /* 2-wire Serial Data1 control */
  #define PA_VERREG      (PA_BCR+0x0600) /* FPGA Version Register */
  
  #define PA_AX88796L    0xa5800400      /* AX88796L Area */
@@ -190,6 +190,8 @@
  #define IRQ_TP                 (HL_FPGA_IRQ_BASE + 12)
  #define IRQ_RTC                        (HL_FPGA_IRQ_BASE + 13)
  #define IRQ_TH_ALERT           (HL_FPGA_IRQ_BASE + 14)
+#define IRQ_SCIF0              (HL_FPGA_IRQ_BASE + 15)
+#define IRQ_SCIF1              (HL_FPGA_IRQ_BASE + 16)
  
  unsigned char *highlander_init_irq_r7780mp(void);
  unsigned char *highlander_init_irq_r7780rp(void);
diff --git a/include/asm-sh/se7721.h b/include/asm-sh/se7721.h

new file mode 100644 (file)

index 0000000..b957f60
--- /dev/null
+++ b/include/asm-sh/se7721.h
@@ -0,0 +1,70 @@
+/*
+ * Copyright (C) 2008 Renesas Solutions Corp.
+ *
+ * Hitachi UL SolutionEngine 7721 Support.
+ *
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License.  See the file "COPYING" in the main directory of this archive
+ * for more details.
+ *
+ */
+
+#ifndef __ASM_SH_SE7721_H
+#define __ASM_SH_SE7721_H
+#include <asm/addrspace.h>
+
+/* Box specific addresses. */
+#define SE_AREA0_WIDTH 2               /* Area0: 32bit */
+#define PA_ROM         0xa0000000      /* EPROM */
+#define PA_ROM_SIZE    0x00200000      /* EPROM size 2M byte */
+#define PA_FROM                0xa1000000      /* Flash-ROM */
+#define PA_FROM_SIZE   0x01000000      /* Flash-ROM size 16M byte */
+#define PA_EXT1                0xa4000000
+#define PA_EXT1_SIZE   0x04000000
+#define PA_SDRAM       0xaC000000      /* SDRAM(Area3) 64MB */
+#define PA_SDRAM_SIZE  0x04000000
+
+#define PA_EXT4                0xb0000000
+#define PA_EXT4_SIZE   0x04000000
+
+#define PA_PERIPHERAL  0xB8000000
+
+#define PA_PCIC                PA_PERIPHERAL
+#define PA_MRSHPC      (PA_PERIPHERAL + 0x003fffe0)
+#define PA_MRSHPC_MW1  (PA_PERIPHERAL + 0x00400000)
+#define PA_MRSHPC_MW2  (PA_PERIPHERAL + 0x00500000)
+#define PA_MRSHPC_IO   (PA_PERIPHERAL + 0x00600000)
+#define MRSHPC_OPTION  (PA_MRSHPC + 6)
+#define MRSHPC_CSR     (PA_MRSHPC + 8)
+#define MRSHPC_ISR     (PA_MRSHPC + 10)
+#define MRSHPC_ICR     (PA_MRSHPC + 12)
+#define MRSHPC_CPWCR   (PA_MRSHPC + 14)
+#define MRSHPC_MW0CR1  (PA_MRSHPC + 16)
+#define MRSHPC_MW1CR1  (PA_MRSHPC + 18)
+#define MRSHPC_IOWCR1  (PA_MRSHPC + 20)
+#define MRSHPC_MW0CR2  (PA_MRSHPC + 22)
+#define MRSHPC_MW1CR2  (PA_MRSHPC + 24)
+#define MRSHPC_IOWCR2  (PA_MRSHPC + 26)
+#define MRSHPC_CDCR    (PA_MRSHPC + 28)
+#define MRSHPC_PCIC_INFO       (PA_MRSHPC + 30)
+
+#define PA_LED         0xB6800000      /* 8bit LED */
+#define PA_FPGA                0xB7000000      /* FPGA base address */
+
+#define MRSHPC_IRQ0    10
+
+#define FPGA_ILSR1     (PA_FPGA + 0x02)
+#define FPGA_ILSR2     (PA_FPGA + 0x03)
+#define FPGA_ILSR3     (PA_FPGA + 0x04)
+#define FPGA_ILSR4     (PA_FPGA + 0x05)
+#define FPGA_ILSR5     (PA_FPGA + 0x06)
+#define FPGA_ILSR6     (PA_FPGA + 0x07)
+#define FPGA_ILSR7     (PA_FPGA + 0x08)
+#define FPGA_ILSR8     (PA_FPGA + 0x09)
+
+void init_se7721_IRQ(void);
+
+#define __IO_PREFIX            se7721
+#include <asm/io_generic.h>
+
+#endif  /* __ASM_SH_SE7721_H */
diff --git a/include/asm-sh/se7722.h b/include/asm-sh/se7722.h

index e0e89fcb8388846ea3b228485bba339e622b051d..3690fe5857a4714d38f6cc1592cd820369bab01b 100644 (file)
--- a/include/asm-sh/se7722.h
+++ b/include/asm-sh/se7722.h
@@ -77,6 +77,8 @@
  #define PORT_PSELA      0xA405014EUL
  #define PORT_PYCR       0xA405014AUL
  #define PORT_PZCR       0xA405014CUL
+#define PORT_HIZCRA     0xA4050158UL
+#define PORT_HIZCRC     0xA405015CUL
  
  /* IRQ */
  #define IRQ0_IRQ        32
diff --git a/include/asm-sh/sh_keysc.h b/include/asm-sh/sh_keysc.h

new file mode 100644 (file)

index 0000000..b5a4dd5
--- /dev/null
+++ b/include/asm-sh/sh_keysc.h
@@ -0,0 +1,13 @@
+#ifndef __ASM_KEYSC_H__
+#define __ASM_KEYSC_H__
+
+#define SH_KEYSC_MAXKEYS 30
+
+struct sh_keysc_info {
+       enum { SH_KEYSC_MODE_1, SH_KEYSC_MODE_2, SH_KEYSC_MODE_3 } mode;
+       int scan_timing; /* 0 -> 7, see KYCR1, SCN[2:0] */
+       int delay;
+       int keycodes[SH_KEYSC_MAXKEYS];
+};
+
+#endif /* __ASM_KEYSC_H__ */
diff --git a/include/asm-sh/system.h b/include/asm-sh/system.h

index 5145aa2a0ce9de652304a7ed5b69ab9682797b34..e65b6b822cb3722731cf5d4d5bb21673050ac9bb 100644 (file)
--- a/include/asm-sh/system.h
+++ b/include/asm-sh/system.h
@@ -146,6 +146,8 @@ extern unsigned int instruction_size(unsigned int insn);
  
  extern unsigned long cached_to_uncached;
  
+extern struct dentry *sh_debugfs_root;
+
  /* XXX
   * disable hlt during certain critical i/o operations
   */
diff --git a/include/asm-sh/topology.h b/include/asm-sh/topology.h

index f402a3b1cfa48fd4b0d626d80ef3837305c4f41a..34cdb28e8f4429c2deb7ed7e96ec9e07c50bbd98 100644 (file)
--- a/include/asm-sh/topology.h
+++ b/include/asm-sh/topology.h
@@ -16,7 +16,7 @@
         .cache_nice_tries       = 2,                    \
         .busy_idx               = 3,                    \
         .idle_idx               = 2,                    \
-       .newidle_idx            = 0,                    \
+       .newidle_idx            = 2,                    \
         .wake_idx               = 1,                    \
         .forkexec_idx           = 1,                    \
         .flags                  = SD_LOAD_BALANCE       \
diff --git a/include/asm-sh/uaccess_32.h b/include/asm-sh/uaccess_32.h

index c0318b60889398e60c95d5632f9b10d071fbe72f..1e41fda74bd38a0ae459ba884b598b9ed78425ce 100644 (file)
--- a/include/asm-sh/uaccess_32.h
+++ b/include/asm-sh/uaccess_32.h
@@ -55,13 +55,10 @@ static inline void set_fs(mm_segment_t s)
   * If we don't have an MMU (or if its disabled) the only thing we really have
   * to look out for is if the address resides somewhere outside of what
   * available RAM we have.
- *
- * TODO: This check could probably also stand to be restricted somewhat more..
- * though it still does the Right Thing(tm) for the time being.
   */
  static inline int __access_ok(unsigned long addr, unsigned long size)
  {
-       return ((addr >= memory_start) && ((addr + size) < memory_end));
+       return 1;
  }
  #else /* CONFIG_MMU */
  #define __addr_ok(addr) \
diff --git a/include/asm-x86/boot.h b/include/asm-x86/boot.h

index ed8affbf96cb804f3a97ec46d610fdec19345a2b..2faed7ecb092a7893c89c573b7ac9870e060d8a7 100644 (file)
--- a/include/asm-x86/boot.h
+++ b/include/asm-x86/boot.h
@@ -17,4 +17,12 @@
                                 + (CONFIG_PHYSICAL_ALIGN - 1)) \
                                 & ~(CONFIG_PHYSICAL_ALIGN - 1))
  
+#ifdef CONFIG_X86_64
+#define BOOT_HEAP_SIZE 0x7000
+#define BOOT_STACK_SIZE        0x4000
+#else
+#define BOOT_HEAP_SIZE 0x4000
+#define BOOT_STACK_SIZE        0x1000
+#endif
+
  #endif /* _ASM_BOOT_H */
diff --git a/include/asm-x86/dma-mapping.h b/include/asm-x86/dma-mapping.h

index 58f790f4df5253fe80c3a9ed34b2b1d4bb91b4ef..a1a4dc7fe6ece75087cc33e33d3c60e2477cd586 100644 (file)
--- a/include/asm-x86/dma-mapping.h
+++ b/include/asm-x86/dma-mapping.h
@@ -1,5 +1,237 @@
+#ifndef _ASM_DMA_MAPPING_H_
+#define _ASM_DMA_MAPPING_H_
+
+/*
+ * IOMMU interface. See Documentation/DMA-mapping.txt and DMA-API.txt for
+ * documentation.
+ */
+
+#include <linux/scatterlist.h>
+#include <asm/io.h>
+#include <asm/swiotlb.h>
+
+extern dma_addr_t bad_dma_address;
+extern int iommu_merge;
+extern struct device fallback_dev;
+extern int panic_on_overflow;
+extern int forbid_dac;
+extern int force_iommu;
+
+struct dma_mapping_ops {
+       int             (*mapping_error)(dma_addr_t dma_addr);
+       void*           (*alloc_coherent)(struct device *dev, size_t size,
+                               dma_addr_t *dma_handle, gfp_t gfp);
+       void            (*free_coherent)(struct device *dev, size_t size,
+                               void *vaddr, dma_addr_t dma_handle);
+       dma_addr_t      (*map_single)(struct device *hwdev, phys_addr_t ptr,
+                               size_t size, int direction);
+       /* like map_single, but doesn't check the device mask */
+       dma_addr_t      (*map_simple)(struct device *hwdev, phys_addr_t ptr,
+                               size_t size, int direction);
+       void            (*unmap_single)(struct device *dev, dma_addr_t addr,
+                               size_t size, int direction);
+       void            (*sync_single_for_cpu)(struct device *hwdev,
+                               dma_addr_t dma_handle, size_t size,
+                               int direction);
+       void            (*sync_single_for_device)(struct device *hwdev,
+                               dma_addr_t dma_handle, size_t size,
+                               int direction);
+       void            (*sync_single_range_for_cpu)(struct device *hwdev,
+                               dma_addr_t dma_handle, unsigned long offset,
+                               size_t size, int direction);
+       void            (*sync_single_range_for_device)(struct device *hwdev,
+                               dma_addr_t dma_handle, unsigned long offset,
+                               size_t size, int direction);
+       void            (*sync_sg_for_cpu)(struct device *hwdev,
+                               struct scatterlist *sg, int nelems,
+                               int direction);
+       void            (*sync_sg_for_device)(struct device *hwdev,
+                               struct scatterlist *sg, int nelems,
+                               int direction);
+       int             (*map_sg)(struct device *hwdev, struct scatterlist *sg,
+                               int nents, int direction);
+       void            (*unmap_sg)(struct device *hwdev,
+                               struct scatterlist *sg, int nents,
+                               int direction);
+       int             (*dma_supported)(struct device *hwdev, u64 mask);
+       int             is_phys;
+};
+
+extern const struct dma_mapping_ops *dma_ops;
+
+static inline int dma_mapping_error(dma_addr_t dma_addr)
+{
+       if (dma_ops->mapping_error)
+               return dma_ops->mapping_error(dma_addr);
+
+       return (dma_addr == bad_dma_address);
+}
+
+#define dma_alloc_noncoherent(d, s, h, f) dma_alloc_coherent(d, s, h, f)
+#define dma_free_noncoherent(d, s, v, h) dma_free_coherent(d, s, v, h)
+
+void *dma_alloc_coherent(struct device *dev, size_t size,
+                          dma_addr_t *dma_handle, gfp_t flag);
+
+void dma_free_coherent(struct device *dev, size_t size,
+                        void *vaddr, dma_addr_t dma_handle);
+
+
+extern int dma_supported(struct device *hwdev, u64 mask);
+extern int dma_set_mask(struct device *dev, u64 mask);
+
+static inline dma_addr_t
+dma_map_single(struct device *hwdev, void *ptr, size_t size,
+              int direction)
+{
+       BUG_ON(!valid_dma_direction(direction));
+       return dma_ops->map_single(hwdev, virt_to_phys(ptr), size, direction);
+}
+
+static inline void
+dma_unmap_single(struct device *dev, dma_addr_t addr, size_t size,
+                int direction)
+{
+       BUG_ON(!valid_dma_direction(direction));
+       if (dma_ops->unmap_single)
+               dma_ops->unmap_single(dev, addr, size, direction);
+}
+
+static inline int
+dma_map_sg(struct device *hwdev, struct scatterlist *sg,
+          int nents, int direction)
+{
+       BUG_ON(!valid_dma_direction(direction));
+       return dma_ops->map_sg(hwdev, sg, nents, direction);
+}
+
+static inline void
+dma_unmap_sg(struct device *hwdev, struct scatterlist *sg, int nents,
+            int direction)
+{
+       BUG_ON(!valid_dma_direction(direction));
+       if (dma_ops->unmap_sg)
+               dma_ops->unmap_sg(hwdev, sg, nents, direction);
+}
+
+static inline void
+dma_sync_single_for_cpu(struct device *hwdev, dma_addr_t dma_handle,
+                       size_t size, int direction)
+{
+       BUG_ON(!valid_dma_direction(direction));
+       if (dma_ops->sync_single_for_cpu)
+               dma_ops->sync_single_for_cpu(hwdev, dma_handle, size,
+                                            direction);
+       flush_write_buffers();
+}
+
+static inline void
+dma_sync_single_for_device(struct device *hwdev, dma_addr_t dma_handle,
+                          size_t size, int direction)
+{
+       BUG_ON(!valid_dma_direction(direction));
+       if (dma_ops->sync_single_for_device)
+               dma_ops->sync_single_for_device(hwdev, dma_handle, size,
+                                               direction);
+       flush_write_buffers();
+}
+
+static inline void
+dma_sync_single_range_for_cpu(struct device *hwdev, dma_addr_t dma_handle,
+                             unsigned long offset, size_t size, int direction)
+{
+       BUG_ON(!valid_dma_direction(direction));
+       if (dma_ops->sync_single_range_for_cpu)
+               dma_ops->sync_single_range_for_cpu(hwdev, dma_handle, offset,
+                                                  size, direction);
+
+       flush_write_buffers();
+}
+
+static inline void
+dma_sync_single_range_for_device(struct device *hwdev, dma_addr_t dma_handle,
+                                unsigned long offset, size_t size,
+                                int direction)
+{
+       BUG_ON(!valid_dma_direction(direction));
+       if (dma_ops->sync_single_range_for_device)
+               dma_ops->sync_single_range_for_device(hwdev, dma_handle,
+                                                     offset, size, direction);
+
+       flush_write_buffers();
+}
+
+static inline void
+dma_sync_sg_for_cpu(struct device *hwdev, struct scatterlist *sg,
+                   int nelems, int direction)
+{
+       BUG_ON(!valid_dma_direction(direction));
+       if (dma_ops->sync_sg_for_cpu)
+               dma_ops->sync_sg_for_cpu(hwdev, sg, nelems, direction);
+       flush_write_buffers();
+}
+
+static inline void
+dma_sync_sg_for_device(struct device *hwdev, struct scatterlist *sg,
+                      int nelems, int direction)
+{
+       BUG_ON(!valid_dma_direction(direction));
+       if (dma_ops->sync_sg_for_device)
+               dma_ops->sync_sg_for_device(hwdev, sg, nelems, direction);
+
+       flush_write_buffers();
+}
+
+static inline dma_addr_t dma_map_page(struct device *dev, struct page *page,
+                                     size_t offset, size_t size,
+                                     int direction)
+{
+       BUG_ON(!valid_dma_direction(direction));
+       return dma_ops->map_single(dev, page_to_phys(page)+offset,
+                                  size, direction);
+}
+
+static inline void dma_unmap_page(struct device *dev, dma_addr_t addr,
+                                 size_t size, int direction)
+{
+       dma_unmap_single(dev, addr, size, direction);
+}
+
+static inline void
+dma_cache_sync(struct device *dev, void *vaddr, size_t size,
+       enum dma_data_direction dir)
+{
+       flush_write_buffers();
+}
+
+static inline int dma_get_cache_alignment(void)
+{
+       /* no easy way to get cache size on all x86, so return the
+        * maximum possible, to be safe */
+       return boot_cpu_data.x86_clflush_size;
+}
+
+#define dma_is_consistent(d, h)        (1)
+
  #ifdef CONFIG_X86_32
-# include "dma-mapping_32.h"
-#else
-# include "dma-mapping_64.h"
+#  define ARCH_HAS_DMA_DECLARE_COHERENT_MEMORY
+struct dma_coherent_mem {
+       void            *virt_base;
+       u32             device_base;
+       int             size;
+       int             flags;
+       unsigned long   *bitmap;
+};
+
+extern int
+dma_declare_coherent_memory(struct device *dev, dma_addr_t bus_addr,
+                           dma_addr_t device_addr, size_t size, int flags);
+
+extern void
+dma_release_declared_memory(struct device *dev);
+
+extern void *
+dma_mark_declared_memory_occupied(struct device *dev,
+                                 dma_addr_t device_addr, size_t size);
+#endif /* CONFIG_X86_32 */
  #endif
diff --git a/include/asm-x86/dma-mapping_32.h b/include/asm-x86/dma-mapping_32.h

deleted file mode 100644 (file)

index 55f01bd..0000000
--- a/include/asm-x86/dma-mapping_32.h
+++ /dev/null
@@ -1,187 +0,0 @@
-#ifndef _ASM_I386_DMA_MAPPING_H
-#define _ASM_I386_DMA_MAPPING_H
-
-#include <linux/mm.h>
-#include <linux/scatterlist.h>
-
-#include <asm/cache.h>
-#include <asm/io.h>
-#include <asm/bug.h>
-
-#define dma_alloc_noncoherent(d, s, h, f) dma_alloc_coherent(d, s, h, f)
-#define dma_free_noncoherent(d, s, v, h) dma_free_coherent(d, s, v, h)
-
-void *dma_alloc_coherent(struct device *dev, size_t size,
-                          dma_addr_t *dma_handle, gfp_t flag);
-
-void dma_free_coherent(struct device *dev, size_t size,
-                        void *vaddr, dma_addr_t dma_handle);
-
-static inline dma_addr_t
-dma_map_single(struct device *dev, void *ptr, size_t size,
-              enum dma_data_direction direction)
-{
-       BUG_ON(!valid_dma_direction(direction));
-       WARN_ON(size == 0);
-       flush_write_buffers();
-       return virt_to_phys(ptr);
-}
-
-static inline void
-dma_unmap_single(struct device *dev, dma_addr_t dma_addr, size_t size,
-                enum dma_data_direction direction)
-{
-       BUG_ON(!valid_dma_direction(direction));
-}
-
-static inline int
-dma_map_sg(struct device *dev, struct scatterlist *sglist, int nents,
-          enum dma_data_direction direction)
-{
-       struct scatterlist *sg;
-       int i;
-
-       BUG_ON(!valid_dma_direction(direction));
-       WARN_ON(nents == 0 || sglist[0].length == 0);
-
-       for_each_sg(sglist, sg, nents, i) {
-               BUG_ON(!sg_page(sg));
-
-               sg->dma_address = sg_phys(sg);
-       }
-
-       flush_write_buffers();
-       return nents;
-}
-
-static inline dma_addr_t
-dma_map_page(struct device *dev, struct page *page, unsigned long offset,
-            size_t size, enum dma_data_direction direction)
-{
-       BUG_ON(!valid_dma_direction(direction));
-       return page_to_phys(page) + offset;
-}
-
-static inline void
-dma_unmap_page(struct device *dev, dma_addr_t dma_address, size_t size,
-              enum dma_data_direction direction)
-{
-       BUG_ON(!valid_dma_direction(direction));
-}
-
-
-static inline void
-dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nhwentries,
-            enum dma_data_direction direction)
-{
-       BUG_ON(!valid_dma_direction(direction));
-}
-
-static inline void
-dma_sync_single_for_cpu(struct device *dev, dma_addr_t dma_handle, size_t size,
-                       enum dma_data_direction direction)
-{
-}
-
-static inline void
-dma_sync_single_for_device(struct device *dev, dma_addr_t dma_handle, size_t size,
-                       enum dma_data_direction direction)
-{
-       flush_write_buffers();
-}
-
-static inline void
-dma_sync_single_range_for_cpu(struct device *dev, dma_addr_t dma_handle,
-                             unsigned long offset, size_t size,
-                             enum dma_data_direction direction)
-{
-}
-
-static inline void
-dma_sync_single_range_for_device(struct device *dev, dma_addr_t dma_handle,
-                                unsigned long offset, size_t size,
-                                enum dma_data_direction direction)
-{
-       flush_write_buffers();
-}
-
-static inline void
-dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg, int nelems,
-                   enum dma_data_direction direction)
-{
-}
-
-static inline void
-dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg, int nelems,
-                   enum dma_data_direction direction)
-{
-       flush_write_buffers();
-}
-
-static inline int
-dma_mapping_error(dma_addr_t dma_addr)
-{
-       return 0;
-}
-
-extern int forbid_dac;
-
-static inline int
-dma_supported(struct device *dev, u64 mask)
-{
-        /*
-         * we fall back to GFP_DMA when the mask isn't all 1s,
-         * so we can't guarantee allocations that must be
-         * within a tighter range than GFP_DMA..
-         */
-        if(mask < 0x00ffffff)
-                return 0;
-
-       /* Work around chipset bugs */
-       if (forbid_dac > 0 && mask > 0xffffffffULL)
-               return 0;
-
-       return 1;
-}
-
-static inline int
-dma_set_mask(struct device *dev, u64 mask)
-{
-       if(!dev->dma_mask || !dma_supported(dev, mask))
-               return -EIO;
-
-       *dev->dma_mask = mask;
-
-       return 0;
-}
-
-static inline int
-dma_get_cache_alignment(void)
-{
-       /* no easy way to get cache size on all x86, so return the
-        * maximum possible, to be safe */
-       return (1 << INTERNODE_CACHE_SHIFT);
-}
-
-#define dma_is_consistent(d, h)        (1)
-
-static inline void
-dma_cache_sync(struct device *dev, void *vaddr, size_t size,
-              enum dma_data_direction direction)
-{
-       flush_write_buffers();
-}
-
-#define ARCH_HAS_DMA_DECLARE_COHERENT_MEMORY
-extern int
-dma_declare_coherent_memory(struct device *dev, dma_addr_t bus_addr,
-                           dma_addr_t device_addr, size_t size, int flags);
-
-extern void
-dma_release_declared_memory(struct device *dev);
-
-extern void *
-dma_mark_declared_memory_occupied(struct device *dev,
-                                 dma_addr_t device_addr, size_t size);
-
-#endif
diff --git a/include/asm-x86/dma-mapping_64.h b/include/asm-x86/dma-mapping_64.h

deleted file mode 100644 (file)

index ecd0f61..0000000
--- a/include/asm-x86/dma-mapping_64.h
+++ /dev/null
@@ -1,202 +0,0 @@
-#ifndef _X8664_DMA_MAPPING_H
-#define _X8664_DMA_MAPPING_H 1
-
-/*
- * IOMMU interface. See Documentation/DMA-mapping.txt and DMA-API.txt for
- * documentation.
- */
-
-#include <linux/scatterlist.h>
-#include <asm/io.h>
-#include <asm/swiotlb.h>
-
-struct dma_mapping_ops {
-       int             (*mapping_error)(dma_addr_t dma_addr);
-       void*           (*alloc_coherent)(struct device *dev, size_t size,
-                                dma_addr_t *dma_handle, gfp_t gfp);
-       void            (*free_coherent)(struct device *dev, size_t size,
-                                void *vaddr, dma_addr_t dma_handle);
-       dma_addr_t      (*map_single)(struct device *hwdev, void *ptr,
-                                size_t size, int direction);
-       /* like map_single, but doesn't check the device mask */
-       dma_addr_t      (*map_simple)(struct device *hwdev, char *ptr,
-                                size_t size, int direction);
-       void            (*unmap_single)(struct device *dev, dma_addr_t addr,
-                               size_t size, int direction);
-       void            (*sync_single_for_cpu)(struct device *hwdev,
-                               dma_addr_t dma_handle, size_t size,
-                               int direction);
-       void            (*sync_single_for_device)(struct device *hwdev,
-                                dma_addr_t dma_handle, size_t size,
-                               int direction);
-       void            (*sync_single_range_for_cpu)(struct device *hwdev,
-                                dma_addr_t dma_handle, unsigned long offset,
-                               size_t size, int direction);
-       void            (*sync_single_range_for_device)(struct device *hwdev,
-                               dma_addr_t dma_handle, unsigned long offset,
-                               size_t size, int direction);
-       void            (*sync_sg_for_cpu)(struct device *hwdev,
-                                struct scatterlist *sg, int nelems,
-                               int direction);
-       void            (*sync_sg_for_device)(struct device *hwdev,
-                               struct scatterlist *sg, int nelems,
-                               int direction);
-       int             (*map_sg)(struct device *hwdev, struct scatterlist *sg,
-                               int nents, int direction);
-       void            (*unmap_sg)(struct device *hwdev,
-                               struct scatterlist *sg, int nents,
-                               int direction);
-       int             (*dma_supported)(struct device *hwdev, u64 mask);
-       int             is_phys;
-};
-
-extern dma_addr_t bad_dma_address;
-extern const struct dma_mapping_ops* dma_ops;
-extern int iommu_merge;
-
-static inline int dma_mapping_error(dma_addr_t dma_addr)
-{
-       if (dma_ops->mapping_error)
-               return dma_ops->mapping_error(dma_addr);
-
-       return (dma_addr == bad_dma_address);
-}
-
-#define dma_alloc_noncoherent(d, s, h, f) dma_alloc_coherent(d, s, h, f)
-#define dma_free_noncoherent(d, s, v, h) dma_free_coherent(d, s, v, h)
-
-#define dma_alloc_noncoherent(d, s, h, f) dma_alloc_coherent(d, s, h, f)
-#define dma_free_noncoherent(d, s, v, h) dma_free_coherent(d, s, v, h)
-
-extern void *dma_alloc_coherent(struct device *dev, size_t size,
-                               dma_addr_t *dma_handle, gfp_t gfp);
-extern void dma_free_coherent(struct device *dev, size_t size, void *vaddr,
-                             dma_addr_t dma_handle);
-
-static inline dma_addr_t
-dma_map_single(struct device *hwdev, void *ptr, size_t size,
-              int direction)
-{
-       BUG_ON(!valid_dma_direction(direction));
-       return dma_ops->map_single(hwdev, ptr, size, direction);
-}
-
-static inline void
-dma_unmap_single(struct device *dev, dma_addr_t addr,size_t size,
-                int direction)
-{
-       BUG_ON(!valid_dma_direction(direction));
-       dma_ops->unmap_single(dev, addr, size, direction);
-}
-
-#define dma_map_page(dev,page,offset,size,dir) \
-       dma_map_single((dev), page_address(page)+(offset), (size), (dir))
-
-#define dma_unmap_page dma_unmap_single
-
-static inline void
-dma_sync_single_for_cpu(struct device *hwdev, dma_addr_t dma_handle,
-                       size_t size, int direction)
-{
-       BUG_ON(!valid_dma_direction(direction));
-       if (dma_ops->sync_single_for_cpu)
-               dma_ops->sync_single_for_cpu(hwdev, dma_handle, size,
-                                            direction);
-       flush_write_buffers();
-}
-
-static inline void
-dma_sync_single_for_device(struct device *hwdev, dma_addr_t dma_handle,
-                          size_t size, int direction)
-{
-       BUG_ON(!valid_dma_direction(direction));
-       if (dma_ops->sync_single_for_device)
-               dma_ops->sync_single_for_device(hwdev, dma_handle, size,
-                                               direction);
-       flush_write_buffers();
-}
-
-static inline void
-dma_sync_single_range_for_cpu(struct device *hwdev, dma_addr_t dma_handle,
-                             unsigned long offset, size_t size, int direction)
-{
-       BUG_ON(!valid_dma_direction(direction));
-       if (dma_ops->sync_single_range_for_cpu) {
-               dma_ops->sync_single_range_for_cpu(hwdev, dma_handle, offset, size, direction);
-       }
-
-       flush_write_buffers();
-}
-
-static inline void
-dma_sync_single_range_for_device(struct device *hwdev, dma_addr_t dma_handle,
-                                unsigned long offset, size_t size, int direction)
-{
-       BUG_ON(!valid_dma_direction(direction));
-       if (dma_ops->sync_single_range_for_device)
-               dma_ops->sync_single_range_for_device(hwdev, dma_handle,
-                                                     offset, size, direction);
-
-       flush_write_buffers();
-}
-
-static inline void
-dma_sync_sg_for_cpu(struct device *hwdev, struct scatterlist *sg,
-                   int nelems, int direction)
-{
-       BUG_ON(!valid_dma_direction(direction));
-       if (dma_ops->sync_sg_for_cpu)
-               dma_ops->sync_sg_for_cpu(hwdev, sg, nelems, direction);
-       flush_write_buffers();
-}
-
-static inline void
-dma_sync_sg_for_device(struct device *hwdev, struct scatterlist *sg,
-                      int nelems, int direction)
-{
-       BUG_ON(!valid_dma_direction(direction));
-       if (dma_ops->sync_sg_for_device) {
-               dma_ops->sync_sg_for_device(hwdev, sg, nelems, direction);
-       }
-
-       flush_write_buffers();
-}
-
-static inline int
-dma_map_sg(struct device *hwdev, struct scatterlist *sg, int nents, int direction)
-{
-       BUG_ON(!valid_dma_direction(direction));
-       return dma_ops->map_sg(hwdev, sg, nents, direction);
-}
-
-static inline void
-dma_unmap_sg(struct device *hwdev, struct scatterlist *sg, int nents,
-            int direction)
-{
-       BUG_ON(!valid_dma_direction(direction));
-       dma_ops->unmap_sg(hwdev, sg, nents, direction);
-}
-
-extern int dma_supported(struct device *hwdev, u64 mask);
-
-/* same for gart, swiotlb, and nommu */
-static inline int dma_get_cache_alignment(void)
-{
-       return boot_cpu_data.x86_clflush_size;
-}
-
-#define dma_is_consistent(d, h) 1
-
-extern int dma_set_mask(struct device *dev, u64 mask);
-
-static inline void
-dma_cache_sync(struct device *dev, void *vaddr, size_t size,
-       enum dma_data_direction dir)
-{
-       flush_write_buffers();
-}
-
-extern struct device fallback_dev;
-extern int panic_on_overflow;
-
-#endif /* _X8664_DMA_MAPPING_H */
diff --git a/include/asm-x86/e820_32.h b/include/asm-x86/e820_32.h

index 43b1a8bd4b349c15694cd4d490e793b8b40c340e..a9f7c6ec32bf7ecad8041a7db1af54c4f2a47393 100644 (file)
--- a/include/asm-x86/e820_32.h
+++ b/include/asm-x86/e820_32.h
@@ -24,7 +24,7 @@ extern void update_e820(void);
  extern int e820_all_mapped(unsigned long start, unsigned long end,
                            unsigned type);
  extern int e820_any_mapped(u64 start, u64 end, unsigned type);
-extern void find_max_pfn(void);
+extern void propagate_e820_map(void);
  extern void register_bootmem_low_pages(unsigned long max_low_pfn);
  extern void add_memory_region(unsigned long long start,
                               unsigned long long size, int type);
diff --git a/include/asm-x86/genapic_32.h b/include/asm-x86/genapic_32.h

index f1b96932746be9ec9de3fd6b37d3f6b49adaf9e3..b02ea6e17de8b6a097c722f28d2c726f5e39514d 100644 (file)
--- a/include/asm-x86/genapic_32.h
+++ b/include/asm-x86/genapic_32.h
@@ -117,6 +117,7 @@ extern struct genapic *genapic;
  enum uv_system_type {UV_NONE, UV_LEGACY_APIC, UV_X2APIC, UV_NON_UNIQUE_APIC};
  #define get_uv_system_type()           UV_NONE
  #define is_uv_system()                 0
+#define uv_wakeup_secondary(a, b)      1
  
  
  #endif
diff --git a/include/asm-x86/i387.h b/include/asm-x86/i387.h

index 54522b814f1c796e36f30c65632b30a35eb8d1a8..da2adb45f6e3949476ac0a4d2c1aa1cbb2024986 100644 (file)
--- a/include/asm-x86/i387.h
+++ b/include/asm-x86/i387.h
@@ -21,8 +21,9 @@
  
  extern void fpu_init(void);
  extern void mxcsr_feature_mask_init(void);
-extern void init_fpu(struct task_struct *child);
+extern int init_fpu(struct task_struct *child);
  extern asmlinkage void math_state_restore(void);
+extern void init_thread_xstate(void);
  
  extern user_regset_active_fn fpregs_active, xfpregs_active;
  extern user_regset_get_fn fpregs_get, xfpregs_get, fpregs_soft_get;
@@ -117,24 +118,22 @@ static inline void __save_init_fpu(struct task_struct *tsk)
         /* Using "fxsaveq %0" would be the ideal choice, but is only supported
            starting with gas 2.16. */
         __asm__ __volatile__("fxsaveq %0"
-                            : "=m" (tsk->thread.i387.fxsave));
+                            : "=m" (tsk->thread.xstate->fxsave));
  #elif 0
         /* Using, as a workaround, the properly prefixed form below isn't
            accepted by any binutils version so far released, complaining that
            the same type of prefix is used twice if an extended register is
            needed for addressing (fix submitted to mainline 2005-11-21). */
         __asm__ __volatile__("rex64/fxsave %0"
-                            : "=m" (tsk->thread.i387.fxsave));
+                            : "=m" (tsk->thread.xstate->fxsave));
  #else
         /* This, however, we can work around by forcing the compiler to select
            an addressing mode that doesn't require extended registers. */
-       __asm__ __volatile__("rex64/fxsave %P2(%1)"
-                            : "=m" (tsk->thread.i387.fxsave)
-                            : "cdaSDb" (tsk),
-                               "i" (offsetof(__typeof__(*tsk),
-                                             thread.i387.fxsave)));
+       __asm__ __volatile__("rex64/fxsave (%1)"
+                            : "=m" (tsk->thread.xstate->fxsave)
+                            : "cdaSDb" (&tsk->thread.xstate->fxsave));
  #endif
-       clear_fpu_state(&tsk->thread.i387.fxsave);
+       clear_fpu_state(&tsk->thread.xstate->fxsave);
         task_thread_info(tsk)->status &= ~TS_USEDFPU;
  }
  
@@ -148,7 +147,7 @@ static inline int save_i387(struct _fpstate __user *buf)
         int err = 0;
  
         BUILD_BUG_ON(sizeof(struct user_i387_struct) !=
-                       sizeof(tsk->thread.i387.fxsave));
+                       sizeof(tsk->thread.xstate->fxsave));
  
         if ((unsigned long)buf % 16)
                 printk("save_i387: bad fpstate %p\n", buf);
@@ -164,7 +163,7 @@ static inline int save_i387(struct _fpstate __user *buf)
                 task_thread_info(tsk)->status &= ~TS_USEDFPU;
                 stts();
         } else {
-               if (__copy_to_user(buf, &tsk->thread.i387.fxsave,
+               if (__copy_to_user(buf, &tsk->thread.xstate->fxsave,
                                    sizeof(struct i387_fxsave_struct)))
                         return -1;
         }
@@ -201,7 +200,7 @@ static inline void restore_fpu(struct task_struct *tsk)
                 "nop ; frstor %1",
                 "fxrstor %1",
                 X86_FEATURE_FXSR,
-               "m" ((tsk)->thread.i387.fxsave));
+               "m" (tsk->thread.xstate->fxsave));
  }
  
  /* We need a safe address that is cheap to find and that is already
@@ -225,8 +224,8 @@ static inline void __save_init_fpu(struct task_struct *tsk)
                 "fxsave %[fx]\n"
                 "bt $7,%[fsw] ; jnc 1f ; fnclex\n1:",
                 X86_FEATURE_FXSR,
-               [fx] "m" (tsk->thread.i387.fxsave),
-               [fsw] "m" (tsk->thread.i387.fxsave.swd) : "memory");
+               [fx] "m" (tsk->thread.xstate->fxsave),
+               [fsw] "m" (tsk->thread.xstate->fxsave.swd) : "memory");
         /* AMD K7/K8 CPUs don't save/restore FDP/FIP/FOP unless an exception
            is pending.  Clear the x87 state here by setting it to fixed
            values. safe_address is a random variable that should be in L1 */
@@ -327,25 +326,25 @@ static inline void clear_fpu(struct task_struct *tsk)
  static inline unsigned short get_fpu_cwd(struct task_struct *tsk)
  {
         if (cpu_has_fxsr) {
-               return tsk->thread.i387.fxsave.cwd;
+               return tsk->thread.xstate->fxsave.cwd;
         } else {
-               return (unsigned short)tsk->thread.i387.fsave.cwd;
+               return (unsigned short)tsk->thread.xstate->fsave.cwd;
         }
  }
  
  static inline unsigned short get_fpu_swd(struct task_struct *tsk)
  {
         if (cpu_has_fxsr) {
-               return tsk->thread.i387.fxsave.swd;
+               return tsk->thread.xstate->fxsave.swd;
         } else {
-               return (unsigned short)tsk->thread.i387.fsave.swd;
+               return (unsigned short)tsk->thread.xstate->fsave.swd;
         }
  }
  
  static inline unsigned short get_fpu_mxcsr(struct task_struct *tsk)
  {
         if (cpu_has_xmm) {
-               return tsk->thread.i387.fxsave.mxcsr;
+               return tsk->thread.xstate->fxsave.mxcsr;
         } else {
                 return MXCSR_DEFAULT;
         }
diff --git a/include/asm-x86/numa_64.h b/include/asm-x86/numa_64.h

index 32c22ae0709f1c13236b6fe31fb22734ba70a7cd..22e87c9f6a80551be4ed7108a87a83e4551a15ab 100644 (file)
--- a/include/asm-x86/numa_64.h
+++ b/include/asm-x86/numa_64.h
@@ -9,7 +9,8 @@ struct bootnode {
         u64 end;
  };
  
-extern int compute_hash_shift(struct bootnode *nodes, int numnodes);
+extern int compute_hash_shift(struct bootnode *nodes, int numblks,
+                             int *nodeids);
  
  #define ZONE_ALIGN (1UL << (MAX_ORDER+PAGE_SHIFT))
  
diff --git a/include/asm-x86/pci_64.h b/include/asm-x86/pci_64.h

index df867e5d80b197a220c8cd84967a8adf9b6c88db..f330234ffa5c568888aeabf92c1e7a49d91335dc 100644 (file)
--- a/include/asm-x86/pci_64.h
+++ b/include/asm-x86/pci_64.h
@@ -22,6 +22,7 @@ extern int (*pci_config_read)(int seg, int bus, int dev, int fn,
  extern int (*pci_config_write)(int seg, int bus, int dev, int fn,
                                int reg, int len, u32 value);
  
+extern void dma32_reserve_bootmem(void);
  extern void pci_iommu_alloc(void);
  
  /* The PCI address space does equal the physical memory
diff --git a/include/asm-x86/processor.h b/include/asm-x86/processor.h

index 6e26c7c717a23a15255869f2b864350faf21d597..e6bf92ddeb21ded7de69cff3f37120df56520291 100644 (file)
--- a/include/asm-x86/processor.h
+++ b/include/asm-x86/processor.h
@@ -354,7 +354,7 @@ struct i387_soft_struct {
         u32                     entry_eip;
  };
  
-union i387_union {
+union thread_xstate {
         struct i387_fsave_struct        fsave;
         struct i387_fxsave_struct       fxsave;
         struct i387_soft_struct         soft;
@@ -365,6 +365,9 @@ DECLARE_PER_CPU(struct orig_ist, orig_ist);
  #endif
  
  extern void print_cpu_info(struct cpuinfo_x86 *);
+extern unsigned int xstate_size;
+extern void free_thread_xstate(struct task_struct *);
+extern struct kmem_cache *task_xstate_cachep;
  extern void init_scattered_cpuid_features(struct cpuinfo_x86 *c);
  extern unsigned int init_intel_cacheinfo(struct cpuinfo_x86 *c);
  extern unsigned short num_cache_leaves;
@@ -397,8 +400,8 @@ struct thread_struct {
         unsigned long           cr2;
         unsigned long           trap_no;
         unsigned long           error_code;
-       /* Floating point info: */
-       union i387_union        i387 __attribute__((aligned(16)));;
+       /* floating point and extended processor state */
+       union thread_xstate     *xstate;
  #ifdef CONFIG_X86_32
         /* Virtual 86 mode info */
         struct vm86_struct __user *vm86_info;
@@ -918,4 +921,11 @@ extern void start_thread(struct pt_regs *regs, unsigned long new_ip,
  
  #define KSTK_EIP(task)         (task_pt_regs(task)->ip)
  
+/* Get/set a process' ability to use the timestamp counter instruction */
+#define GET_TSC_CTL(adr)       get_tsc_mode((adr))
+#define SET_TSC_CTL(val)       set_tsc_mode((val))
+
+extern int get_tsc_mode(unsigned long adr);
+extern int set_tsc_mode(unsigned int val);
+
  #endif
diff --git a/include/asm-x86/scatterlist.h b/include/asm-x86/scatterlist.h

index d13c197866d627daa572c6928034e639dc2b17f2..c0432061f81a0f0609e8b115e3c23e4ff6cdcbb4 100644 (file)
--- a/include/asm-x86/scatterlist.h
+++ b/include/asm-x86/scatterlist.h
@@ -11,9 +11,7 @@ struct scatterlist {
         unsigned int    offset;
         unsigned int    length;
         dma_addr_t      dma_address;
-#ifdef CONFIG_X86_64
         unsigned int    dma_length;
-#endif
  };
  
  #define ARCH_HAS_SG_CHAIN
diff --git a/include/asm-x86/thread_info.h b/include/asm-x86/thread_info.h

index d5fd12f2abdbb141668c205dd424de8c91de90e4..77244f17993f303d8e80e8af7b0f71e50e69f0a7 100644 (file)
--- a/include/asm-x86/thread_info.h
+++ b/include/asm-x86/thread_info.h
@@ -1,5 +1,14 @@
+#ifndef _ASM_X86_THREAD_INFO_H
  #ifdef CONFIG_X86_32
  # include "thread_info_32.h"
  #else
  # include "thread_info_64.h"
  #endif
+
+#ifndef __ASSEMBLY__
+extern void arch_task_cache_init(void);
+extern void free_thread_info(struct thread_info *ti);
+extern int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src);
+#define arch_task_cache_init arch_task_cache_init
+#endif
+#endif /* _ASM_X86_THREAD_INFO_H */
diff --git a/include/asm-x86/thread_info_32.h b/include/asm-x86/thread_info_32.h

index 4e053fa561a9c5e643336d10e68d835e8863e1b8..53185996209664c82588904ca3429844696ab9b2 100644 (file)
--- a/include/asm-x86/thread_info_32.h
+++ b/include/asm-x86/thread_info_32.h
@@ -102,8 +102,6 @@ static inline struct thread_info *current_thread_info(void)
         __get_free_pages(GFP_KERNEL, get_order(THREAD_SIZE)))
  #endif
  
-#define free_thread_info(info) free_pages((unsigned long)(info), get_order(THREAD_SIZE))
-
  #else /* !__ASSEMBLY__ */
  
  /* how to get the thread information struct from ASM */
diff --git a/include/asm-x86/thread_info_64.h b/include/asm-x86/thread_info_64.h

index 1e5c6f6152cd109ad2d3f19b2848a2846c0bfbe1..ed664e874decb873d83bad65ba2eb5a549dab69d 100644 (file)
--- a/include/asm-x86/thread_info_64.h
+++ b/include/asm-x86/thread_info_64.h
@@ -85,8 +85,6 @@ static inline struct thread_info *stack_thread_info(void)
  #define alloc_thread_info(tsk)                                         \
         ((struct thread_info *)__get_free_pages(THREAD_FLAGS, THREAD_ORDER))
  
-#define free_thread_info(ti) free_pages((unsigned long) (ti), THREAD_ORDER)
-
  #else /* !__ASSEMBLY__ */
  
  /* how to get the thread information struct from ASM */
@@ -126,6 +124,7 @@ static inline struct thread_info *stack_thread_info(void)
  #define TIF_DEBUGCTLMSR                25      /* uses thread_struct.debugctlmsr */
  #define TIF_DS_AREA_MSR                26      /* uses thread_struct.ds_area_msr */
  #define TIF_BTS_TRACE_TS       27      /* record scheduling event timestamps */
+#define TIF_NOTSC              28      /* TSC is not accessible in userland */
  
  #define _TIF_SYSCALL_TRACE     (1 << TIF_SYSCALL_TRACE)
  #define _TIF_SIGPENDING                (1 << TIF_SIGPENDING)
@@ -147,6 +146,7 @@ static inline struct thread_info *stack_thread_info(void)
  #define _TIF_DEBUGCTLMSR       (1 << TIF_DEBUGCTLMSR)
  #define _TIF_DS_AREA_MSR       (1 << TIF_DS_AREA_MSR)
  #define _TIF_BTS_TRACE_TS      (1 << TIF_BTS_TRACE_TS)
+#define _TIF_NOTSC             (1 << TIF_NOTSC)
  
  /* work to do on interrupt/exception return */
  #define _TIF_WORK_MASK                                                 \
@@ -160,7 +160,7 @@ static inline struct thread_info *stack_thread_info(void)
  
  /* flags to check in __switch_to() */
  #define _TIF_WORK_CTXSW                                                        \
-       (_TIF_IO_BITMAP|_TIF_DEBUGCTLMSR|_TIF_DS_AREA_MSR|_TIF_BTS_TRACE_TS)
+       (_TIF_IO_BITMAP|_TIF_DEBUGCTLMSR|_TIF_DS_AREA_MSR|_TIF_BTS_TRACE_TS|_TIF_NOTSC)
  #define _TIF_WORK_CTXSW_PREV _TIF_WORK_CTXSW
  #define _TIF_WORK_CTXSW_NEXT (_TIF_WORK_CTXSW|_TIF_DEBUG)
  
diff --git a/include/asm-x86/topology.h b/include/asm-x86/topology.h

index 81a29eb08ac4113f5a5f5d57752fef9b390ae153..22073268b4814ec97a7ddab151e01c1807eabcc8 100644 (file)
--- a/include/asm-x86/topology.h
+++ b/include/asm-x86/topology.h
@@ -88,6 +88,17 @@ static inline int cpu_to_node(int cpu)
  #endif
         return per_cpu(x86_cpu_to_node_map, cpu);
  }
+
+#ifdef CONFIG_NUMA
+
+/* Returns a pointer to the cpumask of CPUs on Node 'node'. */
+#define node_to_cpumask_ptr(v, node)           \
+               cpumask_t *v = &(node_to_cpumask_map[node])
+
+#define node_to_cpumask_ptr_next(v, node)      \
+                          v = &(node_to_cpumask_map[node])
+#endif
+
  #endif /* CONFIG_X86_64 */
  
  /*
@@ -136,17 +147,13 @@ extern unsigned long node_remap_size[];
  
  # define SD_CACHE_NICE_TRIES   2
  # define SD_IDLE_IDX           2
-# define SD_NEWIDLE_IDX                0
+# define SD_NEWIDLE_IDX                2
  # define SD_FORKEXEC_IDX       1
  
  #endif
  
  /* sched_domains SD_NODE_INIT for NUMAQ machines */
  #define SD_NODE_INIT (struct sched_domain) {           \
-       .span                   = CPU_MASK_NONE,        \
-       .parent                 = NULL,                 \
-       .child                  = NULL,                 \
-       .groups                 = NULL,                 \
         .min_interval           = 8,                    \
         .max_interval           = 32,                   \
         .busy_factor            = 32,                   \
@@ -164,7 +171,6 @@ extern unsigned long node_remap_size[];
                                 | SD_WAKE_BALANCE,      \
         .last_balance           = jiffies,              \
         .balance_interval       = 1,                    \
-       .nr_balance_failed      = 0,                    \
  }
  
  #ifdef CONFIG_X86_64_ACPI_NUMA
@@ -174,10 +180,10 @@ extern int __node_distance(int, int);
  
  #else /* CONFIG_NUMA */
  
-#include <asm-generic/topology.h>
-
  #endif
  
+#include <asm-generic/topology.h>
+
  extern cpumask_t cpu_coregroup_map(int cpu);
  
  #ifdef ENABLE_TOPO_DEFINES
diff --git a/include/asm-x86/tsc.h b/include/asm-x86/tsc.h

index d2d8eb5b55f532365f9277b5f776109626f7ed4b..0434bd8349a7456f27f8ca95c18b472e07c1ba8e 100644 (file)
--- a/include/asm-x86/tsc.h
+++ b/include/asm-x86/tsc.h
@@ -18,6 +18,7 @@ extern unsigned int cpu_khz;
  extern unsigned int tsc_khz;
  
  extern void disable_TSC(void);
+extern void enable_TSC(void);
  
  static inline cycles_t get_cycles(void)
  {
diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h

index acad1105d94287af0cba3322e3368efe1b4f38fb..1dbe074f1c645fed22e55484f26cdd90cc325bfb 100644 (file)
--- a/include/linux/bitmap.h
+++ b/include/linux/bitmap.h
@@ -108,6 +108,7 @@ extern int __bitmap_weight(const unsigned long *bitmap, int bits);
  
  extern int bitmap_scnprintf(char *buf, unsigned int len,
                         const unsigned long *src, int nbits);
+extern int bitmap_scnprintf_len(unsigned int len);
  extern int __bitmap_parse(const char *buf, unsigned int buflen, int is_user,
                         unsigned long *dst, int nbits);
  extern int bitmap_parse_user(const char __user *ubuf, unsigned int ulen,
diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h

index 7047f58306a7a24fba73076f255551c59866cc4e..259c8051155d9e34c783f9499b606ac780f8dd84 100644 (file)
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -222,8 +222,13 @@ int __next_cpu(int n, const cpumask_t *srcp);
  #define next_cpu(n, src)       ({ (void)(src); 1; })
  #endif
  
+#ifdef CONFIG_HAVE_CPUMASK_OF_CPU_MAP
+extern cpumask_t *cpumask_of_cpu_map;
+#define cpumask_of_cpu(cpu)    (cpumask_of_cpu_map[cpu])
+
+#else
  #define cpumask_of_cpu(cpu)                                            \
-({                                                                     \
+(*({                                                                   \
         typeof(_unused_cpumask_arg_) m;                                 \
         if (sizeof(m) == sizeof(unsigned long)) {                       \
                 m.bits[0] = 1UL<<(cpu);                                 \
@@ -231,8 +236,9 @@ int __next_cpu(int n, const cpumask_t *srcp);
                 cpus_clear(m);                                          \
                 cpu_set((cpu), m);                                      \
         }                                                               \
-       m;                                                              \
-})
+       &m;                                                             \
+}))
+#endif
  
  #define CPU_MASK_LAST_WORD BITMAP_LAST_WORD_MASK(NR_CPUS)
  
@@ -243,6 +249,8 @@ int __next_cpu(int n, const cpumask_t *srcp);
         [BITS_TO_LONGS(NR_CPUS)-1] = CPU_MASK_LAST_WORD                 \
  } }
  
+#define CPU_MASK_ALL_PTR       (&CPU_MASK_ALL)
+
  #else
  
  #define CPU_MASK_ALL                                                   \
@@ -251,6 +259,10 @@ int __next_cpu(int n, const cpumask_t *srcp);
         [BITS_TO_LONGS(NR_CPUS)-1] = CPU_MASK_LAST_WORD                 \
  } }
  
+/* cpu_mask_all is in init/main.c */
+extern cpumask_t cpu_mask_all;
+#define CPU_MASK_ALL_PTR       (&cpu_mask_all)
+
  #endif
  
  #define CPU_MASK_NONE                                                  \
@@ -273,6 +285,13 @@ static inline int __cpumask_scnprintf(char *buf, int len,
         return bitmap_scnprintf(buf, len, srcp->bits, nbits);
  }
  
+#define cpumask_scnprintf_len(len) \
+                       __cpumask_scnprintf_len((len))
+static inline int __cpumask_scnprintf_len(int len)
+{
+       return bitmap_scnprintf_len(len);
+}
+
  #define cpumask_parse_user(ubuf, ulen, dst) \
                         __cpumask_parse_user((ubuf), (ulen), &(dst), NR_CPUS)
  static inline int __cpumask_parse_user(const char __user *buf, int len,
diff --git a/include/linux/cpuset.h b/include/linux/cpuset.h

index 0a26be353cb3763c8541ad3594bf6162f1a3b2ec..726761e2400365c1e5dc92d627dee8f2e06c837c 100644 (file)
--- a/include/linux/cpuset.h
+++ b/include/linux/cpuset.h
@@ -20,8 +20,8 @@ extern int number_of_cpusets; /* How many cpusets are defined in system? */
  extern int cpuset_init_early(void);
  extern int cpuset_init(void);
  extern void cpuset_init_smp(void);
-extern cpumask_t cpuset_cpus_allowed(struct task_struct *p);
-extern cpumask_t cpuset_cpus_allowed_locked(struct task_struct *p);
+extern void cpuset_cpus_allowed(struct task_struct *p, cpumask_t *mask);
+extern void cpuset_cpus_allowed_locked(struct task_struct *p, cpumask_t *mask);
  extern nodemask_t cpuset_mems_allowed(struct task_struct *p);
  #define cpuset_current_mems_allowed (current->mems_allowed)
  void cpuset_init_current_mems_allowed(void);
@@ -84,13 +84,14 @@ static inline int cpuset_init_early(void) { return 0; }
  static inline int cpuset_init(void) { return 0; }
  static inline void cpuset_init_smp(void) {}
  
-static inline cpumask_t cpuset_cpus_allowed(struct task_struct *p)
+static inline void cpuset_cpus_allowed(struct task_struct *p, cpumask_t *mask)
  {
-       return cpu_possible_map;
+       *mask = cpu_possible_map;
  }
-static inline cpumask_t cpuset_cpus_allowed_locked(struct task_struct *p)
+static inline void cpuset_cpus_allowed_locked(struct task_struct *p,
+                                                               cpumask_t *mask)
  {
-       return cpu_possible_map;
+       *mask = cpu_possible_map;
  }
  
  static inline nodemask_t cpuset_mems_allowed(struct task_struct *p)
diff --git a/include/linux/efi.h b/include/linux/efi.h

index 14813b5958022bbc608430999729b6519a58a246..a5f359a7ad0ef84d2829872f904f50014abe71f9 100644 (file)
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -18,6 +18,7 @@
  #include <linux/proc_fs.h>
  #include <linux/rtc.h>
  #include <linux/ioport.h>
+#include <linux/pfn.h>
  
  #include <asm/page.h>
  #include <asm/system.h>
@@ -394,4 +395,10 @@ struct efi_generic_dev_path {
         u16 length;
  } __attribute ((packed));
  
+static inline void memrange_efi_to_native(u64 *addr, u64 *npages)
+{
+       *npages = PFN_UP(*addr + (*npages<<EFI_PAGE_SHIFT)) - PFN_DOWN(*addr);
+       *addr &= PAGE_MASK;
+}
+
  #endif /* _LINUX_EFI_H */
diff --git a/include/linux/file.h b/include/linux/file.h

index 7239baac81a9e29dda405b16116762f260ed95f2..653477021e4c545b9f4dcc983587ceb26eb11eac 100644 (file)
--- a/include/linux/file.h
+++ b/include/linux/file.h
@@ -61,6 +61,7 @@ extern struct kmem_cache *filp_cachep;
  
  extern void __fput(struct file *);
  extern void fput(struct file *);
+extern void drop_file_write_access(struct file *file);
  
  struct file_operations;
  struct vfsmount;
diff --git a/include/linux/fs.h b/include/linux/fs.h

index b84b848431f24a61a37e2d8a535534fcf0681554..d1eeea669d2c7c1fc4b8f9d4d13bad2a90e3dab8 100644 (file)
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -776,6 +776,9 @@ static inline int ra_has_index(struct file_ra_state *ra, pgoff_t index)
                 index <  ra->start + ra->size);
  }
  
+#define FILE_MNT_WRITE_TAKEN   1
+#define FILE_MNT_WRITE_RELEASED        2
+
  struct file {
         /*
          * fu_list becomes invalid after file_free is called and queued via
@@ -810,6 +813,9 @@ struct file {
         spinlock_t              f_ep_lock;
  #endif /* #ifdef CONFIG_EPOLL */
         struct address_space    *f_mapping;
+#ifdef CONFIG_DEBUG_WRITECOUNT
+       unsigned long f_mnt_write_state;
+#endif
  };
  extern spinlock_t files_lock;
  #define file_list_lock() spin_lock(&files_lock);
@@ -818,6 +824,49 @@ extern spinlock_t files_lock;
  #define get_file(x)    atomic_inc(&(x)->f_count)
  #define file_count(x)  atomic_read(&(x)->f_count)
  
+#ifdef CONFIG_DEBUG_WRITECOUNT
+static inline void file_take_write(struct file *f)
+{
+       WARN_ON(f->f_mnt_write_state != 0);
+       f->f_mnt_write_state = FILE_MNT_WRITE_TAKEN;
+}
+static inline void file_release_write(struct file *f)
+{
+       f->f_mnt_write_state |= FILE_MNT_WRITE_RELEASED;
+}
+static inline void file_reset_write(struct file *f)
+{
+       f->f_mnt_write_state = 0;
+}
+static inline void file_check_state(struct file *f)
+{
+       /*
+        * At this point, either both or neither of these bits
+        * should be set.
+        */
+       WARN_ON(f->f_mnt_write_state == FILE_MNT_WRITE_TAKEN);
+       WARN_ON(f->f_mnt_write_state == FILE_MNT_WRITE_RELEASED);
+}
+static inline int file_check_writeable(struct file *f)
+{
+       if (f->f_mnt_write_state == FILE_MNT_WRITE_TAKEN)
+               return 0;
+       printk(KERN_WARNING "writeable file with no "
+                           "mnt_want_write()\n");
+       WARN_ON(1);
+       return -EINVAL;
+}
+#else /* !CONFIG_DEBUG_WRITECOUNT */
+static inline void file_take_write(struct file *filp) {}
+static inline void file_release_write(struct file *filp) {}
+static inline void file_reset_write(struct file *filp) {}
+static inline void file_check_state(struct file *filp) {}
+static inline int file_check_writeable(struct file *filp)
+{
+       return 0;
+}
+#endif /* CONFIG_DEBUG_WRITECOUNT */
+
  #define        MAX_NON_LFS     ((1UL<<31) - 1)
  
  /* Page cache limit. The filesystems should put that into their s_maxbytes 
@@ -1735,7 +1784,8 @@ extern struct file *create_read_pipe(struct file *f);
  extern struct file *create_write_pipe(void);
  extern void free_write_pipe(struct file *);
  
-extern int open_namei(int dfd, const char *, int, int, struct nameidata *);
+extern struct file *do_filp_open(int dfd, const char *pathname,
+               int open_flag, int mode);
  extern int may_open(struct nameidata *, int, int);
  
  extern int kernel_read(struct file *, unsigned long, char *, unsigned long);
diff --git a/include/linux/init_task.h b/include/linux/init_task.h

index 1f74e1d7415fe9042e2e71467b40e4a56184ffa8..37a6f5bc4a92ab5262fcb75e161b0de94114b98b 100644 (file)
--- a/include/linux/init_task.h
+++ b/include/linux/init_task.h
@@ -151,6 +151,9 @@ extern struct group_info init_groups;
         .cpus_allowed   = CPU_MASK_ALL,                                 \
         .mm             = NULL,                                         \
         .active_mm      = &init_mm,                                     \
+       .se             = {                                             \
+               .group_node     = LIST_HEAD_INIT(tsk.se.group_node),    \
+       },                                                              \
         .rt             = {                                             \
                 .run_list       = LIST_HEAD_INIT(tsk.rt.run_list),      \
                 .time_slice     = HZ,                                   \
diff --git a/include/linux/irqflags.h b/include/linux/irqflags.h

index 412e025bc5c7a6f290d6d665de94d6e6ecf808d4..e600c4e9b8c5b179eebdcaea3769f5bcba49f18a 100644 (file)
--- a/include/linux/irqflags.h
+++ b/include/linux/irqflags.h
@@ -84,10 +84,10 @@
  
  #define irqs_disabled()                                                \
  ({                                                             \
-       unsigned long flags;                                    \
+       unsigned long _flags;                                   \
                                                                 \
-       raw_local_save_flags(flags);                            \
-       raw_irqs_disabled_flags(flags);                         \
+       raw_local_save_flags(_flags);                           \
+       raw_irqs_disabled_flags(_flags);                        \
  })
  
  #define irqs_disabled_flags(flags)     raw_irqs_disabled_flags(flags)
diff --git a/include/linux/ktime.h b/include/linux/ktime.h

index 2cd7fa73d1af88df51e6bb0daf2084dbf381ca0f..ce5983225be4e6430cadbff9cc5cbf448a66e647 100644 (file)
--- a/include/linux/ktime.h
+++ b/include/linux/ktime.h
@@ -327,4 +327,10 @@ extern void ktime_get_ts(struct timespec *ts);
  /* Get the real (wall-) time in timespec format: */
  #define ktime_get_real_ts(ts)  getnstimeofday(ts)
  
+static inline ktime_t ns_to_ktime(u64 ns)
+{
+       static const ktime_t ktime_zero = { .tv64 = 0 };
+       return ktime_add_ns(ktime_zero, ns);
+}
+
  #endif
diff --git a/include/linux/list.h b/include/linux/list.h

index 75ce2cb4ff6ebc92a8fcaa557a9bd7b6b9b9b1b6..dac16f99c70115ba4e4f1cee8b18f5f1b0f050c8 100644 (file)
--- a/include/linux/list.h
+++ b/include/linux/list.h
@@ -631,31 +631,14 @@ static inline void list_splice_init_rcu(struct list_head *list,
   * as long as the traversal is guarded by rcu_read_lock().
   */
  #define list_for_each_rcu(pos, head) \
-       for (pos = (head)->next; \
-               prefetch(rcu_dereference(pos)->next), pos != (head); \
-               pos = pos->next)
+       for (pos = rcu_dereference((head)->next); \
+               prefetch(pos->next), pos != (head); \
+               pos = rcu_dereference(pos->next))
  
  #define __list_for_each_rcu(pos, head) \
-       for (pos = (head)->next; \
-               rcu_dereference(pos) != (head); \
-               pos = pos->next)
-
-/**
- * list_for_each_safe_rcu
- * @pos:       the &struct list_head to use as a loop cursor.
- * @n:         another &struct list_head to use as temporary storage
- * @head:      the head for your list.
- *
- * Iterate over an rcu-protected list, safe against removal of list entry.
- *
- * This list-traversal primitive may safely run concurrently with
- * the _rcu list-mutation primitives such as list_add_rcu()
- * as long as the traversal is guarded by rcu_read_lock().
- */
-#define list_for_each_safe_rcu(pos, n, head) \
-       for (pos = (head)->next; \
-               n = rcu_dereference(pos)->next, pos != (head); \
-               pos = n)
+       for (pos = rcu_dereference((head)->next); \
+               pos != (head); \
+               pos = rcu_dereference(pos->next))
  
  /**
   * list_for_each_entry_rcu     -       iterate over rcu list of given type
@@ -668,10 +651,9 @@ static inline void list_splice_init_rcu(struct list_head *list,
   * as long as the traversal is guarded by rcu_read_lock().
   */
  #define list_for_each_entry_rcu(pos, head, member) \
-       for (pos = list_entry((head)->next, typeof(*pos), member); \
-               prefetch(rcu_dereference(pos)->member.next), \
-                       &pos->member != (head); \
-               pos = list_entry(pos->member.next, typeof(*pos), member))
+       for (pos = list_entry(rcu_dereference((head)->next), typeof(*pos), member); \
+               prefetch(pos->member.next), &pos->member != (head); \
+               pos = list_entry(rcu_dereference(pos->member.next), typeof(*pos), member))
  
  
  /**
@@ -686,9 +668,9 @@ static inline void list_splice_init_rcu(struct list_head *list,
   * as long as the traversal is guarded by rcu_read_lock().
   */
  #define list_for_each_continue_rcu(pos, head) \
-       for ((pos) = (pos)->next; \
-               prefetch(rcu_dereference((pos))->next), (pos) != (head); \
-               (pos) = (pos)->next)
+       for ((pos) = rcu_dereference((pos)->next); \
+               prefetch((pos)->next), (pos) != (head); \
+               (pos) = rcu_dereference((pos)->next))
  
  /*
   * Double linked lists with a single pointer list head.
@@ -986,10 +968,10 @@ static inline void hlist_add_after_rcu(struct hlist_node *prev,
   * as long as the traversal is guarded by rcu_read_lock().
   */
  #define hlist_for_each_entry_rcu(tpos, pos, head, member)               \
-       for (pos = (head)->first;                                        \
-            rcu_dereference(pos) && ({ prefetch(pos->next); 1;}) &&     \
+       for (pos = rcu_dereference((head)->first);                       \
+               pos && ({ prefetch(pos->next); 1;}) &&                   \
                 ({ tpos = hlist_entry(pos, typeof(*tpos), member); 1;}); \
-            pos = pos->next)
+            pos = rcu_dereference(pos->next))
  
  #else
  #warning "don't include kernel headers in userspace"
diff --git a/include/linux/mount.h b/include/linux/mount.h

index 5ee2df217cdfbd05cad3ea32de6777db52c271c6..d6600e3f7e4579e6cb8f476bac437c2c035b2573 100644 (file)
--- a/include/linux/mount.h
+++ b/include/linux/mount.h
@@ -14,6 +14,7 @@
  
  #include <linux/types.h>
  #include <linux/list.h>
+#include <linux/nodemask.h>
  #include <linux/spinlock.h>
  #include <asm/atomic.h>
  
@@ -28,8 +29,10 @@ struct mnt_namespace;
  #define MNT_NOATIME    0x08
  #define MNT_NODIRATIME 0x10
  #define MNT_RELATIME   0x20
+#define MNT_READONLY   0x40    /* does the user want this to be r/o? */
  
  #define MNT_SHRINKABLE 0x100
+#define MNT_IMBALANCED_WRITE_COUNT     0x200 /* just for debugging */
  
  #define MNT_SHARED     0x1000  /* if the vfsmount is a shared mount */
  #define MNT_UNBINDABLE 0x2000  /* if the vfsmount is a unbindable mount */
@@ -62,6 +65,11 @@ struct vfsmount {
         int mnt_expiry_mark;            /* true if marked for expiry */
         int mnt_pinned;
         int mnt_ghosts;
+       /*
+        * This value is not stable unless all of the mnt_writers[] spinlocks
+        * are held, and all mnt_writer[]s on this mount have 0 as their ->count
+        */
+       atomic_t __mnt_writers;
  };
  
  static inline struct vfsmount *mntget(struct vfsmount *mnt)
@@ -71,9 +79,12 @@ static inline struct vfsmount *mntget(struct vfsmount *mnt)
         return mnt;
  }
  
+extern int mnt_want_write(struct vfsmount *mnt);
+extern void mnt_drop_write(struct vfsmount *mnt);
  extern void mntput_no_expire(struct vfsmount *mnt);
  extern void mnt_pin(struct vfsmount *mnt);
  extern void mnt_unpin(struct vfsmount *mnt);
+extern int __mnt_is_readonly(struct vfsmount *mnt);
  
  static inline void mntput(struct vfsmount *mnt)
  {
diff --git a/include/linux/prctl.h b/include/linux/prctl.h

index 3800639775aee4706734b1aa9bb3e55104fe4f3b..5c80b1939636ec556f775692aa5ed27aba5986b4 100644 (file)
--- a/include/linux/prctl.h
+++ b/include/linux/prctl.h
@@ -67,4 +67,10 @@
  #define PR_CAPBSET_READ 23
  #define PR_CAPBSET_DROP 24
  
+/* Get/set the process' ability to use the timestamp counter instruction */
+#define PR_GET_TSC 25
+#define PR_SET_TSC 26
+# define PR_TSC_ENABLE         1       /* allow the use of the timestamp counter */
+# define PR_TSC_SIGSEGV                2       /* throw a SIGSEGV instead of reading the TSC */
+
  #endif /* _LINUX_PRCTL_H */
diff --git a/include/linux/sched.h b/include/linux/sched.h

index 6a1e7afb099bc981f91a23ac9698e7244bab4aa0..be6914014c7045d0faf603406bfd586f404c9968 100644 (file)
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -704,6 +704,7 @@ enum cpu_idle_type {
  #define SD_POWERSAVINGS_BALANCE        256     /* Balance for power savings */
  #define SD_SHARE_PKG_RESOURCES 512     /* Domain members share cpu pkg resources */
  #define SD_SERIALIZE           1024    /* Only a single load balancing instance */
+#define SD_WAKE_IDLE_FAR       2048    /* Gain latency sacrificing cache hit */
  
  #define BALANCE_FOR_MC_POWER   \
         (sched_smt_power_savings ? SD_POWERSAVINGS_BALANCE : 0)
@@ -733,12 +734,31 @@ struct sched_group {
         u32 reciprocal_cpu_power;
  };
  
+enum sched_domain_level {
+       SD_LV_NONE = 0,
+       SD_LV_SIBLING,
+       SD_LV_MC,
+       SD_LV_CPU,
+       SD_LV_NODE,
+       SD_LV_ALLNODES,
+       SD_LV_MAX
+};
+
+struct sched_domain_attr {
+       int relax_domain_level;
+};
+
+#define SD_ATTR_INIT   (struct sched_domain_attr) {    \
+       .relax_domain_level = -1,                       \
+}
+
  struct sched_domain {
         /* These fields must be setup */
         struct sched_domain *parent;    /* top domain must be null terminated */
         struct sched_domain *child;     /* bottom domain must be null terminated */
         struct sched_group *groups;     /* the balancing groups of the domain */
         cpumask_t span;                 /* span of all CPUs in this domain */
+       int first_cpu;                  /* cache of the first cpu in this domain */
         unsigned long min_interval;     /* Minimum balance interval ms */
         unsigned long max_interval;     /* Maximum balance interval ms */
         unsigned int busy_factor;       /* less balancing by factor if busy */
@@ -750,6 +770,7 @@ struct sched_domain {
         unsigned int wake_idx;
         unsigned int forkexec_idx;
         int flags;                      /* See SD_* */
+       enum sched_domain_level level;
  
         /* Runtime fields. */
         unsigned long last_balance;     /* init to jiffies. units in jiffies */
@@ -789,7 +810,8 @@ struct sched_domain {
  #endif
  };
  
-extern void partition_sched_domains(int ndoms_new, cpumask_t *doms_new);
+extern void partition_sched_domains(int ndoms_new, cpumask_t *doms_new,
+                                   struct sched_domain_attr *dattr_new);
  extern int arch_reinit_sched_domains(void);
  
  #endif /* CONFIG_SMP */
@@ -889,7 +911,8 @@ struct sched_class {
         void (*set_curr_task) (struct rq *rq);
         void (*task_tick) (struct rq *rq, struct task_struct *p, int queued);
         void (*task_new) (struct rq *rq, struct task_struct *p);
-       void (*set_cpus_allowed)(struct task_struct *p, cpumask_t *newmask);
+       void (*set_cpus_allowed)(struct task_struct *p,
+                                const cpumask_t *newmask);
  
         void (*join_domain)(struct rq *rq);
         void (*leave_domain)(struct rq *rq);
@@ -923,6 +946,7 @@ struct load_weight {
  struct sched_entity {
         struct load_weight      load;           /* for load-balancing */
         struct rb_node          run_node;
+       struct list_head        group_node;
         unsigned int            on_rq;
  
         u64                     exec_start;
@@ -982,6 +1006,7 @@ struct sched_rt_entity {
         unsigned long timeout;
         int nr_cpus_allowed;
  
+       struct sched_rt_entity *back;
  #ifdef CONFIG_RT_GROUP_SCHED
         struct sched_rt_entity  *parent;
         /* rq on which this entity is (to be) queued: */
@@ -1502,15 +1527,21 @@ static inline void put_task_struct(struct task_struct *t)
  #define used_math() tsk_used_math(current)
  
  #ifdef CONFIG_SMP
-extern int set_cpus_allowed(struct task_struct *p, cpumask_t new_mask);
+extern int set_cpus_allowed_ptr(struct task_struct *p,
+                               const cpumask_t *new_mask);
  #else
-static inline int set_cpus_allowed(struct task_struct *p, cpumask_t new_mask)
+static inline int set_cpus_allowed_ptr(struct task_struct *p,
+                                      const cpumask_t *new_mask)
  {
-       if (!cpu_isset(0, new_mask))
+       if (!cpu_isset(0, *new_mask))
                 return -EINVAL;
         return 0;
  }
  #endif
+static inline int set_cpus_allowed(struct task_struct *p, cpumask_t new_mask)
+{
+       return set_cpus_allowed_ptr(p, &new_mask);
+}
  
  extern unsigned long long sched_clock(void);
  
@@ -1551,7 +1582,6 @@ static inline void wake_up_idle_cpu(int cpu) { }
  extern unsigned int sysctl_sched_latency;
  extern unsigned int sysctl_sched_min_granularity;
  extern unsigned int sysctl_sched_wakeup_granularity;
-extern unsigned int sysctl_sched_batch_wakeup_granularity;
  extern unsigned int sysctl_sched_child_runs_first;
  extern unsigned int sysctl_sched_features;
  extern unsigned int sysctl_sched_migration_cost;
@@ -1564,6 +1594,10 @@ int sched_nr_latency_handler(struct ctl_table *table, int write,
  extern unsigned int sysctl_sched_rt_period;
  extern int sysctl_sched_rt_runtime;
  
+int sched_rt_handler(struct ctl_table *table, int write,
+               struct file *filp, void __user *buffer, size_t *lenp,
+               loff_t *ppos);
+
  extern unsigned int sysctl_sched_compat_yield;
  
  #ifdef CONFIG_RT_MUTEXES
@@ -2031,7 +2065,7 @@ static inline void arch_pick_mmap_layout(struct mm_struct *mm)
  }
  #endif
  
-extern long sched_setaffinity(pid_t pid, cpumask_t new_mask);
+extern long sched_setaffinity(pid_t pid, const cpumask_t *new_mask);
  extern long sched_getaffinity(pid_t pid, cpumask_t *mask);
  
  extern int sched_mc_power_savings, sched_smt_power_savings;
@@ -2041,8 +2075,11 @@ extern void normalize_rt_tasks(void);
  #ifdef CONFIG_GROUP_SCHED
  
  extern struct task_group init_task_group;
+#ifdef CONFIG_USER_SCHED
+extern struct task_group root_task_group;
+#endif
  
-extern struct task_group *sched_create_group(void);
+extern struct task_group *sched_create_group(struct task_group *parent);
  extern void sched_destroy_group(struct task_group *tg);
  extern void sched_move_task(struct task_struct *tsk);
  #ifdef CONFIG_FAIR_GROUP_SCHED
@@ -2053,6 +2090,9 @@ extern unsigned long sched_group_shares(struct task_group *tg);
  extern int sched_group_set_rt_runtime(struct task_group *tg,
                                       long rt_runtime_us);
  extern long sched_group_rt_runtime(struct task_group *tg);
+extern int sched_group_set_rt_period(struct task_group *tg,
+                                     long rt_period_us);
+extern long sched_group_rt_period(struct task_group *tg);
  #endif
  #endif
  
diff --git a/include/linux/sysdev.h b/include/linux/sysdev.h

index f752e73bf977e98381b95070096522c84e2ed5a1..f2767bc6b73517bea32d2e34ed362a8e61b794f1 100644 (file)
--- a/include/linux/sysdev.h
+++ b/include/linux/sysdev.h
@@ -45,12 +45,16 @@ struct sysdev_class_attribute {
         ssize_t (*store)(struct sysdev_class *, const char *, size_t);
  };
  
-#define SYSDEV_CLASS_ATTR(_name,_mode,_show,_store)            \
-struct sysdev_class_attribute attr_##_name = {                         \
+#define _SYSDEV_CLASS_ATTR(_name,_mode,_show,_store)           \
+{                                                              \
         .attr = {.name = __stringify(_name), .mode = _mode },   \
         .show   = _show,                                        \
         .store  = _store,                                       \
-};
+}
+
+#define SYSDEV_CLASS_ATTR(_name,_mode,_show,_store)            \
+       struct sysdev_class_attribute attr_##_name =            \
+               _SYSDEV_CLASS_ATTR(_name,_mode,_show,_store)
  
  
  extern int sysdev_class_register(struct sysdev_class *);
@@ -100,15 +104,16 @@ struct sysdev_attribute {
  };
  
  
-#define _SYSDEV_ATTR(_name,_mode,_show,_store)                 \
+#define _SYSDEV_ATTR(_name, _mode, _show, _store)              \
  {                                                              \
         .attr = { .name = __stringify(_name), .mode = _mode },  \
         .show   = _show,                                        \
         .store  = _store,                                       \
  }
  
-#define SYSDEV_ATTR(_name,_mode,_show,_store)          \
-struct sysdev_attribute attr_##_name = _SYSDEV_ATTR(_name,_mode,_show,_store);
+#define SYSDEV_ATTR(_name, _mode, _show, _store)               \
+       struct sysdev_attribute attr_##_name =                  \
+               _SYSDEV_ATTR(_name, _mode, _show, _store);
  
  extern int sysdev_create_file(struct sys_device *, struct sysdev_attribute *);
  extern void sysdev_remove_file(struct sys_device *, struct sysdev_attribute *);
diff --git a/include/linux/topology.h b/include/linux/topology.h

index bd14f8b30f0998a874ecc33a220a673da5aae9c8..4bb7074a2c3a5d018ff13eee399d82a2be6c10e6 100644 (file)
--- a/include/linux/topology.h
+++ b/include/linux/topology.h
@@ -38,16 +38,15 @@
  #endif
  
  #ifndef nr_cpus_node
-#define nr_cpus_node(node)                                                     \
-       ({                                                                      \
-               cpumask_t __tmp__;                                              \
-               __tmp__ = node_to_cpumask(node);                                \
-               cpus_weight(__tmp__);                                           \
+#define nr_cpus_node(node)                             \
+       ({                                              \
+               node_to_cpumask_ptr(__tmp__, node);     \
+               cpus_weight(*__tmp__);                  \
         })
  #endif
  
-#define for_each_node_with_cpus(node)                                          \
-       for_each_online_node(node)                                              \
+#define for_each_node_with_cpus(node)                  \
+       for_each_online_node(node)                      \
                 if (nr_cpus_node(node))
  
  void arch_update_cpu_topology(void);
@@ -80,7 +79,9 @@ void arch_update_cpu_topology(void);
   * by defining their own arch-specific initializer in include/asm/topology.h.
   * A definition there will automagically override these default initializers
   * and allow arch-specific performance tuning of sched_domains.
+ * (Only non-zero and non-null fields need be specified.)
   */
+
  #ifdef CONFIG_SCHED_SMT
  /* MCD - Do we really need this?  It is always on if CONFIG_SCHED_SMT is,
   * so can't we drop this in favor of CONFIG_SCHED_SMT?
@@ -89,20 +90,10 @@ void arch_update_cpu_topology(void);
  /* Common values for SMT siblings */
  #ifndef SD_SIBLING_INIT
  #define SD_SIBLING_INIT (struct sched_domain) {                \
-       .span                   = CPU_MASK_NONE,        \
-       .parent                 = NULL,                 \
-       .child                  = NULL,                 \
-       .groups                 = NULL,                 \
         .min_interval           = 1,                    \
         .max_interval           = 2,                    \
         .busy_factor            = 64,                   \
         .imbalance_pct          = 110,                  \
-       .cache_nice_tries       = 0,                    \
-       .busy_idx               = 0,                    \
-       .idle_idx               = 0,                    \
-       .newidle_idx            = 0,                    \
-       .wake_idx               = 0,                    \
-       .forkexec_idx           = 0,                    \
         .flags                  = SD_LOAD_BALANCE       \
                                 | SD_BALANCE_NEWIDLE    \
                                 | SD_BALANCE_FORK       \
@@ -112,7 +103,6 @@ void arch_update_cpu_topology(void);
                                 | SD_SHARE_CPUPOWER,    \
         .last_balance           = jiffies,              \
         .balance_interval       = 1,                    \
-       .nr_balance_failed      = 0,                    \
  }
  #endif
  #endif /* CONFIG_SCHED_SMT */
@@ -121,18 +111,12 @@ void arch_update_cpu_topology(void);
  /* Common values for MC siblings. for now mostly derived from SD_CPU_INIT */
  #ifndef SD_MC_INIT
  #define SD_MC_INIT (struct sched_domain) {             \
-       .span                   = CPU_MASK_NONE,        \
-       .parent                 = NULL,                 \
-       .child                  = NULL,                 \
-       .groups                 = NULL,                 \
         .min_interval           = 1,                    \
         .max_interval           = 4,                    \
         .busy_factor            = 64,                   \
         .imbalance_pct          = 125,                  \
         .cache_nice_tries       = 1,                    \
         .busy_idx               = 2,                    \
-       .idle_idx               = 0,                    \
-       .newidle_idx            = 0,                    \
         .wake_idx               = 1,                    \
         .forkexec_idx           = 1,                    \
         .flags                  = SD_LOAD_BALANCE       \
@@ -144,7 +128,6 @@ void arch_update_cpu_topology(void);
                                 | BALANCE_FOR_MC_POWER, \
         .last_balance           = jiffies,              \
         .balance_interval       = 1,                    \
-       .nr_balance_failed      = 0,                    \
  }
  #endif
  #endif /* CONFIG_SCHED_MC */
@@ -152,10 +135,6 @@ void arch_update_cpu_topology(void);
  /* Common values for CPUs */
  #ifndef SD_CPU_INIT
  #define SD_CPU_INIT (struct sched_domain) {            \
-       .span                   = CPU_MASK_NONE,        \
-       .parent                 = NULL,                 \
-       .child                  = NULL,                 \
-       .groups                 = NULL,                 \
         .min_interval           = 1,                    \
         .max_interval           = 4,                    \
         .busy_factor            = 64,                   \
@@ -174,16 +153,11 @@ void arch_update_cpu_topology(void);
                                 | BALANCE_FOR_PKG_POWER,\
         .last_balance           = jiffies,              \
         .balance_interval       = 1,                    \
-       .nr_balance_failed      = 0,                    \
  }
  #endif
  
  /* sched_domains SD_ALLNODES_INIT for NUMA machines */
  #define SD_ALLNODES_INIT (struct sched_domain) {       \
-       .span                   = CPU_MASK_NONE,        \
-       .parent                 = NULL,                 \
-       .child                  = NULL,                 \
-       .groups                 = NULL,                 \
         .min_interval           = 64,                   \
         .max_interval           = 64*num_online_cpus(), \
         .busy_factor            = 128,                  \
@@ -191,14 +165,10 @@ void arch_update_cpu_topology(void);
         .cache_nice_tries       = 1,                    \
         .busy_idx               = 3,                    \
         .idle_idx               = 3,                    \
-       .newidle_idx            = 0, /* unused */       \
-       .wake_idx               = 0, /* unused */       \
-       .forkexec_idx           = 0, /* unused */       \
         .flags                  = SD_LOAD_BALANCE       \
                                 | SD_SERIALIZE, \
         .last_balance           = jiffies,              \
         .balance_interval       = 64,                   \
-       .nr_balance_failed      = 0,                    \
  }
  
  #ifdef CONFIG_NUMA
diff --git a/init/Kconfig b/init/Kconfig

index 7fccf09bb95ad8adeb929829dc1bb624c84ffc82..ba3a389fab94dcef61341973de66ae12992892cf 100644 (file)
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -328,6 +328,13 @@ config RT_GROUP_SCHED
         depends on EXPERIMENTAL
         depends on GROUP_SCHED
         default n
+       help
+         This feature lets you explicitly allocate real CPU bandwidth
+         to users or control groups (depending on the "Basis for grouping tasks"
+         setting below. If enabled, it will also make it impossible to
+         schedule realtime tasks for non-root users until you allocate
+         realtime bandwidth for them.
+         See Documentation/sched-rt-group.txt for more information.
  
  choice
         depends on GROUP_SCHED
diff --git a/init/main.c b/init/main.c

index 99ce94930b09302f7af89f75c07d7daa6a7ef668..833a67df1f7e06c121c4b9b982893592ca7219df 100644 (file)
--- a/init/main.c
+++ b/init/main.c
@@ -359,10 +359,31 @@ static void __init smp_init(void)
  #endif
  
  static inline void setup_per_cpu_areas(void) { }
+static inline void setup_nr_cpu_ids(void) { }
  static inline void smp_prepare_cpus(unsigned int maxcpus) { }
  
  #else
  
+#if NR_CPUS > BITS_PER_LONG
+cpumask_t cpu_mask_all __read_mostly = CPU_MASK_ALL;
+EXPORT_SYMBOL(cpu_mask_all);
+#endif
+
+/* Setup number of possible processor ids */
+int nr_cpu_ids __read_mostly = NR_CPUS;
+EXPORT_SYMBOL(nr_cpu_ids);
+
+/* An arch may set nr_cpu_ids earlier if needed, so this would be redundant */
+static void __init setup_nr_cpu_ids(void)
+{
+       int cpu, highest_cpu = 0;
+
+       for_each_possible_cpu(cpu)
+               highest_cpu = cpu;
+
+       nr_cpu_ids = highest_cpu + 1;
+}
+
  #ifndef CONFIG_HAVE_SETUP_PER_CPU_AREA
  unsigned long __per_cpu_offset[NR_CPUS] __read_mostly;
  
@@ -537,6 +558,7 @@ asmlinkage void __init start_kernel(void)
         setup_command_line(command_line);
         unwind_setup();
         setup_per_cpu_areas();
+       setup_nr_cpu_ids();
         smp_prepare_boot_cpu(); /* arch-specific boot-cpu hooks */
  
         /*
@@ -811,7 +833,7 @@ static int __init kernel_init(void * unused)
         /*
          * init can run on any cpu.
          */
-       set_cpus_allowed(current, CPU_MASK_ALL);
+       set_cpus_allowed_ptr(current, CPU_MASK_ALL_PTR);
         /*
          * Tell the world that we're going to be the grim
          * reaper of innocent orphaned children.
diff --git a/ipc/mqueue.c b/ipc/mqueue.c

index 60f7a27f7a9e4ddeacb1eb95e1d867860b964f3c..94fd3b08fb77036d35ecd1a337785f1847fe1613 100644 (file)
--- a/ipc/mqueue.c
+++ b/ipc/mqueue.c
@@ -598,6 +598,7 @@ static struct file *do_create(struct dentry *dir, struct dentry *dentry,
                         int oflag, mode_t mode, struct mq_attr __user *u_attr)
  {
         struct mq_attr attr;
+       struct file *result;
         int ret;
  
         if (u_attr) {
@@ -612,13 +613,24 @@ static struct file *do_create(struct dentry *dir, struct dentry *dentry,
         }
  
         mode &= ~current->fs->umask;
+       ret = mnt_want_write(mqueue_mnt);
+       if (ret)
+               goto out;
         ret = vfs_create(dir->d_inode, dentry, mode, NULL);
         dentry->d_fsdata = NULL;
         if (ret)
-               goto out;
-
-       return dentry_open(dentry, mqueue_mnt, oflag);
-
+               goto out_drop_write;
+
+       result = dentry_open(dentry, mqueue_mnt, oflag);
+       /*
+        * dentry_open() took a persistent mnt_want_write(),
+        * so we can now drop this one.
+        */
+       mnt_drop_write(mqueue_mnt);
+       return result;
+
+out_drop_write:
+       mnt_drop_write(mqueue_mnt);
  out:
         dput(dentry);
         mntput(mqueue_mnt);
@@ -742,8 +754,11 @@ asmlinkage long sys_mq_unlink(const char __user *u_name)
         inode = dentry->d_inode;
         if (inode)
                 atomic_inc(&inode->i_count);
-
+       err = mnt_want_write(mqueue_mnt);
+       if (err)
+               goto out_err;
         err = vfs_unlink(dentry->d_parent->d_inode, dentry);
+       mnt_drop_write(mqueue_mnt);
  out_err:
         dput(dentry);
  
diff --git a/kernel/compat.c b/kernel/compat.c

index 9c48abfcd4a528ced37ea31cc0315d03c7691e78..e1ef04870c2a12122fddfd3a712ed6e6d2a441ea 100644 (file)
--- a/kernel/compat.c
+++ b/kernel/compat.c
@@ -445,7 +445,7 @@ asmlinkage long compat_sys_sched_setaffinity(compat_pid_t pid,
         if (retval)
                 return retval;
  
-       return sched_setaffinity(pid, new_mask);
+       return sched_setaffinity(pid, &new_mask);
  }
  
  asmlinkage long compat_sys_sched_getaffinity(compat_pid_t pid, unsigned int len,
diff --git a/kernel/cpu.c b/kernel/cpu.c

index 2eff3f63abed6c91c2bbcb1c8d0eccbb97d54c7d..2011ad8d26973fae39a373aa4121ea2fd3f0b8d1 100644 (file)
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -232,9 +232,9 @@ static int _cpu_down(unsigned int cpu, int tasks_frozen)
  
         /* Ensure that we are not runnable on dying cpu */
         old_allowed = current->cpus_allowed;
-       tmp = CPU_MASK_ALL;
+       cpus_setall(tmp);
         cpu_clear(cpu, tmp);
-       set_cpus_allowed(current, tmp);
+       set_cpus_allowed_ptr(current, &tmp);
  
         p = __stop_machine_run(take_cpu_down, &tcd_param, cpu);
  
@@ -268,7 +268,7 @@ static int _cpu_down(unsigned int cpu, int tasks_frozen)
  out_thread:
         err = kthread_stop(p);
  out_allowed:
-       set_cpus_allowed(current, old_allowed);
+       set_cpus_allowed_ptr(current, &old_allowed);
  out_release:
         cpu_hotplug_done();
         return err;
diff --git a/kernel/cpuset.c b/kernel/cpuset.c

index a1b61f414228ea031cc40ea9ea17ab95e264ceaf..8b35fbd8292f2d5b53b613eab630c67af87ec6f8 100644 (file)
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -98,6 +98,9 @@ struct cpuset {
         /* partition number for rebuild_sched_domains() */
         int pn;
  
+       /* for custom sched domain */
+       int relax_domain_level;
+
         /* used for walking a cpuset heirarchy */
         struct list_head stack_list;
  };
@@ -478,6 +481,16 @@ static int cpusets_overlap(struct cpuset *a, struct cpuset *b)
         return cpus_intersects(a->cpus_allowed, b->cpus_allowed);
  }
  
+static void
+update_domain_attr(struct sched_domain_attr *dattr, struct cpuset *c)
+{
+       if (!dattr)
+               return;
+       if (dattr->relax_domain_level < c->relax_domain_level)
+               dattr->relax_domain_level = c->relax_domain_level;
+       return;
+}
+
  /*
   * rebuild_sched_domains()
   *
@@ -553,12 +566,14 @@ static void rebuild_sched_domains(void)
         int csn;                /* how many cpuset ptrs in csa so far */
         int i, j, k;            /* indices for partition finding loops */
         cpumask_t *doms;        /* resulting partition; i.e. sched domains */
+       struct sched_domain_attr *dattr;  /* attributes for custom domains */
         int ndoms;              /* number of sched domains in result */
         int nslot;              /* next empty doms[] cpumask_t slot */
  
         q = NULL;
         csa = NULL;
         doms = NULL;
+       dattr = NULL;
  
         /* Special case for the 99% of systems with one, full, sched domain */
         if (is_sched_load_balance(&top_cpuset)) {
@@ -566,6 +581,11 @@ static void rebuild_sched_domains(void)
                 doms = kmalloc(sizeof(cpumask_t), GFP_KERNEL);
                 if (!doms)
                         goto rebuild;
+               dattr = kmalloc(sizeof(struct sched_domain_attr), GFP_KERNEL);
+               if (dattr) {
+                       *dattr = SD_ATTR_INIT;
+                       update_domain_attr(dattr, &top_cpuset);
+               }
                 *doms = top_cpuset.cpus_allowed;
                 goto rebuild;
         }
@@ -622,6 +642,7 @@ restart:
         doms = kmalloc(ndoms * sizeof(cpumask_t), GFP_KERNEL);
         if (!doms)
                 goto rebuild;
+       dattr = kmalloc(ndoms * sizeof(struct sched_domain_attr), GFP_KERNEL);
  
         for (nslot = 0, i = 0; i < csn; i++) {
                 struct cpuset *a = csa[i];
@@ -644,12 +665,15 @@ restart:
                         }
  
                         cpus_clear(*dp);
+                       if (dattr)
+                               *(dattr + nslot) = SD_ATTR_INIT;
                         for (j = i; j < csn; j++) {
                                 struct cpuset *b = csa[j];
  
                                 if (apn == b->pn) {
                                         cpus_or(*dp, *dp, b->cpus_allowed);
                                         b->pn = -1;
+                                       update_domain_attr(dattr, b);
                                 }
                         }
                         nslot++;
@@ -660,7 +684,7 @@ restart:
  rebuild:
         /* Have scheduler rebuild sched domains */
         get_online_cpus();
-       partition_sched_domains(ndoms, doms);
+       partition_sched_domains(ndoms, doms, dattr);
         put_online_cpus();
  
  done:
@@ -668,6 +692,7 @@ done:
                 kfifo_free(q);
         kfree(csa);
         /* Don't kfree(doms) -- partition_sched_domains() does that. */
+       /* Don't kfree(dattr) -- partition_sched_domains() does that. */
  }
  
  static inline int started_after_time(struct task_struct *t1,
@@ -729,7 +754,7 @@ int cpuset_test_cpumask(struct task_struct *tsk, struct cgroup_scanner *scan)
   */
  void cpuset_change_cpumask(struct task_struct *tsk, struct cgroup_scanner *scan)
  {
-       set_cpus_allowed(tsk, (cgroup_cs(scan->cg))->cpus_allowed);
+       set_cpus_allowed_ptr(tsk, &((cgroup_cs(scan->cg))->cpus_allowed));
  }
  
  /**
@@ -1011,6 +1036,21 @@ static int update_memory_pressure_enabled(struct cpuset *cs, char *buf)
         return 0;
  }
  
+static int update_relax_domain_level(struct cpuset *cs, char *buf)
+{
+       int val = simple_strtol(buf, NULL, 10);
+
+       if (val < 0)
+               val = -1;
+
+       if (val != cs->relax_domain_level) {
+               cs->relax_domain_level = val;
+               rebuild_sched_domains();
+       }
+
+       return 0;
+}
+
  /*
   * update_flag - read a 0 or a 1 in a file and update associated flag
   * bit:        the bit to update (CS_CPU_EXCLUSIVE, CS_MEM_EXCLUSIVE,
@@ -1178,7 +1218,7 @@ static void cpuset_attach(struct cgroup_subsys *ss,
  
         mutex_lock(&callback_mutex);
         guarantee_online_cpus(cs, &cpus);
-       set_cpus_allowed(tsk, cpus);
+       set_cpus_allowed_ptr(tsk, &cpus);
         mutex_unlock(&callback_mutex);
  
         from = oldcs->mems_allowed;
@@ -1202,6 +1242,7 @@ typedef enum {
         FILE_CPU_EXCLUSIVE,
         FILE_MEM_EXCLUSIVE,
         FILE_SCHED_LOAD_BALANCE,
+       FILE_SCHED_RELAX_DOMAIN_LEVEL,
         FILE_MEMORY_PRESSURE_ENABLED,
         FILE_MEMORY_PRESSURE,
         FILE_SPREAD_PAGE,
@@ -1256,6 +1297,9 @@ static ssize_t cpuset_common_file_write(struct cgroup *cont,
         case FILE_SCHED_LOAD_BALANCE:
                 retval = update_flag(CS_SCHED_LOAD_BALANCE, cs, buffer);
                 break;
+       case FILE_SCHED_RELAX_DOMAIN_LEVEL:
+               retval = update_relax_domain_level(cs, buffer);
+               break;
         case FILE_MEMORY_MIGRATE:
                 retval = update_flag(CS_MEMORY_MIGRATE, cs, buffer);
                 break;
@@ -1354,6 +1398,9 @@ static ssize_t cpuset_common_file_read(struct cgroup *cont,
         case FILE_SCHED_LOAD_BALANCE:
                 *s++ = is_sched_load_balance(cs) ? '1' : '0';
                 break;
+       case FILE_SCHED_RELAX_DOMAIN_LEVEL:
+               s += sprintf(s, "%d", cs->relax_domain_level);
+               break;
         case FILE_MEMORY_MIGRATE:
                 *s++ = is_memory_migrate(cs) ? '1' : '0';
                 break;
@@ -1424,6 +1471,13 @@ static struct cftype cft_sched_load_balance = {
         .private = FILE_SCHED_LOAD_BALANCE,
  };
  
+static struct cftype cft_sched_relax_domain_level = {
+       .name = "sched_relax_domain_level",
+       .read = cpuset_common_file_read,
+       .write = cpuset_common_file_write,
+       .private = FILE_SCHED_RELAX_DOMAIN_LEVEL,
+};
+
  static struct cftype cft_memory_migrate = {
         .name = "memory_migrate",
         .read = cpuset_common_file_read,
@@ -1475,6 +1529,9 @@ static int cpuset_populate(struct cgroup_subsys *ss, struct cgroup *cont)
                 return err;
         if ((err = cgroup_add_file(cont, ss, &cft_sched_load_balance)) < 0)
                 return err;
+       if ((err = cgroup_add_file(cont, ss,
+                                       &cft_sched_relax_domain_level)) < 0)
+               return err;
         if ((err = cgroup_add_file(cont, ss, &cft_memory_pressure)) < 0)
                 return err;
         if ((err = cgroup_add_file(cont, ss, &cft_spread_page)) < 0)
@@ -1555,10 +1612,11 @@ static struct cgroup_subsys_state *cpuset_create(
         if (is_spread_slab(parent))
                 set_bit(CS_SPREAD_SLAB, &cs->flags);
         set_bit(CS_SCHED_LOAD_BALANCE, &cs->flags);
-       cs->cpus_allowed = CPU_MASK_NONE;
-       cs->mems_allowed = NODE_MASK_NONE;
+       cpus_clear(cs->cpus_allowed);
+       nodes_clear(cs->mems_allowed);
         cs->mems_generation = cpuset_mems_generation++;
         fmeter_init(&cs->fmeter);
+       cs->relax_domain_level = -1;
  
         cs->parent = parent;
         number_of_cpusets++;
@@ -1625,12 +1683,13 @@ int __init cpuset_init(void)
  {
         int err = 0;
  
-       top_cpuset.cpus_allowed = CPU_MASK_ALL;
-       top_cpuset.mems_allowed = NODE_MASK_ALL;
+       cpus_setall(top_cpuset.cpus_allowed);
+       nodes_setall(top_cpuset.mems_allowed);
  
         fmeter_init(&top_cpuset.fmeter);
         top_cpuset.mems_generation = cpuset_mems_generation++;
         set_bit(CS_SCHED_LOAD_BALANCE, &top_cpuset.flags);
+       top_cpuset.relax_domain_level = -1;
  
         err = register_filesystem(&cpuset_fs_type);
         if (err < 0)
@@ -1844,6 +1903,7 @@ void __init cpuset_init_smp(void)
  
   * cpuset_cpus_allowed - return cpus_allowed mask from a tasks cpuset.
   * @tsk: pointer to task_struct from which to obtain cpuset->cpus_allowed.
+ * @pmask: pointer to cpumask_t variable to receive cpus_allowed set.
   *
   * Description: Returns the cpumask_t cpus_allowed of the cpuset
   * attached to the specified @tsk.  Guaranteed to return some non-empty
@@ -1851,35 +1911,27 @@ void __init cpuset_init_smp(void)
   * tasks cpuset.
   **/
  
-cpumask_t cpuset_cpus_allowed(struct task_struct *tsk)
+void cpuset_cpus_allowed(struct task_struct *tsk, cpumask_t *pmask)
  {
-       cpumask_t mask;
-
         mutex_lock(&callback_mutex);
-       mask = cpuset_cpus_allowed_locked(tsk);
+       cpuset_cpus_allowed_locked(tsk, pmask);
         mutex_unlock(&callback_mutex);
-
-       return mask;
  }
  
  /**
   * cpuset_cpus_allowed_locked - return cpus_allowed mask from a tasks cpuset.
   * Must be called with callback_mutex held.
   **/
-cpumask_t cpuset_cpus_allowed_locked(struct task_struct *tsk)
+void cpuset_cpus_allowed_locked(struct task_struct *tsk, cpumask_t *pmask)
  {
-       cpumask_t mask;
-
         task_lock(tsk);
-       guarantee_online_cpus(task_cs(tsk), &mask);
+       guarantee_online_cpus(task_cs(tsk), pmask);
         task_unlock(tsk);
-
-       return mask;
  }
  
  void cpuset_init_current_mems_allowed(void)
  {
-       current->mems_allowed = NODE_MASK_ALL;
+       nodes_setall(current->mems_allowed);
  }
  
  /**
@@ -2261,8 +2313,16 @@ void cpuset_task_status_allowed(struct seq_file *m, struct task_struct *task)
         m->count += cpumask_scnprintf(m->buf + m->count, m->size - m->count,
                                         task->cpus_allowed);
         seq_printf(m, "\n");
+       seq_printf(m, "Cpus_allowed_list:\t");
+       m->count += cpulist_scnprintf(m->buf + m->count, m->size - m->count,
+                                       task->cpus_allowed);
+       seq_printf(m, "\n");
         seq_printf(m, "Mems_allowed:\t");
         m->count += nodemask_scnprintf(m->buf + m->count, m->size - m->count,
                                         task->mems_allowed);
         seq_printf(m, "\n");
+       seq_printf(m, "Mems_allowed_list:\t");
+       m->count += nodelist_scnprintf(m->buf + m->count, m->size - m->count,
+                                       task->mems_allowed);
+       seq_printf(m, "\n");
  }
diff --git a/kernel/fork.c b/kernel/fork.c

index 9c042f901570e1b789d40fdbda111739c733cdb4..89fe414645e9b76aa777a0de7b5e9d6777814a0f 100644 (file)
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -132,6 +132,14 @@ void __put_task_struct(struct task_struct *tsk)
                 free_task(tsk);
  }
  
+/*
+ * macro override instead of weak attribute alias, to workaround
+ * gcc 4.1.0 and 4.1.1 bugs with weak attribute and empty functions.
+ */
+#ifndef arch_task_cache_init
+#define arch_task_cache_init()
+#endif
+
  void __init fork_init(unsigned long mempages)
  {
  #ifndef __HAVE_ARCH_TASK_STRUCT_ALLOCATOR
@@ -144,6 +152,9 @@ void __init fork_init(unsigned long mempages)
                         ARCH_MIN_TASKALIGN, SLAB_PANIC, NULL);
  #endif
  
+       /* do the arch specific task caches init */
+       arch_task_cache_init();
+
         /*
          * The default maximum number of threads is set to a safe
          * value: the thread structures can take up at most half
@@ -163,6 +174,13 @@ void __init fork_init(unsigned long mempages)
                 init_task.signal->rlim[RLIMIT_NPROC];
  }
  
+int __attribute__((weak)) arch_dup_task_struct(struct task_struct *dst,
+                                              struct task_struct *src)
+{
+       *dst = *src;
+       return 0;
+}
+
  static struct task_struct *dup_task_struct(struct task_struct *orig)
  {
         struct task_struct *tsk;
@@ -181,15 +199,15 @@ static struct task_struct *dup_task_struct(struct task_struct *orig)
                 return NULL;
         }
  
-       *tsk = *orig;
+       err = arch_dup_task_struct(tsk, orig);
+       if (err)
+               goto out;
+
         tsk->stack = ti;
  
         err = prop_local_init_single(&tsk->dirties);
-       if (err) {
-               free_thread_info(ti);
-               free_task_struct(tsk);
-               return NULL;
-       }
+       if (err)
+               goto out;
  
         setup_thread_stack(tsk, orig);
  
@@ -205,6 +223,11 @@ static struct task_struct *dup_task_struct(struct task_struct *orig)
  #endif
         tsk->splice_pipe = NULL;
         return tsk;
+
+out:
+       free_thread_info(ti);
+       free_task_struct(tsk);
+       return NULL;
  }
  
  #ifdef CONFIG_MMU
diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c

index fdb3fbe2b0c4cbb2e0512098c24a0a5a83f21c0d..964964baefa23c2a8bdee504aa053d078ba29ad9 100644 (file)
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -47,7 +47,7 @@ void dynamic_irq_init(unsigned int irq)
         desc->irq_count = 0;
         desc->irqs_unhandled = 0;
  #ifdef CONFIG_SMP
-       desc->affinity = CPU_MASK_ALL;
+       cpus_setall(desc->affinity);
  #endif
         spin_unlock_irqrestore(&desc->lock, flags);
  }
diff --git a/kernel/kmod.c b/kernel/kmod.c

index 22be3ff3f363ac71a319deeb2f55fd0ccbe27c26..e2764047ec03ed123acd3546ff4357341302cabb 100644 (file)
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -165,7 +165,7 @@ static int ____call_usermodehelper(void *data)
         }
  
         /* We can run anywhere, unlike our parent keventd(). */
-       set_cpus_allowed(current, CPU_MASK_ALL);
+       set_cpus_allowed_ptr(current, CPU_MASK_ALL_PTR);
  
         /*
          * Our parent is keventd, which runs with elevated scheduling priority.
diff --git a/kernel/kthread.c b/kernel/kthread.c

index 0ac887882f908502425deeebcda4c678812443cb..25241d6ec8cdec1c5c2e1a80d61dff4335f9db9a 100644 (file)
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -180,6 +180,7 @@ void kthread_bind(struct task_struct *k, unsigned int cpu)
         wait_task_inactive(k);
         set_task_cpu(k, cpu);
         k->cpus_allowed = cpumask_of_cpu(cpu);
+       k->rt.nr_cpus_allowed = 1;
  }
  EXPORT_SYMBOL(kthread_bind);
  
diff --git a/kernel/latencytop.c b/kernel/latencytop.c

index b4e3c85abe74658d8df264080897f9bacf8b9278..7c74dab0d21b9fcd2235b507b5e450459e042ef6 100644 (file)
--- a/kernel/latencytop.c
+++ b/kernel/latencytop.c
@@ -64,8 +64,8 @@ account_global_scheduler_latency(struct task_struct *tsk, struct latency_record
                 return;
  
         for (i = 0; i < MAXLR; i++) {
-               int q;
-               int same = 1;
+               int q, same = 1;
+
                 /* Nothing stored: */
                 if (!latency_record[i].backtrace[0]) {
                         if (firstnonnull > i)
@@ -73,12 +73,15 @@ account_global_scheduler_latency(struct task_struct *tsk, struct latency_record
                         continue;
                 }
                 for (q = 0 ; q < LT_BACKTRACEDEPTH ; q++) {
-                       if (latency_record[i].backtrace[q] !=
-                               lat->backtrace[q])
+                       unsigned long record = lat->backtrace[q];
+
+                       if (latency_record[i].backtrace[q] != record) {
                                 same = 0;
-                       if (same && lat->backtrace[q] == 0)
                                 break;
-                       if (same && lat->backtrace[q] == ULONG_MAX)
+                       }
+
+                       /* 0 and ULONG_MAX entries mean end of backtrace: */
+                       if (record == 0 || record == ULONG_MAX)
                                 break;
                 }
                 if (same) {
@@ -143,14 +146,18 @@ account_scheduler_latency(struct task_struct *tsk, int usecs, int inter)
         for (i = 0; i < LT_SAVECOUNT ; i++) {
                 struct latency_record *mylat;
                 int same = 1;
+
                 mylat = &tsk->latency_record[i];
                 for (q = 0 ; q < LT_BACKTRACEDEPTH ; q++) {
-                       if (mylat->backtrace[q] !=
-                               lat.backtrace[q])
+                       unsigned long record = lat.backtrace[q];
+
+                       if (mylat->backtrace[q] != record) {
                                 same = 0;
-                       if (same && lat.backtrace[q] == 0)
                                 break;
-                       if (same && lat.backtrace[q] == ULONG_MAX)
+                       }
+
+                       /* 0 and ULONG_MAX entries mean end of backtrace: */
+                       if (record == 0 || record == ULONG_MAX)
                                 break;
                 }
                 if (same) {
diff --git a/kernel/rcupreempt.c b/kernel/rcupreempt.c

index e9517014b57c100af5926165d2131992f0589401..e1cdf196a51507ae644c8a7369b599c5233f8604 100644 (file)
--- a/kernel/rcupreempt.c
+++ b/kernel/rcupreempt.c
@@ -1007,10 +1007,10 @@ void __synchronize_sched(void)
         if (sched_getaffinity(0, &oldmask) < 0)
                 oldmask = cpu_possible_map;
         for_each_online_cpu(cpu) {
-               sched_setaffinity(0, cpumask_of_cpu(cpu));
+               sched_setaffinity(0, &cpumask_of_cpu(cpu));
                 schedule();
         }
-       sched_setaffinity(0, oldmask);
+       sched_setaffinity(0, &oldmask);
  }
  EXPORT_SYMBOL_GPL(__synchronize_sched);
  
diff --git a/kernel/rcutorture.c b/kernel/rcutorture.c

index fd599829e72a5d49a5852272ba234be8c38accaf..47894f919d4ea2848263a3e1a5ec5218f27237e3 100644 (file)
--- a/kernel/rcutorture.c
+++ b/kernel/rcutorture.c
@@ -723,9 +723,10 @@ static int rcu_idle_cpu;   /* Force all torture tasks off this CPU */
   */
  static void rcu_torture_shuffle_tasks(void)
  {
-       cpumask_t tmp_mask = CPU_MASK_ALL;
+       cpumask_t tmp_mask;
         int i;
  
+       cpus_setall(tmp_mask);
         get_online_cpus();
  
         /* No point in shuffling if there is only one online CPU (ex: UP) */
@@ -737,25 +738,27 @@ static void rcu_torture_shuffle_tasks(void)
         if (rcu_idle_cpu != -1)
                 cpu_clear(rcu_idle_cpu, tmp_mask);
  
-       set_cpus_allowed(current, tmp_mask);
+       set_cpus_allowed_ptr(current, &tmp_mask);
  
         if (reader_tasks) {
                 for (i = 0; i < nrealreaders; i++)
                         if (reader_tasks[i])
-                               set_cpus_allowed(reader_tasks[i], tmp_mask);
+                               set_cpus_allowed_ptr(reader_tasks[i],
+                                                    &tmp_mask);
         }
  
         if (fakewriter_tasks) {
                 for (i = 0; i < nfakewriters; i++)
                         if (fakewriter_tasks[i])
-                               set_cpus_allowed(fakewriter_tasks[i], tmp_mask);
+                               set_cpus_allowed_ptr(fakewriter_tasks[i],
+                                                    &tmp_mask);
         }
  
         if (writer_task)
-               set_cpus_allowed(writer_task, tmp_mask);
+               set_cpus_allowed_ptr(writer_task, &tmp_mask);
  
         if (stats_task)
-               set_cpus_allowed(stats_task, tmp_mask);
+               set_cpus_allowed_ptr(stats_task, &tmp_mask);
  
         if (rcu_idle_cpu == -1)
                 rcu_idle_cpu = num_online_cpus() - 1;
diff --git a/kernel/sched.c b/kernel/sched.c

index 8dcdec6fe0fe0983f4a90e8d51e501871974d383..57ba7ea9b744558949a6f07bc3f4f11b6a94d483 100644 (file)
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -66,6 +66,10 @@
  #include <linux/unistd.h>
  #include <linux/pagemap.h>
  #include <linux/hrtimer.h>
+#include <linux/tick.h>
+#include <linux/bootmem.h>
+#include <linux/debugfs.h>
+#include <linux/ctype.h>
  
  #include <asm/tlb.h>
  #include <asm/irq_regs.h>
@@ -114,6 +118,11 @@ unsigned long long __attribute__((weak)) sched_clock(void)
   */
  #define DEF_TIMESLICE          (100 * HZ / 1000)
  
+/*
+ * single value that denotes runtime == period, ie unlimited time.
+ */
+#define RUNTIME_INF    ((u64)~0ULL)
+
  #ifdef CONFIG_SMP
  /*
   * Divide a load by a sched group cpu_power : (load / sg->__cpu_power)
@@ -155,6 +164,84 @@ struct rt_prio_array {
         struct list_head queue[MAX_RT_PRIO];
  };
  
+struct rt_bandwidth {
+       /* nests inside the rq lock: */
+       spinlock_t              rt_runtime_lock;
+       ktime_t                 rt_period;
+       u64                     rt_runtime;
+       struct hrtimer          rt_period_timer;
+};
+
+static struct rt_bandwidth def_rt_bandwidth;
+
+static int do_sched_rt_period_timer(struct rt_bandwidth *rt_b, int overrun);
+
+static enum hrtimer_restart sched_rt_period_timer(struct hrtimer *timer)
+{
+       struct rt_bandwidth *rt_b =
+               container_of(timer, struct rt_bandwidth, rt_period_timer);
+       ktime_t now;
+       int overrun;
+       int idle = 0;
+
+       for (;;) {
+               now = hrtimer_cb_get_time(timer);
+               overrun = hrtimer_forward(timer, now, rt_b->rt_period);
+
+               if (!overrun)
+                       break;
+
+               idle = do_sched_rt_period_timer(rt_b, overrun);
+       }
+
+       return idle ? HRTIMER_NORESTART : HRTIMER_RESTART;
+}
+
+static
+void init_rt_bandwidth(struct rt_bandwidth *rt_b, u64 period, u64 runtime)
+{
+       rt_b->rt_period = ns_to_ktime(period);
+       rt_b->rt_runtime = runtime;
+
+       spin_lock_init(&rt_b->rt_runtime_lock);
+
+       hrtimer_init(&rt_b->rt_period_timer,
+                       CLOCK_MONOTONIC, HRTIMER_MODE_REL);
+       rt_b->rt_period_timer.function = sched_rt_period_timer;
+       rt_b->rt_period_timer.cb_mode = HRTIMER_CB_IRQSAFE_NO_SOFTIRQ;
+}
+
+static void start_rt_bandwidth(struct rt_bandwidth *rt_b)
+{
+       ktime_t now;
+
+       if (rt_b->rt_runtime == RUNTIME_INF)
+               return;
+
+       if (hrtimer_active(&rt_b->rt_period_timer))
+               return;
+
+       spin_lock(&rt_b->rt_runtime_lock);
+       for (;;) {
+               if (hrtimer_active(&rt_b->rt_period_timer))
+                       break;
+
+               now = hrtimer_cb_get_time(&rt_b->rt_period_timer);
+               hrtimer_forward(&rt_b->rt_period_timer, now, rt_b->rt_period);
+               hrtimer_start(&rt_b->rt_period_timer,
+                             rt_b->rt_period_timer.expires,
+                             HRTIMER_MODE_ABS);
+       }
+       spin_unlock(&rt_b->rt_runtime_lock);
+}
+
+#ifdef CONFIG_RT_GROUP_SCHED
+static void destroy_rt_bandwidth(struct rt_bandwidth *rt_b)
+{
+       hrtimer_cancel(&rt_b->rt_period_timer);
+}
+#endif
+
  #ifdef CONFIG_GROUP_SCHED
  
  #include <linux/cgroup.h>
@@ -181,29 +268,39 @@ struct task_group {
         struct sched_rt_entity **rt_se;
         struct rt_rq **rt_rq;
  
-       u64 rt_runtime;
+       struct rt_bandwidth rt_bandwidth;
  #endif
  
         struct rcu_head rcu;
         struct list_head list;
+
+       struct task_group *parent;
+       struct list_head siblings;
+       struct list_head children;
  };
  
+#ifdef CONFIG_USER_SCHED
+
+/*
+ * Root task group.
+ *     Every UID task group (including init_task_group aka UID-0) will
+ *     be a child to this group.
+ */
+struct task_group root_task_group;
+
  #ifdef CONFIG_FAIR_GROUP_SCHED
  /* Default task group's sched entity on each cpu */
  static DEFINE_PER_CPU(struct sched_entity, init_sched_entity);
  /* Default task group's cfs_rq on each cpu */
  static DEFINE_PER_CPU(struct cfs_rq, init_cfs_rq) ____cacheline_aligned_in_smp;
-
-static struct sched_entity *init_sched_entity_p[NR_CPUS];
-static struct cfs_rq *init_cfs_rq_p[NR_CPUS];
  #endif
  
  #ifdef CONFIG_RT_GROUP_SCHED
  static DEFINE_PER_CPU(struct sched_rt_entity, init_sched_rt_entity);
  static DEFINE_PER_CPU(struct rt_rq, init_rt_rq) ____cacheline_aligned_in_smp;
-
-static struct sched_rt_entity *init_sched_rt_entity_p[NR_CPUS];
-static struct rt_rq *init_rt_rq_p[NR_CPUS];
+#endif
+#else
+#define root_task_group init_task_group
  #endif
  
  /* task_group_lock serializes add/remove of task groups and also changes to
@@ -221,23 +318,15 @@ static DEFINE_MUTEX(doms_cur_mutex);
  # define INIT_TASK_GROUP_LOAD  NICE_0_LOAD
  #endif
  
+#define MIN_SHARES     2
+
  static int init_task_group_load = INIT_TASK_GROUP_LOAD;
  #endif
  
  /* Default task group.
   *     Every task in system belong to this group at bootup.
   */
-struct task_group init_task_group = {
-#ifdef CONFIG_FAIR_GROUP_SCHED
-       .se     = init_sched_entity_p,
-       .cfs_rq = init_cfs_rq_p,
-#endif
-
-#ifdef CONFIG_RT_GROUP_SCHED
-       .rt_se  = init_sched_rt_entity_p,
-       .rt_rq  = init_rt_rq_p,
-#endif
-};
+struct task_group init_task_group;
  
  /* return group to which a task belongs */
  static inline struct task_group *task_group(struct task_struct *p)
@@ -297,8 +386,12 @@ struct cfs_rq {
  
         struct rb_root tasks_timeline;
         struct rb_node *rb_leftmost;
-       struct rb_node *rb_load_balance_curr;
-       /* 'curr' points to currently running entity on this cfs_rq.
+
+       struct list_head tasks;
+       struct list_head *balance_iterator;
+
+       /*
+        * 'curr' points to currently running entity on this cfs_rq.
          * It is set to NULL otherwise (i.e when none are currently running).
          */
         struct sched_entity *curr, *next;
@@ -318,6 +411,43 @@ struct cfs_rq {
          */
         struct list_head leaf_cfs_rq_list;
         struct task_group *tg;  /* group that "owns" this runqueue */
+
+#ifdef CONFIG_SMP
+       unsigned long task_weight;
+       unsigned long shares;
+       /*
+        * We need space to build a sched_domain wide view of the full task
+        * group tree, in order to avoid depending on dynamic memory allocation
+        * during the load balancing we place this in the per cpu task group
+        * hierarchy. This limits the load balancing to one instance per cpu,
+        * but more should not be needed anyway.
+        */
+       struct aggregate_struct {
+               /*
+                *   load = weight(cpus) * f(tg)
+                *
+                * Where f(tg) is the recursive weight fraction assigned to
+                * this group.
+                */
+               unsigned long load;
+
+               /*
+                * part of the group weight distributed to this span.
+                */
+               unsigned long shares;
+
+               /*
+                * The sum of all runqueue weights within this span.
+                */
+               unsigned long rq_weight;
+
+               /*
+                * Weight contributed by tasks; this is the part we can
+                * influence by moving tasks around.
+                */
+               unsigned long task_weight;
+       } aggregate;
+#endif
  #endif
  };
  
@@ -334,6 +464,9 @@ struct rt_rq {
  #endif
         int rt_throttled;
         u64 rt_time;
+       u64 rt_runtime;
+       /* Nests inside the rq lock: */
+       spinlock_t rt_runtime_lock;
  
  #ifdef CONFIG_RT_GROUP_SCHED
         unsigned long rt_nr_boosted;
@@ -396,6 +529,7 @@ struct rq {
         unsigned long cpu_load[CPU_LOAD_IDX_MAX];
         unsigned char idle_at_tick;
  #ifdef CONFIG_NO_HZ
+       unsigned long last_tick_seen;
         unsigned char in_nohz_recently;
  #endif
         /* capture load from *all* tasks on this cpu: */
@@ -405,8 +539,6 @@ struct rq {
  
         struct cfs_rq cfs;
         struct rt_rq rt;
-       u64 rt_period_expire;
-       int rt_throttled;
  
  #ifdef CONFIG_FAIR_GROUP_SCHED
         /* list of leaf cfs_rq on this cpu: */
@@ -499,6 +631,32 @@ static inline int cpu_of(struct rq *rq)
  #endif
  }
  
+#ifdef CONFIG_NO_HZ
+static inline bool nohz_on(int cpu)
+{
+       return tick_get_tick_sched(cpu)->nohz_mode != NOHZ_MODE_INACTIVE;
+}
+
+static inline u64 max_skipped_ticks(struct rq *rq)
+{
+       return nohz_on(cpu_of(rq)) ? jiffies - rq->last_tick_seen + 2 : 1;
+}
+
+static inline void update_last_tick_seen(struct rq *rq)
+{
+       rq->last_tick_seen = jiffies;
+}
+#else
+static inline u64 max_skipped_ticks(struct rq *rq)
+{
+       return 1;
+}
+
+static inline void update_last_tick_seen(struct rq *rq)
+{
+}
+#endif
+
  /*
   * Update the per-runqueue clock, as finegrained as the platform can give
   * us, but without assuming monotonicity, etc.:
@@ -523,9 +681,12 @@ static void __update_rq_clock(struct rq *rq)
                 /*
                  * Catch too large forward jumps too:
                  */
-               if (unlikely(clock + delta > rq->tick_timestamp + TICK_NSEC)) {
-                       if (clock < rq->tick_timestamp + TICK_NSEC)
-                               clock = rq->tick_timestamp + TICK_NSEC;
+               u64 max_jump = max_skipped_ticks(rq) * TICK_NSEC;
+               u64 max_time = rq->tick_timestamp + max_jump;
+
+               if (unlikely(clock + delta > max_time)) {
+                       if (clock < max_time)
+                               clock = max_time;
                         else
                                 clock++;
                         rq->clock_overflows++;
@@ -561,23 +722,6 @@ static void update_rq_clock(struct rq *rq)
  #define task_rq(p)             cpu_rq(task_cpu(p))
  #define cpu_curr(cpu)          (cpu_rq(cpu)->curr)
  
-unsigned long rt_needs_cpu(int cpu)
-{
-       struct rq *rq = cpu_rq(cpu);
-       u64 delta;
-
-       if (!rq->rt_throttled)
-               return 0;
-
-       if (rq->clock > rq->rt_period_expire)
-               return 1;
-
-       delta = rq->rt_period_expire - rq->clock;
-       do_div(delta, NSEC_PER_SEC / HZ);
-
-       return (unsigned long)delta;
-}
-
  /*
   * Tunables that become constants when CONFIG_SCHED_DEBUG is off:
   */
@@ -590,22 +734,137 @@ unsigned long rt_needs_cpu(int cpu)
  /*
   * Debugging: various feature bits
   */
+
+#define SCHED_FEAT(name, enabled)      \
+       __SCHED_FEAT_##name ,
+
  enum {
-       SCHED_FEAT_NEW_FAIR_SLEEPERS    = 1,
-       SCHED_FEAT_WAKEUP_PREEMPT       = 2,
-       SCHED_FEAT_START_DEBIT          = 4,
-       SCHED_FEAT_HRTICK               = 8,
-       SCHED_FEAT_DOUBLE_TICK          = 16,
+#include "sched_features.h"
  };
  
+#undef SCHED_FEAT
+
+#define SCHED_FEAT(name, enabled)      \
+       (1UL << __SCHED_FEAT_##name) * enabled |
+
  const_debug unsigned int sysctl_sched_features =
-               SCHED_FEAT_NEW_FAIR_SLEEPERS    * 1 |
-               SCHED_FEAT_WAKEUP_PREEMPT       * 1 |
-               SCHED_FEAT_START_DEBIT          * 1 |
-               SCHED_FEAT_HRTICK               * 1 |
-               SCHED_FEAT_DOUBLE_TICK          * 0;
+#include "sched_features.h"
+       0;
+
+#undef SCHED_FEAT
+
+#ifdef CONFIG_SCHED_DEBUG
+#define SCHED_FEAT(name, enabled)      \
+       #name ,
  
-#define sched_feat(x) (sysctl_sched_features & SCHED_FEAT_##x)
+__read_mostly char *sched_feat_names[] = {
+#include "sched_features.h"
+       NULL
+};
+
+#undef SCHED_FEAT
+
+int sched_feat_open(struct inode *inode, struct file *filp)
+{
+       filp->private_data = inode->i_private;
+       return 0;
+}
+
+static ssize_t
+sched_feat_read(struct file *filp, char __user *ubuf,
+               size_t cnt, loff_t *ppos)
+{
+       char *buf;
+       int r = 0;
+       int len = 0;
+       int i;
+
+       for (i = 0; sched_feat_names[i]; i++) {
+               len += strlen(sched_feat_names[i]);
+               len += 4;
+       }
+
+       buf = kmalloc(len + 2, GFP_KERNEL);
+       if (!buf)
+               return -ENOMEM;
+
+       for (i = 0; sched_feat_names[i]; i++) {
+               if (sysctl_sched_features & (1UL << i))
+                       r += sprintf(buf + r, "%s ", sched_feat_names[i]);
+               else
+                       r += sprintf(buf + r, "NO_%s ", sched_feat_names[i]);
+       }
+
+       r += sprintf(buf + r, "\n");
+       WARN_ON(r >= len + 2);
+
+       r = simple_read_from_buffer(ubuf, cnt, ppos, buf, r);
+
+       kfree(buf);
+
+       return r;
+}
+
+static ssize_t
+sched_feat_write(struct file *filp, const char __user *ubuf,
+               size_t cnt, loff_t *ppos)
+{
+       char buf[64];
+       char *cmp = buf;
+       int neg = 0;
+       int i;
+
+       if (cnt > 63)
+               cnt = 63;
+
+       if (copy_from_user(&buf, ubuf, cnt))
+               return -EFAULT;
+
+       buf[cnt] = 0;
+
+       if (strncmp(buf, "NO_", 3) == 0) {
+               neg = 1;
+               cmp += 3;
+       }
+
+       for (i = 0; sched_feat_names[i]; i++) {
+               int len = strlen(sched_feat_names[i]);
+
+               if (strncmp(cmp, sched_feat_names[i], len) == 0) {
+                       if (neg)
+                               sysctl_sched_features &= ~(1UL << i);
+                       else
+                               sysctl_sched_features |= (1UL << i);
+                       break;
+               }
+       }
+
+       if (!sched_feat_names[i])
+               return -EINVAL;
+
+       filp->f_pos += cnt;
+
+       return cnt;
+}
+
+static struct file_operations sched_feat_fops = {
+       .open   = sched_feat_open,
+       .read   = sched_feat_read,
+       .write  = sched_feat_write,
+};
+
+static __init int sched_init_debug(void)
+{
+       debugfs_create_file("sched_features", 0644, NULL, NULL,
+                       &sched_feat_fops);
+
+       return 0;
+}
+late_initcall(sched_init_debug);
+
+#endif
+
+#define sched_feat(x) (sysctl_sched_features & (1UL << __SCHED_FEAT_##x))
  
  /*
   * Number of tasks to iterate in a single balance run.
@@ -627,16 +886,52 @@ static __read_mostly int scheduler_running;
   */
  int sysctl_sched_rt_runtime = 950000;
  
-/*
- * single value that denotes runtime == period, ie unlimited time.
- */
-#define RUNTIME_INF    ((u64)~0ULL)
+static inline u64 global_rt_period(void)
+{
+       return (u64)sysctl_sched_rt_period * NSEC_PER_USEC;
+}
+
+static inline u64 global_rt_runtime(void)
+{
+       if (sysctl_sched_rt_period < 0)
+               return RUNTIME_INF;
+
+       return (u64)sysctl_sched_rt_runtime * NSEC_PER_USEC;
+}
+
+static const unsigned long long time_sync_thresh = 100000;
+
+static DEFINE_PER_CPU(unsigned long long, time_offset);
+static DEFINE_PER_CPU(unsigned long long, prev_cpu_time);
  
  /*
- * For kernel-internal use: high-speed (but slightly incorrect) per-cpu
- * clock constructed from sched_clock():
+ * Global lock which we take every now and then to synchronize
+ * the CPUs time. This method is not warp-safe, but it's good
+ * enough to synchronize slowly diverging time sources and thus
+ * it's good enough for tracing:
   */
-unsigned long long cpu_clock(int cpu)
+static DEFINE_SPINLOCK(time_sync_lock);
+static unsigned long long prev_global_time;
+
+static unsigned long long __sync_cpu_clock(cycles_t time, int cpu)
+{
+       unsigned long flags;
+
+       spin_lock_irqsave(&time_sync_lock, flags);
+
+       if (time < prev_global_time) {
+               per_cpu(time_offset, cpu) += prev_global_time - time;
+               time = prev_global_time;
+       } else {
+               prev_global_time = time;
+       }
+
+       spin_unlock_irqrestore(&time_sync_lock, flags);
+
+       return time;
+}
+
+static unsigned long long __cpu_clock(int cpu)
  {
         unsigned long long now;
         unsigned long flags;
@@ -657,6 +952,24 @@ unsigned long long cpu_clock(int cpu)
  
         return now;
  }
+
+/*
+ * For kernel-internal use: high-speed (but slightly incorrect) per-cpu
+ * clock constructed from sched_clock():
+ */
+unsigned long long cpu_clock(int cpu)
+{
+       unsigned long long prev_cpu_time, time, delta_time;
+
+       prev_cpu_time = per_cpu(prev_cpu_time, cpu);
+       time = __cpu_clock(cpu) + per_cpu(time_offset, cpu);
+       delta_time = time-prev_cpu_time;
+
+       if (unlikely(delta_time > time_sync_thresh))
+               time = __sync_cpu_clock(time, cpu);
+
+       return time;
+}
  EXPORT_SYMBOL_GPL(cpu_clock);
  
  #ifndef prepare_arch_switch
@@ -1116,6 +1429,9 @@ static void __resched_task(struct task_struct *p, int tif_bit)
   */
  #define SRR(x, y) (((x) + (1UL << ((y) - 1))) >> (y))
  
+/*
+ * delta *= weight / lw
+ */
  static unsigned long
  calc_delta_mine(unsigned long delta_exec, unsigned long weight,
                 struct load_weight *lw)
@@ -1138,12 +1454,6 @@ calc_delta_mine(unsigned long delta_exec, unsigned long weight,
         return (unsigned long)min(tmp, (u64)(unsigned long)LONG_MAX);
  }
  
-static inline unsigned long
-calc_delta_fair(unsigned long delta_exec, struct load_weight *lw)
-{
-       return calc_delta_mine(delta_exec, NICE_0_LOAD, lw);
-}
-
  static inline void update_load_add(struct load_weight *lw, unsigned long inc)
  {
         lw->weight += inc;
@@ -1241,11 +1551,390 @@ static void cpuacct_charge(struct task_struct *tsk, u64 cputime);
  static inline void cpuacct_charge(struct task_struct *tsk, u64 cputime) {}
  #endif
  
+static inline void inc_cpu_load(struct rq *rq, unsigned long load)
+{
+       update_load_add(&rq->load, load);
+}
+
+static inline void dec_cpu_load(struct rq *rq, unsigned long load)
+{
+       update_load_sub(&rq->load, load);
+}
+
  #ifdef CONFIG_SMP
  static unsigned long source_load(int cpu, int type);
  static unsigned long target_load(int cpu, int type);
  static unsigned long cpu_avg_load_per_task(int cpu);
  static int task_hot(struct task_struct *p, u64 now, struct sched_domain *sd);
+
+#ifdef CONFIG_FAIR_GROUP_SCHED
+
+/*
+ * Group load balancing.
+ *
+ * We calculate a few balance domain wide aggregate numbers; load and weight.
+ * Given the pictures below, and assuming each item has equal weight:
+ *
+ *         root          1 - thread
+ *         / | \         A - group
+ *        A  1  B
+ *       /|\   / \
+ *      C 2 D 3   4
+ *      |   |
+ *      5   6
+ *
+ * load:
+ *    A and B get 1/3-rd of the total load. C and D get 1/3-rd of A's 1/3-rd,
+ *    which equals 1/9-th of the total load.
+ *
+ * shares:
+ *    The weight of this group on the selected cpus.
+ *
+ * rq_weight:
+ *    Direct sum of all the cpu's their rq weight, e.g. A would get 3 while
+ *    B would get 2.
+ *
+ * task_weight:
+ *    Part of the rq_weight contributed by tasks; all groups except B would
+ *    get 1, B gets 2.
+ */
+
+static inline struct aggregate_struct *
+aggregate(struct task_group *tg, struct sched_domain *sd)
+{
+       return &tg->cfs_rq[sd->first_cpu]->aggregate;
+}
+
+typedef void (*aggregate_func)(struct task_group *, struct sched_domain *);
+
+/*
+ * Iterate the full tree, calling @down when first entering a node and @up when
+ * leaving it for the final time.
+ */
+static
+void aggregate_walk_tree(aggregate_func down, aggregate_func up,
+                        struct sched_domain *sd)
+{
+       struct task_group *parent, *child;
+
+       rcu_read_lock();
+       parent = &root_task_group;
+down:
+       (*down)(parent, sd);
+       list_for_each_entry_rcu(child, &parent->children, siblings) {
+               parent = child;
+               goto down;
+
+up:
+               continue;
+       }
+       (*up)(parent, sd);
+
+       child = parent;
+       parent = parent->parent;
+       if (parent)
+               goto up;
+       rcu_read_unlock();
+}
+
+/*
+ * Calculate the aggregate runqueue weight.
+ */
+static
+void aggregate_group_weight(struct task_group *tg, struct sched_domain *sd)
+{
+       unsigned long rq_weight = 0;
+       unsigned long task_weight = 0;
+       int i;
+
+       for_each_cpu_mask(i, sd->span) {
+               rq_weight += tg->cfs_rq[i]->load.weight;
+               task_weight += tg->cfs_rq[i]->task_weight;
+       }
+
+       aggregate(tg, sd)->rq_weight = rq_weight;
+       aggregate(tg, sd)->task_weight = task_weight;
+}
+
+/*
+ * Redistribute tg->shares amongst all tg->cfs_rq[]s.
+ */
+static void __aggregate_redistribute_shares(struct task_group *tg)
+{
+       int i, max_cpu = smp_processor_id();
+       unsigned long rq_weight = 0;
+       unsigned long shares, max_shares = 0, shares_rem = tg->shares;
+
+       for_each_possible_cpu(i)
+               rq_weight += tg->cfs_rq[i]->load.weight;
+
+       for_each_possible_cpu(i) {
+               /*
+                * divide shares proportional to the rq_weights.
+                */
+               shares = tg->shares * tg->cfs_rq[i]->load.weight;
+               shares /= rq_weight + 1;
+
+               tg->cfs_rq[i]->shares = shares;
+
+               if (shares > max_shares) {
+                       max_shares = shares;
+                       max_cpu = i;
+               }
+               shares_rem -= shares;
+       }
+
+       /*
+        * Ensure it all adds up to tg->shares; we can loose a few
+        * due to rounding down when computing the per-cpu shares.
+        */
+       if (shares_rem)
+               tg->cfs_rq[max_cpu]->shares += shares_rem;
+}
+
+/*
+ * Compute the weight of this group on the given cpus.
+ */
+static
+void aggregate_group_shares(struct task_group *tg, struct sched_domain *sd)
+{
+       unsigned long shares = 0;
+       int i;
+
+again:
+       for_each_cpu_mask(i, sd->span)
+               shares += tg->cfs_rq[i]->shares;
+
+       /*
+        * When the span doesn't have any shares assigned, but does have
+        * tasks to run do a machine wide rebalance (should be rare).
+        */
+       if (unlikely(!shares && aggregate(tg, sd)->rq_weight)) {
+               __aggregate_redistribute_shares(tg);
+               goto again;
+       }
+
+       aggregate(tg, sd)->shares = shares;
+}
+
+/*
+ * Compute the load fraction assigned to this group, relies on the aggregate
+ * weight and this group's parent's load, i.e. top-down.
+ */
+static
+void aggregate_group_load(struct task_group *tg, struct sched_domain *sd)
+{
+       unsigned long load;
+
+       if (!tg->parent) {
+               int i;
+
+               load = 0;
+               for_each_cpu_mask(i, sd->span)
+                       load += cpu_rq(i)->load.weight;
+
+       } else {
+               load = aggregate(tg->parent, sd)->load;
+
+               /*
+                * shares is our weight in the parent's rq so
+                * shares/parent->rq_weight gives our fraction of the load
+                */
+               load *= aggregate(tg, sd)->shares;
+               load /= aggregate(tg->parent, sd)->rq_weight + 1;
+       }
+
+       aggregate(tg, sd)->load = load;
+}
+
+static void __set_se_shares(struct sched_entity *se, unsigned long shares);
+
+/*
+ * Calculate and set the cpu's group shares.
+ */
+static void
+__update_group_shares_cpu(struct task_group *tg, struct sched_domain *sd,
+                         int tcpu)
+{
+       int boost = 0;
+       unsigned long shares;
+       unsigned long rq_weight;
+
+       if (!tg->se[tcpu])
+               return;
+
+       rq_weight = tg->cfs_rq[tcpu]->load.weight;
+
+       /*
+        * If there are currently no tasks on the cpu pretend there is one of
+        * average load so that when a new task gets to run here it will not
+        * get delayed by group starvation.
+        */
+       if (!rq_weight) {
+               boost = 1;
+               rq_weight = NICE_0_LOAD;
+       }
+
+       /*
+        *           \Sum shares * rq_weight
+        * shares =  -----------------------
+        *               \Sum rq_weight
+        *
+        */
+       shares = aggregate(tg, sd)->shares * rq_weight;
+       shares /= aggregate(tg, sd)->rq_weight + 1;
+
+       /*
+        * record the actual number of shares, not the boosted amount.
+        */
+       tg->cfs_rq[tcpu]->shares = boost ? 0 : shares;
+
+       if (shares < MIN_SHARES)
+               shares = MIN_SHARES;
+
+       __set_se_shares(tg->se[tcpu], shares);
+}
+
+/*
+ * Re-adjust the weights on the cpu the task came from and on the cpu the
+ * task went to.
+ */
+static void
+__move_group_shares(struct task_group *tg, struct sched_domain *sd,
+                   int scpu, int dcpu)
+{
+       unsigned long shares;
+
+       shares = tg->cfs_rq[scpu]->shares + tg->cfs_rq[dcpu]->shares;
+
+       __update_group_shares_cpu(tg, sd, scpu);
+       __update_group_shares_cpu(tg, sd, dcpu);
+
+       /*
+        * ensure we never loose shares due to rounding errors in the
+        * above redistribution.
+        */
+       shares -= tg->cfs_rq[scpu]->shares + tg->cfs_rq[dcpu]->shares;
+       if (shares)
+               tg->cfs_rq[dcpu]->shares += shares;
+}
+
+/*
+ * Because changing a group's shares changes the weight of the super-group
+ * we need to walk up the tree and change all shares until we hit the root.
+ */
+static void
+move_group_shares(struct task_group *tg, struct sched_domain *sd,
+                 int scpu, int dcpu)
+{
+       while (tg) {
+               __move_group_shares(tg, sd, scpu, dcpu);
+               tg = tg->parent;
+       }
+}
+
+static
+void aggregate_group_set_shares(struct task_group *tg, struct sched_domain *sd)
+{
+       unsigned long shares = aggregate(tg, sd)->shares;
+       int i;
+
+       for_each_cpu_mask(i, sd->span) {
+               struct rq *rq = cpu_rq(i);
+               unsigned long flags;
+
+               spin_lock_irqsave(&rq->lock, flags);
+               __update_group_shares_cpu(tg, sd, i);
+               spin_unlock_irqrestore(&rq->lock, flags);
+       }
+
+       aggregate_group_shares(tg, sd);
+
+       /*
+        * ensure we never loose shares due to rounding errors in the
+        * above redistribution.
+        */
+       shares -= aggregate(tg, sd)->shares;
+       if (shares) {
+               tg->cfs_rq[sd->first_cpu]->shares += shares;
+               aggregate(tg, sd)->shares += shares;
+       }
+}
+
+/*
+ * Calculate the accumulative weight and recursive load of each task group
+ * while walking down the tree.
+ */
+static
+void aggregate_get_down(struct task_group *tg, struct sched_domain *sd)
+{
+       aggregate_group_weight(tg, sd);
+       aggregate_group_shares(tg, sd);
+       aggregate_group_load(tg, sd);
+}
+
+/*
+ * Rebalance the cpu shares while walking back up the tree.
+ */
+static
+void aggregate_get_up(struct task_group *tg, struct sched_domain *sd)
+{
+       aggregate_group_set_shares(tg, sd);
+}
+
+static DEFINE_PER_CPU(spinlock_t, aggregate_lock);
+
+static void __init init_aggregate(void)
+{
+       int i;
+
+       for_each_possible_cpu(i)
+               spin_lock_init(&per_cpu(aggregate_lock, i));
+}
+
+static int get_aggregate(struct sched_domain *sd)
+{
+       if (!spin_trylock(&per_cpu(aggregate_lock, sd->first_cpu)))
+               return 0;
+
+       aggregate_walk_tree(aggregate_get_down, aggregate_get_up, sd);
+       return 1;
+}
+
+static void put_aggregate(struct sched_domain *sd)
+{
+       spin_unlock(&per_cpu(aggregate_lock, sd->first_cpu));
+}
+
+static void cfs_rq_set_shares(struct cfs_rq *cfs_rq, unsigned long shares)
+{
+       cfs_rq->shares = shares;
+}
+
+#else
+
+static inline void init_aggregate(void)
+{
+}
+
+static inline int get_aggregate(struct sched_domain *sd)
+{
+       return 0;
+}
+
+static inline void put_aggregate(struct sched_domain *sd)
+{
+}
+#endif
+
+#else /* CONFIG_SMP */
+
+#ifdef CONFIG_FAIR_GROUP_SCHED
+static void cfs_rq_set_shares(struct cfs_rq *cfs_rq, unsigned long shares)
+{
+}
+#endif
+
  #endif /* CONFIG_SMP */
  
  #include "sched_stats.h"
@@ -1258,26 +1947,14 @@ static int task_hot(struct task_struct *p, u64 now, struct sched_domain *sd);
  
  #define sched_class_highest (&rt_sched_class)
  
-static inline void inc_load(struct rq *rq, const struct task_struct *p)
-{
-       update_load_add(&rq->load, p->se.load.weight);
-}
-
-static inline void dec_load(struct rq *rq, const struct task_struct *p)
-{
-       update_load_sub(&rq->load, p->se.load.weight);
-}
-
-static void inc_nr_running(struct task_struct *p, struct rq *rq)
+static void inc_nr_running(struct rq *rq)
  {
         rq->nr_running++;
-       inc_load(rq, p);
  }
  
-static void dec_nr_running(struct task_struct *p, struct rq *rq)
+static void dec_nr_running(struct rq *rq)
  {
         rq->nr_running--;
-       dec_load(rq, p);
  }
  
  static void set_load_weight(struct task_struct *p)
@@ -1369,7 +2046,7 @@ static void activate_task(struct rq *rq, struct task_struct *p, int wakeup)
                 rq->nr_uninterruptible--;
  
         enqueue_task(rq, p, wakeup);
-       inc_nr_running(p, rq);
+       inc_nr_running(rq);
  }
  
  /*
@@ -1381,7 +2058,7 @@ static void deactivate_task(struct rq *rq, struct task_struct *p, int sleep)
                 rq->nr_uninterruptible++;
  
         dequeue_task(rq, p, sleep);
-       dec_nr_running(p, rq);
+       dec_nr_running(rq);
  }
  
  /**
@@ -1438,7 +2115,7 @@ task_hot(struct task_struct *p, u64 now, struct sched_domain *sd)
         /*
          * Buddy candidates are cache hot:
          */
-       if (&p->se == cfs_rq_of(&p->se)->next)
+       if (sched_feat(CACHE_HOT_BUDDY) && (&p->se == cfs_rq_of(&p->se)->next))
                 return 1;
  
         if (p->sched_class != &fair_sched_class)
@@ -1728,17 +2405,17 @@ find_idlest_group(struct sched_domain *sd, struct task_struct *p, int this_cpu)
   * find_idlest_cpu - find the idlest cpu among the cpus in group.
   */
  static int
-find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
+find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu,
+               cpumask_t *tmp)
  {
-       cpumask_t tmp;
         unsigned long load, min_load = ULONG_MAX;
         int idlest = -1;
         int i;
  
         /* Traverse only the allowed CPUs */
-       cpus_and(tmp, group->cpumask, p->cpus_allowed);
+       cpus_and(*tmp, group->cpumask, p->cpus_allowed);
  
-       for_each_cpu_mask(i, tmp) {
+       for_each_cpu_mask(i, *tmp) {
                 load = weighted_cpuload(i);
  
                 if (load < min_load || (load == min_load && i == this_cpu)) {
@@ -1777,7 +2454,7 @@ static int sched_balance_self(int cpu, int flag)
         }
  
         while (sd) {
-               cpumask_t span;
+               cpumask_t span, tmpmask;
                 struct sched_group *group;
                 int new_cpu, weight;
  
@@ -1793,7 +2470,7 @@ static int sched_balance_self(int cpu, int flag)
                         continue;
                 }
  
-               new_cpu = find_idlest_cpu(group, t, cpu);
+               new_cpu = find_idlest_cpu(group, t, cpu, &tmpmask);
                 if (new_cpu == -1 || new_cpu == cpu) {
                         /* Now try balancing at a lower domain level of cpu */
                         sd = sd->child;
@@ -1839,6 +2516,9 @@ static int try_to_wake_up(struct task_struct *p, unsigned int state, int sync)
         long old_state;
         struct rq *rq;
  
+       if (!sched_feat(SYNC_WAKEUPS))
+               sync = 0;
+
         smp_wmb();
         rq = task_rq_lock(p, &flags);
         old_state = p->state;
@@ -1955,6 +2635,7 @@ static void __sched_fork(struct task_struct *p)
  
         INIT_LIST_HEAD(&p->rt.run_list);
         p->se.on_rq = 0;
+       INIT_LIST_HEAD(&p->se.group_node);
  
  #ifdef CONFIG_PREEMPT_NOTIFIERS
         INIT_HLIST_HEAD(&p->preempt_notifiers);
@@ -2030,7 +2711,7 @@ void wake_up_new_task(struct task_struct *p, unsigned long clone_flags)
                  * management (if any):
                  */
                 p->sched_class->task_new(rq, p);
-               inc_nr_running(p, rq);
+               inc_nr_running(rq);
         }
         check_preempt_curr(rq, p);
  #ifdef CONFIG_SMP
@@ -2674,7 +3355,7 @@ static int move_one_task(struct rq *this_rq, int this_cpu, struct rq *busiest,
  static struct sched_group *
  find_busiest_group(struct sched_domain *sd, int this_cpu,
                    unsigned long *imbalance, enum cpu_idle_type idle,
-                  int *sd_idle, cpumask_t *cpus, int *balance)
+                  int *sd_idle, const cpumask_t *cpus, int *balance)
  {
         struct sched_group *busiest = NULL, *this = NULL, *group = sd->groups;
         unsigned long max_load, avg_load, total_load, this_load, total_pwr;
@@ -2975,7 +3656,7 @@ ret:
   */
  static struct rq *
  find_busiest_queue(struct sched_group *group, enum cpu_idle_type idle,
-                  unsigned long imbalance, cpumask_t *cpus)
+                  unsigned long imbalance, const cpumask_t *cpus)
  {
         struct rq *busiest = NULL, *rq;
         unsigned long max_load = 0;
@@ -3014,14 +3695,18 @@ find_busiest_queue(struct sched_group *group, enum cpu_idle_type idle,
   */
  static int load_balance(int this_cpu, struct rq *this_rq,
                         struct sched_domain *sd, enum cpu_idle_type idle,
-                       int *balance)
+                       int *balance, cpumask_t *cpus)
  {
         int ld_moved, all_pinned = 0, active_balance = 0, sd_idle = 0;
         struct sched_group *group;
         unsigned long imbalance;
         struct rq *busiest;
-       cpumask_t cpus = CPU_MASK_ALL;
         unsigned long flags;
+       int unlock_aggregate;
+
+       cpus_setall(*cpus);
+
+       unlock_aggregate = get_aggregate(sd);
  
         /*
          * When power savings policy is enabled for the parent domain, idle
@@ -3037,7 +3722,7 @@ static int load_balance(int this_cpu, struct rq *this_rq,
  
  redo:
         group = find_busiest_group(sd, this_cpu, &imbalance, idle, &sd_idle,
-                                  &cpus, balance);
+                                  cpus, balance);
  
         if (*balance == 0)
                 goto out_balanced;
@@ -3047,7 +3732,7 @@ redo:
                 goto out_balanced;
         }
  
-       busiest = find_busiest_queue(group, idle, imbalance, &cpus);
+       busiest = find_busiest_queue(group, idle, imbalance, cpus);
         if (!busiest) {
                 schedstat_inc(sd, lb_nobusyq[idle]);
                 goto out_balanced;
@@ -3080,8 +3765,8 @@ redo:
  
                 /* All tasks on this runqueue were pinned by CPU affinity */
                 if (unlikely(all_pinned)) {
-                       cpu_clear(cpu_of(busiest), cpus);
-                       if (!cpus_empty(cpus))
+                       cpu_clear(cpu_of(busiest), *cpus);
+                       if (!cpus_empty(*cpus))
                                 goto redo;
                         goto out_balanced;
                 }
@@ -3138,8 +3823,9 @@ redo:
  
         if (!ld_moved && !sd_idle && sd->flags & SD_SHARE_CPUPOWER &&
             !test_sd_parent(sd, SD_POWERSAVINGS_BALANCE))
-               return -1;
-       return ld_moved;
+               ld_moved = -1;
+
+       goto out;
  
  out_balanced:
         schedstat_inc(sd, lb_balanced[idle]);
@@ -3154,8 +3840,13 @@ out_one_pinned:
  
         if (!sd_idle && sd->flags & SD_SHARE_CPUPOWER &&
             !test_sd_parent(sd, SD_POWERSAVINGS_BALANCE))
-               return -1;
-       return 0;
+               ld_moved = -1;
+       else
+               ld_moved = 0;
+out:
+       if (unlock_aggregate)
+               put_aggregate(sd);
+       return ld_moved;
  }
  
  /*
@@ -3166,7 +3857,8 @@ out_one_pinned:
   * this_rq is locked.
   */
  static int
-load_balance_newidle(int this_cpu, struct rq *this_rq, struct sched_domain *sd)
+load_balance_newidle(int this_cpu, struct rq *this_rq, struct sched_domain *sd,
+                       cpumask_t *cpus)
  {
         struct sched_group *group;
         struct rq *busiest = NULL;
@@ -3174,7 +3866,8 @@ load_balance_newidle(int this_cpu, struct rq *this_rq, struct sched_domain *sd)
         int ld_moved = 0;
         int sd_idle = 0;
         int all_pinned = 0;
-       cpumask_t cpus = CPU_MASK_ALL;
+
+       cpus_setall(*cpus);
  
         /*
          * When power savings policy is enabled for the parent domain, idle
@@ -3189,14 +3882,13 @@ load_balance_newidle(int this_cpu, struct rq *this_rq, struct sched_domain *sd)
         schedstat_inc(sd, lb_count[CPU_NEWLY_IDLE]);
  redo:
         group = find_busiest_group(sd, this_cpu, &imbalance, CPU_NEWLY_IDLE,
-                                  &sd_idle, &cpus, NULL);
+                                  &sd_idle, cpus, NULL);
         if (!group) {
                 schedstat_inc(sd, lb_nobusyg[CPU_NEWLY_IDLE]);
                 goto out_balanced;
         }
  
-       busiest = find_busiest_queue(group, CPU_NEWLY_IDLE, imbalance,
-                               &cpus);
+       busiest = find_busiest_queue(group, CPU_NEWLY_IDLE, imbalance, cpus);
         if (!busiest) {
                 schedstat_inc(sd, lb_nobusyq[CPU_NEWLY_IDLE]);
                 goto out_balanced;
@@ -3218,8 +3910,8 @@ redo:
                 spin_unlock(&busiest->lock);
  
                 if (unlikely(all_pinned)) {
-                       cpu_clear(cpu_of(busiest), cpus);
-                       if (!cpus_empty(cpus))
+                       cpu_clear(cpu_of(busiest), *cpus);
+                       if (!cpus_empty(*cpus))
                                 goto redo;
                 }
         }
@@ -3253,6 +3945,7 @@ static void idle_balance(int this_cpu, struct rq *this_rq)
         struct sched_domain *sd;
         int pulled_task = -1;
         unsigned long next_balance = jiffies + HZ;
+       cpumask_t tmpmask;
  
         for_each_domain(this_cpu, sd) {
                 unsigned long interval;
@@ -3262,8 +3955,8 @@ static void idle_balance(int this_cpu, struct rq *this_rq)
  
                 if (sd->flags & SD_BALANCE_NEWIDLE)
                         /* If we've pulled tasks over stop searching: */
-                       pulled_task = load_balance_newidle(this_cpu,
-                                                               this_rq, sd);
+                       pulled_task = load_balance_newidle(this_cpu, this_rq,
+                                                          sd, &tmpmask);
  
                 interval = msecs_to_jiffies(sd->balance_interval);
                 if (time_after(next_balance, sd->last_balance + interval))
@@ -3422,6 +4115,7 @@ static void rebalance_domains(int cpu, enum cpu_idle_type idle)
         /* Earliest time when we have to do rebalance again */
         unsigned long next_balance = jiffies + 60*HZ;
         int update_next_balance = 0;
+       cpumask_t tmp;
  
         for_each_domain(cpu, sd) {
                 if (!(sd->flags & SD_LOAD_BALANCE))
@@ -3445,7 +4139,7 @@ static void rebalance_domains(int cpu, enum cpu_idle_type idle)
                 }
  
                 if (time_after_eq(jiffies, sd->last_balance + interval)) {
-                       if (load_balance(cpu, rq, sd, idle, &balance)) {
+                       if (load_balance(cpu, rq, sd, idle, &balance, &tmp)) {
                                 /*
                                  * We've pulled tasks over so either we're no
                                  * longer idle, or one of our SMT siblings is
@@ -3561,7 +4255,7 @@ static inline void trigger_load_balance(struct rq *rq, int cpu)
                          */
                         int ilb = first_cpu(nohz.cpu_mask);
  
-                       if (ilb != NR_CPUS)
+                       if (ilb < nr_cpu_ids)
                                 resched_cpu(ilb);
                 }
         }
@@ -3765,9 +4459,9 @@ void scheduler_tick(void)
                 rq->clock_underflows++;
         }
         rq->tick_timestamp = rq->clock;
+       update_last_tick_seen(rq);
         update_cpu_load(rq);
         curr->sched_class->task_tick(rq, curr, 0);
-       update_sched_rt_period(rq);
         spin_unlock(&rq->lock);
  
  #ifdef CONFIG_SMP
@@ -4367,10 +5061,8 @@ void set_user_nice(struct task_struct *p, long nice)
                 goto out_unlock;
         }
         on_rq = p->se.on_rq;
-       if (on_rq) {
+       if (on_rq)
                 dequeue_task(rq, p, 0);
-               dec_load(rq, p);
-       }
  
         p->static_prio = NICE_TO_PRIO(nice);
         set_load_weight(p);
@@ -4380,7 +5072,6 @@ void set_user_nice(struct task_struct *p, long nice)
  
         if (on_rq) {
                 enqueue_task(rq, p, 0);
-               inc_load(rq, p);
                 /*
                  * If the task increased its priority or is running and
                  * lowered its priority, then reschedule its CPU:
@@ -4602,7 +5293,7 @@ recheck:
          * Do not allow realtime tasks into groups that have no runtime
          * assigned.
          */
-       if (rt_policy(policy) && task_group(p)->rt_runtime == 0)
+       if (rt_policy(policy) && task_group(p)->rt_bandwidth.rt_runtime == 0)
                 return -EPERM;
  #endif
  
@@ -4764,9 +5455,10 @@ out_unlock:
         return retval;
  }
  
-long sched_setaffinity(pid_t pid, cpumask_t new_mask)
+long sched_setaffinity(pid_t pid, const cpumask_t *in_mask)
  {
         cpumask_t cpus_allowed;
+       cpumask_t new_mask = *in_mask;
         struct task_struct *p;
         int retval;
  
@@ -4797,13 +5489,13 @@ long sched_setaffinity(pid_t pid, cpumask_t new_mask)
         if (retval)
                 goto out_unlock;
  
-       cpus_allowed = cpuset_cpus_allowed(p);
+       cpuset_cpus_allowed(p, &cpus_allowed);
         cpus_and(new_mask, new_mask, cpus_allowed);
   again:
-       retval = set_cpus_allowed(p, new_mask);
+       retval = set_cpus_allowed_ptr(p, &new_mask);
  
         if (!retval) {
-               cpus_allowed = cpuset_cpus_allowed(p);
+               cpuset_cpus_allowed(p, &cpus_allowed);
                 if (!cpus_subset(new_mask, cpus_allowed)) {
                         /*
                          * We must have raced with a concurrent cpuset
@@ -4847,7 +5539,7 @@ asmlinkage long sys_sched_setaffinity(pid_t pid, unsigned int len,
         if (retval)
                 return retval;
  
-       return sched_setaffinity(pid, new_mask);
+       return sched_setaffinity(pid, &new_mask);
  }
  
  /*
@@ -5309,7 +6001,6 @@ static inline void sched_init_granularity(void)
                 sysctl_sched_latency = limit;
  
         sysctl_sched_wakeup_granularity *= factor;
-       sysctl_sched_batch_wakeup_granularity *= factor;
  }
  
  #ifdef CONFIG_SMP
@@ -5338,7 +6029,7 @@ static inline void sched_init_granularity(void)
   * task must not exit() & deallocate itself prematurely. The
   * call is not atomic; no spinlocks may be held.
   */
-int set_cpus_allowed(struct task_struct *p, cpumask_t new_mask)
+int set_cpus_allowed_ptr(struct task_struct *p, const cpumask_t *new_mask)
  {
         struct migration_req req;
         unsigned long flags;
@@ -5346,23 +6037,23 @@ int set_cpus_allowed(struct task_struct *p, cpumask_t new_mask)
         int ret = 0;
  
         rq = task_rq_lock(p, &flags);
-       if (!cpus_intersects(new_mask, cpu_online_map)) {
+       if (!cpus_intersects(*new_mask, cpu_online_map)) {
                 ret = -EINVAL;
                 goto out;
         }
  
         if (p->sched_class->set_cpus_allowed)
-               p->sched_class->set_cpus_allowed(p, &new_mask);
+               p->sched_class->set_cpus_allowed(p, new_mask);
         else {
-               p->cpus_allowed = new_mask;
-               p->rt.nr_cpus_allowed = cpus_weight(new_mask);
+               p->cpus_allowed = *new_mask;
+               p->rt.nr_cpus_allowed = cpus_weight(*new_mask);
         }
  
         /* Can the task run on the task's current CPU? If so, we're done */
-       if (cpu_isset(task_cpu(p), new_mask))
+       if (cpu_isset(task_cpu(p), *new_mask))
                 goto out;
  
-       if (migrate_task(p, any_online_cpu(new_mask), &req)) {
+       if (migrate_task(p, any_online_cpu(*new_mask), &req)) {
                 /* Need help from migration thread: drop lock and wait. */
                 task_rq_unlock(rq, &flags);
                 wake_up_process(rq->migration_thread);
@@ -5375,7 +6066,7 @@ out:
  
         return ret;
  }
-EXPORT_SYMBOL_GPL(set_cpus_allowed);
+EXPORT_SYMBOL_GPL(set_cpus_allowed_ptr);
  
  /*
   * Move (not current) task off this cpu, onto dest cpu. We're doing
@@ -5513,12 +6204,14 @@ static void move_task_off_dead_cpu(int dead_cpu, struct task_struct *p)
                 dest_cpu = any_online_cpu(mask);
  
                 /* On any allowed CPU? */
-               if (dest_cpu == NR_CPUS)
+               if (dest_cpu >= nr_cpu_ids)
                         dest_cpu = any_online_cpu(p->cpus_allowed);
  
                 /* No more Mr. Nice Guy. */
-               if (dest_cpu == NR_CPUS) {
-                       cpumask_t cpus_allowed = cpuset_cpus_allowed_locked(p);
+               if (dest_cpu >= nr_cpu_ids) {
+                       cpumask_t cpus_allowed;
+
+                       cpuset_cpus_allowed_locked(p, &cpus_allowed);
                         /*
                          * Try to stay on the same cpuset, where the
                          * current cpuset may be a subset of all cpus.
@@ -5554,7 +6247,7 @@ static void move_task_off_dead_cpu(int dead_cpu, struct task_struct *p)
   */
  static void migrate_nr_uninterruptible(struct rq *rq_src)
  {
-       struct rq *rq_dest = cpu_rq(any_online_cpu(CPU_MASK_ALL));
+       struct rq *rq_dest = cpu_rq(any_online_cpu(*CPU_MASK_ALL_PTR));
         unsigned long flags;
  
         local_irq_save(flags);
@@ -5966,20 +6659,16 @@ void __init migration_init(void)
  
  #ifdef CONFIG_SMP
  
-/* Number of possible processor ids */
-int nr_cpu_ids __read_mostly = NR_CPUS;
-EXPORT_SYMBOL(nr_cpu_ids);
-
  #ifdef CONFIG_SCHED_DEBUG
  
-static int sched_domain_debug_one(struct sched_domain *sd, int cpu, int level)
+static int sched_domain_debug_one(struct sched_domain *sd, int cpu, int level,
+                                 cpumask_t *groupmask)
  {
         struct sched_group *group = sd->groups;
-       cpumask_t groupmask;
-       char str[NR_CPUS];
+       char str[256];
  
-       cpumask_scnprintf(str, NR_CPUS, sd->span);
-       cpus_clear(groupmask);
+       cpulist_scnprintf(str, sizeof(str), sd->span);
+       cpus_clear(*groupmask);
  
         printk(KERN_DEBUG "%*s domain %d: ", level, "", level);
  
@@ -6023,25 +6712,25 @@ static int sched_domain_debug_one(struct sched_domain *sd, int cpu, int level)
                         break;
                 }
  
-               if (cpus_intersects(groupmask, group->cpumask)) {
+               if (cpus_intersects(*groupmask, group->cpumask)) {
                         printk(KERN_CONT "\n");
                         printk(KERN_ERR "ERROR: repeated CPUs\n");
                         break;
                 }
  
-               cpus_or(groupmask, groupmask, group->cpumask);
+               cpus_or(*groupmask, *groupmask, group->cpumask);
  
-               cpumask_scnprintf(str, NR_CPUS, group->cpumask);
+               cpulist_scnprintf(str, sizeof(str), group->cpumask);
                 printk(KERN_CONT " %s", str);
  
                 group = group->next;
         } while (group != sd->groups);
         printk(KERN_CONT "\n");
  
-       if (!cpus_equal(sd->span, groupmask))
+       if (!cpus_equal(sd->span, *groupmask))
                 printk(KERN_ERR "ERROR: groups don't span domain->span\n");
  
-       if (sd->parent && !cpus_subset(groupmask, sd->parent->span))
+       if (sd->parent && !cpus_subset(*groupmask, sd->parent->span))
                 printk(KERN_ERR "ERROR: parent span is not a superset "
                         "of domain->span\n");
         return 0;
@@ -6049,6 +6738,7 @@ static int sched_domain_debug_one(struct sched_domain *sd, int cpu, int level)
  
  static void sched_domain_debug(struct sched_domain *sd, int cpu)
  {
+       cpumask_t *groupmask;
         int level = 0;
  
         if (!sd) {
@@ -6058,14 +6748,21 @@ static void sched_domain_debug(struct sched_domain *sd, int cpu)
  
         printk(KERN_DEBUG "CPU%d attaching sched-domain:\n", cpu);
  
+       groupmask = kmalloc(sizeof(cpumask_t), GFP_KERNEL);
+       if (!groupmask) {
+               printk(KERN_DEBUG "Cannot load-balance (out of memory)\n");
+               return;
+       }
+
         for (;;) {
-               if (sched_domain_debug_one(sd, cpu, level))
+               if (sched_domain_debug_one(sd, cpu, level, groupmask))
                         break;
                 level++;
                 sd = sd->parent;
                 if (!sd)
                         break;
         }
+       kfree(groupmask);
  }
  #else
  # define sched_domain_debug(sd, cpu) do { } while (0)
@@ -6253,30 +6950,33 @@ __setup("isolcpus=", isolated_cpu_setup);
   * and ->cpu_power to 0.
   */
  static void
-init_sched_build_groups(cpumask_t span, const cpumask_t *cpu_map,
+init_sched_build_groups(const cpumask_t *span, const cpumask_t *cpu_map,
                         int (*group_fn)(int cpu, const cpumask_t *cpu_map,
-                                       struct sched_group **sg))
+                                       struct sched_group **sg,
+                                       cpumask_t *tmpmask),
+                       cpumask_t *covered, cpumask_t *tmpmask)
  {
         struct sched_group *first = NULL, *last = NULL;
-       cpumask_t covered = CPU_MASK_NONE;
         int i;
  
-       for_each_cpu_mask(i, span) {
+       cpus_clear(*covered);
+
+       for_each_cpu_mask(i, *span) {
                 struct sched_group *sg;
-               int group = group_fn(i, cpu_map, &sg);
+               int group = group_fn(i, cpu_map, &sg, tmpmask);
                 int j;
  
-               if (cpu_isset(i, covered))
+               if (cpu_isset(i, *covered))
                         continue;
  
-               sg->cpumask = CPU_MASK_NONE;
+               cpus_clear(sg->cpumask);
                 sg->__cpu_power = 0;
  
-               for_each_cpu_mask(j, span) {
-                       if (group_fn(j, cpu_map, NULL) != group)
+               for_each_cpu_mask(j, *span) {
+                       if (group_fn(j, cpu_map, NULL, tmpmask) != group)
                                 continue;
  
-                       cpu_set(j, covered);
+                       cpu_set(j, *covered);
                         cpu_set(j, sg->cpumask);
                 }
                 if (!first)
@@ -6302,7 +7002,7 @@ init_sched_build_groups(cpumask_t span, const cpumask_t *cpu_map,
   *
   * Should use nodemask_t.
   */
-static int find_next_best_node(int node, unsigned long *used_nodes)
+static int find_next_best_node(int node, nodemask_t *used_nodes)
  {
         int i, n, val, min_val, best_node = 0;
  
@@ -6316,7 +7016,7 @@ static int find_next_best_node(int node, unsigned long *used_nodes)
                         continue;
  
                 /* Skip already used nodes */
-               if (test_bit(n, used_nodes))
+               if (node_isset(n, *used_nodes))
                         continue;
  
                 /* Simple min distance search */
@@ -6328,40 +7028,36 @@ static int find_next_best_node(int node, unsigned long *used_nodes)
                 }
         }
  
-       set_bit(best_node, used_nodes);
+       node_set(best_node, *used_nodes);
         return best_node;
  }
  
  /**
   * sched_domain_node_span - get a cpumask for a node's sched_domain
   * @node: node whose cpumask we're constructing
- * @size: number of nodes to include in this span
   *
   * Given a node, construct a good cpumask for its sched_domain to span. It
   * should be one that prevents unnecessary balancing, but also spreads tasks
   * out optimally.
   */
-static cpumask_t sched_domain_node_span(int node)
+static void sched_domain_node_span(int node, cpumask_t *span)
  {
-       DECLARE_BITMAP(used_nodes, MAX_NUMNODES);
-       cpumask_t span, nodemask;
+       nodemask_t used_nodes;
+       node_to_cpumask_ptr(nodemask, node);
         int i;
  
-       cpus_clear(span);
-       bitmap_zero(used_nodes, MAX_NUMNODES);
+       cpus_clear(*span);
+       nodes_clear(used_nodes);
  
-       nodemask = node_to_cpumask(node);
-       cpus_or(span, span, nodemask);
-       set_bit(node, used_nodes);
+       cpus_or(*span, *span, *nodemask);
+       node_set(node, used_nodes);
  
         for (i = 1; i < SD_NODES_PER_DOMAIN; i++) {
-               int next_node = find_next_best_node(node, used_nodes);
+               int next_node = find_next_best_node(node, &used_nodes);
  
-               nodemask = node_to_cpumask(next_node);
-               cpus_or(span, span, nodemask);
+               node_to_cpumask_ptr_next(nodemask, next_node);
+               cpus_or(*span, *span, *nodemask);
         }
-
-       return span;
  }
  #endif
  
@@ -6375,7 +7071,8 @@ static DEFINE_PER_CPU(struct sched_domain, cpu_domains);
  static DEFINE_PER_CPU(struct sched_group, sched_group_cpus);
  
  static int
-cpu_to_cpu_group(int cpu, const cpumask_t *cpu_map, struct sched_group **sg)
+cpu_to_cpu_group(int cpu, const cpumask_t *cpu_map, struct sched_group **sg,
+                cpumask_t *unused)
  {
         if (sg)
                 *sg = &per_cpu(sched_group_cpus, cpu);
@@ -6393,19 +7090,22 @@ static DEFINE_PER_CPU(struct sched_group, sched_group_core);
  
  #if defined(CONFIG_SCHED_MC) && defined(CONFIG_SCHED_SMT)
  static int
-cpu_to_core_group(int cpu, const cpumask_t *cpu_map, struct sched_group **sg)
+cpu_to_core_group(int cpu, const cpumask_t *cpu_map, struct sched_group **sg,
+                 cpumask_t *mask)
  {
         int group;
-       cpumask_t mask = per_cpu(cpu_sibling_map, cpu);
-       cpus_and(mask, mask, *cpu_map);
-       group = first_cpu(mask);
+
+       *mask = per_cpu(cpu_sibling_map, cpu);
+       cpus_and(*mask, *mask, *cpu_map);
+       group = first_cpu(*mask);
         if (sg)
                 *sg = &per_cpu(sched_group_core, group);
         return group;
  }
  #elif defined(CONFIG_SCHED_MC)
  static int
-cpu_to_core_group(int cpu, const cpumask_t *cpu_map, struct sched_group **sg)
+cpu_to_core_group(int cpu, const cpumask_t *cpu_map, struct sched_group **sg,
+                 cpumask_t *unused)
  {
         if (sg)
                 *sg = &per_cpu(sched_group_core, cpu);
@@ -6417,17 +7117,18 @@ static DEFINE_PER_CPU(struct sched_domain, phys_domains);
  static DEFINE_PER_CPU(struct sched_group, sched_group_phys);
  
  static int
-cpu_to_phys_group(int cpu, const cpumask_t *cpu_map, struct sched_group **sg)
+cpu_to_phys_group(int cpu, const cpumask_t *cpu_map, struct sched_group **sg,
+                 cpumask_t *mask)
  {
         int group;
  #ifdef CONFIG_SCHED_MC
-       cpumask_t mask = cpu_coregroup_map(cpu);
-       cpus_and(mask, mask, *cpu_map);
-       group = first_cpu(mask);
+       *mask = cpu_coregroup_map(cpu);
+       cpus_and(*mask, *mask, *cpu_map);
+       group = first_cpu(*mask);
  #elif defined(CONFIG_SCHED_SMT)
-       cpumask_t mask = per_cpu(cpu_sibling_map, cpu);
-       cpus_and(mask, mask, *cpu_map);
-       group = first_cpu(mask);
+       *mask = per_cpu(cpu_sibling_map, cpu);
+       cpus_and(*mask, *mask, *cpu_map);
+       group = first_cpu(*mask);
  #else
         group = cpu;
  #endif
@@ -6443,19 +7144,19 @@ cpu_to_phys_group(int cpu, const cpumask_t *cpu_map, struct sched_group **sg)
   * gets dynamically allocated.
   */
  static DEFINE_PER_CPU(struct sched_domain, node_domains);
-static struct sched_group **sched_group_nodes_bycpu[NR_CPUS];
+static struct sched_group ***sched_group_nodes_bycpu;
  
  static DEFINE_PER_CPU(struct sched_domain, allnodes_domains);
  static DEFINE_PER_CPU(struct sched_group, sched_group_allnodes);
  
  static int cpu_to_allnodes_group(int cpu, const cpumask_t *cpu_map,
-                                struct sched_group **sg)
+                                struct sched_group **sg, cpumask_t *nodemask)
  {
-       cpumask_t nodemask = node_to_cpumask(cpu_to_node(cpu));
         int group;
  
-       cpus_and(nodemask, nodemask, *cpu_map);
-       group = first_cpu(nodemask);
+       *nodemask = node_to_cpumask(cpu_to_node(cpu));
+       cpus_and(*nodemask, *nodemask, *cpu_map);
+       group = first_cpu(*nodemask);
  
         if (sg)
                 *sg = &per_cpu(sched_group_allnodes, group);
@@ -6491,7 +7192,7 @@ static void init_numa_sched_groups_power(struct sched_group *group_head)
  
  #ifdef CONFIG_NUMA
  /* Free memory allocated for various sched_group structures */
-static void free_sched_groups(const cpumask_t *cpu_map)
+static void free_sched_groups(const cpumask_t *cpu_map, cpumask_t *nodemask)
  {
         int cpu, i;
  
@@ -6503,11 +7204,11 @@ static void free_sched_groups(const cpumask_t *cpu_map)
                         continue;
  
                 for (i = 0; i < MAX_NUMNODES; i++) {
-                       cpumask_t nodemask = node_to_cpumask(i);
                         struct sched_group *oldsg, *sg = sched_group_nodes[i];
  
-                       cpus_and(nodemask, nodemask, *cpu_map);
-                       if (cpus_empty(nodemask))
+                       *nodemask = node_to_cpumask(i);
+                       cpus_and(*nodemask, *nodemask, *cpu_map);
+                       if (cpus_empty(*nodemask))
                                 continue;
  
                         if (sg == NULL)
@@ -6525,7 +7226,7 @@ next_sg:
         }
  }
  #else
-static void free_sched_groups(const cpumask_t *cpu_map)
+static void free_sched_groups(const cpumask_t *cpu_map, cpumask_t *nodemask)
  {
  }
  #endif
@@ -6572,24 +7273,117 @@ static void init_sched_groups_power(int cpu, struct sched_domain *sd)
                 return;
         }
  
-       /*
-        * add cpu_power of each child group to this groups cpu_power
-        */
-       group = child->groups;
-       do {
-               sg_inc_cpu_power(sd->groups, group->__cpu_power);
-               group = group->next;
-       } while (group != child->groups);
+       /*
+        * add cpu_power of each child group to this groups cpu_power
+        */
+       group = child->groups;
+       do {
+               sg_inc_cpu_power(sd->groups, group->__cpu_power);
+               group = group->next;
+       } while (group != child->groups);
+}
+
+/*
+ * Initializers for schedule domains
+ * Non-inlined to reduce accumulated stack pressure in build_sched_domains()
+ */
+
+#define        SD_INIT(sd, type)       sd_init_##type(sd)
+#define SD_INIT_FUNC(type)     \
+static noinline void sd_init_##type(struct sched_domain *sd)   \
+{                                                              \
+       memset(sd, 0, sizeof(*sd));                             \
+       *sd = SD_##type##_INIT;                                 \
+       sd->level = SD_LV_##type;                               \
+}
+
+SD_INIT_FUNC(CPU)
+#ifdef CONFIG_NUMA
+ SD_INIT_FUNC(ALLNODES)
+ SD_INIT_FUNC(NODE)
+#endif
+#ifdef CONFIG_SCHED_SMT
+ SD_INIT_FUNC(SIBLING)
+#endif
+#ifdef CONFIG_SCHED_MC
+ SD_INIT_FUNC(MC)
+#endif
+
+/*
+ * To minimize stack usage kmalloc room for cpumasks and share the
+ * space as the usage in build_sched_domains() dictates.  Used only
+ * if the amount of space is significant.
+ */
+struct allmasks {
+       cpumask_t tmpmask;                      /* make this one first */
+       union {
+               cpumask_t nodemask;
+               cpumask_t this_sibling_map;
+               cpumask_t this_core_map;
+       };
+       cpumask_t send_covered;
+
+#ifdef CONFIG_NUMA
+       cpumask_t domainspan;
+       cpumask_t covered;
+       cpumask_t notcovered;
+#endif
+};
+
+#if    NR_CPUS > 128
+#define        SCHED_CPUMASK_ALLOC             1
+#define        SCHED_CPUMASK_FREE(v)           kfree(v)
+#define        SCHED_CPUMASK_DECLARE(v)        struct allmasks *v
+#else
+#define        SCHED_CPUMASK_ALLOC             0
+#define        SCHED_CPUMASK_FREE(v)
+#define        SCHED_CPUMASK_DECLARE(v)        struct allmasks _v, *v = &_v
+#endif
+
+#define        SCHED_CPUMASK_VAR(v, a)         cpumask_t *v = (cpumask_t *) \
+                       ((unsigned long)(a) + offsetof(struct allmasks, v))
+
+static int default_relax_domain_level = -1;
+
+static int __init setup_relax_domain_level(char *str)
+{
+       default_relax_domain_level = simple_strtoul(str, NULL, 0);
+       return 1;
+}
+__setup("relax_domain_level=", setup_relax_domain_level);
+
+static void set_domain_attribute(struct sched_domain *sd,
+                                struct sched_domain_attr *attr)
+{
+       int request;
+
+       if (!attr || attr->relax_domain_level < 0) {
+               if (default_relax_domain_level < 0)
+                       return;
+               else
+                       request = default_relax_domain_level;
+       } else
+               request = attr->relax_domain_level;
+       if (request < sd->level) {
+               /* turn off idle balance on this domain */
+               sd->flags &= ~(SD_WAKE_IDLE|SD_BALANCE_NEWIDLE);
+       } else {
+               /* turn on idle balance on this domain */
+               sd->flags |= (SD_WAKE_IDLE_FAR|SD_BALANCE_NEWIDLE);
+       }
  }
  
  /*
   * Build sched domains for a given set of cpus and attach the sched domains
   * to the individual cpus
   */
-static int build_sched_domains(const cpumask_t *cpu_map)
+static int __build_sched_domains(const cpumask_t *cpu_map,
+                                struct sched_domain_attr *attr)
  {
         int i;
         struct root_domain *rd;
+       SCHED_CPUMASK_DECLARE(allmasks);
+       cpumask_t *tmpmask;
  #ifdef CONFIG_NUMA
         struct sched_group **sched_group_nodes = NULL;
         int sd_allnodes = 0;
@@ -6603,39 +7397,65 @@ static int build_sched_domains(const cpumask_t *cpu_map)
                 printk(KERN_WARNING "Can not alloc sched group node list\n");
                 return -ENOMEM;
         }
-       sched_group_nodes_bycpu[first_cpu(*cpu_map)] = sched_group_nodes;
  #endif
  
         rd = alloc_rootdomain();
         if (!rd) {
                 printk(KERN_WARNING "Cannot alloc root domain\n");
+#ifdef CONFIG_NUMA
+               kfree(sched_group_nodes);
+#endif
+               return -ENOMEM;
+       }
+
+#if SCHED_CPUMASK_ALLOC
+       /* get space for all scratch cpumask variables */
+       allmasks = kmalloc(sizeof(*allmasks), GFP_KERNEL);
+       if (!allmasks) {
+               printk(KERN_WARNING "Cannot alloc cpumask array\n");
+               kfree(rd);
+#ifdef CONFIG_NUMA
+               kfree(sched_group_nodes);
+#endif
                 return -ENOMEM;
         }
+#endif
+       tmpmask = (cpumask_t *)allmasks;
+
+
+#ifdef CONFIG_NUMA
+       sched_group_nodes_bycpu[first_cpu(*cpu_map)] = sched_group_nodes;
+#endif
  
         /*
          * Set up domains for cpus specified by the cpu_map.
          */
         for_each_cpu_mask(i, *cpu_map) {
                 struct sched_domain *sd = NULL, *p;
-               cpumask_t nodemask = node_to_cpumask(cpu_to_node(i));
+               SCHED_CPUMASK_VAR(nodemask, allmasks);
  
-               cpus_and(nodemask, nodemask, *cpu_map);
+               *nodemask = node_to_cpumask(cpu_to_node(i));
+               cpus_and(*nodemask, *nodemask, *cpu_map);
  
  #ifdef CONFIG_NUMA
                 if (cpus_weight(*cpu_map) >
-                               SD_NODES_PER_DOMAIN*cpus_weight(nodemask)) {
+                               SD_NODES_PER_DOMAIN*cpus_weight(*nodemask)) {
                         sd = &per_cpu(allnodes_domains, i);
-                       *sd = SD_ALLNODES_INIT;
+                       SD_INIT(sd, ALLNODES);
+                       set_domain_attribute(sd, attr);
                         sd->span = *cpu_map;
-                       cpu_to_allnodes_group(i, cpu_map, &sd->groups);
+                       sd->first_cpu = first_cpu(sd->span);
+                       cpu_to_allnodes_group(i, cpu_map, &sd->groups, tmpmask);
                         p = sd;
                         sd_allnodes = 1;
                 } else
                         p = NULL;
  
                 sd = &per_cpu(node_domains, i);
-               *sd = SD_NODE_INIT;
-               sd->span = sched_domain_node_span(cpu_to_node(i));
+               SD_INIT(sd, NODE);
+               set_domain_attribute(sd, attr);
+               sched_domain_node_span(cpu_to_node(i), &sd->span);
+               sd->first_cpu = first_cpu(sd->span);
                 sd->parent = p;
                 if (p)
                         p->child = sd;
@@ -6644,94 +7464,120 @@ static int build_sched_domains(const cpumask_t *cpu_map)
  
                 p = sd;
                 sd = &per_cpu(phys_domains, i);
-               *sd = SD_CPU_INIT;
-               sd->span = nodemask;
+               SD_INIT(sd, CPU);
+               set_domain_attribute(sd, attr);
+               sd->span = *nodemask;
+               sd->first_cpu = first_cpu(sd->span);
                 sd->parent = p;
                 if (p)
                         p->child = sd;
-               cpu_to_phys_group(i, cpu_map, &sd->groups);
+               cpu_to_phys_group(i, cpu_map, &sd->groups, tmpmask);
  
  #ifdef CONFIG_SCHED_MC
                 p = sd;
                 sd = &per_cpu(core_domains, i);
-               *sd = SD_MC_INIT;
+               SD_INIT(sd, MC);
+               set_domain_attribute(sd, attr);
                 sd->span = cpu_coregroup_map(i);
+               sd->first_cpu = first_cpu(sd->span);
                 cpus_and(sd->span, sd->span, *cpu_map);
                 sd->parent = p;
                 p->child = sd;
-               cpu_to_core_group(i, cpu_map, &sd->groups);
+               cpu_to_core_group(i, cpu_map, &sd->groups, tmpmask);
  #endif
  
  #ifdef CONFIG_SCHED_SMT
                 p = sd;
                 sd = &per_cpu(cpu_domains, i);
-               *sd = SD_SIBLING_INIT;
+               SD_INIT(sd, SIBLING);
+               set_domain_attribute(sd, attr);
                 sd->span = per_cpu(cpu_sibling_map, i);
+               sd->first_cpu = first_cpu(sd->span);
                 cpus_and(sd->span, sd->span, *cpu_map);
                 sd->parent = p;
                 p->child = sd;
-               cpu_to_cpu_group(i, cpu_map, &sd->groups);
+               cpu_to_cpu_group(i, cpu_map, &sd->groups, tmpmask);
  #endif
         }
  
  #ifdef CONFIG_SCHED_SMT
         /* Set up CPU (sibling) groups */
         for_each_cpu_mask(i, *cpu_map) {
-               cpumask_t this_sibling_map = per_cpu(cpu_sibling_map, i);
-               cpus_and(this_sibling_map, this_sibling_map, *cpu_map);
-               if (i != first_cpu(this_sibling_map))
+               SCHED_CPUMASK_VAR(this_sibling_map, allmasks);
+               SCHED_CPUMASK_VAR(send_covered, allmasks);
+
+               *this_sibling_map = per_cpu(cpu_sibling_map, i);
+               cpus_and(*this_sibling_map, *this_sibling_map, *cpu_map);
+               if (i != first_cpu(*this_sibling_map))
                         continue;
  
                 init_sched_build_groups(this_sibling_map, cpu_map,
-                                       &cpu_to_cpu_group);
+                                       &cpu_to_cpu_group,
+                                       send_covered, tmpmask);
         }
  #endif
  
  #ifdef CONFIG_SCHED_MC
         /* Set up multi-core groups */
         for_each_cpu_mask(i, *cpu_map) {
-               cpumask_t this_core_map = cpu_coregroup_map(i);
-               cpus_and(this_core_map, this_core_map, *cpu_map);
-               if (i != first_cpu(this_core_map))
+               SCHED_CPUMASK_VAR(this_core_map, allmasks);
+               SCHED_CPUMASK_VAR(send_covered, allmasks);
+
+               *this_core_map = cpu_coregroup_map(i);
+               cpus_and(*this_core_map, *this_core_map, *cpu_map);
+               if (i != first_cpu(*this_core_map))
                         continue;
+
                 init_sched_build_groups(this_core_map, cpu_map,
-                                       &cpu_to_core_group);
+                                       &cpu_to_core_group,
+                                       send_covered, tmpmask);
         }
  #endif
  
         /* Set up physical groups */
         for (i = 0; i < MAX_NUMNODES; i++) {
-               cpumask_t nodemask = node_to_cpumask(i);
+               SCHED_CPUMASK_VAR(nodemask, allmasks);
+               SCHED_CPUMASK_VAR(send_covered, allmasks);
  
-               cpus_and(nodemask, nodemask, *cpu_map);
-               if (cpus_empty(nodemask))
+               *nodemask = node_to_cpumask(i);
+               cpus_and(*nodemask, *nodemask, *cpu_map);
+               if (cpus_empty(*nodemask))
                         continue;
  
-               init_sched_build_groups(nodemask, cpu_map, &cpu_to_phys_group);
+               init_sched_build_groups(nodemask, cpu_map,
+                                       &cpu_to_phys_group,
+                                       send_covered, tmpmask);
         }
  
  #ifdef CONFIG_NUMA
         /* Set up node groups */
-       if (sd_allnodes)
-               init_sched_build_groups(*cpu_map, cpu_map,
-                                       &cpu_to_allnodes_group);
+       if (sd_allnodes) {
+               SCHED_CPUMASK_VAR(send_covered, allmasks);
+
+               init_sched_build_groups(cpu_map, cpu_map,
+                                       &cpu_to_allnodes_group,
+                                       send_covered, tmpmask);
+       }
  
         for (i = 0; i < MAX_NUMNODES; i++) {
                 /* Set up node groups */
                 struct sched_group *sg, *prev;
-               cpumask_t nodemask = node_to_cpumask(i);
-               cpumask_t domainspan;
-               cpumask_t covered = CPU_MASK_NONE;
+               SCHED_CPUMASK_VAR(nodemask, allmasks);
+               SCHED_CPUMASK_VAR(domainspan, allmasks);
+               SCHED_CPUMASK_VAR(covered, allmasks);
                 int j;
  
-               cpus_and(nodemask, nodemask, *cpu_map);
-               if (cpus_empty(nodemask)) {
+               *nodemask = node_to_cpumask(i);
+               cpus_clear(*covered);
+
+               cpus_and(*nodemask, *nodemask, *cpu_map);
+               if (cpus_empty(*nodemask)) {
                         sched_group_nodes[i] = NULL;
                         continue;
                 }
  
-               domainspan = sched_domain_node_span(i);
-               cpus_and(domainspan, domainspan, *cpu_map);
+               sched_domain_node_span(i, domainspan);
+               cpus_and(*domainspan, *domainspan, *cpu_map);
  
                 sg = kmalloc_node(sizeof(struct sched_group), GFP_KERNEL, i);
                 if (!sg) {
@@ -6740,31 +7586,31 @@ static int build_sched_domains(const cpumask_t *cpu_map)
                         goto error;
                 }
                 sched_group_nodes[i] = sg;
-               for_each_cpu_mask(j, nodemask) {
+               for_each_cpu_mask(j, *nodemask) {
                         struct sched_domain *sd;
  
                         sd = &per_cpu(node_domains, j);
                         sd->groups = sg;
                 }
                 sg->__cpu_power = 0;
-               sg->cpumask = nodemask;
+               sg->cpumask = *nodemask;
                 sg->next = sg;
-               cpus_or(covered, covered, nodemask);
+               cpus_or(*covered, *covered, *nodemask);
                 prev = sg;
  
                 for (j = 0; j < MAX_NUMNODES; j++) {
-                       cpumask_t tmp, notcovered;
+                       SCHED_CPUMASK_VAR(notcovered, allmasks);
                         int n = (i + j) % MAX_NUMNODES;
+                       node_to_cpumask_ptr(pnodemask, n);
  
-                       cpus_complement(notcovered, covered);
-                       cpus_and(tmp, notcovered, *cpu_map);
-                       cpus_and(tmp, tmp, domainspan);
-                       if (cpus_empty(tmp))
+                       cpus_complement(*notcovered, *covered);
+                       cpus_and(*tmpmask, *notcovered, *cpu_map);
+                       cpus_and(*tmpmask, *tmpmask, *domainspan);
+                       if (cpus_empty(*tmpmask))
                                 break;
  
-                       nodemask = node_to_cpumask(n);
-                       cpus_and(tmp, tmp, nodemask);
-                       if (cpus_empty(tmp))
+                       cpus_and(*tmpmask, *tmpmask, *pnodemask);
+                       if (cpus_empty(*tmpmask))
                                 continue;
  
                         sg = kmalloc_node(sizeof(struct sched_group),
@@ -6775,9 +7621,9 @@ static int build_sched_domains(const cpumask_t *cpu_map)
                                 goto error;
                         }
                         sg->__cpu_power = 0;
-                       sg->cpumask = tmp;
+                       sg->cpumask = *tmpmask;
                         sg->next = prev->next;
-                       cpus_or(covered, covered, tmp);
+                       cpus_or(*covered, *covered, *tmpmask);
                         prev->next = sg;
                         prev = sg;
                 }
@@ -6813,7 +7659,8 @@ static int build_sched_domains(const cpumask_t *cpu_map)
         if (sd_allnodes) {
                 struct sched_group *sg;
  
-               cpu_to_allnodes_group(first_cpu(*cpu_map), cpu_map, &sg);
+               cpu_to_allnodes_group(first_cpu(*cpu_map), cpu_map, &sg,
+                                                               tmpmask);
                 init_numa_sched_groups_power(sg);
         }
  #endif
@@ -6831,17 +7678,26 @@ static int build_sched_domains(const cpumask_t *cpu_map)
                 cpu_attach_domain(sd, rd, i);
         }
  
+       SCHED_CPUMASK_FREE((void *)allmasks);
         return 0;
  
  #ifdef CONFIG_NUMA
  error:
-       free_sched_groups(cpu_map);
+       free_sched_groups(cpu_map, tmpmask);
+       SCHED_CPUMASK_FREE((void *)allmasks);
         return -ENOMEM;
  #endif
  }
  
+static int build_sched_domains(const cpumask_t *cpu_map)
+{
+       return __build_sched_domains(cpu_map, NULL);
+}
+
  static cpumask_t *doms_cur;    /* current sched domains */
  static int ndoms_cur;          /* number of sched domains in 'doms_cur' */
+static struct sched_domain_attr *dattr_cur;    /* attribues of custom domains
+                                                  in 'doms_cur' */
  
  /*
   * Special case: If a kmalloc of a doms_cur partition (array of
@@ -6869,15 +7725,17 @@ static int arch_init_sched_domains(const cpumask_t *cpu_map)
         if (!doms_cur)
                 doms_cur = &fallback_doms;
         cpus_andnot(*doms_cur, *cpu_map, cpu_isolated_map);
+       dattr_cur = NULL;
         err = build_sched_domains(doms_cur);
         register_sched_domain_sysctl();
  
         return err;
  }
  
-static void arch_destroy_sched_domains(const cpumask_t *cpu_map)
+static void arch_destroy_sched_domains(const cpumask_t *cpu_map,
+                                      cpumask_t *tmpmask)
  {
-       free_sched_groups(cpu_map);
+       free_sched_groups(cpu_map, tmpmask);
  }
  
  /*
@@ -6886,6 +7744,7 @@ static void arch_destroy_sched_domains(const cpumask_t *cpu_map)
   */
  static void detach_destroy_domains(const cpumask_t *cpu_map)
  {
+       cpumask_t tmpmask;
         int i;
  
         unregister_sched_domain_sysctl();
@@ -6893,7 +7752,23 @@ static void detach_destroy_domains(const cpumask_t *cpu_map)
         for_each_cpu_mask(i, *cpu_map)
                 cpu_attach_domain(NULL, &def_root_domain, i);
         synchronize_sched();
-       arch_destroy_sched_domains(cpu_map);
+       arch_destroy_sched_domains(cpu_map, &tmpmask);
+}
+
+/* handle null as "default" */
+static int dattrs_equal(struct sched_domain_attr *cur, int idx_cur,
+                       struct sched_domain_attr *new, int idx_new)
+{
+       struct sched_domain_attr tmp;
+
+       /* fast path */
+       if (!new && !cur)
+               return 1;
+
+       tmp = SD_ATTR_INIT;
+       return !memcmp(cur ? (cur + idx_cur) : &tmp,
+                       new ? (new + idx_new) : &tmp,
+                       sizeof(struct sched_domain_attr));
  }
  
  /*
@@ -6917,7 +7792,8 @@ static void detach_destroy_domains(const cpumask_t *cpu_map)
   *
   * Call with hotplug lock held
   */
-void partition_sched_domains(int ndoms_new, cpumask_t *doms_new)
+void partition_sched_domains(int ndoms_new, cpumask_t *doms_new,
+                            struct sched_domain_attr *dattr_new)
  {
         int i, j;
  
@@ -6930,12 +7806,14 @@ void partition_sched_domains(int ndoms_new, cpumask_t *doms_new)
                 ndoms_new = 1;
                 doms_new = &fallback_doms;
                 cpus_andnot(doms_new[0], cpu_online_map, cpu_isolated_map);
+               dattr_new = NULL;
         }
  
         /* Destroy deleted domains */
         for (i = 0; i < ndoms_cur; i++) {
                 for (j = 0; j < ndoms_new; j++) {
-                       if (cpus_equal(doms_cur[i], doms_new[j]))
+                       if (cpus_equal(doms_cur[i], doms_new[j])
+                           && dattrs_equal(dattr_cur, i, dattr_new, j))
                                 goto match1;
                 }
                 /* no match - a current sched domain not in new doms_new[] */
@@ -6947,11 +7825,13 @@ match1:
         /* Build new domains */
         for (i = 0; i < ndoms_new; i++) {
                 for (j = 0; j < ndoms_cur; j++) {
-                       if (cpus_equal(doms_new[i], doms_cur[j]))
+                       if (cpus_equal(doms_new[i], doms_cur[j])
+                           && dattrs_equal(dattr_new, i, dattr_cur, j))
                                 goto match2;
                 }
                 /* no match - add a new doms_new */
-               build_sched_domains(doms_new + i);
+               __build_sched_domains(doms_new + i,
+                                       dattr_new ? dattr_new + i : NULL);
  match2:
                 ;
         }
@@ -6959,7 +7839,9 @@ match2:
         /* Remember the new sched domains */
         if (doms_cur != &fallback_doms)
                 kfree(doms_cur);
+       kfree(dattr_cur);       /* kfree(NULL) is safe */
         doms_cur = doms_new;
+       dattr_cur = dattr_new;
         ndoms_cur = ndoms_new;
  
         register_sched_domain_sysctl();
@@ -7086,6 +7968,11 @@ void __init sched_init_smp(void)
  {
         cpumask_t non_isolated_cpus;
  
+#if defined(CONFIG_NUMA)
+       sched_group_nodes_bycpu = kzalloc(nr_cpu_ids * sizeof(void **),
+                                                               GFP_KERNEL);
+       BUG_ON(sched_group_nodes_bycpu == NULL);
+#endif
         get_online_cpus();
         arch_init_sched_domains(&cpu_online_map);
         cpus_andnot(non_isolated_cpus, cpu_possible_map, cpu_isolated_map);
@@ -7096,13 +7983,18 @@ void __init sched_init_smp(void)
         hotcpu_notifier(update_sched_domains, 0);
  
         /* Move init over to a non-isolated CPU */
-       if (set_cpus_allowed(current, non_isolated_cpus) < 0)
+       if (set_cpus_allowed_ptr(current, &non_isolated_cpus) < 0)
                 BUG();
         sched_init_granularity();
  }
  #else
  void __init sched_init_smp(void)
  {
+#if defined(CONFIG_NUMA)
+       sched_group_nodes_bycpu = kzalloc(nr_cpu_ids * sizeof(void **),
+                                                               GFP_KERNEL);
+       BUG_ON(sched_group_nodes_bycpu == NULL);
+#endif
         sched_init_granularity();
  }
  #endif /* CONFIG_SMP */
@@ -7117,6 +8009,7 @@ int in_sched_functions(unsigned long addr)
  static void init_cfs_rq(struct cfs_rq *cfs_rq, struct rq *rq)
  {
         cfs_rq->tasks_timeline = RB_ROOT;
+       INIT_LIST_HEAD(&cfs_rq->tasks);
  #ifdef CONFIG_FAIR_GROUP_SCHED
         cfs_rq->rq = rq;
  #endif
@@ -7146,6 +8039,8 @@ static void init_rt_rq(struct rt_rq *rt_rq, struct rq *rq)
  
         rt_rq->rt_time = 0;
         rt_rq->rt_throttled = 0;
+       rt_rq->rt_runtime = 0;
+       spin_lock_init(&rt_rq->rt_runtime_lock);
  
  #ifdef CONFIG_RT_GROUP_SCHED
         rt_rq->rt_nr_boosted = 0;
@@ -7154,10 +8049,11 @@ static void init_rt_rq(struct rt_rq *rt_rq, struct rq *rq)
  }
  
  #ifdef CONFIG_FAIR_GROUP_SCHED
-static void init_tg_cfs_entry(struct rq *rq, struct task_group *tg,
-               struct cfs_rq *cfs_rq, struct sched_entity *se,
-               int cpu, int add)
+static void init_tg_cfs_entry(struct task_group *tg, struct cfs_rq *cfs_rq,
+                               struct sched_entity *se, int cpu, int add,
+                               struct sched_entity *parent)
  {
+       struct rq *rq = cpu_rq(cpu);
         tg->cfs_rq[cpu] = cfs_rq;
         init_cfs_rq(cfs_rq, rq);
         cfs_rq->tg = tg;
@@ -7165,45 +8061,132 @@ static void init_tg_cfs_entry(struct rq *rq, struct task_group *tg,
                 list_add(&cfs_rq->leaf_cfs_rq_list, &rq->leaf_cfs_rq_list);
  
         tg->se[cpu] = se;
-       se->cfs_rq = &rq->cfs;
+       /* se could be NULL for init_task_group */
+       if (!se)
+               return;
+
+       if (!parent)
+               se->cfs_rq = &rq->cfs;
+       else
+               se->cfs_rq = parent->my_q;
+
         se->my_q = cfs_rq;
         se->load.weight = tg->shares;
         se->load.inv_weight = div64_64(1ULL<<32, se->load.weight);
-       se->parent = NULL;
+       se->parent = parent;
  }
  #endif
  
  #ifdef CONFIG_RT_GROUP_SCHED
-static void init_tg_rt_entry(struct rq *rq, struct task_group *tg,
-               struct rt_rq *rt_rq, struct sched_rt_entity *rt_se,
-               int cpu, int add)
+static void init_tg_rt_entry(struct task_group *tg, struct rt_rq *rt_rq,
+               struct sched_rt_entity *rt_se, int cpu, int add,
+               struct sched_rt_entity *parent)
  {
+       struct rq *rq = cpu_rq(cpu);
+
         tg->rt_rq[cpu] = rt_rq;
         init_rt_rq(rt_rq, rq);
         rt_rq->tg = tg;
         rt_rq->rt_se = rt_se;
+       rt_rq->rt_runtime = tg->rt_bandwidth.rt_runtime;
         if (add)
                 list_add(&rt_rq->leaf_rt_rq_list, &rq->leaf_rt_rq_list);
  
         tg->rt_se[cpu] = rt_se;
+       if (!rt_se)
+               return;
+
+       if (!parent)
+               rt_se->rt_rq = &rq->rt;
+       else
+               rt_se->rt_rq = parent->my_q;
+
         rt_se->rt_rq = &rq->rt;
         rt_se->my_q = rt_rq;
-       rt_se->parent = NULL;
+       rt_se->parent = parent;
         INIT_LIST_HEAD(&rt_se->run_list);
  }
  #endif
  
  void __init sched_init(void)
  {
-       int highest_cpu = 0;
         int i, j;
+       unsigned long alloc_size = 0, ptr;
+
+#ifdef CONFIG_FAIR_GROUP_SCHED
+       alloc_size += 2 * nr_cpu_ids * sizeof(void **);
+#endif
+#ifdef CONFIG_RT_GROUP_SCHED
+       alloc_size += 2 * nr_cpu_ids * sizeof(void **);
+#endif
+#ifdef CONFIG_USER_SCHED
+       alloc_size *= 2;
+#endif
+       /*
+        * As sched_init() is called before page_alloc is setup,
+        * we use alloc_bootmem().
+        */
+       if (alloc_size) {
+               ptr = (unsigned long)alloc_bootmem_low(alloc_size);
+
+#ifdef CONFIG_FAIR_GROUP_SCHED
+               init_task_group.se = (struct sched_entity **)ptr;
+               ptr += nr_cpu_ids * sizeof(void **);
+
+               init_task_group.cfs_rq = (struct cfs_rq **)ptr;
+               ptr += nr_cpu_ids * sizeof(void **);
+
+#ifdef CONFIG_USER_SCHED
+               root_task_group.se = (struct sched_entity **)ptr;
+               ptr += nr_cpu_ids * sizeof(void **);
+
+               root_task_group.cfs_rq = (struct cfs_rq **)ptr;
+               ptr += nr_cpu_ids * sizeof(void **);
+#endif
+#endif
+#ifdef CONFIG_RT_GROUP_SCHED
+               init_task_group.rt_se = (struct sched_rt_entity **)ptr;
+               ptr += nr_cpu_ids * sizeof(void **);
+
+               init_task_group.rt_rq = (struct rt_rq **)ptr;
+               ptr += nr_cpu_ids * sizeof(void **);
+
+#ifdef CONFIG_USER_SCHED
+               root_task_group.rt_se = (struct sched_rt_entity **)ptr;
+               ptr += nr_cpu_ids * sizeof(void **);
+
+               root_task_group.rt_rq = (struct rt_rq **)ptr;
+               ptr += nr_cpu_ids * sizeof(void **);
+#endif
+#endif
+       }
  
  #ifdef CONFIG_SMP
+       init_aggregate();
         init_defrootdomain();
  #endif
  
+       init_rt_bandwidth(&def_rt_bandwidth,
+                       global_rt_period(), global_rt_runtime());
+
+#ifdef CONFIG_RT_GROUP_SCHED
+       init_rt_bandwidth(&init_task_group.rt_bandwidth,
+                       global_rt_period(), global_rt_runtime());
+#ifdef CONFIG_USER_SCHED
+       init_rt_bandwidth(&root_task_group.rt_bandwidth,
+                       global_rt_period(), RUNTIME_INF);
+#endif
+#endif
+
  #ifdef CONFIG_GROUP_SCHED
         list_add(&init_task_group.list, &task_groups);
+       INIT_LIST_HEAD(&init_task_group.children);
+
+#ifdef CONFIG_USER_SCHED
+       INIT_LIST_HEAD(&root_task_group.children);
+       init_task_group.parent = &root_task_group;
+       list_add(&init_task_group.siblings, &root_task_group.children);
+#endif
  #endif
  
         for_each_possible_cpu(i) {
@@ -7214,26 +8197,68 @@ void __init sched_init(void)
                 lockdep_set_class(&rq->lock, &rq->rq_lock_key);
                 rq->nr_running = 0;
                 rq->clock = 1;
+               update_last_tick_seen(rq);
                 init_cfs_rq(&rq->cfs, rq);
                 init_rt_rq(&rq->rt, rq);
  #ifdef CONFIG_FAIR_GROUP_SCHED
                 init_task_group.shares = init_task_group_load;
                 INIT_LIST_HEAD(&rq->leaf_cfs_rq_list);
-               init_tg_cfs_entry(rq, &init_task_group,
+#ifdef CONFIG_CGROUP_SCHED
+               /*
+                * How much cpu bandwidth does init_task_group get?
+                *
+                * In case of task-groups formed thr' the cgroup filesystem, it
+                * gets 100% of the cpu resources in the system. This overall
+                * system cpu resource is divided among the tasks of
+                * init_task_group and its child task-groups in a fair manner,
+                * based on each entity's (task or task-group's) weight
+                * (se->load.weight).
+                *
+                * In other words, if init_task_group has 10 tasks of weight
+                * 1024) and two child groups A0 and A1 (of weight 1024 each),
+                * then A0's share of the cpu resource is:
+                *
+                *      A0's bandwidth = 1024 / (10*1024 + 1024 + 1024) = 8.33%
+                *
+                * We achieve this by letting init_task_group's tasks sit
+                * directly in rq->cfs (i.e init_task_group->se[] = NULL).
+                */
+               init_tg_cfs_entry(&init_task_group, &rq->cfs, NULL, i, 1, NULL);
+#elif defined CONFIG_USER_SCHED
+               root_task_group.shares = NICE_0_LOAD;
+               init_tg_cfs_entry(&root_task_group, &rq->cfs, NULL, i, 0, NULL);
+               /*
+                * In case of task-groups formed thr' the user id of tasks,
+                * init_task_group represents tasks belonging to root user.
+                * Hence it forms a sibling of all subsequent groups formed.
+                * In this case, init_task_group gets only a fraction of overall
+                * system cpu resource, based on the weight assigned to root
+                * user's cpu share (INIT_TASK_GROUP_LOAD). This is accomplished
+                * by letting tasks of init_task_group sit in a separate cfs_rq
+                * (init_cfs_rq) and having one entity represent this group of
+                * tasks in rq->cfs (i.e init_task_group->se[] != NULL).
+                */
+               init_tg_cfs_entry(&init_task_group,
                                 &per_cpu(init_cfs_rq, i),
-                               &per_cpu(init_sched_entity, i), i, 1);
+                               &per_cpu(init_sched_entity, i), i, 1,
+                               root_task_group.se[i]);
  
  #endif
+#endif /* CONFIG_FAIR_GROUP_SCHED */
+
+               rq->rt.rt_runtime = def_rt_bandwidth.rt_runtime;
  #ifdef CONFIG_RT_GROUP_SCHED
-               init_task_group.rt_runtime =
-                       sysctl_sched_rt_runtime * NSEC_PER_USEC;
                 INIT_LIST_HEAD(&rq->leaf_rt_rq_list);
-               init_tg_rt_entry(rq, &init_task_group,
+#ifdef CONFIG_CGROUP_SCHED
+               init_tg_rt_entry(&init_task_group, &rq->rt, NULL, i, 1, NULL);
+#elif defined CONFIG_USER_SCHED
+               init_tg_rt_entry(&root_task_group, &rq->rt, NULL, i, 0, NULL);
+               init_tg_rt_entry(&init_task_group,
                                 &per_cpu(init_rt_rq, i),
-                               &per_cpu(init_sched_rt_entity, i), i, 1);
+                               &per_cpu(init_sched_rt_entity, i), i, 1,
+                               root_task_group.rt_se[i]);
+#endif
  #endif
-               rq->rt_period_expire = 0;
-               rq->rt_throttled = 0;
  
                 for (j = 0; j < CPU_LOAD_IDX_MAX; j++)
                         rq->cpu_load[j] = 0;
@@ -7250,7 +8275,6 @@ void __init sched_init(void)
  #endif
                 init_rq_hrtick(rq);
                 atomic_set(&rq->nr_iowait, 0);
-               highest_cpu = i;
         }
  
         set_load_weight(&init_task);
@@ -7260,7 +8284,6 @@ void __init sched_init(void)
  #endif
  
  #ifdef CONFIG_SMP
-       nr_cpu_ids = highest_cpu + 1;
         open_softirq(SCHED_SOFTIRQ, run_rebalance_domains, NULL);
  #endif
  
@@ -7419,8 +8442,6 @@ void set_curr_task(int cpu, struct task_struct *p)
  
  #endif
  
-#ifdef CONFIG_GROUP_SCHED
-
  #ifdef CONFIG_FAIR_GROUP_SCHED
  static void free_fair_sched_group(struct task_group *tg)
  {
@@ -7437,17 +8458,18 @@ static void free_fair_sched_group(struct task_group *tg)
         kfree(tg->se);
  }
  
-static int alloc_fair_sched_group(struct task_group *tg)
+static
+int alloc_fair_sched_group(struct task_group *tg, struct task_group *parent)
  {
         struct cfs_rq *cfs_rq;
-       struct sched_entity *se;
+       struct sched_entity *se, *parent_se;
         struct rq *rq;
         int i;
  
-       tg->cfs_rq = kzalloc(sizeof(cfs_rq) * NR_CPUS, GFP_KERNEL);
+       tg->cfs_rq = kzalloc(sizeof(cfs_rq) * nr_cpu_ids, GFP_KERNEL);
         if (!tg->cfs_rq)
                 goto err;
-       tg->se = kzalloc(sizeof(se) * NR_CPUS, GFP_KERNEL);
+       tg->se = kzalloc(sizeof(se) * nr_cpu_ids, GFP_KERNEL);
         if (!tg->se)
                 goto err;
  
@@ -7466,7 +8488,8 @@ static int alloc_fair_sched_group(struct task_group *tg)
                 if (!se)
                         goto err;
  
-               init_tg_cfs_entry(rq, tg, cfs_rq, se, i, 0);
+               parent_se = parent ? parent->se[i] : NULL;
+               init_tg_cfs_entry(tg, cfs_rq, se, i, 0, parent_se);
         }
  
         return 1;
@@ -7490,7 +8513,8 @@ static inline void free_fair_sched_group(struct task_group *tg)
  {
  }
  
-static inline int alloc_fair_sched_group(struct task_group *tg)
+static inline
+int alloc_fair_sched_group(struct task_group *tg, struct task_group *parent)
  {
         return 1;
  }
@@ -7509,6 +8533,8 @@ static void free_rt_sched_group(struct task_group *tg)
  {
         int i;
  
+       destroy_rt_bandwidth(&tg->rt_bandwidth);
+
         for_each_possible_cpu(i) {
                 if (tg->rt_rq)
                         kfree(tg->rt_rq[i]);
@@ -7520,21 +8546,23 @@ static void free_rt_sched_group(struct task_group *tg)
         kfree(tg->rt_se);
  }
  
-static int alloc_rt_sched_group(struct task_group *tg)
+static
+int alloc_rt_sched_group(struct task_group *tg, struct task_group *parent)
  {
         struct rt_rq *rt_rq;
-       struct sched_rt_entity *rt_se;
+       struct sched_rt_entity *rt_se, *parent_se;
         struct rq *rq;
         int i;
  
-       tg->rt_rq = kzalloc(sizeof(rt_rq) * NR_CPUS, GFP_KERNEL);
+       tg->rt_rq = kzalloc(sizeof(rt_rq) * nr_cpu_ids, GFP_KERNEL);
         if (!tg->rt_rq)
                 goto err;
-       tg->rt_se = kzalloc(sizeof(rt_se) * NR_CPUS, GFP_KERNEL);
+       tg->rt_se = kzalloc(sizeof(rt_se) * nr_cpu_ids, GFP_KERNEL);
         if (!tg->rt_se)
                 goto err;
  
-       tg->rt_runtime = 0;
+       init_rt_bandwidth(&tg->rt_bandwidth,
+                       ktime_to_ns(def_rt_bandwidth.rt_period), 0);
  
         for_each_possible_cpu(i) {
                 rq = cpu_rq(i);
@@ -7549,7 +8577,8 @@ static int alloc_rt_sched_group(struct task_group *tg)
                 if (!rt_se)
                         goto err;
  
-               init_tg_rt_entry(rq, tg, rt_rq, rt_se, i, 0);
+               parent_se = parent ? parent->rt_se[i] : NULL;
+               init_tg_rt_entry(tg, rt_rq, rt_se, i, 0, parent_se);
         }
  
         return 1;
@@ -7573,7 +8602,8 @@ static inline void free_rt_sched_group(struct task_group *tg)
  {
  }
  
-static inline int alloc_rt_sched_group(struct task_group *tg)
+static inline
+int alloc_rt_sched_group(struct task_group *tg, struct task_group *parent)
  {
         return 1;
  }
@@ -7587,6 +8617,7 @@ static inline void unregister_rt_sched_group(struct task_group *tg, int cpu)
  }
  #endif
  
+#ifdef CONFIG_GROUP_SCHED
  static void free_sched_group(struct task_group *tg)
  {
         free_fair_sched_group(tg);
@@ -7595,7 +8626,7 @@ static void free_sched_group(struct task_group *tg)
  }
  
  /* allocate runqueue etc for a new task group */
-struct task_group *sched_create_group(void)
+struct task_group *sched_create_group(struct task_group *parent)
  {
         struct task_group *tg;
         unsigned long flags;
@@ -7605,10 +8636,10 @@ struct task_group *sched_create_group(void)
         if (!tg)
                 return ERR_PTR(-ENOMEM);
  
-       if (!alloc_fair_sched_group(tg))
+       if (!alloc_fair_sched_group(tg, parent))
                 goto err;
  
-       if (!alloc_rt_sched_group(tg))
+       if (!alloc_rt_sched_group(tg, parent))
                 goto err;
  
         spin_lock_irqsave(&task_group_lock, flags);
@@ -7617,6 +8648,12 @@ struct task_group *sched_create_group(void)
                 register_rt_sched_group(tg, i);
         }
         list_add_rcu(&tg->list, &task_groups);
+
+       WARN_ON(!parent); /* root should already exist */
+
+       tg->parent = parent;
+       list_add_rcu(&tg->siblings, &parent->children);
+       INIT_LIST_HEAD(&tg->children);
         spin_unlock_irqrestore(&task_group_lock, flags);
  
         return tg;
@@ -7645,6 +8682,7 @@ void sched_destroy_group(struct task_group *tg)
                 unregister_rt_sched_group(tg, i);
         }
         list_del_rcu(&tg->list);
+       list_del_rcu(&tg->siblings);
         spin_unlock_irqrestore(&task_group_lock, flags);
  
         /* wait for possible concurrent references to cfs_rqs complete */
@@ -7688,16 +8726,14 @@ void sched_move_task(struct task_struct *tsk)
  
         task_rq_unlock(rq, &flags);
  }
+#endif
  
  #ifdef CONFIG_FAIR_GROUP_SCHED
-static void set_se_shares(struct sched_entity *se, unsigned long shares)
+static void __set_se_shares(struct sched_entity *se, unsigned long shares)
  {
         struct cfs_rq *cfs_rq = se->cfs_rq;
-       struct rq *rq = cfs_rq->rq;
         int on_rq;
  
-       spin_lock_irq(&rq->lock);
-
         on_rq = se->on_rq;
         if (on_rq)
                 dequeue_entity(cfs_rq, se, 0);
@@ -7707,8 +8743,17 @@ static void set_se_shares(struct sched_entity *se, unsigned long shares)
  
         if (on_rq)
                 enqueue_entity(cfs_rq, se, 0);
+}
  
-       spin_unlock_irq(&rq->lock);
+static void set_se_shares(struct sched_entity *se, unsigned long shares)
+{
+       struct cfs_rq *cfs_rq = se->cfs_rq;
+       struct rq *rq = cfs_rq->rq;
+       unsigned long flags;
+
+       spin_lock_irqsave(&rq->lock, flags);
+       __set_se_shares(se, shares);
+       spin_unlock_irqrestore(&rq->lock, flags);
  }
  
  static DEFINE_MUTEX(shares_mutex);
@@ -7718,13 +8763,19 @@ int sched_group_set_shares(struct task_group *tg, unsigned long shares)
         int i;
         unsigned long flags;
  
+       /*
+        * We can't change the weight of the root cgroup.
+        */
+       if (!tg->se[0])
+               return -EINVAL;
+
         /*
          * A weight of 0 or 1 can cause arithmetics problems.
          * (The default weight is 1024 - so there's no practical
          *  limitation from this.)
          */
-       if (shares < 2)
-               shares = 2;
+       if (shares < MIN_SHARES)
+               shares = MIN_SHARES;
  
         mutex_lock(&shares_mutex);
         if (tg->shares == shares)
@@ -7733,6 +8784,7 @@ int sched_group_set_shares(struct task_group *tg, unsigned long shares)
         spin_lock_irqsave(&task_group_lock, flags);
         for_each_possible_cpu(i)
                 unregister_fair_sched_group(tg, i);
+       list_del_rcu(&tg->siblings);
         spin_unlock_irqrestore(&task_group_lock, flags);
  
         /* wait for any ongoing reference to this group to finish */
@@ -7743,8 +8795,13 @@ int sched_group_set_shares(struct task_group *tg, unsigned long shares)
          * w/o tripping rebalance_share or load_balance_fair.
          */
         tg->shares = shares;
-       for_each_possible_cpu(i)
-               set_se_shares(tg->se[i], shares);
+       for_each_possible_cpu(i) {
+               /*
+                * force a rebalance
+                */
+               cfs_rq_set_shares(tg->cfs_rq[i], 0);
+               set_se_shares(tg->se[i], shares/nr_cpu_ids);
+       }
  
         /*
          * Enable load balance activity on this group, by inserting it back on
@@ -7753,6 +8810,7 @@ int sched_group_set_shares(struct task_group *tg, unsigned long shares)
         spin_lock_irqsave(&task_group_lock, flags);
         for_each_possible_cpu(i)
                 register_fair_sched_group(tg, i);
+       list_add_rcu(&tg->siblings, &tg->parent->children);
         spin_unlock_irqrestore(&task_group_lock, flags);
  done:
         mutex_unlock(&shares_mutex);
@@ -7779,26 +8837,58 @@ static unsigned long to_ratio(u64 period, u64 runtime)
         return div64_64(runtime << 16, period);
  }
  
+#ifdef CONFIG_CGROUP_SCHED
+static int __rt_schedulable(struct task_group *tg, u64 period, u64 runtime)
+{
+       struct task_group *tgi, *parent = tg->parent;
+       unsigned long total = 0;
+
+       if (!parent) {
+               if (global_rt_period() < period)
+                       return 0;
+
+               return to_ratio(period, runtime) <
+                       to_ratio(global_rt_period(), global_rt_runtime());
+       }
+
+       if (ktime_to_ns(parent->rt_bandwidth.rt_period) < period)
+               return 0;
+
+       rcu_read_lock();
+       list_for_each_entry_rcu(tgi, &parent->children, siblings) {
+               if (tgi == tg)
+                       continue;
+
+               total += to_ratio(ktime_to_ns(tgi->rt_bandwidth.rt_period),
+                               tgi->rt_bandwidth.rt_runtime);
+       }
+       rcu_read_unlock();
+
+       return total + to_ratio(period, runtime) <
+               to_ratio(ktime_to_ns(parent->rt_bandwidth.rt_period),
+                               parent->rt_bandwidth.rt_runtime);
+}
+#elif defined CONFIG_USER_SCHED
  static int __rt_schedulable(struct task_group *tg, u64 period, u64 runtime)
  {
         struct task_group *tgi;
         unsigned long total = 0;
         unsigned long global_ratio =
-               to_ratio(sysctl_sched_rt_period,
-                        sysctl_sched_rt_runtime < 0 ?
-                               RUNTIME_INF : sysctl_sched_rt_runtime);
+               to_ratio(global_rt_period(), global_rt_runtime());
  
         rcu_read_lock();
         list_for_each_entry_rcu(tgi, &task_groups, list) {
                 if (tgi == tg)
                         continue;
  
-               total += to_ratio(period, tgi->rt_runtime);
+               total += to_ratio(ktime_to_ns(tgi->rt_bandwidth.rt_period),
+                               tgi->rt_bandwidth.rt_runtime);
         }
         rcu_read_unlock();
  
         return total + to_ratio(period, runtime) < global_ratio;
  }
+#endif
  
  /* Must be called with tasklist_lock held */
  static inline int tg_has_rt_tasks(struct task_group *tg)
@@ -7811,19 +8901,14 @@ static inline int tg_has_rt_tasks(struct task_group *tg)
         return 0;
  }
  
-int sched_group_set_rt_runtime(struct task_group *tg, long rt_runtime_us)
+static int tg_set_bandwidth(struct task_group *tg,
+               u64 rt_period, u64 rt_runtime)
  {
-       u64 rt_runtime, rt_period;
-       int err = 0;
-
-       rt_period = (u64)sysctl_sched_rt_period * NSEC_PER_USEC;
-       rt_runtime = (u64)rt_runtime_us * NSEC_PER_USEC;
-       if (rt_runtime_us == -1)
-               rt_runtime = RUNTIME_INF;
+       int i, err = 0;
  
         mutex_lock(&rt_constraints_mutex);
         read_lock(&tasklist_lock);
-       if (rt_runtime_us == 0 && tg_has_rt_tasks(tg)) {
+       if (rt_runtime == 0 && tg_has_rt_tasks(tg)) {
                 err = -EBUSY;
                 goto unlock;
         }
@@ -7831,7 +8916,19 @@ int sched_group_set_rt_runtime(struct task_group *tg, long rt_runtime_us)
                 err = -EINVAL;
                 goto unlock;
         }
-       tg->rt_runtime = rt_runtime;
+
+       spin_lock_irq(&tg->rt_bandwidth.rt_runtime_lock);
+       tg->rt_bandwidth.rt_period = ns_to_ktime(rt_period);
+       tg->rt_bandwidth.rt_runtime = rt_runtime;
+
+       for_each_possible_cpu(i) {
+               struct rt_rq *rt_rq = tg->rt_rq[i];
+
+               spin_lock(&rt_rq->rt_runtime_lock);
+               rt_rq->rt_runtime = rt_runtime;
+               spin_unlock(&rt_rq->rt_runtime_lock);
+       }
+       spin_unlock_irq(&tg->rt_bandwidth.rt_runtime_lock);
   unlock:
         read_unlock(&tasklist_lock);
         mutex_unlock(&rt_constraints_mutex);
@@ -7839,19 +8936,109 @@ int sched_group_set_rt_runtime(struct task_group *tg, long rt_runtime_us)
         return err;
  }
  
+int sched_group_set_rt_runtime(struct task_group *tg, long rt_runtime_us)
+{
+       u64 rt_runtime, rt_period;
+
+       rt_period = ktime_to_ns(tg->rt_bandwidth.rt_period);
+       rt_runtime = (u64)rt_runtime_us * NSEC_PER_USEC;
+       if (rt_runtime_us < 0)
+               rt_runtime = RUNTIME_INF;
+
+       return tg_set_bandwidth(tg, rt_period, rt_runtime);
+}
+
  long sched_group_rt_runtime(struct task_group *tg)
  {
         u64 rt_runtime_us;
  
-       if (tg->rt_runtime == RUNTIME_INF)
+       if (tg->rt_bandwidth.rt_runtime == RUNTIME_INF)
                 return -1;
  
-       rt_runtime_us = tg->rt_runtime;
+       rt_runtime_us = tg->rt_bandwidth.rt_runtime;
         do_div(rt_runtime_us, NSEC_PER_USEC);
         return rt_runtime_us;
  }
+
+int sched_group_set_rt_period(struct task_group *tg, long rt_period_us)
+{
+       u64 rt_runtime, rt_period;
+
+       rt_period = (u64)rt_period_us * NSEC_PER_USEC;
+       rt_runtime = tg->rt_bandwidth.rt_runtime;
+
+       return tg_set_bandwidth(tg, rt_period, rt_runtime);
+}
+
+long sched_group_rt_period(struct task_group *tg)
+{
+       u64 rt_period_us;
+
+       rt_period_us = ktime_to_ns(tg->rt_bandwidth.rt_period);
+       do_div(rt_period_us, NSEC_PER_USEC);
+       return rt_period_us;
+}
+
+static int sched_rt_global_constraints(void)
+{
+       int ret = 0;
+
+       mutex_lock(&rt_constraints_mutex);
+       if (!__rt_schedulable(NULL, 1, 0))
+               ret = -EINVAL;
+       mutex_unlock(&rt_constraints_mutex);
+
+       return ret;
+}
+#else
+static int sched_rt_global_constraints(void)
+{
+       unsigned long flags;
+       int i;
+
+       spin_lock_irqsave(&def_rt_bandwidth.rt_runtime_lock, flags);
+       for_each_possible_cpu(i) {
+               struct rt_rq *rt_rq = &cpu_rq(i)->rt;
+
+               spin_lock(&rt_rq->rt_runtime_lock);
+               rt_rq->rt_runtime = global_rt_runtime();
+               spin_unlock(&rt_rq->rt_runtime_lock);
+       }
+       spin_unlock_irqrestore(&def_rt_bandwidth.rt_runtime_lock, flags);
+
+       return 0;
+}
  #endif
-#endif /* CONFIG_GROUP_SCHED */
+
+int sched_rt_handler(struct ctl_table *table, int write,
+               struct file *filp, void __user *buffer, size_t *lenp,
+               loff_t *ppos)
+{
+       int ret;
+       int old_period, old_runtime;
+       static DEFINE_MUTEX(mutex);
+
+       mutex_lock(&mutex);
+       old_period = sysctl_sched_rt_period;
+       old_runtime = sysctl_sched_rt_runtime;
+
+       ret = proc_dointvec(table, write, filp, buffer, lenp, ppos);
+
+       if (!ret && write) {
+               ret = sched_rt_global_constraints();
+               if (ret) {
+                       sysctl_sched_rt_period = old_period;
+                       sysctl_sched_rt_runtime = old_runtime;
+               } else {
+                       def_rt_bandwidth.rt_runtime = global_rt_runtime();
+                       def_rt_bandwidth.rt_period =
+                               ns_to_ktime(global_rt_period());
+               }
+       }
+       mutex_unlock(&mutex);
+
+       return ret;
+}
  
  #ifdef CONFIG_CGROUP_SCHED
  
@@ -7865,7 +9052,7 @@ static inline struct task_group *cgroup_tg(struct cgroup *cgrp)
  static struct cgroup_subsys_state *
  cpu_cgroup_create(struct cgroup_subsys *ss, struct cgroup *cgrp)
  {
-       struct task_group *tg;
+       struct task_group *tg, *parent;
  
         if (!cgrp->parent) {
                 /* This is early initialization for the top cgroup */
@@ -7873,11 +9060,8 @@ cpu_cgroup_create(struct cgroup_subsys *ss, struct cgroup *cgrp)
                 return &init_task_group.css;
         }
  
-       /* we support only 1-level deep hierarchical scheduler atm */
-       if (cgrp->parent->parent)
-               return ERR_PTR(-EINVAL);
-
-       tg = sched_create_group();
+       parent = cgroup_tg(cgrp->parent);
+       tg = sched_create_group(parent);
         if (IS_ERR(tg))
                 return ERR_PTR(-ENOMEM);
  
@@ -7901,7 +9085,7 @@ cpu_cgroup_can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
  {
  #ifdef CONFIG_RT_GROUP_SCHED
         /* Don't accept realtime tasks when there is no way for them to run */
-       if (rt_task(tsk) && cgroup_tg(cgrp)->rt_runtime == 0)
+       if (rt_task(tsk) && cgroup_tg(cgrp)->rt_bandwidth.rt_runtime == 0)
                 return -EINVAL;
  #else
         /* We don't support RT-tasks being in separate groups */
@@ -7935,7 +9119,7 @@ static u64 cpu_shares_read_uint(struct cgroup *cgrp, struct cftype *cft)
  #endif
  
  #ifdef CONFIG_RT_GROUP_SCHED
-static int cpu_rt_runtime_write(struct cgroup *cgrp, struct cftype *cft,
+static ssize_t cpu_rt_runtime_write(struct cgroup *cgrp, struct cftype *cft,
                                 struct file *file,
                                 const char __user *userbuf,
                                 size_t nbytes, loff_t *unused_ppos)
@@ -7979,6 +9163,17 @@ static ssize_t cpu_rt_runtime_read(struct cgroup *cgrp, struct cftype *cft,
  
         return simple_read_from_buffer(buf, nbytes, ppos, tmp, len);
  }
+
+static int cpu_rt_period_write_uint(struct cgroup *cgrp, struct cftype *cftype,
+               u64 rt_period_us)
+{
+       return sched_group_set_rt_period(cgroup_tg(cgrp), rt_period_us);
+}
+
+static u64 cpu_rt_period_read_uint(struct cgroup *cgrp, struct cftype *cft)
+{
+       return sched_group_rt_period(cgroup_tg(cgrp));
+}
  #endif
  
  static struct cftype cpu_files[] = {
@@ -7995,6 +9190,11 @@ static struct cftype cpu_files[] = {
                 .read = cpu_rt_runtime_read,
                 .write = cpu_rt_runtime_write,
         },
+       {
+               .name = "rt_period_us",
+               .read_uint = cpu_rt_period_read_uint,
+               .write_uint = cpu_rt_period_write_uint,
+       },
  #endif
  };
  
@@ -8035,9 +9235,9 @@ struct cpuacct {
  struct cgroup_subsys cpuacct_subsys;
  
  /* return cpu accounting group corresponding to this container */
-static inline struct cpuacct *cgroup_ca(struct cgroup *cont)
+static inline struct cpuacct *cgroup_ca(struct cgroup *cgrp)
  {
-       return container_of(cgroup_subsys_state(cont, cpuacct_subsys_id),
+       return container_of(cgroup_subsys_state(cgrp, cpuacct_subsys_id),
                             struct cpuacct, css);
  }
  
@@ -8050,7 +9250,7 @@ static inline struct cpuacct *task_ca(struct task_struct *tsk)
  
  /* create a new cpu accounting group */
  static struct cgroup_subsys_state *cpuacct_create(
-       struct cgroup_subsys *ss, struct cgroup *cont)
+       struct cgroup_subsys *ss, struct cgroup *cgrp)
  {
         struct cpuacct *ca = kzalloc(sizeof(*ca), GFP_KERNEL);
  
@@ -8068,18 +9268,18 @@ static struct cgroup_subsys_state *cpuacct_create(
  
  /* destroy an existing cpu accounting group */
  static void
-cpuacct_destroy(struct cgroup_subsys *ss, struct cgroup *cont)
+cpuacct_destroy(struct cgroup_subsys *ss, struct cgroup *cgrp)
  {
-       struct cpuacct *ca = cgroup_ca(cont);
+       struct cpuacct *ca = cgroup_ca(cgrp);
  
         free_percpu(ca->cpuusage);
         kfree(ca);
  }
  
  /* return total cpu usage (in nanoseconds) of a group */
-static u64 cpuusage_read(struct cgroup *cont, struct cftype *cft)
+static u64 cpuusage_read(struct cgroup *cgrp, struct cftype *cft)
  {
-       struct cpuacct *ca = cgroup_ca(cont);
+       struct cpuacct *ca = cgroup_ca(cgrp);
         u64 totalcpuusage = 0;
         int i;
  
@@ -8098,16 +9298,40 @@ static u64 cpuusage_read(struct cgroup *cont, struct cftype *cft)
         return totalcpuusage;
  }
  
+static int cpuusage_write(struct cgroup *cgrp, struct cftype *cftype,
+                                                               u64 reset)
+{
+       struct cpuacct *ca = cgroup_ca(cgrp);
+       int err = 0;
+       int i;
+
+       if (reset) {
+               err = -EINVAL;
+               goto out;
+       }
+
+       for_each_possible_cpu(i) {
+               u64 *cpuusage = percpu_ptr(ca->cpuusage, i);
+
+               spin_lock_irq(&cpu_rq(i)->lock);
+               *cpuusage = 0;
+               spin_unlock_irq(&cpu_rq(i)->lock);
+       }
+out:
+       return err;
+}
+
  static struct cftype files[] = {
         {
                 .name = "usage",
                 .read_uint = cpuusage_read,
+               .write_uint = cpuusage_write,
         },
  };
  
-static int cpuacct_populate(struct cgroup_subsys *ss, struct cgroup *cont)
+static int cpuacct_populate(struct cgroup_subsys *ss, struct cgroup *cgrp)
  {
-       return cgroup_add_files(cont, ss, files, ARRAY_SIZE(files));
+       return cgroup_add_files(cgrp, ss, files, ARRAY_SIZE(files));
  }
  
  /*
diff --git a/kernel/sched_debug.c b/kernel/sched_debug.c

index ef358ba0768353cfd2e8bde743fa4901ab9141a7..f3f4af4b8b0fb8ffa4e24da75f5982358c44cc3e 100644 (file)
--- a/kernel/sched_debug.c
+++ b/kernel/sched_debug.c
@@ -67,14 +67,24 @@ print_task(struct seq_file *m, struct rq *rq, struct task_struct *p)
                 (long long)(p->nvcsw + p->nivcsw),
                 p->prio);
  #ifdef CONFIG_SCHEDSTATS
-       SEQ_printf(m, "%9Ld.%06ld %9Ld.%06ld %9Ld.%06ld\n",
+       SEQ_printf(m, "%9Ld.%06ld %9Ld.%06ld %9Ld.%06ld",
                 SPLIT_NS(p->se.vruntime),
                 SPLIT_NS(p->se.sum_exec_runtime),
                 SPLIT_NS(p->se.sum_sleep_runtime));
  #else
-       SEQ_printf(m, "%15Ld %15Ld %15Ld.%06ld %15Ld.%06ld %15Ld.%06ld\n",
+       SEQ_printf(m, "%15Ld %15Ld %15Ld.%06ld %15Ld.%06ld %15Ld.%06ld",
                 0LL, 0LL, 0LL, 0L, 0LL, 0L, 0LL, 0L);
  #endif
+
+#ifdef CONFIG_CGROUP_SCHED
+       {
+               char path[64];
+
+               cgroup_path(task_group(p)->css.cgroup, path, sizeof(path));
+               SEQ_printf(m, " %s", path);
+       }
+#endif
+       SEQ_printf(m, "\n");
  }
  
  static void print_rq(struct seq_file *m, struct rq *rq, int rq_cpu)
@@ -109,7 +119,21 @@ void print_cfs_rq(struct seq_file *m, int cpu, struct cfs_rq *cfs_rq)
         struct sched_entity *last;
         unsigned long flags;
  
-       SEQ_printf(m, "\ncfs_rq\n");
+#if !defined(CONFIG_CGROUP_SCHED) || !defined(CONFIG_USER_SCHED)
+       SEQ_printf(m, "\ncfs_rq[%d]:\n", cpu);
+#else
+       char path[128] = "";
+       struct cgroup *cgroup = NULL;
+       struct task_group *tg = cfs_rq->tg;
+
+       if (tg)
+               cgroup = tg->css.cgroup;
+
+       if (cgroup)
+               cgroup_path(cgroup, path, sizeof(path));
+
+       SEQ_printf(m, "\ncfs_rq[%d]:%s\n", cpu, path);
+#endif
  
         SEQ_printf(m, "  .%-30s: %Ld.%06ld\n", "exec_clock",
                         SPLIT_NS(cfs_rq->exec_clock));
@@ -143,6 +167,11 @@ void print_cfs_rq(struct seq_file *m, int cpu, struct cfs_rq *cfs_rq)
  #endif
         SEQ_printf(m, "  .%-30s: %ld\n", "nr_spread_over",
                         cfs_rq->nr_spread_over);
+#ifdef CONFIG_FAIR_GROUP_SCHED
+#ifdef CONFIG_SMP
+       SEQ_printf(m, "  .%-30s: %lu\n", "shares", cfs_rq->shares);
+#endif
+#endif
  }
  
  static void print_cpu(struct seq_file *m, int cpu)
@@ -214,7 +243,6 @@ static int sched_debug_show(struct seq_file *m, void *v)
         PN(sysctl_sched_latency);
         PN(sysctl_sched_min_granularity);
         PN(sysctl_sched_wakeup_granularity);
-       PN(sysctl_sched_batch_wakeup_granularity);
         PN(sysctl_sched_child_runs_first);
         P(sysctl_sched_features);
  #undef PN
diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c

index 0080968d3e4a88e883a7905f409e22013abc8d55..89fa32b4edf27d500c3d6c644596ffc4afb881f5 100644 (file)
--- a/kernel/sched_fair.c
+++ b/kernel/sched_fair.c
@@ -61,25 +61,15 @@ const_debug unsigned int sysctl_sched_child_runs_first = 1;
   */
  unsigned int __read_mostly sysctl_sched_compat_yield;
  
-/*
- * SCHED_BATCH wake-up granularity.
- * (default: 10 msec * (1 + ilog(ncpus)), units: nanoseconds)
- *
- * This option delays the preemption effects of decoupled workloads
- * and reduces their over-scheduling. Synchronous workloads will still
- * have immediate wakeup/sleep latencies.
- */
-unsigned int sysctl_sched_batch_wakeup_granularity = 10000000UL;
-
  /*
   * SCHED_OTHER wake-up granularity.
- * (default: 5 msec * (1 + ilog(ncpus)), units: nanoseconds)
+ * (default: 10 msec * (1 + ilog(ncpus)), units: nanoseconds)
   *
   * This option delays the preemption effects of decoupled workloads
   * and reduces their over-scheduling. Synchronous workloads will still
   * have immediate wakeup/sleep latencies.
   */
-unsigned int sysctl_sched_wakeup_granularity = 5000000UL;
+unsigned int sysctl_sched_wakeup_granularity = 10000000UL;
  
  const_debug unsigned int sysctl_sched_migration_cost = 500000UL;
  
@@ -87,6 +77,11 @@ const_debug unsigned int sysctl_sched_migration_cost = 500000UL;
   * CFS operations on generic schedulable entities:
   */
  
+static inline struct task_struct *task_of(struct sched_entity *se)
+{
+       return container_of(se, struct task_struct, se);
+}
+
  #ifdef CONFIG_FAIR_GROUP_SCHED
  
  /* cpu runqueue to which this cfs_rq is attached */
@@ -98,6 +93,54 @@ static inline struct rq *rq_of(struct cfs_rq *cfs_rq)
  /* An entity is a task if it doesn't "own" a runqueue */
  #define entity_is_task(se)     (!se->my_q)
  
+/* Walk up scheduling entities hierarchy */
+#define for_each_sched_entity(se) \
+               for (; se; se = se->parent)
+
+static inline struct cfs_rq *task_cfs_rq(struct task_struct *p)
+{
+       return p->se.cfs_rq;
+}
+
+/* runqueue on which this entity is (to be) queued */
+static inline struct cfs_rq *cfs_rq_of(struct sched_entity *se)
+{
+       return se->cfs_rq;
+}
+
+/* runqueue "owned" by this group */
+static inline struct cfs_rq *group_cfs_rq(struct sched_entity *grp)
+{
+       return grp->my_q;
+}
+
+/* Given a group's cfs_rq on one cpu, return its corresponding cfs_rq on
+ * another cpu ('this_cpu')
+ */
+static inline struct cfs_rq *cpu_cfs_rq(struct cfs_rq *cfs_rq, int this_cpu)
+{
+       return cfs_rq->tg->cfs_rq[this_cpu];
+}
+
+/* Iterate thr' all leaf cfs_rq's on a runqueue */
+#define for_each_leaf_cfs_rq(rq, cfs_rq) \
+       list_for_each_entry_rcu(cfs_rq, &rq->leaf_cfs_rq_list, leaf_cfs_rq_list)
+
+/* Do the two (enqueued) entities belong to the same group ? */
+static inline int
+is_same_group(struct sched_entity *se, struct sched_entity *pse)
+{
+       if (se->cfs_rq == pse->cfs_rq)
+               return 1;
+
+       return 0;
+}
+
+static inline struct sched_entity *parent_entity(struct sched_entity *se)
+{
+       return se->parent;
+}
+
  #else  /* CONFIG_FAIR_GROUP_SCHED */
  
  static inline struct rq *rq_of(struct cfs_rq *cfs_rq)
@@ -107,13 +150,49 @@ static inline struct rq *rq_of(struct cfs_rq *cfs_rq)
  
  #define entity_is_task(se)     1
  
-#endif /* CONFIG_FAIR_GROUP_SCHED */
+#define for_each_sched_entity(se) \
+               for (; se; se = NULL)
  
-static inline struct task_struct *task_of(struct sched_entity *se)
+static inline struct cfs_rq *task_cfs_rq(struct task_struct *p)
  {
-       return container_of(se, struct task_struct, se);
+       return &task_rq(p)->cfs;
+}
+
+static inline struct cfs_rq *cfs_rq_of(struct sched_entity *se)
+{
+       struct task_struct *p = task_of(se);
+       struct rq *rq = task_rq(p);
+
+       return &rq->cfs;
+}
+
+/* runqueue "owned" by this group */
+static inline struct cfs_rq *group_cfs_rq(struct sched_entity *grp)
+{
+       return NULL;
+}
+
+static inline struct cfs_rq *cpu_cfs_rq(struct cfs_rq *cfs_rq, int this_cpu)
+{
+       return &cpu_rq(this_cpu)->cfs;
+}
+
+#define for_each_leaf_cfs_rq(rq, cfs_rq) \
+               for (cfs_rq = &rq->cfs; cfs_rq; cfs_rq = NULL)
+
+static inline int
+is_same_group(struct sched_entity *se, struct sched_entity *pse)
+{
+       return 1;
+}
+
+static inline struct sched_entity *parent_entity(struct sched_entity *se)
+{
+       return NULL;
  }
  
+#endif /* CONFIG_FAIR_GROUP_SCHED */
+
  
  /**************************************************************
   * Scheduling class tree data structure manipulation methods:
@@ -254,6 +333,34 @@ int sched_nr_latency_handler(struct ctl_table *table, int write,
  }
  #endif
  
+/*
+ * delta *= w / rw
+ */
+static inline unsigned long
+calc_delta_weight(unsigned long delta, struct sched_entity *se)
+{
+       for_each_sched_entity(se) {
+               delta = calc_delta_mine(delta,
+                               se->load.weight, &cfs_rq_of(se)->load);
+       }
+
+       return delta;
+}
+
+/*
+ * delta *= rw / w
+ */
+static inline unsigned long
+calc_delta_fair(unsigned long delta, struct sched_entity *se)
+{
+       for_each_sched_entity(se) {
+               delta = calc_delta_mine(delta,
+                               cfs_rq_of(se)->load.weight, &se->load);
+       }
+
+       return delta;
+}
+
  /*
   * The idea is to set a period in which each task runs once.
   *
@@ -283,29 +390,54 @@ static u64 __sched_period(unsigned long nr_running)
   */
  static u64 sched_slice(struct cfs_rq *cfs_rq, struct sched_entity *se)
  {
-       return calc_delta_mine(__sched_period(cfs_rq->nr_running),
-                              se->load.weight, &cfs_rq->load);
+       return calc_delta_weight(__sched_period(cfs_rq->nr_running), se);
  }
  
  /*
- * We calculate the vruntime slice.
+ * We calculate the vruntime slice of a to be inserted task
   *
- * vs = s/w = p/rw
+ * vs = s*rw/w = p
   */
-static u64 __sched_vslice(unsigned long rq_weight, unsigned long nr_running)
+static u64 sched_vslice_add(struct cfs_rq *cfs_rq, struct sched_entity *se)
  {
-       u64 vslice = __sched_period(nr_running);
+       unsigned long nr_running = cfs_rq->nr_running;
  
-       vslice *= NICE_0_LOAD;
-       do_div(vslice, rq_weight);
+       if (!se->on_rq)
+               nr_running++;
  
-       return vslice;
+       return __sched_period(nr_running);
  }
  
-static u64 sched_vslice_add(struct cfs_rq *cfs_rq, struct sched_entity *se)
+/*
+ * The goal of calc_delta_asym() is to be asymmetrically around NICE_0_LOAD, in
+ * that it favours >=0 over <0.
+ *
+ *   -20         |
+ *               |
+ *     0 --------+-------
+ *             .'
+ *    19     .'
+ *
+ */
+static unsigned long
+calc_delta_asym(unsigned long delta, struct sched_entity *se)
  {
-       return __sched_vslice(cfs_rq->load.weight + se->load.weight,
-                       cfs_rq->nr_running + 1);
+       struct load_weight lw = {
+               .weight = NICE_0_LOAD,
+               .inv_weight = 1UL << (WMULT_SHIFT-NICE_0_SHIFT)
+       };
+
+       for_each_sched_entity(se) {
+               struct load_weight *se_lw = &se->load;
+
+               if (se->load.weight < NICE_0_LOAD)
+                       se_lw = &lw;
+
+               delta = calc_delta_mine(delta,
+                               cfs_rq_of(se)->load.weight, se_lw);
+       }
+
+       return delta;
  }
  
  /*
@@ -322,11 +454,7 @@ __update_curr(struct cfs_rq *cfs_rq, struct sched_entity *curr,
  
         curr->sum_exec_runtime += delta_exec;
         schedstat_add(cfs_rq, exec_clock, delta_exec);
-       delta_exec_weighted = delta_exec;
-       if (unlikely(curr->load.weight != NICE_0_LOAD)) {
-               delta_exec_weighted = calc_delta_fair(delta_exec_weighted,
-                                                       &curr->load);
-       }
+       delta_exec_weighted = calc_delta_fair(delta_exec, curr);
         curr->vruntime += delta_exec_weighted;
  }
  
@@ -413,20 +541,43 @@ update_stats_curr_start(struct cfs_rq *cfs_rq, struct sched_entity *se)
   * Scheduling class queueing methods:
   */
  
+#if defined CONFIG_SMP && defined CONFIG_FAIR_GROUP_SCHED
+static void
+add_cfs_task_weight(struct cfs_rq *cfs_rq, unsigned long weight)
+{
+       cfs_rq->task_weight += weight;
+}
+#else
+static inline void
+add_cfs_task_weight(struct cfs_rq *cfs_rq, unsigned long weight)
+{
+}
+#endif
+
  static void
  account_entity_enqueue(struct cfs_rq *cfs_rq, struct sched_entity *se)
  {
         update_load_add(&cfs_rq->load, se->load.weight);
+       if (!parent_entity(se))
+               inc_cpu_load(rq_of(cfs_rq), se->load.weight);
+       if (entity_is_task(se))
+               add_cfs_task_weight(cfs_rq, se->load.weight);
         cfs_rq->nr_running++;
         se->on_rq = 1;
+       list_add(&se->group_node, &cfs_rq->tasks);
  }
  
  static void
  account_entity_dequeue(struct cfs_rq *cfs_rq, struct sched_entity *se)
  {
         update_load_sub(&cfs_rq->load, se->load.weight);
+       if (!parent_entity(se))
+               dec_cpu_load(rq_of(cfs_rq), se->load.weight);
+       if (entity_is_task(se))
+               add_cfs_task_weight(cfs_rq, -se->load.weight);
         cfs_rq->nr_running--;
         se->on_rq = 0;
+       list_del_init(&se->group_node);
  }
  
  static void enqueue_sleeper(struct cfs_rq *cfs_rq, struct sched_entity *se)
@@ -510,8 +661,12 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial)
  
         if (!initial) {
                 /* sleeps upto a single latency don't count. */
-               if (sched_feat(NEW_FAIR_SLEEPERS))
-                       vruntime -= sysctl_sched_latency;
+               if (sched_feat(NEW_FAIR_SLEEPERS)) {
+                       if (sched_feat(NORMALIZED_SLEEPER))
+                               vruntime -= calc_delta_weight(sysctl_sched_latency, se);
+                       else
+                               vruntime -= sysctl_sched_latency;
+               }
  
                 /* ensure we never gain time by being placed backwards. */
                 vruntime = max_vruntime(se->vruntime, vruntime);
@@ -627,20 +782,16 @@ set_next_entity(struct cfs_rq *cfs_rq, struct sched_entity *se)
         se->prev_sum_exec_runtime = se->sum_exec_runtime;
  }
  
+static int
+wakeup_preempt_entity(struct sched_entity *curr, struct sched_entity *se);
+
  static struct sched_entity *
  pick_next(struct cfs_rq *cfs_rq, struct sched_entity *se)
  {
-       s64 diff, gran;
-
         if (!cfs_rq->next)
                 return se;
  
-       diff = cfs_rq->next->vruntime - se->vruntime;
-       if (diff < 0)
-               return se;
-
-       gran = calc_delta_fair(sysctl_sched_wakeup_granularity, &cfs_rq->load);
-       if (diff > gran)
+       if (wakeup_preempt_entity(cfs_rq->next, se) != 0)
                 return se;
  
         return cfs_rq->next;
@@ -708,101 +859,6 @@ entity_tick(struct cfs_rq *cfs_rq, struct sched_entity *curr, int queued)
   * CFS operations on tasks:
   */
  
-#ifdef CONFIG_FAIR_GROUP_SCHED
-
-/* Walk up scheduling entities hierarchy */
-#define for_each_sched_entity(se) \
-               for (; se; se = se->parent)
-
-static inline struct cfs_rq *task_cfs_rq(struct task_struct *p)
-{
-       return p->se.cfs_rq;
-}
-
-/* runqueue on which this entity is (to be) queued */
-static inline struct cfs_rq *cfs_rq_of(struct sched_entity *se)
-{
-       return se->cfs_rq;
-}
-
-/* runqueue "owned" by this group */
-static inline struct cfs_rq *group_cfs_rq(struct sched_entity *grp)
-{
-       return grp->my_q;
-}
-
-/* Given a group's cfs_rq on one cpu, return its corresponding cfs_rq on
- * another cpu ('this_cpu')
- */
-static inline struct cfs_rq *cpu_cfs_rq(struct cfs_rq *cfs_rq, int this_cpu)
-{
-       return cfs_rq->tg->cfs_rq[this_cpu];
-}
-
-/* Iterate thr' all leaf cfs_rq's on a runqueue */
-#define for_each_leaf_cfs_rq(rq, cfs_rq) \
-       list_for_each_entry_rcu(cfs_rq, &rq->leaf_cfs_rq_list, leaf_cfs_rq_list)
-
-/* Do the two (enqueued) entities belong to the same group ? */
-static inline int
-is_same_group(struct sched_entity *se, struct sched_entity *pse)
-{
-       if (se->cfs_rq == pse->cfs_rq)
-               return 1;
-
-       return 0;
-}
-
-static inline struct sched_entity *parent_entity(struct sched_entity *se)
-{
-       return se->parent;
-}
-
-#else  /* CONFIG_FAIR_GROUP_SCHED */
-
-#define for_each_sched_entity(se) \
-               for (; se; se = NULL)
-
-static inline struct cfs_rq *task_cfs_rq(struct task_struct *p)
-{
-       return &task_rq(p)->cfs;
-}
-
-static inline struct cfs_rq *cfs_rq_of(struct sched_entity *se)
-{
-       struct task_struct *p = task_of(se);
-       struct rq *rq = task_rq(p);
-
-       return &rq->cfs;
-}
-
-/* runqueue "owned" by this group */
-static inline struct cfs_rq *group_cfs_rq(struct sched_entity *grp)
-{
-       return NULL;
-}
-
-static inline struct cfs_rq *cpu_cfs_rq(struct cfs_rq *cfs_rq, int this_cpu)
-{
-       return &cpu_rq(this_cpu)->cfs;
-}
-
-#define for_each_leaf_cfs_rq(rq, cfs_rq) \
-               for (cfs_rq = &rq->cfs; cfs_rq; cfs_rq = NULL)
-
-static inline int
-is_same_group(struct sched_entity *se, struct sched_entity *pse)
-{
-       return 1;
-}
-
-static inline struct sched_entity *parent_entity(struct sched_entity *se)
-{
-       return NULL;
-}
-
-#endif /* CONFIG_FAIR_GROUP_SCHED */
-
  #ifdef CONFIG_SCHED_HRTICK
  static void hrtick_start_fair(struct rq *rq, struct task_struct *p)
  {
@@ -916,7 +972,7 @@ static void yield_task_fair(struct rq *rq)
         /*
          * Already in the rightmost position?
          */
-       if (unlikely(rightmost->vruntime < se->vruntime))
+       if (unlikely(!rightmost || rightmost->vruntime < se->vruntime))
                 return;
  
         /*
@@ -955,7 +1011,9 @@ static int wake_idle(int cpu, struct task_struct *p)
                 return cpu;
  
         for_each_domain(cpu, sd) {
-               if (sd->flags & SD_WAKE_IDLE) {
+               if ((sd->flags & SD_WAKE_IDLE)
+                   || ((sd->flags & SD_WAKE_IDLE_FAR)
+                       && !task_hot(p, task_rq(p)->clock, sd))) {
                         cpus_and(tmp, sd->span, p->cpus_allowed);
                         for_each_cpu_mask(i, tmp) {
                                 if (idle_cpu(i)) {
@@ -1099,6 +1157,58 @@ out:
  }
  #endif /* CONFIG_SMP */
  
+static unsigned long wakeup_gran(struct sched_entity *se)
+{
+       unsigned long gran = sysctl_sched_wakeup_granularity;
+
+       /*
+        * More easily preempt - nice tasks, while not making it harder for
+        * + nice tasks.
+        */
+       gran = calc_delta_asym(sysctl_sched_wakeup_granularity, se);
+
+       return gran;
+}
+
+/*
+ * Should 'se' preempt 'curr'.
+ *
+ *             |s1
+ *        |s2
+ *   |s3
+ *         g
+ *      |<--->|c
+ *
+ *  w(c, s1) = -1
+ *  w(c, s2) =  0
+ *  w(c, s3) =  1
+ *
+ */
+static int
+wakeup_preempt_entity(struct sched_entity *curr, struct sched_entity *se)
+{
+       s64 gran, vdiff = curr->vruntime - se->vruntime;
+
+       if (vdiff < 0)
+               return -1;
+
+       gran = wakeup_gran(curr);
+       if (vdiff > gran)
+               return 1;
+
+       return 0;
+}
+
+/* return depth at which a sched entity is present in the hierarchy */
+static inline int depth_se(struct sched_entity *se)
+{
+       int depth = 0;
+
+       for_each_sched_entity(se)
+               depth++;
+
+       return depth;
+}
  
  /*
   * Preempt the current task with a newly woken task if needed:
@@ -1108,7 +1218,7 @@ static void check_preempt_wakeup(struct rq *rq, struct task_struct *p)
         struct task_struct *curr = rq->curr;
         struct cfs_rq *cfs_rq = task_cfs_rq(curr);
         struct sched_entity *se = &curr->se, *pse = &p->se;
-       unsigned long gran;
+       int se_depth, pse_depth;
  
         if (unlikely(rt_prio(p->prio))) {
                 update_rq_clock(rq);
@@ -1133,20 +1243,33 @@ static void check_preempt_wakeup(struct rq *rq, struct task_struct *p)
         if (!sched_feat(WAKEUP_PREEMPT))
                 return;
  
-       while (!is_same_group(se, pse)) {
+       /*
+        * preemption test can be made between sibling entities who are in the
+        * same cfs_rq i.e who have a common parent. Walk up the hierarchy of
+        * both tasks until we find their ancestors who are siblings of common
+        * parent.
+        */
+
+       /* First walk up until both entities are at same depth */
+       se_depth = depth_se(se);
+       pse_depth = depth_se(pse);
+
+       while (se_depth > pse_depth) {
+               se_depth--;
                 se = parent_entity(se);
+       }
+
+       while (pse_depth > se_depth) {
+               pse_depth--;
                 pse = parent_entity(pse);
         }
  
-       gran = sysctl_sched_wakeup_granularity;
-       /*
-        * More easily preempt - nice tasks, while not making
-        * it harder for + nice tasks.
-        */
-       if (unlikely(se->load.weight > NICE_0_LOAD))
-               gran = calc_delta_fair(gran, &se->load);
+       while (!is_same_group(se, pse)) {
+               se = parent_entity(se);
+               pse = parent_entity(pse);
+       }
  
-       if (pse->vruntime + gran < se->vruntime)
+       if (wakeup_preempt_entity(se, pse) == 1)
                 resched_task(curr);
  }
  
@@ -1197,15 +1320,27 @@ static void put_prev_task_fair(struct rq *rq, struct task_struct *prev)
   * the current task:
   */
  static struct task_struct *
-__load_balance_iterator(struct cfs_rq *cfs_rq, struct rb_node *curr)
+__load_balance_iterator(struct cfs_rq *cfs_rq, struct list_head *next)
  {
-       struct task_struct *p;
+       struct task_struct *p = NULL;
+       struct sched_entity *se;
+
+       if (next == &cfs_rq->tasks)
+               return NULL;
+
+       /* Skip over entities that are not tasks */
+       do {
+               se = list_entry(next, struct sched_entity, group_node);
+               next = next->next;
+       } while (next != &cfs_rq->tasks && !entity_is_task(se));
  
-       if (!curr)
+       if (next == &cfs_rq->tasks)
                 return NULL;
  
-       p = rb_entry(curr, struct task_struct, se.run_node);
-       cfs_rq->rb_load_balance_curr = rb_next(curr);
+       cfs_rq->balance_iterator = next;
+
+       if (entity_is_task(se))
+               p = task_of(se);
  
         return p;
  }
@@ -1214,85 +1349,100 @@ static struct task_struct *load_balance_start_fair(void *arg)
  {
         struct cfs_rq *cfs_rq = arg;
  
-       return __load_balance_iterator(cfs_rq, first_fair(cfs_rq));
+       return __load_balance_iterator(cfs_rq, cfs_rq->tasks.next);
  }
  
  static struct task_struct *load_balance_next_fair(void *arg)
  {
         struct cfs_rq *cfs_rq = arg;
  
-       return __load_balance_iterator(cfs_rq, cfs_rq->rb_load_balance_curr);
+       return __load_balance_iterator(cfs_rq, cfs_rq->balance_iterator);
  }
  
-#ifdef CONFIG_FAIR_GROUP_SCHED
-static int cfs_rq_best_prio(struct cfs_rq *cfs_rq)
+static unsigned long
+__load_balance_fair(struct rq *this_rq, int this_cpu, struct rq *busiest,
+               unsigned long max_load_move, struct sched_domain *sd,
+               enum cpu_idle_type idle, int *all_pinned, int *this_best_prio,
+               struct cfs_rq *cfs_rq)
  {
-       struct sched_entity *curr;
-       struct task_struct *p;
-
-       if (!cfs_rq->nr_running || !first_fair(cfs_rq))
-               return MAX_PRIO;
-
-       curr = cfs_rq->curr;
-       if (!curr)
-               curr = __pick_next_entity(cfs_rq);
+       struct rq_iterator cfs_rq_iterator;
  
-       p = task_of(curr);
+       cfs_rq_iterator.start = load_balance_start_fair;
+       cfs_rq_iterator.next = load_balance_next_fair;
+       cfs_rq_iterator.arg = cfs_rq;
  
-       return p->prio;
+       return balance_tasks(this_rq, this_cpu, busiest,
+                       max_load_move, sd, idle, all_pinned,
+                       this_best_prio, &cfs_rq_iterator);
  }
-#endif
  
+#ifdef CONFIG_FAIR_GROUP_SCHED
  static unsigned long
  load_balance_fair(struct rq *this_rq, int this_cpu, struct rq *busiest,
                   unsigned long max_load_move,
                   struct sched_domain *sd, enum cpu_idle_type idle,
                   int *all_pinned, int *this_best_prio)
  {
-       struct cfs_rq *busy_cfs_rq;
         long rem_load_move = max_load_move;
-       struct rq_iterator cfs_rq_iterator;
-
-       cfs_rq_iterator.start = load_balance_start_fair;
-       cfs_rq_iterator.next = load_balance_next_fair;
+       int busiest_cpu = cpu_of(busiest);
+       struct task_group *tg;
  
-       for_each_leaf_cfs_rq(busiest, busy_cfs_rq) {
-#ifdef CONFIG_FAIR_GROUP_SCHED
-               struct cfs_rq *this_cfs_rq;
+       rcu_read_lock();
+       list_for_each_entry(tg, &task_groups, list) {
                 long imbalance;
-               unsigned long maxload;
+               unsigned long this_weight, busiest_weight;
+               long rem_load, max_load, moved_load;
+
+               /*
+                * empty group
+                */
+               if (!aggregate(tg, sd)->task_weight)
+                       continue;
+
+               rem_load = rem_load_move * aggregate(tg, sd)->rq_weight;
+               rem_load /= aggregate(tg, sd)->load + 1;
+
+               this_weight = tg->cfs_rq[this_cpu]->task_weight;
+               busiest_weight = tg->cfs_rq[busiest_cpu]->task_weight;
+
+               imbalance = (busiest_weight - this_weight) / 2;
  
-               this_cfs_rq = cpu_cfs_rq(busy_cfs_rq, this_cpu);
+               if (imbalance < 0)
+                       imbalance = busiest_weight;
  
-               imbalance = busy_cfs_rq->load.weight - this_cfs_rq->load.weight;
-               /* Don't pull if this_cfs_rq has more load than busy_cfs_rq */
-               if (imbalance <= 0)
+               max_load = max(rem_load, imbalance);
+               moved_load = __load_balance_fair(this_rq, this_cpu, busiest,
+                               max_load, sd, idle, all_pinned, this_best_prio,
+                               tg->cfs_rq[busiest_cpu]);
+
+               if (!moved_load)
                         continue;
  
-               /* Don't pull more than imbalance/2 */
-               imbalance /= 2;
-               maxload = min(rem_load_move, imbalance);
+               move_group_shares(tg, sd, busiest_cpu, this_cpu);
  
-               *this_best_prio = cfs_rq_best_prio(this_cfs_rq);
-#else
-# define maxload rem_load_move
-#endif
-               /*
-                * pass busy_cfs_rq argument into
-                * load_balance_[start|next]_fair iterators
-                */
-               cfs_rq_iterator.arg = busy_cfs_rq;
-               rem_load_move -= balance_tasks(this_rq, this_cpu, busiest,
-                                              maxload, sd, idle, all_pinned,
-                                              this_best_prio,
-                                              &cfs_rq_iterator);
+               moved_load *= aggregate(tg, sd)->load;
+               moved_load /= aggregate(tg, sd)->rq_weight + 1;
  
-               if (rem_load_move <= 0)
+               rem_load_move -= moved_load;
+               if (rem_load_move < 0)
                         break;
         }
+       rcu_read_unlock();
  
         return max_load_move - rem_load_move;
  }
+#else
+static unsigned long
+load_balance_fair(struct rq *this_rq, int this_cpu, struct rq *busiest,
+                 unsigned long max_load_move,
+                 struct sched_domain *sd, enum cpu_idle_type idle,
+                 int *all_pinned, int *this_best_prio)
+{
+       return __load_balance_fair(this_rq, this_cpu, busiest,
+                       max_load_move, sd, idle, all_pinned,
+                       this_best_prio, &busiest->cfs);
+}
+#endif
  
  static int
  move_one_task_fair(struct rq *this_rq, int this_cpu, struct rq *busiest,
@@ -1461,16 +1611,40 @@ static const struct sched_class fair_sched_class = {
  };
  
  #ifdef CONFIG_SCHED_DEBUG
+static void
+print_cfs_rq_tasks(struct seq_file *m, struct cfs_rq *cfs_rq, int depth)
+{
+       struct sched_entity *se;
+
+       if (!cfs_rq)
+               return;
+
+       list_for_each_entry_rcu(se, &cfs_rq->tasks, group_node) {
+               int i;
+
+               for (i = depth; i; i--)
+                       seq_puts(m, "  ");
+
+               seq_printf(m, "%lu %s %lu\n",
+                               se->load.weight,
+                               entity_is_task(se) ? "T" : "G",
+                               calc_delta_weight(SCHED_LOAD_SCALE, se)
+                               );
+               if (!entity_is_task(se))
+                       print_cfs_rq_tasks(m, group_cfs_rq(se), depth + 1);
+       }
+}
+
  static void print_cfs_stats(struct seq_file *m, int cpu)
  {
         struct cfs_rq *cfs_rq;
  
-#ifdef CONFIG_FAIR_GROUP_SCHED
-       print_cfs_rq(m, cpu, &cpu_rq(cpu)->cfs);
-#endif
         rcu_read_lock();
         for_each_leaf_cfs_rq(cpu_rq(cpu), cfs_rq)
                 print_cfs_rq(m, cpu, cfs_rq);
+
+       seq_printf(m, "\nWeight tree:\n");
+       print_cfs_rq_tasks(m, &cpu_rq(cpu)->cfs, 1);
         rcu_read_unlock();
  }
  #endif
diff --git a/kernel/sched_features.h b/kernel/sched_features.h

new file mode 100644 (file)

index 0000000..1c7283c
--- /dev/null
+++ b/kernel/sched_features.h
@@ -0,0 +1,10 @@
+SCHED_FEAT(NEW_FAIR_SLEEPERS, 1)
+SCHED_FEAT(WAKEUP_PREEMPT, 1)
+SCHED_FEAT(START_DEBIT, 1)
+SCHED_FEAT(AFFINE_WAKEUPS, 1)
+SCHED_FEAT(CACHE_HOT_BUDDY, 1)
+SCHED_FEAT(SYNC_WAKEUPS, 1)
+SCHED_FEAT(HRTICK, 1)
+SCHED_FEAT(DOUBLE_TICK, 0)
+SCHED_FEAT(NORMALIZED_SLEEPER, 1)
+SCHED_FEAT(DEADLINE, 1)
diff --git a/kernel/sched_rt.c b/kernel/sched_rt.c

index 0a6d2e516420516cb1d0c35a5345db1f356f71ab..c2730a5a4f056c04526cb56008b772fe26712072 100644 (file)
--- a/kernel/sched_rt.c
+++ b/kernel/sched_rt.c
@@ -62,7 +62,12 @@ static inline u64 sched_rt_runtime(struct rt_rq *rt_rq)
         if (!rt_rq->tg)
                 return RUNTIME_INF;
  
-       return rt_rq->tg->rt_runtime;
+       return rt_rq->rt_runtime;
+}
+
+static inline u64 sched_rt_period(struct rt_rq *rt_rq)
+{
+       return ktime_to_ns(rt_rq->tg->rt_bandwidth.rt_period);
  }
  
  #define for_each_leaf_rt_rq(rt_rq, rq) \
@@ -127,14 +132,39 @@ static int rt_se_boosted(struct sched_rt_entity *rt_se)
         return p->prio != p->normal_prio;
  }
  
+#ifdef CONFIG_SMP
+static inline cpumask_t sched_rt_period_mask(void)
+{
+       return cpu_rq(smp_processor_id())->rd->span;
+}
+#else
+static inline cpumask_t sched_rt_period_mask(void)
+{
+       return cpu_online_map;
+}
+#endif
+
+static inline
+struct rt_rq *sched_rt_period_rt_rq(struct rt_bandwidth *rt_b, int cpu)
+{
+       return container_of(rt_b, struct task_group, rt_bandwidth)->rt_rq[cpu];
+}
+
+static inline struct rt_bandwidth *sched_rt_bandwidth(struct rt_rq *rt_rq)
+{
+       return &rt_rq->tg->rt_bandwidth;
+}
+
  #else
  
  static inline u64 sched_rt_runtime(struct rt_rq *rt_rq)
  {
-       if (sysctl_sched_rt_runtime == -1)
-               return RUNTIME_INF;
+       return rt_rq->rt_runtime;
+}
  
-       return (u64)sysctl_sched_rt_runtime * NSEC_PER_USEC;
+static inline u64 sched_rt_period(struct rt_rq *rt_rq)
+{
+       return ktime_to_ns(def_rt_bandwidth.rt_period);
  }
  
  #define for_each_leaf_rt_rq(rt_rq, rq) \
@@ -173,6 +203,102 @@ static inline int rt_rq_throttled(struct rt_rq *rt_rq)
  {
         return rt_rq->rt_throttled;
  }
+
+static inline cpumask_t sched_rt_period_mask(void)
+{
+       return cpu_online_map;
+}
+
+static inline
+struct rt_rq *sched_rt_period_rt_rq(struct rt_bandwidth *rt_b, int cpu)
+{
+       return &cpu_rq(cpu)->rt;
+}
+
+static inline struct rt_bandwidth *sched_rt_bandwidth(struct rt_rq *rt_rq)
+{
+       return &def_rt_bandwidth;
+}
+
+#endif
+
+static int do_sched_rt_period_timer(struct rt_bandwidth *rt_b, int overrun)
+{
+       int i, idle = 1;
+       cpumask_t span;
+
+       if (rt_b->rt_runtime == RUNTIME_INF)
+               return 1;
+
+       span = sched_rt_period_mask();
+       for_each_cpu_mask(i, span) {
+               int enqueue = 0;
+               struct rt_rq *rt_rq = sched_rt_period_rt_rq(rt_b, i);
+               struct rq *rq = rq_of_rt_rq(rt_rq);
+
+               spin_lock(&rq->lock);
+               if (rt_rq->rt_time) {
+                       u64 runtime;
+
+                       spin_lock(&rt_rq->rt_runtime_lock);
+                       runtime = rt_rq->rt_runtime;
+                       rt_rq->rt_time -= min(rt_rq->rt_time, overrun*runtime);
+                       if (rt_rq->rt_throttled && rt_rq->rt_time < runtime) {
+                               rt_rq->rt_throttled = 0;
+                               enqueue = 1;
+                       }
+                       if (rt_rq->rt_time || rt_rq->rt_nr_running)
+                               idle = 0;
+                       spin_unlock(&rt_rq->rt_runtime_lock);
+               }
+
+               if (enqueue)
+                       sched_rt_rq_enqueue(rt_rq);
+               spin_unlock(&rq->lock);
+       }
+
+       return idle;
+}
+
+#ifdef CONFIG_SMP
+static int balance_runtime(struct rt_rq *rt_rq)
+{
+       struct rt_bandwidth *rt_b = sched_rt_bandwidth(rt_rq);
+       struct root_domain *rd = cpu_rq(smp_processor_id())->rd;
+       int i, weight, more = 0;
+       u64 rt_period;
+
+       weight = cpus_weight(rd->span);
+
+       spin_lock(&rt_b->rt_runtime_lock);
+       rt_period = ktime_to_ns(rt_b->rt_period);
+       for_each_cpu_mask(i, rd->span) {
+               struct rt_rq *iter = sched_rt_period_rt_rq(rt_b, i);
+               s64 diff;
+
+               if (iter == rt_rq)
+                       continue;
+
+               spin_lock(&iter->rt_runtime_lock);
+               diff = iter->rt_runtime - iter->rt_time;
+               if (diff > 0) {
+                       do_div(diff, weight);
+                       if (rt_rq->rt_runtime + diff > rt_period)
+                               diff = rt_period - rt_rq->rt_runtime;
+                       iter->rt_runtime -= diff;
+                       rt_rq->rt_runtime += diff;
+                       more = 1;
+                       if (rt_rq->rt_runtime == rt_period) {
+                               spin_unlock(&iter->rt_runtime_lock);
+                               break;
+                       }
+               }
+               spin_unlock(&iter->rt_runtime_lock);
+       }
+       spin_unlock(&rt_b->rt_runtime_lock);
+
+       return more;
+}
  #endif
  
  static inline int rt_se_prio(struct sched_rt_entity *rt_se)
@@ -197,12 +323,24 @@ static int sched_rt_runtime_exceeded(struct rt_rq *rt_rq)
         if (rt_rq->rt_throttled)
                 return rt_rq_throttled(rt_rq);
  
+       if (sched_rt_runtime(rt_rq) >= sched_rt_period(rt_rq))
+               return 0;
+
+#ifdef CONFIG_SMP
         if (rt_rq->rt_time > runtime) {
-               struct rq *rq = rq_of_rt_rq(rt_rq);
+               int more;
  
-               rq->rt_throttled = 1;
-               rt_rq->rt_throttled = 1;
+               spin_unlock(&rt_rq->rt_runtime_lock);
+               more = balance_runtime(rt_rq);
+               spin_lock(&rt_rq->rt_runtime_lock);
  
+               if (more)
+                       runtime = sched_rt_runtime(rt_rq);
+       }
+#endif
+
+       if (rt_rq->rt_time > runtime) {
+               rt_rq->rt_throttled = 1;
                 if (rt_rq_throttled(rt_rq)) {
                         sched_rt_rq_dequeue(rt_rq);
                         return 1;
@@ -212,29 +350,6 @@ static int sched_rt_runtime_exceeded(struct rt_rq *rt_rq)
         return 0;
  }
  
-static void update_sched_rt_period(struct rq *rq)
-{
-       struct rt_rq *rt_rq;
-       u64 period;
-
-       while (rq->clock > rq->rt_period_expire) {
-               period = (u64)sysctl_sched_rt_period * NSEC_PER_USEC;
-               rq->rt_period_expire += period;
-
-               for_each_leaf_rt_rq(rt_rq, rq) {
-                       u64 runtime = sched_rt_runtime(rt_rq);
-
-                       rt_rq->rt_time -= min(rt_rq->rt_time, runtime);
-                       if (rt_rq->rt_throttled && rt_rq->rt_time < runtime) {
-                               rt_rq->rt_throttled = 0;
-                               sched_rt_rq_enqueue(rt_rq);
-                       }
-               }
-
-               rq->rt_throttled = 0;
-       }
-}
-
  /*
   * Update the current task's runtime statistics. Skip current tasks that
   * are not in our scheduling class.
@@ -259,9 +374,15 @@ static void update_curr_rt(struct rq *rq)
         curr->se.exec_start = rq->clock;
         cpuacct_charge(curr, delta_exec);
  
-       rt_rq->rt_time += delta_exec;
-       if (sched_rt_runtime_exceeded(rt_rq))
-               resched_task(curr);
+       for_each_sched_rt_entity(rt_se) {
+               rt_rq = rt_rq_of_se(rt_se);
+
+               spin_lock(&rt_rq->rt_runtime_lock);
+               rt_rq->rt_time += delta_exec;
+               if (sched_rt_runtime_exceeded(rt_rq))
+                       resched_task(curr);
+               spin_unlock(&rt_rq->rt_runtime_lock);
+       }
  }
  
  static inline
@@ -284,6 +405,11 @@ void inc_rt_tasks(struct sched_rt_entity *rt_se, struct rt_rq *rt_rq)
  #ifdef CONFIG_RT_GROUP_SCHED
         if (rt_se_boosted(rt_se))
                 rt_rq->rt_nr_boosted++;
+
+       if (rt_rq->tg)
+               start_rt_bandwidth(&rt_rq->tg->rt_bandwidth);
+#else
+       start_rt_bandwidth(&def_rt_bandwidth);
  #endif
  }
  
@@ -353,27 +479,21 @@ static void dequeue_rt_entity(struct sched_rt_entity *rt_se)
  /*
   * Because the prio of an upper entry depends on the lower
   * entries, we must remove entries top - down.
- *
- * XXX: O(1/2 h^2) because we can only walk up, not down the chain.
- *      doesn't matter much for now, as h=2 for GROUP_SCHED.
   */
  static void dequeue_rt_stack(struct task_struct *p)
  {
-       struct sched_rt_entity *rt_se, *top_se;
+       struct sched_rt_entity *rt_se, *back = NULL;
  
-       /*
-        * dequeue all, top - down.
-        */
-       do {
-               rt_se = &p->rt;
-               top_se = NULL;
-               for_each_sched_rt_entity(rt_se) {
-                       if (on_rt_rq(rt_se))
-                               top_se = rt_se;
-               }
-               if (top_se)
-                       dequeue_rt_entity(top_se);
-       } while (top_se);
+       rt_se = &p->rt;
+       for_each_sched_rt_entity(rt_se) {
+               rt_se->back = back;
+               back = rt_se;
+       }
+
+       for (rt_se = back; rt_se; rt_se = rt_se->back) {
+               if (on_rt_rq(rt_se))
+                       dequeue_rt_entity(rt_se);
+       }
  }
  
  /*
@@ -393,6 +513,8 @@ static void enqueue_task_rt(struct rq *rq, struct task_struct *p, int wakeup)
          */
         for_each_sched_rt_entity(rt_se)
                 enqueue_rt_entity(rt_se);
+
+       inc_cpu_load(rq, p->se.load.weight);
  }
  
  static void dequeue_task_rt(struct rq *rq, struct task_struct *p, int sleep)
@@ -412,6 +534,8 @@ static void dequeue_task_rt(struct rq *rq, struct task_struct *p, int sleep)
                 if (rt_rq && rt_rq->rt_nr_running)
                         enqueue_rt_entity(rt_se);
         }
+
+       dec_cpu_load(rq, p->se.load.weight);
  }
  
  /*
@@ -1001,7 +1125,8 @@ move_one_task_rt(struct rq *this_rq, int this_cpu, struct rq *busiest,
         return 0;
  }
  
-static void set_cpus_allowed_rt(struct task_struct *p, cpumask_t *new_mask)
+static void set_cpus_allowed_rt(struct task_struct *p,
+                               const cpumask_t *new_mask)
  {
         int weight = cpus_weight(*new_mask);
  
diff --git a/kernel/sched_stats.h b/kernel/sched_stats.h

index 5b32433e7ee5719cb061956e53ec4ef2a242d903..5bae2e0c3ff293b665e0428ce2942c6e9ac1db0b 100644 (file)
--- a/kernel/sched_stats.h
+++ b/kernel/sched_stats.h
@@ -9,6 +9,11 @@
  static int show_schedstat(struct seq_file *seq, void *v)
  {
         int cpu;
+       int mask_len = NR_CPUS/32 * 9;
+       char *mask_str = kmalloc(mask_len, GFP_KERNEL);
+
+       if (mask_str == NULL)
+               return -ENOMEM;
  
         seq_printf(seq, "version %d\n", SCHEDSTAT_VERSION);
         seq_printf(seq, "timestamp %lu\n", jiffies);
@@ -36,9 +41,8 @@ static int show_schedstat(struct seq_file *seq, void *v)
                 preempt_disable();
                 for_each_domain(cpu, sd) {
                         enum cpu_idle_type itype;
-                       char mask_str[NR_CPUS];
  
-                       cpumask_scnprintf(mask_str, NR_CPUS, sd->span);
+                       cpumask_scnprintf(mask_str, mask_len, sd->span);
                         seq_printf(seq, "domain%d %s", dcount++, mask_str);
                         for (itype = CPU_IDLE; itype < CPU_MAX_IDLE_TYPES;
                                         itype++) {
diff --git a/kernel/softirq.c b/kernel/softirq.c

index 31e9f2a4792847388b524d313bc389bd8cd4cf20..3c44956ee7e2312d30f28bc68a5b9825a7da657a 100644 (file)
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -356,7 +356,8 @@ void open_softirq(int nr, void (*action)(struct softirq_action*), void *data)
  /* Tasklets */
  struct tasklet_head
  {
-       struct tasklet_struct *list;
+       struct tasklet_struct *head;
+       struct tasklet_struct **tail;
  };
  
  /* Some compilers disobey section attribute on statics when not
@@ -369,8 +370,9 @@ void __tasklet_schedule(struct tasklet_struct *t)
         unsigned long flags;
  
         local_irq_save(flags);
-       t->next = __get_cpu_var(tasklet_vec).list;
-       __get_cpu_var(tasklet_vec).list = t;
+       t->next = NULL;
+       *__get_cpu_var(tasklet_vec).tail = t;
+       __get_cpu_var(tasklet_vec).tail = &(t->next);
         raise_softirq_irqoff(TASKLET_SOFTIRQ);
         local_irq_restore(flags);
  }
@@ -382,8 +384,9 @@ void __tasklet_hi_schedule(struct tasklet_struct *t)
         unsigned long flags;
  
         local_irq_save(flags);
-       t->next = __get_cpu_var(tasklet_hi_vec).list;
-       __get_cpu_var(tasklet_hi_vec).list = t;
+       t->next = NULL;
+       *__get_cpu_var(tasklet_hi_vec).tail = t;
+       __get_cpu_var(tasklet_hi_vec).tail = &(t->next);
         raise_softirq_irqoff(HI_SOFTIRQ);
         local_irq_restore(flags);
  }
@@ -395,8 +398,9 @@ static void tasklet_action(struct softirq_action *a)
         struct tasklet_struct *list;
  
         local_irq_disable();
-       list = __get_cpu_var(tasklet_vec).list;
-       __get_cpu_var(tasklet_vec).list = NULL;
+       list = __get_cpu_var(tasklet_vec).head;
+       __get_cpu_var(tasklet_vec).head = NULL;
+       __get_cpu_var(tasklet_vec).tail = &__get_cpu_var(tasklet_vec).head;
         local_irq_enable();
  
         while (list) {
@@ -416,8 +420,9 @@ static void tasklet_action(struct softirq_action *a)
                 }
  
                 local_irq_disable();
-               t->next = __get_cpu_var(tasklet_vec).list;
-               __get_cpu_var(tasklet_vec).list = t;
+               t->next = NULL;
+               *__get_cpu_var(tasklet_vec).tail = t;
+               __get_cpu_var(tasklet_vec).tail = &(t->next);
                 __raise_softirq_irqoff(TASKLET_SOFTIRQ);
                 local_irq_enable();
         }
@@ -428,8 +433,9 @@ static void tasklet_hi_action(struct softirq_action *a)
         struct tasklet_struct *list;
  
         local_irq_disable();
-       list = __get_cpu_var(tasklet_hi_vec).list;
-       __get_cpu_var(tasklet_hi_vec).list = NULL;
+       list = __get_cpu_var(tasklet_hi_vec).head;
+       __get_cpu_var(tasklet_hi_vec).head = NULL;
+       __get_cpu_var(tasklet_hi_vec).tail = &__get_cpu_var(tasklet_hi_vec).head;
         local_irq_enable();
  
         while (list) {
@@ -449,8 +455,9 @@ static void tasklet_hi_action(struct softirq_action *a)
                 }
  
                 local_irq_disable();
-               t->next = __get_cpu_var(tasklet_hi_vec).list;
-               __get_cpu_var(tasklet_hi_vec).list = t;
+               t->next = NULL;
+               *__get_cpu_var(tasklet_hi_vec).tail = t;
+               __get_cpu_var(tasklet_hi_vec).tail = &(t->next);
                 __raise_softirq_irqoff(HI_SOFTIRQ);
                 local_irq_enable();
         }
@@ -487,6 +494,15 @@ EXPORT_SYMBOL(tasklet_kill);
  
  void __init softirq_init(void)
  {
+       int cpu;
+
+       for_each_possible_cpu(cpu) {
+               per_cpu(tasklet_vec, cpu).tail =
+                       &per_cpu(tasklet_vec, cpu).head;
+               per_cpu(tasklet_hi_vec, cpu).tail =
+                       &per_cpu(tasklet_hi_vec, cpu).head;
+       }
+
         open_softirq(TASKLET_SOFTIRQ, tasklet_action, NULL);
         open_softirq(HI_SOFTIRQ, tasklet_hi_action, NULL);
  }
@@ -555,9 +571,12 @@ void tasklet_kill_immediate(struct tasklet_struct *t, unsigned int cpu)
                 return;
  
         /* CPU is dead, so no lock needed. */
-       for (i = &per_cpu(tasklet_vec, cpu).list; *i; i = &(*i)->next) {
+       for (i = &per_cpu(tasklet_vec, cpu).head; *i; i = &(*i)->next) {
                 if (*i == t) {
                         *i = t->next;
+                       /* If this was the tail element, move the tail ptr */
+                       if (*i == NULL)
+                               per_cpu(tasklet_vec, cpu).tail = i;
                         return;
                 }
         }
@@ -566,20 +585,20 @@ void tasklet_kill_immediate(struct tasklet_struct *t, unsigned int cpu)
  
  static void takeover_tasklets(unsigned int cpu)
  {
-       struct tasklet_struct **i;
-
         /* CPU is dead, so no lock needed. */
         local_irq_disable();
  
         /* Find end, append list for that CPU. */
-       for (i = &__get_cpu_var(tasklet_vec).list; *i; i = &(*i)->next);
-       *i = per_cpu(tasklet_vec, cpu).list;
-       per_cpu(tasklet_vec, cpu).list = NULL;
+       *__get_cpu_var(tasklet_vec).tail = per_cpu(tasklet_vec, cpu).head;
+       __get_cpu_var(tasklet_vec).tail = per_cpu(tasklet_vec, cpu).tail;
+       per_cpu(tasklet_vec, cpu).head = NULL;
+       per_cpu(tasklet_vec, cpu).tail = &per_cpu(tasklet_vec, cpu).head;
         raise_softirq_irqoff(TASKLET_SOFTIRQ);
  
-       for (i = &__get_cpu_var(tasklet_hi_vec).list; *i; i = &(*i)->next);
-       *i = per_cpu(tasklet_hi_vec, cpu).list;
-       per_cpu(tasklet_hi_vec, cpu).list = NULL;
+       *__get_cpu_var(tasklet_hi_vec).tail = per_cpu(tasklet_hi_vec, cpu).head;
+       __get_cpu_var(tasklet_hi_vec).tail = per_cpu(tasklet_hi_vec, cpu).tail;
+       per_cpu(tasklet_hi_vec, cpu).head = NULL;
+       per_cpu(tasklet_hi_vec, cpu).tail = &per_cpu(tasklet_hi_vec, cpu).head;
         raise_softirq_irqoff(HI_SOFTIRQ);
  
         local_irq_enable();
diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c

index 6f4e0e13f70c337c531be43b0d3cc5296c972c71..e1b2a5b1b105452f3f741fa285c51358075cfb26 100644 (file)
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -35,7 +35,7 @@ static int stopmachine(void *cpu)
         int irqs_disabled = 0;
         int prepared = 0;
  
-       set_cpus_allowed(current, cpumask_of_cpu((int)(long)cpu));
+       set_cpus_allowed_ptr(current, &cpumask_of_cpu((int)(long)cpu));
  
         /* Ack: we are alive */
         smp_mb(); /* Theoretically the ack = 0 might not be on this CPU yet. */
diff --git a/kernel/sys.c b/kernel/sys.c

index a626116af5db96b58f47434eb47c6af23da6535e..6a0cc71ee88d61e1afdd47e960a841515dad8a3c 100644 (file)
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -67,6 +67,12 @@
  #ifndef SET_ENDIAN
  # define SET_ENDIAN(a,b)       (-EINVAL)
  #endif
+#ifndef GET_TSC_CTL
+# define GET_TSC_CTL(a)                (-EINVAL)
+#endif
+#ifndef SET_TSC_CTL
+# define SET_TSC_CTL(a)                (-EINVAL)
+#endif
  
  /*
   * this is where the system-wide overflow UID and GID are defined, for
@@ -1737,7 +1743,12 @@ asmlinkage long sys_prctl(int option, unsigned long arg2, unsigned long arg3,
  #else
                         return -EINVAL;
  #endif
-
+               case PR_GET_TSC:
+                       error = GET_TSC_CTL(arg2);
+                       break;
+               case PR_SET_TSC:
+                       error = SET_TSC_CTL(arg2);
+                       break;
                 default:
                         error = -EINVAL;
                         break;
diff --git a/kernel/sysctl.c b/kernel/sysctl.c

index b2a2d6889babc898794e7c7fb7907ef59a3b487b..fd3364827ccf0a838c79fe240016c93cedd508e4 100644 (file)
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -268,17 +268,6 @@ static struct ctl_table kern_table[] = {
                 .extra1         = &min_wakeup_granularity_ns,
                 .extra2         = &max_wakeup_granularity_ns,
         },
-       {
-               .ctl_name       = CTL_UNNUMBERED,
-               .procname       = "sched_batch_wakeup_granularity_ns",
-               .data           = &sysctl_sched_batch_wakeup_granularity,
-               .maxlen         = sizeof(unsigned int),
-               .mode           = 0644,
-               .proc_handler   = &proc_dointvec_minmax,
-               .strategy       = &sysctl_intvec,
-               .extra1         = &min_wakeup_granularity_ns,
-               .extra2         = &max_wakeup_granularity_ns,
-       },
         {
                 .ctl_name       = CTL_UNNUMBERED,
                 .procname       = "sched_child_runs_first",
@@ -318,7 +307,7 @@ static struct ctl_table kern_table[] = {
                 .data           = &sysctl_sched_rt_period,
                 .maxlen         = sizeof(unsigned int),
                 .mode           = 0644,
-               .proc_handler   = &proc_dointvec,
+               .proc_handler   = &sched_rt_handler,
         },
         {
                 .ctl_name       = CTL_UNNUMBERED,
@@ -326,7 +315,7 @@ static struct ctl_table kern_table[] = {
                 .data           = &sysctl_sched_rt_runtime,
                 .maxlen         = sizeof(int),
                 .mode           = 0644,
-               .proc_handler   = &proc_dointvec,
+               .proc_handler   = &sched_rt_handler,
         },
         {
                 .ctl_name       = CTL_UNNUMBERED,
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c

index 69dba0c71727b70828fbd9b3122392feba4c2f3c..d358d4e3a95806c2ff71544947dee234809fdb88 100644 (file)
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -191,7 +191,6 @@ u64 get_cpu_idle_time_us(int cpu, u64 *last_update_time)
  void tick_nohz_stop_sched_tick(void)
  {
         unsigned long seq, last_jiffies, next_jiffies, delta_jiffies, flags;
-       unsigned long rt_jiffies;
         struct tick_sched *ts;
         ktime_t last_update, expires, now;
         struct clock_event_device *dev = __get_cpu_var(tick_cpu_device).evtdev;
@@ -243,10 +242,6 @@ void tick_nohz_stop_sched_tick(void)
         next_jiffies = get_next_timer_interrupt(last_jiffies);
         delta_jiffies = next_jiffies - last_jiffies;
  
-       rt_jiffies = rt_needs_cpu(cpu);
-       if (rt_jiffies && rt_jiffies < delta_jiffies)
-               delta_jiffies = rt_jiffies;
-
         if (rcu_needs_cpu(cpu))
                 delta_jiffies = 1;
         /*
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c

index a3fa587c350c598063a17c903beeee99fe0f47e0..2d6087c7cf9820fb4a16c43fdd75ed9f33d16bca 100644 (file)
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -178,6 +178,7 @@ static void change_clocksource(void)
         if (clock == new)
                 return;
  
+       new->cycle_last = 0;
         now = clocksource_read(new);
         nsec =  __get_nsec_offset();
         timespec_add_ns(&xtime, nsec);
@@ -295,6 +296,7 @@ static int timekeeping_resume(struct sys_device *dev)
         timespec_add_ns(&xtime, timekeeping_suspend_nsecs);
         update_xtime_cache(0);
         /* re-base the last cycle value */
+       clock->cycle_last = 0;
         clock->cycle_last = clocksource_read(clock);
         clock->error = 0;
         timekeeping_suspended = 0;
diff --git a/kernel/user.c b/kernel/user.c

index 7132022a040cc764b1c8cdd53744e7398e05502d..debce602bfddd9f117ec5fb98ad0023176eb7f7b 100644 (file)
--- a/kernel/user.c
+++ b/kernel/user.c
@@ -101,7 +101,7 @@ static int sched_create_user(struct user_struct *up)
  {
         int rc = 0;
  
-       up->tg = sched_create_group();
+       up->tg = sched_create_group(&root_task_group);
         if (IS_ERR(up->tg))
                 rc = -ENOMEM;
  
@@ -193,6 +193,33 @@ static ssize_t cpu_rt_runtime_store(struct kobject *kobj,
  
  static struct kobj_attribute cpu_rt_runtime_attr =
         __ATTR(cpu_rt_runtime, 0644, cpu_rt_runtime_show, cpu_rt_runtime_store);
+
+static ssize_t cpu_rt_period_show(struct kobject *kobj,
+                                  struct kobj_attribute *attr,
+                                  char *buf)
+{
+       struct user_struct *up = container_of(kobj, struct user_struct, kobj);
+
+       return sprintf(buf, "%lu\n", sched_group_rt_period(up->tg));
+}
+
+static ssize_t cpu_rt_period_store(struct kobject *kobj,
+                                   struct kobj_attribute *attr,
+                                   const char *buf, size_t size)
+{
+       struct user_struct *up = container_of(kobj, struct user_struct, kobj);
+       unsigned long rt_period;
+       int rc;
+
+       sscanf(buf, "%lu", &rt_period);
+
+       rc = sched_group_set_rt_period(up->tg, rt_period);
+
+       return (rc ? rc : size);
+}
+
+static struct kobj_attribute cpu_rt_period_attr =
+       __ATTR(cpu_rt_period, 0644, cpu_rt_period_show, cpu_rt_period_store);
  #endif
  
  /* default attributes per uid directory */
@@ -202,6 +229,7 @@ static struct attribute *uids_attributes[] = {
  #endif
  #ifdef CONFIG_RT_GROUP_SCHED
         &cpu_rt_runtime_attr.attr,
+       &cpu_rt_period_attr.attr,
  #endif
         NULL
  };
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug

index 95de3102bc87c61afe3289b117ffc16099a8df16..623ef24c23812894c1617f0db781fb3065d6bdf2 100644 (file)
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -427,6 +427,16 @@ config DEBUG_VM
  
           If unsure, say N.
  
+config DEBUG_WRITECOUNT
+       bool "Debug filesystem writers count"
+       depends on DEBUG_KERNEL
+       help
+         Enable this to catch wrong use of the writers count in struct
+         vfsmount.  This will increase the size of each file struct by
+         32 bits.
+
+         If unsure, say N.
+
  config DEBUG_LIST
         bool "Debug linked list manipulation"
         depends on DEBUG_KERNEL
diff --git a/lib/bitmap.c b/lib/bitmap.c

index 2c9242e3fed01ca348b4171db2c0d89333fa357e..a6939e18d7bb3bc4f8d0caba8004e6b075bd3983 100644 (file)
--- a/lib/bitmap.c
+++ b/lib/bitmap.c
@@ -315,6 +315,22 @@ int bitmap_scnprintf(char *buf, unsigned int buflen,
  }
  EXPORT_SYMBOL(bitmap_scnprintf);
  
+/**
+ * bitmap_scnprintf_len - return buffer length needed to convert
+ * bitmap to an ASCII hex string.
+ * @len: number of bits to be converted
+ */
+int bitmap_scnprintf_len(unsigned int len)
+{
+       /* we need 9 chars per word for 32 bit words (8 hexdigits + sep/null) */
+       int bitslen = ALIGN(len, CHUNKSZ);
+       int wordlen = CHUNKSZ / 4;
+       int buflen = (bitslen / wordlen) * (wordlen + 1) * sizeof(char);
+
+       return buflen;
+}
+EXPORT_SYMBOL(bitmap_scnprintf_len);
+
  /**
   * __bitmap_parse - convert an ASCII hex string into a bitmap.
   * @buf: pointer to buffer containing string.
diff --git a/mm/allocpercpu.c b/mm/allocpercpu.c

index b0012e27fea8796da01cfb2172d1d5930c60d22f..f4026bae6eedadb9e077e3500480fc47feb25624 100644 (file)
--- a/mm/allocpercpu.c
+++ b/mm/allocpercpu.c
@@ -82,9 +82,10 @@ EXPORT_SYMBOL_GPL(percpu_populate);
  int __percpu_populate_mask(void *__pdata, size_t size, gfp_t gfp,
                            cpumask_t *mask)
  {
-       cpumask_t populated = CPU_MASK_NONE;
+       cpumask_t populated;
         int cpu;
  
+       cpus_clear(populated);
         for_each_cpu_mask(cpu, *mask)
                 if (unlikely(!percpu_populate(__pdata, size, gfp, cpu))) {
                         __percpu_depopulate_mask(__pdata, &populated);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c

index 402a504f12283f23cb510ec9ebfce4d0d52eea29..32e796af12a16c756b70051426195e8983227cdc 100644 (file)
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2029,6 +2029,7 @@ static int find_next_best_node(int node, nodemask_t *used_node_mask)
         int n, val;
         int min_val = INT_MAX;
         int best_node = -1;
+       node_to_cpumask_ptr(tmp, 0);
  
         /* Use the local node if we haven't already */
         if (!node_isset(node, *used_node_mask)) {
@@ -2037,7 +2038,6 @@ static int find_next_best_node(int node, nodemask_t *used_node_mask)
         }
  
         for_each_node_state(n, N_HIGH_MEMORY) {
-               cpumask_t tmp;
  
                 /* Don't want a node to appear more than once */
                 if (node_isset(n, *used_node_mask))
@@ -2050,8 +2050,8 @@ static int find_next_best_node(int node, nodemask_t *used_node_mask)
                 val += (n < node);
  
                 /* Give preference to headless and unused nodes */
-               tmp = node_to_cpumask(n);
-               if (!cpus_empty(tmp))
+               node_to_cpumask_ptr_next(tmp, n);
+               if (!cpus_empty(*tmp))
                         val += PENALTY_FOR_NODE_WITH_CPUS;
  
                 /* Slight preference for less loaded node */
diff --git a/mm/pdflush.c b/mm/pdflush.c

index 8f6ee073c0e3f44a14a40fe1351006b1246e4523..0ceacff56457d10c607a36a51da040af806465d4 100644 (file)
--- a/mm/pdflush.c
+++ b/mm/pdflush.c
@@ -187,8 +187,8 @@ static int pdflush(void *dummy)
          * This is needed as pdflush's are dynamically created and destroyed.
          * The boottime pdflush's are easily placed w/o these 2 lines.
          */
-       cpus_allowed = cpuset_cpus_allowed(current);
-       set_cpus_allowed(current, cpus_allowed);
+       cpuset_cpus_allowed(current, &cpus_allowed);
+       set_cpus_allowed_ptr(current, &cpus_allowed);
  
         return __pdflush(&my_work);
  }
diff --git a/mm/slab.c b/mm/slab.c

index 04b308c3bc547f72762521fa5a45170713546e41..03927cb5ec9e119ca06cf337d69de017303d82ba 100644 (file)
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1160,14 +1160,13 @@ static void __cpuinit cpuup_canceled(long cpu)
         struct kmem_cache *cachep;
         struct kmem_list3 *l3 = NULL;
         int node = cpu_to_node(cpu);
+       node_to_cpumask_ptr(mask, node);
  
         list_for_each_entry(cachep, &cache_chain, next) {
                 struct array_cache *nc;
                 struct array_cache *shared;
                 struct array_cache **alien;
-               cpumask_t mask;
  
-               mask = node_to_cpumask(node);
                 /* cpu is dead; no one can alloc from it. */
                 nc = cachep->array[cpu];
                 cachep->array[cpu] = NULL;
@@ -1183,7 +1182,7 @@ static void __cpuinit cpuup_canceled(long cpu)
                 if (nc)
                         free_block(cachep, nc->entry, nc->avail, node);
  
-               if (!cpus_empty(mask)) {
+               if (!cpus_empty(*mask)) {
                         spin_unlock_irq(&l3->list_lock);
                         goto free_array_cache;
                 }
diff --git a/mm/vmscan.c b/mm/vmscan.c

index 4046434046e68b0a45864c019a095fab088420ee..f80a5b7c057ffc387ba87e50b038f8e3756e1084 100644 (file)
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1647,11 +1647,10 @@ static int kswapd(void *p)
         struct reclaim_state reclaim_state = {
                 .reclaimed_slab = 0,
         };
-       cpumask_t cpumask;
+       node_to_cpumask_ptr(cpumask, pgdat->node_id);
  
-       cpumask = node_to_cpumask(pgdat->node_id);
-       if (!cpus_empty(cpumask))
-               set_cpus_allowed(tsk, cpumask);
+       if (!cpus_empty(*cpumask))
+               set_cpus_allowed_ptr(tsk, cpumask);
         current->reclaim_state = &reclaim_state;
  
         /*
@@ -1880,17 +1879,16 @@ out:
  static int __devinit cpu_callback(struct notifier_block *nfb,
                                   unsigned long action, void *hcpu)
  {
-       pg_data_t *pgdat;
-       cpumask_t mask;
         int nid;
  
         if (action == CPU_ONLINE || action == CPU_ONLINE_FROZEN) {
                 for_each_node_state(nid, N_HIGH_MEMORY) {
-                       pgdat = NODE_DATA(nid);
-                       mask = node_to_cpumask(pgdat->node_id);
-                       if (any_online_cpu(mask) != NR_CPUS)
+                       pg_data_t *pgdat = NODE_DATA(nid);
+                       node_to_cpumask_ptr(mask, pgdat->node_id);
+
+                       if (any_online_cpu(*mask) < nr_cpu_ids)
                                 /* One of our CPUs online: restore mask */
-                               set_cpus_allowed(pgdat->kswapd, mask);
+                               set_cpus_allowed_ptr(pgdat->kswapd, mask);
                 }
         }
         return NOTIFY_OK;
diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c

index a290e1523297783da4e491ad547e17cdeb5da675..090af78d68b5a4d010757b7d725721a1db3141fc 100644 (file)
--- a/net/sunrpc/svc.c
+++ b/net/sunrpc/svc.c
@@ -301,7 +301,6 @@ static inline int
  svc_pool_map_set_cpumask(unsigned int pidx, cpumask_t *oldmask)
  {
         struct svc_pool_map *m = &svc_pool_map;
-       unsigned int node; /* or cpu */
  
         /*
          * The caller checks for sv_nrpools > 1, which
@@ -314,16 +313,23 @@ svc_pool_map_set_cpumask(unsigned int pidx, cpumask_t *oldmask)
         default:
                 return 0;
         case SVC_POOL_PERCPU:
-               node = m->pool_to[pidx];
+       {
+               unsigned int cpu = m->pool_to[pidx];
+
                 *oldmask = current->cpus_allowed;
-               set_cpus_allowed(current, cpumask_of_cpu(node));
+               set_cpus_allowed_ptr(current, &cpumask_of_cpu(cpu));
                 return 1;
+       }
         case SVC_POOL_PERNODE:
-               node = m->pool_to[pidx];
+       {
+               unsigned int node = m->pool_to[pidx];
+               node_to_cpumask_ptr(nodecpumask, node);
+
                 *oldmask = current->cpus_allowed;
-               set_cpus_allowed(current, node_to_cpumask(node));
+               set_cpus_allowed_ptr(current, nodecpumask);
                 return 1;
         }
+       }
  }
  
  /*
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c

index 2851d0d15048e8d958668f2eaa9129220f274010..1454afcc06c48e962298ac9934b3dfbe35965dcc 100644 (file)
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -819,7 +819,11 @@ static int unix_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len)
                  */
                 mode = S_IFSOCK |
                        (SOCK_INODE(sock)->i_mode & ~current->fs->umask);
+               err = mnt_want_write(nd.path.mnt);
+               if (err)
+                       goto out_mknod_dput;
                 err = vfs_mknod(nd.path.dentry->d_inode, dentry, mode, 0);
+               mnt_drop_write(nd.path.mnt);
                 if (err)
                         goto out_mknod_dput;
                 mutex_unlock(&nd.path.dentry->d_inode->i_mutex);
author	Linus Torvalds <torvalds@linux-foundation.org>
	Mon, 21 Apr 2008 22:40:55 +0000 (15:40 -0700)
committer	Linus Torvalds <torvalds@linux-foundation.org>
	Mon, 21 Apr 2008 22:40:55 +0000 (15:40 -0700)
Documentation/cpusets.txt		patch \| blob \| history
Documentation/feature-removal-schedule.txt		patch \| blob \| history
Documentation/kernel-parameters.txt		patch \| blob \| history
Documentation/prctl/disable-tsc-ctxt-sw-stress-test.c	[new file with mode: 0644]	patch \| blob
Documentation/prctl/disable-tsc-on-off-stress-test.c	[new file with mode: 0644]	patch \| blob
Documentation/prctl/disable-tsc-test.c	[new file with mode: 0644]	patch \| blob
Documentation/scheduler/sched-rt-group.txt		patch \| blob \| history
arch/sh/Kconfig		patch \| blob \| history
arch/sh/Kconfig.debug		patch \| blob \| history
arch/sh/Makefile		patch \| blob \| history
arch/sh/boards/renesas/migor/setup.c		patch \| blob \| history
arch/sh/boards/renesas/r7780rp/irq-r7780mp.c		patch \| blob \| history
arch/sh/boards/renesas/r7780rp/setup.c		patch \| blob \| history
arch/sh/boards/se/7721/Makefile	[new file with mode: 0644]	patch \| blob
arch/sh/boards/se/7721/irq.c	[new file with mode: 0644]	patch \| blob
arch/sh/boards/se/7721/setup.c	[new file with mode: 0644]	patch \| blob
arch/sh/boards/se/7722/setup.c		patch \| blob \| history
arch/sh/configs/se7721_defconfig	[new file with mode: 0644]	patch \| blob
arch/sh/kernel/cf-enabler.c		patch \| blob \| history
arch/sh/kernel/cpu/sh2a/Makefile		patch \| blob \| history
arch/sh/kernel/cpu/sh2a/probe.c		patch \| blob \| history
arch/sh/kernel/cpu/sh2a/setup-mxg.c	[new file with mode: 0644]	patch \| blob
arch/sh/kernel/cpu/sh4/probe.c		patch \| blob \| history
arch/sh/kernel/cpu/sh4a/Makefile		patch \| blob \| history
arch/sh/kernel/cpu/sh4a/setup-sh7722.c		patch \| blob \| history
arch/sh/kernel/cpu/sh4a/setup-sh7723.c	[new file with mode: 0644]	patch \| blob
arch/sh/kernel/cpu/sh4a/setup-sh7763.c		patch \| blob \| history
arch/sh/kernel/cpu/sh4a/setup-sh7770.c		patch \| blob \| history
arch/sh/kernel/setup.c		patch \| blob \| history
arch/sh/lib/clear_page.S		patch \| blob \| history
arch/sh/lib/copy_page.S		patch \| blob \| history
arch/sh/mm/cache-debugfs.c		patch \| blob \| history
arch/sh/mm/pmb.c		patch \| blob \| history
arch/sh/tools/mach-types		patch \| blob \| history
arch/x86/Kconfig		patch \| blob \| history
arch/x86/boot/a20.c		patch \| blob \| history
arch/x86/boot/apm.c		patch \| blob \| history
arch/x86/boot/bitops.h		patch \| blob \| history
arch/x86/boot/boot.h		patch \| blob \| history
arch/x86/boot/cmdline.c		patch \| blob \| history
arch/x86/boot/compressed/head_32.S		patch \| blob \| history
arch/x86/boot/compressed/head_64.S		patch \| blob \| history
arch/x86/boot/compressed/misc.c		patch \| blob \| history
arch/x86/boot/compressed/vmlinux_64.lds		patch \| blob \| history
arch/x86/boot/copy.S		patch \| blob \| history
arch/x86/boot/cpucheck.c		patch \| blob \| history
arch/x86/boot/edd.c		patch \| blob \| history
arch/x86/boot/install.sh		patch \| blob \| history
arch/x86/boot/main.c		patch \| blob \| history
arch/x86/boot/mca.c		patch \| blob \| history
arch/x86/boot/memory.c		patch \| blob \| history
arch/x86/boot/pm.c		patch \| blob \| history
arch/x86/boot/pmjump.S		patch \| blob \| history
arch/x86/boot/printf.c		patch \| blob \| history
arch/x86/boot/string.c		patch \| blob \| history
arch/x86/boot/tty.c		patch \| blob \| history
arch/x86/boot/version.c		patch \| blob \| history
arch/x86/boot/video-bios.c		patch \| blob \| history
arch/x86/boot/video-vesa.c		patch \| blob \| history
arch/x86/boot/video-vga.c		patch \| blob \| history
arch/x86/boot/video.c		patch \| blob \| history
arch/x86/boot/video.h		patch \| blob \| history
arch/x86/boot/voyager.c		patch \| blob \| history
arch/x86/kernel/Makefile		patch \| blob \| history
arch/x86/kernel/acpi/cstate.c		patch \| blob \| history
arch/x86/kernel/acpi/processor.c		patch \| blob \| history
arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c		patch \| blob \| history
arch/x86/kernel/cpu/cpufreq/p4-clockmod.c		patch \| blob \| history
arch/x86/kernel/cpu/cpufreq/powernow-k8.c		patch \| blob \| history
arch/x86/kernel/cpu/cpufreq/speedstep-centrino.c		patch \| blob \| history
arch/x86/kernel/cpu/cpufreq/speedstep-ich.c		patch \| blob \| history
arch/x86/kernel/cpu/intel_cacheinfo.c		patch \| blob \| history
arch/x86/kernel/cpu/mcheck/mce_amd_64.c		patch \| blob \| history
arch/x86/kernel/cpu/mcheck/therm_throt.c		patch \| blob \| history
arch/x86/kernel/e820_32.c		patch \| blob \| history
arch/x86/kernel/e820_64.c		patch \| blob \| history
arch/x86/kernel/efi.c		patch \| blob \| history
arch/x86/kernel/efi_64.c		patch \| blob \| history
arch/x86/kernel/entry_32.S		patch \| blob \| history
arch/x86/kernel/genx2apic_uv_x.c		patch \| blob \| history
arch/x86/kernel/head64.c		patch \| blob \| history
arch/x86/kernel/head_32.S		patch \| blob \| history
arch/x86/kernel/i387.c		patch \| blob \| history
arch/x86/kernel/io_apic_64.c		patch \| blob \| history
arch/x86/kernel/kgdb.c		patch \| blob \| history
arch/x86/kernel/microcode.c		patch \| blob \| history
arch/x86/kernel/nmi_32.c		patch \| blob \| history
arch/x86/kernel/nmi_64.c		patch \| blob \| history
arch/x86/kernel/pci-calgary_64.c		patch \| blob \| history
arch/x86/kernel/pci-dma.c	[moved from arch/x86/kernel/pci-dma_64.c with 55% similarity]	patch \| blob \| history
arch/x86/kernel/pci-dma_32.c	[deleted file]	patch \| blob \| history
arch/x86/kernel/pci-gart_64.c		patch \| blob \| history
arch/x86/kernel/pci-nommu.c	[moved from arch/x86/kernel/pci-nommu_64.c with 77% similarity]	patch \| blob \| history
arch/x86/kernel/pci-swiotlb_64.c		patch \| blob \| history
arch/x86/kernel/process.c	[new file with mode: 0644]	patch \| blob
arch/x86/kernel/process_32.c		patch \| blob \| history
arch/x86/kernel/process_64.c		patch \| blob \| history
arch/x86/kernel/reboot.c		patch \| blob \| history
arch/x86/kernel/setup.c		patch \| blob \| history
arch/x86/kernel/setup64.c		patch \| blob \| history
arch/x86/kernel/setup_32.c		patch \| blob \| history
arch/x86/kernel/setup_64.c		patch \| blob \| history
arch/x86/kernel/smpboot.c		patch \| blob \| history
arch/x86/kernel/traps_32.c		patch \| blob \| history
arch/x86/kernel/traps_64.c		patch \| blob \| history
arch/x86/kernel/tsc_32.c		patch \| blob \| history
arch/x86/kernel/tsc_64.c		patch \| blob \| history
arch/x86/mach-visws/visws_apic.c		patch \| blob \| history
arch/x86/mach-voyager/voyager_basic.c		patch \| blob \| history
arch/x86/mach-voyager/voyager_cat.c		patch \| blob \| history
arch/x86/mach-voyager/voyager_smp.c		patch \| blob \| history
arch/x86/mach-voyager/voyager_thread.c		patch \| blob \| history
arch/x86/math-emu/fpu_entry.c		patch \| blob \| history
arch/x86/math-emu/fpu_system.h		patch \| blob \| history
arch/x86/math-emu/reg_ld_str.c		patch \| blob \| history
arch/x86/mm/discontig_32.c		patch \| blob \| history
arch/x86/mm/init_32.c		patch \| blob \| history
arch/x86/mm/init_64.c		patch \| blob \| history
arch/x86/mm/ioremap.c		patch \| blob \| history
arch/x86/mm/k8topology_64.c		patch \| blob \| history
arch/x86/mm/numa_64.c		patch \| blob \| history
arch/x86/mm/pgtable_32.c		patch \| blob \| history
arch/x86/mm/srat_64.c		patch \| blob \| history
arch/x86/oprofile/nmi_int.c		patch \| blob \| history
arch/x86/vdso/Makefile		patch \| blob \| history
arch/x86/video/fbdev.c		patch \| blob \| history
drivers/acpi/processor_throttling.c		patch \| blob \| history
drivers/base/cpu.c		patch \| blob \| history
drivers/base/node.c		patch \| blob \| history
drivers/base/topology.c		patch \| blob \| history
drivers/firmware/dcdbas.c		patch \| blob \| history
drivers/input/keyboard/Kconfig		patch \| blob \| history
drivers/input/keyboard/Makefile		patch \| blob \| history
drivers/input/keyboard/sh_keysc.c	[new file with mode: 0644]	patch \| blob
drivers/pci/pci-driver.c		patch \| blob \| history
drivers/pci/pci-sysfs.c		patch \| blob \| history
drivers/pci/probe.c		patch \| blob \| history
drivers/rtc/rtc-sh.c		patch \| blob \| history
drivers/serial/sh-sci.c		patch \| blob \| history
drivers/serial/sh-sci.h		patch \| blob \| history
fs/ext2/ioctl.c		patch \| blob \| history
fs/ext3/ioctl.c		patch \| blob \| history
fs/ext4/ioctl.c		patch \| blob \| history
fs/fat/file.c		patch \| blob \| history
fs/file_table.c		patch \| blob \| history
fs/hfsplus/ioctl.c		patch \| blob \| history
fs/inode.c		patch \| blob \| history
fs/jfs/ioctl.c		patch \| blob \| history
fs/namei.c		patch \| blob \| history
fs/namespace.c		patch \| blob \| history
fs/ncpfs/ioctl.c		patch \| blob \| history
fs/nfs/dir.c		patch \| blob \| history
fs/nfsd/nfs4proc.c		patch \| blob \| history
fs/nfsd/nfs4recover.c		patch \| blob \| history
fs/nfsd/nfs4state.c		patch \| blob \| history
fs/nfsd/vfs.c		patch \| blob \| history
fs/ocfs2/ioctl.c		patch \| blob \| history
fs/open.c		patch \| blob \| history
fs/reiserfs/ioctl.c		patch \| blob \| history
fs/super.c		patch \| blob \| history
fs/utimes.c		patch \| blob \| history
fs/xattr.c		patch \| blob \| history
fs/xfs/linux-2.6/xfs_ioctl.c		patch \| blob \| history
fs/xfs/linux-2.6/xfs_iops.c		patch \| blob \| history
fs/xfs/linux-2.6/xfs_lrw.c		patch \| blob \| history
include/asm-alpha/topology.h		patch \| blob \| history
include/asm-frv/topology.h		patch \| blob \| history
include/asm-generic/topology.h		patch \| blob \| history
include/asm-ia64/topology.h		patch \| blob \| history
include/asm-powerpc/topology.h		patch \| blob \| history
include/asm-sh/bugs.h		patch \| blob \| history
include/asm-sh/cpu-sh4/freq.h		patch \| blob \| history
include/asm-sh/cpu-sh4/rtc.h		patch \| blob \| history
include/asm-sh/migor.h	[new file with mode: 0644]	patch \| blob
include/asm-sh/processor.h		patch \| blob \| history
include/asm-sh/r7780rp.h		patch \| blob \| history
include/asm-sh/se7721.h	[new file with mode: 0644]	patch \| blob
include/asm-sh/se7722.h		patch \| blob \| history
include/asm-sh/sh_keysc.h	[new file with mode: 0644]	patch \| blob
include/asm-sh/system.h		patch \| blob \| history
include/asm-sh/topology.h		patch \| blob \| history
include/asm-sh/uaccess_32.h		patch \| blob \| history
include/asm-x86/boot.h		patch \| blob \| history
include/asm-x86/dma-mapping.h		patch \| blob \| history
include/asm-x86/dma-mapping_32.h	[deleted file]	patch \| blob \| history
include/asm-x86/dma-mapping_64.h	[deleted file]	patch \| blob \| history
include/asm-x86/e820_32.h		patch \| blob \| history
include/asm-x86/genapic_32.h		patch \| blob \| history
include/asm-x86/i387.h		patch \| blob \| history
include/asm-x86/numa_64.h		patch \| blob \| history
include/asm-x86/pci_64.h		patch \| blob \| history
include/asm-x86/processor.h		patch \| blob \| history
include/asm-x86/scatterlist.h		patch \| blob \| history
include/asm-x86/thread_info.h		patch \| blob \| history
include/asm-x86/thread_info_32.h		patch \| blob \| history
include/asm-x86/thread_info_64.h		patch \| blob \| history
include/asm-x86/topology.h		patch \| blob \| history
include/asm-x86/tsc.h		patch \| blob \| history
include/linux/bitmap.h		patch \| blob \| history
include/linux/cpumask.h		patch \| blob \| history
include/linux/cpuset.h		patch \| blob \| history
include/linux/efi.h		patch \| blob \| history
include/linux/file.h		patch \| blob \| history
include/linux/fs.h		patch \| blob \| history
include/linux/init_task.h		patch \| blob \| history
include/linux/irqflags.h		patch \| blob \| history
include/linux/ktime.h		patch \| blob \| history
include/linux/list.h		patch \| blob \| history
include/linux/mount.h		patch \| blob \| history
include/linux/prctl.h		patch \| blob \| history
include/linux/sched.h		patch \| blob \| history
include/linux/sysdev.h		patch \| blob \| history
include/linux/topology.h		patch \| blob \| history
init/Kconfig		patch \| blob \| history
init/main.c		patch \| blob \| history
ipc/mqueue.c		patch \| blob \| history
kernel/compat.c		patch \| blob \| history
kernel/cpu.c		patch \| blob \| history
kernel/cpuset.c		patch \| blob \| history
kernel/fork.c		patch \| blob \| history
kernel/irq/chip.c		patch \| blob \| history
kernel/kmod.c		patch \| blob \| history
kernel/kthread.c		patch \| blob \| history
kernel/latencytop.c		patch \| blob \| history
kernel/rcupreempt.c		patch \| blob \| history
kernel/rcutorture.c		patch \| blob \| history
kernel/sched.c		patch \| blob \| history
kernel/sched_debug.c		patch \| blob \| history
kernel/sched_fair.c		patch \| blob \| history
kernel/sched_features.h	[new file with mode: 0644]	patch \| blob
kernel/sched_rt.c		patch \| blob \| history
kernel/sched_stats.h		patch \| blob \| history
kernel/softirq.c		patch \| blob \| history
kernel/stop_machine.c		patch \| blob \| history
kernel/sys.c		patch \| blob \| history
kernel/sysctl.c		patch \| blob \| history
kernel/time/tick-sched.c		patch \| blob \| history
kernel/time/timekeeping.c		patch \| blob \| history
kernel/user.c		patch \| blob \| history
lib/Kconfig.debug		patch \| blob \| history
lib/bitmap.c		patch \| blob \| history
mm/allocpercpu.c		patch \| blob \| history
mm/page_alloc.c		patch \| blob \| history
mm/pdflush.c		patch \| blob \| history
mm/slab.c		patch \| blob \| history
mm/vmscan.c		patch \| blob \| history
net/sunrpc/svc.c		patch \| blob \| history
net/unix/af_unix.c		patch \| blob \| history