mirror of
https://github.com/torvalds/linux.git
synced 2026-03-08 01:04:41 +01:00
RCU changes for v7.0
RCU Tasks Trace:
Re-implement RCU tasks trace in term of SRCU-fast, not only more than 500 lines
of code are saved because of the reimplementation, a new set of API,
rcu_read_{,un}lock_tasks_trace(), becomes possible as well. Compared to the
previous rcu_read_{,un}lock_trace(), the new API avoid the task_struct accesses
thanks to the SRCU-fast semantics. As a result, the old
rcu_read{,un}lock_trace() API is now deprecated.
RCU Torture Test:
- Multiple improvements on kvm-series.sh (parallel run and progress showing
metrics)
- Add context checks to rcu_torture_timer().
- Make config2csv.sh properly handle comments in .boot files.
- Include commit discription in testid.txt.
Miscellaneous RCU changes:
- Reduce synchronize_rcu() latency by reporting GP kthread's CPU QS early.
- Use suitable gfp_flags for the init_srcu_struct_nodes().
- Fix rcu_read_unlock() deadloop due to softirq.
- Correctly compute probability to invoke ->exp_current() in rcutorture.
- Make expedited RCU CPU stall warnings detect stall-end races.
RCU nocb:
- Remove unnecessary WakeOvfIsDeferred wake path and callback overload
handling.
- Extract nocb_defer_wakeup_cancel() helper.
-----BEGIN PGP SIGNATURE-----
iQFFBAABCAAvFiEEj5IosQTPz8XU1wRHSXnow7UH+rgFAmmARZERHGJvcXVuQGtl
cm5lbC5vcmcACgkQSXnow7UH+rh8SAf+PDIBWAkdbGgs32EfgpFY42RB4CWygH47
YRup/M3+nU0JcBzNnona1srpHBXRySBJQbvRbsOdlM45VoNQ2wPjig/3vFVRUKYx
uqj9Tze00DS74IIGESoTGp0amZde9SS9JakNRoEfTr+Zpj8N6LFERQw0ywUwjR5b
RR6bz7q05TAl3u2BYUAgNdnf3VWWTmj4WYwArlQ+qRFAyGN+TVj8Ezra6+K5TJ7u
SQYrf7WmRGOhHbVVolvVEOVdACccI8dFl3ebJVE2Ky0gp1o3BLPkcDLJ6gBdTCoE
rRrbnkeqs5V7tOkPFDBeUhLPrm1QxrdEDxUQFWjSApbv161sx7AOZA==
=O9QQ
-----END PGP SIGNATURE-----
Merge tag 'rcu.release.v7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux
Pull RCU updates from Boqun Feng:
- RCU Tasks Trace:
Re-implement RCU tasks trace in term of SRCU-fast, not only more than
500 lines of code are saved because of the reimplementation, a new
set of API, rcu_read_{,un}lock_tasks_trace(), becomes possible as
well. Compared to the previous rcu_read_{,un}lock_trace(), the new
API avoid the task_struct accesses thanks to the SRCU-fast semantics.
As a result, the old rcu_read{,un}lock_trace() API is now deprecated.
- RCU Torture Test:
- Multiple improvements on kvm-series.sh (parallel run and
progress showing metrics)
- Add context checks to rcu_torture_timer()
- Make config2csv.sh properly handle comments in .boot files
- Include commit discription in testid.txt
- Miscellaneous RCU changes:
- Reduce synchronize_rcu() latency by reporting GP kthread's
CPU QS early
- Use suitable gfp_flags for the init_srcu_struct_nodes()
- Fix rcu_read_unlock() deadloop due to softirq
- Correctly compute probability to invoke ->exp_current()
in rcutorture
- Make expedited RCU CPU stall warnings detect stall-end races
- RCU nocb:
- Remove unnecessary WakeOvfIsDeferred wake path and callback
overload handling
- Extract nocb_defer_wakeup_cancel() helper
* tag 'rcu.release.v7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux: (25 commits)
rcu/nocb: Extract nocb_defer_wakeup_cancel() helper
rcu/nocb: Remove dead callback overload handling
rcu/nocb: Remove unnecessary WakeOvfIsDeferred wake path
rcu: Reduce synchronize_rcu() latency by reporting GP kthread's CPU QS early
srcu: Use suitable gfp_flags for the init_srcu_struct_nodes()
rcu: Fix rcu_read_unlock() deadloop due to softirq
rcutorture: Correctly compute probability to invoke ->exp_current()
rcu: Make expedited RCU CPU stall warnings detect stall-end races
rcutorture: Add --kill-previous option to terminate previous kvm.sh runs
rcutorture: Prevent concurrent kvm.sh runs on same source tree
torture: Include commit discription in testid.txt
torture: Make config2csv.sh properly handle comments in .boot files
torture: Make kvm-series.sh give run numbers and totals
torture: Make kvm-series.sh give build numbers and totals
torture: Parallelize kvm-series.sh guest-OS execution
rcutorture: Add context checks to rcu_torture_timer()
rcutorture: Test rcu_tasks_trace_expedite_current()
srcu: Create an rcu_tasks_trace_expedite_current() function
checkpatch: Deprecate rcu_read_{,un}lock_trace()
rcu: Update Requirements.rst for RCU Tasks Trace
...
This commit is contained in:
commit
ef852baaf6
27 changed files with 459 additions and 932 deletions
|
|
@ -2780,12 +2780,12 @@ Tasks Trace RCU
|
|||
~~~~~~~~~~~~~~~
|
||||
|
||||
Some forms of tracing need to sleep in readers, but cannot tolerate
|
||||
SRCU's read-side overhead, which includes a full memory barrier in both
|
||||
srcu_read_lock() and srcu_read_unlock(). This need is handled by a
|
||||
Tasks Trace RCU that uses scheduler locking and IPIs to synchronize with
|
||||
readers. Real-time systems that cannot tolerate IPIs may build their
|
||||
kernels with ``CONFIG_TASKS_TRACE_RCU_READ_MB=y``, which avoids the IPIs at
|
||||
the expense of adding full memory barriers to the read-side primitives.
|
||||
SRCU's read-side overhead, which includes a full memory barrier in
|
||||
both srcu_read_lock() and srcu_read_unlock(). This need is handled by
|
||||
a Tasks Trace RCU API implemented as thin wrappers around SRCU-fast,
|
||||
which avoids the read-side memory barriers, at least for architectures
|
||||
that apply noinstr to kernel entry/exit code (or that build with
|
||||
``CONFIG_TASKS_TRACE_RCU_NO_MB=y``.
|
||||
|
||||
The tasks-trace-RCU API is also reasonably compact,
|
||||
consisting of rcu_read_lock_trace(), rcu_read_unlock_trace(),
|
||||
|
|
|
|||
|
|
@ -6289,13 +6289,6 @@ Kernel parameters
|
|||
dynamically) adjusted. This parameter is intended
|
||||
for use in testing.
|
||||
|
||||
rcupdate.rcu_task_ipi_delay= [KNL]
|
||||
Set time in jiffies during which RCU tasks will
|
||||
avoid sending IPIs, starting with the beginning
|
||||
of a given grace period. Setting a large
|
||||
number avoids disturbing real-time workloads,
|
||||
but lengthens grace periods.
|
||||
|
||||
rcupdate.rcu_task_lazy_lim= [KNL]
|
||||
Number of callbacks on a given CPU that will
|
||||
cancel laziness on that CPU. Use -1 to disable
|
||||
|
|
@ -6339,14 +6332,6 @@ Kernel parameters
|
|||
of zero will disable batching. Batching is
|
||||
always disabled for synchronize_rcu_tasks().
|
||||
|
||||
rcupdate.rcu_tasks_trace_lazy_ms= [KNL]
|
||||
Set timeout in milliseconds RCU Tasks
|
||||
Trace asynchronous callback batching for
|
||||
call_rcu_tasks_trace(). A negative value
|
||||
will take the default. A value of zero will
|
||||
disable batching. Batching is always disabled
|
||||
for synchronize_rcu_tasks_trace().
|
||||
|
||||
rcupdate.rcu_self_test= [KNL]
|
||||
Run the RCU early boot self tests
|
||||
|
||||
|
|
|
|||
|
|
@ -175,36 +175,7 @@ void rcu_tasks_torture_stats_print(char *tt, char *tf);
|
|||
# define synchronize_rcu_tasks synchronize_rcu
|
||||
# endif
|
||||
|
||||
# ifdef CONFIG_TASKS_TRACE_RCU
|
||||
// Bits for ->trc_reader_special.b.need_qs field.
|
||||
#define TRC_NEED_QS 0x1 // Task needs a quiescent state.
|
||||
#define TRC_NEED_QS_CHECKED 0x2 // Task has been checked for needing quiescent state.
|
||||
|
||||
u8 rcu_trc_cmpxchg_need_qs(struct task_struct *t, u8 old, u8 new);
|
||||
void rcu_tasks_trace_qs_blkd(struct task_struct *t);
|
||||
|
||||
# define rcu_tasks_trace_qs(t) \
|
||||
do { \
|
||||
int ___rttq_nesting = READ_ONCE((t)->trc_reader_nesting); \
|
||||
\
|
||||
if (unlikely(READ_ONCE((t)->trc_reader_special.b.need_qs) == TRC_NEED_QS) && \
|
||||
likely(!___rttq_nesting)) { \
|
||||
rcu_trc_cmpxchg_need_qs((t), TRC_NEED_QS, TRC_NEED_QS_CHECKED); \
|
||||
} else if (___rttq_nesting && ___rttq_nesting != INT_MIN && \
|
||||
!READ_ONCE((t)->trc_reader_special.b.blocked)) { \
|
||||
rcu_tasks_trace_qs_blkd(t); \
|
||||
} \
|
||||
} while (0)
|
||||
void rcu_tasks_trace_torture_stats_print(char *tt, char *tf);
|
||||
# else
|
||||
# define rcu_tasks_trace_qs(t) do { } while (0)
|
||||
# endif
|
||||
|
||||
#define rcu_tasks_qs(t, preempt) \
|
||||
do { \
|
||||
rcu_tasks_classic_qs((t), (preempt)); \
|
||||
rcu_tasks_trace_qs(t); \
|
||||
} while (0)
|
||||
#define rcu_tasks_qs(t, preempt) rcu_tasks_classic_qs((t), (preempt))
|
||||
|
||||
# ifdef CONFIG_TASKS_RUDE_RCU
|
||||
void synchronize_rcu_tasks_rude(void);
|
||||
|
|
|
|||
|
|
@ -12,27 +12,74 @@
|
|||
#include <linux/rcupdate.h>
|
||||
#include <linux/cleanup.h>
|
||||
|
||||
extern struct lockdep_map rcu_trace_lock_map;
|
||||
#ifdef CONFIG_TASKS_TRACE_RCU
|
||||
extern struct srcu_struct rcu_tasks_trace_srcu_struct;
|
||||
#endif // #ifdef CONFIG_TASKS_TRACE_RCU
|
||||
|
||||
#ifdef CONFIG_DEBUG_LOCK_ALLOC
|
||||
#if defined(CONFIG_DEBUG_LOCK_ALLOC) && defined(CONFIG_TASKS_TRACE_RCU)
|
||||
|
||||
static inline int rcu_read_lock_trace_held(void)
|
||||
{
|
||||
return lock_is_held(&rcu_trace_lock_map);
|
||||
return srcu_read_lock_held(&rcu_tasks_trace_srcu_struct);
|
||||
}
|
||||
|
||||
#else /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
|
||||
#else // #if defined(CONFIG_DEBUG_LOCK_ALLOC) && defined(CONFIG_TASKS_TRACE_RCU)
|
||||
|
||||
static inline int rcu_read_lock_trace_held(void)
|
||||
{
|
||||
return 1;
|
||||
}
|
||||
|
||||
#endif /* #else #ifdef CONFIG_DEBUG_LOCK_ALLOC */
|
||||
#endif // #else // #if defined(CONFIG_DEBUG_LOCK_ALLOC) && defined(CONFIG_TASKS_TRACE_RCU)
|
||||
|
||||
#ifdef CONFIG_TASKS_TRACE_RCU
|
||||
|
||||
void rcu_read_unlock_trace_special(struct task_struct *t);
|
||||
/**
|
||||
* rcu_read_lock_tasks_trace - mark beginning of RCU-trace read-side critical section
|
||||
*
|
||||
* When synchronize_rcu_tasks_trace() is invoked by one task, then that
|
||||
* task is guaranteed to block until all other tasks exit their read-side
|
||||
* critical sections. Similarly, if call_rcu_trace() is invoked on one
|
||||
* task while other tasks are within RCU read-side critical sections,
|
||||
* invocation of the corresponding RCU callback is deferred until after
|
||||
* the all the other tasks exit their critical sections.
|
||||
*
|
||||
* For more details, please see the documentation for
|
||||
* srcu_read_lock_fast(). For a description of how implicit RCU
|
||||
* readers provide the needed ordering for architectures defining the
|
||||
* ARCH_WANTS_NO_INSTR Kconfig option (and thus promising never to trace
|
||||
* code where RCU is not watching), please see the __srcu_read_lock_fast()
|
||||
* (non-kerneldoc) header comment. Otherwise, the smp_mb() below provided
|
||||
* the needed ordering.
|
||||
*/
|
||||
static inline struct srcu_ctr __percpu *rcu_read_lock_tasks_trace(void)
|
||||
{
|
||||
struct srcu_ctr __percpu *ret = __srcu_read_lock_fast(&rcu_tasks_trace_srcu_struct);
|
||||
|
||||
rcu_try_lock_acquire(&rcu_tasks_trace_srcu_struct.dep_map);
|
||||
if (!IS_ENABLED(CONFIG_TASKS_TRACE_RCU_NO_MB))
|
||||
smp_mb(); // Provide ordering on noinstr-incomplete architectures.
|
||||
return ret;
|
||||
}
|
||||
|
||||
/**
|
||||
* rcu_read_unlock_tasks_trace - mark end of RCU-trace read-side critical section
|
||||
* @scp: return value from corresponding rcu_read_lock_tasks_trace().
|
||||
*
|
||||
* Pairs with the preceding call to rcu_read_lock_tasks_trace() that
|
||||
* returned the value passed in via scp.
|
||||
*
|
||||
* For more details, please see the documentation for rcu_read_unlock().
|
||||
* For memory-ordering information, please see the header comment for the
|
||||
* rcu_read_lock_tasks_trace() function.
|
||||
*/
|
||||
static inline void rcu_read_unlock_tasks_trace(struct srcu_ctr __percpu *scp)
|
||||
{
|
||||
if (!IS_ENABLED(CONFIG_TASKS_TRACE_RCU_NO_MB))
|
||||
smp_mb(); // Provide ordering on noinstr-incomplete architectures.
|
||||
__srcu_read_unlock_fast(&rcu_tasks_trace_srcu_struct, scp);
|
||||
srcu_lock_release(&rcu_tasks_trace_srcu_struct.dep_map);
|
||||
}
|
||||
|
||||
/**
|
||||
* rcu_read_lock_trace - mark beginning of RCU-trace read-side critical section
|
||||
|
|
@ -50,12 +97,15 @@ static inline void rcu_read_lock_trace(void)
|
|||
{
|
||||
struct task_struct *t = current;
|
||||
|
||||
WRITE_ONCE(t->trc_reader_nesting, READ_ONCE(t->trc_reader_nesting) + 1);
|
||||
barrier();
|
||||
if (IS_ENABLED(CONFIG_TASKS_TRACE_RCU_READ_MB) &&
|
||||
t->trc_reader_special.b.need_mb)
|
||||
smp_mb(); // Pairs with update-side barriers
|
||||
rcu_lock_acquire(&rcu_trace_lock_map);
|
||||
rcu_try_lock_acquire(&rcu_tasks_trace_srcu_struct.dep_map);
|
||||
if (t->trc_reader_nesting++) {
|
||||
// In case we interrupted a Tasks Trace RCU reader.
|
||||
return;
|
||||
}
|
||||
barrier(); // nesting before scp to protect against interrupt handler.
|
||||
t->trc_reader_scp = __srcu_read_lock_fast(&rcu_tasks_trace_srcu_struct);
|
||||
if (!IS_ENABLED(CONFIG_TASKS_TRACE_RCU_NO_MB))
|
||||
smp_mb(); // Placeholder for more selective ordering
|
||||
}
|
||||
|
||||
/**
|
||||
|
|
@ -69,26 +119,88 @@ static inline void rcu_read_lock_trace(void)
|
|||
*/
|
||||
static inline void rcu_read_unlock_trace(void)
|
||||
{
|
||||
int nesting;
|
||||
struct srcu_ctr __percpu *scp;
|
||||
struct task_struct *t = current;
|
||||
|
||||
rcu_lock_release(&rcu_trace_lock_map);
|
||||
nesting = READ_ONCE(t->trc_reader_nesting) - 1;
|
||||
barrier(); // Critical section before disabling.
|
||||
// Disable IPI-based setting of .need_qs.
|
||||
WRITE_ONCE(t->trc_reader_nesting, INT_MIN + nesting);
|
||||
if (likely(!READ_ONCE(t->trc_reader_special.s)) || nesting) {
|
||||
WRITE_ONCE(t->trc_reader_nesting, nesting);
|
||||
return; // We assume shallow reader nesting.
|
||||
scp = t->trc_reader_scp;
|
||||
barrier(); // scp before nesting to protect against interrupt handler.
|
||||
if (!--t->trc_reader_nesting) {
|
||||
if (!IS_ENABLED(CONFIG_TASKS_TRACE_RCU_NO_MB))
|
||||
smp_mb(); // Placeholder for more selective ordering
|
||||
__srcu_read_unlock_fast(&rcu_tasks_trace_srcu_struct, scp);
|
||||
}
|
||||
WARN_ON_ONCE(nesting != 0);
|
||||
rcu_read_unlock_trace_special(t);
|
||||
srcu_lock_release(&rcu_tasks_trace_srcu_struct.dep_map);
|
||||
}
|
||||
|
||||
void call_rcu_tasks_trace(struct rcu_head *rhp, rcu_callback_t func);
|
||||
void synchronize_rcu_tasks_trace(void);
|
||||
void rcu_barrier_tasks_trace(void);
|
||||
struct task_struct *get_rcu_tasks_trace_gp_kthread(void);
|
||||
/**
|
||||
* call_rcu_tasks_trace() - Queue a callback trace task-based grace period
|
||||
* @rhp: structure to be used for queueing the RCU updates.
|
||||
* @func: actual callback function to be invoked after the grace period
|
||||
*
|
||||
* The callback function will be invoked some time after a trace rcu-tasks
|
||||
* grace period elapses, in other words after all currently executing
|
||||
* trace rcu-tasks read-side critical sections have completed. These
|
||||
* read-side critical sections are delimited by calls to rcu_read_lock_trace()
|
||||
* and rcu_read_unlock_trace().
|
||||
*
|
||||
* See the description of call_rcu() for more detailed information on
|
||||
* memory ordering guarantees.
|
||||
*/
|
||||
static inline void call_rcu_tasks_trace(struct rcu_head *rhp, rcu_callback_t func)
|
||||
{
|
||||
call_srcu(&rcu_tasks_trace_srcu_struct, rhp, func);
|
||||
}
|
||||
|
||||
/**
|
||||
* synchronize_rcu_tasks_trace - wait for a trace rcu-tasks grace period
|
||||
*
|
||||
* Control will return to the caller some time after a trace rcu-tasks
|
||||
* grace period has elapsed, in other words after all currently executing
|
||||
* trace rcu-tasks read-side critical sections have elapsed. These read-side
|
||||
* critical sections are delimited by calls to rcu_read_lock_trace()
|
||||
* and rcu_read_unlock_trace().
|
||||
*
|
||||
* This is a very specialized primitive, intended only for a few uses in
|
||||
* tracing and other situations requiring manipulation of function preambles
|
||||
* and profiling hooks. The synchronize_rcu_tasks_trace() function is not
|
||||
* (yet) intended for heavy use from multiple CPUs.
|
||||
*
|
||||
* See the description of synchronize_rcu() for more detailed information
|
||||
* on memory ordering guarantees.
|
||||
*/
|
||||
static inline void synchronize_rcu_tasks_trace(void)
|
||||
{
|
||||
synchronize_srcu(&rcu_tasks_trace_srcu_struct);
|
||||
}
|
||||
|
||||
/**
|
||||
* rcu_barrier_tasks_trace - Wait for in-flight call_rcu_tasks_trace() callbacks.
|
||||
*
|
||||
* Note that rcu_barrier_tasks_trace() is not obligated to actually wait,
|
||||
* for example, if there are no pending callbacks.
|
||||
*/
|
||||
static inline void rcu_barrier_tasks_trace(void)
|
||||
{
|
||||
srcu_barrier(&rcu_tasks_trace_srcu_struct);
|
||||
}
|
||||
|
||||
/**
|
||||
* rcu_tasks_trace_expedite_current - Expedite the current Tasks Trace RCU grace period
|
||||
*
|
||||
* Cause the current Tasks Trace RCU grace period to become expedited.
|
||||
* The grace period following the current one might also be expedited.
|
||||
* If there is no current grace period, one might be created. If the
|
||||
* current grace period is currently sleeping, that sleep will complete
|
||||
* before expediting will take effect.
|
||||
*/
|
||||
static inline void rcu_tasks_trace_expedite_current(void)
|
||||
{
|
||||
srcu_expedite_current(&rcu_tasks_trace_srcu_struct);
|
||||
}
|
||||
|
||||
// Placeholders to enable stepwise transition.
|
||||
void __init rcu_tasks_trace_suppress_unused(void);
|
||||
|
||||
#else
|
||||
/*
|
||||
* The BPF JIT forms these addresses even when it doesn't call these
|
||||
|
|
|
|||
|
|
@ -945,11 +945,7 @@ struct task_struct {
|
|||
|
||||
#ifdef CONFIG_TASKS_TRACE_RCU
|
||||
int trc_reader_nesting;
|
||||
int trc_ipi_to_cpu;
|
||||
union rcu_special trc_reader_special;
|
||||
struct list_head trc_holdout_list;
|
||||
struct list_head trc_blkd_node;
|
||||
int trc_blkd_cpu;
|
||||
struct srcu_ctr __percpu *trc_reader_scp;
|
||||
#endif /* #ifdef CONFIG_TASKS_TRACE_RCU */
|
||||
|
||||
struct sched_info sched_info;
|
||||
|
|
|
|||
|
|
@ -195,9 +195,6 @@ struct task_struct init_task __aligned(L1_CACHE_BYTES) = {
|
|||
#endif
|
||||
#ifdef CONFIG_TASKS_TRACE_RCU
|
||||
.trc_reader_nesting = 0,
|
||||
.trc_reader_special.s = 0,
|
||||
.trc_holdout_list = LIST_HEAD_INIT(init_task.trc_holdout_list),
|
||||
.trc_blkd_node = LIST_HEAD_INIT(init_task.trc_blkd_node),
|
||||
#endif
|
||||
#ifdef CONFIG_CPUSETS
|
||||
.mems_allowed_seq = SEQCNT_SPINLOCK_ZERO(init_task.mems_allowed_seq,
|
||||
|
|
|
|||
|
|
@ -54,24 +54,6 @@ static __always_inline void rcu_task_enter(void)
|
|||
#endif /* #if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL) */
|
||||
}
|
||||
|
||||
/* Turn on heavyweight RCU tasks trace readers on kernel exit. */
|
||||
static __always_inline void rcu_task_trace_heavyweight_enter(void)
|
||||
{
|
||||
#ifdef CONFIG_TASKS_TRACE_RCU
|
||||
if (IS_ENABLED(CONFIG_TASKS_TRACE_RCU_READ_MB))
|
||||
current->trc_reader_special.b.need_mb = true;
|
||||
#endif /* #ifdef CONFIG_TASKS_TRACE_RCU */
|
||||
}
|
||||
|
||||
/* Turn off heavyweight RCU tasks trace readers on kernel entry. */
|
||||
static __always_inline void rcu_task_trace_heavyweight_exit(void)
|
||||
{
|
||||
#ifdef CONFIG_TASKS_TRACE_RCU
|
||||
if (IS_ENABLED(CONFIG_TASKS_TRACE_RCU_READ_MB))
|
||||
current->trc_reader_special.b.need_mb = false;
|
||||
#endif /* #ifdef CONFIG_TASKS_TRACE_RCU */
|
||||
}
|
||||
|
||||
/*
|
||||
* Record entry into an extended quiescent state. This is only to be
|
||||
* called when not already in an extended quiescent state, that is,
|
||||
|
|
@ -85,7 +67,6 @@ static noinstr void ct_kernel_exit_state(int offset)
|
|||
* critical sections, and we also must force ordering with the
|
||||
* next idle sojourn.
|
||||
*/
|
||||
rcu_task_trace_heavyweight_enter(); // Before CT state update!
|
||||
// RCU is still watching. Better not be in extended quiescent state!
|
||||
WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !rcu_is_watching_curr_cpu());
|
||||
(void)ct_state_inc(offset);
|
||||
|
|
@ -108,7 +89,6 @@ static noinstr void ct_kernel_enter_state(int offset)
|
|||
*/
|
||||
seq = ct_state_inc(offset);
|
||||
// RCU is now watching. Better not be in an extended quiescent state!
|
||||
rcu_task_trace_heavyweight_exit(); // After CT state update!
|
||||
WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !(seq & CT_RCU_WATCHING));
|
||||
}
|
||||
|
||||
|
|
|
|||
|
|
@ -1828,9 +1828,6 @@ static inline void rcu_copy_process(struct task_struct *p)
|
|||
#endif /* #ifdef CONFIG_TASKS_RCU */
|
||||
#ifdef CONFIG_TASKS_TRACE_RCU
|
||||
p->trc_reader_nesting = 0;
|
||||
p->trc_reader_special.s = 0;
|
||||
INIT_LIST_HEAD(&p->trc_holdout_list);
|
||||
INIT_LIST_HEAD(&p->trc_blkd_node);
|
||||
#endif /* #ifdef CONFIG_TASKS_TRACE_RCU */
|
||||
}
|
||||
|
||||
|
|
|
|||
|
|
@ -82,7 +82,7 @@ config NEED_SRCU_NMI_SAFE
|
|||
def_bool HAVE_NMI && !ARCH_HAS_NMI_SAFE_THIS_CPU_OPS && !TINY_SRCU
|
||||
|
||||
config TASKS_RCU_GENERIC
|
||||
def_bool TASKS_RCU || TASKS_RUDE_RCU || TASKS_TRACE_RCU
|
||||
def_bool TASKS_RCU || TASKS_RUDE_RCU
|
||||
help
|
||||
This option enables generic infrastructure code supporting
|
||||
task-based RCU implementations. Not for manual selection.
|
||||
|
|
@ -142,6 +142,29 @@ config TASKS_TRACE_RCU
|
|||
default n
|
||||
select IRQ_WORK
|
||||
|
||||
config TASKS_TRACE_RCU_NO_MB
|
||||
bool "Override RCU Tasks Trace inclusion of read-side memory barriers"
|
||||
depends on RCU_EXPERT && TASKS_TRACE_RCU
|
||||
default ARCH_WANTS_NO_INSTR
|
||||
help
|
||||
This option prevents the use of read-side memory barriers in
|
||||
rcu_read_lock_tasks_trace() and rcu_read_unlock_tasks_trace()
|
||||
even in kernels built with CONFIG_ARCH_WANTS_NO_INSTR=n, that is,
|
||||
in kernels that do not have noinstr set up in entry/exit code.
|
||||
By setting this option, you are promising to carefully review
|
||||
use of ftrace, BPF, and friends to ensure that no tracing
|
||||
operation is attached to a function that runs in that portion
|
||||
of the entry/exit code that RCU does not watch, that is,
|
||||
where rcu_is_watching() returns false. Alternatively, you
|
||||
might choose to never remove traces except by rebooting.
|
||||
|
||||
Those wishing to disable read-side memory barriers for an entire
|
||||
architecture can select this Kconfig option, hence the polarity.
|
||||
|
||||
Say Y here if you need speed and will review use of tracing.
|
||||
Say N here for certain esoteric testing of RCU itself.
|
||||
Take the default if you are unsure.
|
||||
|
||||
config RCU_STALL_COMMON
|
||||
def_bool TREE_RCU
|
||||
help
|
||||
|
|
@ -313,24 +336,6 @@ config RCU_NOCB_CPU_CB_BOOST
|
|||
Say Y here if you want to set RT priority for offloading kthreads.
|
||||
Say N here if you are building a !PREEMPT_RT kernel and are unsure.
|
||||
|
||||
config TASKS_TRACE_RCU_READ_MB
|
||||
bool "Tasks Trace RCU readers use memory barriers in user and idle"
|
||||
depends on RCU_EXPERT && TASKS_TRACE_RCU
|
||||
default PREEMPT_RT || NR_CPUS < 8
|
||||
help
|
||||
Use this option to further reduce the number of IPIs sent
|
||||
to CPUs executing in userspace or idle during tasks trace
|
||||
RCU grace periods. Given that a reasonable setting of
|
||||
the rcupdate.rcu_task_ipi_delay kernel boot parameter
|
||||
eliminates such IPIs for many workloads, proper setting
|
||||
of this Kconfig option is important mostly for aggressive
|
||||
real-time installations and for battery-powered devices,
|
||||
hence the default chosen above.
|
||||
|
||||
Say Y here if you hate IPIs.
|
||||
Say N here if you hate read-side memory barriers.
|
||||
Take the default if you are unsure.
|
||||
|
||||
config RCU_LAZY
|
||||
bool "RCU callback lazy invocation functionality"
|
||||
depends on RCU_NOCB_CPU
|
||||
|
|
|
|||
|
|
@ -544,10 +544,6 @@ struct task_struct *get_rcu_tasks_rude_gp_kthread(void);
|
|||
void rcu_tasks_rude_get_gp_data(int *flags, unsigned long *gp_seq);
|
||||
#endif // # ifdef CONFIG_TASKS_RUDE_RCU
|
||||
|
||||
#ifdef CONFIG_TASKS_TRACE_RCU
|
||||
void rcu_tasks_trace_get_gp_data(int *flags, unsigned long *gp_seq);
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_TASKS_RCU_GENERIC
|
||||
void tasks_cblist_init_generic(void);
|
||||
#else /* #ifdef CONFIG_TASKS_RCU_GENERIC */
|
||||
|
|
@ -673,11 +669,6 @@ void show_rcu_tasks_rude_gp_kthread(void);
|
|||
#else
|
||||
static inline void show_rcu_tasks_rude_gp_kthread(void) {}
|
||||
#endif
|
||||
#if !defined(CONFIG_TINY_RCU) && defined(CONFIG_TASKS_TRACE_RCU)
|
||||
void show_rcu_tasks_trace_gp_kthread(void);
|
||||
#else
|
||||
static inline void show_rcu_tasks_trace_gp_kthread(void) {}
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_TINY_RCU
|
||||
static inline bool rcu_cpu_beenfullyonline(int cpu) { return true; }
|
||||
|
|
|
|||
|
|
@ -400,11 +400,6 @@ static void tasks_trace_scale_read_unlock(int idx)
|
|||
rcu_read_unlock_trace();
|
||||
}
|
||||
|
||||
static void rcu_tasks_trace_scale_stats(void)
|
||||
{
|
||||
rcu_tasks_trace_torture_stats_print(scale_type, SCALE_FLAG);
|
||||
}
|
||||
|
||||
static struct rcu_scale_ops tasks_tracing_ops = {
|
||||
.ptype = RCU_TASKS_FLAVOR,
|
||||
.init = rcu_sync_scale_init,
|
||||
|
|
@ -416,8 +411,6 @@ static struct rcu_scale_ops tasks_tracing_ops = {
|
|||
.gp_barrier = rcu_barrier_tasks_trace,
|
||||
.sync = synchronize_rcu_tasks_trace,
|
||||
.exp_sync = synchronize_rcu_tasks_trace,
|
||||
.rso_gp_kthread = get_rcu_tasks_trace_gp_kthread,
|
||||
.stats = IS_ENABLED(CONFIG_TINY_RCU) ? NULL : rcu_tasks_trace_scale_stats,
|
||||
.name = "tasks-tracing"
|
||||
};
|
||||
|
||||
|
|
|
|||
|
|
@ -1178,10 +1178,9 @@ static struct rcu_torture_ops tasks_tracing_ops = {
|
|||
.deferred_free = rcu_tasks_tracing_torture_deferred_free,
|
||||
.sync = synchronize_rcu_tasks_trace,
|
||||
.exp_sync = synchronize_rcu_tasks_trace,
|
||||
.exp_current = rcu_tasks_trace_expedite_current,
|
||||
.call = call_rcu_tasks_trace,
|
||||
.cb_barrier = rcu_barrier_tasks_trace,
|
||||
.gp_kthread_dbg = show_rcu_tasks_trace_gp_kthread,
|
||||
.get_gp_data = rcu_tasks_trace_get_gp_data,
|
||||
.cbflood_max = 50000,
|
||||
.irq_capable = 1,
|
||||
.slow_gps = 1,
|
||||
|
|
@ -1750,7 +1749,7 @@ rcu_torture_writer(void *arg)
|
|||
ulo[i] = cur_ops->get_comp_state();
|
||||
gp_snap = cur_ops->start_gp_poll();
|
||||
rcu_torture_writer_state = RTWS_POLL_WAIT;
|
||||
if (cur_ops->exp_current && !torture_random(&rand) % 0xff)
|
||||
if (cur_ops->exp_current && !(torture_random(&rand) & 0xff))
|
||||
cur_ops->exp_current();
|
||||
while (!cur_ops->poll_gp_state(gp_snap)) {
|
||||
gp_snap1 = cur_ops->get_gp_state();
|
||||
|
|
@ -1772,7 +1771,7 @@ rcu_torture_writer(void *arg)
|
|||
cur_ops->get_comp_state_full(&rgo[i]);
|
||||
cur_ops->start_gp_poll_full(&gp_snap_full);
|
||||
rcu_torture_writer_state = RTWS_POLL_WAIT_FULL;
|
||||
if (cur_ops->exp_current && !torture_random(&rand) % 0xff)
|
||||
if (cur_ops->exp_current && !(torture_random(&rand) & 0xff))
|
||||
cur_ops->exp_current();
|
||||
while (!cur_ops->poll_gp_state_full(&gp_snap_full)) {
|
||||
cur_ops->get_gp_state_full(&gp_snap1_full);
|
||||
|
|
@ -2455,6 +2454,9 @@ static DEFINE_TORTURE_RANDOM_PERCPU(rcu_torture_timer_rand);
|
|||
*/
|
||||
static void rcu_torture_timer(struct timer_list *unused)
|
||||
{
|
||||
WARN_ON_ONCE(!in_serving_softirq());
|
||||
WARN_ON_ONCE(in_hardirq());
|
||||
WARN_ON_ONCE(in_nmi());
|
||||
atomic_long_inc(&n_rcu_torture_timers);
|
||||
(void)rcu_torture_one_read(this_cpu_ptr(&rcu_torture_timer_rand), -1);
|
||||
|
||||
|
|
|
|||
|
|
@ -262,7 +262,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
|
|||
ssp->srcu_sup->srcu_gp_seq_needed_exp = SRCU_GP_SEQ_INITIAL_VAL;
|
||||
ssp->srcu_sup->srcu_last_gp_end = ktime_get_mono_fast_ns();
|
||||
if (READ_ONCE(ssp->srcu_sup->srcu_size_state) == SRCU_SIZE_SMALL && SRCU_SIZING_IS_INIT()) {
|
||||
if (!init_srcu_struct_nodes(ssp, GFP_ATOMIC))
|
||||
if (!init_srcu_struct_nodes(ssp, is_static ? GFP_ATOMIC : GFP_KERNEL))
|
||||
goto err_free_sda;
|
||||
WRITE_ONCE(ssp->srcu_sup->srcu_size_state, SRCU_SIZE_BIG);
|
||||
}
|
||||
|
|
|
|||
|
|
@ -161,11 +161,6 @@ static void tasks_rcu_exit_srcu_stall(struct timer_list *unused);
|
|||
static DEFINE_TIMER(tasks_rcu_exit_srcu_stall_timer, tasks_rcu_exit_srcu_stall);
|
||||
#endif
|
||||
|
||||
/* Avoid IPIing CPUs early in the grace period. */
|
||||
#define RCU_TASK_IPI_DELAY (IS_ENABLED(CONFIG_TASKS_TRACE_RCU_READ_MB) ? HZ / 2 : 0)
|
||||
static int rcu_task_ipi_delay __read_mostly = RCU_TASK_IPI_DELAY;
|
||||
module_param(rcu_task_ipi_delay, int, 0644);
|
||||
|
||||
/* Control stall timeouts. Disable with <= 0, otherwise jiffies till stall. */
|
||||
#define RCU_TASK_BOOT_STALL_TIMEOUT (HZ * 30)
|
||||
#define RCU_TASK_STALL_TIMEOUT (HZ * 60 * 10)
|
||||
|
|
@ -718,7 +713,6 @@ static void __init rcu_tasks_bootup_oddness(void)
|
|||
#endif /* #ifdef CONFIG_TASKS_TRACE_RCU */
|
||||
}
|
||||
|
||||
|
||||
/* Dump out rcutorture-relevant state common to all RCU-tasks flavors. */
|
||||
static void show_rcu_tasks_generic_gp_kthread(struct rcu_tasks *rtp, char *s)
|
||||
{
|
||||
|
|
@ -801,9 +795,7 @@ static void rcu_tasks_torture_stats_print_generic(struct rcu_tasks *rtp, char *t
|
|||
|
||||
#endif // #ifndef CONFIG_TINY_RCU
|
||||
|
||||
static void exit_tasks_rcu_finish_trace(struct task_struct *t);
|
||||
|
||||
#if defined(CONFIG_TASKS_RCU) || defined(CONFIG_TASKS_TRACE_RCU)
|
||||
#if defined(CONFIG_TASKS_RCU)
|
||||
|
||||
////////////////////////////////////////////////////////////////////////
|
||||
//
|
||||
|
|
@ -898,7 +890,7 @@ static void rcu_tasks_wait_gp(struct rcu_tasks *rtp)
|
|||
rtp->postgp_func(rtp);
|
||||
}
|
||||
|
||||
#endif /* #if defined(CONFIG_TASKS_RCU) || defined(CONFIG_TASKS_TRACE_RCU) */
|
||||
#endif /* #if defined(CONFIG_TASKS_RCU) */
|
||||
|
||||
#ifdef CONFIG_TASKS_RCU
|
||||
|
||||
|
|
@ -1322,13 +1314,11 @@ void exit_tasks_rcu_finish(void)
|
|||
raw_spin_lock_irqsave_rcu_node(rtpcp, flags);
|
||||
list_del_init(&t->rcu_tasks_exit_list);
|
||||
raw_spin_unlock_irqrestore_rcu_node(rtpcp, flags);
|
||||
|
||||
exit_tasks_rcu_finish_trace(t);
|
||||
}
|
||||
|
||||
#else /* #ifdef CONFIG_TASKS_RCU */
|
||||
void exit_tasks_rcu_start(void) { }
|
||||
void exit_tasks_rcu_finish(void) { exit_tasks_rcu_finish_trace(current); }
|
||||
void exit_tasks_rcu_finish(void) { }
|
||||
#endif /* #else #ifdef CONFIG_TASKS_RCU */
|
||||
|
||||
#ifdef CONFIG_TASKS_RUDE_RCU
|
||||
|
|
@ -1449,682 +1439,11 @@ EXPORT_SYMBOL_GPL(rcu_tasks_rude_get_gp_data);
|
|||
|
||||
#endif /* #ifdef CONFIG_TASKS_RUDE_RCU */
|
||||
|
||||
////////////////////////////////////////////////////////////////////////
|
||||
//
|
||||
// Tracing variant of Tasks RCU. This variant is designed to be used
|
||||
// to protect tracing hooks, including those of BPF. This variant
|
||||
// therefore:
|
||||
//
|
||||
// 1. Has explicit read-side markers to allow finite grace periods
|
||||
// in the face of in-kernel loops for PREEMPT=n builds.
|
||||
//
|
||||
// 2. Protects code in the idle loop, exception entry/exit, and
|
||||
// CPU-hotplug code paths, similar to the capabilities of SRCU.
|
||||
//
|
||||
// 3. Avoids expensive read-side instructions, having overhead similar
|
||||
// to that of Preemptible RCU.
|
||||
//
|
||||
// There are of course downsides. For example, the grace-period code
|
||||
// can send IPIs to CPUs, even when those CPUs are in the idle loop or
|
||||
// in nohz_full userspace. If needed, these downsides can be at least
|
||||
// partially remedied.
|
||||
//
|
||||
// Perhaps most important, this variant of RCU does not affect the vanilla
|
||||
// flavors, rcu_preempt and rcu_sched. The fact that RCU Tasks Trace
|
||||
// readers can operate from idle, offline, and exception entry/exit in no
|
||||
// way allows rcu_preempt and rcu_sched readers to also do so.
|
||||
//
|
||||
// The implementation uses rcu_tasks_wait_gp(), which relies on function
|
||||
// pointers in the rcu_tasks structure. The rcu_spawn_tasks_trace_kthread()
|
||||
// function sets these function pointers up so that rcu_tasks_wait_gp()
|
||||
// invokes these functions in this order:
|
||||
//
|
||||
// rcu_tasks_trace_pregp_step():
|
||||
// Disables CPU hotplug, adds all currently executing tasks to the
|
||||
// holdout list, then checks the state of all tasks that blocked
|
||||
// or were preempted within their current RCU Tasks Trace read-side
|
||||
// critical section, adding them to the holdout list if appropriate.
|
||||
// Finally, this function re-enables CPU hotplug.
|
||||
// The ->pertask_func() pointer is NULL, so there is no per-task processing.
|
||||
// rcu_tasks_trace_postscan():
|
||||
// Invokes synchronize_rcu() to wait for late-stage exiting tasks
|
||||
// to finish exiting.
|
||||
// check_all_holdout_tasks_trace(), repeatedly until holdout list is empty:
|
||||
// Scans the holdout list, attempting to identify a quiescent state
|
||||
// for each task on the list. If there is a quiescent state, the
|
||||
// corresponding task is removed from the holdout list. Once this
|
||||
// list is empty, the grace period has completed.
|
||||
// rcu_tasks_trace_postgp():
|
||||
// Provides the needed full memory barrier and does debug checks.
|
||||
//
|
||||
// The exit_tasks_rcu_finish_trace() synchronizes with exiting tasks.
|
||||
//
|
||||
// Pre-grace-period update-side code is ordered before the grace period
|
||||
// via the ->cbs_lock and barriers in rcu_tasks_kthread(). Pre-grace-period
|
||||
// read-side code is ordered before the grace period by atomic operations
|
||||
// on .b.need_qs flag of each task involved in this process, or by scheduler
|
||||
// context-switch ordering (for locked-down non-running readers).
|
||||
|
||||
// The lockdep state must be outside of #ifdef to be useful.
|
||||
#ifdef CONFIG_DEBUG_LOCK_ALLOC
|
||||
static struct lock_class_key rcu_lock_trace_key;
|
||||
struct lockdep_map rcu_trace_lock_map =
|
||||
STATIC_LOCKDEP_MAP_INIT("rcu_read_lock_trace", &rcu_lock_trace_key);
|
||||
EXPORT_SYMBOL_GPL(rcu_trace_lock_map);
|
||||
#endif /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
|
||||
|
||||
#ifdef CONFIG_TASKS_TRACE_RCU
|
||||
|
||||
// Record outstanding IPIs to each CPU. No point in sending two...
|
||||
static DEFINE_PER_CPU(bool, trc_ipi_to_cpu);
|
||||
|
||||
// The number of detections of task quiescent state relying on
|
||||
// heavyweight readers executing explicit memory barriers.
|
||||
static unsigned long n_heavy_reader_attempts;
|
||||
static unsigned long n_heavy_reader_updates;
|
||||
static unsigned long n_heavy_reader_ofl_updates;
|
||||
static unsigned long n_trc_holdouts;
|
||||
|
||||
void call_rcu_tasks_trace(struct rcu_head *rhp, rcu_callback_t func);
|
||||
DEFINE_RCU_TASKS(rcu_tasks_trace, rcu_tasks_wait_gp, call_rcu_tasks_trace,
|
||||
"RCU Tasks Trace");
|
||||
|
||||
/* Load from ->trc_reader_special.b.need_qs with proper ordering. */
|
||||
static u8 rcu_ld_need_qs(struct task_struct *t)
|
||||
{
|
||||
smp_mb(); // Enforce full grace-period ordering.
|
||||
return smp_load_acquire(&t->trc_reader_special.b.need_qs);
|
||||
}
|
||||
|
||||
/* Store to ->trc_reader_special.b.need_qs with proper ordering. */
|
||||
static void rcu_st_need_qs(struct task_struct *t, u8 v)
|
||||
{
|
||||
smp_store_release(&t->trc_reader_special.b.need_qs, v);
|
||||
smp_mb(); // Enforce full grace-period ordering.
|
||||
}
|
||||
|
||||
/*
|
||||
* Do a cmpxchg() on ->trc_reader_special.b.need_qs, allowing for
|
||||
* the four-byte operand-size restriction of some platforms.
|
||||
*
|
||||
* Returns the old value, which is often ignored.
|
||||
*/
|
||||
u8 rcu_trc_cmpxchg_need_qs(struct task_struct *t, u8 old, u8 new)
|
||||
{
|
||||
return cmpxchg(&t->trc_reader_special.b.need_qs, old, new);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(rcu_trc_cmpxchg_need_qs);
|
||||
|
||||
/*
|
||||
* If we are the last reader, signal the grace-period kthread.
|
||||
* Also remove from the per-CPU list of blocked tasks.
|
||||
*/
|
||||
void rcu_read_unlock_trace_special(struct task_struct *t)
|
||||
{
|
||||
unsigned long flags;
|
||||
struct rcu_tasks_percpu *rtpcp;
|
||||
union rcu_special trs;
|
||||
|
||||
// Open-coded full-word version of rcu_ld_need_qs().
|
||||
smp_mb(); // Enforce full grace-period ordering.
|
||||
trs = smp_load_acquire(&t->trc_reader_special);
|
||||
|
||||
if (IS_ENABLED(CONFIG_TASKS_TRACE_RCU_READ_MB) && t->trc_reader_special.b.need_mb)
|
||||
smp_mb(); // Pairs with update-side barriers.
|
||||
// Update .need_qs before ->trc_reader_nesting for irq/NMI handlers.
|
||||
if (trs.b.need_qs == (TRC_NEED_QS_CHECKED | TRC_NEED_QS)) {
|
||||
u8 result = rcu_trc_cmpxchg_need_qs(t, TRC_NEED_QS_CHECKED | TRC_NEED_QS,
|
||||
TRC_NEED_QS_CHECKED);
|
||||
|
||||
WARN_ONCE(result != trs.b.need_qs, "%s: result = %d", __func__, result);
|
||||
}
|
||||
if (trs.b.blocked) {
|
||||
rtpcp = per_cpu_ptr(rcu_tasks_trace.rtpcpu, t->trc_blkd_cpu);
|
||||
raw_spin_lock_irqsave_rcu_node(rtpcp, flags);
|
||||
list_del_init(&t->trc_blkd_node);
|
||||
WRITE_ONCE(t->trc_reader_special.b.blocked, false);
|
||||
raw_spin_unlock_irqrestore_rcu_node(rtpcp, flags);
|
||||
}
|
||||
WRITE_ONCE(t->trc_reader_nesting, 0);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(rcu_read_unlock_trace_special);
|
||||
|
||||
/* Add a newly blocked reader task to its CPU's list. */
|
||||
void rcu_tasks_trace_qs_blkd(struct task_struct *t)
|
||||
{
|
||||
unsigned long flags;
|
||||
struct rcu_tasks_percpu *rtpcp;
|
||||
|
||||
local_irq_save(flags);
|
||||
rtpcp = this_cpu_ptr(rcu_tasks_trace.rtpcpu);
|
||||
raw_spin_lock_rcu_node(rtpcp); // irqs already disabled
|
||||
t->trc_blkd_cpu = smp_processor_id();
|
||||
if (!rtpcp->rtp_blkd_tasks.next)
|
||||
INIT_LIST_HEAD(&rtpcp->rtp_blkd_tasks);
|
||||
list_add(&t->trc_blkd_node, &rtpcp->rtp_blkd_tasks);
|
||||
WRITE_ONCE(t->trc_reader_special.b.blocked, true);
|
||||
raw_spin_unlock_irqrestore_rcu_node(rtpcp, flags);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(rcu_tasks_trace_qs_blkd);
|
||||
|
||||
/* Add a task to the holdout list, if it is not already on the list. */
|
||||
static void trc_add_holdout(struct task_struct *t, struct list_head *bhp)
|
||||
{
|
||||
if (list_empty(&t->trc_holdout_list)) {
|
||||
get_task_struct(t);
|
||||
list_add(&t->trc_holdout_list, bhp);
|
||||
n_trc_holdouts++;
|
||||
}
|
||||
}
|
||||
|
||||
/* Remove a task from the holdout list, if it is in fact present. */
|
||||
static void trc_del_holdout(struct task_struct *t)
|
||||
{
|
||||
if (!list_empty(&t->trc_holdout_list)) {
|
||||
list_del_init(&t->trc_holdout_list);
|
||||
put_task_struct(t);
|
||||
n_trc_holdouts--;
|
||||
}
|
||||
}
|
||||
|
||||
/* IPI handler to check task state. */
|
||||
static void trc_read_check_handler(void *t_in)
|
||||
{
|
||||
int nesting;
|
||||
struct task_struct *t = current;
|
||||
struct task_struct *texp = t_in;
|
||||
|
||||
// If the task is no longer running on this CPU, leave.
|
||||
if (unlikely(texp != t))
|
||||
goto reset_ipi; // Already on holdout list, so will check later.
|
||||
|
||||
// If the task is not in a read-side critical section, and
|
||||
// if this is the last reader, awaken the grace-period kthread.
|
||||
nesting = READ_ONCE(t->trc_reader_nesting);
|
||||
if (likely(!nesting)) {
|
||||
rcu_trc_cmpxchg_need_qs(t, 0, TRC_NEED_QS_CHECKED);
|
||||
goto reset_ipi;
|
||||
}
|
||||
// If we are racing with an rcu_read_unlock_trace(), try again later.
|
||||
if (unlikely(nesting < 0))
|
||||
goto reset_ipi;
|
||||
|
||||
// Get here if the task is in a read-side critical section.
|
||||
// Set its state so that it will update state for the grace-period
|
||||
// kthread upon exit from that critical section.
|
||||
rcu_trc_cmpxchg_need_qs(t, 0, TRC_NEED_QS | TRC_NEED_QS_CHECKED);
|
||||
|
||||
reset_ipi:
|
||||
// Allow future IPIs to be sent on CPU and for task.
|
||||
// Also order this IPI handler against any later manipulations of
|
||||
// the intended task.
|
||||
smp_store_release(per_cpu_ptr(&trc_ipi_to_cpu, smp_processor_id()), false); // ^^^
|
||||
smp_store_release(&texp->trc_ipi_to_cpu, -1); // ^^^
|
||||
}
|
||||
|
||||
/* Callback function for scheduler to check locked-down task. */
|
||||
static int trc_inspect_reader(struct task_struct *t, void *bhp_in)
|
||||
{
|
||||
struct list_head *bhp = bhp_in;
|
||||
int cpu = task_cpu(t);
|
||||
int nesting;
|
||||
bool ofl = cpu_is_offline(cpu);
|
||||
|
||||
if (task_curr(t) && !ofl) {
|
||||
// If no chance of heavyweight readers, do it the hard way.
|
||||
if (!IS_ENABLED(CONFIG_TASKS_TRACE_RCU_READ_MB))
|
||||
return -EINVAL;
|
||||
|
||||
// If heavyweight readers are enabled on the remote task,
|
||||
// we can inspect its state despite its currently running.
|
||||
// However, we cannot safely change its state.
|
||||
n_heavy_reader_attempts++;
|
||||
// Check for "running" idle tasks on offline CPUs.
|
||||
if (!rcu_watching_zero_in_eqs(cpu, &t->trc_reader_nesting))
|
||||
return -EINVAL; // No quiescent state, do it the hard way.
|
||||
n_heavy_reader_updates++;
|
||||
nesting = 0;
|
||||
} else {
|
||||
// The task is not running, so C-language access is safe.
|
||||
nesting = t->trc_reader_nesting;
|
||||
WARN_ON_ONCE(ofl && task_curr(t) && (t != idle_task(task_cpu(t))));
|
||||
if (IS_ENABLED(CONFIG_TASKS_TRACE_RCU_READ_MB) && ofl)
|
||||
n_heavy_reader_ofl_updates++;
|
||||
}
|
||||
|
||||
// If not exiting a read-side critical section, mark as checked
|
||||
// so that the grace-period kthread will remove it from the
|
||||
// holdout list.
|
||||
if (!nesting) {
|
||||
rcu_trc_cmpxchg_need_qs(t, 0, TRC_NEED_QS_CHECKED);
|
||||
return 0; // In QS, so done.
|
||||
}
|
||||
if (nesting < 0)
|
||||
return -EINVAL; // Reader transitioning, try again later.
|
||||
|
||||
// The task is in a read-side critical section, so set up its
|
||||
// state so that it will update state upon exit from that critical
|
||||
// section.
|
||||
if (!rcu_trc_cmpxchg_need_qs(t, 0, TRC_NEED_QS | TRC_NEED_QS_CHECKED))
|
||||
trc_add_holdout(t, bhp);
|
||||
return 0;
|
||||
}
|
||||
|
||||
/* Attempt to extract the state for the specified task. */
|
||||
static void trc_wait_for_one_reader(struct task_struct *t,
|
||||
struct list_head *bhp)
|
||||
{
|
||||
int cpu;
|
||||
|
||||
// If a previous IPI is still in flight, let it complete.
|
||||
if (smp_load_acquire(&t->trc_ipi_to_cpu) != -1) // Order IPI
|
||||
return;
|
||||
|
||||
// The current task had better be in a quiescent state.
|
||||
if (t == current) {
|
||||
rcu_trc_cmpxchg_need_qs(t, 0, TRC_NEED_QS_CHECKED);
|
||||
WARN_ON_ONCE(READ_ONCE(t->trc_reader_nesting));
|
||||
return;
|
||||
}
|
||||
|
||||
// Attempt to nail down the task for inspection.
|
||||
get_task_struct(t);
|
||||
if (!task_call_func(t, trc_inspect_reader, bhp)) {
|
||||
put_task_struct(t);
|
||||
return;
|
||||
}
|
||||
put_task_struct(t);
|
||||
|
||||
// If this task is not yet on the holdout list, then we are in
|
||||
// an RCU read-side critical section. Otherwise, the invocation of
|
||||
// trc_add_holdout() that added it to the list did the necessary
|
||||
// get_task_struct(). Either way, the task cannot be freed out
|
||||
// from under this code.
|
||||
|
||||
// If currently running, send an IPI, either way, add to list.
|
||||
trc_add_holdout(t, bhp);
|
||||
if (task_curr(t) &&
|
||||
time_after(jiffies + 1, rcu_tasks_trace.gp_start + rcu_task_ipi_delay)) {
|
||||
// The task is currently running, so try IPIing it.
|
||||
cpu = task_cpu(t);
|
||||
|
||||
// If there is already an IPI outstanding, let it happen.
|
||||
if (per_cpu(trc_ipi_to_cpu, cpu) || t->trc_ipi_to_cpu >= 0)
|
||||
return;
|
||||
|
||||
per_cpu(trc_ipi_to_cpu, cpu) = true;
|
||||
t->trc_ipi_to_cpu = cpu;
|
||||
rcu_tasks_trace.n_ipis++;
|
||||
if (smp_call_function_single(cpu, trc_read_check_handler, t, 0)) {
|
||||
// Just in case there is some other reason for
|
||||
// failure than the target CPU being offline.
|
||||
WARN_ONCE(1, "%s(): smp_call_function_single() failed for CPU: %d\n",
|
||||
__func__, cpu);
|
||||
rcu_tasks_trace.n_ipis_fails++;
|
||||
per_cpu(trc_ipi_to_cpu, cpu) = false;
|
||||
t->trc_ipi_to_cpu = -1;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/*
|
||||
* Initialize for first-round processing for the specified task.
|
||||
* Return false if task is NULL or already taken care of, true otherwise.
|
||||
*/
|
||||
static bool rcu_tasks_trace_pertask_prep(struct task_struct *t, bool notself)
|
||||
{
|
||||
// During early boot when there is only the one boot CPU, there
|
||||
// is no idle task for the other CPUs. Also, the grace-period
|
||||
// kthread is always in a quiescent state. In addition, just return
|
||||
// if this task is already on the list.
|
||||
if (unlikely(t == NULL) || (t == current && notself) || !list_empty(&t->trc_holdout_list))
|
||||
return false;
|
||||
|
||||
rcu_st_need_qs(t, 0);
|
||||
t->trc_ipi_to_cpu = -1;
|
||||
return true;
|
||||
}
|
||||
|
||||
/* Do first-round processing for the specified task. */
|
||||
static void rcu_tasks_trace_pertask(struct task_struct *t, struct list_head *hop)
|
||||
{
|
||||
if (rcu_tasks_trace_pertask_prep(t, true))
|
||||
trc_wait_for_one_reader(t, hop);
|
||||
}
|
||||
|
||||
/* Initialize for a new RCU-tasks-trace grace period. */
|
||||
static void rcu_tasks_trace_pregp_step(struct list_head *hop)
|
||||
{
|
||||
LIST_HEAD(blkd_tasks);
|
||||
int cpu;
|
||||
unsigned long flags;
|
||||
struct rcu_tasks_percpu *rtpcp;
|
||||
struct task_struct *t;
|
||||
|
||||
// There shouldn't be any old IPIs, but...
|
||||
for_each_possible_cpu(cpu)
|
||||
WARN_ON_ONCE(per_cpu(trc_ipi_to_cpu, cpu));
|
||||
|
||||
// Disable CPU hotplug across the CPU scan for the benefit of
|
||||
// any IPIs that might be needed. This also waits for all readers
|
||||
// in CPU-hotplug code paths.
|
||||
cpus_read_lock();
|
||||
|
||||
// These rcu_tasks_trace_pertask_prep() calls are serialized to
|
||||
// allow safe access to the hop list.
|
||||
for_each_online_cpu(cpu) {
|
||||
rcu_read_lock();
|
||||
// Note that cpu_curr_snapshot() picks up the target
|
||||
// CPU's current task while its runqueue is locked with
|
||||
// an smp_mb__after_spinlock(). This ensures that either
|
||||
// the grace-period kthread will see that task's read-side
|
||||
// critical section or the task will see the updater's pre-GP
|
||||
// accesses. The trailing smp_mb() in cpu_curr_snapshot()
|
||||
// does not currently play a role other than simplify
|
||||
// that function's ordering semantics. If these simplified
|
||||
// ordering semantics continue to be redundant, that smp_mb()
|
||||
// might be removed.
|
||||
t = cpu_curr_snapshot(cpu);
|
||||
if (rcu_tasks_trace_pertask_prep(t, true))
|
||||
trc_add_holdout(t, hop);
|
||||
rcu_read_unlock();
|
||||
cond_resched_tasks_rcu_qs();
|
||||
}
|
||||
|
||||
// Only after all running tasks have been accounted for is it
|
||||
// safe to take care of the tasks that have blocked within their
|
||||
// current RCU tasks trace read-side critical section.
|
||||
for_each_possible_cpu(cpu) {
|
||||
rtpcp = per_cpu_ptr(rcu_tasks_trace.rtpcpu, cpu);
|
||||
raw_spin_lock_irqsave_rcu_node(rtpcp, flags);
|
||||
list_splice_init(&rtpcp->rtp_blkd_tasks, &blkd_tasks);
|
||||
while (!list_empty(&blkd_tasks)) {
|
||||
rcu_read_lock();
|
||||
t = list_first_entry(&blkd_tasks, struct task_struct, trc_blkd_node);
|
||||
list_del_init(&t->trc_blkd_node);
|
||||
list_add(&t->trc_blkd_node, &rtpcp->rtp_blkd_tasks);
|
||||
raw_spin_unlock_irqrestore_rcu_node(rtpcp, flags);
|
||||
rcu_tasks_trace_pertask(t, hop);
|
||||
rcu_read_unlock();
|
||||
raw_spin_lock_irqsave_rcu_node(rtpcp, flags);
|
||||
}
|
||||
raw_spin_unlock_irqrestore_rcu_node(rtpcp, flags);
|
||||
cond_resched_tasks_rcu_qs();
|
||||
}
|
||||
|
||||
// Re-enable CPU hotplug now that the holdout list is populated.
|
||||
cpus_read_unlock();
|
||||
}
|
||||
|
||||
/*
|
||||
* Do intermediate processing between task and holdout scans.
|
||||
*/
|
||||
static void rcu_tasks_trace_postscan(struct list_head *hop)
|
||||
{
|
||||
// Wait for late-stage exiting tasks to finish exiting.
|
||||
// These might have passed the call to exit_tasks_rcu_finish().
|
||||
|
||||
// If you remove the following line, update rcu_trace_implies_rcu_gp()!!!
|
||||
synchronize_rcu();
|
||||
// Any tasks that exit after this point will set
|
||||
// TRC_NEED_QS_CHECKED in ->trc_reader_special.b.need_qs.
|
||||
}
|
||||
|
||||
/* Communicate task state back to the RCU tasks trace stall warning request. */
|
||||
struct trc_stall_chk_rdr {
|
||||
int nesting;
|
||||
int ipi_to_cpu;
|
||||
u8 needqs;
|
||||
};
|
||||
|
||||
static int trc_check_slow_task(struct task_struct *t, void *arg)
|
||||
{
|
||||
struct trc_stall_chk_rdr *trc_rdrp = arg;
|
||||
|
||||
if (task_curr(t) && cpu_online(task_cpu(t)))
|
||||
return false; // It is running, so decline to inspect it.
|
||||
trc_rdrp->nesting = READ_ONCE(t->trc_reader_nesting);
|
||||
trc_rdrp->ipi_to_cpu = READ_ONCE(t->trc_ipi_to_cpu);
|
||||
trc_rdrp->needqs = rcu_ld_need_qs(t);
|
||||
return true;
|
||||
}
|
||||
|
||||
/* Show the state of a task stalling the current RCU tasks trace GP. */
|
||||
static void show_stalled_task_trace(struct task_struct *t, bool *firstreport)
|
||||
{
|
||||
int cpu;
|
||||
struct trc_stall_chk_rdr trc_rdr;
|
||||
bool is_idle_tsk = is_idle_task(t);
|
||||
|
||||
if (*firstreport) {
|
||||
pr_err("INFO: rcu_tasks_trace detected stalls on tasks:\n");
|
||||
*firstreport = false;
|
||||
}
|
||||
cpu = task_cpu(t);
|
||||
if (!task_call_func(t, trc_check_slow_task, &trc_rdr))
|
||||
pr_alert("P%d: %c%c\n",
|
||||
t->pid,
|
||||
".I"[t->trc_ipi_to_cpu >= 0],
|
||||
".i"[is_idle_tsk]);
|
||||
else
|
||||
pr_alert("P%d: %c%c%c%c nesting: %d%c%c cpu: %d%s\n",
|
||||
t->pid,
|
||||
".I"[trc_rdr.ipi_to_cpu >= 0],
|
||||
".i"[is_idle_tsk],
|
||||
".N"[cpu >= 0 && tick_nohz_full_cpu(cpu)],
|
||||
".B"[!!data_race(t->trc_reader_special.b.blocked)],
|
||||
trc_rdr.nesting,
|
||||
" !CN"[trc_rdr.needqs & 0x3],
|
||||
" ?"[trc_rdr.needqs > 0x3],
|
||||
cpu, cpu_online(cpu) ? "" : "(offline)");
|
||||
sched_show_task(t);
|
||||
}
|
||||
|
||||
/* List stalled IPIs for RCU tasks trace. */
|
||||
static void show_stalled_ipi_trace(void)
|
||||
{
|
||||
int cpu;
|
||||
|
||||
for_each_possible_cpu(cpu)
|
||||
if (per_cpu(trc_ipi_to_cpu, cpu))
|
||||
pr_alert("\tIPI outstanding to CPU %d\n", cpu);
|
||||
}
|
||||
|
||||
/* Do one scan of the holdout list. */
|
||||
static void check_all_holdout_tasks_trace(struct list_head *hop,
|
||||
bool needreport, bool *firstreport)
|
||||
{
|
||||
struct task_struct *g, *t;
|
||||
|
||||
// Disable CPU hotplug across the holdout list scan for IPIs.
|
||||
cpus_read_lock();
|
||||
|
||||
list_for_each_entry_safe(t, g, hop, trc_holdout_list) {
|
||||
// If safe and needed, try to check the current task.
|
||||
if (READ_ONCE(t->trc_ipi_to_cpu) == -1 &&
|
||||
!(rcu_ld_need_qs(t) & TRC_NEED_QS_CHECKED))
|
||||
trc_wait_for_one_reader(t, hop);
|
||||
|
||||
// If check succeeded, remove this task from the list.
|
||||
if (smp_load_acquire(&t->trc_ipi_to_cpu) == -1 &&
|
||||
rcu_ld_need_qs(t) == TRC_NEED_QS_CHECKED)
|
||||
trc_del_holdout(t);
|
||||
else if (needreport)
|
||||
show_stalled_task_trace(t, firstreport);
|
||||
cond_resched_tasks_rcu_qs();
|
||||
}
|
||||
|
||||
// Re-enable CPU hotplug now that the holdout list scan has completed.
|
||||
cpus_read_unlock();
|
||||
|
||||
if (needreport) {
|
||||
if (*firstreport)
|
||||
pr_err("INFO: rcu_tasks_trace detected stalls? (Late IPI?)\n");
|
||||
show_stalled_ipi_trace();
|
||||
}
|
||||
}
|
||||
|
||||
static void rcu_tasks_trace_empty_fn(void *unused)
|
||||
{
|
||||
}
|
||||
|
||||
/* Wait for grace period to complete and provide ordering. */
|
||||
static void rcu_tasks_trace_postgp(struct rcu_tasks *rtp)
|
||||
{
|
||||
int cpu;
|
||||
|
||||
// Wait for any lingering IPI handlers to complete. Note that
|
||||
// if a CPU has gone offline or transitioned to userspace in the
|
||||
// meantime, all IPI handlers should have been drained beforehand.
|
||||
// Yes, this assumes that CPUs process IPIs in order. If that ever
|
||||
// changes, there will need to be a recheck and/or timed wait.
|
||||
for_each_online_cpu(cpu)
|
||||
if (WARN_ON_ONCE(smp_load_acquire(per_cpu_ptr(&trc_ipi_to_cpu, cpu))))
|
||||
smp_call_function_single(cpu, rcu_tasks_trace_empty_fn, NULL, 1);
|
||||
|
||||
smp_mb(); // Caller's code must be ordered after wakeup.
|
||||
// Pairs with pretty much every ordering primitive.
|
||||
}
|
||||
|
||||
/* Report any needed quiescent state for this exiting task. */
|
||||
static void exit_tasks_rcu_finish_trace(struct task_struct *t)
|
||||
{
|
||||
union rcu_special trs = READ_ONCE(t->trc_reader_special);
|
||||
|
||||
rcu_trc_cmpxchg_need_qs(t, 0, TRC_NEED_QS_CHECKED);
|
||||
WARN_ON_ONCE(READ_ONCE(t->trc_reader_nesting));
|
||||
if (WARN_ON_ONCE(rcu_ld_need_qs(t) & TRC_NEED_QS || trs.b.blocked))
|
||||
rcu_read_unlock_trace_special(t);
|
||||
else
|
||||
WRITE_ONCE(t->trc_reader_nesting, 0);
|
||||
}
|
||||
|
||||
/**
|
||||
* call_rcu_tasks_trace() - Queue a callback trace task-based grace period
|
||||
* @rhp: structure to be used for queueing the RCU updates.
|
||||
* @func: actual callback function to be invoked after the grace period
|
||||
*
|
||||
* The callback function will be invoked some time after a trace rcu-tasks
|
||||
* grace period elapses, in other words after all currently executing
|
||||
* trace rcu-tasks read-side critical sections have completed. These
|
||||
* read-side critical sections are delimited by calls to rcu_read_lock_trace()
|
||||
* and rcu_read_unlock_trace().
|
||||
*
|
||||
* See the description of call_rcu() for more detailed information on
|
||||
* memory ordering guarantees.
|
||||
*/
|
||||
void call_rcu_tasks_trace(struct rcu_head *rhp, rcu_callback_t func)
|
||||
{
|
||||
call_rcu_tasks_generic(rhp, func, &rcu_tasks_trace);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(call_rcu_tasks_trace);
|
||||
|
||||
/**
|
||||
* synchronize_rcu_tasks_trace - wait for a trace rcu-tasks grace period
|
||||
*
|
||||
* Control will return to the caller some time after a trace rcu-tasks
|
||||
* grace period has elapsed, in other words after all currently executing
|
||||
* trace rcu-tasks read-side critical sections have elapsed. These read-side
|
||||
* critical sections are delimited by calls to rcu_read_lock_trace()
|
||||
* and rcu_read_unlock_trace().
|
||||
*
|
||||
* This is a very specialized primitive, intended only for a few uses in
|
||||
* tracing and other situations requiring manipulation of function preambles
|
||||
* and profiling hooks. The synchronize_rcu_tasks_trace() function is not
|
||||
* (yet) intended for heavy use from multiple CPUs.
|
||||
*
|
||||
* See the description of synchronize_rcu() for more detailed information
|
||||
* on memory ordering guarantees.
|
||||
*/
|
||||
void synchronize_rcu_tasks_trace(void)
|
||||
{
|
||||
RCU_LOCKDEP_WARN(lock_is_held(&rcu_trace_lock_map), "Illegal synchronize_rcu_tasks_trace() in RCU Tasks Trace read-side critical section");
|
||||
synchronize_rcu_tasks_generic(&rcu_tasks_trace);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(synchronize_rcu_tasks_trace);
|
||||
|
||||
/**
|
||||
* rcu_barrier_tasks_trace - Wait for in-flight call_rcu_tasks_trace() callbacks.
|
||||
*
|
||||
* Although the current implementation is guaranteed to wait, it is not
|
||||
* obligated to, for example, if there are no pending callbacks.
|
||||
*/
|
||||
void rcu_barrier_tasks_trace(void)
|
||||
{
|
||||
rcu_barrier_tasks_generic(&rcu_tasks_trace);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(rcu_barrier_tasks_trace);
|
||||
|
||||
int rcu_tasks_trace_lazy_ms = -1;
|
||||
module_param(rcu_tasks_trace_lazy_ms, int, 0444);
|
||||
|
||||
static int __init rcu_spawn_tasks_trace_kthread(void)
|
||||
{
|
||||
if (IS_ENABLED(CONFIG_TASKS_TRACE_RCU_READ_MB)) {
|
||||
rcu_tasks_trace.gp_sleep = HZ / 10;
|
||||
rcu_tasks_trace.init_fract = HZ / 10;
|
||||
} else {
|
||||
rcu_tasks_trace.gp_sleep = HZ / 200;
|
||||
if (rcu_tasks_trace.gp_sleep <= 0)
|
||||
rcu_tasks_trace.gp_sleep = 1;
|
||||
rcu_tasks_trace.init_fract = HZ / 200;
|
||||
if (rcu_tasks_trace.init_fract <= 0)
|
||||
rcu_tasks_trace.init_fract = 1;
|
||||
}
|
||||
if (rcu_tasks_trace_lazy_ms >= 0)
|
||||
rcu_tasks_trace.lazy_jiffies = msecs_to_jiffies(rcu_tasks_trace_lazy_ms);
|
||||
rcu_tasks_trace.pregp_func = rcu_tasks_trace_pregp_step;
|
||||
rcu_tasks_trace.postscan_func = rcu_tasks_trace_postscan;
|
||||
rcu_tasks_trace.holdouts_func = check_all_holdout_tasks_trace;
|
||||
rcu_tasks_trace.postgp_func = rcu_tasks_trace_postgp;
|
||||
rcu_spawn_tasks_kthread_generic(&rcu_tasks_trace);
|
||||
return 0;
|
||||
}
|
||||
|
||||
#if !defined(CONFIG_TINY_RCU)
|
||||
void show_rcu_tasks_trace_gp_kthread(void)
|
||||
{
|
||||
char buf[64];
|
||||
|
||||
snprintf(buf, sizeof(buf), "N%lu h:%lu/%lu/%lu",
|
||||
data_race(n_trc_holdouts),
|
||||
data_race(n_heavy_reader_ofl_updates),
|
||||
data_race(n_heavy_reader_updates),
|
||||
data_race(n_heavy_reader_attempts));
|
||||
show_rcu_tasks_generic_gp_kthread(&rcu_tasks_trace, buf);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(show_rcu_tasks_trace_gp_kthread);
|
||||
|
||||
void rcu_tasks_trace_torture_stats_print(char *tt, char *tf)
|
||||
{
|
||||
rcu_tasks_torture_stats_print_generic(&rcu_tasks_trace, tt, tf, "");
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(rcu_tasks_trace_torture_stats_print);
|
||||
#endif // !defined(CONFIG_TINY_RCU)
|
||||
|
||||
struct task_struct *get_rcu_tasks_trace_gp_kthread(void)
|
||||
{
|
||||
return rcu_tasks_trace.kthread_ptr;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(get_rcu_tasks_trace_gp_kthread);
|
||||
|
||||
void rcu_tasks_trace_get_gp_data(int *flags, unsigned long *gp_seq)
|
||||
{
|
||||
*flags = 0;
|
||||
*gp_seq = rcu_seq_current(&rcu_tasks_trace.tasks_gp_seq);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(rcu_tasks_trace_get_gp_data);
|
||||
|
||||
#else /* #ifdef CONFIG_TASKS_TRACE_RCU */
|
||||
static void exit_tasks_rcu_finish_trace(struct task_struct *t) { }
|
||||
#endif /* #else #ifdef CONFIG_TASKS_TRACE_RCU */
|
||||
|
||||
#ifndef CONFIG_TINY_RCU
|
||||
void show_rcu_tasks_gp_kthreads(void)
|
||||
{
|
||||
show_rcu_tasks_classic_gp_kthread();
|
||||
show_rcu_tasks_rude_gp_kthread();
|
||||
show_rcu_tasks_trace_gp_kthread();
|
||||
}
|
||||
#endif /* #ifndef CONFIG_TINY_RCU */
|
||||
|
||||
|
|
@ -2251,10 +1570,6 @@ void __init tasks_cblist_init_generic(void)
|
|||
#ifdef CONFIG_TASKS_RUDE_RCU
|
||||
cblist_init_generic(&rcu_tasks_rude);
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_TASKS_TRACE_RCU
|
||||
cblist_init_generic(&rcu_tasks_trace);
|
||||
#endif
|
||||
}
|
||||
|
||||
static int __init rcu_init_tasks_generic(void)
|
||||
|
|
@ -2267,10 +1582,6 @@ static int __init rcu_init_tasks_generic(void)
|
|||
rcu_spawn_tasks_rude_kthread();
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_TASKS_TRACE_RCU
|
||||
rcu_spawn_tasks_trace_kthread();
|
||||
#endif
|
||||
|
||||
// Run the self-tests.
|
||||
rcu_tasks_initiate_self_tests();
|
||||
|
||||
|
|
@ -2281,3 +1592,16 @@ core_initcall(rcu_init_tasks_generic);
|
|||
#else /* #ifdef CONFIG_TASKS_RCU_GENERIC */
|
||||
static inline void rcu_tasks_bootup_oddness(void) {}
|
||||
#endif /* #else #ifdef CONFIG_TASKS_RCU_GENERIC */
|
||||
|
||||
#ifdef CONFIG_TASKS_TRACE_RCU
|
||||
|
||||
////////////////////////////////////////////////////////////////////////
|
||||
//
|
||||
// Tracing variant of Tasks RCU. This variant is designed to be used
|
||||
// to protect tracing hooks, including those of BPF. This variant
|
||||
// is implemented via a straightforward mapping onto SRCU-fast.
|
||||
|
||||
DEFINE_SRCU_FAST(rcu_tasks_trace_srcu_struct);
|
||||
EXPORT_SYMBOL_GPL(rcu_tasks_trace_srcu_struct);
|
||||
|
||||
#endif /* #else #ifdef CONFIG_TASKS_TRACE_RCU */
|
||||
|
|
|
|||
|
|
@ -160,6 +160,7 @@ static void rcu_report_qs_rnp(unsigned long mask, struct rcu_node *rnp,
|
|||
unsigned long gps, unsigned long flags);
|
||||
static void invoke_rcu_core(void);
|
||||
static void rcu_report_exp_rdp(struct rcu_data *rdp);
|
||||
static void rcu_report_qs_rdp(struct rcu_data *rdp);
|
||||
static void check_cb_ovld_locked(struct rcu_data *rdp, struct rcu_node *rnp);
|
||||
static bool rcu_rdp_is_offloaded(struct rcu_data *rdp);
|
||||
static bool rcu_rdp_cpu_online(struct rcu_data *rdp);
|
||||
|
|
@ -1983,6 +1984,17 @@ static noinline_for_stack bool rcu_gp_init(void)
|
|||
if (IS_ENABLED(CONFIG_RCU_STRICT_GRACE_PERIOD))
|
||||
on_each_cpu(rcu_strict_gp_boundary, NULL, 0);
|
||||
|
||||
/*
|
||||
* Immediately report QS for the GP kthread's CPU. The GP kthread
|
||||
* cannot be in an RCU read-side critical section while running
|
||||
* the FQS scan. This eliminates the need for a second FQS wait
|
||||
* when all CPUs are idle.
|
||||
*/
|
||||
preempt_disable();
|
||||
rcu_qs();
|
||||
rcu_report_qs_rdp(this_cpu_ptr(&rcu_data));
|
||||
preempt_enable();
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
|
|
@ -3769,7 +3781,7 @@ static void rcu_barrier_entrain(struct rcu_data *rdp)
|
|||
}
|
||||
rcu_nocb_unlock(rdp);
|
||||
if (wake_nocb)
|
||||
wake_nocb_gp(rdp, false);
|
||||
wake_nocb_gp(rdp);
|
||||
smp_store_release(&rdp->barrier_seq_snap, gseq);
|
||||
}
|
||||
|
||||
|
|
|
|||
|
|
@ -203,7 +203,7 @@ struct rcu_data {
|
|||
/* during and after the last grace */
|
||||
/* period it is aware of. */
|
||||
struct irq_work defer_qs_iw; /* Obtain later scheduler attention. */
|
||||
int defer_qs_iw_pending; /* Scheduler attention pending? */
|
||||
int defer_qs_pending; /* irqwork or softirq pending? */
|
||||
struct work_struct strict_work; /* Schedule readers for strict GPs. */
|
||||
|
||||
/* 2) batch handling */
|
||||
|
|
@ -301,7 +301,6 @@ struct rcu_data {
|
|||
#define RCU_NOCB_WAKE_BYPASS 1
|
||||
#define RCU_NOCB_WAKE_LAZY 2
|
||||
#define RCU_NOCB_WAKE 3
|
||||
#define RCU_NOCB_WAKE_FORCE 4
|
||||
|
||||
#define RCU_JIFFIES_TILL_FORCE_QS (1 + (HZ > 250) + (HZ > 500))
|
||||
/* For jiffies_till_first_fqs and */
|
||||
|
|
@ -500,7 +499,7 @@ static void zero_cpu_stall_ticks(struct rcu_data *rdp);
|
|||
static struct swait_queue_head *rcu_nocb_gp_get(struct rcu_node *rnp);
|
||||
static void rcu_nocb_gp_cleanup(struct swait_queue_head *sq);
|
||||
static void rcu_init_one_nocb(struct rcu_node *rnp);
|
||||
static bool wake_nocb_gp(struct rcu_data *rdp, bool force);
|
||||
static bool wake_nocb_gp(struct rcu_data *rdp);
|
||||
static bool rcu_nocb_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
|
||||
unsigned long j, bool lazy);
|
||||
static void call_rcu_nocb(struct rcu_data *rdp, struct rcu_head *head,
|
||||
|
|
|
|||
|
|
@ -589,7 +589,12 @@ static void synchronize_rcu_expedited_stall(unsigned long jiffies_start, unsigne
|
|||
pr_cont(" } %lu jiffies s: %lu root: %#lx/%c\n",
|
||||
j - jiffies_start, rcu_state.expedited_sequence, data_race(rnp_root->expmask),
|
||||
".T"[!!data_race(rnp_root->exp_tasks)]);
|
||||
if (ndetected) {
|
||||
if (!ndetected) {
|
||||
// This is invoked from the grace-period worker, so
|
||||
// a new grace period cannot have started. And if this
|
||||
// worker were stalled, we would not get here. ;-)
|
||||
pr_err("INFO: Expedited stall ended before state dump start\n");
|
||||
} else {
|
||||
pr_err("blocking rcu_node structures (internal RCU debug):");
|
||||
rcu_for_each_node_breadth_first(rnp) {
|
||||
if (rnp == rnp_root)
|
||||
|
|
|
|||
|
|
@ -190,9 +190,18 @@ static void rcu_init_one_nocb(struct rcu_node *rnp)
|
|||
init_swait_queue_head(&rnp->nocb_gp_wq[1]);
|
||||
}
|
||||
|
||||
/* Clear any pending deferred wakeup timer (nocb_gp_lock must be held). */
|
||||
static void nocb_defer_wakeup_cancel(struct rcu_data *rdp_gp)
|
||||
{
|
||||
if (rdp_gp->nocb_defer_wakeup > RCU_NOCB_WAKE_NOT) {
|
||||
WRITE_ONCE(rdp_gp->nocb_defer_wakeup, RCU_NOCB_WAKE_NOT);
|
||||
timer_delete(&rdp_gp->nocb_timer);
|
||||
}
|
||||
}
|
||||
|
||||
static bool __wake_nocb_gp(struct rcu_data *rdp_gp,
|
||||
struct rcu_data *rdp,
|
||||
bool force, unsigned long flags)
|
||||
unsigned long flags)
|
||||
__releases(rdp_gp->nocb_gp_lock)
|
||||
{
|
||||
bool needwake = false;
|
||||
|
|
@ -204,12 +213,9 @@ static bool __wake_nocb_gp(struct rcu_data *rdp_gp,
|
|||
return false;
|
||||
}
|
||||
|
||||
if (rdp_gp->nocb_defer_wakeup > RCU_NOCB_WAKE_NOT) {
|
||||
WRITE_ONCE(rdp_gp->nocb_defer_wakeup, RCU_NOCB_WAKE_NOT);
|
||||
timer_delete(&rdp_gp->nocb_timer);
|
||||
}
|
||||
nocb_defer_wakeup_cancel(rdp_gp);
|
||||
|
||||
if (force || READ_ONCE(rdp_gp->nocb_gp_sleep)) {
|
||||
if (READ_ONCE(rdp_gp->nocb_gp_sleep)) {
|
||||
WRITE_ONCE(rdp_gp->nocb_gp_sleep, false);
|
||||
needwake = true;
|
||||
}
|
||||
|
|
@ -225,13 +231,13 @@ static bool __wake_nocb_gp(struct rcu_data *rdp_gp,
|
|||
/*
|
||||
* Kick the GP kthread for this NOCB group.
|
||||
*/
|
||||
static bool wake_nocb_gp(struct rcu_data *rdp, bool force)
|
||||
static bool wake_nocb_gp(struct rcu_data *rdp)
|
||||
{
|
||||
unsigned long flags;
|
||||
struct rcu_data *rdp_gp = rdp->nocb_gp_rdp;
|
||||
|
||||
raw_spin_lock_irqsave(&rdp_gp->nocb_gp_lock, flags);
|
||||
return __wake_nocb_gp(rdp_gp, rdp, force, flags);
|
||||
return __wake_nocb_gp(rdp_gp, rdp, flags);
|
||||
}
|
||||
|
||||
#ifdef CONFIG_RCU_LAZY
|
||||
|
|
@ -518,22 +524,17 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
|
|||
}
|
||||
|
||||
/*
|
||||
* Awaken the no-CBs grace-period kthread if needed, either due to it
|
||||
* legitimately being asleep or due to overload conditions.
|
||||
*
|
||||
* If warranted, also wake up the kthread servicing this CPUs queues.
|
||||
* Awaken the no-CBs grace-period kthread if needed due to it legitimately
|
||||
* being asleep.
|
||||
*/
|
||||
static void __call_rcu_nocb_wake(struct rcu_data *rdp, bool was_alldone,
|
||||
unsigned long flags)
|
||||
__releases(rdp->nocb_lock)
|
||||
{
|
||||
long bypass_len;
|
||||
unsigned long cur_gp_seq;
|
||||
unsigned long j;
|
||||
long lazy_len;
|
||||
long len;
|
||||
struct task_struct *t;
|
||||
struct rcu_data *rdp_gp = rdp->nocb_gp_rdp;
|
||||
|
||||
// If we are being polled or there is no kthread, just leave.
|
||||
t = READ_ONCE(rdp->nocb_gp_kthread);
|
||||
|
|
@ -549,47 +550,26 @@ static void __call_rcu_nocb_wake(struct rcu_data *rdp, bool was_alldone,
|
|||
lazy_len = READ_ONCE(rdp->lazy_len);
|
||||
if (was_alldone) {
|
||||
rdp->qlen_last_fqs_check = len;
|
||||
rcu_nocb_unlock(rdp);
|
||||
// Only lazy CBs in bypass list
|
||||
if (lazy_len && bypass_len == lazy_len) {
|
||||
rcu_nocb_unlock(rdp);
|
||||
wake_nocb_gp_defer(rdp, RCU_NOCB_WAKE_LAZY,
|
||||
TPS("WakeLazy"));
|
||||
} else if (!irqs_disabled_flags(flags)) {
|
||||
/* ... if queue was empty ... */
|
||||
rcu_nocb_unlock(rdp);
|
||||
wake_nocb_gp(rdp, false);
|
||||
wake_nocb_gp(rdp);
|
||||
trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
|
||||
TPS("WakeEmpty"));
|
||||
} else {
|
||||
rcu_nocb_unlock(rdp);
|
||||
wake_nocb_gp_defer(rdp, RCU_NOCB_WAKE,
|
||||
TPS("WakeEmptyIsDeferred"));
|
||||
}
|
||||
} else if (len > rdp->qlen_last_fqs_check + qhimark) {
|
||||
/* ... or if many callbacks queued. */
|
||||
rdp->qlen_last_fqs_check = len;
|
||||
j = jiffies;
|
||||
if (j != rdp->nocb_gp_adv_time &&
|
||||
rcu_segcblist_nextgp(&rdp->cblist, &cur_gp_seq) &&
|
||||
rcu_seq_done(&rdp->mynode->gp_seq, cur_gp_seq)) {
|
||||
rcu_advance_cbs_nowake(rdp->mynode, rdp);
|
||||
rdp->nocb_gp_adv_time = j;
|
||||
}
|
||||
smp_mb(); /* Enqueue before timer_pending(). */
|
||||
if ((rdp->nocb_cb_sleep ||
|
||||
!rcu_segcblist_ready_cbs(&rdp->cblist)) &&
|
||||
!timer_pending(&rdp_gp->nocb_timer)) {
|
||||
rcu_nocb_unlock(rdp);
|
||||
wake_nocb_gp_defer(rdp, RCU_NOCB_WAKE_FORCE,
|
||||
TPS("WakeOvfIsDeferred"));
|
||||
} else {
|
||||
rcu_nocb_unlock(rdp);
|
||||
trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("WakeNot"));
|
||||
}
|
||||
} else {
|
||||
rcu_nocb_unlock(rdp);
|
||||
trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("WakeNot"));
|
||||
|
||||
return;
|
||||
}
|
||||
|
||||
rcu_nocb_unlock(rdp);
|
||||
trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("WakeNot"));
|
||||
}
|
||||
|
||||
static void call_rcu_nocb(struct rcu_data *rdp, struct rcu_head *head,
|
||||
|
|
@ -814,10 +794,7 @@ static void nocb_gp_wait(struct rcu_data *my_rdp)
|
|||
if (rdp_toggling)
|
||||
my_rdp->nocb_toggling_rdp = NULL;
|
||||
|
||||
if (my_rdp->nocb_defer_wakeup > RCU_NOCB_WAKE_NOT) {
|
||||
WRITE_ONCE(my_rdp->nocb_defer_wakeup, RCU_NOCB_WAKE_NOT);
|
||||
timer_delete(&my_rdp->nocb_timer);
|
||||
}
|
||||
nocb_defer_wakeup_cancel(my_rdp);
|
||||
WRITE_ONCE(my_rdp->nocb_gp_sleep, true);
|
||||
raw_spin_unlock_irqrestore(&my_rdp->nocb_gp_lock, flags);
|
||||
} else {
|
||||
|
|
@ -966,7 +943,6 @@ static bool do_nocb_deferred_wakeup_common(struct rcu_data *rdp_gp,
|
|||
unsigned long flags)
|
||||
__releases(rdp_gp->nocb_gp_lock)
|
||||
{
|
||||
int ndw;
|
||||
int ret;
|
||||
|
||||
if (!rcu_nocb_need_deferred_wakeup(rdp_gp, level)) {
|
||||
|
|
@ -974,8 +950,7 @@ static bool do_nocb_deferred_wakeup_common(struct rcu_data *rdp_gp,
|
|||
return false;
|
||||
}
|
||||
|
||||
ndw = rdp_gp->nocb_defer_wakeup;
|
||||
ret = __wake_nocb_gp(rdp_gp, rdp, ndw == RCU_NOCB_WAKE_FORCE, flags);
|
||||
ret = __wake_nocb_gp(rdp_gp, rdp, flags);
|
||||
trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("DeferredWake"));
|
||||
|
||||
return ret;
|
||||
|
|
@ -991,7 +966,6 @@ static void do_nocb_deferred_wakeup_timer(struct timer_list *t)
|
|||
trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("Timer"));
|
||||
|
||||
raw_spin_lock_irqsave(&rdp->nocb_gp_lock, flags);
|
||||
smp_mb__after_spinlock(); /* Timer expire before wakeup. */
|
||||
do_nocb_deferred_wakeup_common(rdp, rdp, RCU_NOCB_WAKE_BYPASS, flags);
|
||||
}
|
||||
|
||||
|
|
@ -1272,7 +1246,7 @@ lazy_rcu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
|
|||
}
|
||||
rcu_nocb_try_flush_bypass(rdp, jiffies);
|
||||
rcu_nocb_unlock_irqrestore(rdp, flags);
|
||||
wake_nocb_gp(rdp, false);
|
||||
wake_nocb_gp(rdp);
|
||||
sc->nr_to_scan -= _count;
|
||||
count += _count;
|
||||
if (sc->nr_to_scan <= 0)
|
||||
|
|
@ -1657,7 +1631,7 @@ static void rcu_init_one_nocb(struct rcu_node *rnp)
|
|||
{
|
||||
}
|
||||
|
||||
static bool wake_nocb_gp(struct rcu_data *rdp, bool force)
|
||||
static bool wake_nocb_gp(struct rcu_data *rdp)
|
||||
{
|
||||
return false;
|
||||
}
|
||||
|
|
|
|||
|
|
@ -487,8 +487,8 @@ rcu_preempt_deferred_qs_irqrestore(struct task_struct *t, unsigned long flags)
|
|||
union rcu_special special;
|
||||
|
||||
rdp = this_cpu_ptr(&rcu_data);
|
||||
if (rdp->defer_qs_iw_pending == DEFER_QS_PENDING)
|
||||
rdp->defer_qs_iw_pending = DEFER_QS_IDLE;
|
||||
if (rdp->defer_qs_pending == DEFER_QS_PENDING)
|
||||
rdp->defer_qs_pending = DEFER_QS_IDLE;
|
||||
|
||||
/*
|
||||
* If RCU core is waiting for this CPU to exit its critical section,
|
||||
|
|
@ -645,7 +645,7 @@ static void rcu_preempt_deferred_qs_handler(struct irq_work *iwp)
|
|||
* 5. Deferred QS reporting does not happen.
|
||||
*/
|
||||
if (rcu_preempt_depth() > 0)
|
||||
WRITE_ONCE(rdp->defer_qs_iw_pending, DEFER_QS_IDLE);
|
||||
WRITE_ONCE(rdp->defer_qs_pending, DEFER_QS_IDLE);
|
||||
}
|
||||
|
||||
/*
|
||||
|
|
@ -747,7 +747,10 @@ static void rcu_read_unlock_special(struct task_struct *t)
|
|||
// Using softirq, safe to awaken, and either the
|
||||
// wakeup is free or there is either an expedited
|
||||
// GP in flight or a potential need to deboost.
|
||||
raise_softirq_irqoff(RCU_SOFTIRQ);
|
||||
if (rdp->defer_qs_pending != DEFER_QS_PENDING) {
|
||||
rdp->defer_qs_pending = DEFER_QS_PENDING;
|
||||
raise_softirq_irqoff(RCU_SOFTIRQ);
|
||||
}
|
||||
} else {
|
||||
// Enabling BH or preempt does reschedule, so...
|
||||
// Also if no expediting and no possible deboosting,
|
||||
|
|
@ -755,11 +758,11 @@ static void rcu_read_unlock_special(struct task_struct *t)
|
|||
// tick enabled.
|
||||
set_need_resched_current();
|
||||
if (IS_ENABLED(CONFIG_IRQ_WORK) && irqs_were_disabled &&
|
||||
needs_exp && rdp->defer_qs_iw_pending != DEFER_QS_PENDING &&
|
||||
needs_exp && rdp->defer_qs_pending != DEFER_QS_PENDING &&
|
||||
cpu_online(rdp->cpu)) {
|
||||
// Get scheduler to re-evaluate and call hooks.
|
||||
// If !IRQ_WORK, FQS scan will eventually IPI.
|
||||
rdp->defer_qs_iw_pending = DEFER_QS_PENDING;
|
||||
rdp->defer_qs_pending = DEFER_QS_PENDING;
|
||||
irq_work_queue_on(&rdp->defer_qs_iw, rdp->cpu);
|
||||
}
|
||||
}
|
||||
|
|
|
|||
|
|
@ -863,7 +863,9 @@ our %deprecated_apis = (
|
|||
#These should be enough to drive away new IDR users
|
||||
"DEFINE_IDR" => "DEFINE_XARRAY",
|
||||
"idr_init" => "xa_init",
|
||||
"idr_init_base" => "xa_init_flags"
|
||||
"idr_init_base" => "xa_init_flags",
|
||||
"rcu_read_lock_trace" => "rcu_read_lock_tasks_trace",
|
||||
"rcu_read_unlock_trace" => "rcu_read_unlock_tasks_trace",
|
||||
);
|
||||
|
||||
#Create a search pattern for all these strings to speed up a loop below
|
||||
|
|
|
|||
|
|
@ -3,3 +3,4 @@ initrd
|
|||
b[0-9]*
|
||||
res
|
||||
*.swp
|
||||
.kvm.sh.lock
|
||||
|
|
|
|||
|
|
@ -42,7 +42,7 @@ do
|
|||
grep -v '^#' < $i | grep -v '^ *$' > $T/p
|
||||
if test -r $i.boot
|
||||
then
|
||||
tr -s ' ' '\012' < $i.boot | grep -v '^#' >> $T/p
|
||||
sed -e 's/#.*$//' < $i.boot | tr -s ' ' '\012' >> $T/p
|
||||
fi
|
||||
sed -e 's/^[^=]*$/&=?/' < $T/p |
|
||||
sed -e 's/^\([^=]*\)=\(.*\)$/\tp["\1:'"$i"'"] = "\2";\n\tc["\1"] = 1;/' >> $T/p.awk
|
||||
|
|
|
|||
|
|
@ -15,7 +15,7 @@
|
|||
# This script is intended to replace kvm-check-branches.sh by providing
|
||||
# ease of use and faster execution.
|
||||
|
||||
T="`mktemp -d ${TMPDIR-/tmp}/kvm-series.sh.XXXXXX`"
|
||||
T="`mktemp -d ${TMPDIR-/tmp}/kvm-series.sh.XXXXXX`"; export T
|
||||
trap 'rm -rf $T' 0
|
||||
|
||||
scriptname=$0
|
||||
|
|
@ -32,6 +32,7 @@ then
|
|||
echo "$0: Repetition ('*') not allowed in config list."
|
||||
exit 1
|
||||
fi
|
||||
config_list_len="`echo ${config_list} | wc -w | awk '{ print $1; }'`"
|
||||
|
||||
commit_list="${2}"
|
||||
if test -z "${commit_list}"
|
||||
|
|
@ -47,70 +48,209 @@ then
|
|||
exit 2
|
||||
fi
|
||||
sha1_list=`cat $T/commits`
|
||||
sha1_list_len="`echo ${sha1_list} | wc -w | awk '{ print $1; }'`"
|
||||
|
||||
shift
|
||||
shift
|
||||
|
||||
RCUTORTURE="`pwd`/tools/testing/selftests/rcutorture"; export RCUTORTURE
|
||||
PATH=${RCUTORTURE}/bin:$PATH; export PATH
|
||||
RES="${RCUTORTURE}/res"; export RES
|
||||
. functions.sh
|
||||
|
||||
ret=0
|
||||
nfail=0
|
||||
nbuildfail=0
|
||||
nrunfail=0
|
||||
nsuccess=0
|
||||
faillist=
|
||||
ncpus=0
|
||||
buildfaillist=
|
||||
runfaillist=
|
||||
successlist=
|
||||
cursha1="`git rev-parse --abbrev-ref HEAD`"
|
||||
ds="`date +%Y.%m.%d-%H.%M.%S`-series"
|
||||
DS="${RES}/${ds}"; export DS
|
||||
startdate="`date`"
|
||||
starttime="`get_starttime`"
|
||||
|
||||
echo " --- " $scriptname $args | tee -a $T/log
|
||||
echo " --- Results directory: " $ds | tee -a $T/log
|
||||
|
||||
# Do all builds. Iterate through commits within a given scenario
|
||||
# because builds normally go faster from one commit to the next within a
|
||||
# given scenario. In contrast, switching scenarios on each rebuild will
|
||||
# often force a full rebuild due to Kconfig differences, for example,
|
||||
# turning preemption on and off. Defer actual runs in order to run
|
||||
# lots of them concurrently on large systems.
|
||||
touch $T/torunlist
|
||||
n2build="$((config_list_len*sha1_list_len))"
|
||||
nbuilt=0
|
||||
for config in ${config_list}
|
||||
do
|
||||
sha_n=0
|
||||
for sha in ${sha1_list}
|
||||
do
|
||||
sha1=${sha_n}.${sha} # Enable "sort -k1nr" to list commits in order.
|
||||
echo Starting ${config}/${sha1} at `date` | tee -a $T/log
|
||||
git checkout "${sha}"
|
||||
time tools/testing/selftests/rcutorture/bin/kvm.sh --configs "$config" --datestamp "$ds/${config}/${sha1}" --duration 1 "$@"
|
||||
echo
|
||||
echo Starting ${config}/${sha1} "($((nbuilt+1)) of ${n2build})" at `date` | tee -a $T/log
|
||||
git checkout --detach "${sha}"
|
||||
tools/testing/selftests/rcutorture/bin/kvm.sh --configs "$config" --datestamp "$ds/${config}/${sha1}" --duration 1 --build-only --trust-make "$@"
|
||||
curret=$?
|
||||
if test "${curret}" -ne 0
|
||||
then
|
||||
nfail=$((nfail+1))
|
||||
faillist="$faillist ${config}/${sha1}(${curret})"
|
||||
nbuildfail=$((nbuildfail+1))
|
||||
buildfaillist="$buildfaillist ${config}/${sha1}(${curret})"
|
||||
else
|
||||
nsuccess=$((nsuccess+1))
|
||||
successlist="$successlist ${config}/${sha1}"
|
||||
# Successful run, so remove large files.
|
||||
rm -f ${RCUTORTURE}/$ds/${config}/${sha1}/{vmlinux,bzImage,System.map,Module.symvers}
|
||||
batchncpus="`grep -v "^# cpus=" "${DS}/${config}/${sha1}/batches" | awk '{ sum += $3 } END { print sum }'`"
|
||||
echo run_one_qemu ${sha_n} ${config}/${sha1} ${batchncpus} >> $T/torunlist
|
||||
if test "${ncpus}" -eq 0
|
||||
then
|
||||
ncpus="`grep "^# cpus=" "${DS}/${config}/${sha1}/batches" | sed -e 's/^# cpus=//'`"
|
||||
case "${ncpus}" in
|
||||
^[0-9]*$)
|
||||
;;
|
||||
*)
|
||||
ncpus=0
|
||||
;;
|
||||
esac
|
||||
fi
|
||||
fi
|
||||
if test "${ret}" -eq 0
|
||||
then
|
||||
ret=${curret}
|
||||
fi
|
||||
sha_n=$((sha_n+1))
|
||||
nbuilt=$((nbuilt+1))
|
||||
done
|
||||
done
|
||||
|
||||
# If the user did not specify the number of CPUs, use them all.
|
||||
if test "${ncpus}" -eq 0
|
||||
then
|
||||
ncpus="`identify_qemu_vcpus`"
|
||||
fi
|
||||
|
||||
cpusused=0
|
||||
touch $T/successlistfile
|
||||
touch $T/faillistfile
|
||||
n2run="`wc -l $T/torunlist | awk '{ print $1; }'`"
|
||||
nrun=0
|
||||
|
||||
# do_run_one_qemu ds resultsdir qemu_curout
|
||||
#
|
||||
# Start the specified qemu run and record its success or failure.
|
||||
do_run_one_qemu () {
|
||||
local ret
|
||||
local ds="$1"
|
||||
local resultsdir="$2"
|
||||
local qemu_curout="$3"
|
||||
|
||||
tools/testing/selftests/rcutorture/bin/kvm-again.sh "${DS}/${resultsdir}" --link inplace-force > ${qemu_curout} 2>&1
|
||||
ret=$?
|
||||
if test "${ret}" -eq 0
|
||||
then
|
||||
echo ${resultsdir} >> $T/successlistfile
|
||||
# Successful run, so remove large files.
|
||||
rm -f ${DS}/${resultsdir}/{vmlinux,bzImage,System.map,Module.symvers}
|
||||
else
|
||||
echo "${resultsdir}(${ret})" >> $T/faillistfile
|
||||
fi
|
||||
}
|
||||
|
||||
# cleanup_qemu_batch batchncpus
|
||||
#
|
||||
# Update success and failure lists, files, and counts at the end of
|
||||
# a batch.
|
||||
cleanup_qemu_batch () {
|
||||
local batchncpus="$1"
|
||||
|
||||
echo Waiting, cpusused=${cpusused}, ncpus=${ncpus} `date` | tee -a $T/log
|
||||
wait
|
||||
cpusused="${batchncpus}"
|
||||
nsuccessbatch="`wc -l $T/successlistfile | awk '{ print $1 }'`"
|
||||
nsuccess=$((nsuccess+nsuccessbatch))
|
||||
successlist="$successlist `cat $T/successlistfile`"
|
||||
rm $T/successlistfile
|
||||
touch $T/successlistfile
|
||||
nfailbatch="`wc -l $T/faillistfile | awk '{ print $1 }'`"
|
||||
nrunfail=$((nrunfail+nfailbatch))
|
||||
runfaillist="$runfaillist `cat $T/faillistfile`"
|
||||
rm $T/faillistfile
|
||||
touch $T/faillistfile
|
||||
}
|
||||
|
||||
# run_one_qemu sha_n config/sha1 batchncpus
|
||||
#
|
||||
# Launch into the background the sha_n-th qemu job whose results directory
|
||||
# is config/sha1 and which uses batchncpus CPUs. Once we reach a job that
|
||||
# would overflow the number of available CPUs, wait for the previous jobs
|
||||
# to complete and record their results.
|
||||
run_one_qemu () {
|
||||
local sha_n="$1"
|
||||
local config_sha1="$2"
|
||||
local batchncpus="$3"
|
||||
local qemu_curout
|
||||
|
||||
cpusused=$((cpusused+batchncpus))
|
||||
if test "${cpusused}" -gt $ncpus
|
||||
then
|
||||
cleanup_qemu_batch "${batchncpus}"
|
||||
fi
|
||||
echo Starting ${config_sha1} using ${batchncpus} CPUs "($((nrun+1)) of ${n2run})" `date`
|
||||
qemu_curout="${DS}/${config_sha1}/qemu-series"
|
||||
do_run_one_qemu "$ds" "${config_sha1}" ${qemu_curout} &
|
||||
nrun="$((nrun+1))"
|
||||
}
|
||||
|
||||
# Re-ordering the runs will mess up the affinity chosen at build time
|
||||
# (among other things, over-using CPU 0), so suppress it.
|
||||
TORTURE_NO_AFFINITY="no-affinity"; export TORTURE_NO_AFFINITY
|
||||
|
||||
# Run the kernels (if any) that built correctly.
|
||||
echo | tee -a $T/log # Put a blank line between build and run messages.
|
||||
. $T/torunlist
|
||||
cleanup_qemu_batch "${batchncpus}"
|
||||
|
||||
# Get back to initial checkout/SHA-1.
|
||||
git checkout "${cursha1}"
|
||||
|
||||
echo ${nsuccess} SUCCESSES: | tee -a $T/log
|
||||
echo ${successlist} | fmt | tee -a $T/log
|
||||
echo | tee -a $T/log
|
||||
echo ${nfail} FAILURES: | tee -a $T/log
|
||||
echo ${faillist} | fmt | tee -a $T/log
|
||||
if test -n "${faillist}"
|
||||
# Throw away leading and trailing space characters for fmt.
|
||||
successlist="`echo ${successlist} | sed -e 's/^ *//' -e 's/ *$//'`"
|
||||
buildfaillist="`echo ${buildfaillist} | sed -e 's/^ *//' -e 's/ *$//'`"
|
||||
runfaillist="`echo ${runfaillist} | sed -e 's/^ *//' -e 's/ *$//'`"
|
||||
|
||||
# Print lists of successes, build failures, and run failures, if any.
|
||||
if test "${nsuccess}" -gt 0
|
||||
then
|
||||
echo | tee -a $T/log
|
||||
echo Failures across commits: | tee -a $T/log
|
||||
echo ${faillist} | tr ' ' '\012' | sed -e 's,^[^/]*/,,' -e 's/([0-9]*)//' |
|
||||
echo ${nsuccess} SUCCESSES: | tee -a $T/log
|
||||
echo ${successlist} | fmt | tee -a $T/log
|
||||
fi
|
||||
if test "${nbuildfail}" -gt 0
|
||||
then
|
||||
echo | tee -a $T/log
|
||||
echo ${nbuildfail} BUILD FAILURES: | tee -a $T/log
|
||||
echo ${buildfaillist} | fmt | tee -a $T/log
|
||||
fi
|
||||
if test "${nrunfail}" -gt 0
|
||||
then
|
||||
echo | tee -a $T/log
|
||||
echo ${nrunfail} RUN FAILURES: | tee -a $T/log
|
||||
echo ${runfaillist} | fmt | tee -a $T/log
|
||||
fi
|
||||
|
||||
# If there were build or runtime failures, map them to commits.
|
||||
if test "${nbuildfail}" -gt 0 || test "${nrunfail}" -gt 0
|
||||
then
|
||||
echo | tee -a $T/log
|
||||
echo Build failures across commits: | tee -a $T/log
|
||||
echo ${buildfaillist} | tr ' ' '\012' | sed -e 's,^[^/]*/,,' -e 's/([0-9]*)//' |
|
||||
sort | uniq -c | sort -k2n | tee -a $T/log
|
||||
fi
|
||||
|
||||
# Print run summary.
|
||||
echo | tee -a $T/log
|
||||
echo Started at $startdate, ended at `date`, duration `get_starttime_duration $starttime`. | tee -a $T/log
|
||||
echo Summary: Successes: ${nsuccess} Failures: ${nfail} | tee -a $T/log
|
||||
cp $T/log tools/testing/selftests/rcutorture/res/${ds}
|
||||
echo Summary: Successes: ${nsuccess} " "Build Failures: ${nbuildfail} " "Runtime Failures: ${nrunfail}| tee -a $T/log
|
||||
cp $T/log ${DS}
|
||||
|
||||
exit "${ret}"
|
||||
|
|
|
|||
|
|
@ -80,6 +80,7 @@ usage () {
|
|||
echo " --kasan"
|
||||
echo " --kconfig Kconfig-options"
|
||||
echo " --kcsan"
|
||||
echo " --kill-previous"
|
||||
echo " --kmake-arg kernel-make-arguments"
|
||||
echo " --mac nn:nn:nn:nn:nn:nn"
|
||||
echo " --memory megabytes|nnnG"
|
||||
|
|
@ -206,6 +207,9 @@ do
|
|||
--kcsan)
|
||||
TORTURE_KCONFIG_KCSAN_ARG="$debuginfo CONFIG_KCSAN=y CONFIG_KCSAN_STRICT=y CONFIG_KCSAN_REPORT_ONCE_IN_MS=100000 CONFIG_KCSAN_VERBOSE=y CONFIG_DEBUG_LOCK_ALLOC=y CONFIG_PROVE_LOCKING=y"; export TORTURE_KCONFIG_KCSAN_ARG
|
||||
;;
|
||||
--kill-previous)
|
||||
TORTURE_KILL_PREVIOUS=1
|
||||
;;
|
||||
--kmake-arg|--kmake-args)
|
||||
checkarg --kmake-arg "(kernel make arguments)" $# "$2" '.*' '^error$'
|
||||
TORTURE_KMAKE_ARG="`echo "$TORTURE_KMAKE_ARG $2" | sed -e 's/^ *//' -e 's/ *$//'`"
|
||||
|
|
@ -275,6 +279,42 @@ do
|
|||
shift
|
||||
done
|
||||
|
||||
# Prevent concurrent kvm.sh runs on the same source tree. The flock
|
||||
# is automatically released when the script exits, even if killed.
|
||||
TORTURE_LOCK="$RCUTORTURE/.kvm.sh.lock"
|
||||
|
||||
# Terminate any processes holding the lock file, if requested.
|
||||
if test -n "$TORTURE_KILL_PREVIOUS"
|
||||
then
|
||||
if test -e "$TORTURE_LOCK"
|
||||
then
|
||||
echo "Killing processes holding $TORTURE_LOCK..."
|
||||
if fuser -k "$TORTURE_LOCK" >/dev/null 2>&1
|
||||
then
|
||||
sleep 2
|
||||
echo "Previous kvm.sh processes killed."
|
||||
else
|
||||
echo "No processes were holding the lock."
|
||||
fi
|
||||
else
|
||||
echo "No lock file exists, nothing to kill."
|
||||
fi
|
||||
fi
|
||||
|
||||
if test -z "$dryrun"
|
||||
then
|
||||
# Create a file descriptor and flock it, so that when kvm.sh (and its
|
||||
# children) exit, the flock is released by the kernel automatically.
|
||||
exec 9>"$TORTURE_LOCK"
|
||||
if ! flock -n 9
|
||||
then
|
||||
echo "ERROR: Another kvm.sh instance is already running on this tree."
|
||||
echo " Lock file: $TORTURE_LOCK"
|
||||
echo " To run kvm.sh, kill all existing kvm.sh runs first (--kill-previous)."
|
||||
exit 1
|
||||
fi
|
||||
fi
|
||||
|
||||
if test -n "$dryrun" || test -z "$TORTURE_INITRD" || tools/testing/selftests/rcutorture/bin/mkinitrd.sh
|
||||
then
|
||||
:
|
||||
|
|
|
|||
|
|
@ -18,7 +18,7 @@ fi
|
|||
echo Build directory: `pwd` > ${resdir}/testid.txt
|
||||
if test -d .git
|
||||
then
|
||||
echo Current commit: `git rev-parse HEAD` >> ${resdir}/testid.txt
|
||||
echo Current commit: `git show --oneline --no-patch HEAD` >> ${resdir}/testid.txt
|
||||
echo >> ${resdir}/testid.txt
|
||||
echo ' ---' Output of "'"git status"'": >> ${resdir}/testid.txt
|
||||
git status >> ${resdir}/testid.txt
|
||||
|
|
|
|||
|
|
@ -10,5 +10,4 @@ CONFIG_PROVE_LOCKING=n
|
|||
#CHECK#CONFIG_PROVE_RCU=n
|
||||
CONFIG_FORCE_TASKS_TRACE_RCU=y
|
||||
#CHECK#CONFIG_TASKS_TRACE_RCU=y
|
||||
CONFIG_TASKS_TRACE_RCU_READ_MB=y
|
||||
CONFIG_RCU_EXPERT=y
|
||||
|
|
|
|||
|
|
@ -9,6 +9,5 @@ CONFIG_PROVE_LOCKING=y
|
|||
#CHECK#CONFIG_PROVE_RCU=y
|
||||
CONFIG_FORCE_TASKS_TRACE_RCU=y
|
||||
#CHECK#CONFIG_TASKS_TRACE_RCU=y
|
||||
CONFIG_TASKS_TRACE_RCU_READ_MB=n
|
||||
CONFIG_RCU_EXPERT=y
|
||||
CONFIG_DEBUG_OBJECTS=y
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue