書名： Mastering Linux Kernel Development
作者名： Raghu Bharadwaj
本章字數： 727字
更新時間： 2021-07-08 09:47:22

Linux process scheduler design

Linux, which was primarily developed for desktop systems, has unassumingly evolved into a multi-dimensional operating system with its usage spread across embedded devices, mainframes, and supercomputers to room-sized servers. It has also seamlessly accommodated the ever-evolving diverse computing platforms such as SMP, virtualization, and real-time systems. The diversity of these platforms is brought forth by the kind of processes that run on these systems. For instance, a highly interactive desktop system may run processes that are I/O bound, and a real-time system thrives on deterministic processes. Every kind of process thus calls for a different kind of heuristic when it needs to be fairly scheduled, as a CPU-intensive process may require more CPU time than a normal process, and a real-time process would require deterministic execution. Linux, which caters to a wide spectrum of systems, is thus confronted with addressing the varying scheduling challenges that come along when managing these diverse processes.

The intrinsic design of Linux's process scheduler elegantly and deftly handles this challenge by adopting a simple two-layered model, with its first layer, the Generic Scheduler, defining abstract operations that serve as entry functions for the scheduler, and the second layer, the scheduling class, implementing the actual scheduling operations, where each class is dedicated to handling the scheduling heuristics of a particular kind of process. This model enables the generic scheduler to remain abstracted from the implementation details of every scheduler class. For instance, normal processes (I/O bound) can be handled by one class, and processes that require deterministic execution, such as real-time processes, can be handled by another class. This architecture also enables adding a new scheduling class seamlessly. The previous figure depicts the layered design of the process scheduler.

The generic scheduler defines abstract interfaces through a structure called sched_class:

struct sched_class {
    const struct sched_class *next;

     void (*enqueue_task) (struct rq *rq, struct task_struct *p, int flags);
   void (*dequeue_task) (struct rq *rq, struct task_struct *p, int flags);
   void (*yield_task) (struct rq *rq);
       bool (*yield_to_task) (struct rq *rq, struct task_struct *p, bool preempt);

 void (*check_preempt_curr) (struct rq *rq, struct task_struct *p, int flags);

       /*
         * It is the responsibility of the pick_next_task() method that will
       * return the next task to call put_prev_task() on the @prev task or
  * something equivalent.
   *
         * May return RETRY_TASK when it finds a higher prio class has runnable
    * tasks.
  */
       struct task_struct * (*pick_next_task) (struct rq *rq,
                                            struct task_struct *prev,
                                         struct rq_flags *rf);
     void (*put_prev_task) (struct rq *rq, struct task_struct *p);

#ifdef CONFIG_SMP
        int  (*select_task_rq)(struct task_struct *p, int task_cpu, int sd_flag, int flags);
      void (*migrate_task_rq)(struct task_struct *p);

     void (*task_woken) (struct rq *this_rq, struct task_struct *task);

  void (*set_cpus_allowed)(struct task_struct *p,
                            const struct cpumask *newmask);

    void (*rq_online)(struct rq *rq);
 void (*rq_offline)(struct rq *rq);
#endif

      void (*set_curr_task) (struct rq *rq);
    void (*task_tick) (struct rq *rq, struct task_struct *p, int queued);
     void (*task_fork) (struct task_struct *p);
        void (*task_dead) (struct task_struct *p);

  /*
         * The switched_from() call is allowed to drop rq->lock, therefore we
   * cannot assume the switched_from/switched_to pair is serialized by
        * rq->lock. They are however serialized by p->pi_lock.
      */
       void (*switched_from) (struct rq *this_rq, struct task_struct *task);
     void (*switched_to) (struct rq *this_rq, struct task_struct *task);
       void (*prio_changed) (struct rq *this_rq, struct task_struct *task,
                            int oldprio);

  unsigned int (*get_rr_interval) (struct rq *rq,
                                    struct task_struct *task);

 void (*update_curr) (struct rq *rq);

#define TASK_SET_GROUP  0
#define TASK_MOVE_GROUP  1

#ifdef CONFIG_FAIR_GROUP_SCHED
       void (*task_change_group) (struct task_struct *p, int type);
#endif
};

Every scheduler class implements operations as defined in the sched_class structure. As of the 4.12.x kernel, there are three scheduling classes: the Completely Fair Scheduling (CFS) class , Real-Time Scheduling class, and Deadline Scheduling class, with each class handling processes with specific scheduling requirements. The following code snippets show how each class populates its operations as per the sched_class structure.

CFS class:

const struct sched_class fair_sched_class = {
         .next                   = &idle_sched_class,
         .enqueue_task           = enqueue_task_fair,
         .dequeue_task           = dequeue_task_fair,
         .yield_task             = yield_task_fair,
         .yield_to_task          = yield_to_task_fair,
 
         .check_preempt_curr     = check_preempt_wakeup,
 
         .pick_next_task         = pick_next_task_fair,
         .put_prev_task          = put_prev_task_fair,
....
}

Real-Time Scheduling class:

const struct sched_class rt_sched_class = {
         .next                   = &fair_sched_class,
         .enqueue_task           = enqueue_task_rt,
         .dequeue_task           = dequeue_task_rt,
         .yield_task             = yield_task_rt,
 
         .check_preempt_curr     = check_preempt_curr_rt,
 
         .pick_next_task         = pick_next_task_rt,
         .put_prev_task          = put_prev_task_rt,
....
}

Deadline Scheduling class:

const struct sched_class dl_sched_class = {
         .next                   = &rt_sched_class,
         .enqueue_task           = enqueue_task_dl,
         .dequeue_task           = dequeue_task_dl,
         .yield_task             = yield_task_dl,
 
         .check_preempt_curr     = check_preempt_curr_dl,
 
         .pick_next_task         = pick_next_task_dl,
         .put_prev_task          = put_prev_task_dl,
....
}

官术网_书友最值得收藏!

Mastering Linux Kernel Development

Linux process scheduler design