Preemptive Task Scheduler on STM32 (Bare-Metal)

Overview

While working on embedded systems, I wanted to understand how multiple tasks actually run “at the same time” on a microcontroller.

Instead of using an RTOS, I decided to build a small scheduler from scratch on an STM32. The goal was simple: manage multiple independent tasks, switch between them efficiently, and handle timing without blocking the CPU.

This project focuses on the core idea behind operating systems—context switching, task scheduling, and time-based execution.


What This System Does

The scheduler allows multiple tasks to run concurrently using time slicing.

Each task:

  • Runs independently
  • Can perform hardware operations (like LED control)
  • Can delay itself without blocking other tasks
  • Automatically resumes after the delay expires

The system uses:

  • SysTick → for timekeeping (1 ms tick)
  • PendSV → for context switching between tasks

Architecture

At the core, the system uses a simple round-robin scheduler.

Each task is represented by a Task Control Block (TCB) that stores:

  • Process Stack Pointer (PSP)
  • Task state (READY or BLOCKED)
  • Delay timing (tick count)
  • Pointer to the task function

The design uses two stack pointers:

  • MSP (Main Stack Pointer) → used by the scheduler and interrupt handlers
  • PSP (Process Stack Pointer) → used by individual tasks

Each task gets its own dedicated stack in SRAM, ensuring isolation between tasks.

Context switching is handled manually inside the PendSV handler by saving and restoring CPU registers.


Execution Flow

The system runs on a fixed time base using SysTick:

  1. SysTick interrupt fires every 1 ms
  2. Global tick counter is incremented
  3. Blocked tasks are checked and unblocked if their delay has expired
  4. PendSV interrupt is triggered
  5. Current task context is saved
  6. Next ready task is selected
  7. Next task context is restored and execution resumes

This creates a continuous loop of task execution without blocking the processor.


Task Scheduling and Delay Mechanism

Tasks use a tick-based delay system instead of busy-waiting.

When a task calls delay:

  • It sets a wake-up time (current_tick + delay)
  • Marks itself as BLOCKED
  • Scheduler immediately switches to another task

Once the system tick reaches the wake-up time:

  • The task is moved back to READY state
  • It gets scheduled again in the next cycle

This allows efficient CPU usage and proper multitasking.


Demonstration

To test the scheduler, multiple LED tasks were created:

  • Green LED → toggles every 1 second
  • Orange LED → toggles every 500 ms
  • Red LED → toggles every 250 ms
  • Blue LED → toggles every 125 ms

All LEDs run at different frequencies simultaneously, demonstrating correct scheduling and timing behavior.


Gallery

Here are some snapshots from the project showing the scheduler in action and the development process.

  • Multiple LEDs running at different frequencies simultaneously
  • Hardware setup on STM32 board
  • Debugging and testing during development

The LED behavior verifies that tasks are running independently and are being scheduled correctly without blocking each other.


Challenges Faced

Building this from scratch exposed several low-level issues:

  • Tasks continued executing after delay due to missing immediate context switch
  • LED flickering caused by timing misalignment and scheduler behavior
  • Incorrect unblock logic due to strict tick comparison
  • Debugging register save/restore in context switching
  • Proper initialization of task stacks with dummy exception frames

Fixing these required a deeper understanding of:

  • ARM Cortex-M exception handling
  • Stack frame structure
  • Interrupt-driven execution

What I Learned

  • How context switching works internally in an RTOS
  • Difference between MSP and PSP and why both are needed
  • How SysTick and PendSV are used for scheduling
  • Designing a non-blocking delay system
  • Debugging timing and concurrency issues in bare-metal systems

Tech Stack

  • MCU: STM32 (ARM Cortex-M)
  • Language: C (bare-metal)
  • Development: STM32CubeIDE
  • Debugging: SWD

This project helped me understand the fundamentals of how operating systems manage execution on constrained hardware, and gave me hands-on experience with low-level scheduling and context switching.

Share your love

Leave a Reply

Your email address will not be published. Required fields are marked *