To RTOS or not? (to RTOS)
Often in the embedded world the question of using a Real-Time Operating System (RTOS) or not, is the big question amongst engineers. The answers found on-line are usually biased opinions without metrics or scientific support of the argument. They usually state the advantages or disadvantages over the classic round-robin systems. The truth is that engineers prefer and like evidence instead of heuristics. I will try to answer this, as I did for myself. I believe this small guide will help decide if an RTOS is worth the effort or not.
Many times the same question has been asked by embedded systems engineers, regarding whether they should use an RTOS or not.
Requirements of Embedded Systems
The embedded systems are often called Real-Time Systems and the terms are used interchangeably although this may not always be the case. Ie. There are embedded systems that may not be real-time. Think for example of a thermometer. There is an embedded processor, but failing to measure the temperature in time does not have an impact.
One misconception is that Real-Time means very fast. This is a misleading interpretation. Real-Time means deterministic. For example if an event happens, our system needs to respond within a time limit. Depending on the application this time-limit will vary. In addition if multiple events are processed and all have the requirement of real-time then every event should be processed according to its own deadline.
As you may see we stress the property of time. Embedded systems may have other restrictions as well like memory footprint etc. but because here the resource we need to analyze and RTOSes handle it, is time.
Round-Robin systems may come in two varieties. Fixed execution, or scheduled.
Fixed execution is the system which is hard coded. The main loop calls a list of functions (tasks) which each one process their own events.
Obviously the tasks should not block on waiting, but they should rather return. This might add some complexity but in general this is probably the easiest method to build the system.
In this case each function is executed on demand. There is a top scheduler which observes which task has an event to process. Each task has also its deadline. Then the scheduler executes the best possible sequence of the previous tasks, according to their priority. This adds more complexity for the advantage of improving schedulability. Each time a task finishes the scheduler runs and decides which will the next task be.
This simple approach has a problem. What if, one of the tasks needs much processing time that upon returning to the main loop or scheduler, another task may have lost its deadline? Maximum response time in this case is the worst execution (till CPU release) of the set of all tasks. On average you may get better response time than Round-Robin systems but the average value cannot be used for real-time systems.
What is RTOS?
RTOS tries to solve the schedulability problem by pre-empting the tasks. This means that each task thinks that runs alone into the system without interruption. The scheduler will pre-empt each task according to some rules and will return back at the same point to continue when all the higher priority events have been processed. Although it might seem too complex, it is not so difficult to be done. Looking at the code of some RTOSes you will understand the logic behind it. Beware thought that many RTOSes are not code friendly. Ie. The code is not well written in respect to reading. For me this is what I consider one of the parameters of choosing an RTOS. Even if the RTOS is not perfect, if you have the source and it is readable you can fix or improve things. If the code is unreadable, then it is far more difficult to change it.
In fact even Round-Robin systems do pre-emptions! Can you guess it? The interrupts. The interrupts do exactly the same; However these are considered hardware priority (which is true) and limited in capacity or numbers. Thus actually RTOSes add one more layer of software pre-emption.
So why not everyone use an RTOS? There are many reasons.
First there is complexity. Using an RTOS may require additional code (like protection mutexes, semaphores, messages etc), that were not needed before.
Second the question of which RTOS to choose is not simple. Some are free, some need licenses and are expensive; others have a large footprint, or no do not provide source code.
Third are resources. What if your memory requirements need a very low footprint system or time constrains prohibit such systems? Maybe an RTOS cannot fit?
And the list goes on.
And that’s how the question described in the beginning starts. Seems that there is no specific engineering parameter that would pin-point if we really need an RTOS or not. But let’s go back to the principles of decision. What all systems try to do? Share the CPU time resource. Is RTOS better in schedulability in respect to the other non pre-emptive systems?
Actually there is a very good principle that helps us in general with schedulability. This is called Rate-Monotonic Approach (RMA). This method analyzes a system to check if it is possible to schedule its tasks. The inputs are various parameters like period of events, sporadic events, deadlines, etc that help derive mathematically if the system is schedulable. This approach works with fixed-priority schemes and with either pre-emptive or non pre-emptive systems.
Thus the methodology would be to estimate each tasks worst execution time, gather all the deadlines, fill in the matrices and get a result if the specific system is schedulable. Analyzing the round-robin systems first you get the idea if this will work or you stress the system.
RMA proves that a pre-emptive system is better. Thus if you round-robin systems fails to be schedulable, you should try the RTOS. Of course you have to add the context switching time and any other overheads. If the system is schedulable (with a safe margin), then using an RTOS is the solution with the given hardware. It might be the case that neither solution works. In this case you need probably to upgrade the hardware.
Fixed priority is not the best scheduling method, but it is predictable. Earliest Dead Line (EDL) priorities are better, but RMA cannot define how much better. Thus an EDL system will work if the fixed scheduled system is also schedulable.
Of course there other parameters to consider in this case like costs, memory footprints etc, but the fact is that if you need an RTOS, the question moves from “To RTOS or not to RTOS?” to “Which RTOS?” which is a whole new story.
Examples and Experience
During my carrier I rarely needed an RTOS. Classic round-robin systems would fit the bill very well. And that’s how AVRILOS came along over the years started from 8051 in the ‘90s and then ported in assembly to AVR in 2000. Then after a couple of years AVRILOS was re-written in C for AVR.
A case I should have used an RTOS
Looking back, I could see a project where an RTOS was necessary and I failed to see that at that point. The project was a cash register machine. I wrote the OS, which performed dynamic scheduling with Earliest Dead Line first priority but without pre-emption. The main application which we wrote with my colleague had to be split in sub-states and return to the main loop at the 2mS slot interval allocated for every task. This got us a tedious development. The funny thing was that we had implemented a one task pre-emption in case a task was missing its deadline. This was of course a safeguard and was not really used. But the pre-emption mechanism was there. Things got worse when I had to do the 10.4 digits BCD division which itself was longer than 2mS on our Z80 core. The full system could work skipping all these states/substates coding part from our side, if we had chosen to use an RTOS (or build our own). Although we had memory limitations, the extra code to support the splitting of the main application probably would be covered by the RTOS itself with a cleaner implementation.
A case that an RTOS would not fit
In another instance we were building the MAC layer of 802.11abg with smart antennas. Our core was an ARM9 with tightly coupled memory (TCM) running at 80MHz. The external bus was running at half the frequency as the connected logic (FPGA) could not go faster. At that point I was wondering if using an RTOS would be beneficial. The system had to respond in 2uS from an event, which means it had to process the received packet and prepare a response. Utilizing some system pipelining we could extend to 4-6uS. When I tried an RTOS to see the overheads, I realized that the penalty was about 2uS. The time cost was way too much, so I used the classic round robin approach which was matching the system needs pretty well.
So you may see from the above examples that the decision to use or not an RTOS was irrelevant of the system complexity or the execution speed, but rather a matter of schedulability.
The question to place an RTOS or not can be greatly answered depending on schedulability. If the system can be scheduled without an RTOS safely then you do not need an RTOS. If not, then RTOS is the way to go. Of course there can be other reasons for the decision, like future expansion, ready stacks to use etc. but these goes beyond the basic principles of decision. You may use the RMA method to provide the criteria for your decision.
 Meeting Deadlines in Hard Real-Time Systems
The Rate Monotonic Approach, by Loic P. Briand and Daniel M. Roy, ISBN 0-8186-7406-7,
IEEE Computer Society
AVRILOS: A simple OS for AVR microcontrollers