When an event is detected, a predictable response must be applied. Cost, while important, is not the top priority. Trade-offs are frequently made in favor of supporting the critical aspects of the application, but at the same time, designers have to be conscious of system costs in order to maintain a workable balance. Commercial Off-the-Shelf COTS solutions are readily available, helping to keep costs under control; however, these COTS products are not shipped in high volumes nor do these use the same components used in personal computers.
For these reasons, the unit costs are higher than consumer-grade products. There are many applications outside of military markets that need critical embedded systems. The intelligent highway, rail transportation, industrial controls, medical, scientific, space exploration, aviation, and many other applications meet the description of critical embedded systems. Trains continue to be a major form of freight and passenger transportation.
Nations around the world have automated train control ATO and railway operation systems with components on trains and wayside. Positive train control systems use technology that is capable of preventing train-to-train collisions. Rail safety acts have mandated the widespread installation of these systems.
Rail traffic management systems enhance interoperability and signaling throughout train control and command systems. High-speed rail systems make it even more critical to improve the computing capabilities used in these systems. The Large Hadron Collider LHC at CERN, used by physicists to study the smallest known particles, depends heavily on critical embedded systems for the control of the physics experiments conducted at the laboratory. Controlling the beams requires the performance and reliability found in a critical embedded system.
Synthetic Aperture Radar SAR systems are used for environmental monitoring, Earth-resource mapping, and military systems that require broad-area imaging at high resolutions. Many times the imagery must be acquired in inclement weather or during night as well as day. SAR systems provide information that is critical to the safe success of many military missions. The best, most reliable hardware in the world is only as strong as the software that runs on the platform. While hardware failures are relatively easy to spot, software failures are not. Investigators will often state that a glitch was reported to have been the problem that led to a catastrophic failure.
Requests for safety-critical computer systems are increasing not only in air and ground transportation, but also in nuclear physics and critical industrial environments. Despite using the best-designed hardware, how do you remove the glitches? Dewar suggests that a three-step process can help tremendously. Step 1: Write a set of rigorous high-level requirements that can be understood thoroughly.
Step 2: Derive detailed requirements that can lead to well written code. Step 3: Use the detailed requirements to generate tests that check out the code to the requirements. He also adds that the single biggest change that could be made is to eliminate the tolerance for bad software. Making systems reliable is a consensus position. Both the hardware and the software teams need to be in alignment with total system design. Fault detection and removal. The use of verification and validation techniques that effectively helps to detect and remove the faults before the system is used.
Techniques that ensure that faults in a system do not result in system errors or that ensure that system errors do not result in system failures. The essential feature of safety-critical systems is that system operation is always safe. These systems never compromise or damage people or the environment of the system, even if the system fail. Safety-critical software has two groups:. Primary safety-critical software. This software is usually embedded as a controller in a system. Secondary safety-critical software. This is software that can indirectly result in injury.
For an example, software used for design has a fault can causes the malfunction of designed system and this may results in injury to people. The safe operation, i. Hazard avoidance. This type of system is designed so that hazards are avoided. For example, a safe cutting system equipped with two control buttons, where the two buttons can be operated by using separate hands. Hazard detection and removal. The system is designed so that hazards are detected and removed before they result in an accident.
For example, pressure control in a chemical reactor system can reduce the detected excessive pressure before an explosion occurs. Damage limitation. These systems have a functionality that can minimize the effects of an accident. For example, automatic fire extinguisher systems. Security has become increasingly important attributes of systems connecting to the Internet. Internet connections provide additional system functionality, but it also allows systems to be attacked by people with hostile intentions. Security is a system attribute that shows the ability of the system to protect itself from against accidental or deliberate external attacks.
In some critical systems such as systems for electronic commerce, military systems, etc. Examples of attacks might be viruses, unauthorized use of system services and data, unauthorized modification of the system, etc. Security is an important attribute for all critical systems. Without a reasonable level of security, the availability, reliability and safety of the system may be compromised if external attacks cause some damage to the system.
There are three types of damage that may be caused by external attack:. Denial of service. In this case of attack the system is forced into a state where its normal services become unavailable. Corruption of programs or data. The software components of the system are damaged affecting reliability and safety of system.
Disclosure of confidential information.
Confidential information managed by the system is exposed to unauthorized people as a consequence of the external attack. Vulnerability avoidance. The system is designed not to be vulnerable.
For example, if a system is not connected to Internet there is no possibility of external attacks. Attack detection and neutralization. The system is designed so that it detects and removes vulnerabilities before any damage occurs. An example of vulnerability detection and removal is the use of a virus checker to remove infected files. Exposure limitation. In these methods the consequences of attack are minimized. An example of exposure limitation is the application of regular system backups. Due to the quick progress in computer technology, improvement of software development methods, better programming languages and effective quality management the dependability of software has significantly improved in the last two decades.
In system development special development techniques may be used to ensure that the system is safe, secure and reliable. There are three complementary approaches can be used to develop dependable software:. The design and implementation process are used to minimize the programming errors and so on the number of faults in a program. Fault detection. The verification and validation processes are designed to discover and remove faults in a program before it is deployed for operational use.
The system is designed so that faults or unexpected system behaviour during execution are detected and managed in such a way that system failure does not occur. Redundancy and diversity are fundamental to the achievement of dependability in any system. Examples of redundancy are the components of critical systems that replicate the functionality of other components or an additional checking mechanism that is added to system but not strictly necessary for the basic operation of system. Faults can therefore be detected before they cause failures, and the system may be able to continue operating if individual components fail.
If the redundant components are not the same as other components, is the case of diversity, a common failure in the same, replicated component will not result in a complete system failure. Software engineering research intended to develop tools, techniques and methodologies that lead to the production of fault-free software. Fault-free software is software that exactly meets its specification.
Of course, this does not mean that the software will never fail. There may be errors in the specification that may be reflected in the software, or the users may misunderstand or misuse the software system. In order to develop fault-free software the following software engineering techniques must be used:. Dependable software processes.
The use of a dependable software process with appropriate verification and validation activities can minimize the number of faults in a program and detect those that do slip through. Quality management. The software development organization must have a development culture in which quality drives the software process. Design and development standards should be established that provide the development of fault-free programs.
Formal specification. There must be a precise system specification that defines the system to be implemented.. Static verification. Static verification techniques, such as the use of static analysers, can find anomalous program features that could be faults.
Strong typing. A strongly typed programming language such as Java must be used for development. If the programming language has strong typing, the language compiler can detect many programming errors. Safe programming. Some programming language constructs are more complex and error-prone than others. Safe programming means avoiding or at least minimizing the use of these constructs. Protected information. Design and implementation processes based on information hiding and encapsulation is to be followed.
Object-oriented languages such as Java satisfy this condition. Although, development of fault-free software by application of these techniques is possible, it is economically disadvantageous.
Industrial Flash Storage for Extreme Conditions. Many cryptographic operations are also computationally expensive to execute especially on limited resources available in embedded RT-IoT devices. Our goal here is to minimize the perturbation between the achievable i. Due to resource constraints e. There are no charges for publishing with Inderscience, unless you require your article to be Open Access OA. Solid state drives, or SSDs, have revolutionized data storage. Our primary focus here is to ensure the safety of the system despite the presence of malicious entity.
The cost of finding and removing remaining faults rises exponentially as faults in the program are discovered and removed. While the software becomes more dependable more tests are needed to find fewer and fewer faults. A fault-tolerant system can continue its operation even after some of its part is faulty or not reliable. The fault-tolerance mechanisms in the system ensure that these system faults do not cause system failure.
Where system failure could cause a catastrophic accident or where a loss of system operation would cause large economic losses it is necessary to develop fault-tolerant system. There are four complementary approaches to ensure fault-tolerance of a system:. The system must detect a fault that causes a system failure. Generally, this based on checking consistency of the system state. Damage assessment.
The parts of the system state that have been affected by the fault must be detected. Fault recovery. The system restores its state to a known safe state. This may be achieved by correcting the damaged state or by restoring the system to a known safe state. The first stage in ensuring fault tolerance is to detect that a fault either has occurred or will occur unless some action is taken immediately.
To achieve this, the illegal values of state variables must be recognized. Therefore, it is necessary to define state constraints that define the conditions that must always hold for all legal states. If these predicates are false, then a fault has occurred. Damage assessment involves analyzing the system state to estimate the extent of the state corruption. The role of the damage assessment procedures is not to recover from the fault but to assess what parts of the state space have been affected by the fault.
Damage can only be assessed if it is possible to apply some validity function that checks whether the state is consistent. The purpose of fault recovery process is to modify the state of the system so that the effects of the fault are eliminated or reduced.
The system can continue to operate, perhaps in some degraded form. Forward recovery tries to correct the damaged system state and to create the intended state.
Forward recovery is only possible in the cases where the state information includes built-in redundancy. Backward recovery restores the system state to a known correct state. For an example, most database systems include backward error recovery. When a user starts a database operation, a transaction is initiated. The changes made during that transaction are not immediately incorporated in the database. The database is only updated after the transaction is finished and no problems are detected.
If the transaction fails, the database is not updated. The real-time embedded systems are significantly different from other types of software systems. Their correct operation is dependent on the system responding to events within a short time interval. The real-time system can be shortly defined as follows:.
A real-time system is a software system where the correct operation of the system depends on the results produced by the system and the time at which these results are produced. Timely response is an important factor in all embedded systems but, in some cases, very fast response is not necessary.
It must produce a corresponding response for a particular input stimulus. Therefore, the behaviour of a real-time system can therefore be defined by listing the stimuli received by the system, the associated responses and the time at which the response must be produced. Stimuli has two classes:. The responses of system are transmitted to actuators that may control some equipment.
Aperiodic stimuli may be generated either by the actuators or by sensors. This sensor-system-actuator model of an embedded real-time system is illustrated in Figure A real-time system must able to respond to stimuli that occur at different times. Therefore, architecture should be designed so that, as soon as a stimulus is received, control is transferred to the correct handler. This cannot be achieved using sequential programs.
Consequently, real-time systems are normally designed as a set of concurrent and cooperating processes. In order to manage these concurrent processes most real-time systems includes a real-time operating system. The stimulus-response model of a real-time system consists of three processes. Each type of sensor has a sensor management process, computational processes to compute the required response for the stimuli received by the system and control processes for actuator to manage their operation. This stimulus-response model enables rapid collection of data from the sensor and allows the computational processes and actuator responses to be carried out later.
Designing a real-time system it is necessary to decide first which system capabilities are to be implemented in software and which in hardware.
technology of embedded and real-time systems and their emerging applications, including the Proceedings will be published by the IEEE Computer Society. rtcsa International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA). Hangzhou, China, August ,
Then the design process of real-time software focuses on the stimuli rather than the objects and functions. The design process has a number of overlapped stages:. Aggregation of the stimulus and response processing into a number of concurrent processes. It is usual in real-time systems design is to associate a concurrent process with each class of stimulus and response as shown in Figure Processes must be coordinated in a real-time system.
Process coordination mechanisms ensure mutual exclusion to shared resources. Once the process architecture has been designed and scheduling policy has been decided it should be checked that the system will meet its timing requirements. Timing constraints or other requirements often mean that some system functions, such as signal processing, should be implemented in hardware rather than in software.
Hardware components can provide a better performance than the equivalent software. Real-time systems have to respond to events occurring at irregular intervals. These stimuli often cause the system to move to a new state. For this reason, state machine models are often used to model real-time systems. Application of state machine models is an effective way to represent the design of a real-time system. The UML supports the development of state models based on state-charts.
A state model of a system assumes that the system, at any time, is in one of a number of possible states. When a stimulus is received it may cause a transition to a different state. Most of the embedded systems have real-time performance constraints that mean they have to work in conjunction with a real-time operating system RTOS. Real-time operating systems guarantee a certain capability within a specified time constrain. It manages processes and resource allocation in a real-time system.