Department of Computer Engineering (CE)
Iran University of Science and Technology (IUST)

Course Webpage:

Dependable Software Design

Instructor: M. Abdollahi Azgomi

هدف درس:

n     هدف اين درس، آشنايي دانشجويان با مفاهيم اتكاءپذيري نرم‌افزار (software dependability) و مسايل مرتبط با آن است. براي اين منظور مروري بر مسايل اتكاءپذيري سيستم‌ها يا محاسبات تحمل‌پذير خطا (fault-tolerant computing) انجام مي‌شود. آنگاه مباحث اصلي تحمل‌پذيري خطای نرم‌افزار (software fault tolerance) مطرح مي‌شوند. براي نيل به درك عميق‌تر و بدست‌آوردن تجربه، دانشجويان براي انجام تمرين‌ها و پروژه‌هاي تحقیقی در اين زمينه راهنمايي مي‌شوند.


n      Main Books:

     E. Dubrova, Fault-Tolerant Design: An Introduction, Kluwer Academic Publisher (2005)

     M.R. Lyu, Software Fault Tolerance, John Wiley & Sons (2005)

n      Other Books:

     L.L. Pullum, Software Fault Tolerance: Techniques and Implementation, Artech House, Norwood (2001)

     B.W. Johnson, Design and Analysis of Fault-Tolerant Digital Systems, Addison-Weseley (1989)

     D. K. Pradhan (ed.), Fault-Tolerant Computer System Design, Prentice-Hall, 1st ed. (1996)

     M. Xie, Y.-S. Dai and K.-L. Poh, Computing System Reliability: Models and Analysis, Kluwer Academic Publishers (2004)

     D. Crowe (ed.), Design for Reliability, CRC Press (2001)

     M. L. Shooman, Reliability of Computer Systems and Networks: Fault Tolerance, Analysis and Design, Wiley Interscience (2002)

     J.-C. Geffroy and M. Gilles, Design of Dependable Computing Systems, Kluwer Academic Publishers (2002)



n      INTRODUCTION [Dubrova, Ch. 1]



n      HARDWARE REDUNDANCY [Dubrova, Ch. 4]

n      SOFTWARE REDUNDANCY [Dubrova, Ch. 7]

n      From Lyu's Book:

     Chapter 1. The Evolution of the Recovery Block Concept

     Chapter 2. The Methodology of N-Version Programming

     Chapter 3. Architectural Issues in Software Fault Tolerance

     Chapter 4. Exception Handling and Tolerance of Software Faults

     Chapter 5. Dependability Modeling for Fault-Tolerant Software and Systems

     Chapter 6. Analyses Using Stochastic Reward Nets

     Chapter 7. Checkpointing and the Modeling of Program Execution Time

     Chapter 8. The Distributed Recovery Block Scheme

     Chapter 9. Software Fault Tolerance by Design Diversity

     Chapter 10. Software Fault Tolerance in the Application Layer

     Chapter 11. Software Fault Tolerance in Computer Operating Systems

     Chapter 12. The Cost Effectiveness for Telecommunication Service Dependability

     Chapter 13. Software Fault Insertion Testing for Fault Tolerance


Return to Azgomi's Home Page