Distributed Systems Introduction Venus Samawi Isra University 1 Content • Distributed Computing & Distributed System • Advantages and Disadvantages of Distributed Systems • Characteristics of Distributed Systems • Examples of distributed systems • Resource Sharing & The Web • WWW 2 Distributed Computing & Distributed System • Early computing was performed on a single processor. Uni processor computing can be called centralized computing. • Twoadvances in technology began to change the situation • Thedevelopment of powerful microprocessors • Theinvention of high-speed computer networks • Distributed computing is computing performed in a distributed system. • ADistributed system is: • Acollection of independent computers, interconnected via a network, capable of collaborating on a task, • One in which components located at networked computers communicate and coordinate their actions only by message passing, • Acollection of independent computers, that appears to the users of the system as a single computer. 3 Advantages of Distributed Systems • The motivation for constructing and using distributed systems stems from a desire to share resources. • Economics: distributed systems allow the pooling of resources, including • CPUcycles, data storage, input/output devices, and services. • Distributed systems have a better price/performance ratio • Concurrency: we can solve the problem more quickly using several processors concurrently. • Some applications are inherently distributed : In some problems the most natural solution is to use separate parallel processes to perform the subtasks of the given problem. • Computer Supported Cooperative Work 4 Advantages of Distributed Systems (Cont.) • Communication: make human-to-human communication easier, for example by electronic mail. • Reliability: a distributed system allow replication of resources and/or services, thus reducing service outage due to failures. • Incremental growth: computing power can be added in small • increments. 5 Disadvantages of Distributed Computing • Multiple Points of Failures: the failure of one or more participating computers, or one or more network links, can spell trouble. • Security Concerns: In a distributed system, there are more opportunities for unauthorized attack. 6 Characteristics of Distributed Systems • Concurrency of components (or Parallel activities) • Autonomous components executing concurrent tasks • Lack of a global clock (only limited precision for processes to synchronize) • No global state: No single process can have knowledge of the Current global state of the system • Independent failures of components 7 Examples of distributed systems • The following examples are based on familiar and widely used computer networks: the Internet, intranets and the emerging technology of networks based on mobile devices. • Internet: A huge interconnected collection of computer networks of many different types. • The Internet is a vast interconnected collection of computer networks. • Programs running on computers connected to it interact by passing messages. • It is a "network of networks" that consists of millions of private and public, academic, business, and government networks of local to global scope that are linked by copper wires, fiber-optic cables, wireless connections, and other technologies. • It enables users to make use of services such as the World Wide Web, e-mail and file transfer. 8 Examples of Distributed Systems—Cont. • Intranet: A portion of the Internet (a network of computers and workstations within an organization) managed by an organization that can be configured to enforce local security policies. • Composed of several LANs linked by backbone connections. • Isolated from the Internet via a protective device (a firewall). • Mobile and ubiquitous computing: the integration of small and portable computing devices into distributed systems. These devices include: Laptop computers, Handheld devices (PDAs, mobile phones, pagers, video cameras and digital cameras), smart watches, devices embedded in appliances such as cars, washing machines. • Ubiquitous is intended to mean that small computing devices will eventually become so pervasive in everyday objects that they are scarcely noticed. 9 Internet • It enables users to use services such as WWW, email, file transfer, Multimedia services, etc. • Programs running on the computers connected to the Internet interact by passing messages. • The design and construction of the Internet protocols enables a program running anywhere to address messages to programs anywhere else. Note: Some times Web is incorrectly used to mean the Internet. • Internet Service Providers (ISPs) are companies that provide modem links and other types of connection to users and small organizations, enabling them to • Access services in the internet as well as • Providing local services such as email and web hosting 10 11 Intranets 12 13 Mobile and Ubiquitous Computing 14 • Gateways adjust traffic between dissimilar networks, while routers adjust traffic between similar networks • Since TCP/IP is the main Internet protocol, a router could be used to connect the network to the Internet. 15 Resource Sharing & The Web 16 Resource Sharing & The Web (cont.) • AService is a distinct part of a computer system that manages a collection of related resources and presents their functionality to users and applications. • For example, shared files are accessed through a file service. • The service access is via the set of operations that it exports. • e.g. A file service provides read, write and delete operations on files. • Server refers to a process (running program) on a network computer that accepts requests from programs running on other computers to perform a service and responds appropriately. • The requesting processes are referred to as a clients. 17 Resource Sharing & The Web (cont.) • Requests are sent in massages from clients to a server • Replies are sent in massages from the server to the clients. The client invokes an operation upon the server means, the client sends a request for an operation to be carried out by a server. • Remote invocation: A complete interaction between a client and server, from the point when the client sends the request until receiving the server’s response, • Although in everyday parlance the terms ‘client’ and ‘server’ refer to the computers themselves. • In this topic, the terms ‘client’ and ‘server’ refer to processes rather 18 than the computers that they execute upon. Resource Sharing & The Web (cont.) • In a distributed system written in an object-oriented language, • Resources may be encapsulated as objects and accessed by client objects, in which case we speak of a client object invoking a method upon a server object. • An executing web browser is an example of a client. • The web browser communicates with a web server, to request web pages from it. 19 The World Wide Web (WWW) 20 Web servers and web browsers 21 Distributed Systems Architecture & Challenges Venus Samawi Isra University 1 Content • Networks vs. Distributed Systems • Architectures of Distributed Systems • How to characterize a distributed system? • Distributed Systems: Challenges 2 Networks vs. Distributed Systems • Networks: A media for interconnecting local and wide area computers and exchange messages based on protocols. • Network entities are visible and they are explicitly addressed (IP address). • Distributed System: existence of multiple autonomous computers is transparent • However, • many problems (e.g., openness, reliability) in common, but at different levels. • Networks focuses on packets, routing, etc., whereas distributed systems focus on applications. • Every distributed system relies on services provided by a computer network. Distributed Systems Computer Networks Architectures of Distributed Systems • Multiprocessor systems • Shared memory • Tightly coupled system • Easier to program • Bus-based interconnection network • E.g. SMPs (symmetric multiprocessors) with two or more CPUs • Multicomputer systems / Clusters • No shared memory • Homogeneous in hard- and software • Massively Parallel Processors (MPP) • Loosely coupled system • PC/Workstation clusters(each has its own memory And CPU. • High-speed networks/switches-based connection. How to characterize a distributed system? • Computers in distributed systems may be on separate continents, in the same building, or the same room. DSs have the following consequences: • Concurrency – each system is autonomous. • Carry out tasks independently • Tasks coordinate their actions by exchanging messages. • Heterogeneity • No global clock • Independent Failures • Prime motivation: to share resources 6 Selected application domains & Associated networked applications Finance and commerce eCommercee.g. Amazon and eBay, PayPal, online banking and trading The information society Web information and search engines, ebooks, Wikipedia; social networking: Facebookand Twitter. Creative industries and entertainment Online gaming, music and film in the home, user generated content, e.g. YouTube, Flickr Healthcare Health informatics, on online patient records, monitoring patients. Education e-learning, virtual learning environments; distance learning. e.g., Coursera Transport and logistics GPS in route finding systems, map services: Google Maps, Google Earth Science and Engineering Cloud computing as an enabling technology for collaboration between scientists (LHC, LIGO) Environmental management Sensor networks to monitor earthquakes, floods or tsunamis (Bureau of Meteorology flood warning system) Business Example –Challenges • What if • Your customer uses a completely different hardware? (PC, MAC, iPad, Mobile…) • … a different operating system? (Windows, Unix,…) • … a different way of representing data? (ASCII,…) • Heterogeneity • Or • You want to move your business and computers to the Caribbean (because of the weather or low tax)? • Your client moves to the Caribbean (more likely)? • Distribution transparency Middleware – software layer providing: • masking heterogeneity of: * underlying networks * hardware * operating systems Business Example –Challenges What if • Two customers want to order the same item at the same time? • Concurrency • Or • The database with your inventory information crashes? • Your customer’s computer crashes in the middle of an order? • Fault tolerance What if • Someone tries to break into your system to steal data? • … sniffs for information? • … your customer orders something and doesn’t accept the delivery saying he didn’t? • Security • Or • You are so successful that millions of people are visiting your online store at the same time? • Scalability Distributed Systems: Overview of Challenges • Heterogeneity • Heterogeneous components must be able to interoperate • Distribution transparency • Distribution should be hidden from the user as much as possible • Fault tolerance • Failure of a component (partial failure) should not result in failure of the whole system • Scalability • System should work efficiently with an increasing number of users • System performance should increase with inclusion of additional resources • Concurrency • Shared access to resources must be possible • Openness • Interfaces should be publicly available to ease inclusion of new components • Security • The system should only be used in the way intended Distributed Systems: Overview of Challenges • Heterogeneous components must be able to interoperate across different: • Operating systems • Hardware architectures • Communication architectures • Programming languages • Software interfaces • Security measures • Information representation Distribution Transparency I • To hide from the user and the application programmer the separation/distribution of components, so that the system is perceived as a whole rather than a collection of independent components. Forms of transparencies: • Access transparency • Access to local or remote resources is identical • E.g. Network File System / Dropbox • Location transparency • Access without knowledge of location • E.g. separation of domain name from machine address. • Failure transparency • Tasks can be completed despite failures • E.g. message retransmission, failure of a Web server node should not bring down the website. Distribution Transparency II • Replication transparency • Access to replicated resources as if there was just one. • provide enhanced reliability and performance without knowledge of the replicas by users or application programmers. • Migration (mobility/relocation) transparency • Allow the movement of resources and clients within a system without affecting the operation of users or applications. • E.g. switching from one name server to another at runtime; migration of an agent/process from one node to another. Distribution Transparency III • Concurrency transparency • A process should not notice that there are other sharing the same resources • Performance transparency: • Allows the system to be reconfigured to improve performance as loads vary • E.g., dynamic addition/deletion of components, switching from linear structures to hierarchical structures when the number of users increase • Scaling transparency: • Allows the system and applications to expand in scale without changes in the system structure or the application algorithms. • Application level transparencies: • Persistence transparency • Masks the deactivation and reactivation of an object • Transaction transparency • Hides the coordination required to satisfy the transactional properties of operations Fault Tolerance • Failure: an offered service no longer complies with its specification (e.g., no longer available or very slow to be usable) • Fault: cause of a failure (e.g. crash of a component) • Fault tolerance: no failure despite faults i.e., programmed to handle failures and hides them from users. Fault Tolerance Mechanisms • Fault detection Checksums, … • Fault masking Retransmission of corrupted messages, redundancy, … • Fault toleration Exception handling, timeouts,… • Fault recovery Rollback mechanisms,… Scalability • System should work efficiently at many different scales, ranging from a small Intranet to the Internet • Remains effective when there is a significant increase in the number of resources and the number of users • Challenges of designing scalable distributed systems: • Cost of physical resources • Cost should linearly increase with system size • Preventing software resources running out: • Numbers used to represent Internet addresses • Avoiding performance bottlenecks Concurrency • Provide and manage concurrent access to shared resources: • Fair scheduling • Preserve dependencies (e.g. distributed transactions -- buy a book using Credit card, make sure user has sufficient funds prior to finalizing order ) • Avoid deadlocks Openness and Interoperability Chrome (Google) Client1 in C Client in Python IE (Microsoft) Server in Java • Open system: A system that implements sufficient open specifications for: • Interfaces, • Services, and • supporting formats to • Enable properly engineered applications software to be ported across a wide range of systems with minimal changes, • Interoperate with other applications on local and remote systems, and • Interact with users in a style which facilitates user portability Security • Resources are accessible to authorized users and used in the way they are intended • Confidentiality • Protection against disclosure to unauthorized individual information • E.g. ACLs (access control lists) to provide authorized access to information • Integrity • Protection against alteration or corruption • E.g. changing the account number or amount value in a money order • Availability • Protection against denial of service (DoS) attacks to the resources. • E.g. denial of service (DoS) attacks • Non-repudiation • Proof of sending / receiving an information (E.g. digital signature) Security Mechanisms • Encryption (E.g. Blowfish, RSA) • Authentication (E.g. password, public key authentication) • Authorization (E.g. access control lists) Distributed Systems System Model Venus Samawi Isra University 1 Content • Physical Models: • Three Generations of DS: Early, Internet-Scale, Contemporary • Architectural Models • Software Layers • System Architectures • Client-Server • Clients and a Single Server, Multiple Servers, Proxy Servers with Caches, Peer Model • Alternative Client-Sever models driven by: • Mobile code, mobile agents, network computers, thin clients, mobile devices, and spontaneous networking • Design Challenges/Requirements • Fundamental Models – formal description • Interaction, failure, and security models. • Summary 2 Introduction • Distributed systems should be designed to function correctly in ALL circumstances/scenarios. • Distributed system models helps in… • ..classifying • ..identifying • ..crafting and understanding different implementations their weaknesses and their strengths new systems outs of pre-validated building blocks • We will study distributed system models from different perspectives • Structure, organization, and placement of components • Interactions • Fundamental properties of systems Models Physical, Architectural, and Fundamental Models Characterization: Challenges (Difficulties and Threats) • Widely varying models of use • High variation of workload, partial disconnection of components, or poor connection. • Wide range of system environments • Heterogeneous hardware, operating systems, network, and performance. • Internal problems • Non synchronized clocks, conflicting updates, various hardware and software failures. • External threats • Attacks on data integrity, secrecy, and denial of service. Characterization: Dealing with Challenges • Observations • Widely varying models of use • The structure and the organization of systems allow for distribution of workloads, redundant services, and high availability. • Wide range of system environments • A flexible and modular structure allows for implementing different solutions for different hardware, OS, and networks. • Internal problems • The relationship between components and the patterns of interaction can resolve concurrency issues, • Structure and organization of component can support failover mechanisms. • External threats • Security has to be built into the infrastructure and it is fundamental for shaping the relationship between components. Physical Models • A representation of the underlying H/W elements of a DS that abstracts away specific details of the computer/networking technologies. • Baseline physical model – minimal physical model of a distributed system as an extensible set of computer nodes interconnected by a computer network for the required passing of messages Three Generations of DSs (Distributed Systems) • Three generations of distributed systems 1- Early distributed systems • 10 and 100 nodes interconnected by a local area network • limited Internet connectivity • supported a small range of services e.g. • shared local printers • File servers • email • file transfer across the Internet Three Generations of DSs(Distributed Systems) 2- Internet-scale distributed systems • Extensible set of nodes interconnected by a network of networks (the Internet) 3- Contemporary DS with hundreds of thousands nodes + emergence of: • Mobile computing • laptops or smart phones may move from location to location – need for added capabilities (service discovery; support for spontaneous interoperation) • Ubiquitous computing • Computers are embedded everywhere • Cloud computing • pools of nodes that together provide a given service Architectural model • An Architectural model of a distributed system is concerned with the placement of its parts and relationship between them. Examples: • Client-Server (CS) and Peer Process models. • CS can be modified by: • The partitioning of data/replication at cooperative servers(performance and reliability reasons) • The caching of data by proxy servers or clients • The use of mobile code and mobile agents • The requirements to add or remove mobile devices. Fundamental Models • Fundamental Models are concerned with a formal description of the properties that are common in all of the architectural models • Models addressing time synchronization, message delays, failures, security issues are addressed as: • Interaction Model– deals with performance and the difficulty of setting of time limits in a distributed system. • Failure Model– specification of the faults that can be exhibited by processes • Security Model– discusses possible threats to processes and communication channels. Architectural Models 13 Architectural Models Architectural Elements • What are the entities that are communicating in the distributed system? • How do they communicate, or, more specifically, what communication paradigm (model) is used? •What (potentially changing) roles and responsibilities do they have in the overall architecture? • How are they mapped on to the physical distributed infrastructure (what is their placement)? Architectural Models –Cont. • The architecture of a system is its structure in terms of separately specified components. • Its goal is to meet present and likely future demands(What might suit 10 users is terrible for 10,000 users). • Major concerns are making the system reliable, manageable, adaptable, and cost effective. • Architectural Model: • Simplifies and abstracts the functions of individual components • The placement of the components across a network of computers – define patterns for the distribution of data and workloads. • The interrelationship between the components – i.e., functional roles and the patterns of communication between them. • A communication pattern is a pattern on messages exchanged in a distributed computation. • An input pattern is a vector made up of the input parameters of the processes involved in a distributed computation. Architectural Models –Cont. • Architectural Model - simplifies and abstracts the functions of individual components: • An initial simplification is achieved by classifying processes as: • Server processes • Client processes client server • Peer processes peer • Cooperate and communicate in a symmetric manner to perform a task. peer Software Architecture &Layers • The term software architecture referred: • Originally to the structure of software as layers or modules in a single computer. • More recently in terms of services offered and requested between processes in the same or different computers. • Breaking up the complexity of systems by designing them through layers and services • Layer: a group of related functional components • Service: functionality provided to the next layer. Layer N … Layer 2 (services offered to above layer) Layer 1 Software and hardware service layers in distributed systems Applications, services Computer and network hardware Platform Operating system Middleware Platform • The lowest hardware and software layers are often referred to as a platform for distributed systems and applications. • These low-level layers provide services to the layers above them, which are implemented independently in each computer. • Major Examples • Intel x86/Windows • Intel x86/Linux • Intel x86/Solaris • PowerPC/MacOS • iPhone/iOS • Samsung Galaxy/Android Middleware • A layer of software whose purpose is to mask heterogeneity present in distributed systems and to provide a convenient programming model to application developers. • Major Examples: • Sun RPC (Remote Procedure Call) • OMGCORBA (Common Object Request Broker Architecture) • Microsoft D-COM (Distributed Components Object Model) • Sun Java RMI (Remote Method Invocation) • Modern Middleware Examples: • ManjrasoftAneka– for Cloud computing • IBM WebSphere • Microsoft .NET • Sun J2EE • Google AppEngine • Microsoft Azure System Architecture • The most evident aspect of DS design is the division of responsibilities between • System components (applications, servers, and other processes) and • The placement of the components on computers in the network. • It has major implication for: • Performance, reliability, and security of the resulting system. Client-Server Basic Model: Clients invoke individual servers Client invocation result Client invocation Server Key: result Process: Server Computer: • Client processes interact with individual server processes in a separate computer in order to access data or resource. The server in turn may use services of other servers. • Example: • A Web Server is often a client of file server. • Browser  search engine -> crawlers  other web servers. Client-Server Architecture Types (Tier arch compliments layer architecture) • Two-tier model (classic) client server • Three-tier (when the server, becomes a client) client Server/client server • Multi-tier (cascade model) client Server/client server Server/client server Clients and Servers • General interaction between a client and a server. A service provided by multiple servers Service Client Server Client Server Server • Services may be implemented as several server processes in separate host computers. • Example: Cluster based Web servers and apps such as Google, parallel databases Oracle Proxy servers (replication transparency) and caches: Web proxy server Client Client • A cache is a store of recently used data. Web Proxy server server Web server Peer Processes: A distributed application based on peer processes Peer 2 Peer 1 Application Sharable objects Application Peer 3 Application Peer 4 Application Peers 5 .... N • All of the processes play similar roles, interacting cooperatively as peers to perform distributed activities or computations without distinction between clients and servers. E.g., music sharing systems Napster, Gnutella, Kaza, BitTorrent. P2P with a Centralized Index Server (e.g. Napster Architecture) peer peer peer peer peer peer peer Variants of Client Sever Model: Mobile code and Web applets a) client request results in the downloading of applet code Client Applet code b) client interacts with the applet Client Applet Web server Web server An applet is any small application which, • performs one specific task that • runs within the scope of a dedicated widget engine (software platform on which desktop or web widgets run ) or a larger program, often as a plug-in • Applets downloaded to clients give good interactive response • Mobile codes such as Applets are potential security threat, therefore the browser gives applets limited access to local resources (e.g. NO access to local/user file system). Variants of Client Sever Model: Mobile Agents • A running program (code and data) that travels from one computer to another in a network carrying out an autonomous task, usually on behalf of some other process • advantages: flexibility, savings in communications cost • virtual markets, software maintain on the computers within an organisation. • Potential security threat to the resources in computers they visit. The environment receiving agent should decide which of the local resource to allow. (e.g., crawlers and web servers). • Agents themselves can be vulnerable – they may not be able to complete task if they are refused access. • Example technology: • Java Agent Development Framework (JADE) Thin clients and compute servers Network computer or PC Thin Client Compute server network Application Process • Network computer: download OS and applications from the network and run on a desktop (solve up-gradation problem) at runtime. • Thin clients: work by connecting remotely to a server-based computing environment where most applications, sensitive data, and memory, are stored. • The server does most of the work, which can include launching software programs, performing calculations, and storing data. • Windows-based UI on the user machine and application execution on a remote computer. E.g, X-11 system. Mobile devices and spontaneous networking [3rd Generation Distributed System] • The world is increasingly populated by small and portable computing devices. • W-LAN needs to handle constantly changing heterogeneous, roaming devices • Need to provide discovery services: (1) registration service to enable servers to publish their services and (2) lookup service to allow clients to discover services that meet their requirements. Summary -Models and Implications • The use of CS (Client-Server) has impact on the software architecture followed: • Distribution of responsibilities • Synchronization mechanisms between client and server • Admissible types of requests/responses • Basic CS model, responsibility is statically allocated. • File server is responsible for file, not for web pages. • Peer process model, responsibility is dynamically allocated: • In fully decentralized music file sharing system, search process may be delegated to different peers at runtime. Design Requirements/Challenges of Distributed Systems • Performance Issues • Responsiveness • Support interactive clients • Use caching and replication • Throughput • Load balancing and timeliness • Quality of Service: • Reliability • Security • Adaptive performance. • Dependability issues: • Correctness, security, and fault tolerance • Dependable applications continue to work in the presence of faults in hardware, software, and networks. Presentation Outline • Introduction • Architectural Models • Software Layers • System Architectures • Client-Server • Clients and a Single Sever, Multiple Servers, Proxy Servers with Caches, Peer Model • Alternative Client-Sever models driven by: • Mobile code, mobile agents, network computers, thin clients, mobile devices and spontaneous networking • Design Challenges/Requirements • Fundamental Models –formal description • Interaction, Failure, and Security models. • Summary Fundamental Models at Glance • Fundamental Models are concerned with a formal description of the properties that are common in all of the architectural models • All architectural models are composed of processes that communicate with each other by sending messages over a computer networks. • Models addressing time synchronization, message delays, failures, security issues are addressed as: • Interaction Model – deals with performance and the difficulty of setting of time limits in a distributed system. • Failure Model – specification of the faults that can be exhibited by processes • Security Model – discusses possible threats to processes and communication channels. Interaction Model • Computation occurs within processes; • The processes interact by passing messages, resulting in: • Communication (information flow) • Coordination (synchronization and ordering of activities) between processes. • Two significant factors affecting interacting processes in a distributed system are: • Communication performance is often a limiting characteristic. • It is impossible to maintain a single global notion of time. Interaction Model: Performance of Communication Channel • Communication over a computer network has performance characteristics: • Latency: • A delay between the start of a message’s transmission from one process to the beginning of reception by another. • Bandwidth: • the total amount of information that can be transmitted over in a given time. • Communication channels using the same network, have to share the available bandwidth. • Jitter • The variation in the time taken to deliver a series of messages. It is very relevant to multimedia data. • Is the variation in the time between data packets arriving, caused by • network congestion, or • route changes. • The longer data packets take to transmit, the more jitter affects audio quality. The standard jitter measurement is in milliseconds (ms) Interaction Model: Computer clocks and timing events • Each computer in a DS has its own internal clock, which can be used by local processes to obtain the value of the current time. • Therefore, two processes running on different computers can associate timestamp with their events. • However, even if two processes read their clocks at the same time, their local clocks may supply different time. • This is because computer clock drifts from perfect time and their drift rates differ from one another. • Even if the clocks on all the computers in a DS are set to the same time initially, their clocks would eventually vary quite significantly unless corrections are applied. Interaction Model: Two variants of the interaction model • In a DS it is hard to set time limits on the time taken for process execution, message delivery or clock drift. • Synchronous DS – hard to achieve: • The time taken to execute a step of a process has known lower and upper bounds. • Each message transmitted over a channel is received within a known bounded time. • Each process has a local clock whose drift rate from real time has known bound. • Asynchronous DS: There is NO bounds on: • Process execution speeds • Message transmission delays • Clock drift rates. Interaction Model: Event Ordering • In many DS applications we are interested in knowing whether an event occurred • before, • after, or • concurrently with another event at other processes. • The execution of a system can be described in terms of events and their ordering despite the lack of accurate clocks. Failure Model • In a DS, both processes and communication channels may fail – i.e., they may depart from what is considered to be correct or desirable behavior (define and classify the faults). • Types of failures: • Omission Failure(refer to cases where the process or communication channel fails to perform a requested action) • Arbitrary Failure(where any type of error can occur. Corrupt data, unexpected responses) • Timing Failure(related to synchronous messages, where a set bound [clock drift, message ack, process execution time] exceeds defined bounds) Processes and channels process p send process q m receive Communication channel Outgoing message buffer Incoming message buffer • Communication channel produces an omission failure if it does not transport a message from “p”s outgoing message buffer to “q”’s incoming message buffer. This is known as “dropping messages” and is generally caused by • a lack of buffer space at the receiver or at gateway or by a network transmission error. Omission and arbitrary failures Class of failure Affects Description Fail-stop Process Process halts and remains halted. Other processes may detectthis state. Crash Process Process halts and remains halted. Other processes may not be able to detect this state. Omission Channel A message inserted in an outgoing message buffer never arrivesat the other end’s incoming message buffer. Send-omission Process A process completes a send, but the message is not put in its outgoing message buffer. Receive-omission Process A message is put in a process’s incoming message buffer, but that process does not receive it. Arbitrary (Byzantine) Process or channel Process/channel exhibits arbitrary behaviour: it may send/transmit arbitrary messages at arbitrary times, commit omissions; a process may stop or take an incorrect step. Timing failures Class of Failure Affects Description Clock Process Process’s local clock exceeds the bounds on its rate of drift from real time. Performance Process Process exceeds the bounds on the interval between two steps. Performance Channel A message’s transmission takes longer than the stated bound. Masking Failures • It is possible to construct reliable services from components that exhibit failures. • For example, multiple servers that hold replicas of data can continue to provide a service when one of them crashes. • A knowledge of failure characteristics of a component can enable a new service to be designed to mask the failure of the components on which it depends: • Checksums are used to mask corrupted messages. Security Model • The security of a DS can be achieved by securing the processes and the channels used in their interactions and by protecting the objects that they encapsulate against unauthorized access. Protecting Objects: Objects and principals invocation Client Access rights result Principal (user) Netw ork Server Principal (server) Object • Use “access rights” that define who is allowed to perform operation on a object. • The server should verify the identity of the principal (user) behind each operation and checking that they have sufficient access rights to perform the requested operation on the particular object, rejecting those who do not. The enemy Copy of m Process p The enemy m m’ Communication channel Process q • To model security threats, we postulate an enemy that is capable of sending any process or reading/copying message between a pair of processes • Threats form a potential enemy: threats to processes, threats to communication channels, and denial of service. Defeating security threats: Secure channels Principal A Process p Secure channel Principal B Process q • Encryption and authentication are use to build secure channels. • Each of the processes knows the identity of the principal on whose behalf the other process is executing and can check their access rights before performing an operation. Summary • Most DSs are arranged accordingly to one of a variety of architectural models: • Client-Server • Clients and a Single Sever, Multiple Servers, Proxy Servers with Cache, Peer Model • Alternative Client-Sever models driven by: • Mobile code, mobile agents, network computers, thin clients, mobile devices and spontaneous networking • Fundamental Models – formal description • Interaction, failure, and security models. • The concepts discussed in the module play an important role while architecting DS and apps. Distributed Systems Communication in Distributed Systems Venus Samawi Isra University 1 Content • Network layers of the model • Summary of OSI layers • TCP/IP and OSI model • The interaction between layers in the OSI model • An exchange using the OSI model • Port Addresses • Computer Networks • TCP VsUDP • Software and hardware service layers in distributed systems • Operating systems (OS) • Network OS • Distributed OS • Middleware 2 In previous lectures • So far, We have discussed the communication in distributed system form internet protocol’s point of view by studying the layered implementation. • We are going to study the communication in distributed system at higher level (processes and application) Figure 2.2 Seven layers of the OSI model 2.4 Figure 2.15 Summary of layers 2.5 TCP/IP and OSI model 2.6 The interaction between layers in the OSI model 2.7 An exchange using the OSI model 2.8 Port addresses 2.9 Port Address Translation (PAT) • Port Address Translation (PAT) is an extension of Network Address Translation (NAT) • permits multiple devices on a LAN to be mapped to a single public IP address to conserve IP addresses. • PAT is similar to port forwarding except that an incoming packet with destination port (external port) is translated to a packet different destination port (an internal port). • The Internet Service Provider (ISP) assigns a single IP address to the edge device. • When a computer logs on to the Internet, this device assigns the client a port number that is appended to the internal IP address, giving the computer a unique IP address. • If another computer logs on the Internet, this device assigns it the same public IP address, but a different port number. • Although both computers are sharing the same public IP address, this device knows which computer to send its packets, because the device uses the port numbers to assign the packets the unique internal IP address of the computers. Computer Networks Difference: Connection oriented and Connectionless service 1) In connection oriented service authentication is needed, while connectionless service does not need any authentication. 2) Connection oriented protocol makes a connection and checks whether message is received or not and sends again if an error occurs, while connectionless service protocol does not guarantees a message delivery. 2) Connection oriented service is more reliable than connectionless service. Software and hardware service layers in distributed systems Applications, services Computer and network hardware Platform Operating system Middleware What is an Operating System? • A program that acts as an intermediary between a user of a computer and the computer hardware • Operating system goals: • Execute user programs and make solving user problems easier • Make the computer system convenient to use • Use the computer hardware in an efficient manner Computer System Structure Computer system can be divided into four components: • Hardware –provides basic computing resources • CPU, memory, I/O devices • Operating system • Controls and coordinates use of hardware among various applications and users • Application programs –define the ways in which the system resources are used to solve the computing problems of the users • Word processors, compilers, web browsers, database systems, video games • Users • People, machines, other computers Operating System Supporting In computing, a system call (commonly abbreviated to syscall) is the programmatic way in which a computer program requests a service from the kernel of the operating system on which it is executed. Distributed OS Distributed OS Cont Network OS Vs Distributed System OS The main difference between these two operating systems (Network Operating System and Distributed Operating System) is that • In network operating system each node or system can have its own operating system • Indistribute operating system each node or system have same operating system which is opposite to the network operating system. Network OS Vs Distributed System OS Cont. Network OS • Main objective is to provide the local services to remote client. • Communication takes place on the basis of files. • more scalable than Distributed Operating System. • Limited fault tolerance • High Rate of autonomy • Ease of implementation • All nodes can have different operating system. Distributed System OS • Main objective is to manage the hardware resources. • Communication takes place on the basis of messages and shared memory. • Less scalable than network Operating System. • High fault tolerance • Limited Rate of autonomy • More complicated implementation • All nodes have same operating system. Distributed Systems Inter-process Communication (IPC) Venus Samawi Isra University 1 Content • Interprocess Communication (IPC) • Message passing • Buffer 2 InterprocessCommunication Process Cooperation • There are several reasons to provide process cooperation environment • Information sharing • Computation speed up: If we want a particular task to run faster, we must break it into subtasks, each of which will be executing in parallel with the others. • Modularity: dividing the system functions into separate processes or threads • Convenience: Even an individual user may work on many tasks at the same time. For instance, a user may be editing, printing, and compiling in parallel. 4 Models for InterprocessCommunication Models for interprocess communication could be through: • Shared memory • Message passing Message -passing Systems 5 Message Size 6 7 Naming Communication between processes either Direct or Indirect 8 9 Example 10 Message passing (Part2) Communication link must exists between processes 11 12 Buffer Size Vs Blocking and non-Boloking 13 Distributed Systems Inter-process Communication (IPC) Venus Samawi Isra University 1 Content • Interprocess Communication (IPC) • Sockets • RPC • RMI 2 Middleware layers Instructor’s Guide for Coulouris, Dollimore, Kindberg and Blair, Distributed Systems: Concepts and Design Edn. 5 Pearson Education 2012 Sockets (used for Communication in Client -Server systems) 4 Sockets 5 Sockets (Cont.) 6 Sockets and ports socket client Internet address = 138.37.94.248 any port agreed port message other ports socket server Internet address = 138.37.88.249 Instructor’s Guide for Coulouris, Dollimore, Kindberg and Blair, Distributed Systems: Concepts and Design Edn. 5 Pearson Education 2012 • Adaemonis a long-running background process that answers requests for services. • In multitasking computer operating systems, a daemon is a computer program that runs as a background process, rather than being under the direct control of an interactive user. • Astub in distributed computing is a piece of code that converts parameters passed between client and server during a remote procedure call (RPC). • The main idea of an RPC is to allow a local computer (client) to remotely call procedures on a different computer (server). • "marshalling" refers to the process of converting the data or the objects into a byte stream, • "unmarshalling" is the reverse process of converting the byte-stream back to their original data or object. • Adaemonis a long-running background process that answers requests for services. • In multitasking computer operating systems, a daemon is a computer program that runs as a background process, rather than being under the direct control of an interactive user. • Binding information includes the protocol that clients use to communicate with the site, the site's IP address, the port number, and a host header. The element contains two attributes to configure the binding information: bindingInformation and protocol RPC RMIVS RPC RPC does not provide any security. Although it provides client-level security. 18 Key Differences Between RPC and RMI • RPC supports procedural programming paradigms thus is C based, while RMI supports object-oriented programming paradigms and is java based. • The parameters passed to remote procedures in RPC are the ordinary data structures. On the contrary, RMI transits objects as a parameter to the remote method. • RPCcanbeconsidered as: • Theolder version of RMI, • it is used in the programming languages that support procedural programming, • it can only use pass by value method. As against, RMI facility is devised based on modern programming approach, which could use • passbyvalue or reference. • Anotheradvantage of RMIis that the parameters passed by reference can be changed. • RPCprotocol generates more overheads than RMI. • The parameters passed in RPC must be “in-out” which means that the value passed to the procedure and the output value must have the same datatypes. In contrast, there is no compulsion of passing “in-out” parameters in RMI. • In RPC, references could not be probable because the two processes have the distinct address space, but it is possible in case of RMI. 19 Parameter Passing in RPC • Functions in an application that runs in a single process may collaborate via parameters and/or global variables. • Functions in an application that runs in multiple processes on the same host may collaborate via message passing and/or non-distributed shared memory • In RPC, passing parameters is the only way that , clients and servers share information • Parameters that are passed by value are fairly simple to handle • The client stub copies the value from the client and packages into a network message Ex: • Consider a remote procedure, sum(i, j), which takes two integer parameters and returns their arithmetic sum. • The client stub takes its two parameters and puts them in a message, and Puts the name or number of the procedure to be called in the message. • When the message arrives at the server, the stub examines the message to see which procedure is needed, and then makes the appropriate call. •When the server has finished execution, it takes the result provided by the server and packs it into a message. This message is sent back to the client stub, which unpacks it and returns the value to the client procedure 20 RPC: Parameters Passed by Reference • Parameters passed by reference are much harder: • For example distributed systems with distributed shared-memory mechanisms can allow passing of parameters by reference. • A pointer is meaningful only within the address space of the process in which it is being used. Suppose there are two parameters to be passed, if the second parameter is the address of the buffer which is 1000 on the client, one cannot just pass the number 1000 to the server and expect it to work. Address 1000 on the server might be in the middle of the program text. For that, call by reference is not practical in RPC and massage passing is used. 21 Remote Method Invocation (RMI) • Remote Method Invocation (developed in 1980’s) allows remote method calls, at which objects in different programs can communicate • RMI is based on RPC (RMI is Java’s implementation of RPC). • It is object-oriented version of RPC • Methods calls appear same as those in same program. • RMI performs networking and marshaling of data (converting the data or the objects into a byte-stream) • Interface definition language is required to describe functions. • RMI is an approach that provides remote communication between the application using two objects: • stub 22 • skeleton Stub Whenastub'smethodisinvoked, itdoes thefollowing: • initiates aconnectionwith the remote JVMcontainingtheremoteobject, •marshals (writes and transmits) the parameterstotheremoteJVM, •waits for the result of the method invocation, •Unmarshals(reads) thereturnvalueor exceptionreturned,and • returnsthevaluetothecaller. Skeleton Whenaskeletonreceivesanincomingmethod invocationitdoesthefollowing: • unmarshals (reads) the parameters for the remotemethod, • invokes themethod on the actual remote objectimplementation,and •marshals (writes and transmits) the result (returnvalueorexception)tothecaller. RMICont. ThecommondifferencebetweenRPCandRMIisthat • RPConlysupportsproceduralprogrammingwhereasRMIsupportsobject orientedprogramming. • Client calls the local method (stub) to perform this operation • The stub on computer client calls RMI Registry to know whether thatmethod exists and the correct way to call it • Client stub calls this method on server • Server skeleton receives this request and communicate with the software on server, when it receives the response, sends this response to the client stub. Client RMISystem Layers The RMI system consists of 3 layers: • The stub/Skeleton layer • Client side stubs (proxies) • Server side skeletons • The Remote Reference Layer • Transport Layer RMI Architecture Remote Reference layer • The remote reference layer deals with the lower-level transport interface. • It is also responsible for carrying out a specific remote reference protocol which is independent of the client stubs and server skeletons. • Each remote object implementation chooses its own remote reference subclass that operates on its behalf. • Various invocation protocols can be carried out at this layer. Examples are: • Unicast point-to-point invocation. • Invocation to replicated object groups. • Support for a specific replication strategy. • Support for a persistent reference to the remote object (enabling activation of the remote object). • Reconnection strategies (if remote object becomes inaccessible). 27 Remote Reference layer (Cont.) The remote reference layer has two cooperating components: • the client-side • server-side components. The client-side component • contains information specific to the remote server (or servers, if the remote reference is to a replicated object) • communicates via the transport to the server-side component. During each method invocation, the client and server-side components perform the specific remote reference semantics. • For example, If a remote object is part of a replicated object, the client-side component can forward the invocation to each replica rather than just a single remote object. 28 Transport Layer The transport layer of the RMI system is responsible for: • Setting up connections to remote address spaces. • Managing connections. • Monitoring connection "liveness." • Listening for incoming calls. • Maintaining a table of remote objects that reside in the address space. • Setting up a connection for an incoming call. • Locating the dispatcher (لسرملا) for the target of the remote call and passing the connection to this dispatcher. Transport Layer (Cont.) • The concrete representation of a remote object reference consists of • Anendpoint • Anobject identifier. This representation is called a live reference. • Given a live reference for a remote object, • a transport can use the endpoint to set up a connection to the address space in which the remote object resides. • Onthe server side, the transport uses the object identifier to look up the target of the remote call. The transport for the RMI system consists of four basic abstractions: • Anendpoint: is the abstraction used to denote an address space or Java virtual machine. • In the implementation, an endpoint can be mapped to its transport. That is, given an endpoint, a specific transport instance can be obtained. • Achannel: is the abstraction for a conduit ( ) هانق between two address spaces. • it is responsible for managing connections between the local address space and the remote address space for which it is a channel. • Aconnection: is the abstraction for transferring data (performing input/output). • The transport abstraction manages channels. 30 • Each channel is a virtual connection between two address spaces.