• Ei tuloksia

Open Cloud Platforms & Cloud Services

N/A
N/A
Info
Lataa
Protected

Academic year: 2022

Jaa "Open Cloud Platforms & Cloud Services"

Copied!
74
0
0

Kokoteksti

(1)

Open Cloud Platforms & Cloud Services

Zhonghong Ou

Post-doc researcher

Data Communications Software (DCS) Lab,

Department of Computer Science and Engineering, Aalto University

Aalto University

Zhonghong Ou 26/09/2014

(2)

Cloud technology videos

• http://www.youtube.com/watch?v=txvGNDnKNWw&featur e=related

• http://www.youtube.com/watch?v=QJncFirhjPg

2

(3)

Open cloud platforms

• Eucalyptus

• Open Cirrus

• OpenNebula

• Apache CloudStack

• OpenStack

• …

• Apache Spark

3

(4)

OpenStack

Image source: http://www.openstack.org/brand/openstack-logo/logo-download/4

(5)

What is OpenStack?

• OpenStack is a cloud operating system that controls large pools of compute, storage, and networking

resources throughout a datacenter, all managed through a dashboard that gives administrators control while

empowering their users to provision resources through a web interface.

Source: http://www.openstack.org/software/5

(6)

A bit detail

• A a pilot project launched by Rackspace and NASA founded in July 2010.

• To avoid “vendor lock-in”, Open Cloud Computing Interface (OCCI) emerged as standard to provide a solution for this by defining

interoperable, portable and integration standards.

• OpenStack was launched as an independent implementation of OCCI and offers a flexible and accommodating cloud service.

• OpenStack has got more momentum considering big names are a part of its project (AT&T, IBM, HP, REDHAT, Cisco, Dell, etc).

• Supports Xen, KVM, VMWare hypervisor underneath.

Source: http://getcloudify.org/2014/07/10/what-is-openstack-tutorial.html6

(7)

Components

Component name Description

Compute (Nova) allows the user to create and manage virtual servers using the machine images. It is the brain of the Cloud. OpenStack compute provisions and manages large networks of virtual machines

Networking (Neutron) pluggable, scalable and API-driven system for managing networks (VLAN, IP address, firewalls etc)

Block Storage (Cinder) provides persistent block storage to running instances

Object Storage (Swift) stores and retrieves unstructured data objects through the HTTP based APIs, fault tolerant due to its data replication and scale out architecture Image Service (Glance) provides the discovery, registration and delivery services for the disk and

server images Identity Service

(Keystone) provides a central directory of users mapped to the OpenStack services, provides an authentication and authorization service for other services Dashboard (Horizon) provides a web-based portal to interact with all the underlying services Telemetry Service

(Ceilometer) monitors the usage of the Cloud services and decides the billing accordingly Orchestration Heat manages multiple Cloud applications through an OpenStack-native REST

API and a CloudFormation-compatible Query API Database as a Service

(Trove) allows users to quickly and easily utilize the features of a relational database without the burden of handling complex administrative tasks

Messaging as a Service (Marconi)

cloud messaging and notification service for developers building applications on top of OpenStack

7

(8)

Example flow (1/5)

Source: http://getcloudify.org/2014/07/18/openstack-wiki-open-cloud.html8

(9)

Example flow (2/5)

9

(10)

Example flow (3/5)

10

(11)

Example flow (4/5)

11

(12)

Example flow (5/5)

12

After getting the image, Nova mounts it on a VM host. During the boot

process of the VM, it requests Neutron (DHCP component) for an IP address.

(13)

Structure

Source: http://de.wikipedia.org/wiki/OpenStack5 13

(14)

OpenStack Compute (Nova)

• Component based architecture enabling quicker additions of new features;

• Fault tolerant, recoverable and provides API-compatibility with systems like Amazon EC2;

• Built on a messaging architecture and all of its components can typically be run on several servers; enable communications among components through message queue;

• Nova together with its components share a centralized SQL-based database; for larger deployments an aggregation system will be in place to manage the data across

multiple data stores;

• Supports virtualization technology: KVM, XenServer, Linux Container (LXC);

• Supports ARM and x86 etc hardware architectures.

Source: http://blog.flux7.com/blogs/openstack/tutorial-what-is-nova-and-how-to-install-use-it-openstack14

(15)

Nova Components

DB: SQL database for storing data

Web Dashboard: External component to communicate with the API

API: Component that uses the queue or http to communicate with other components and to receive http requests

Auth Manager: A python class used by all components to communicate with the backend DB or LDAP.

Also, this component is responsible for users, projects and roles.

Object Store: Replication of S3 API allowing storage and retrieval of images

Scheduler: Allocates hosts to the appropriate VMs

Network: Responsible for IP forwarding, bridges and vlans

Compute: Controls the communication between the hypervisor and VMs

Source: http://blog.flux7.com/blogs/openstack/tutorial-what-is-nova-and-how-to-install-use-it-openstack15

(16)

Example Nova configurations (1/2)

16

Source: https://wiki.openstack.org/wiki/UnderstandingFlatNetworking

(17)

Example Nova configurations (2/2)

17

Source: https://wiki.openstack.org/wiki/UnderstandingFlatNetworking

(18)

OpenStack Networking (Neutron)

• Pluggable, scalable and API-driven system for managing networks and IP addresses;

• Providing a variety of network services ranging from L3 forwarding and NAT to load balancing, edge firewalls and IPSEC VPN;

• Managing software-defined networking (SDN) and can be configured for advanced virtual network topologies, such as per-tenant private networks and others;

• Its object abstractions include networks, subnets and routers. Each has functionality that mimics its physical counterpart: networks contain

subnets, and routers route traffic between different subnets and networks.

Source: http://blog.flux7.com/blogs/openstack/tutorial-what-is-neutron-how-to-install-and-use-it18

(19)

Neutron setup

External network represents network that is accessible outside the

OpenStack installation. IP addresses on the Neutron external network are accessible by anyone outside the network and DHCP is disabled.

Internal networks are software-defined networks connect directly to VMs.

– Only VMs on any given internal network, or those on subnets connected through interfaces to a similar router, can access VMs directly connected to that network.

– Outside network to access VMs, and vice versa, requires routers between them.

• Supports security groups that enable administrators to define firewall rules in groups.

• Firewall-as-a-Service (FWaaS) and Load-Balancing-as-a-Service (LBaaS) plug-ins are available.

Source: http://blog.flux7.com/blogs/openstack/tutorial-what-is-neutron-how-to-install-and-use-it19

(20)

Neutron setup example

20

Source: http://blog.flux7.com/blogs/openstack/tutorial-what-is-neutron-how-to-install-and-use-it

(21)

OpenStack Storage

• Block Storage

– Cinder

• Object Storage

– Swift

21

(22)

OpenStack Cinder

• Persistent block level storage device for use with OpenStack compute instances;

• Managing the creation, attachment and detachment of block devices to servers;

• Providing unified storage support for numerous other storage

platforms, including Ceph, NetApp, Nexenta, SolidFire and Zadara;

• Providing snapshot management functionality for backing up data stored on block storage volumes that can be restored or used to create new block storage volumes.

Source: http://blog.flux7.com/blogs/openstack/tutorial-what-is-cinder-and-how-to-install-and-use-it22

(23)

Cinder components

• cinder-api: Accepts API requests and routes them to cinder-volume for action.

• cinder-volume: Responds to requests to read from and write to a block

storage database to maintain state by interacting with other processes, like cinder-scheduler, through a message queue, and to act directly upon block- storage providing hardware or software. It can interact with a variety of

storage providers through driver architecture.

• cinder-scheduler: Picks the optimal block storage provider node to create the volume.

• Messaging queue: Routes information between Block Storage Service processes.

Source: http://blog.flux7.com/blogs/openstack/tutorial-what-is-cinder-and-how-to-install-and-use-it23

(24)

Cinder example

Source: http://blog.flux7.com/blogs/openstack/tutorial-what-is-cinder-and-how-to-install-and-use-it24

(25)

OpenStack Swift

• Object storage system provided under the Apache 2 open source license;

• Powering the largest object storage clouds, including Rackspace Cloud Files, the HP Cloud, IBM Softlayer Cloud and countless private object storage clusters;

• Like Amazon S3, has an eventual consistency architecture;

– Which is in contrast against strong consistency in filesystems and block storage

• All objects, or files, stored in Swift have a URL;

• Applications store and retrieve data in Swift via an industry-standard RESTful http API;

Source: https://swiftstack.com/openstack-swift/architecture/ 25

(26)

Swift Overview

-Server processes

• Proxy server

– Responsible for tying together the rest of the Swift architecture

– Looking up the location of the account, container, or object in the ring and route the request accordingly – Handling failure cases

• Account server

– Responsible for listings of containers

• Container server

– Handling listings of objects

– It doesn’t know where those object’s are, just what objects are in a specific container

• Object server

– Simple blob storage server that can store, retrieve and delete objects stored on local devices

– Objects are stored as binary files on the filesystem with metadata stored in the file’s extended attributes (xattrs)

26

(27)

Swift Overview

- Consistency services

• Auditors

– Running in the background on every storage node and continually scan the disks to ensure that the data stored on disk has not suffered any bit-rot or file system corruption. There are account auditors, container auditors and object auditors which run to support their corresponding server process;

– If an error is found, the auditor moves the corrupted object to a quarantine area.

• Replicators

– Account, container, and object replicator processes run in the background on all nodes that are running the corresponding services;

– Continuously examine its local node and compare the accounts, containers, or objects against the copies on other nodes in the cluster;

– If one of other nodes has an old or missing copy, then the replicator will send a copy of its local data out to that node (only push, no pull);

– Handles object and container deletions.

• Object deletion starts by creating a zero-byte tombstone file that is the latest version of the object. This version is then replicated to the other nodes and the object is removed from the entire system.

• Container deletion can only happen with an empty container. It will be marked as deleted and the replicators push this version out.

27

(28)

Open Cirrus

-A Global Cloud Computing Testbed

28

(29)

Motivation

• Applications researchers in areas such as machine learning and scientific computing can get access to large-scale cluster

resources, e.g. data centers provided by Amazon, Microsoft, Yahoo!, Google, and IBM.

• System researchers, who are developing the techniques and software infrastructure to support cloud computing, still have trouble obtaining low-level access to such resources.

• Open Cirrus aims to address this problem by providing a single testbed based on a range of heterogeneous distributed data centers for systems, applications and services.

29

(30)

Participants

30

(31)

Geo-distribution

31

(32)

High-level architectural choices

• Systems versus application-only research.

– Open Cirrus enables research using physical machines in addition to virtualized machines.

• Federated versus unified sites

– Open Cirrus federates numerous sites with various hardware, services, and tools, in contrast to a unified architecture such as PlanetLab.

• Data-center focus versus centralized homogeneous infrastructure.

– Open Cirrus revolves around multiple data centers, compared to a centralized approach such as Emulab.

32

(33)

Service stack architecture

PXE: Preboot Execution Environment

IPMI: Intelligent Platform Management Interface

33

(34)

Service stack architecture

-Zoni

• Responsible for managing physical resources in the

cluster and is crucial to providing users with bare-metal server access to conduct system research.

• Provides five key functions:

• allocation of server nodes;

• isolation of node groups, called domains;

• provisioning of key software in a domain;

• out-of-band server management; and

• debugging of allocated nodes.

34

(35)

Service stack architecture

-Primary domain services

• To support users working with very large data sets, a cluster storage system, in particular the Hadoop distributed file system (HDFS), is used to aggregate the storage of all the nodes in the domain.

• To support a diverse set of user needs, the recommended primary domain services include a virtual machine management (VMM) layer, which provides a convenient mechanism for allocating resources to various users and services.

– Hadoop – Maui + Torque – MPI

• Different sites may select any VMM service as long as it

supports the EC2 interface from Amazon Web Services (AWS).

– Tashi – Eucalyptus

• Data Location Service (DLS)

– a clearinghouse for data location information independent of a storage mechanism,

• Resource Telemetry Service (RTS)

– provides a means to obtain an abstract distance measure between two location identifiers.

35

Data location service (DLS)

(36)

Tashi

• The Tashi project aims to build a software infrastructure for cloud computing on massive Internet-scale datasets (what is called Big Data). The idea is to build a cluster

management system that enables the Big Data that are stored in a cluster/data center to be accessed, shared, manipulated, and computed on by remote users in a convenient, efficient, and safe manner.

• While Tashi is similar to other systems that manage logical clusters of VMs, it was developed to support research in coscheduling computation, storage, and power.

http://incubator.apache.org/tashi/ 36

(37)

Example service in Open Cirrus

37

(38)

Service stack architecture

-Site utility services

• A monitoring service (such as Ganglia) not only enables the site administrator to monitor the cluster’s health, it also facilitates

collection of cluster operational data that may inform future research projects.

• Some conventional network file system storage is convenient for storing user scripts, small data sets, and small output files.

• Site utilities also include facilities for tracking resources consumed by users and managing the cluster’s power consumption.

38

(39)

basic characteristics of the current Open Cirrus sites

39

Approximately 100 research projects at 10 sites use Open Cirrus at the systems and applications levels.

(40)

Open Cirrus economic model

• Single site

– Suppose a medium-sized company which needs the same resource as UIUC Open Cirrus site: 128 servers (1,024 cores) and 524 Tbytes.

– AWS rates: US$0.12 per GiB/month and $0.10 per CPU-hour.

– Renting a cloud:

• Monthly storage cost : 524 × 1,000 × $0.12=$62,880

• Total monthly cost: $62,880 + 1,024 × 24 × 30 × $0.10 = $136,608

– Owning a cloud:

• Amortized monthly costs: hardware (45%) + power (40%) + network (15%)

• Service lifetime: M months

• Monthly storage cost (assuming $300 1-Tbyte disks) and scaling for power and networking:

524 × $300/0.45/M=$349,333/M

• Total monthly cost: $700,000/0.45/M + $7,500=$1,555,555/M + $7,500

– The break-even point

• Storage: $349,000/M < $62,880, or M > 5.55 months

• Overall: $1,555,555/M + $7,500 < $136,608, or M > 12 months.

40

Conclusion:

If the service runs for more than 12 months, owning the cloud infrastructure is preferable to renting it.

Similarly, it’s better to own storage if you use it for more than 6 months.

(41)

Open Cirrus economic model (Cont.d)

• Single site

– Underutilization

– With X percent resource utilization, the break-even time becomes 12 × 100/X months.

– Given the typical hardware lifetime of 36 months, the break- even resource utilization is 12 × 100/X < 36, or X > 33.3%.

41

Conclusion:

Even at the current 20% CPU utilization rates observed in industry, storage utilization greater than 47% would make ownership preferable, as storage and CPU account evenly for costs.

(42)

Open Cirrus economic model (Cont.d) -Federated sites

42

Costs incurred by a single under-provisioned cloud for three options:

offloading only to Amazon Web Services (existing data center), offloading to five federated clouds (Open Cirrus 6) and AWS, and offloading to 49 federated clouds (Open Cirrus 50) and AWS.

(43)

Comparison of cloud computing testbed

43

(44)

Cloud services

• Infrastructure-as-a-Service (IaaS)

– Amazon Web Services (AWS) – Microsoft Azure

– Rackspace

– Google Compute Engine

• Platform-as-a-Service (PaaS)

– Google App Engine – Microsoft Azure

• Software-as-a-Sservice (SaaS)

– Google Apps – Salesforce – 37Signals – ZOHO

• Cloud storage

– Box.net

– MobileMe (Apple) – Ovi store (Nokia) – Dropbox

– Google Drive

• Cloud appliances

– Pogoplug – Ctera – Tonidoplug

44

(45)

Cloud service

45

(46)

Public cloud service(AWS)

Infrastructure as a Service (IaaS)

• One of the biggest public cloud providers.

• Has an incredible array of cloud computing services, called Amazon Web Services (AWS), including:

– Amazon S3 (Simple Storage Service): cloud storage – Amazon EC2 (Elastic Compute Cloud): cloud computing

– Amazon VPC (Virtual Private Cloud): secure bridge between private cloud and public cloud – Amazon Elastic MapReduce: processing data-intensive tasks

– Amazon CloudFront: content delivery

– Amazon RDS (Relational Database Service): cloud database – Amazon SNS (Simple Notification Service): cloud notification

• A basic Linux server starts at $.085 per hour and a Windows server at $.12 per hour.

• For Amazon S3 storage, $.15 per GB/month.

• Amazon has a solution for huge volume of storage with their Import/Export service, which allows for secure shipping of a USB or SATA drive to Amazon to copy data into the cloud. The data should be encrypted prior to shipping to keep it secure.

46

(47)

IaaS-Rackspace

• Has been in the hosting business since 1998 and has 9 data centers throughout the world.

• Its cloud services include:

– Cloudserver

• Similar cloud service as Amazon EC2.

• Starting at $0.015/ hour ($10.95 / month).

– Cloudsites

• Host scalable and reliable websites

• Starting at $149 / month.

– Cloudfiles

• Provides unlimited file storage & hosting.

• Similar cloud storage service as Amazon S3.

• Starting at $0.15/GB.

47

(48)

Public cloud service(Windows Azure) Platform as a Service (PaaS)

Windows Azure provides what’s commonly called Platform as a Service (PaaS).

– It provides a platform that lets customers run applications without worrying about administering the environment they run in.

A simple Windows server on Microsoft’s cloud is $.12 per hour. Storage, as with Amazon is $.15 per GB/month.

http://www.microsoft.com/windowsazure/resources/default.aspx?pmc=NO-CARE-01

48

(49)

Windows Azure

49

(50)

Windows Azure (Compute)

50

(51)

Windows Azure (Storage)

51

(52)

Windows Azure AppFabric

52

(53)

Windows Azure AppFabric (Service Bus)

53

(54)

Windows Azure AppFabric (Access Control)

54

(55)

Windows Azure AppFabric (Caching)

55

(56)

SQL Azure

56

(57)

SQL Azure (SQL Azure Database)

57

(58)

SQL Azure (SQL Azure Data Sync)

58

(59)

Windows Azure Marketplace

59

(60)

Windows Azure Marketplace (DataMarket)

60

(61)

PaaS-Google App Engine

• Google App Engine (Platform-as-a-Service)

– Enables you to build and host web apps on the same systems that power Google applications.

– The sandbox isolates your application in its own secure, reliable environment.

– Each application costs $8 per user per month, up to a maximum of $1000 a month.

– SDK for JAVA, SDK for Python, Plugin for Eclipse.

61

(62)

Cloud services-Cloud applications (Software as a Service, SaaS)

• Google

– The GoogleApps suite for business provides email, calendaring, documents and other software for $50 per user per year.

• It eliminates the need for installing and maintaining office applications like Outlook, Excel, Power Point and Word, and provides on line storage for emails and files.

62

(63)

Cloud services-Cloud applications (Salesforce)

• Provides cloud Customer Relationship Management (CRM) software solutions.

• Sales Cloud

– Has an amount of features, including a customer database, sales lead tools, workflow,

integration to desktop applications (like Office), search tools, reporting, and access to other cloud applications.

– Is priced at $5 to $250 per user per month depending on the features selected.

• Service Cloud

– Includes customer trouble reporting and management tools, integration to social media sites like Twitter and Facebook, and other services to care for your customers.

– Is priced at $65 to $265 per user per month, depending on the features selected.

• Chatter

– Updates on people, groups, documents, and your application data come straight to you in your real-time feeds.

• Force.com

– Give developers a platform to create rich, collaborative custom apps fast-without buying hardware or installing software.

63

(64)

Salesforce (Cont.d)

64

(65)

Cloud services-Cloud applications (37Signals)

• The “sole investor” in 37Signals is Jeff Bezos of Amazon.com.

• Basecamp

– Project management and collaboration.

– Is priced from $49/month to $149/month.

• Highrise

– Contact and customer management.

– Is priced from $24/month to $99/month.

• Backpack

– Share information with the team, internal communication.

– Is priced from $24/month to $149/month.

• Campfire

– Team collaboration with real time chat.

– Is priced from $12/month to $99/month.

65

(66)

Cloud services-Cloud applications (ZOHO, SaaS)

• Zoho provides a wide, integrated portfolio of rich online applications for businesses.

• Services include:

66

(67)

ZOHO (Cont.d) Screenshot

67

(68)

Cloud services-Storage

• Box.net

– cloud service that provides on line access to all your files and content, can not only store files and data, it has other services including document management, project management, plus FTP and other file transfers.

– Billing

• Free personal option for 1GB of online storage;

• 10GB of data $9.99/month;

• 15GB $19.99/month;

• Business: $15/mo/user product for storing up to 15GB/user.

• Apple’s MobileMe

– Cloud computing product.

– Store your email, contacts, and calendar and sync them to your iPhone, PC, and iPad.

– A free 60 day trial, after 60 days, MobileMe will run you $99/year.

• Ovi Store

• Dropbox

• Google Drive

68

(69)

Cloud appliances (Pogoplug)

• You connect one or more USB drives to the physical Pogoplug device. The Pogoplug software allows you to access the files from anywhere in the

cloud, either on your network or over the Internet, with either a PC or a PDA.

• Products include:

– Pogoplug

• £99 / 99€

– Pogoplug Pro

• $99

• Exclusively available now at U.S.

– Pogoplug Biz

• $299/£249/€249

• Share massive amounts of content with clients and co-workers.

69

(70)

Cloud appliances (CTERA)

• Provides storage and data protection for SMBs (Small and Medium Businesses) and enterprise branch offices, by combining cloud storage services with on-premises storage appliances.

• Access to more than 20,000 VARs (Value-Added Resellers) and MSPs (Managed Service Providers)

• Products include:

– CloudPlug

• Converts any external USB/eSATA drive into Network Attached Storage with secure cloud backup, remote access and collaboration services, and allows to share and synchronize files on your local network.

• Approximately $200.

– CTERA C200

• Cloud Attached Storage appliance

• Data is synchronized between individual PCs on the network and the C200 drives, then backed up using CTERA's integrated online backup service.

• Approximately $371.

– CTERA C400

• Boasts up to 8TB of local storage space, with RAID5/6 capability and four hot-swappable drive bays.

• Retails for a price of $1,499.

70

(71)

Cloud appliances (TonidoPlug)

• TonidoPlug is a tiny, low-power, low-cost personal home server and NAS device powered by Tonido software that allows you to access your files, music and media from anywhere via a web browser (Powered by Tonido ® software).

• Like the PogoPlug and Ctera C200, it is a physical device and requires supplying and connecting a USB drive for storage.

• Running on top of embedded Ubuntu Jaunty Linux OS, based on GHz ARM processor.

• Price: $99.

71

(72)

Future of Cloud computing

• According to MarketsandMarkets, the lucrative sector will increase from $37.8 billion in 2010 to $121.1 billion in 2015 at a CAGR of 26.2% from 2010 to 2015.

• Intel’s cloud 2015 vision

– Federated – Automated – Client-aware

72

(73)

Future of cloud computing (Cont.d)

73

(74)

References

• Virtualization for dummies. Bernard Golden. Wiley Publishing, Inc. ISBN: 978-0-470-14831-0.

• Wikipedia.

• http://www.smallcloudbuilder.com/everything-else/article s/123-introduction-to-small-cloud-computing-part-2

• David Chappell & associates. Introducing the Windows Azure platform.

74

Viittaukset

LIITTYVÄT TIEDOSTOT

Different cloud service providers usually sell products for different purposes (ERP, CRM, database, cloud computing, managed services, etc.), which means that a

According to ENISA’s whitepaper on cloud standards and security (2014, p. 12) Cloud Services are often more common than traditional legacy IT deploy- ments. Due to this increase

It defines cloud as follows: “Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g.,

• Interview data containing personal sensitive information or confidential data obtained from a company cannot be stored and shared via cloud services. Solution: Aalto

The Cloud Software Finland project which aims on developing the cloud services is a program made the Technology and Innovation in the Field of ICT (TIVIT) is a program

• The public cloud computing market is still dominated by services based on proprietary platforms and customer interfaces. ©

Cloud Computing is the delivering of computing services over internet, including servers, storage, databases, networking, software, analytics, IOT and AI... Oracle Cloud Gen 1

Indeed, by centralizing most of the services needed for this application, such as Cloud Storage, Firestore and Authentication, Firebase simplifies the configuration and management