AWS - Certified Solutions Architect (NOTES)

AWS - Certified Solutions Architect (NOTES)

AWS Certified Solutions Architect – Associate Study Notes –

I' currently studying to sit the AWS Solutions Architect Associate certification. To do this I'm (a) going through the exam blueprint, (b) writing blogs on my AWS free-tier lab, (c) watching the excellent videos. I bought the "Associate Bundle" so I plan on taking all 3 associate level courses eventually. I've decided to consolidate the past 5 articles into 1 large article for ease of searching (and so that I'm not updating 5 separate articles while I continue to study):

  • AWS Global Infrastructure

    • 12 Regions & 33 AZs, 5 more Regions & 11 more AZs coming throughout the next year
    • Region = 2 or more AZs
    • AZ = DataCenter
    • Edge Location = CDN End Points for CloudFront
  • Networking

    • VPC = Virtual Private Cloud (essentially your virtual datacenter)
    • Direct Connect = connecting to AWS w/out using Internet connection
    • Route53 = DNS service (port 53… duh)
  • Compute

    • EC2 = virtual server
    • EC2 Container Service = EC2 with Docker
    • Elastic Beanstalk = Service for deploying web applications and services. "AWS for beginners"
    • Lambda = "Most powerful/revolutionary service". Run code w/out servers. Pay for execution time, only charged when code is executed.
  • Storage

    • S3 = Object based storage, a place to store flat files in the cloud
    • CloudFront = CDN (content delivery network), local caching of content
    • Glacier = long term backup, 3-5 hours to retrieve data
    • EFS = NAS in the cloud, block level storage (in preview)
    • Snowball = Import/Export service. For moving large amounts of data in/out of AWS. They ship you a physical suitcase of disks J
    • Storage Gateway = VM that you run locally that replicates data from local datacenter to AWS
  • Databases

    • RDS = SQL, Aurora, Oracle, PostgreSQL, MySQL, MariaDB
    • DynamoDB = NoSQL
    • Elasticache = Caching DB services in cloud to relieve stress on RDS for high I/O environments
    • Redshift = Data warehousing service. Great performance
    • DMS = Database Migration Services. How to migrate/convert local DBs into AWS
  • Analytics

    EMR = Elastic Map Reduce. A way of processing big data
    Managed web service Hadoop clusters
    Data Pipeline = moving data from one service to another
    Elastic Search = Managed service to deploy/operate a search engine in cloud
    Kinesis = managed service platform for real time streaming of big data.
    Web apps, mobile devices, wearables generate huge amounts of streaming data.
    Use kinesis to digest big data
    Machine Learning = for use by developers to work with machine learning…. (not in test)
    Quick Sight = Business Intelligence service (not in test)

  • Security & Identity

  • Management Tools

  • Application Svcs

    • API Gateway = not in test
    • AppStream = AWS version of XenApp
    • CloudSearch = Managed search solution
    • Elastic Transcoder = Media transcoding service, change media files from source format to destination format
    • SES = Simple Email Service = send/receive emails
    • SQS = Simple Queue Service, a way of decoupling infrastructure
    • SWF = Simple WorkFlow Service
  • Dev Tools (not in test)

    • CodeCommit = "Github"
    • CodeDeploy = automates code deployment
    • CodePipeline = build, test, deploy code
  • Mobile Svcs (not in test, except for SNS)

    • Mobile Hub = test mobile apps
    • Cognito = save mobile user data in AWS cloud
    • Device Farm = test against real smartphones & tablets in AWS cloud
    • Mobile Analytics =
    • SNS = big topic in exam, Simple Notification Service. Way to send notifications from cloud
  • Enterprise Applications

    • WorkSpaces = VDI

      • Replaces Windows PC in the cloud (PCoIP)

      • Runs Windows 7, provided by Windows Server 2008 R2

      • Are persistent (EBS)

      • All data on D drive backed up every 12 hours

      • Do not need an AWS account to login to workspaces

      • Don't need an existing AD domain, can use free client app

      • Can integrate with existing AD domain

      • By default:

        • Users can personalize their WorkSpaces with wallpaper, icons, shortcuts, etc..
        • Users have local admin access to install apps
    • WorkDocs = DropBox for enterprise

    • WorkMail = Exchange

  • IoT

    • Internet of Things = not in test

**Identity Access Management (IAM)
**

  • Central control of AWS account
  • Share access
  • Granular permissions of accounts/groups/roles/policies
  • Identity Federation (AD, Facebook, LinkedIn, etc…)
  • MFA = Multi Factor Authentication
  • Temp access for users/devices/services
  • Pwd rotation policy highly customizable
  • Policies = JSON key/value pairs
  • IAM is universal, applies to all regions consistently
  • New Users have no permissions when 1st created
  • New Users are assigned an access key ID & secret access key when first created, only viewable once so download it & secure!
  • Always setup MFA on root
  • Integrated with AWS marketplace

**S3
**

  • Secure, durable, highly scalable object storage. "Unlimited storage". A hard drive in the cloud.

  • Object based NOT block based storage (no OS or DBs -> that's Elastic Block Storage (EBS)). i.e. allows you to upload files

  • 0 byte to 5Tb file size

  • Files are stored in buckets

  • S3 is a universal namespace, each one must be unique:

  • EXAM Tips

    • Read after Write consistency for PUTS of new Objects
    • Eventual consistency for overwrite PUTS and DELETES as it can take time to propagate
  • S3 = Object based. Objects consist of the following:

    • Key = name of the object
    • Value = the data
    • Version ID (for versioning)
    • Metadata (tags)
    • Subresources
    • Access Control Lists (ACLs)
  • 99.99% availability

  • 99.999999999% durability

  • Tiered storage

  • Lifecycle mgmt.

    • Can be used in conjunction with versioning

    • Can be applied to both current & previous versions

    • Actions:

      • Transition to S3-IA (128Kb & 30 days after creation)
      • Archive to Glacier (30 days after S3-IA, if relevant)
  • Encryption, ACLs & Bucket Policies

  • Storage Tiers

  • Versioning

    • Stores all versions of an object (including all writes and deletes)
    • Great backup tool
    • Cannot disable versioning once enabled, but you can suspend
    • Integrates with lifecycle rules
    • Can use MFA delete capability, so that you can't delete without MFA
    • Cross Region Replication requires versioning – only applies to files manipulated after CRR is turned on
    • Can take up a LOT of space on files that change a lot (because it stores each changed version)

**S3 – Security & Encryption
**

**CloudFront – CDN (Content Delivery Network)
**

**Storage Gateway
**

**Snowball (Import/Export) 2 Types:
**

  • Import/Export Disk**
    **

    • You ship your disks to AWS site of your choice**
      **
    • Import into S3, Glacier, or EBS**
      **
    • Export from S3 **
      **
  • Import/Export Snowball**
    **

    • Available in US, EU(Ireland) & APAC(Sydney)**
      **
    • 50TB or 80TB models available**
      **
    • 256-bit encryption**
      **
    • TPM ensures chain-of-custody**
      **
    • Import into S3 only**
      **
    • Export from S3**
      **

**S3 Transfer Acceleration (probably not in exam yet)
**

  • Use Edge Network to accelerate uploads to your S3 bucket**
    **
  • Better performance the further you are away from your bucket**
    **
  • Incurs an additional fee**
    **

**EC2 (Elastic Compute Cloud) – "**A web service that provides resizable compute capacity in the cloud. Reduces time required to obtain & boot new server instances to minutes allowing the ability to quickly scale capacity both up and down."

Pricing models:

  • On Demand – pay fixed rate by the hour with no commitment

    • Best for burst need servers & unpredictable workloads that cannot be interrupted
    • For users that want flexibility of EC2 w/out up-front payments or long-term commitment
    • Test/Dev for apps running on EC2 for the 1st time.
    • Supplement reserved instance servers (for extra temporary server load)
  • Reserved – 1 or 3 year term. Discount compared to On Demand, the longer your contract, the more you save.

  • Spot – Allows you to bid for whatever price you want to pay for instance capacity (by hour).

    • When your bid = spot price, you get a server
    • When spot price exceeds your bid, you lose server with 1 hour warning
    • Best used for grid computing where instances are disposable & applications have flexible start/stop times
    • If spot instance is terminated by EC2, you don't get charged for partial hour of usage. If you terminate, you'll get charged for the full hour.

EC2 Instance Types:

*EBS (Elastic Block Storage) – Storage volumes that are attached to EC2 instances (think VMDKs)
**

**Know how to create a VPC from memory for exam!
**

  • When creating an AMI, on Step 4(Add storage) "Delete on Termination" is checked and not encrypted by default (i.e. Termination protection is turned off by default):

  • On an EBS-backed instance, the default action is for the root EBS vol to be deleted when the instance is terminated.
  • Root volumes cannot be encrypted by default, you'll need a 3rd party tool (bit locker, etc) to encrypt root vols.

**Security Group Basics:
**

**Volumes vs Snapshots
**

  • Volume

    • A volume is a virtual hard disk (think VMDK)
    • Volumes exist on EBS
    • If you take a snapshot of a volume, this will store that volume on S3
  • Snapshot

Go into EC2 -> Volumes -> create volume (make sure it's in the same AZ as your server!) -> Actions -> attach to server.

Use _lsblk _to view disks to confirm new volume attached.

Use _file –s /dev/xvdf _to make sure it's clean

Use _mkfs –t ext4 /dev/xvdf _to make file system, then _mkdir /fileserver to create directory, & mount /dev/xvdf/fileserver _to mount

**Volumes vs Snapshots – Security
**

**RAID, Volumes & Snapshots
**

  • RAID = Redundant Array of Independent Disks

    • RAID 0 – Striped, no redundancy, good performance
    • RAID 1 – mirrored, redundancy
    • RAID 5 – good for reads, bad for writes, AWS does not recommend ever putting RAID 5's on EBS
    • RAID 10 – Striped & Mirrored, good redundancy, good performance
  • Why create a RAID in AWS?

    • Not getting Disk I/O that you require from GP2 or IO1 on a single volume.
  • How do you snap a RAID array?

**Create an AMI (Amazon Machine Image)
**

AMI Types (EBS vs Instance Store)

**Elastic Load Balancers (ELB)
**

**CloudWatch – Performance Monitoring Service
**

  • Standard monitoring = 5 minutes**
    **

  • Detailed monitoring = 1 minute**
    **

  • Monitors the hypervisor, NOT the guest OS**
    **

  • Dashboards – create/configure widgets to monitor your environment**
    **

  • Alarms – notify when a given threshold is hit**
    **

  • Events – automatically respond to state changes in your AWS resources**
    **

  • Logs – aggregate, monitor & store logs. Agent installed onto EC2 instances**
    **

**AWS Command Line –
**

  • You can only assign a role to an EC2 instance during its creation!

  • AWS command line preinstalled on the AWS AMI

  • Commands:

    • Aws configure

      • Input access key, Secret Access key, default region name (in doc above) & output format (I just hit enter)
    • Aws s3 help

      • Make Bucket = mb
      • Remove Bucket = rb

![][2]

  • If you use roles, you don't have to store your credentials on your EC2 instance (which is a security risk)

**IAM – Roles
**

  • Roles can only be assigned to an EC2 instance when you are launching it.
  • Roles are more secure than storing access keys on individual EC2 instances
  • Roles are easier to manage
  • They are universal, can be used in any region/AZ
  • Useful for:

**Bash Scripting
**

![][3]**
**

**Instance Metadata –
**

![][4]**
**

**Auto scaling Groups
**

  • Have to have a launch configuration to have an auto scaling group
  • Can create rules to spin-up and/or shut down instances based on monitor triggers
  • Deleting an auto scaling group will automatically delete any instances it created

**EC2 Placement Groups
**

  • A logical grouping of instances within a single AZ.

  • Enables applications to participate in low-latency, 10 GBps network

  • Recommended for apps that benefit from low latency networks, high network throughput, or both

    • Grid computing
    • Hadoop clusters
  • Name must be unique within your AWS account

  • Only certain types of instances can be launched in a placement group

    • Compute Optimized
    • GPU
    • Memory Optimized
    • Storage Optimized
  • AWS recommends homogenous instances within a placement group (size & family)

  • Can't merge placement groups

  • Can't move an existing instances into a placement group. You can create an AMI from your existing instance THEN launch a new instance from that AMI into a placement group… if you really wanted to.

**Elastic File System (not in exam yet)
**

  • File storage for EC2 instances
  • Elastic capacity
  • Can mount multiple EC2 instances to 1 EFS "volume"
  • Supports NFSv4 & thousands of connections
  • Only pay for the storage you use (don't need to pre-provision)
  • Scales up to PBs
  • Data is stored across multiple AZs within a region
  • Read after write consistency
  • Block based storage

**Lambda concepts
**

**Route53 (DNS)

  • IPv6 ranges supporte as of December 1st 2016.

  • Alias records work like CNAME records

    • Used to map resource record sets in your hosted zone to ELB, CloudFront distributions, or S3 buckets that are configured as websites.
    • Difference – a CNAME can't be used for naked domain names (i.e. w/out "www"), you can with A record or Alias.
    • Automatically recognizes changes in the record sets
  • ELBs don't have a pre-defined IPv4 address, resolved using DNS

    • This can be an issue because naked domain names need an IP address.
    • Hence the need for Alias records
  • Given a choice, always choose an Alias record because you won't incur additional charges (as you would with a CNAME)

DNS Routing Policies:

  • Simple

    • Default when you create a new record set
    • Most commonly used when you have a single resource that performs a given function (i.e. 1 webserver)
    • No built-in intelligence
  • Weighted

    • Split traffic based on weighted assignments (10% to X, 90% to Y)
    • Different regions, ELBs, AZs, etc.
    • Commonly used when testing a new website & you only want a small subset to see the new site
  • Latency

    • Route traffic based on lowest network latency for your end user
    • Need to create a latency resource record set for the EC2 or ELB resource in each region you want participating.
    • Great for improving global page load times
  • Failover

    • Used when you want to create an active/passive set up.
    • Route53 will monitor health of primary site using a health check (which monitors your end points)
  • Geolocation

    • You choose were traffic will be sent based on location of users
    • Ex. All EU users get routed to servers w/ local language and prices in Euros

**Databases
**

  • RDS – Been around since the 70s. Database: tables, rows, fields (columns) -> think spreadsheet

  • DynamoDB – non-relational databases (No SQL)

    • Database:

      • Collection = Table
      • Document = Row
      • Key/Value pairs = Fields
  • ElastiCache

  • Redshift (data warehousing)

    • OLAP
    • Used for BI. Cognos, Jaspersoft, SAP Netweaver
    • Used to pull in large & complex data sets. Usually used to do queries on data.
  • DMS (database migration services)

    • Migrate your prod DB into AWS

    • AWS manages all the complexities of migration like data type transformation, compression & parallel xfer

    • Schema conversion tool:

      • Convert source DB to a different target DB (Oracle -> Aurora, etc…)
  • Backups, Multi-AZ & Read Replicas

    • Backups (2 types):

      • Automated

      • DB Snapshots

        • Done manually (user initiated), full backup
        • Stored even after you delete the original RDS instance, until you explicitly delete them
        • When you restore either automated or snap, the restored version will be a new RDS instance with a new endpoint
    • Encryption

      • At rest is supported for MySQL, Oracle, SQL, PostgreSQL & MariaDB
      • Done using AWS KMS
      • Once your RDS instance is encrypted at rest – underlying storage, backups, read replicas and snaps are also encrypted
      • Turning on encryption for an existing instance isn't supported… create a new encrypted instance & migrate data to it
    • Multi-AZ

      • Primary RDS instance uses synchronous replication to an RDS in a diff AZ.

      • Automatic failover, same DNS point, AWS handles replication

      • Disaster Recovery only, not performance improvement

      • Only in:

        • SQL Server
        • Oracle
        • MySQL Server
        • PostgreSQL
        • MariaDB
    • Read Replica

    • DynamoDB vs RDS

      • DynamoDB offers "push button" scaling -> scale DB on the fly with no downtime
      • RDS isn't as easy -> usually need to create bigger instance size manually or add a read replica
  • DynamoDB

  • Redshift

    • Fast (10 times faster), fully managed petabyte-scale data warehouse service

    • Can start small for $0.25 per hour with no commitments & scale up to PB or more for $1,000 per TB per year.

    • OLAP transactions

    • Data warehousing DBs us diff type of architecture from both a DB perspective & infrastructure layer.

    • 2 Configurations:

      • Single node (160Gb)

      • Multi-node

        • Leader Node (manages client connections and receives queries)
        • Compute Node (store data & perform queries and computations). Up to 128 Compute Nodes
    • Columnar Data Storage – instead of rows, redshift organizes data by column

      • Only columns involved in the queries are processed
      • Columnar data is stored sequentially on the storage media
      • Block size of 1MB for columnar storage
      • Therefore requires far fewer I/Os, greatly improving performance
    • Advanced Compression

      • Columnar data can be compressed much better than row based data
      • Redshift automatically samples data & chooses the best compression scheme
    • Massively Parallel Processing (MPP):

      • Automatically distributes data & query load across all nodes & newly added nodes
    • Pricing:

      • Compute Node Hours

      • Backup

      • Data Transfer

    • Security

    • Only available in 1 AZ

      • Can restore snaps to new AZs in the event of an outage
    • Good choice if mgmt. runs lots of OLAP transactions & it's stressing the DB

    • Think Business Intelligence (BI)

  • Elasticache

    • Caches things – if your app is constantly going to a DB to pull the same data over and over, you can cache it for faster performance

    • Used to improve latency and throughput for read-heavy app workloads (social networks, gaming, media sharing) or compute heavy workloads (recommendation engine)

    • Improves application performance by storing critical pieces of data in mem for low-latency access.

    • Types of elasticache

      • Memcached

        • Widely adopted mem object caching system.
      • Redis

        • Open source in-mem key/value store.
      • Supports master/slave replication & multi-AZ to achieve cross AZ redundancy

    • Good choice if your DB is read heavy & not prone to frequent changing

  • Aurora

    • MySQL compatible RDS DB engine

    • Speed & availability of commercial DBs

    • Simplicity & cost-effectiveness of open source DBs

    • 5x better performance than MySQL @ 1/10th the price of commercial DB w/ similar performance & availability

    • Big challenge to Oracle

    • Scaling capabilities:

      • Start w/ 10Gb, scales in 10Gb increments up to 64Tb
      • Compute scales up to 32vCPUs & 244Gb of mem
      • 2 copies of DB in each AZ w/ a min of 3 AZs (6 copies of data)
      • Can handle loss of 2 copies w/out affecting write availability
      • Can handle loss of 3 copies w/out affecting read availability
      • Storage is self-healing. Blocks & disks are constantly scanned & repaired
    • Replica features:

      • Aurora Replicas (currently 15)
      • MySQL read replicas (currently 5)

**VPC (Virtual Private Cloud)
**

  • **For the exam know how to build a custom VPC from memory
    **

  • VPC = Think of it as a Virtual Datacenter**
    **

    • By default you are allowed 5 VPCs per region**
      **
    • Logically isolated section of AWS where you can launch AWS resources in a virtual network of your own definition**
      **
    • You control the network environment: IP address range, subnets, routing tables, gateways, etc**
      **
  • Default VPC vs Custom VPC

    • Default is user friendly, can deploy instances right away**
      **
    • All subnets in default VPC have an internet gateway attached**
      **
    • Each EC2 instance has both a public & private IP address**
      **
    • If you delete default VPC, you have to call AWS to get it back**
      **
  • VPC Peering**
    **

    • Connect 1 VPC to another VPC via direct network route using private IP addresses**
      **
    • Instances behave as if they were on the same private network**
      **
    • You can peer VPC's with other AWS accounts & with other VPC's in the same account **within a single region
      **
    • AWS uses the existing infrastructure of a VPC to create a VPC peering connection. **
      **
    • It is not a gateway or a VPN connection.**
      **
    • It does not rely on a separate piece of hardware**
      **
    • No SPoF for communication or bandwidth bottleneck**
      **
    • Peering is done in a star configuration. VPC A ßà VPC B ßà VPC C = A cannot talk to C unless you connect directly (no transitive peering)**
      **
    • **Peers cannot have matching or overlapping CIDR blocks
      **
  • By default when you create a VPC it will automatically create a route table**
    **

  • If you choose dedicated tenancy for your VPC, any instances you create in that VPC will also be dedicated**
    **

  • 1 subnet = 1 AZ, you cannot have subnets cross AZ**
    **

  • Don't forget to add internet gateway**
    **

    • 1 IGW per VPC**
      **
    • Need to attach IGW after you create it**
      **
  • Need to create InternetRouteTable if you want VPC to communicate in/out **
    **

![][5]**
**

* Once you've created your IGW, any subnet associations you make to it will be internet accessible:**  

**

![][6]**
**

* A security group can stretch across multiple Regions/AZs where a subnet cannot**  

**

  • VPC Flow Logs:

**Network Address Translation (NAT)
**

  • Allows your instances that do not have internet access the ability to access the internet via a NAT server instance**
    **
  • create security group**
    **
  • allow inbound & outbound on HTTP and HTTPS**
    **
  • provision NAT inside public subnet**
    **
  • **On a NAT instance, you need to change source/destination check to disabled
    **
  • Set up route on private subnet to route through NAT instance**
    **

**Access Control Lists (ACLs)
**

  • A numbered list of rules (in order, lowest applies first)**
    **
  • Put down network access lists across the entire subnet**
    **
  • Over rules security groups**
    **
  • Acts as a basic firewall**
    **
  • VPC automatically comes with an ACL**
    **
  • When you create a new ACL, by default everything is DENY**
    **
  • Only one ACL per subnet, but many subnets can have the same ACL**
    **

**Application Services
**

**SQS – most important service going into exam
**

  • Read FAQ for SQS for exam: **
    **

  • A distributed message queueing service that sits between a "producer" and "consumer" to quickly and reliably cache that message.**
    **

  • Allows you to decouple the components of an app so that they can run independently.**
    **

  • Eases message management between components**
    **

  • Any component can later retrieve the queued message using SQS API**
    **

  • Queue resolves issues if:**
    **

    • The producer is producing work faster than consumer is processing**
      **
    • Producer or consumer are only intermittently connected to network**
      **
  • Ensures delivery of each message at least once**
    **

  • Supports multiple writers and readers on the same queue**
    **

  • Can apply autoscaling to SQS **
    **

  • A single queue can be used by many app components with no need for those components to coordinate amongst themselves to share the queue**
    **

  • SQS does NOT guarantee first in, first out (FIFO) delivery of message**
    **

    • If you want this, you need to place sequencing information in each message so that you can reorder the messages after they come out of queue**
      **
  • SQS is a pull based system**
    **

  • 12 hour visibility time out by default

  • Engineered to provide "at least once" delivery of mgs, but you should design your app so that processing a message more than once won't create an error

  • Messages can contain up to 256KB of text in any format**
    **

  • Billed at 64KB "chunk" – a 25kKB msg will be 4 x 64KB "chunks"**
    **

  • 1st 1 million SQS requests per month are free**
    **

  • $0.50 per 1 million requests per month thereafter**
    **

  • A single request can have from 1 to 10 messages, up to a max total payload of 256KB**
    **

  • Each 64KB 'chunk' of payload is billed as 1 request.

    • Ex: 1 API call with a 256KB payload is billed as 4 requests

**SWF – Simple Workflow Service
**

**SNS – Simple Notification Service
**

**Elastic Transcoder
**

**White Paper Breakdown:
**

**Overview of AWS:
**

What is cloud computing? On demand delivery of IT resources and apps via the Internet w/ pay-as-you-go pricing. Cloud providers maintain the network-connected hardware while the consumer provisions and use what you need via web applications.

6 Advantages of Cloud:

  1. Trade capex for "variable expense"
  2. Benefit from economies of scale
  3. Stop guessing about capacity
  4. Increase speed & agility
  5. Stop spending money running & maintaining datacenters
  6. Go global in minutes

**Overview of Security Processes:
**

  • State of the art electronic surveillance and multi factor access control systems
  • Staffed 24×7 by security guards
  • Access is least privilege based

Shared Security Model – AWS is responsible for securing the underlying infrastructure. YOU are responsible for anything you put on or connects to the cloud

AWS responsibilities:

  • Infrastructure (hardware, software, networking, facilities)
  • Security configuration of it's managed services (DynamoDB, RDS, Redshift, Elastic MapReduce, WorkSpaces)

Customer responsibilities:

  • IAAS – EC2, VPC, S3
  • Managed services – Amazon is responsible for patching, AV etc… but YOU are responsible for account mgmt. and user access. Recommended that MFA is implemented, SSL/TLS is used for communication, & API/user activity is logged using CloudTrail

Storage Decommissioning:

  • AWS uses NIST 800-88 to destroy data. All decommed magnetic storage devices are degaussed and physically destroyed.

Network Security:

  • Transmission Protection – Use HTTPS using SSL
  • For customers who need additional layers of network security, AWS provides VPCs & the ability to use an IPSec VPN between their datacenter & the VPC
  • Amazon Corporate Segregation – AWS production network is segregated from the Amazon corporate network by a means of a complex set of network security/segregation devices
  • DDoS mitigation
  • Prevent Man in the middle attacks (MITM)
  • Prevent IP Spoofing – the AWS controlled, host-based firewall will not permit an instance to send traffic with a source IP or MAC other than its own.
  • Prevent Port Scanning – Unauthorized port scans are a violation of T&Es. You must request a vulnerability scan in advance
  • Prevent Packet Sniffing by other tenants

AWS Credentials

  • Passwords
  • MFA
  • Access Keys
  • Key Pairs
  • X.509 certs

AWS Trusted Advisor

Instance Isolation

  • Instances on same physical machine are isolated from each other via the Xen hypervisor.

  • The AWS firewall resides within the hypervisor layer, between the physical network interface & the instances virtual interface.

    • All packets must pass through this firewall
  • Physical RAM is separated using similar mechanisms

  • Customer instances have no access to raw disk devices, only virtual disks

  • AWS proprietary disk virtualization layer resets every block of storage used by the customer

    • Ensures customer X data isn't exposed to customer Y
  • Mem allocated to guest is scrubbed (zeroed out) by hypervisor when it becomes unprovisioned

    • Mem not returned to pool of free mem until scrubbing is complete
  • Guest OS

    • Instances are completely controlled by customer. AWS does not have any access rights or back doors to guest OSes
    • AWS provides the ability to encrypt EBS volumes & their snapshots with AES-256
  • Firewall:

    • EC2 provides a complete firewall solution. By default inbound is DENY-ALL
  • ELB – SSL Termination on the load balancer is supported

    • Allows you to ID the originating IP address of a client connecting to your servers, whether you are using HTTPS or TCP load balancing
  • Direct Connect:

    • Slower to provision than a VPN because it's a physical connection

    • Bypass ISPs in your network path (if you don't want traffic to traverse Internet)

    • Procure rack space within the facility housing the AWS Direct Connect location & deploy your equipment nearby.

    • Connect this equipment to AWS Direct Connect using a cross-connect

    • Use VLANs (802.1q) to use 1 connection to access both public (S3) and private (EC2 in a VPC) AWS resources

    • Available in

      • 10Gbps
      • 1Gbps
      • Sub 1Gbps groups purchased through AWS Direct Connect Partners

Risk and Compliance:

  • SOC 1/SSAE 16/ISAE 3402
  • SOC2
  • SOC3
  • FISMA, DIACAP, & FedRAMP
  • PCI DSS Level 1 ß
    can take credit card information with PCI compliance (software needs to be compliant too)
  • ISO 27001
  • ISO 9001
  • ITAR
  • FIPS 140-2
  • HIPAA
  • Cloud Security Alliance (CSA)
  • Motion Picture Association of America (MPAA)

AWS Platform:

![][7]

**Storage Options in the Cloud: (2 docs?)
**


**


**

**Architecting for the Cloud – Best Practices:
**

Business Benefits:

  • Almost 0 upfront infrastructure investment
  • JIT infrastructure
  • More efficient resource utilization
  • Usage-based pricing
  • Reduced time to market

Technical Benefits:

  • Automation – "Scriptable infrastructure"
  • Auto-scaling
  • Proactive scaling
  • More efficient dev lifecycle
  • Improved testability
  • DR/BC baked in
  • "Overflow" traffic to the cloud

Design for Failure:

  • Assume that hardware will fail & outages will occur
  • Assume that you will be overloaded with requests
  • By being a pessimist, you think about recovery strategies during design time, which helps you design an overall better system

Decouple your components:

  • Build components that do not have tight dependencies so that if 1 component dies/sleeps/is busy, the other components are built so as to continue work as if no failure is happening.
  • If you see decoupling in exam, think SQS
  • WebServer – SQS – AppServer – SQS – DBServer

Implement Elasticity:

  • Proactive Cyclic Scaling – periodic scaling that occurs @ fixed intervals (daily, weekly, monthly, quarterly) i.e. "Payroll Monday"
  • Proactive Event Scaling – when you are expecting a big surge of traffic (Black Friday, new product launch, marketing campaign)
  • Auto-scaling based on demand – Create triggers in monitoring to scale up/down resources

Secure Your Application:

  • Only have the ports open to/from your various stacks to allow communication, no more (duh)

Consolidated Billing

  • 1 paying account for all linked accounts in an org
  • Paying account gets 1 monthly bill
  • Paying account cannot access resources of the linked accounts
  • All linked accounts are independent of each other
  • 20 linked accounts for consolidated billing (soft limit)
  • Easy to track charges & allocate costs
  • Volume pricing discount, resources of all your linked accounts are added up for discounts

Resource Groups & Tagging

Active Directory Integration:

  • User browses to ADFS URL
  • User authenticates against AD
  • User receives a SAML assertion
  • User's browser posts the SAML assertion to the AWS sign-in endpoint for SAML
  • User's browser receives the sign-in URL and is redirected to the console

Like this: