AWS - Certified Solutions Architect (NOTES)
AWS Certified Solutions Architect – Associate Study Notes –
I' currently studying to sit the AWS Solutions Architect Associate certification. To do this I'm (a) going through the exam blueprint, (b) writing blogs on my AWS free-tier lab, (c) watching the excellent videos. I bought the "Associate Bundle" so I plan on taking all 3 associate level courses eventually. I've decided to consolidate the past 5 articles into 1 large article for ease of searching (and so that I'm not updating 5 separate articles while I continue to study):
-
AWS Global Infrastructure
- 12 Regions & 33 AZs, 5 more Regions & 11 more AZs coming throughout the next year
- Region = 2 or more AZs
- AZ = DataCenter
- Edge Location = CDN End Points for CloudFront
-
Networking
- VPC = Virtual Private Cloud (essentially your virtual datacenter)
- Direct Connect = connecting to AWS w/out using Internet connection
- Route53 = DNS service (port 53… duh)
-
Compute
- EC2 = virtual server
- EC2 Container Service = EC2 with Docker
- Elastic Beanstalk = Service for deploying web applications and services. "AWS for beginners"
- Lambda = "Most powerful/revolutionary service". Run code w/out servers. Pay for execution time, only charged when code is executed.
-
Storage
- S3 = Object based storage, a place to store flat files in the cloud
- CloudFront = CDN (content delivery network), local caching of content
- Glacier = long term backup, 3-5 hours to retrieve data
- EFS = NAS in the cloud, block level storage (in preview)
- Snowball = Import/Export service. For moving large amounts of data in/out of AWS. They ship you a physical suitcase of disks J
- Storage Gateway = VM that you run locally that replicates data from local datacenter to AWS
-
Databases
- RDS = SQL, Aurora, Oracle, PostgreSQL, MySQL, MariaDB
- DynamoDB = NoSQL
- Elasticache = Caching DB services in cloud to relieve stress on RDS for high I/O environments
- Redshift = Data warehousing service. Great performance
- DMS = Database Migration Services. How to migrate/convert local DBs into AWS
-
Analytics
EMR = Elastic Map Reduce. A way of processing big data
Managed web service Hadoop clusters
Data Pipeline = moving data from one service to another
Elastic Search = Managed service to deploy/operate a search engine in cloud
Kinesis = managed service platform for real time streaming of big data.
Web apps, mobile devices, wearables generate huge amounts of streaming data.
Use kinesis to digest big data
Machine Learning = for use by developers to work with machine learning…. (not in test)
Quick Sight = Business Intelligence service (not in test) -
Security & Identity
-
Management Tools
-
Application Svcs
- API Gateway = not in test
- AppStream = AWS version of XenApp
- CloudSearch = Managed search solution
- Elastic Transcoder = Media transcoding service, change media files from source format to destination format
- SES = Simple Email Service = send/receive emails
- SQS = Simple Queue Service, a way of decoupling infrastructure
- SWF = Simple WorkFlow Service
-
Dev Tools (not in test)
- CodeCommit = "Github"
- CodeDeploy = automates code deployment
- CodePipeline = build, test, deploy code
-
Mobile Svcs (not in test, except for SNS)
- Mobile Hub = test mobile apps
- Cognito = save mobile user data in AWS cloud
- Device Farm = test against real smartphones & tablets in AWS cloud
- Mobile Analytics =
- SNS = big topic in exam, Simple Notification Service. Way to send notifications from cloud
-
Enterprise Applications
-
WorkSpaces = VDI
-
Replaces Windows PC in the cloud (PCoIP)
-
Runs Windows 7, provided by Windows Server 2008 R2
-
Are persistent (EBS)
-
All data on D drive backed up every 12 hours
-
Do not need an AWS account to login to workspaces
-
Don't need an existing AD domain, can use free client app
-
Can integrate with existing AD domain
-
By default:
- Users can personalize their WorkSpaces with wallpaper, icons, shortcuts, etc..
- Users have local admin access to install apps
-
-
WorkDocs = DropBox for enterprise
-
WorkMail = Exchange
-
-
IoT
- Internet of Things = not in test
**Identity Access Management (IAM)
**
- Central control of AWS account
- Share access
- Granular permissions of accounts/groups/roles/policies
- Identity Federation (AD, Facebook, LinkedIn, etc…)
- MFA = Multi Factor Authentication
- Temp access for users/devices/services
- Pwd rotation policy highly customizable
- Policies = JSON key/value pairs
- IAM is universal, applies to all regions consistently
- New Users have no permissions when 1st created
- New Users are assigned an access key ID & secret access key when first created, only viewable once so download it & secure!
- Always setup MFA on root
- Integrated with AWS marketplace
**S3
**
-
Secure, durable, highly scalable object storage. "Unlimited storage". A hard drive in the cloud.
-
Object based NOT block based storage (no OS or DBs -> that's Elastic Block Storage (EBS)). i.e. allows you to upload files
-
0 byte to 5Tb file size
-
Files are stored in buckets
-
S3 is a universal namespace, each one must be unique:
-
EXAM Tips
- Read after Write consistency for PUTS of new Objects
- Eventual consistency for overwrite PUTS and DELETES as it can take time to propagate
-
S3 = Object based. Objects consist of the following:
- Key = name of the object
- Value = the data
- Version ID (for versioning)
- Metadata (tags)
- Subresources
- Access Control Lists (ACLs)
-
99.99% availability
-
99.999999999% durability
-
Tiered storage
-
Lifecycle mgmt.
-
Can be used in conjunction with versioning
-
Can be applied to both current & previous versions
-
Actions:
- Transition to S3-IA (128Kb & 30 days after creation)
- Archive to Glacier (30 days after S3-IA, if relevant)
-
-
Encryption, ACLs & Bucket Policies
-
Storage Tiers
-
Versioning
- Stores all versions of an object (including all writes and deletes)
- Great backup tool
- Cannot disable versioning once enabled, but you can suspend
- Integrates with lifecycle rules
- Can use MFA delete capability, so that you can't delete without MFA
- Cross Region Replication requires versioning – only applies to files manipulated after CRR is turned on
- Can take up a LOT of space on files that change a lot (because it stores each changed version)
**S3 – Security & Encryption
**
**CloudFront – CDN (Content Delivery Network)
**
**Storage Gateway
**
**Snowball (Import/Export) 2 Types:
**
-
Import/Export Disk**
**- You ship your disks to AWS site of your choice**
** - Import into S3, Glacier, or EBS**
** - Export from S3 **
**
- You ship your disks to AWS site of your choice**
-
Import/Export Snowball**
**- Available in US, EU(Ireland) & APAC(Sydney)**
** - 50TB or 80TB models available**
** - 256-bit encryption**
** - TPM ensures chain-of-custody**
** - Import into S3 only**
** - Export from S3**
**
- Available in US, EU(Ireland) & APAC(Sydney)**
**S3 Transfer Acceleration (probably not in exam yet)
**
- Use Edge Network to accelerate uploads to your S3 bucket**
** - Better performance the further you are away from your bucket**
** - Incurs an additional fee**
**
**EC2 (Elastic Compute Cloud) – "**A web service that provides resizable compute capacity in the cloud. Reduces time required to obtain & boot new server instances to minutes allowing the ability to quickly scale capacity both up and down."
Pricing models:
-
On Demand – pay fixed rate by the hour with no commitment
- Best for burst need servers & unpredictable workloads that cannot be interrupted
- For users that want flexibility of EC2 w/out up-front payments or long-term commitment
- Test/Dev for apps running on EC2 for the 1st time.
- Supplement reserved instance servers (for extra temporary server load)
-
Reserved – 1 or 3 year term. Discount compared to On Demand, the longer your contract, the more you save.
-
Spot – Allows you to bid for whatever price you want to pay for instance capacity (by hour).
- When your bid = spot price, you get a server
- When spot price exceeds your bid, you lose server with 1 hour warning
- Best used for grid computing where instances are disposable & applications have flexible start/stop times
- If spot instance is terminated by EC2, you don't get charged for partial hour of usage. If you terminate, you'll get charged for the full hour.
EC2 Instance Types:
*EBS (Elastic Block Storage) – Storage volumes that are attached to EC2 instances (think VMDKs)
**
**Know how to create a VPC from memory for exam!
**
- When creating an AMI, on Step 4(Add storage) "Delete on Termination" is checked and not encrypted by default (i.e. Termination protection is turned off by default):
- On an EBS-backed instance, the default action is for the root EBS vol to be deleted when the instance is terminated.
- Root volumes cannot be encrypted by default, you'll need a 3rd party tool (bit locker, etc) to encrypt root vols.
**Security Group Basics:
**
**Volumes vs Snapshots
**
-
Volume
- A volume is a virtual hard disk (think VMDK)
- Volumes exist on EBS
- If you take a snapshot of a volume, this will store that volume on S3
-
Snapshot
Go into EC2 -> Volumes -> create volume (make sure it's in the same AZ as your server!) -> Actions -> attach to server.
Use _lsblk _to view disks to confirm new volume attached.
Use _file –s /dev/xvdf _to make sure it's clean
Use _mkfs –t ext4 /dev/xvdf _to make file system, then _mkdir /fileserver to create directory, & mount /dev/xvdf/fileserver _to mount
**Volumes vs Snapshots – Security
**
**RAID, Volumes & Snapshots
**
-
RAID = Redundant Array of Independent Disks
- RAID 0 – Striped, no redundancy, good performance
- RAID 1 – mirrored, redundancy
- RAID 5 – good for reads, bad for writes, AWS does not recommend ever putting RAID 5's on EBS
- RAID 10 – Striped & Mirrored, good redundancy, good performance
-
Why create a RAID in AWS?
- Not getting Disk I/O that you require from GP2 or IO1 on a single volume.
-
How do you snap a RAID array?
**Create an AMI (Amazon Machine Image)
**
AMI Types (EBS vs Instance Store)
**Elastic Load Balancers (ELB)
**
**CloudWatch – Performance Monitoring Service
**
-
Standard monitoring = 5 minutes**
** -
Detailed monitoring = 1 minute**
** -
Monitors the hypervisor, NOT the guest OS**
** -
Dashboards – create/configure widgets to monitor your environment**
** -
Alarms – notify when a given threshold is hit**
** -
Events – automatically respond to state changes in your AWS resources**
** -
Logs – aggregate, monitor & store logs. Agent installed onto EC2 instances**
**
**AWS Command Line –
**
-
You can only assign a role to an EC2 instance during its creation!
-
AWS command line preinstalled on the AWS AMI
-
Commands:
-
Aws configure
- Input access key, Secret Access key, default region name (in doc above) & output format (I just hit enter)
-
Aws s3 help
- Make Bucket = mb
- Remove Bucket = rb
-
![][2]
- If you use roles, you don't have to store your credentials on your EC2 instance (which is a security risk)
**IAM – Roles
**
- Roles can only be assigned to an EC2 instance when you are launching it.
- Roles are more secure than storing access keys on individual EC2 instances
- Roles are easier to manage
- They are universal, can be used in any region/AZ
- Useful for:
**Bash Scripting
**
![][3]**
**
**Instance Metadata –
**
![][4]**
**
**Auto scaling Groups
**
- Have to have a launch configuration to have an auto scaling group
- Can create rules to spin-up and/or shut down instances based on monitor triggers
- Deleting an auto scaling group will automatically delete any instances it created
**EC2 Placement Groups
**
-
A logical grouping of instances within a single AZ.
-
Enables applications to participate in low-latency, 10 GBps network
-
Recommended for apps that benefit from low latency networks, high network throughput, or both
- Grid computing
- Hadoop clusters
-
Name must be unique within your AWS account
-
Only certain types of instances can be launched in a placement group
- Compute Optimized
- GPU
- Memory Optimized
- Storage Optimized
-
AWS recommends homogenous instances within a placement group (size & family)
-
Can't merge placement groups
-
Can't move an existing instances into a placement group. You can create an AMI from your existing instance THEN launch a new instance from that AMI into a placement group… if you really wanted to.
**Elastic File System (not in exam yet)
**
- File storage for EC2 instances
- Elastic capacity
- Can mount multiple EC2 instances to 1 EFS "volume"
- Supports NFSv4 & thousands of connections
- Only pay for the storage you use (don't need to pre-provision)
- Scales up to PBs
- Data is stored across multiple AZs within a region
- Read after write consistency
- Block based storage
**Lambda concepts
**
**Route53 (DNS)
-
IPv6 ranges supporte as of December 1st 2016.
-
Alias records work like CNAME records
- Used to map resource record sets in your hosted zone to ELB, CloudFront distributions, or S3 buckets that are configured as websites.
- Difference – a CNAME can't be used for naked domain names (i.e. w/out "www"), you can with A record or Alias.
- Automatically recognizes changes in the record sets
-
ELBs don't have a pre-defined IPv4 address, resolved using DNS
- This can be an issue because naked domain names need an IP address.
- Hence the need for Alias records
-
Given a choice, always choose an Alias record because you won't incur additional charges (as you would with a CNAME)
DNS Routing Policies:
-
Simple
- Default when you create a new record set
- Most commonly used when you have a single resource that performs a given function (i.e. 1 webserver)
- No built-in intelligence
-
Weighted
- Split traffic based on weighted assignments (10% to X, 90% to Y)
- Different regions, ELBs, AZs, etc.
- Commonly used when testing a new website & you only want a small subset to see the new site
-
Latency
- Route traffic based on lowest network latency for your end user
- Need to create a latency resource record set for the EC2 or ELB resource in each region you want participating.
- Great for improving global page load times
-
Failover
- Used when you want to create an active/passive set up.
- Route53 will monitor health of primary site using a health check (which monitors your end points)
-
Geolocation
- You choose were traffic will be sent based on location of users
- Ex. All EU users get routed to servers w/ local language and prices in Euros
**Databases
**
-
RDS – Been around since the 70s. Database: tables, rows, fields (columns) -> think spreadsheet
-
DynamoDB – non-relational databases (No SQL)
-
Database:
- Collection = Table
- Document = Row
- Key/Value pairs = Fields
-
-
ElastiCache
-
Redshift (data warehousing)
- OLAP
- Used for BI. Cognos, Jaspersoft, SAP Netweaver
- Used to pull in large & complex data sets. Usually used to do queries on data.
-
DMS (database migration services)
-
Migrate your prod DB into AWS
-
AWS manages all the complexities of migration like data type transformation, compression & parallel xfer
-
Schema conversion tool:
- Convert source DB to a different target DB (Oracle -> Aurora, etc…)
-
-
Backups, Multi-AZ & Read Replicas
-
Backups (2 types):
-
Automated
-
DB Snapshots
- Done manually (user initiated), full backup
- Stored even after you delete the original RDS instance, until you explicitly delete them
- When you restore either automated or snap, the restored version will be a new RDS instance with a new endpoint
-
-
Encryption
- At rest is supported for MySQL, Oracle, SQL, PostgreSQL & MariaDB
- Done using AWS KMS
- Once your RDS instance is encrypted at rest – underlying storage, backups, read replicas and snaps are also encrypted
- Turning on encryption for an existing instance isn't supported… create a new encrypted instance & migrate data to it
-
Multi-AZ
-
Primary RDS instance uses synchronous replication to an RDS in a diff AZ.
-
Automatic failover, same DNS point, AWS handles replication
-
Disaster Recovery only, not performance improvement
-
Only in:
- SQL Server
- Oracle
- MySQL Server
- PostgreSQL
- MariaDB
-
-
Read Replica
-
DynamoDB vs RDS
- DynamoDB offers "push button" scaling -> scale DB on the fly with no downtime
- RDS isn't as easy -> usually need to create bigger instance size manually or add a read replica
-
-
DynamoDB
-
Redshift
-
Fast (10 times faster), fully managed petabyte-scale data warehouse service
-
Can start small for $0.25 per hour with no commitments & scale up to PB or more for $1,000 per TB per year.
-
OLAP transactions
-
Data warehousing DBs us diff type of architecture from both a DB perspective & infrastructure layer.
-
2 Configurations:
-
Single node (160Gb)
-
Multi-node
- Leader Node (manages client connections and receives queries)
- Compute Node (store data & perform queries and computations). Up to 128 Compute Nodes
-
-
Columnar Data Storage – instead of rows, redshift organizes data by column
- Only columns involved in the queries are processed
- Columnar data is stored sequentially on the storage media
- Block size of 1MB for columnar storage
- Therefore requires far fewer I/Os, greatly improving performance
-
Advanced Compression
- Columnar data can be compressed much better than row based data
- Redshift automatically samples data & chooses the best compression scheme
-
Massively Parallel Processing (MPP):
- Automatically distributes data & query load across all nodes & newly added nodes
-
Pricing:
-
Compute Node Hours
-
Backup
-
Data Transfer
-
-
Security
-
Only available in 1 AZ
- Can restore snaps to new AZs in the event of an outage
-
Good choice if mgmt. runs lots of OLAP transactions & it's stressing the DB
-
Think Business Intelligence (BI)
-
-
Elasticache
-
Caches things – if your app is constantly going to a DB to pull the same data over and over, you can cache it for faster performance
-
Used to improve latency and throughput for read-heavy app workloads (social networks, gaming, media sharing) or compute heavy workloads (recommendation engine)
-
Improves application performance by storing critical pieces of data in mem for low-latency access.
-
Types of elasticache
-
Memcached
- Widely adopted mem object caching system.
-
Redis
- Open source in-mem key/value store.
-
Supports master/slave replication & multi-AZ to achieve cross AZ redundancy
-
-
Good choice if your DB is read heavy & not prone to frequent changing
-
-
Aurora
-
MySQL compatible RDS DB engine
-
Speed & availability of commercial DBs
-
Simplicity & cost-effectiveness of open source DBs
-
5x better performance than MySQL @ 1/10th the price of commercial DB w/ similar performance & availability
-
Big challenge to Oracle
-
Scaling capabilities:
- Start w/ 10Gb, scales in 10Gb increments up to 64Tb
- Compute scales up to 32vCPUs & 244Gb of mem
- 2 copies of DB in each AZ w/ a min of 3 AZs (6 copies of data)
- Can handle loss of 2 copies w/out affecting write availability
- Can handle loss of 3 copies w/out affecting read availability
- Storage is self-healing. Blocks & disks are constantly scanned & repaired
-
Replica features:
- Aurora Replicas (currently 15)
- MySQL read replicas (currently 5)
-
**VPC (Virtual Private Cloud)
**
-
**For the exam know how to build a custom VPC from memory
** -
VPC = Think of it as a Virtual Datacenter**
**- By default you are allowed 5 VPCs per region**
** - Logically isolated section of AWS where you can launch AWS resources in a virtual network of your own definition**
** - You control the network environment: IP address range, subnets, routing tables, gateways, etc**
**
- By default you are allowed 5 VPCs per region**
-
Default VPC vs Custom VPC
- Default is user friendly, can deploy instances right away**
** - All subnets in default VPC have an internet gateway attached**
** - Each EC2 instance has both a public & private IP address**
** - If you delete default VPC, you have to call AWS to get it back**
**
- Default is user friendly, can deploy instances right away**
-
VPC Peering**
**- Connect 1 VPC to another VPC via direct network route using private IP addresses**
** - Instances behave as if they were on the same private network**
** - You can peer VPC's with other AWS accounts & with other VPC's in the same account **within a single region
** - AWS uses the existing infrastructure of a VPC to create a VPC peering connection. **
** - It is not a gateway or a VPN connection.**
** - It does not rely on a separate piece of hardware**
** - No SPoF for communication or bandwidth bottleneck**
** - Peering is done in a star configuration. VPC A ßà VPC B ßà VPC C = A cannot talk to C unless you connect directly (no transitive peering)**
** - **Peers cannot have matching or overlapping CIDR blocks
**
- Connect 1 VPC to another VPC via direct network route using private IP addresses**
-
By default when you create a VPC it will automatically create a route table**
** -
If you choose dedicated tenancy for your VPC, any instances you create in that VPC will also be dedicated**
** -
1 subnet = 1 AZ, you cannot have subnets cross AZ**
** -
Don't forget to add internet gateway**
**- 1 IGW per VPC**
** - Need to attach IGW after you create it**
**
- 1 IGW per VPC**
-
Need to create InternetRouteTable if you want VPC to communicate in/out **
**
![][5]**
**
* Once you've created your IGW, any subnet associations you make to it will be internet accessible:**
**
![][6]**
**
* A security group can stretch across multiple Regions/AZs where a subnet cannot**
**
- VPC Flow Logs:
**Network Address Translation (NAT)
**
- Allows your instances that do not have internet access the ability to access the internet via a NAT server instance**
** - create security group**
** - allow inbound & outbound on HTTP and HTTPS**
** - provision NAT inside public subnet**
** - **On a NAT instance, you need to change source/destination check to disabled
** - Set up route on private subnet to route through NAT instance**
**
**Access Control Lists (ACLs)
**
- A numbered list of rules (in order, lowest applies first)**
** - Put down network access lists across the entire subnet**
** - Over rules security groups**
** - Acts as a basic firewall**
** - VPC automatically comes with an ACL**
** - When you create a new ACL, by default everything is DENY**
** - Only one ACL per subnet, but many subnets can have the same ACL**
**
**Application Services
**
**SQS – most important service going into exam
**
-
Read FAQ for SQS for exam: **
** -
A distributed message queueing service that sits between a "producer" and "consumer" to quickly and reliably cache that message.**
** -
Allows you to decouple the components of an app so that they can run independently.**
** -
Eases message management between components**
** -
Any component can later retrieve the queued message using SQS API**
** -
Queue resolves issues if:**
**- The producer is producing work faster than consumer is processing**
** - Producer or consumer are only intermittently connected to network**
**
- The producer is producing work faster than consumer is processing**
-
Ensures delivery of each message at least once**
** -
Supports multiple writers and readers on the same queue**
** -
Can apply autoscaling to SQS **
** -
A single queue can be used by many app components with no need for those components to coordinate amongst themselves to share the queue**
** -
SQS does NOT guarantee first in, first out (FIFO) delivery of message**
**- If you want this, you need to place sequencing information in each message so that you can reorder the messages after they come out of queue**
**
- If you want this, you need to place sequencing information in each message so that you can reorder the messages after they come out of queue**
-
SQS is a pull based system**
** -
12 hour visibility time out by default
-
Engineered to provide "at least once" delivery of mgs, but you should design your app so that processing a message more than once won't create an error
-
Messages can contain up to 256KB of text in any format**
** -
Billed at 64KB "chunk" – a 25kKB msg will be 4 x 64KB "chunks"**
** -
1st 1 million SQS requests per month are free**
** -
$0.50 per 1 million requests per month thereafter**
** -
A single request can have from 1 to 10 messages, up to a max total payload of 256KB**
** -
Each 64KB 'chunk' of payload is billed as 1 request.
- Ex: 1 API call with a 256KB payload is billed as 4 requests
**SWF – Simple Workflow Service
**
**SNS – Simple Notification Service
**
**Elastic Transcoder
**
**White Paper Breakdown:
**
**Overview of AWS:
**
What is cloud computing? On demand delivery of IT resources and apps via the Internet w/ pay-as-you-go pricing. Cloud providers maintain the network-connected hardware while the consumer provisions and use what you need via web applications.
6 Advantages of Cloud:
- Trade capex for "variable expense"
- Benefit from economies of scale
- Stop guessing about capacity
- Increase speed & agility
- Stop spending money running & maintaining datacenters
- Go global in minutes
**Overview of Security Processes:
**
- State of the art electronic surveillance and multi factor access control systems
- Staffed 24×7 by security guards
- Access is least privilege based
Shared Security Model – AWS is responsible for securing the underlying infrastructure. YOU are responsible for anything you put on or connects to the cloud
AWS responsibilities:
- Infrastructure (hardware, software, networking, facilities)
- Security configuration of it's managed services (DynamoDB, RDS, Redshift, Elastic MapReduce, WorkSpaces)
Customer responsibilities:
- IAAS – EC2, VPC, S3
- Managed services – Amazon is responsible for patching, AV etc… but YOU are responsible for account mgmt. and user access. Recommended that MFA is implemented, SSL/TLS is used for communication, & API/user activity is logged using CloudTrail
Storage Decommissioning:
- AWS uses NIST 800-88 to destroy data. All decommed magnetic storage devices are degaussed and physically destroyed.
Network Security:
- Transmission Protection – Use HTTPS using SSL
- For customers who need additional layers of network security, AWS provides VPCs & the ability to use an IPSec VPN between their datacenter & the VPC
- Amazon Corporate Segregation – AWS production network is segregated from the Amazon corporate network by a means of a complex set of network security/segregation devices
- DDoS mitigation
- Prevent Man in the middle attacks (MITM)
- Prevent IP Spoofing – the AWS controlled, host-based firewall will not permit an instance to send traffic with a source IP or MAC other than its own.
- Prevent Port Scanning – Unauthorized port scans are a violation of T&Es. You must request a vulnerability scan in advance
- Prevent Packet Sniffing by other tenants
AWS Credentials
- Passwords
- MFA
- Access Keys
- Key Pairs
- X.509 certs
AWS Trusted Advisor
Instance Isolation
-
Instances on same physical machine are isolated from each other via the Xen hypervisor.
-
The AWS firewall resides within the hypervisor layer, between the physical network interface & the instances virtual interface.
- All packets must pass through this firewall
-
Physical RAM is separated using similar mechanisms
-
Customer instances have no access to raw disk devices, only virtual disks
-
AWS proprietary disk virtualization layer resets every block of storage used by the customer
- Ensures customer X data isn't exposed to customer Y
-
Mem allocated to guest is scrubbed (zeroed out) by hypervisor when it becomes unprovisioned
- Mem not returned to pool of free mem until scrubbing is complete
-
Guest OS
- Instances are completely controlled by customer. AWS does not have any access rights or back doors to guest OSes
- AWS provides the ability to encrypt EBS volumes & their snapshots with AES-256
-
Firewall:
- EC2 provides a complete firewall solution. By default inbound is DENY-ALL
-
ELB – SSL Termination on the load balancer is supported
- Allows you to ID the originating IP address of a client connecting to your servers, whether you are using HTTPS or TCP load balancing
-
Direct Connect:
-
Slower to provision than a VPN because it's a physical connection
-
Bypass ISPs in your network path (if you don't want traffic to traverse Internet)
-
Procure rack space within the facility housing the AWS Direct Connect location & deploy your equipment nearby.
-
Connect this equipment to AWS Direct Connect using a cross-connect
-
Use VLANs (802.1q) to use 1 connection to access both public (S3) and private (EC2 in a VPC) AWS resources
-
Available in
- 10Gbps
- 1Gbps
- Sub 1Gbps groups purchased through AWS Direct Connect Partners
-
Risk and Compliance:
- SOC 1/SSAE 16/ISAE 3402
- SOC2
- SOC3
- FISMA, DIACAP, & FedRAMP
- PCI DSS Level 1 ß
can take credit card information with PCI compliance (software needs to be compliant too) - ISO 27001
- ISO 9001
- ITAR
- FIPS 140-2
- HIPAA
- Cloud Security Alliance (CSA)
- Motion Picture Association of America (MPAA)
AWS Platform:
![][7]
**Storage Options in the Cloud: (2 docs?)
**
**
**
**Architecting for the Cloud – Best Practices:
**
Business Benefits:
- Almost 0 upfront infrastructure investment
- JIT infrastructure
- More efficient resource utilization
- Usage-based pricing
- Reduced time to market
Technical Benefits:
- Automation – "Scriptable infrastructure"
- Auto-scaling
- Proactive scaling
- More efficient dev lifecycle
- Improved testability
- DR/BC baked in
- "Overflow" traffic to the cloud
Design for Failure:
- Assume that hardware will fail & outages will occur
- Assume that you will be overloaded with requests
- By being a pessimist, you think about recovery strategies during design time, which helps you design an overall better system
Decouple your components:
- Build components that do not have tight dependencies so that if 1 component dies/sleeps/is busy, the other components are built so as to continue work as if no failure is happening.
- If you see decoupling in exam, think SQS
- WebServer – SQS – AppServer – SQS – DBServer
Implement Elasticity:
- Proactive Cyclic Scaling – periodic scaling that occurs @ fixed intervals (daily, weekly, monthly, quarterly) i.e. "Payroll Monday"
- Proactive Event Scaling – when you are expecting a big surge of traffic (Black Friday, new product launch, marketing campaign)
- Auto-scaling based on demand – Create triggers in monitoring to scale up/down resources
Secure Your Application:
- Only have the ports open to/from your various stacks to allow communication, no more (duh)
Consolidated Billing
- 1 paying account for all linked accounts in an org
- Paying account gets 1 monthly bill
- Paying account cannot access resources of the linked accounts
- All linked accounts are independent of each other
- 20 linked accounts for consolidated billing (soft limit)
- Easy to track charges & allocate costs
- Volume pricing discount, resources of all your linked accounts are added up for discounts
Resource Groups & Tagging
Active Directory Integration:
- User browses to ADFS URL
- User authenticates against AD
- User receives a SAML assertion
- User's browser posts the SAML assertion to the AWS sign-in endpoint for SAML
- User's browser receives the sign-in URL and is redirected to the console