User Tools

Site Tools


certification:awscertifiedsolutionarchitecprofessional

AWS Certified Solution Architect Professional

Exam guide https://d1.awsstatic.com/training-and-certification/docs-sa-pro/AWS-Certified-Solutions-Architect-Professional_Exam-Guide.pdf

Experience:

Test Tips to remember:

  • AMIs in cloud - an indicator that you need “Application Migration Service”
  • strewaming and intermediary Lambda - Kinesis Data Stream does not offer plug-and-play integration with an intermediary Lambda function as Firehose does.
  • streaming and S3 as target - Kinesis Data Streams cannot directly write the output to S3. Firehose can do.
  • streaming to target, Firehose is preferred for natively supported targets
    • Simple Storage Service (Amazon S3),
    • Amazon Redshift,
    • Amazon OpenSearch Service,
    • Amazon OpenSearch Serverless,
    • Splunk,
    • and any custom HTTP endpoint or HTTP endpoints owned by supported third-party service providers,
      • including Datadog,
      • Dynatrace,
      • LogicMonitor,
      • MongoDB,
      • New Relic,
      • Coralogix, and
      • Elastic.
  • When connecting to S3 from on-prem, DONT use Gateway endpoints for Amazon S3
  • for migration of AuroraDB to another AWS account, with minimal downtime - use DataBaseMigration DMS replication

Example questions:

Cheat Sheet, important remarks :

https://www.dropbox.com/scl/fi/j3p79amqddxg0f1x11xax/AWS-Architect-Professional-Master-Cheat-Sheet.pdf?rlkey=2ruf14noczobkpz3mzkoye40b&dl=0

Domains in Exam:

  1. Design for organizational complexity
    26%
  2. Design for new solutions
    29%
  3. Continuously improve existing solutions
    25%
  4. Accelerate workload migration and modernization
    20%

Notes

  • command to import data from S3 into Redshift is COPY
  • Autoscaling group
  • A Route Table
  • Network Access Control Lists ( NACLs)
    • is a firewall for subnets. All traffic entering or exiting a subnet is checked against the NACL rules to determine whether the traffic is allowed in/out of the subnet.
    • use Network Access Control Lists ( NACLs) to block attackers by IP and Port.
    • NACL operate on subnet level.
    • you should not use CIDR as this will open the communication from other EC2 instances on the subnet as well. It is recommended to use security group IDs
  • AWS Security Groups
    • can't block by Ports.
    • have an implicit deny. But cant DENY ips explicitely, only allow.
    • are stateful and use connection tracking, to allow all traffic in/out, once it was initiated via allowed security-group-rule
  • AWS detective can collect data for ONE region, multiple accounts. (it scans CloudTrail, VPC Flow Logs, Amazon GuardDuty) https://alfrepo.github.io/blog/notes/article000027/
    • GuardDuty + Detective - can analyze malicious behaviour of IAM roles, cross-accounts
    • GuardDuty - It monitors for activity such as unusual API calls or potentially unauthorized deployments that indicate a possible account compromise however, it does not check if your EC2 instances are using an approved AMI or not
  • AWS Macie - detects sensitive data in S3 via machine-learning.
    • Macie cant scan CodeCommit.
  • AWS Global accelerator - buzword optimize the path from your users to your applications this is Global accelerator, not CloudFront
    • AWS Global Accelerator has the following types of endpoints only - Network Load Balancers, Application Load Balancers, Amazon EC2 instances, or Elastic IP addresses.
    • AWS Global Accelerator CANT point to CloudFront. But CloudFront CAN point to GlobalAccelerator.
    • global router, with static IP, which can be pointed to resources in different AWS-accounts and AWS-regions.
    • supports custom routing logic, via custom routing accelerator of Global accelerator. Only to VPC endpoints right now. And VPC endpoints can point to a VPC-subnet. Or a service.

      Use-case: custom routing logic required for a score server. where you can assign multiple players to a single session on a game server based on factors such as geographic location, player skill, and a few more configurable parameters.
    • Failover Mechanism: Global accelerator uses health checks to monitor the health of endpoints and automatically reroutes traffic from unhealthy endpoints to healthy ones within the same or different AWS regions.
    • Layer: Operates at the network (Layer 4) level.
      • for comparison Route53 operates on Layer 7 - DNS, but also implements failover.
    • Global Accelerator endpoints (listing included region endpoints to which to redirect) Can be added as CloudFront origin, e.g. to look for the closest location for affected users.
  • AWS (Elastic) Disaster Recovery (DRS) - can also recover to on-prem. Via agents, by capturing block-storage-changes live. Zero RecoveryPointObjective RPO - is the buzzword for DRS.
  • AWS Compute Optimizer - proposes improvals in EC2, EBS
  • Carrier gateways - does NAT in WaveLength-VPC subnets for 5G. Similar to how an internet gateway functions in a Region.
  • Step functions
    • Standard workflow - exact once
    • Express Synchronous - at most once not more (5 min max. cancel then)
    • Express Asynchronous - at least once to many (5 min. max. cancel then)
  • AWS Transfer Family - supports SFTP
  • TimeStream - has 2 storage layers -
  • S3
    • S3 bucket keys - cache KMS keys, reduce encryption cost.
    • S3 Replication Time Control is designed to replicate 99.99% of objects within 15 minutes after upload, with the majority of those new objects replicated in seconds.
  • Kinesis Video - service for live video ingestion. Supports protocols
  • AWS IoT - FleetWise - collect and transfer vehicles IoT data. Offers vehicle specific functions
    • real-time tracking
    • predictive maintainance
    • route optimization
  • AWS IoT Analytics -
    • needs Lambda as glue, between Kinesis DataStream, FireHouse.
  • AWS IoT Sitewise -
    • Use CloudFormation to manage resources
  • AWS ELB - Load Balancers
    • Connection Draining on ELB - in auto scaling group lets ELB complete all outstanding HTTP requests before removing the instance.
  • Network Load Balancer -
    • can offload TLS
    • provides support for PrivateLink and
    • a static IP address per Availability Zone,
    • can point to your Application Load Balancer (which cant do PrivateLink or static IP).
  • AWS Managed Blockchain.
    • Supports Networks
      • Etherium - public network
      • Hyperledger Fabric - private blockchain network
    • For inviting a new member to be part of the blockchain network, a proposal has to be created. All existing member accounts vote on this proposal and based upon approval, an invitation is sent to a new account to join the network.
    • Amazon Managed Blockchain creates an endpoint to communicate with the Hyperledger Fabric resources. To access these endpoints, the VPC Privatelink endpoint needs to be created in the account.
    • For multiple members in the AWS account, a single VPC PrivateLink endpoint has to be created.
  • Backup
    • Disaster Recovery Service DRS - supports low RTO RPO with agents. Replicates storage. Logs to CLoudWatch the progress.
    • CloudEndure - managed service. AWS company. Replicates on-prem workloads into the cloud.
  • Wavelength Zones - a VPC extension for low latency 5G communication.
    • the Ec2 instances in 2 WavelengthZones A, B cant communicate with each other. To get communication between 2 WavelenthZones you need :
  • CloudHSM - Cloud Hardware Secure Module managed in cloud. Reason is compliance with “FIPS 140-2 Level 3”
  • EBS classes
    • EC2 instance store
    • IOPS optimized (iop1) - for NOSQL DBs, relational DBs
      • Multi-Attach feature - where its attached to multiple Ec2 - is only applicable for Provisioned IOPS SSD volumes
    • Throughput optimized (st1) - for data warehousing, log analyzing
    • Cold storage (SC1) - for few scans a day
  • aws billing and management console
    • contains “cost allocation tags” like “aws:createdBy” can be used to allocate cost to teams, departements etc. Just activate em in Cost Allocation Tags > AWS generated cost allocation tags. Then they will also pop up in cost-reports.
  • AWS Budgets - can alert when budget is spent of forecasted to be spent.
    • Alerts can trigger Actions
    • Actions can be used, to attach SCP, IAM policy, stop RDS or EC2 in region.
  • SQS
    • FIFO queue preserves order
    • Standard queue does not preserve order.
    • deduplication ID to ensure that the same message is not processed multiple times. When a message is received by a consumer, SQS checks the deduplication ID of the message against the deduplication IDs of messages that have already been processed.
    • SQS queue cant be converted into another type.
    • Dead letter queue - is a second queue, of SAME TYPE
    • the SQS WORKER can adopt the “visibility time” of a message, e.g. by looking at the message-headers using
  • Network Address Translation ( NAT gateway) - ElasticIP CAN be assigned to a NAT.
  • Elastic Load Balancer - won't route NO outbound traffic. Only inbound.
  • INternet Gateway in VPC (IGW) - public EC2 instances with public IPs , when routed through IGWis routed through IGW, then EC2 keeps their public IPs.
  • egress-only internet gateway
    • prevents inbound connections from the web
    • works only with IPv6
    • with IPv4 you would use NAT-Gateway
  • x-RAY - tracing system, to find root-causes of bugs
    • trace - Tracing works by injecting a unique identifier into a request, and including that identifier whenever a segment related to that request is recorded. AWS X-Ray groups segments with the same identifier, forming a trace. If your system is fully instrumented, a trace can describe the whole lifecycle of a single request.
    • segments / subsegments - A single component may divide a segment into subsegments, in a similar way to a method calling other methods before returning.
    • groups - query, which may express source and target of a flow and the result https://youtu.be/5MQkX57eTh8?si=Gam7jX1UrIVZDnMk&t=2426
  • AWS VPN
    • when using dynamic routing, BGP ASN stands for Border Gateway Protocol (BGP) and Autonomous System Number (ASN).
  • A customer gateway - is a resource that you create in AWS that represents the customer gateway device in your on-premises network.
    • is probably NOT Highly available. As on customer side
  • A Virtual Private Gateway alias virtual gateway
  • Transit Gateway - also Transit Gateway can be the receiving gateway on AWS
  • nesting application-CloudFormation-stack, as substack of network-CF-stack should require network-level permissions in IAM policy
    • restricting application to concrete VPC - would require resource-level permission
      • single connection
      • with a virtual private gateway
      • with a transit gateway
      • Site-to-Site VPN connection with AWS Direct Connect
      • Private IP Site-to-Site VPN connection with AWS Direct Connect
    • only ONE Virtual Gateway (VGW) can be attached per VPC
    • Each connection requires a Virtual Interface (VIF)
  • AWS Placement Groups - use to group EC2 machines, to achive minimal latency between High-performance apps relying on low latency.
    • adding the instance to a group
      • stop the instance, add to placementgroup, start again.
        • At start it is decided, if enough resources are available to fill the placement group.
    • placement strategy
      • Cluster – Packs instances close together inside an Availability Zone. This strategy enables workloads to achieve the low-latency network performance necessary for tightly-coupled node-to-node communication that is typical of high-performance computing (HPC) applications.
      • Partition – Spreads your instances across logical partitions such that groups of instances in one partition do not share the underlying hardware with groups of instances in different partitions. This strategy is typically used by large distributed and replicated workloads, such as Hadoop, Cassandra, and Kafka.
      • Spread – Strictly places a small group of instances across distinct underlying hardware to reduce correlated failures.
    • When you create a VPC endpoint service, AWS generates endpoint-specific DNS hostnames that you can use to communicate with the service. These names include the VPC endpoint ID, the Availability Zone name and Region Name, for example, vpce-1234-abcdev-us-east-1.vpce-svc-123345.us-east-1.vpce.amazonaws.com By default, your consumers access the service with that DNS name
    • Interface endpoints - direct private connection, from a private subnet, to a multitude of services https://repost.aws/questions/QUIEQI_gnBSziGWFznpbBv0g/should-i-use-a-an-interface-vpc-endpoint-or-a-gateway-vpc-endpoint
    • Gateway endpoints - Free in-VPC private traffic - to a AWS service. To make Dynamo, S3 reachable from private subnets. Only supported are
      • dynamoDB
      • S3
    • The main difference between gateway and interface VPC endpoints is how traffic is routed to the AWS service

      With a gateway endpoint, traffic is routed from your VPC to the service using AWS's private network. Route tables in your VPC are configured with a prefix list that routes matching traffic to the gateway endpoint resource.

      Interface endpoints extend this by creating a network interface within your VPC that represents the service endpoint. Traffic is routed directly to this interface's private IP address without going over AWS's network.

      Some key differences:
      • Gateway endpoints rely on VPC prefix-list routing,
        interface endpoints use direct routing to a private IP.
      • Interface endpoints support PrivateLink, allowing integration of both AWS and external services.
        Gateway endpoints only support certain AWS services like S3 and DynamoDB.
      • Interface endpoints - are available as a private-IP in VPC. Communicating instances can use PRIVATE IPs, do not need PUBLIC IPs
      • Interface endpoints can work across VPCs and regions using PrivateLink, gateway endpoints are limited to a single VPC.
      • Interface endpoints - do scale independently from other VPC resources since they use separate network interfaces.
        Gateway endpoints share scaling limits with the VPC.
      • VPC-Service endpoints
        • unidirectional connection
        • on OSI layer 3, via NLB
  • VPC service endpoint (PrivateLink) vs VPC peering
  • VPC Lattice - virtual network, which enforces AUTHORIZATION on network-level. It expresses which lattice-service can contact which other lattice service. https://youtu.be/zQk9AIPVdXs?si=TdsLDViGeTfa1gz4&t=2007
  • AWS PrivateLink exposed - consumption of private services - requires an Interface (not Gateway) VPC-endpoint.
  • AWS Systems manager - is an Operations console
    • Automations. like “Runbooks” with visual rules-editor. Step-function-like UI for OPS people.
    • Document. Same as “command/script”.
    • State Manager stores Association. Takes a Document. Sets CRON for the document (command)
    • Distributor - lets you package your own agent software
    • Patch Manager - updates patch level for machine. For free.
    • Can generate System Manager Patch compliance report
    • reference Systems Manager properties - by name of property from RDS environment variable
  • AWS Systems Manager > State Manager
    • configuration management, like “Ansible”
    • ensure that the instances are bootstrapped with specific software at startup
    • The following list describes the types of tasks you can perform with State Manager:
      • Bootstrap instances with specific software at start-up
      • Download and update agents on a defined schedule, including SSM Agent
      • Configure network settings
      • Join instances to a Windows domain (Windows instances only)
      • Patch instances with software updates throughout their lifecycle
      • Run scripts on Linux and Windows managed instances throughout their lifecycle
    • AWS Systems Manager > SESSION Manager
      • the SSH console with port 22
  • VPC Sharing
  • in VPC - first 3 and last 1 addresses - are blocked. For broadcasting
  • CIDR
    • 20.0.0.0/32 - indicates 1 IP “20.0.0.0”
    • 20.0.0.0/24 - indicates block - from 20.0.0.0 to 20.0.0.255. One segment *.*.*.00000000 with 8 nums has 256 ips
    • 20.0.0.0/25 - indicates block - from 20.0.0.0 to 20.0.0.128 (256 is the whole block, half of 256 is 128)
    • 20.0.0.128/25 - indicates block - from 20.0.0.128 to 20.0.0.255
  • AWS RDS
    • RDS reserved instances - require same RDS class
    • RDS read replica creation. Happens via separate “Actions” submenu on active master.
    • RDS read replica - can not happen via SSL endpoint. Instead it works via IPSEC VPN tunnel. Using MySQL Native Protocol like BinLog
    • RDS changing the type of DB - causes the instance to reboot. Apply during maintainance window
    • RDS deployment in multi-AZ, when failover from primary to secondary happens, then CNAME is pointed to secondary https://aws.amazon.com/blogs/aws/amazon-rds-multi-az-deployment/
    • To access your Amazon S3 on Outposts bucket, you must create and configure an access point
    • Access points are named network endpoints that are attached to buckets that you can use to perform Amazon S3 object operations, such as GetObject and PutObject.
    • from source code or a container image directly to a scalable and secure web application in the AWS Cloud
    • automatic deployments each time a commit is pushed to the code repository or a new container image version is pushed to the image repository.
  • AWS CloudWatch > Synthetic Canary - its “outside-in” monitoring. Scriptable, similar to “Selenium”. user-perspective monitoring and alerting, via scheduled lambda, logging to CloudWatch, alarms
    • Can record screenshots
    • Can send alarms
    • Route53 - also has health checks, but it is not that close to alarms etc.
  • AWS CloudWatch Evidently - A/B support and associated statistics
  • AWS Aurora
    • cross AZ - fails over and keeps data consistant, repairs blocks automatically
    • 2 deployment types
      • serverless - a clusters whose capacity is scaled automatically according to the specified minimum and maximum capacity values
      • provisioned DB cluster - capacity is managed manually by creating DB instancesa single primary DB instance (writer) and multiple Aurora Read-Replicas
    • Auto-scaling - works only for read.replicas. NOT for the master/writer instance.
    • Multi-master - 2 Master instances. With same storage attached.

  • RDS Global Databases - means there are async secondary clusters cross regions
  • Cross Region replication - means there are read replica installed in other regions
    • Cross-region Read replication happens synchronously, using MySQL protocol
    • Aurora MySQL can do cross-region Aurora Replicas
    • Aurora Postgres - can NOT cross-replicate between regions
  • Failover (when region fails) or Switchover (when planned) happens in seconds but surprisingly manually
  • Each Aurora DB cluster has one cluster endpoint and one primary DB instance. A cluster endpoint (or writer endpoint) for an Aurora DB cluster connects to the current primary DB instance for that DB cluster. This cluster-endpoint is the only one that can perform write operations such as DDL statements. Because of this, the cluster endpoint is the one that you connect to when you first set up a cluster or when your cluster only contains a single DB instance.
    • After detaching the read-replica in another region - the endpoint of the new Write-cluster in a new region after the failover - changes. Application must know the new endpoint, or cross regional DNS must be in place.
  • EC2
  • Emphimeral Ec2 instance volumes - doesnt support snapshots. Only EBS does.
  • Ec2 - AMIs from ephimeral instance-volumes are created via
    • ec2-bundle-vol
    • ec2-upload-bundle, to
    • ec2 register-image.
  • Ec2 - AMIs from EBS-volumes are created via
  • AWS Route53
  • health checkers
    • Route 53 aggregates the data from the health checkers, and if more than 18% of health checkers report that an endpoint is healthy, Route 53 considers it healthy.
    • does NOT validate TLS certificates
  • when pointing to ELB, for high availability use an ALIAS record.
    • Automated Updates: Alias records are automatically updated by Route 53 when the underlying AWS resource
  • Route 53 health checkers are outside the VPC. To check the health of an endpoint within a VPC by IP address, you must assign a public IP address to an instance in the VPC.
  • You can create a CloudWatch metric, associate an alarm with the metric, and then create a health check that is based on the data stream for the alarm. (for private subnets)
  • supports Domain Name System Security Extensions( DNSSEC) - signature of DNS traffic, via chain of trust, to avoid DBS poisoning or DBS spoofing
  • AWS Route53 failover - via health checks
  • use for complete system failures
  • dont use for TEMPORARY errors. The health checks may not register a failure e.g. if the 502 errors are sporadic and the system is generally operational, thus the failover might not be triggered. Use CLoudfront instead for sporadic errors.
    • for sporadic failure use CloudFront custom error response
  • AWS Route53 resolver - DNS resolver
  • INbound / Outbound
  • VPC and AZ specific.
  • AWS Route53 - routing
  • Simple routing policy – Use for a single resource that performs a given function for your domain, for example, a web server that serves content for the example.com website.
  • Failover routing policy – Use when you want to configure active-passive failover.
  • Geolocation routing policy – Use when you want to route traffic based on the location of your users.
  • Geoproximity routing policy – Use when you want to route traffic based on the location of your resources and, optionally, shift traffic from resources in one location to resources in another. supports “bias” for a region, affecting proximity calculation.
  • Latency routing policy – Use when you have resources in multiple AWS Regions and you want to route traffic to the region that provides the best latency.
  • Multivalue answer routing policy – Use when you want Route 53 to respond to DNS queries with up to eight healthy records selected at random.
  • Weighted routing policy – Use to route traffic to multiple resources in proportions that you specify.
  • AWS Database Migration Service (DMS)
  • With AWS DMS, you can discover your source data stores, convert your source schemas, and migrate your data.
  • At a basic level, AWS DMS is a server in the AWS Cloud that runs replication software. You create a source and target connection to tell AWS DMS where to extract data from and where to load it.
  • AWS “Schema Conversion Tool” SCT - tool to convert schemas from source to target DB on AWS.
  • AWS Route53 - PRIVATE hosted zone
  • makes locally resolvable domains.
  • must be explicitly associated with a VPC
  • associate private hosted zone with VPC in another account . Can only be done programmatically, not via console
  • AWS Data exchange service
  • data market, for data sets
  • Batch processing - high volume repetitive data jobs
  • AWS Batch
    • any type of async task, which works as docker container
  • Extract Transform Load (ETL) service - when comples data transformation is required
  • AWS Glue
  • AWS Data Pipeline
    • used in combination with Amazon EMR for analytics.
    • Managed ETL. Automates data movement between compute (EC2) and storage services.
  • Big Data
  • AWS EMR - Amazon Elastic MapReduce
    • Big data tool. Hadoop, Spark, other open source tools under the hood.
    • Use EMR when you need to
      • reduce cost
      • need more control over the underlying data processing infrastructure *
  • AWS Kinesis
  • checkpointing - remembering the last state of stream, to continue from there.
    • writes state in DynamoDB
    • each application into an own Dynamo table, in order not to share state
  • AWS Kinesis Data Firehose
  • near-realtime
  • integrates with:
    • Simple Storage Service (Amazon S3),
    • Amazon Redshift,
    • Amazon OpenSearch Service,
    • Amazon OpenSearch Serverless,
    • Splunk,
    • and any custom HTTP endpoint or HTTP endpoints owned by supported third-party service providers,
    • including Datadog,
    • Dynatrace,
    • LogicMonitor,
    • MongoDB,
    • New Relic,
    • Coralogix, and
    • Elastic.
  • not integrates e.g. with Iot Analytics
  • Kinesis Data Stream
  • real time
  • default retention period covers 24h hours
  • for additional fee - up to a year
  • Limitations
    • Kinesis Data Streams cannot directly write the output to S3.
    • In addition, KDS does not offer plug-and-play integration with an intermediary Lambda function as Firehose does.
  • AWS Redshift
  • supports audit logs, encrypted with SSE-S3 (aws managed key). No custom keys SSE-KMS etc.
  • Amazon Redshift Spectrum - is a distributed data warehouse service that uses serverless architecture and can be used to query data stored in Amazon S3 without having to load it into Redshift tables.
    • comparable to “AWS Athena”
  • AWS Audit Manager - a service, for executing audits, collecting evidence (Prüfungsnachweise) https://youtu.be/iq4AAUMVCWg?si=RQpLq0HhCXWmzWW2&t=965
  • Audit manager - is a regional service
  • AWS Network Firewall - does traffic inspection.
    • AWS Network Firewall can be used to inspect and control traffic between VPCs or between subnets in the same VPC. With VPC routing enhancements, AWS Network Firewall can be inserted between 2 subnets in the same VPC. - There is no need to place subnets in two separate VPCs and use the AWS Transit gateway to forward traffic to AWS Network Firewall for inspection.
    • Indeed, cross-many VPCs and for centralized firewall rule management, you can use deployment model with transit-VPC
    • and routing traffic to a Network Load Balancer in firewall-VPC, with network firewall rules
    • and routing traffic through a central egress-VPC, after it was inspected
  • AWS inspector - detects vulnerabilities in infrastructure.
  • Amazon Connect - is an omnichannel cloud contact center service that enables you to deliver personalized customer service experiences across voice, chat, and other channels. It is a fully managed service that provides a complete contact center solution, from agent management to customer engagement.
  • Service-linked roles - enable other AWS services to integrate with AWS Organizations and can't be restricted by SCPs
  • IAM Resource-based policy - attached to a resource, e.g. in bucket-policy of “s3bucketA”
  • express which identity is allowed to access the resource (“s3bucketA”)
  • use, whenever “requirement to copy information” is mentioned, as 1 principal can access both: shared and own resources
  • Direct Connect - using a dedicated AWS network. Not Internet.
  • can connect on-prem network AND cloud
  • provides three types of virtual interfaces
    • Private virtual interface: A private virtual interface should be used to access an Amazon VPC using private IP addresses.
    • Public virtual interface: A public virtual interface can access all AWS public services using public IP addresses.
    • Transit virtual interface: A transit virtual interface should be used to access one or more Amazon VPC Transit Gateways associated with Direct Connect gateways. You can use transit virtual interfaces with 1/2/5/10 Gbps AWS Direct Connect connections.
  • can use a VPN connection as a fallback for DirectConnect
  • can connect VPCs
    • also in different regions
    • there are Direct Connect Gateways, which connect multiple VPCs in an account.
    • each VPC gets a Virtual Private Gateway as a connector (small “lock” icons)
    • each cloud gets a virtual interface (VIF), which in AWS Direct Connect is a logical interface that represents a physical port on a Direct Connect gateway. VIFs are used to establish connections between your on-premises network and AWS. There are two types of VIFs:
      • Private VIF: A private VIF is used to connect to a private virtual private cloud (VPC) in the AWS Cloud. This type of VIF can be used to connect to AWS resources that are private, such as Amazon Relational Database Service (RDS) instances and Amazon Elastic Container Service (ECS) clusters.
      • Public VIF: A public VIF is used to connect to the public internet. This type of VIF can be used to connect to AWS resources that are publicly accessible, such as Amazon S3 buckets and Amazon EC2 instances.
      • if you want to ENCRYPT the traffic flowing through Direct Connect, you will need to use the PUBLIC virtual interface of DirectConnect to create a VPN connection that will allow access to AWS services such as S3, EC2, and other services.
  • link aggregation group (LAG) is just a logical interface that uses the Link Aggregation Control Protocol (LACP) to aggregate multiple connections at a single AWS Direct Connect endpoint, allowing you to treat them as a single, managed connection.
  • AWS OpsWork - configuration management with Chef and Puppet.
  • Deprecated
  • AWS LightSail
  • 1 click Deploys some popular apps like “WordPress” and allows snapshots, restore etc.
  • AWS CloudFront - can use Lambda@Edge for redirects by country. Just by looking at cloudfront-headers. https://aws.plainenglish.io/lambda-edge-to-redirect-users-based-on-their-location-f72f89d7ff38
  • AWS CloudFront - can do “field-level encryption” with asymmetric key encryption. Last app decrypts the message https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/field-level-encryption.html
  • AWS CloudFront - signed cookies / signed URLs.
  • “serving the dynamic files” with CloudFront is a bad smell and mostly wrong
  • signed URL - is like S3 signed URL
  • signed cookie - stores the authorization in a cookie, it expires and is applicable to multiple URLs
  • geografic restrictions - are explicitely available on CloudFront, not Route53
  • to restrict access to ALB (dynamic content) to happen via CloudFront only or for CUSTOM origin (non S3):
    • Use CloudFront to add a custom header to all origin requests.
    • Using AWS Web Application Firewall (WAF), create a web rule that denies all requests without this custom header. See https://youtu.be/ZHoFUrxNbrg?si=Fu6QLAFx4aEJ378O&t=285
    • Associate the web ACL to the Application Load Balancer.
  • AWS CloudSearch
  • OPensearch too
  • AWS Application Migration Service (MGN) is a highly automated lift-and-shift (rehost) solution, with an AWS replication agent. It enables companies to lift and shift a large number of physical, virtual, or cloud servers without compatibility issues, performance disruption, or long cutover windows. MGN replicates source servers into your AWS account. When you’re ready, it automatically converts and launches your servers on AWS so you can quickly benefit from the cost savings, productivity, resilience, and agility of the Cloud. Once your applications are running on AWS, you can leverage AWS services and capabilities to quickly and easily re-platform or refactor those applications – which makes lift-and-shift a fast route to modernization.
  • AWS Shield

AWS Shield protects the OSI model’s infrastructure layers (Layer 3 Network, Layer 4 Transport)

AWS Shield is a managed Distributed Denial of Service (DDoS) protection service, whereas AWS WAF is an application-layer firewall that controls access via Web ACL’s.

See https://cmakkaya.medium.com/how-to-secure-our-resources-from-doos-attacks-with-aws-waf-shield-5307c85cb476

Shield “Simple” - AWS reacts on DDoS attacks

Shield “Advanced” - AWS reacts on DDoS attacks and provides a 24×7 team and reports.

  • WAF Web Application Firewall
    • AWS WAF is a web application firewall that lets you monitor the HTTP and HTTPS requests that are forwarded to an
      • Amazon CloudFront distribution,
      • Amazon API Gateway REST API,
      • an Application Load Balancer,
      • or an AWS AppSync GraphQL API.
    • You can use WAF
      • You can use AWS WAF to protect the following regional resource types:
        • Amazon API Gateway REST API
        • Application Load Balancer
        • AWS AppSync GraphQL API
        • Amazon Cognito user pool
        • AWS App Runner service
        • AWS Verified Access instance
    • WAF can block by
      • IP address origin of the request
      • 2. Country of origin of the request
      • 3. String match or regular expression (regex) match in a part of the request
      • 4. Size of a particular part of the request
      • 5. Detection of malicious SQL code or scripting

Opposed to Application firewalls like WAF - Network firewalls operate at Layer 3 (Network) and only understand the

  • source IP Address,
  • port, and
  • protocol.

AWS Security Groups are a great example of this.

WAF - is a Access Control List based application firewall and works on OSI layer 7 (Application)

means it understands higher-level protocols such as an

  • HTTP(S) request, including its
    • headers,
    • body,
    • method, and
    • URL
  • WAF interacts with
    • CloudFront distributions,
    • application load balancers,
    • AppSync GraphQL,
    • APIs and
    • API Gateway REST APIs.

A WAF can be configured to detect traffic from the following:

  • specific IPs;
  • IP ranges or country of origin;
  • content patterns in request bodies, headers and cookies;
  • SQL injection attacks;
  • cross-site scripting; and
  • IPs exceeding rate-based rules

When incoming traffic matches any of the configured rules, WAF can reject requests, return custom responses or simply create metrics to monitor applicable requests.

  • EFS
    • use EFS instead of S3. to reuse static data, when auto-scaling dynamically
      • The default General Purpose performance mode is ideal for latency-sensitive use cases, like web serving environments, content management systems, home directories, and general file serving. File systems in the
      • Max I/O mode can scale to higher levels of aggregate throughput and operations per second with a tradeoff of slightly higher latencies for file metadata operations.
      • Using the default Bursting Throughput mode, throughput scales as your file system grows.
      • Using Provisioned Throughput mode, can increase the Provisioned Throughput of your file system as often as you want. You can decrease your file system throughput in Provisioned Throughput mode as long as it’s been more than 24 hours since the last decrease.
      • can have provisioned throughput. Limit 50,000 - 100,000 IOPS depending on region
  • CUSTOM Identity Broker on-premise - is allowed to call STS, to authorize the calls to AWS-API
  • Lambda at Edge - allows 10.000 reuests per second
  • Lambda - provides burst consurrency around 500-3000 requests per second (depends on region)
  • ALB application load balancer
    • ALB can indeed store certificates in IAM service
    • SSL TLS - Server Name Indication SNI - SNI Custom SSL relies on the SNI extension of the Transport Layer Security protocol, which allows multiple domains to serve SSL traffic over the same IP address by including the hostname to which the viewers are trying to connect. You can host multiple TLS-secured applications, each with its own TLS certificate, behind a single load balancer. In order to use SNI, all you need to do is bind multiple certificates to the same secure listener on your load balancer. ALB will automatically choose the optimal TLS certificate for each client. These features are provided at no additional charge.
  • You can use your own SSL certificates with Amazon CloudFront at no additional charge with Server Name Indication (SNI) Custom SSL. Most modern browsers support SNI and provide an efficient way to deliver content over HTTPS using your own domain and SSL certificate. Amazon CloudFront delivers your content from each edge location and offers the same security as the Dedicated IP Custom SSL feature.
    • Create a new CloudFront web distribution and configure it to serve HTTPS requests using dedicated IP addresses in order to associate your alternate domain names with a dedicated IP address in each CloudFront edge location
  • Elastic Beanstalk - can NOT deploy to on-prem.
  • CodeDeploy - CAN deploy on-prem
  • ECS and Docker
    • If the network mode is set to none, the task’s containers do not have external connectivity, and port mappings can’t be specified in the container definition.
    • If the network mode is bridge, the task utilizes Docker’s built-in virtual network which runs inside each container instance.
    • If the network mode is host, the task bypasses Docker’s built-in virtual network and maps container ports directly to the EC2 instance’s network interface directly. In this mode, you can’t run multiple instantiations of the same task on a single container instance when port mappings are used.
    • If the network mode is awsvpc, the task is allocated an elastic network interface, and you must specify a NetworkConfiguration when you create a service or run a task with the task definition. When you use this network mode in your task definitions, every task that is launched from that task definition gets its own elastic network interface (ENI) and a primary private IP address. The task networking feature simplifies container networking and gives you more control over how containerized applications communicate with each other and other services within your VPCs.
      • Task network mode “awsvpc” also provides greater security for your containers by allowing you to use security groups and network monitoring tools at a more granular level within your tasks. Because each task gets its own ENI, you can also take advantage of other Amazon EC2 networking features like VPC Flow Logs so that you can monitor traffic to and from your tasks.
      • Additionally, containers that belong to the same task can communicate over the localhost interface. A task can only have one ENI associated with it at a given time.
  • API Gateway
  • Amazon Lex - is a service for building conversational interfaces into any application using voice and text. Amazon Lex provides the advanced deep learning functionalities of automatic speech recognition (ASR) for converting speech to text and natural language understanding (NLU) to recognize the intent of the text, enabling you to build applications with highly engaging user experiences and lifelike conversational interactions.
  • Amazon Connect - a way to set up call centers
  • Amazon Lex, the same deep learning technologies that power Amazon Alexa are now available to any developer, enabling you to quickly and easily build sophisticated, natural language conversational bots (“chatbots”).
  • AWS Certificate Manager (ACM)
    • can request wildcard certificates / or certificates for multiple domains
      • CloudFront can use multi-domain certificates, using TLS SSI extension
  • VPC peering
    • bidirectional connection
    • A static route is a predefined route that is manually configured in the routing table of a VPC. This route specifies the target network, subnet, and next hop for traffic destined for a particular IP address range. Static routes are useful for scenarios where the network topology or destination IP addresses are known and unlikely to change.
    • AWS currently does not support unicast reverse path forwarding in VPC peering connections that checks the source IP of packets and routes reply packets back to the source. You still need to configure static routes back from the peered VPC
    • A dynamic route, usually used for Direct Connection connections or Site-to-Site VPNs. This route is discovered through the Border Gateway Protocol (BGP), which is a protocol that routers use to exchange routing information. Dynamic routes are useful for scenarios where the network topology or destination IP addresses are not known or may change frequently.
  • AWS AppSync - GraphQL server on AWS.
  • AWS ApiGateway -
    • WebSockets functionality - allows to push messages via @connect annotation
  • AWS Storage Gateway
    • Challenge-Handshake Authentication Protocol - protects from unauthenticated clients.
    • from e.g. IP spoofing
    • All at once: Replace all v1 with v2 at the same time. Failure not handled.
    • Canary: A v2 is deployed and observed. If successful, all remaining v2 instances are deployed immediately.
    • Rolling: Replace v1 instances with v2 instances one at a time. Watch for failures.
    • Rolling with batch: Create some new v2 instances. If successful, roll out on v1 instances. When all are v2 instances, scale back to original size.
    • Immutable: Don't change v1 instances. Create same number of v2 instances. Wait for success, then stop v1 instances.
    • Blue/green: Instead of operating on an environment in-place, create a new environment (network, etc.) fully provisioned with v2, and switch over when ready.
  • AWS FSx
    • Windows NFS
    • OpenZFS
    • Lustre
    • ONTAP
  • IAM roles
    • AdministratorAccess
      • all services
    • PowerUserAccess
      • PowerUserAccess = AdministrativeAccess - IAM
      • read Organizations permissions
        • to view information about the user’s organization, including the master account email and organization limitations.
      • deny IAM
    • SystemAdministrator
      • deny organization
      • deny IAM
  • IAM Identity Center - service to federate third-party services and providers as well as custom applications
    • Requirement: Users in your self-managed Active Directory (AD) can also have SSO access to AWS accounts and cloud applications in the AWS access portal.
    • Set up a two-way forest trust relationship between the AWS Directory service and the company Active Directory to allow users to use their corporate credentials when logging in to AWS.
  • AWS SWF - predecessor of “AWS Step Functions”
  • AWS Migration Hub
    • actually an organization dashboard to track migration.
    • enable the Data Exploration option from the Migration Hub console,
    • data from AWS Application Discovery Service (ADS) Agents collected and will automatically be stored in an Amazon S3 bucket, created for you and the data will be automatically updated.
      • via Amazon Athena and run pre-defined queries to export the utilization data and network connection data for all your servers
    • AWS Migration Hub can generate Amazon EC2 instance sizing, using data from Discovery Service.
  • AWS Resource Access Manager (RAM), also known as RAM, is a service that allows you to centrally manage access to your AWS resources.
    • shared VPC subnets - default limit is around 100 subnets
      • For comparison - a transit gateway can connect 5000 VPCs < prefer this one
      • For comparison - a VPC peering can connect 50 VPCs
    • SCPs dont work in RAM.
      • when sharing e.g. subnets of VPC using AWS RAM, it’s AWS RAM that manages the permissions. And the shared accounts do not have management permissions for the VPC by default.
      • Therefore, denying VPC management actions via an Service Control Policy SCP would be unnecessary in this scenario.
      • Managed permissions of a shared subnet for example. And by the way one cant create its own permissions for resources of type Subnets - custoemr managed permissions are not allowed.
  • AWS Service Catalogue -
  • VPC
    • Subnets allow blocks > /28 only
    • private subnets -
      • resources here, to be reachable from inet - need a route to NAT in public
    • public subnets -
      • even if in public network,
      • even if routed to internet gateway
      • resources without public ips and without NAT routes - wont be able to communicate via internet
  • SES - uses credentials, not IAM roles
  • AWS RDS proxy - pools connections, saas
      • makes your app resilient to database failovers. When the original DB instance becomes unavailable, RDS Proxy connects to the standby database without dropping idle application connections.
      • a failover involves NO brief outage. You can still perform write operations on that database.
  • Secrets Manager
    • reference from RDS by ARN of secret
  • “AWS Resource group” in “AWS Cost and Usage Reports”
    • enable in management account to get a central Cost and Usage Reports cross accounts
  • AWS Pinpoint
    • multichannel communication platform
  • AWS Cost and Usage Report CUR
  • Virtual Tape Shelf - VTS
    • backed by S3 or glacier, but a bit more expensive
    • do prefer pure Glacier or deep archive for cost optimization
    • Gateway Load Balancer
      • mainly to achieve high availability, when communicating between “virtual appliances” - thats the buzzword you need to look for
        • e.g. between Intrusion prevention systems
        • e.g. between Firewalls
      • operates on “IP level”
      • operates on combination of OSI layer 3 + 4
    • Machine learning services
      • Amazon Comprehend: Analyzes text to identify topics, entities, sentiment, and other relevant information. is not designed for text recognition. Amazon Comprehend is used to extract insights from documents. For example, it can detect how customers feel about products via its DetectSentiment API.
      • Amazon Forecast: Generates predictions about future events based on historical data and machine learning models.
      • Amazon Fraud Detector: Helps businesses detect and prevent fraud in real time.
      • Amazon Kendra: Creates a knowledge base from unstructured text data like documents and emails.
        • Search queries in natural language, like “where can I get tested for COVID19”, usable on websites and applications.
      • Amazon Lex: Builds natural language chatbots that interact with customers in their own language.
      • Amazon Personalize: Recommends products, content, and more to your customers based on their past behavior and preferences.
      • Amazon Polly: Converts text into lifelike speech with a variety of voices and accents.
      • Amazon Rekognition: Analyzes images and videos to detect objects, people, and activities.
        • for face detection - RecognizeCelebrities, DetectFaces APIs exist. S
        • how to use
        • Input (for real-time) via Kinesis VIDEO stream. Create the video stream.
          • Fill Video stream. If data comes as an AVI, then you must convert it fragement by fragment - e.g. via Java SDK “PutMedia”
          • Processing: happens via StreamProcessor which is created by passing
            • SOURCE: Kinesis-VIDEO-Stream,
            • OUTPUT: Kinesis-DATA-Stream,
              • the face collection for recognition
        • Output is generated into Kinesis DATA Stream and consumed by data-stream-consumer.
        • Can work with data on S3.
        • Can recognize objects via “startLabelDetecion”
      • Amazon SageMaker: Provides an end-to-end machine learning platform that includes tools for data preparation, model training, and deployment.
      • Amazon Textract: OCR. Extracts text and data from scanned documents and images.
      • Amazon Transcribe: Converts speech to text in real time or at scale. are transribtion services, for Video
      • Amazon Translate: Translates text between more than 200 languages.
  • AWS Backup - backup strategies
    • RDS - Snapshot
      • takes automated backups can happen once a day only
      • snapshot restore - takes 30 minutes to complete
      • feasible to use for RPO / RTO of hours
    • AWS Backup - backup service
      • taking backups can happen once in 12 hours
      • snapshot restore - takes 30 minutes to complete
      • feasible for cold data
    • AWS Ec2 Snapshots
      • snapshot restore - takes 30 minutes to complete
      • feasible to use for RPO of day / RTO 30 Minutes
      • feasible for stateless web servers
    • AWS Aurora - read replication to another region
      • takes 5 Minutes to make read-replicat the master (will retart DB)
      • feasible to use for RPO / RTO of 5 Minutes
  • DDoS attack mitigating services. Use those against DDos.
    • CloudFront - supports AWS WAF - the web firewall
      • When you use AWS WAF on Amazon CloudFront, your rules run in all AWS Edge Locations, located around the world close to your end users. Blocked requests are stopped before they reach your web servers.
    • Route53 - designed to withstand DNS query floods
      • DDoS attacks such as SYN/ACK floods, UDP floods, and reflection attacks are built into both Route 53 and CloudFront
  • AWS S3
    • S3 Standard
      • supports S3 select - on text, encrypted formats (GZIP)
    • Glacier
      • supports Glacier select - on text formats only: https://aws.amazon.com/blogs/aws/s3-glacier-select/
      • Instant retrieval - 32% cost of standard, immediately available (ms) - restore 1 in quater
      • Flexible retrieval - 90% cost of instant, available in 12h - to restore 1-2 per year
      • Deep Archive - 25% cost of flexible, available in 12h - 48h - to restore less than 1per year
  • AWS Athena
    • supports - formats such as CSV, JSON, ORC, Avro, or Parquet.
      • ORC - Stored as columns and compressed
    • supports hive - partitioning for efficiency, independent of encoding: s3:bucket/year=YYYY/month=MM/day=DD/hour=HH
    • for efficiency: encode with compressed, columnar format, partition, with “hive partitioning”
    • Bucketing in Athena
      • If you are familiar with data partitioning, then you can understand buckets as a form of Hash partitioning. A table can be bucketed on one or more columns into a fixed number of buckets. These buckets are stored on S3 and reduce data scans in such a way that only the concerned bucket is scanned when querying the data on the bucketed column, which can dramatically reduce the number of rows of data to read.
      • perform wonders on reducing data scans(read, money) when used effectively
  • AWS CloudFormation
certification/awscertifiedsolutionarchitecprofessional.txt · Last modified: 2024/02/03 12:21 by skipidar