1. Amazon SQS supports an unlimited number of queues, and 120,000 messages per queue for each user. Please be aware that Amazon SQS automatically deletes messages that have been in the queue for more than 14 days.
  2. The user can configure SQS, which will decouple the call between the EC2 application and S3. Thus, the application does not keep waiting for S3 to provide the data.
  3. Auto Scaling enables you to follow the demand curve for your applications closely, reducing the need to manually provision Amazon EC2 capacity in advance. For example, you can set a condition to add new Amazon EC2 instances in increments to the Auto Scaling group when the average utilisation of your Amazon EC2 fleet is high; and similarly, you can set a condition to remove instances in the same increments when CPU utilisation is low. If you have predictable load changes, you can set a schedule through Auto Scaling to plan your scaling activities. You can use Amazon CloudWatch to send alarms to trigger scaling activities and Elastic Load Balancing to help distribute traffic to your instances within Auto Scaling groups. Auto Scaling enables you to run your Amazon EC2 fleet at optimal utilisation.
  4. Amazon S3 provides four different access control mechanisms: AWS Identity and Access Management (IAM) policies, Access Control Lists (ACLs), bucket policies, and query string authentication.
  5. Amazon S3 bucket policies can be used to add or deny permissions across some or all of the objects within a single bucket.
  6. With Query string authentication, you have the ability to share Amazon S3 objects through URLs that are valid for a specified period of time.
  7. The user can get notifications using SNS if he has configured the notifications while creating the Auto Scaling group.
  8. In Elastic Load Balancing a health configuration uses information such as protocol, ping port, ping path (URL), response timeout period, and health check interval to determine the health state of the instances registered with the load balancer.Currently, HTTP on port 80 is the default health check.
  9. Security groups—Act as a firewall for associated Amazon EC2 instances, controlling both inbound and outbound traffic at the instance level. Security groups are stateful: (Return traffic is automatically allowed, regardless of any rules)
  10. Network access control lists (ACLs)—Act as a firewall for associated subnets, controlling both inbound and outbound traffic at the subnet level. Network ACLs are stateless: (Return traffic must be explicitly allowed by rules).
  11. Amazon Glacier supports various vault operations.
    1. A vault inventory refers to the list of archives in a vault.
    2. Downloading a vault inventory is an asynchronous operation. 
    3. Given the asynchronous nature of the job, you can use Amazon Simple Notification Service (Amazon SNS) notifications to notify you when the job completes
    4. Amazon Glacier prepares an inventory for each vault periodically, every 24 hours. If there have been no archive additions or deletions to the vault since the last inventory, the inventory date is not updated.
  12. Within Amazon EC2, when using a Linux instance, the device name /dev/sda1 is reserved for the root device. Another device name, /dev/xvda, is also reserved for certain Linux root devices.
  13. Each Amazon EBS Snapshot has a createVolumePermission attribute that you can set to one or more AWS Account IDs to share the AMI with those AWS Accounts. To allow several AWS Accounts to use a particular EBS snapshot, you can use the snapshots’s createVolumePermission attribute to include a list of the accounts that can use it.
  14. A Classic Load Balancer routes each request independently to the registered instance with the smallest load. However, you can use the sticky session feature (also known as session affinity), which enables the load balancer to bind a user’s session to a specific instance. This ensures that all requests from the user during the session are sent to the same instance.
  15. EMR monitoring choices:
    1. Hadoop Web Interfaces
      Every cluster publishes a set of web interfaces on the master node that contain information about the cluster. You can access these web pages by using an SSH tunnel to connect them on the master node.
    2. CloudWatch Metrics
      Every cluster reports metrics to CloudWatch. CloudWatch is a web service that tracks metrics, and which you can use to set alarms on those metrics.
    3. Ganglia
      Ganglia is a cluster monitoring tool. To have this available, you have to install Ganglia on the cluster when you launch it. After you’ve done so, you can monitor the cluster as it runs by using an SSH tunnel to connect to the Ganglia UI running on the master node.
  16. The Multi AZ feature allows the user to achieve High Availability. For Multi AZ, Amazon RDS automatically provisions and maintains a synchronous “standby” replica in a different Availability Zone.
  17. By default, all accounts are limited to 5 Elastic IP addresses per region. If you need more than 5 Elastic IP addresses, AWS asks that you apply for your limit to be raised. They will ask you to think through your use case and help them understand your need for additional addresses.
  18. When the user account has reached the maximum number of EC2 instances, it will not be allowed to launch an instance. AWS will throw an ‘InstanceLimitExceeded’ error. For all other reasons, such as “AMI is missing part”, “Corrupt Snapshot” or ”Volume limit has reached” it will launch an EC2 instance and then terminate it.
  19. A VPC security group controls access to DB instances and EC2 instances inside a VPC. Amazon RDS uses VPC security groups only for DB instances launched by recently created AWS accounts.
  20. CloudFormation: If any of the services fails to launch, CloudFormation will rollback all the changes and terminate or delete all the created services.
  21. When modifying EBS snapshot permissions with AWS Console, one of the options is to make the snapshot public or not. However, snapshots with AWS Marketplace product codes CANNOT be made public.
  22. Amazon EBS replication is stored within the same availability zone, not across multiple zones; therefore, it is highly recommended that you conduct regular snapshots to Amazon S3 for long-term data durability.
  23. For customers who have architected complex transactional databases using EBS, it is recommended that backups to Amazon S3 be performed through the database management system so that distributed transactions and logs can be checkpointed.
  24. AWS Import/Export supports: 
    1. Import to Amazon S3
    2. Import to Amazon EBS
    3. Import to Amazon Glacier
    4. Export from Amazon S3 (only)
  25. When you enable connection draining, you can specify a maximum time for the load balancer to keep connections alive before reporting the instance as deregistered. The maximum timeout value can be set between 1 and 3,600 seconds (the default is 300 seconds).
  26. Amazon SNS makes it simple and cost-effective to push to mobile devices, such as iPhone, iPad, Android, Kindle Fire, and internet connected smart devices, as well as pushing to other distributed services.
  27. In relation to AWS CloudHSM, High-availability (HA) recovery is hands-off resumption by failed HA group members. Prior to the introduction of this function, the HA feature provided redundancy and performance, but required that a failed/lost group member be manually reinstated.
  28. AWS generates separate unique encryption keys for each Amazon Glacier archive, and encrypts it using AES-256. The encryption key then encrypts itself using AES-256 with a master key that is stored in a secure location.
  29. Instances that you launch into a default subnet receive both a public IP address and a private IP address.
  30. Instances that you launch into a non default subnet in a default VPC don’t receive a public IP address or a DNS hostname. You can change your subnet’s default public IP addressing behaviour.
  31. When you create or modify your DB Instance to run as a Multi-AZ deployment, Amazon RDS automatically provisions and maintains a synchronous “standby” replica in a different Availability Zone. 
    Updates to your DB Instance are synchronously replicated across Availability Zones to the standby in order to keep both in sync and protect your latest database updates against DB Instance failure. 
  32. To determine your instance’s public IP address from within the instance, you can use instance metadata at http://169.254.169.254/latest/meta-data/
  33. You can’t attach an EBS volume to multiple EC2 instances. This is because it is equivalent to using a single hard drive with many computers at the same time.
  34. RAID 5 and RAID 6 are not recommended for Amazon EBS because the parity write operations of these RAID modes consume some of the IOPS available to your volumes.
  35. New Amazon SES users start in the Amazon SES sandbox, which is a test environment that has a sending quota of 1,000 emails per 24-hour period, at a maximum rate of 1 email per second..
  36. SES Sending limits are based on recipients rather than on messages. 
  37. Every Amazon SES sender has a unique set of sending limits, which are calculated by Amazon SES on an ongoing basis.
  38. The Elastic Load Balancer connection draining feature causes the load balancer to stop sending new requests to the back-end instances when the instances are deregistering or become unhealthy, while ensuring that in-flight requests continue to be served. Max connection draining time is 1 hour (3600 seconds).
  39. Resource-based permissions are supported by Amazon S3, Amazon SNS, Amazon SQS, Amazon Glacier.
  40. Amazon DynamoDB integrates with AWS Identity and Access Management (IAM). You can use AWS IAM to grant access to Amazon DynamoDB resources and API actions. To do this, you first write an AWS IAM policy, which is a document that explicitly lists the permissions you want to grant. You then attach that policy to an AWS IAM user or role.
  41. Every CloudFront web distribution must be associated either with the default CloudFront certificate or with a custom SSL certificate. Before you can delete an SSL certificate, you need to either rotate SSL certificates (replace the current custom SSL certificate with another custom SSL certificate) or revert from using a custom SSL certificate to using the default CloudFront certificate.
  42. You can’t use IAM to control access to CloudWatch data for specific resources.
  43. FGAC can benefit any application that tracks information in a DynamoDB table, where the end user (or application client acting on behalf of an end user) wants to read or modify the table directly, without a middle-tier service.
  44. The core components of DynamoDB are
    1. “Table”, a collection of Items
    2. “Items”, with Keys and one or more Attribute;
    3. “Attribute”, with Name and Value
  45. The AWS CloudHSM service defines a resource known as a high-availability (HA) partition group, which is a virtual partition that represents a group of partitions, typically distributed between several physical HSMs for high-availability. 
  46. You are billed per-second with a one-minute minimum for On-Demand, Spot and Reserved instances as long as your EC2 instance is in a running state, provided the instance is Linux (with some exceptions), and not a Windows operating system, in which case the instance would be billed per-hour.
  47. Virtual tape shelf is backed by Amazon Glacier whereas virtual tape library is backed by Amazon S3.
  48. Amazon CloudFront billing is mainly affected by
    1. Data Transfer Out
    2. Edge Location Traffic Distribution
    3. Invalidation Requests
    4. HTTP/HTTPS Requests
    5. Dedicated IP SSL Certificates
  49. You can create an Auto Scaling group directly from an EC2 instance. When you use this feature, Auto Scaling automatically creates a launch configuration for you as well.
  50. In AWS CloudHSM, you can perform a remote backup/restore of a Luna SA partition if you have purchased a Luna Backup HSM. 
  51. A VPC can span several Availability Zones. In contrast, a subnet must reside within a single Availability Zone.
  52. You grant AWS Lambda permission to access a DynamoDB Stream using an IAM role known as the “execution role”. 
  53. You can assign tags only to resources that already exist. You can’t terminate, stop, or delete a resource based solely on its tags; you must specify the resource identifier.
  54. The different cluster states of an Amazon EMR cluster are listed below.
    1. STARTING – The cluster provisions, starts, and configures EC2 instances.
    2. BOOTSTRAPPING – Bootstrap actions are being executed on the cluster.
    3. RUNNING – A step for the cluster is currently being run.
    4. WAITING – The cluster is currently active, but has no steps to run.
    5. TERMINATING – The cluster is in the process of shutting down.
    6. TERMINATED – The cluster was shut down without error.
    7. TERMINATED_WITH_ERRORS – The cluster was shut down with errors.
  55. When you create a snapshot of a Throughput Optimised HDD (st1) or Cold HDD (sc1) volume, performance may drop as far as the volume’s baseline value while the snapshot is in progress. This behaviour is specific to these volume types. 
  56. Bucket names must be globally unique, regardless of the AWS region in which you create the bucket, and they must be DNS-compliant.
  57. Bucket names must be at least 3 and no more than 63 characters long.
  58. Bucket names can contain lowercase letters, numbers, periods, and/or hyphens. Each label must start and end with a lowercase letter or a number.
  59. Bucket names must not be formatted as an IP address (e.g., 192.168.1.1).
  60. The URL of any S3 object follows this template: https://s3-<region>.amazonaws.com/<bucket-name>/<object-path><object-name>
  61. DDOS attack: The attack surface is composed of the different Internet entry points that allow access to your application. The strategy to minimise the attack surface area is to
    1. Reduce the number of necessary Internet entry points
    2. Eliminate non-critical Internet entry points
    3. Separate end user traffic from management traffic
    4. Obfuscate necessary Internet entry points to the level that untrusted end users cannot access them, and
    5. Decouple Internet entry points to minimise the effects of attacks. This strategy can be accomplished with Amazon VPC.
  62. Amazon RDS read replicas provide enhanced performance and durability for Amazon RDS instances. This replication feature makes it easy to scale out elastically beyond the capacity constraints of a single Amazon RDS instance for read-heavy database workloads. You can create one or more replicas of a given source Amazon RDS instance and serve high-volume application read traffic from multiple copies of your data, thereby increasing aggregate read throughput.
  63. AWS Instance profile is a container for an AWS Identity and Access Management (IAM) role that you can use to pass role information to an Amazon EC2 instance when the instance starts. The IAM role should have a policy attached that only allows access to the AWS Cloud services necessary to perform its function.
  64.  To create an Availability Zone-independent architecture, create a NAT gateway in each Availability Zone and configure your routing to ensure that resources use the NAT gateway in the same Availability Zone.
  65. DynamoDB: You can create a maximum of 5 global secondary indexes per table.
  66. Elastic Load Balancing supports the Server Order Preference option for negotiating connections between a client and a load balancer. During the SSL connection negotiation process, the client and the load balancer present a list of ciphers and protocols that they each support, in order of preference. By default, the first cipher on the client’s list that matches any one of the load balancer’s ciphers is selected for the SSL connection. If the load balancer is configured to support Server Order Preference, then the load balancer selects the first cipher in its list that is in the client’s list of ciphers. This ensures that the load balancer determines which cipher is used for SSL connection. If you do not enable Server Order Preference, the order of ciphers presented by the client is used to negotiate connections between the client and the load balancer.
  67. Amazon WorkSpaces uses PCoIP, which provides an interactive video stream without transmitting actual data.
  68. Architect for high availability: Distributing applications across multiple Availability Zones provides the ability to remain resilient in the face of most failure modes, including natural disasters or system failures.
  69. Amazon DynamoDB does not have a server-side feature to encrypt items within a table. You need to use a solution outside of DynamoDB such as a client-side library to encrypt items before storing them, or a key management service like AWS Key Management Service to manage keys that are used to encrypt items before storing them in DynamoDB.
  70. Amazon EC2 roles must be assigned a policy. 
  71. Integration of Role with Active Directory involves integration between Active Directory and IAM via SAML.
  72. DynamoDB: You can have multiple local secondary indexes, and they must be created at the same time the table is created. You can create multiple global secondary indexes associated with a table at any time.
  73. The Auto Scaling cool-down period is a configurable setting for your Auto Scaling group that helps ensure that Auto Scaling doesn’t launch or terminate additional instances before the previous scaling activity takes effect. After the Auto Scaling group dynamically scales using a simple scaling policy, Auto Scaling waits for the cool-down period to complete before resuming scaling activities.
  74. Amazon DynamoDB supports Query operations that require an input value and Scan operations that do not require an input value when retrieving data from a table. #DynamoDB
  75. Data is copied asynchronously from the source database to the Read Replica. #RDS
  76. The supported notification protocols for SNS are HTTP, HTTPS, Amazon SQS, Email, Short Message Service (SMS), and AWS Lambda. #SNS
  77. Delay queues make messages unavailable upon arrival to the queue and visibility timeouts make messages unavailable after being retrieved from the queue. #SQS
  78. You must provide the receipt handle for the message in order to delete it from a queue. #SQS
  79. VPC subnets do not span availability zones. One subnet equals one availability zone. #VPC
  80. The bursty nature of the IO (~3000 IOPS) makes the General Purpose SSD the more cost-effective choice. #IO #EBS
  81. The instance type defines the virtual hardware (CPU, Memory) allocated to the instance. #EC2
  82. Identity federation is based on temporary security tokens. Access cannot be granted directly to external identities, nor can they be added to IAM groups. #IAM
  83. Amazon S3, Amazon Redshift, and Amazon Elastic search Service are the three possible destinations for Amazon Firehose data. #KinesisFirehose
  84. Amazon EMR is an excellent choice to analyze data at read, such as logs stored on Amazon S3. #AmazonEMR
  85. Amazon Kinesis Streams can analyze streams of data in real time. #KinesisStreams
  86. Gateway Stored Volumes replicate all data to the cloud asynchronously while storing it all locally to reduce latency. #StorageGateway
  87. Gateway Cached Volumes provide an iSCSI interface to block storage, but copies all files to Amazon S3 for durability and retains only the recently used files locally.  #StorageGateway
  88. Amazon Cloud Front can use any HTTP/S source as an origin, whether on AWS or on-premises. #Cloudfront
  89. AWS Config tracks the configuration of your AWS infrastructure and does not monitor its health. #AWSConfig
  90. Amazon Glacier objects are retrieved online (not shipped) and are available in three to five hours. #Glacier
  91. You must enable versioning before you can enable cross-region replication, and Amazon S3 must have IAM permissions to perform the replication. #S3
  92. Lifecycle rules migrate data from one storage class to another, not from one bucket to another. #S3
  93. Amazon CloudWatch metric data is kept for 2 weeks. #CloudWatch
  94. Query is the most efficient operation to find a single item in a large table. #DynamoDB
  95. Amazon SES can also be used to receive messages and deliver them to an Amazon S3 bucket, call custom code via an AWS Lambda function, or publish notifications to Amazon SNS. #SES
  96. Resources aren’t replicated across regions unless organizations choose to do so. #General
  97. In Amazon S3, you GET an object or PUT an object, operating on the whole object at once, instead of incrementally updating portions of the object as you would with a file. #S3
  98. You can’t “mount” a bucket, “open” an object, install an operating system on Amazon S3, or run a database on S3. #S3
  99. Amazon Elastic File System (AWS EFS) provides network-attached shared file storage (NAS storage) using the NFS v4 protocol.
  100. S3 bucket objects – User metadata is optional, and it can only be specified at the time an object is created.
  101. The combination of bucket, key, and optional version ID uniquely identifies an Amazon S3 object.
  102. For PUTs to existing objects (object overwrite to an existing key) and for object DELETEs, Amazon S3 provides eventual consistency.
  103. for PUTs to existing objects (object overwrite to an existing key) and for object DELETEs, Amazon S3 provides eventual consistency.
  104. Amazon S3 bucket policies are the recommended access control mechanism for Amazon S3 and provide much finer-grained control.
  105. Versioning is turned on at the bucket level. Once enabled, versioning cannot be removed from a bucket; it can only be suspended.
  106. MFA Delete can only be enabled by the root account
  107. Multipart upload is a three-step process: initiation, uploading the parts, and completion (or abort).
  108. Cross-region replication is a feature of Amazon S3 that allows you to asynchronously replicate all new objects in the source bucket in one AWS region to a target bucket in another region. To enable cross-region replication, versioning must be turned on for both source and destination buckets, and you must use an IAM policy to give Amazon S3 permission to replicate objects on your behalf.
  109. In Amazon Glacier, data is stored in archives. An archive can contain up to 40TB of data, and you can have an unlimited number of archives.
  110. Glacier Vaults are containers for archives. Each AWS account can have up to 1,000 vaults.
  111. Amazon Glacier supports 40TB archives versus 5TB objects in Amazon S3.
  112. Archives in Amazon Glacier are identified by system-generated archive IDs, while Amazon S3 lets you use “friendly” key names.
  113. Amazon Glacier archives are automatically encrypted, while encryption at rest is optional in Amazon S3.
  114. Versioning and MFA Delete can be used to protect against accidental deletion in S3 buckets.
  115. Multipart upload can be used to upload large objects, and Range GETs can be used to download portions of an Amazon S3 object or Amazon Glacier archive.
  116. Amazon S3 event notifications can be used to send an Amazon SQS or Amazon SNS message or to trigger an AWS Lambda function when an object is created or deleted.
  117. Amazon Glacier vaults can be locked for compliance purposes.
  118. S3 – Use ACLs Amazon S3 bucket policies and AWS IAM policies for access control.
  119. CloudFront – Use pre-signed URLs for time-limited download access.
  120. RRS offers lower durability at lower cost for easily replicated data.
  121. Lifecycle configuration rules define actions to transition objects from one storage class to another based on time.
  122. Data is stored in encrypted archives that can be as large as 40TB.
  123. Enhanced networking is available only for instances launched in an Amazon Virtual Private Cloud 
  124. The Amazon Machine Image (AMI) defines the initial software that will be on an instance when it is launched.
  125. Security groups allow you to control traffic based on port, protocol, and source.
  126. EC2-Classic Security Groups Control outgoing instance traffic 
  127. VPC Security Groups Control outgoing and incoming instance traffic 
  128. Every instance must have at least one security group but can have more.
  129. A security group is default deny; that is, it does not allow any traffic that is not explicitly allowed by a security group rule.
  130. A security group is a stateful firewall; that is, an outgoing message is remembered so that the response is allowed through the security group without an explicit inbound rule being required.
  131. Security groups are applied at the instance level, as opposed to a traditional on-premises firewall that protects at the perimeter.
  132. UserData is stored with the instance and is not encrypted, so it is important to not include any secrets such as passwords or keys in the UserData.
  133. Outside of an Amazon VPC (called EC2-Classic), the association of the security, groups cannot be changed after launch.
  134. In order to prevent termination via the AWS Management Console, CLI, or API, termination protection can be enabled for an instance. While enabled, calls to terminate the instance will fail until termination protection is disabled. It does not prevent termination triggered by an OS shutdown command, termination from an Auto Scaling group, or termination of a Spot Instance due to Spot price changes.
  135. Amazon VPC is the networking layer for Amazon Elastic Compute Cloud (Amazon EC2),
  136. Default Amazon VPCs contain one public subnet in every Availability Zone within the region, with a netmask of /20.
  137. Each route table contains a default route called the local route, which enables communication within the Amazon VPC, and this route cannot be modified or removed.
  138. An Amazon VPC may have multiple peering connections, and peering is a one-to-one relationship between Amazon VPCs, meaning two Amazon VPCs cannot have two peering agreements between them.
  139. SG: You can specify allow rules, but not deny rules. This is an important difference between security groups and ACLs.
  140. Instances associated with the same security group can’t talk to each other unless you add rules allowing it (with the exception being the default security group).
  141. You can change the security groups with which an instance is associated after launch, and the changes will take effect immediately.
  142. A network access control list (ACL) is another layer of security that acts as a stateless firewall on a subnet level.
  143. A network ACL is a numbered list of rules that AWS evaluates in order, starting with the lowest numbered rule
  144. Every subnet must be associated with a network ACL.
  145. For common use cases, AWS recommends that you use a NAT gateway instead of a NAT instance.
  146. The VPG is the AWS end of the VPN tunnel. The CGW is a hardware or software application on the customer’s side of the VPN tunnel.
  147. You must initiate the VPN tunnel from the CGW to the VPG.
  148. A public subnet is one in which the associated route table directs the subnet’s traffic to the Amazon VPC’s IGW.
  149. A private subnet is one in which the associated route table does not direct the subnet’s traffic to the Amazon VPC’s IGW.
  150. A VPN-only subnet is one in which the associated route table directs the subnet’s traffic to the Amazon VPC’s VPG and does not have a route to the IGW.
  151. An IGW provides a target in your Amazon VPC route tables for Internet-routable traffic, and it performs network address translation for instances that have been assigned public IP addresses.
  152. In order for you to assign your own domain name to your instances, you create a custom DHCP option set and assign it to your Amazon VPC.
  153. An Amazon VPC endpoint enables you to create a private connection between your Amazon VPC and another AWS service without requiring access over the Internet or through a NAT instance, VPN connection, or AWS Direct Connect. Endpoints support services within the region only.
  154. You can create an Amazon VPC peering connection between your own Amazon VPCs or with an Amazon VPC in another AWS account within a single region.
  155. A NAT instance is a customer-managed instance.
  156. A NAT gateway is an AWS-managed service 
  157. Transitive peering is not supported, and peering is only available between Amazon VPCs within the same region.
  158. The VPN connection must be initiated from the CGW side, and the connection consists of two IPSec tunnels.
  159. Amazon CloudWatch is a service that monitors AWS Cloud resources and applications running on AWS. It collects and tracks metrics, collects and monitors log files, and sets alarms. Amazon CloudWatch has a basic level of monitoring for no cost and a more detailed level of monitoring for an additional cost.
  160. Elastic Load Balancing supports routing and load balancing of Hypertext Transfer Protocol (HTTP), Hypertext Transfer Protocol Secure (HTTPS), Transmission Control Protocol (TCP), and Secure Sockets Layer (SSL) traffic to Amazon EC2 instances.
  161. Elastic Load Balancing is a managed service, it scales in and out automatically to meet the demands of increased application traffic and is highly available within a region.
  162. Elastic Load Balancing also supports integrated certificate management and SSL termination.
  163. Elastic Load Balancing in Amazon VPC supports IPv4 addresses only.
  164. Elastic Load Balancing in EC2-Classic supports both IPv4 and IPv6 addresses.
  165. You can use internal load balancers to route traffic to your Amazon EC2 instances in VPCs with private subnets.
  166. In order to use SSL, you must install an SSL certificate on the load balancer that it uses to terminate the connection and then decrypt requests from clients before sending requests to the back-end Amazon EC2 instances. You can optionally choose to enable authentication on your back-end instances.
  167. Elastic Load Balancing does not support Server Name Indication (SNI) on your load balancer.
  168. If you want to host multiple websites on a fleet of Amazon EC2 instances behind Elastic Load Balancing with a single SSL certificate, you will need to add a Subject Alternative Name (SAN) for each website to the certificate 
  169. Every load balancer must have one or more listeners configured.
  170. Every listener is configured with a protocol and a port (client to load balancer) for a front-end connection and a protocol and a port for the back-end (load balancer to Amazon EC2 instance) connection.
  171. Elastic Load Balancing supports the following protocols: HTTP HTTPS TCP SSL 
  172. Elastic Load Balancing supports protocols operating at two different Open System Interconnection (OSI) layers. Layer 4 & 7.
  173. Elastic Load Balancing allows you to configure many aspects of the load balancer, including idle connection timeout, cross-zone load balancing, connection draining, proxy protocol, sticky sessions, and health checks.
  174. For each request that a client makes through a load balancer, the load balancer maintains two connections. One connection is with the client and the other connection is to the back-end instance.
  175. By default, Elastic Load Balancing sets the idle timeout to 60 seconds for both connections.
  176. Keep-alive, when enabled, allows the load balancer to reuse connections to your back-end instance, which reduces CPU utilization.
  177. To ensure that the load balancer is responsible for closing the connections to your back-end instance, make sure that the value you set for the keep-alive time is greater than the idle timeout setting on your load balancer.
  178. You should enable connection draining to ensure that the load balancer stops sending requests to instances that are deregistering or unhealthy, while keeping the existing connections open. This enables the load balancer to complete in-flight requests made to these instances.
  179. Draining timeout between 1 and 3,600 seconds (the default is 300 seconds). When the maximum time limit is reached, the load balancer forcibly closes connections to the deregistering instance.
  180. If you enable Proxy Protocol, a human-readable header is added to the request header with connection information such as the source IP address, destination IP address, and port numbers. The header is then sent to the back-end instance as part of the request.
  181. Sticky session feature (also known as session affinity) enables the load balancer to bind a user’s session to a specific instance.
  182. Elastic Load Balancing creates a cookie named AWSELB that is used to map the session to the instance.
  183. A health check is a ping, a connection attempt, or a page that is checked periodically.
  184. Amazon CloudWatch supports multiple types of actions such as sending a notification to an Amazon Simple Notification Service (Amazon SNS) topic or executing an Auto Scaling policy.
  185. Amazon CloudWatch supports an Application Programming Interface (API) that allows programs and scripts to PUT metrics into Amazon CloudWatch as name-value pairs that can then be used to create events and trigger alarms in the same manner as the default Amazon CloudWatch metrics.
  186. A CloudWatch Logs agent is available that provides an automated way to send log data to CloudWatch Logs for Amazon EC2 instances running Amazon Linux or Ubuntu. You can use the Amazon CloudWatch Logs agent installer on an existing Amazon EC2 instance to install and configure the CloudWatch Logs agent.
  187. Each AWS account is limited to 5,000 alarms per AWS account, and metrics data is retained for two weeks by default.
  188. Auto scaling types
    1. Maintain Current Instance Levels 
    2. Manual Scaling 
    3. Scheduled scaling means that scaling actions are performed automatically as a function of time and date.
    4. Dynamic scaling lets you define parameters that control the Auto Scaling process in a scaling policy.
  189. Auto Scaling has several components that need to be configured to work properly: a launch configuration, an Auto Scaling group, and an optional scaling policy.
  190. A launch configuration is the template that Auto Scaling uses to create new instances, and it is composed of the configuration name, Amazon Machine Image (AMI), Amazon EC2 instance type, security group, and instance key pair.
  191. The default limit for launch configurations is 100 per region. If you exceed this limit, the call to create-launch-configuration will fail.
  192. aws autoscaling describe-account-limits. API call to find out a/c limits
  193. Auto Scaling may cause you to reach limits of other services, such as the default number of Amazon EC2 instances you can currently launch within a region, which is 20.
  194. A launch configuration can reference On-Demand Instances or Spot Instances, but not both. (— spot-price “0.15”)
  195. Auto Scaling protects instances from termination during scale-in events. This means that Auto Scaling instance protection will receive the CloudWatch trigger  to delete instances, and delete instances in the Auto Scaling group that do not have instance protection enabled. However, instance protection won’t protect Spot instance termination triggered due to market price exceeding bid price.
  196. The AWS Trusted Advisor provides best practices (or checks) in four categories:
    1. Cost Optimization
    2. Security
    3. Fault tolerance
    4. Performance improvement.
  197. Using Amazon CloudWatch alarm actions, you can create alarms that automatically stop, terminate, reboot, or recover your Amazon Elastic Compute Cloud (Amazon EC2) instances. You can use the stop or terminate actions to help you save money when you no longer need an instance to be running.
  198. Since AWS is a public cloud any application hosted on EC2 is prone to hacker attacks. It becomes extremely important for a user to setup a proper security mechanism on the EC2 instances. A few of the security measures are listed below:
    1. Always keep the OS updated with the latest patch
    2. Always create separate users with in OS if they need to connect with the EC2 instances, create their keys and disable their password
    3. Create a procedure using which the admin can revoke the access of the user when the business work on the EC2 instance is completed
    4. Lock down unnecessary ports
    5. Audit any proprietary applications that the user may be running on the EC2 instance
    6. Provide temporary escalated privileges, such as sudo for users who need to perform occasional privileged tasks
  199. A recommended best practice is to scale out quickly and scale in slowly so you can respond to bursts or spikes but avoid inadvertently terminating Amazon EC2 instances too quickly, only having to launch more Amazon EC2 instances if the burst is sustained.
  200. IAM is not an identity store/ authorization system 
  201. if you are working with a mobile app, consider Amazon Cognito for identity management for mobile applications.
  202. A principal is an IAM entity that is allowed to interact with AWS resources. A principal can be permanent or temporary, and it can represent a human or an application.
  203. There are three types of principals: root users, IAM users, and roles / temporary security tokens.
  204. Roles are used to grant specific privileges to specific actors for a set duration of time.
  205. When one of these actors assumes a role, AWS provides the actor with a temporary security token from the AWS Security Token Service (STS) that the actor can use to access AWS Cloud services.
  206. The range of a temporary security token lifetime is 15 minutes to 36 hours.
  207. Amazon EC2 Roles
    1. Granting permissions to applications running on an Amazon EC2 instance.
    2. Cross-Account Access— Granting permissions to users from other AWS accounts, whether you control those accounts or not.
    3. Federation— Granting permissions to users authenticated by a trusted external system.
  208. IAM can integrate with two different types of outside Identity Providers: web (e.g. facebook) & internal/enterprise (e.g. active directory)
  209. A policy is a JSON document that fully defines a set of permissions to access and manipulate AWS resources.
  210. The security risk of any credential increases with the age of the credential. To this end, it is a security best practice to rotate access keys associated with your IAM users. IAM facilitates this process by allowing two active access keys at a time.
  211. If an AssumeRole call includes a role and a policy, the policy cannot expand the privileges of the role.
  212. Common use cases for IAM roles include federating identities from external IdPs, assigning privileges to an Amazon EC2 instance where they can be assumed by applications running on the instance, and cross-account access.
  213. The three principals that can authenticate and interact with AWS resources are the root user, IAM users, and roles.
  214. Amazon RDS does not provide shell access to Database (DB) Instances, and it restricts access to certain system procedures and tables that require advanced privileges.
  215. Existing DB Instances can be changed or resized using the ModifyDBInstance API.
  216. A DB parameter group acts as a container for engine configuration values that can be applied to one or more DB Instances.
  217. A DB option group acts as a container for engine features, which is empty by default.
  218. Amazon RDS MySQL supports Multi-AZ deployments for high availability and read replicas for horizontal scaling.
  219. AWS offers two licensing models: License Included and Bring Your Own License (BYOL).
  220. When you first create an Amazon Aurora instance, you create a DB cluster. A DB cluster has one or more instances and includes a cluster volume that manages the data for those instances. Each DB cluster can have up to 15 Amazon Aurora Replicas in addition to the primary instance.
  221. Amazon RDS supports three storage types: Magnetic, General Purpose (Solid State Drive [SSD]), and Provisioned IOPS (SSD).
  222. RPO is defined as the maximum period of data loss that is acceptable 
  223. RTO is defined as the maximum amount of downtime that is permitted to recover from backup and to resume processing.
  224. When you delete a DB Instance, all automated backup snapshots are deleted and cannot be recovered. Manual snapshots, however, are not deleted.
  225. Automated backups are kept for a configurable number of days, called the backup retention period.
  226. You cannot restore from a DB snapshot to an existing DB Instance; a new DB Instance is created when you restore.
  227. Multi-AZ deployments are available for all types of Amazon RDS database engines.
  228. Amazon RDS automatically replicates the data from the master database or primary instance to the slave database or secondary instance using synchronous replication.
  229. Amazon RDS will automatically fail over to the standby instance without user intervention. The DNS name remains the same, but the Amazon RDS service changes the CNAME to point to the standby.
  230. Failover between the primary and the secondary instance is fast, and the time automatic failover takes to complete is typically one to two minutes.
  231. Each database instance can scale from 5GB up to 6TB in provisioned storage depending on the storage type and engine.
  232. Read replicas are currently supported in Amazon RDS for MySQL, PostgreSQL, MariaDB, and Amazon Aurora.
  233. Updates made to the source DB Instance are asynchronously copied to the read replica. You can reduce the load on your source DB Instance by routing read queries from your applications to the read replica.
  234. Before you can deploy into an Amazon VPC, you must first create a DB subnet group that predefines which subnets are available for Amazon RDS deployments.
  235. The key component of an Amazon Redshift data warehouse is a cluster. A cluster is composed of a leader node and one or more compute nodes. The Dense Compute node types support clusters up to 326TB using fast SSDs, while the Dense Storage nodes support clusters up to 2PB using large magnetic disks. The number of slices per node depends on the node size of the cluster and typically varies between 2 and 16.
  236. Whenever you perform a resize operation, Amazon Redshift will create a new cluster and migrate data from the old cluster to the new one. During a resize operation, the database will become read-only until the operation is finished.
  237. The data distribution style that you select for your database has a big impact on query performance, storage requirements, data loading, and maintenance. When creating a table, you can choose between one of three distribution styles: EVEN, KEY, or ALL.
  238. EVEN distribution is the default option and results in the data being distributed across the slices in a uniform fashion regardless of the data.
  239. KEY distribution: With KEY distribution, the rows are distributed according to the values in one column. The leader node will store matching values close together and increase query performance for joins.
  240. With ALL, a full copy of the entire table is distributed to every node. This is useful for lookup tables and other large tables that are not updated frequently.
  241. The sort keys for a table can be either compound or interleaved. A compound sort key is more efficient when query predicates use a prefix, which is a subset of the sort key columns in order. An interleaved sort key gives equal weight to each column in the sort key, so query predicates can use any subset of the columns that make up the sort key, in any order.
  242. Amazon Redshift supports standard SQL commands like INSERT and UPDATE to create and modify records in a table.
  243. A COPY command can load data into a table in the most efficient manner, and it supports multiple types of input data sources.
  244. Data can also be exported out of Amazon Redshift using the UNLOAD command. This command can be used to generate delimited text files and store them in Amazon S3.
  245. For large Amazon Redshift clusters supporting many users, you can configure Workload Management (WLM) to queue and prioritize queries. WLM allows you define multiple queues and set the concurrency level for each queue.
  246. Amazon DynamoDB supports two types of primary keys, and this configuration cannot be changed after a table has been created:
  247. It is possible for two items to have the same partition key value, but those two items must have different sort key values.
  248. You can create or delete a global secondary index on a table at any time. #DynamoDB
  249. You can only create a local secondary index when you create a table. #DynamoDB
  250. Amazon DynamoDB Streams makes it easy to get a list of item modifications for the last 24-hour period. Stream records are organized into groups, also referred to as shards.Shards live for a maximum of 24 hours and, with fluctuating load levels, could be split one or more times before they are eventually closed.
  251. To build an application that reads from a shard, it is recommended to use the Amazon DynamoDB Streams Kinesis Adapter.
  252. Amazon SQS ensures delivery of each message at least once and supports multiple readers and writers interacting with the same queue.
  253. Although most of the time each message will be delivered to your application exactly once, you should design your system to be idempotent (that is, it must not be adversely affected if it processes the same message more than once).
  254. Delay queues allow you to postpone the delivery of new messages in a queue for a specific number of seconds.
  255. To create a delay queue, use CreateQueue and set the DelaySeconds attribute to any value between 0 and 900 (15 minutes). The default value for DelaySeconds is 0.
  256. When a message is in the queue but is neither delayed nor in a visibility timeout, it is considered to be “in flight.”
  257. You can have up to 120,000 messages in flight at any given time. 
  258. Amazon SQS supports up to 12 hours’ maximum visibility timeout.
  259. Amazon SQS uses three identifiers that you need to be familiar with: queue URLs, message IDs, and receipt handles.
  260. To delete a message, you need the message’s receipt handle instead of the message ID.
  261. The maximum length of a message ID is 100 characters. 
  262. The maximum length of a receipt handle is 1,024 characters. 
  263. Each message can have up to 10 attributes.
  264. If there is no message in the queue, then the call will wait up to WaitTimeSeconds for a message to appear before returning.
  265. Long polling drastically reduces the amount of load on your client.
  266. Dead letter queue is a queue that other (source) queues can target to send messages that for some reason could not be successfully processed
  267. You can create a dead letter queue from the Amazon SQS API and the Amazon SQS console.
  268. Amazon SQS Access Control allows you to assign policies to queues that grant specific interactions to other accounts without that account having to assume IAM roles from your account.
  269. Amazon SQS does not return success to a SendMessage API call until the message is durably stored in Amazon SQS.
  270. In Amazon SWF, a task represents a logical unit of work that is performed by a component of your application.
  271. When using Amazon SWF, you implement workers to perform tasks. These workers can run either on cloud infrastructure, such as Amazon EC2, or on your own premises.
  272. Using Amazon SWF, you can implement distributed, asynchronous applications as workflows.
  273. Workflows coordinate and manage the execution of activities that can be run asynchronously across multiple computing devices and that can feature both sequential and parallel processing.
  274. Domains provide a way of scoping Amazon SWF resources within your AWS account. You must specify a domain for all the components of a workflow,It is possible to have more than one workflow in a domain;
  275. Workflows in different domains cannot interact with one another.
  276. Amazon SWF consists of a number of different types of programmatic features known as actors.
  277. actors communicate with Amazon SWF through its API.
  278. A workflow starter is any application that can initiate workflow executions. An activity worker is a single computer process (or thread) that performs the activity tasks in your workflow. The logic that coordinates the tasks in a workflow is called the decider.
  279. Amazon SWF provides activity workers and deciders with work assignments, given as one of three types of tasks: activity tasks, AWS Lambda tasks, and decision tasks.
  280. An AWS Lambda task is similar to an activity task, but executes an AWS Lambda function instead of a traditional Amazon SWF activity.
  281. The decision task contains the current workflow history.
  282. Amazon SWF schedules a decision task when the workflow starts and whenever the state of the workflow changes, such as when an activity task completes.
  283. Scheduling a task creates the task list if it doesn’t already exist.
  284. Deciders and activity workers communicate with Amazon SWF using long polling.
  285. Registered workflow type is identified by its domain, name, and version.
  286. Workflow types are specified in the call to RegisterWorkflowType.
  287. Activity types are specified in the call to RegisterActivityType.
  288. Amazon SNS is a web service for mobile and enterprise messaging that enables you to set up, operate, and send notifications.
  289. Amazon SNS follows the publish-subscribe (pub-sub) messaging paradigm,
  290. A fanout scenario is when an Amazon SNS message is sent to a topic and then replicated and pushed to multiple Amazon SQS queues, HTTP endpoints, or email addresses (see Figure 8.5). This allows for parallel asynchronous processing.
  291. Push email and text messaging are two ways to transmit messages to individuals or groups via email and/ or SMS.
  292. Visibility timeout is a period of time during which Amazon SQS prevents other components from receiving and processing a message because another component is already processing it. By default, the message visibility timeout is set to 30 seconds, and the maximum that it can be is 12 hours.
  293. Long polling allows your Amazon SQS client to poll an Amazon SQS queue. If nothing is there, ReceiveMessage waits between 1 and 20 seconds.
  294. You can use the following protocols with Amazon SNS: HTTP, HTTPS, SMS, email, email-JSON, Amazon SQS, and AWS Lambda.
  295. Amazon Route 53 is an authoritative DNS system. An authoritative DNS system provides an update mechanism that developers use to manage their public DNS names.
  296. Name servers can be authoritative, meaning that they give answers to queries about domains under their control.
  297. A zone file is a simple text file that contains the mappings between domain names and IP addresses. This is how a DNS server finally identifies which IP address should be contacted when a user requests a certain domain name.
  298. A Start of Authority (SOA) record is mandatory in all zone files,
  299. The A record is used to map a host to an IPv4 IP address, while AAAA records are used to map a host to an IPv6 address.
  300. The MX record should point to a host defined by an A or AAAA record and not one defined by a CNAME.
  301. Pointer (PTR) record is essentially the reverse of an A record. PTR records map an IP address to a DNS name, and they are mainly used to check if the server name is associated with the IP address from where the connection was initiated.
  302. Use an alias record, not a CNAME, for your hosted zone. CNAMEs are not allowed for hosted zones in Amazon Route 53.
  303. Routing policy options are simple, weighted, latency-based, failover, and geolocation.
  304. Note that you can’t create failover resource record sets for private hosted zones.
  305. Geolocation routing: If you don’t create a default resource record set, Amazon Route 53 returns a “no answer” response for queries from those locations.
  306. You cannot create two geolocation resource record sets that specify the same geographic location.
  307. Memcached is a simple-to-use in-memory key/ value store that can be used to store arbitrary types of data.
  308. Redis is a flexible in-memory data structure store that can be used as a cache, database, or even as a message broker.
  309. Redis clusters  can support up to five read replicas to offload read requests.
  310. Some of the key actions an administrator can perform include CreateCacheCluster, ModifyCacheCluster, or DeleteCacheCluster. Redis clusters also support CreateReplicationGroup and CreateSnapshot actions, among others.
  311. Use Memcached when you need a simple, in-memory object store that can be easily partitioned and scaled horizontally.
  312. Use Redis when you need to back up and restore your data, need many clones or read replicas, or are looking for advanced functionality like sort and rank or leaderboards that Redis natively supports.
  313. Amazon CloudFront is optimized to work with other AWS cloud services as the origin server, including Amazon S3 buckets, Amazon S3 static websites, Amazon Elastic Compute Cloud (Amazon EC2), and Elastic Load Balancing.
  314. By default, objects expire from CloudFront cache after 24 hours.
  315. Cache behaviors are applied in order; if a request does not match the first path pattern, it drops down to the next path pattern. Normally the last path pattern specified is * to match all files.
  316. Signed URLs: Use URLs that are valid only between certain times and optionally from certain IP addresses.
  317. Signed Cookies Require authentication via public and private key pairs.
  318. Origin Access Identities (OAI) Restrict access to an Amazon S3 bucket only to a special Amazon CloudFront user associated with your distribution. This is the easiest way to ensure that content in a bucket is only accessed by Amazon CloudFront.
  319. All or Most Requests Come From a Single Location and/or via VPN  – don’t use CloudFront
  320. AWS Storage Gateway is a service connecting an on-premises software appliance with cloud-based storage to provide seamless and secure integration between an organization’s on-premises IT environment and AWS storage infrastructure.
  321. Gateway-Cached volumes allow you to expand your local storage capacity into Amazon S3. While each volume is limited to a maximum size of 32TB, a single gateway can support up to 32 volumes for a maximum storage of 1 PB. 
  322. Gateway-Stored volumes allow you to store your data on your on-premises storage and asynchronously back up that data to Amazon S3. (512TB max – 16TB X 32 volumes)
  323. When your tape software ejects a tape, it is archived on a Virtual Tape Shelf (VTS) and stored in Amazon Glacier.
  324. You’re allowed 1 VTS per AWS region, but multiple gateways in the same region can share a VTS.
  325. Simple AD is a Microsoft Active Directory-compatible directory from AWS Directory Service that is powered by Samba 4. Note that you cannot set up trust relationships between Simple AD and other Active Directory domains.
  326. AD Connector is a proxy service for connecting your on-premises Microsoft Active Directory to the AWS cloud without requiring complex directory synchronization or the cost and complexity of hosting a federation infrastructure. You can also use AD Connector to enable MFA by integrating it with your existing Remote Authentication Dial-Up Service (RADIUS)-based MFA infrastructure to provide an additional layer of security when users access AWS applications.
  327. M{0}nbsp;AD Directory Service is your best choice if you have more than 5,000 users and need a trust relationship set up between an AWS-hosted directory and your on-premises directories.
  328. In most cases, Simple AD is the least expensive option and your best choice if you have 5,000 or fewer users and don’t need the more advanced Microsoft Active Directory features.
  329. CMKs can never leave AWS KMS unencrypted, but data keys can leave the service unencrypted.
  330. All AWS KMS cryptographic operations accept an optional key / value map of additional contextual information called an encryption context.
  331. The specified context must be the same for both the encrypt and decrypt operations or decryption will not succeed.
  332. Symmetric encryption algorithms require that the same key be used for both encrypting and decrypting the data.
  333. When you create a trail that applies to all AWS regions, AWS CloudTrail creates the same trail in each region
  334. Amazon Kinesis Firehose receives stream data and stores it in Amazon S3, Amazon Redshift, or Amazon Elasticsearch.
  335. Amazon Kinesis Streams enable you to collect and process large streams of data records in real time.
  336. When a cluster is shut down, instance storage is lost and the data does not persist.
  337. HDFS can also make use of Amazon EBS storage, trading in the cost effectiveness of instance storage for the ability to shut down a cluster without losing data.
  338. AWS Data Pipeline is best for regular batch processes 
  339. Use Amazon Kinesis for data streams.
  340. AWS Snowball uses Amazon-provided shippable storage appliances shipped through UPS.
  341. The AWS Snowball is its own shipping container, and the shipping label is an E Ink display that automatically shows the correct address when the AWS Snowball is ready to ship. You can drop it off with UPS, no box required.
  342. AWS Import/ Export Disk transfers data directly onto and off of storage devices you own using the Amazon high-speed internal network.
  343. AWS Import/ Export Disk has an upper limit of 16TB.
  344. AWS OpsWorks is a configuration management service that helps you configure and operate applications using Chef.
  345. The stack is the core AWS OpsWorks component. It is basically a container for AWS resources— Amazon EC2 instances, Amazon RDS database instances, and so on— that have a common purpose and make sense to be logically managed together.
  346. You can use AWS OpsWorks or IAM to manage user permissions. Note that the two options are not mutually exclusive; it is sometimes desirable to use both.
  347. You define the elements of a stack by adding one or more layers. A layer represents a set of resources that serve a particular purpose, such as load balancing, web applications, or hosting a database server.
  348. Layers depend on Chef recipes to handle tasks such as installing packages on instances, deploying applications, and running scripts.
  349. When you deploy an app, AWS OpsWorks triggers a Deploy event, which runs the Deploy recipes on the stack’s instances.
  350. AWS CloudFormation is a service that helps you model and set up your AWS resources so that you can spend less time managing those resources and more time focusing on your applications that run in AWS.
  351. A template is a text file whose format complies with the JSON standard. AWS CloudFormation uses these templates as blueprints for building your AWS resources.
  352. If stack creation fails, AWS CloudFormation rolls back your changes by deleting the resources that it created.
  353. You can use template parameters to tune the settings and thresholds in each region separately and still be sure that the application is deployed consistently across the regions.
  354. To update a stack, create a change set by submitting a modified version of the original stack template, different input parameter values, or both. AWS CloudFormation compares the modified template with the original template and generates a change set. The change set lists the proposed changes. After reviewing the changes, you can execute the change set to update your stack.
  355. If you want to delete a stack but still retain some resources in that stack, you can use a deletion policy to retain those resources. If a resource has no deletion policy, AWS CloudFormation deletes the resource by default.
  356. AWS Elastic Beanstalk is the fastest and simplest way to get an application up and running on AWS. Developers can simply upload their application code, and the service automatically handles all of the details, such as resource provisioning, load balancing, Auto Scaling, and monitoring.
  357. An application version refers to a specific, labeled iteration of deployable code for a web application.
  358. An environment is an application version that is deployed onto AWS resources.
  359. Each environment runs only a single application version at a time;
  360. An environment configuration identifies a collection of parameters and settings that define how an environment and its associated resources behave. When an environment’s configuration settings are updated, AWS Elastic Beanstalk automatically applies the changes to existing resources or deletes and deploys new resources depending on the type of change.
  361. When an AWS Elastic Beanstalk environment is launched, the environment tier, platform, and environment type are specified.
  362. An environment tier whose web application processes web requests is known as a web server tier.
  363. An environment tier whose application runs background jobs is known as a worker tier.
  364. AWS Trusted Advisor draws upon best practices learned from the aggregated operational history of serving over a million AWS customers.
  365. AWS Config is a fully managed service that provides you with an AWS resource inventory, configuration history, and configuration change notifications to enable security and governance.
  366. AWS Config will generate configuration items when the configuration of a resource changes, and it maintains historical records of the configuration items of your resources from the time you start the configuration recorder.
  367. AWS has strategically placed a limited number of access points to the cloud to allow for a more comprehensive monitoring of inbound and outbound communications and network traffic.
  368. It is not possible for a virtual instance running in promiscuous mode to receive or “sniff” traffic that is intended for a different virtual instance.
  369. Attacks such as Address Resolution Protocol (ARP) cache poisoning do not work within Amazon EC2 and Amazon VPC.
  370. The AWS IAM API enables you to rotate the access keys of your AWS account and also for IAM user accounts.
  371. AWS passwords can be up to 128 characters long and contain special characters, giving you the ability to create very strong passwords.
  372. Not only does the signing process help protect message integrity by preventing tampering with the request while it is in transit, but it also helps protect against potential replay attacks.
  373. A request must reach AWS within 15 minutes of the timestamp in the request. Otherwise, AWS denies the request.
  374. Version 4 provides an additional measure of protection over previous versions by requiring that you sign the message using a key that is derived from your SAK instead of using the SAK itself.
  375. When you create an IAM role using the AWS Management Console, the console creates an instance profile automatically and gives it the same name as the role to which it corresponds.
  376. When you use the AWS CLI, API, or an AWS SDK to create a role, you create the role and instance profile as separate actions, and you might give them different names.
  377. To launch an instance with an IAM role, you specify the name of its instance profile.
  378. Amazon CloudFront key pairs can be created only by the root account and cannot be created by IAM users.
  379. For IAM users, you must create the X. 509 certificate (signing certificate) by using third-party software.
  380. You will also need an X. 509 certificate to create a customized Linux AMI for Amazon EC2 instances. The certificate is only required to create an instance-backed AMI (as opposed to an Amazon Elastic Block Store [Amazon EBS]-backed AMI).
  381. CloudTrail File Integrity: This feature is built using industry standard algorithms: SHA-256 for hashing and SHA-256 with RSA for digital signing. This makes it computationally unfeasible to modify, delete, or forge AWS CloudTrail log files without detection.
  382. Amazon EC2 currently uses a highly customized version of the Xen hypervisor, taking advantage of paravirtualization (in the case of Linux guests).
  383. The host OS executes in Ring 0. However, instead of executing in Ring 0 as most OSs do, the guest OS runs in lesser-privileged Ring 1, and applications in the least privileged in Ring 3.
  384. AWS firewall resides within the hypervisor layer
  385. Host Operating System Administrators with a business need to access the management plane are required to use MFA to gain access to purpose-built administration hosts.
  386. Amazon EC2 provides a mandatory inbound firewall that is configured in a default deny-all mode; Amazon EC2 customers must explicitly open the ports needed to allow inbound traffic.
  387. Amazon EBS replication is stored within the same Availability Zone, not across multiple zones; therefore, it is highly recommended that you conduct regular snapshots to Amazon S3 for long-term data durability.
  388. it is recommended that RDS backups to Amazon S3 be performed through the database management system so that distributed transactions and logs can be checkpointed.
  389. SG: The default group enables inbound communication from other members of the same group and outbound communication to any destination.
  390. With ACLs, you can only grant other AWS accounts (not specific users) access to your Amazon S3 resources.
  391. With bucket policies, you can grant users within your AWS account or other AWS accounts access to your Amazon S3 resources.
  392. Amazon Glacier stores files as archives within vaults.
  393. You can store an unlimited number of archives in a single vault and can create up to 1,000 vaults per region. Each archive can contain up to 40 TB of data.
  394. All network traffic entering or exiting your Amazon VPC via your IPsec VPN connection can be inspected by your on-premises security infrastructure, including network firewalls and intrusion detection systems.
  395. If you require your MySQL data to be encrypted while at rest in the database, your application must manage the encryption and decryption of data.
  396. When an Amazon RDS DB Instance deletion API (DeleteDBInstance) is run, the DB Instance is marked for deletion.
  397. To increase performance, Amazon Redshift uses techniques such as columnar storage, data compression, and zone maps to reduce the amount of I/ O needed to perform queries. It also has a Massively Parallel Processing (MPP) architecture, parallelizing and distributing SQL operations to take advantage of all available resources.
  398. In Amazon Redshift, you grant database user permissions on a per-cluster basis instead of on a per-table basis. However, users can see data only in the table rows that were generated by their own activities; rows generated by other users are not visible to them.
  399. Amazon Redshift stores your snapshots for a user-defined period, which can be from 1 to 35 days.
  400. Amazon Redshift uses a four-tier, key-based architecture for encryption. These keys consist of data encryption keys, a database key, a cluster key, and a master key.
  401. Forward Secrecy uses session keys that are ephemeral and not stored anywhere, which prevents the decoding of captured data by unauthorized third parties, even if the secret long-term key itself is compromised. You 
  402. Using the Amazon ElastiCache service, you create a Cache Cluster, which is a collection of one or more Cache Nodes, each running an instance of the Memcached service.
  403. To allow network access to your Cache Cluster, create a Cache Security Group and use the Authorize Cache Security Group Ingress API or CLI command to authorize the desired Amazon EC2 security group (which in turn specifies the Amazon EC2 instances allowed).
  404. IP-range based access control is currently not enabled for Cache Clusters.
  405. All clients to a Cache Cluster must be within the Amazon EC2 network, and authorized via Cache Security Groups.
  406. When launching job flows on your behalf, Amazon EMR sets up two Amazon EC2 security groups: one for the master nodes and another for the slaves.
  407. Amazon Kinesis is a managed service designed to handle real-time streaming of big data.
  408. Federated users are users (or applications) who do not have AWS accounts. With roles, you can give them access to your AWS resources for a limited amount of time.
  409. To begin using Amazon Cognito, you create an identity pool through the Amazon Cognito console. The identity pool is a store of user identity information that is specific to your AWS account.
  410. By default, Amazon Cognito creates a new role with limited permissions; end users only have access to the Amazon Cognito Sync service and Amazon Mobile Analytics.
  411. Data stored in Amazon EBS volumes is redundantly stored in multiple physical locations within the same Availability Zone as part of normal operation of that service and at no additional charge.

Should you build, rent or buy your software?

The target audience of this article is any company that wants to sell its products online.

Not too long ago, the debate was whether to build or buy, but with the advent of cloud computing, the question that makes more sense now is — build or rent or buy. Lets clarify these definitions.

Build
You build your own software, reuse open source libraries and deploy your solution on either open source servers, or paid servers. You write every line of code that is specific to your business/website. The servers/infrastructure is out of the remit of this article.

Rent
You rent software developed by someone else. You don’t write a single line of code. Blogger & Shopify are two examples of this kind of software.

Buy
You buy the software written by a big reputed company e.g. Oracle, IBM , etc. COTS (Commercial off the shelf) is often the term used for this kind of software. You then hire consultants or retrain your own staff to customise the COTS solution to fit your needs. There is an element of Build in this, but nowhere near as elaborate as the ‘Build’ option in itself.

Companies move in and out of these paradigms at fairly regular intervals (~ 5-10 years). Whatever approach a company plans to take, it is one of the most expensive decisions which will keep impacting the company bottom line for many years to come. So, it is imperative that it is discussed properly.

Is there a right approach ? Or in other words can we have some standard rules by which this decision can be made. I am sure there are many views out there on this topic, so here is one more to add to that mix.

So, what is the problem? Why is it so hard to build software? As opposed to lets say cars. The car assembly line pumps out car after car with identical quality. Why is it impossible to apply the same principle to software?

Before we attempt to come up with a framework for evaluating those choices, lets look into some definitions which have a great deal of impact on those choices.

Software
No two softwares are identical. You can drive any Ford Fiesta or Porsche (if you prefer) in the world , and they would be identical to each other. You don’t have to learn or unlearn anything to switch cars, which makes it easy & predictable to drive any car and minimises any risk introduced by the switch.

Software Developer
No two software developers are identical. One of the most undervalued entities in the software industry is a ‘Good software developer’.
What/Who is a good software developer ? If I have to answer that in one line, I would say that a good software developer is someone who spends more time writing new software than fixing defects in the code developed by him/her.
Any/every software developed at any point in time will have defects. However all defects are not the same, and it is important to understand the subtle differences in the defects, if we are to fully appreciate a developer.
A technical defect (e.g., a NullPointerException) is purely a developer’s fault. Whereas a behavioural (i.e., a functional) defect is not a developer’s fault. The responsibility in the second instance has to be with the person responsible for finalising and clarifying business requirements. If you have vague requirements, you will have behavioural defects, which will only come to light once the system is put under use (either at acceptance testing or with real customers). Too late. As they say, the cost of fixing defects increase exponentially with passing time. [1]

In my view, a good developer is worth 10 average developers, and infinitely more valuable than bad developers. A bad developer will set you back on your business plans, and the tangible cost of that can never be fully worked out in terms of $/£.

Okay, thats all good but as a business, how does it help me make my decision ? What do I need to do to make use of all that information, and chart a course for my eCommerce venture ?

Lets look into each type of business:

Small player, offline only
This category includes the SME player (< $500million annual turnover), which has brick and mortar structure in place but there is no online presence. i.e. they are starting afresh to sell their wares online.

Selling online is not just about the technology but the business processes that go along with it. You need to not just establish stable technology to be able to successfully cater visitors, but you also need to ensure that your business processes are geared to meet the online customer and understand its nuisances. Technology keeps changing but the business process i.e the roles and responsibilities of a business user is something that stays more or less the same over a period of time.

As a small player starting out in the eCommerce venture, you should aim to establish your business processes first. Also, since you are new to the technology field, you should go for low cost option in the short term with minimal impact on your existing business. Given these constraints, the option that best fits your needs is the Rent option. You can rent the software which has already been developed by someone else, and push your own product data set through that software. This will give you a feel of the online space and at the same time, keep your capital expenditure on the lower side. While you are using a stable technology, (which you did not have to spend time developing), you can take time to develop & streamline your business processes.
This should give you couple of years to establish yourself in the online space (or not). And once you have tested the waters, you can get adventurous or not, as the case may be.

Small player, online/offline
This category includes SME players which have offline shops as well as online presence.If you are a small player, and have been operating in the online space for some time, then you must have had time to establish the business processes. So, lets assume that you have half decent business processes in place. The reason you are looking to change the technology could be that your current technology has become out of date, or you are simply wanting to grow beyond your current customer base, which your current solution is unable to support.
Either way, you need to understand your current technology team to be able to make a decision.

If you have a great technology team, then you should go for the build option using API-first model [2]. You could start with a single API service to start with e.g. a ProductService and then ensure that this service works before applying the same principle to other services.
On the other hand, if you don’t have a high confidence in your current technical team, then you could go for Rent/Buy option. Consider, if you are able to invest in your team and improve the overall quality of the team by training or new hires.
If your business processes are complex and you have lot of customisations, then Buy is the only realistic option. With Rent option, you will eventually hit the roadblock where you are unable to do things that you need for your business, and by that time it would be too late. So bite the bullet, and buy. Or else simplify your business processes (if it works).

Big player, offline only
If you are a big player with only offline presence & no online presence (unlikely but possible), then you should invest in hiring the best technical team, and go for the Build option using API first architecture.
The most important thing for you is that your solutions are stable and scalable, and only a great technical team can ensure that.
You can also consider the Buy option, but in my opinion, the lock in introduced by the Buy option does not outweigh the advances being made in technology across the software stack.

Big player, online/offline
TODO

References

[1] http://www.agilemodeling.com/essays/costOfChange.htm
[2] http://martinfowler.com/bliki/MonolithFirst.html

Gaffer, a spy software used by UK Government has been open sourced.

If you have a github account, you can fork out the project from below link:

https://github.com/GovernmentCommunicationsHeadquarters/Gaffer/

 

Gaffer is essentially a framework for processing large amounts of interconnected data, and build graphs based on that data. This in turn can be used to analyse the relationships between those objects.

(AN = alphanumeric)

For indexing purposes, Oracle Commerce divides characters into 3 categories:

  • AN characters
  • Non-AN characters (as configured in special chars config)
  • Other non-AN chars not configures in above.

Case folding
Oracle commerce search ops are not case sensitive, so alpha chars are always included in lowercase form at the time of indexing.

Indexing of non-AN chars
The non-AN chars which are not specified as searchable in config are treated depending upon whether they are considered punctuation or symbols.

  • Punctuation chars are treated as white space. The following are punctuation chars:
    ! @ # & ( ) - [ { } ] : ; ‘ , ? / *
    In a multi word search with words separated by punctuation, the word order is preserved as if it was a phrase search.
  • Symbols are also treated as white space, but word order is not preserved. The following are symbol characters
    ` ~ $ ^ + = < > "

(AN = alphanumeric)

For indexing purposes, Oracle Commerce divides characters into 3 categories:

  • AN characters
  • Non-AN characters (as configured in special chars config)
  • Other non-AN chars not configures in above.

Case folding
Oracle commerce search ops are not case sensitive, so alpha chars are always included in lowercase form at the time of indexing.

Indexing of non-AN chars
The non-AN chars which are not specified as searchable in config are treated depending upon whether they are considered punctuation or symbols.

  • Punctuation chars are treated as white space. The following are punctuation chars:
    ! @ # & ( ) - [ { } ] : ; ‘ , ? / *
    In a multi word search with words separated by punctuation, the word order is preserved as if it was a phrase search.
  • Symbols are also treated as white space, but word order is not preserved.
    If a symbol character is next to a punctuation character, it is ignored. 
    The following are symbol characters
    ` ~ $ ^ + = < > "

Oracle Commerce Search supports following search modes:

 

  • MatchAll
    MatchAll means all keywords specified by user should be present in the record. e.g. If a user searched for brown bag, the result should have both brown AND bag in the text.
  • MatchPartial
    Matches partial keywords e.g. brown bag will return all records which have EITHER brown OR bag in its properties. In addition, the settings specified in ‘match at least’ and ‘omit at most’ parameters, also must be satisfied.
  • MatchAny
    Matches any of the terms in the search query string.
  • MatchAllAny
    The engine first tries to match ALL, and if no results are found, it falls back to Match any.
  • MatchAllPartial
    First, the engine tries to find results for MatchAll. If no results are found, then MatchPartial is used.
  • MatchPartialMax
    First, the engine tries to find results for MatchAll. If no results exist, then the search is executed with one less term than original, and so on, until results are found. This mode is subject to Match at least and Omit at most.
  • MatchBoolean
    Allows users to specify complex expressions.

    ——————————————————————-

Ntx
Query parameter to specify match mode e..g Ntx=mode+matchall

Dx
Query parameter to use in dimension search

——————————————————————-

Black Friday is the day following Thanksgiving Day in the United States (the fourth Thursday of November). Since the early 2000s, it has been regarded as the beginning of the Christmas shopping season in the US, and most major retailers open very early (and more recently during overnight hours) and offer promotional sales. Black Friday is not an official holiday, but California and some other states observe “The Day After Thanksgiving” as a holiday for state government employees, sometimes in lieu of another federal holiday such as Columbus Day. Many non-retail employees and schools have both Thanksgiving and the following Friday off, which, along with the following regular weekend, makes it a four-day weekend, thereby increasing the number of potential shoppers. It has routinely been the busiest shopping day of the year since 2005, although news reports, which at that time were inaccurate, have described it as the busiest shopping day of the year for a much longer period of time. Similar stories resurface year upon year at this time, portraying hysteria and shortage of stock, creating a state of positive feedback.

Where the mind is without fear and the head is held high,
Where knowledge is free,
Where the world has not been broken up into fragments,
by narrow domestic walls,
Where words come out from the depth of truth,
Where tireless striving stretches its arms towards perfection,
Where the clear stream of reason has not lost its way,
Into the dreary desert sand of dead habit,
Where the mind is led forward by thee,
Into ever-widening thought and action,
Into that heaven of freedom, my father, let my country awake.

This post was created before elections (April 2014), but was left unfinished. With all the hustle & bustle, it is probably worth putting below views in the open.

This is my take on pros and cons of AAP contesting on a large number of LS seats. Please note that the views expressed are my own. I am not affiliated to any political party (but I remain optimistic about AAP as a force for change).

Pros

Exposure to maximum voter base

By contesting on a large number of seats, AAP has ensured that it gets as much exposure and visibility as possible. Had it contested on a smaller number of seats, the exposure would have been lesser.

It can be argued, that had it contested on smaller number of seats, they could have been more effective.

However, contesting on a large number of seats is probably very much in rhythm with AAP’s philosophy of “This is YOUR election, not MINE”. Also, contesting on a large number of seats has tested waters at all levels. e.g. in Punjab, where AAP is seeing a vast amount of support. Had it picked a few seats to contest e.g. say 100, Punjab may or may not have figured in that list.

Each LS seat has the same value

The seat of Varanasi carries the same weight as Amritsar in LS, whereas in reality it may be much easier to contest and win Amritsar than Varanasi.

Fast track to becoming a ‘national’ party

Cons

Budget

Less control

Difficulty in identifying like-minded personnel

 

March 04, 2015

Corruption in community centres:

AAP MLA, Alka Lamba exposes the scandal in running the Community Centres. Good work!

NewImage

Corruption in MCD & PWD

AAP MLA, Alka Lamba exposes the ‘open’ corruption in MCD & PWD.

NewImage

Corruption in Housing societies

Housing societies to be brought under teh preview of RTIs

http://timesofindia.indiatimes.com/city/delhi/Delhi-RTI-lens-likely-on-housing-societies/articleshow/46449989.cms

NewImage

The below gamification problem was provided as part of Gamification Course conducted by very honourable Mr. Kevin Werbach. Many thanks for conducting the course. It was thoroughly enjoyable.

Gamification Problem

You are approached by Rashmi Horenstein, the CEO of ShareAll, a prominent company in the hot collaborative consumption space. (If you aren’t familiar with the concept, some good resources are CollaborativeConsumption.com and the March 9, 2013 cover story in the Economist.) She knows you are one of the top experts on gamification, which she has heard can revolutionize business. She asks you to present a proposal for a gamified system to take her business to the next level.

ShareAll’s mission is to make shared use of products and services as common as individual purchases. It follows the path of companies such as AirBnB, Buzzcar, and Uber, which allow sharing of particular products (cars, housing, etc). ShareAll’s patented technology makes it easy for consumers and business to share any product or service. ShareAll has also developed a global virtual currency, called Shares, which can be used to purchase access to any asset in the system. Shares can be exchanged for real money, and users can generate more Shares by sharing items or volunteering their time to complete tasks for others.

ShareAll charges a small transaction fee whenever Shares are generated, traded, or spent. Therefore, the more activity, the more money ShareAll makes. Horenstein tells you that she cares about the social benefits of sustainability. However, ShareAll is a for-profit company, with investments and partnerships from some of the world’s largest corporations, so profits matter. Horenstein believes gamification could significantly help ShareAll’s business. She is eager to read your ideas.

Gamification Solution Ideas

Please note that below ideas are my own & should not be copied without due permission.

Business objectives

  • Build a user base 
    As the company wants to become a household name, it is essential that the company builds up a BIG user base. In addition, the business model of the company necessitates that the company has as big a user base as possible. Therefore it should be the first big item on the list of business objectives.
  • Establish credibility 
    In addition to establishing user base, the company needs to ensure that it is seen as a credible company. And, does not have trust issues. Trust and increasing user base should complement each other, wherein the more trusted the brand becomes, the more people use it, and vice versa.
  • Social sharing
    Since the company wants to become a household name, it is imperative that is gets in front of as many ‘eyeballs’ as possible. Therefore, encouraging social behaviour should be one of the business objectives.

Target behaviors

  • Share on social networks i.e. facebook, twitter et al
    We want customers to share as much of their “ShareAll activity” on social networks as possible. This is one of the key behaviour that is expected of our Players.
  • Reasoning for not using sharing for a specific product type or service type
    As sharing products/services (not facebook sharing) is the key underlying factor, we want Players to reveal their reasons for not sharing certain items/services. This will help us to spot a pattern for a certain product/service, if one exists.
  • Inventory building
    We expect our Players to put up products/services for sharing, thus building up the inventory for the ShareAll system. This behaviour should be encouraged as much as possible, as the inventory is the heart & soul of the system. 

Player descriptions

  • The Owner/Sharer
    This player type represents the section of players, who own items, and put them up for sharing with others. In terms of services, these are Players who seek services from others. 
  • The User
    This player type represents the section of players who will be users of the products, and not necessarily the owners. Also, they will be keen to offer services to others.
  • The Money maker
    This category represents Players whose objective is to make money out of the system i.e. exchange virtual Shares with real money. Among others, some House wives/husbands can probably fall in this category.
  • The Ethical sharer
    This category represents Players who share because of ethical reasons i.e. help others save money, if possible etc.
  • The compulsive sharer/user
    This category represents people who use the system because of their compulsions to do the activity itself. They are not necessarily interested in the money aspect or the ethical aspect, but purely the game.

Activity loops

  • Shares – Global currency
    A user would be able to earn shares by using the system i.e. when the user puts a product up for sharing with others, he will get certain shares. He also earns shares every time his service/product is ‘consumed’ by others.
  • Product / service quality
    Each share will contribute towards the overall quality of the product / service being offered. i.e. a 5-star system can be devised to help users gauge the relative quality of the item (product/service) vis-a-vis other competing items. When the person using the product/service has availed the benefit, he would have to provide 1-5 stars, depending upon his experience. Based on this, the overall quality of the product/service can be derived.
  • Product/service use feedback
    Every time a product / service is used, the owner of the item would also provide feedback to establish the quality of returned items / services. e.g. if the user damaged the product, then that should be factored in the feedback.
    For this purpose, a user should have a feedback rating assigned as well. This will be similar in concept to eBay seller rating.
  • Encouraging site visits
    In order to encourage site visits and make the users stick around, we should also think about giving some shares to the ‘browsers’. This is similar in concept to the points that Samsung gives to users for staying on the site. The longer & more frequent the visits, the better the business.
  • Encouraging sharing
    Users can earn badges et al by using ShareAll at certain levels. This will appeal to those people, who are badge-lovers. At every X level, you will get a badge. i.e. after your item has been shared 10 times, or you have used the ShareAll service 10 times, you accumulate the badges etc. Also, the real value of badges would be that, with every badge, the number of Virtual Shares you get for sharing/using an item/service will increase (may be marginally, but nonetheless).

Fun
In order to make this process fun, we should think about utilising the intrinsic as well as extrinsic motivations. e.g. If a certain individual is an expert in a service, and other users need that service, then that should be portrayed as helping others, not just earning Virtual Shares. In addition, user’s preference upfront, and his activity on the site, should be used to target activities for individuals.

  • Share your experience
    Since you are the expert, help those who need this service. If a person is a real expert (and likes helping people), then he/she should get satisfaction out of this.
  • Sharing to maximise Virtual Shares
    People who get motivated by money, should be targeted in that way e.g. Get 10 more Virtual Shares to go to the next level.
  • Connecters
    One of the fun items which can be introduced is a connecting game, where in a third person is asked to connect the buyer & seller, and he can earn points based on that. This would cater to the situation, wherein the seller has the product, and the buyer wants the product, but none of them have the time to browse and connect the dots.

Tooling

  • Web / Mobile App 
    The website / mobile app will allow the user to do the following
    – Register for the ShareAll system
    – View your status i.e. the number of shares you hold, and your sell/purchase history
    – View the current items/services available
    – Register interest in a service/product
    – Be able to bid below the item/service’s prescribed ‘share value’
    The company should look at streamlining the process workflows and systems, so the user can interact with the ShareAll system, irrespective of the platform i.e. desktop, mobile, Smartphone, etc i.e. a truly cross-platform system.
  • Videos
    In order to make users familiar with the workflow, short videos describing the workflow of sharing products and services should be utilised. The videos can revolve around the target player personas, wherein each video is focussed on specific persona. 
  • Insurance
    For products with some value, the company should look to get insurance. This will alleviate the fears of those individuals who would otherwise not go for sharing those products/services

If ShareAll follows the above mentioned steps, then it will not only help them become a household name in collaboration space, but also create a valuable company which will be the envy of all.