Kinesis
What is streaming data?
Streaming Data is data that is generated continuously by thousands of data sources, which typically send in the data records simultaneously, and in small sizes (KB).
Examples: Purchases from online stores (amazon), Stock Prizes, Game data, Social network data, Geospatial data (uber), iOT sensor data.
What is Kineses?
Amazon Kineses is a platform on AWS to send your streaming data to.
Kineses makes it easy to load and analyze streaming data, and also providing the ability for you to build your own custom applications for your business needs.
Types of Kineses
- Kineses Streams - Kineses Firehose - Kineses Analytics

- You have all your devices (EC2, Mobile Phone, Laptop, iOT ect)
- These are your data Producers.
- They stream their data to Kineses Streams, which is a place to store this data.
- By default data will be stored for 24 Hours, but data can be stored for up to 7 Days.
- Data is contained in shards and you might have shards for different purposes.
- EC2 instances (Consumers) can analyze this data stored in the shards.
- Once your EC2 instances have analyze this data they then store the data to DynamoDB, S3, EMR, Redshift ect.
- Kineses Streams consist of Shards:
- 5 transactions per second for reads, up to a maximum total data read rate of 2 MB per second and up to 1 000 records per second for writes, up to a maximum total data write rate of 1 MB per second (including partition keys).
- The data capacity of your stream is a function of the number of shards that you specify for the stream.
- The total capacity of the stream is the sum of the capacities of its shards.
- For the exam, remember if you hear about shards, think Kineses Streams as only Kineses Streams have shards.

- Our Data Producers send their data to Kineses Firehose.
- Inside Kineses Firehose there is no persistent storage, data gets analyze as it comes in.
- It is optional to have lambda functions inside your Kineses Firehose.
- As soon as data comes in, it triggers a lambda function that does something with the data and then output it somewhere safe, such as a S3 bucket.
- If you want to output your data to Redshift, you must first output it to S3, and then from S3 it could be outputted to Redshift.
- It could also be outputted to Amazon S3, Amazon Redshift, Amazon Elasticsearch Service, and Splunk.
- Difference between Kineses Firehose and Kineses Streams have shards and your data has persistent, where Firehose has no data persistence, rather you have to do something with the data on the fly.
- Kineses Analytics works with Kineses Streams and Kineses Firehose.
- It can analyse data on the fly in either service and then goes and store it in S3, Redshift or Elasticsearch Cluster.