[cs615asa] AWS Data Event Summary

Tiankai Xie txie4 at stevens.edu
Mon May 1 11:17:49 EDT 2017


Hello everyone,

I would like to share one of the topics about the event I participated — AWS 
Optimizing Storage Storage for Big Data/Analytics Workloads.

As you may know, AWS has a series of products used to deal with data for 
different purposes (EMR for data computing, Redshift for data warehouse, 
Kinesis for streaming, DynamoDB for NoSQL, ElastiSearch for searching, etc.). 
Thus, the way to store the data should be taken into consideration for better 
performance. AWS also provides a couple of storage ways: S3 (Multi-tenant 
Key-store Native API), EFS (Shared/Distributed POSIX NFS/SMB), EBS 
(Single-tenant Block-store). 

For this lecture, we are focusing on EBS(Elastic Block Store). 

Basically, EBS provides 4 volume types: gp2, io1, st1 and sc1. Choosing 
volume type depends on what is your expectation — which is more important: 
IOPS(Input/output operations per second) or Throughput. If you look for better 
IOPS, then SSD-based volumes are good choices for you (gp2, io1). However, if 
you need more important throughput, then HDD-based volumes are the ones
(sc1, st1). Of course, there are more options for you if you look for more 
accurate performance. For example extremely high IOPS or low latency less 
than 1ms, you should consider i2 rather than EBS type(Slides below may be 
more intuitive).

So far, we are clear about what AWS stores data with its products: S3 is for Big 
Data&Analytics, and EBS is used for Data Warehouses, Search& Indexing, 
Transactional& NoSQL Databases and Streaming data. And EBS will be finally 
managed with S3.

More interesting news about this topic is EBS Elastic Volumes, a new feature in 
2017. Elastic Volumes is used to modify the configuration of live volumes 
attached to instances in a simple, flexible, non-disruptive and automated way. 
The idea here is that with this feature we can dynamically increase size, tune 
performance and change the type of existing and new current generation 
volumes. However, there are also some limitations for current features. For 
example, you can modify a volume once per 6 hours, it supported only for 
current generation volumes (gp2/io1/st1/sc1), and the live changes supported for 
volumes attached to current generation instances, etc.

Finally, I think AWS is a powerful platform. More and more features will be 
added for the sake of better performance of computing. There are too many 
things need to learn in this field, so hopefully, I will keep learning and keep 
updating.

Thanks for the reading!

-Tiankai Xie (txie4)

Links for “AWS Storage Day”:

Optimizing Storage for Big Data/Analytics Workloads: 
https://www.slideshare.net/AmazonWebServices/optimizing-storage-for-big-data-analytics-workloads <https://www.slideshare.net/AmazonWebServices/optimizing-storage-for-big-data-analytics-workloads>
 
Migrating Large Scale Data Sets to the Cloud: 
https://www.slideshare.net/AmazonWebServices/migrating-large-scale-data-sets-to-the-cloud <https://www.slideshare.net/AmazonWebServices/migrating-large-scale-data-sets-to-the-cloud>
 
Best Practices for Building a Data Lake on AWS: 
https://www.slideshare.net/AmazonWebServices/best-practices-for-building-a-data-lake-on-aws <https://www.slideshare.net/AmazonWebServices/best-practices-for-building-a-data-lake-on-aws>
 
Supercharging the Value of Your Data with S3: 
https://www.slideshare.net/AmazonWebServices/supercharging-the-value-of-your-data-with-amazon-s3 <https://www.slideshare.net/AmazonWebServices/supercharging-the-value-of-your-data-with-amazon-s3>
 
Strategic Uses for Cost Efficient Long-Term Cloud Storage: 
https://www.slideshare.net/AmazonWebServices/strategic-uses-for-cost-efficient-longterm-cloud-storage <https://www.slideshare.net/AmazonWebServices/strategic-uses-for-cost-efficient-longterm-cloud-storage>

Simple, Scalable and Highly Durable NAS in the Cloud – Amazon EFS: 
https://www.slideshare.net/AmazonWebServices/simple-scalable-and-highly-durable-nas-in-the-cloud-amazon-efs <https://www.slideshare.net/AmazonWebServices/simple-scalable-and-highly-durable-nas-in-the-cloud-amazon-efs>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.stevens.edu/pipermail/cs615asa/attachments/20170501/61957f3b/attachment.html>


More information about the cs615asa mailing list