[cs615asa] Yanqiao Zhan's HWN

Sat Apr 11 12:22:47 EDT 2015

*Assignment N CS615*

Yanqiao Zhan Stevens Institute of Technology

April 8, I went to Manhattan to join a meet up “Using In-Memory,
Data-Parallel Computing for Operational Intelligence”. The speaker is
Dr.William L. Bain, he is the founder & CEO of ScaleOutSoftware, and he is
proficient in handling data. The meeting held at 99 Madison Ave, the office
of ThoughtWorks, an energetic and warm place.

The meet up attracts me in following aspects. a) With data scale increasing
recently, how should we store data that can make IO more efficient?
Traditional database or hard disk storage can’t meet highly increased
demand. b) How to design a parallel computing architecture. As a system
administrator, we may need to design and implement the big data platform
that can support high-speed data read/write. I’m sure in the future big
data platform/center will become a essential topic for system
administration field.

>From this event, I learned the necessity of in-memory and parallel
computing and what and why operational data need faster, scalable process
rate. Suppose now we need to deal with several TB or PB data, and the data
must be operated in a quite short time. How to organize data structure
efficiently? In Dr.William’s design, some data can be stored in memory as
data grid and other data can be stored in disk-based persistent storage. In
memory, data use object-oriented model to organize, access data object by
unique key. In this way, network and disk I/O delay can be almost
dismissed, so read/write I/O can achieve linear speedup. In short, IMDG has
following benefits: faster access time, scalable throughput, high
availability, shared access to data, global data access and fast data
analysis.

Another important part is parallel computing. With the help of parallel
computing, several data grids can be operated together. After each thread
finish its computation, master thread will merge the results. The same idea
comes from Map Reduce framework.

Furthermore, the following features make this meeting more successful: a)
Speaker uses almost non-technical statement to describe the calculation
model; b) The lecture introduces many different scenarios, such as online
shopping, cable TV tracking, stock exchange, these scenarios all need to
treat large amount of live data; c) The lecture provides some JAVA code
show how to implement the analysis.

Although Dr.William’s speak only last for 45 minutes and he didn’t talk more
detail technology, his design idea inspires me a lot. In the future, I will
join more meetup to expand my sight.

Relevant Links:

http://www.meetup.com/mysqlnyc/events/221369970/

https://www.youtube.com/watch?v=52smTmprT7w(The link is the same
presentation made by Dr.William in 2014 Spain)

https://www.youtube.com/watch?v=H6OFzdIEy-g(What is operational
intelligence)

https://docs.google.com/a/stevens.edu/file/d/0B8QCZ3jIFMVla05UN3dVd3JpZ3c/edit
(Slides)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.stevens.edu/pipermail/cs615asa/attachments/20150411/62808ba7/attachment.html>