[cs615asa] HW#N: Attended a relevant Meetup/Talk/community event

Tue Apr 12 22:49:39 EDT 2016

Hi All,

Please find the content of the meet-up that which i attended:

********************************** Talk Events at MediaMath with respect to
Operations Platform, held on Feb 22, 2016
*************************************

They shared important steps that must be considered while working in
Operations or as a Sysadmin.
Below i have mentioned in brief:

Management:
- How operation team should maintain the business continuity?
- How your customer should be retained for long
- Code review and policy
- What kinds of services will you provide on the network?
- Overcome dependency from resources, that is, there should be no
dependency between the resources so they must know each and every task.

Degree of Reliability:
- Scalable design with high availability
- Analyze how much downtime is acceptable ? For most users, answer would be
zero
- Vital components of network such as file servers should have fault
tolerance built in from the bottom up.
- Set up disk array using RAID techniques to provide data integrity in the
event of Disk failure.
- If a link between two offices needs to be up 100% then you should plan
for multiple links between the two sites to provide a backup capability

Reactive vs Proactive Approaches:
Reactive (Events that Occur at failure):
- Many events occur before a system failure happens and eventually an
set-up alert will raised notifying you that system is down.
- If failures occurs then operations team is responsible for debuging the
root and origin of the failure and trying to fix it, thenafter document
that steps which they took for resolution.

Proactive (Events that Occur Before failure):
- Here the alerts will most likely come from pro-active monitoring to tell
you there are component failures that need attention but have not yet
affected overall application availability, here we can consider an e.g. of
power supply or disk failure in a server.
- Troubleshot in testing environment.
- Discovery and Monitoring - Proactive action to avoid any failure
- Centralized statistical views of the overall system - e.g. Viewpoint
- Analysis and Graphical representation of metric data.

DevOps Terminology:
- DevOps term is now a days well know, but job responsibility of a devops
is neither a dev nor a ops but it is something in middle.
- DevOps should be able to understand both development as well as
operational task, they also do code review and operational inspection.
- The future systems administrator is more about the system and less about
the administration.

Automation:
Yes , you heard it right , AUTOMATION .... and now its the new mantra for
DevOps ... Hold on .. Hold on .. but be cautious with following scenarios:
Don't automate what you don't understand.
Don't automate what you cannot validate.

Tools to Automate:
We are lucky as we have many tool availability to automate the process and
configuration, thus saving time by reducing human efforts, and tools are as
follows:
Ansible(agent less), python
chef and puppet has agent to talk to master (having agent is not idea thus
ansible is better, just push out)
Program in a modular fashion for code-reusability.

I am pretty much sure that many of you sitting here schedule their jobs
with Cron but yes, don't be surprised as it has many disadvantages and some
of them
are invisible but they exist.
Sometimes cron doesn't run and thus no action is performed but user remains
un-notified.
Many a times a dependent cron job initiates despite the dependent is
unfinished.
It is very hard to debug when things go wrong, whereas using Jenkins can
give us all sorts of visibility into this task, options to chain things
together
etc.

Backup:
Now, coming to one of the most aspect related to data storage and backup.
Whatever we do , we do for data but what if data is lost? Could anyone help
with this .... Yes you got it right ... BACKUP
Many of us think that, RAID means we have data backup .. but .. but .. but
its not the case "RAID has nothing to do with backup"
Sysadmin has to think and analyze how often we need to back up your data,
they can be either weekly/ monthly or incremently scheduled depending upon
the
importance of data.

******************* Write policy with proper documentation which is the
most important part (Can be used for future references)
*************************

Why I chosed it?
As it was more related to my area of Interest i.e. System and DevOps.

What you learned from it.
After attending this Meetup I came to know disadvantages of Cron ,
importance of backup, various tools used for automation and
responsibilities of System admin.

Thanks,
Bipin Pandey.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.stevens.edu/pipermail/cs615asa/attachments/20160412/6e51552d/attachment.html>