System

About This Presentation

Title:

System

Description:

Microsoft Exchange Server. 15. 3.1.4 Simplicity ... a mail system with many filtering mechanisms (e.g., anti-spam, anti-virus, etc. ... – PowerPoint PPT presentation

Number of Views:58

Avg rating:3.0/5.0

Slides: 63

Provided by: liuta

Category:

more less

Transcript and Presenter's Notes

Title: System

1
System Network Administration

Chapter 3 Service
By Chang-Sheng Chen (200803011)

2
Contents of Chapter 3

3.1 The Basics
3.1.1 Customer Requirements
3.1.2 Operational Requirements
3.1.3 Open Architecture
3.1.4 Simplicity
3.1.5 Vendor Relation
3.1.6 Machine Independence
3.1.7 Environment
3.1.8 Restricted Access
3.1.9 Reliability

3.1.10 Single or Multiple Servers
3.1.11 Centralization and Standards
3.1.12 Performance
3.1.13 Monitoring
3.1.14 Service Rollout
3.2 The Icing
3.2.1 Dedicated Machine
3.2.2 Full Redundancy
3.3 Conclusion

3
The Basics

The most important thing to consider at all
stages of design and deployment is the customers
requirements.
Talk to the customers and find out what their
needs and expectations are for the services.
Then, build a list of other requirements that are
only visible to the SA team.
Focus on what, rather than how.
Service should be built on server-class machines
that are kept in a suitable environment.

4
The Basics (cont.)

Access to server machines should be restricted to
SAs for reasons of reliability and security.
An SA has several decisions to make when building
a service.
Choosing vendors and products ( software,
hardware)
Reliability, performance, etc.

5
The Basics (cont.)

Most services rely on other services.
Understanding in detail how a service works will
give you insight into the service on which it
relies.
For example, almost every service relies on name
service (DNS). DNS relies on network, and
therefore, anything that relies on DNS also
relies on network.
A service should be built as simple as possible,
with as few dependencies as possible, to increase
reliability and make it easier to support and
maintain.

6
The Basics (cont.)

Another method of easing support and maintenance
is to use standard hardware/software, standard
configurations and have documentation in a
standard location.
A key part of implementing any new service is to
make it independent of the particular machine

7
3.1.1 Customer Requirements

When building a new service, you should always
start with the customer requirements.
Gathering the customer requirements
There are very few services that do not have
customer requirements.
DNS, authentication services, etc.
A Service Level Agreement (SLA)
An SLA enumerate the services that will be
provided and the level of support they receive.

8
Service Level Agreement(cont.)

A Service Level Agreement (SLA)
An SLA enumerate the services that will be
provided and the level of support they receive.
It typically categories problems by severity and
commits to response times for each category.
The SLA usually defines an escalation process
that increases the severity of a problem if it
has not been resolved after a specified time and
calls for managers to get involved if problems
are getting out of hand.

9
Service Level Agreement (SLA)

The SLA process is a forum for the SAs to
understand the customers expectations and to set
them appropriately, so that the customers can
understand what is and is NOT possible and why.
It is a tool to plan what resources will be
required.
The SLA should document the customers
requirements and set realistic goals for the SA
teams in terms of features, availability,
performance, and support.
It should document future needs and capacity so
that all parties will understand the growth plans.

10
3.1.2 Operation Requirements

The SA team may have other requirements for the
new service that are not immediately visible to
the customers.
The administrative interface, whether it
interoperates with other existing services and
can be integrated with central service such as
authentication or directory services.
SAs also need to consider how the service scales.
A related consideration is the upgrade path for
the service
The level of reliability
Network performance issues
Monitoring issues ( availability, performance,
etc.)
Budget issues

11
Operation Requirements (cont.)

Questions about an upgrade process
Does it involve an interruption of service ?
Does it involve touching every desktop ?
Is it possible to rollout the upgrade slowly, to
test it on a few willing people before inflicting
it on whole organization ?
Try to design the service, so that upgrades are
easy, can be performed without service
interruption, dont require touching the
desktops, and can be rolled out slowly.

12
3.1.3 Open architecture

Whenever possible, a new service should be built
around open protocols and file formats.
Any service with an open architecture can be more
easily integrated with other services that follow
the same standards.
The business case for using open protocols is
simple
it lets you build better services because you can
select from the best server and client, rather
than being forced to pick, for example, the best
client and then getting stuck with a less than
optimal server.

13
Open architecture (cont.)-The ability to
decouple the client and server selections

A better way to select protocols based on open
standards ad permit each side (i.e., client and
server) to select their own software.
Customers are free to choose the software that
best fits their own needs, biases, and even
platforms.
SAs can independently choose a server solution
based on their needs for reliability,
scalability, and manageability.
The SAs can now choose between competing server
products, rather than being locked into the
(potential difficult to manage) server software
and platform required for a particular client
application.
Open protocols provide a level playing field that
inspire competition between vendors, which
benefits you.

14
Open architecture (cont.)

Open protocols and file formats are typical quite
static (or only change in upward compatible ways)
and widely support,
giving you the maximum product choice and maximum
chance of reliable, interoperable products.
The other benefit of using open systems is that
you dont require a gateway to the rest of world.
Gateways are additional services that require
capacity planning, engineering, monitoring, and
everything else mentioned in this chapter
Case Study
Hazards of Proprietary Email Software
Primarily based on client user interface and
features (e.g., Graphic User Interface, etc.) and
no concerns for server management, reliability
and scalability
All messages from all users in a single large
file
Protocol Gateway Reduce Reliability
Microsoft Exchange Server

15
3.1.4 Simplicity

When architecting a new service, simplicity
should be your foremost consideration.
The simplest solution that satisfying all the
requirements will be the most reliable, easiest
to maintain, easiest to expand, and easiest to
integrate with other systems.
As the system grows, it will become complex.
Therefore, starting out as simple as possible
delays the day when a system has become too
complex.
Sometimes, one or two requirements from the
customer or SAs may add considerably to the
complexity of the system.
Reevaluate the importance of these requirements
These requirements could be met, but at a cost to
reliability, support levels, and on-going
maintenance.

16
3.1.5 Vendor Relations

When choosing hardware and software for a
service, you should be able to talk to sale
engineers from your vendors to get advices on the
best configuration from your application.
Hardware vendors sometimes have product
configurations that are tuned for particular
applications, such as database or web server.
If there is more than one server vendor in your
environment, and it seems that more than one of
your vendors has an appropriate product, You
should use the situation to your advantage.
Get those vendors biding against each other
the same price for more performance, reliability,
or scalability
Get a better price and be able to invest the
surplus
Even if you know which vendor you will choose,
dont let them know that you have decided until
you are convinced that you have the best deal
possible.

17
3.1.5 Vendor Relations (cont.)

When choosing a vendor, particularly for software
product, it is important for you to understand
the direction in which the vendor is taking the
product.
For key, central service, such as authentication
or directory services, it is essential to stay in
touch with the product direction, or you may
suddenly discover that the vendor no longer
supports your platform.
If possible, try to stick to vendors who develop
the product primarily on the platform you use,
rather than port it to other platform.
Having fewer bugs, receiving new features first,
and better support, etc.

18
3.1.6 Machine Independence

For Name-based Service (Ch.6 Name Service)
Clients should always access a service using a
generic name that is based on the function of the
service.
E.g., Smtp.nctu.edu.tw, pop3.nctu.edu.tw
The machine should never have a primary machine
name that is functional-based,
because ultimately the function may need to move
to another machine. For example,
Primary name DcMg.nctu.edu.tw
Alias (service) name smtp.cc.nctu.edu.tw

19
3.1.6 Machine Independence (cont)

For IP address based services,
we could also use some techniques (such as layer
4 switching) to give the machine that the service
runs on multiple virtual IP addresses in addition
to the primary real IP address.
Then the virtual address and the service can be
moved to another machine relatively easily.

20
3.1.7 Environment

A fundamental piece of building a service is
providing a reasonable high level of
availability, which means placing all the
equipments associated with that service into a
data center (cf. Ch.17).
A data center provides protected power, plenty of
cooling, controlled humidity (vital in dry or
damp climates), fire suppression, and a secure
location where the machine should be free from
accidental damage or disconnection.
In addition, a server often needs much high speed
network connections (e.g., high-speed links, more
interfaces) than its clients because it needs to
be able to communicate at reasonable speeds with
many clients simultaneously.
High-speed network cabling and hardware typically
are expensive to deploy

21
Environment (cont.)

None of the components of the service should rely
on anything than runs on a machine that is not
located in the data center.
The service is only as reliable as the weakest
link in the chain of the components that need to
be working for the service to be available.
If that is the case, find a way to change the
situation
Move the machine into a data center
Replicate that service onto a data center machine
Remove the dependency on the less reliable
machine
Case Study
Hazards of servers relying on Non-servers
NFS automount

22
3.1.8 Restricted Access

Restricting server access to the SA team from the
beginning is the best approach to ensure
reliability and expected performance levels.
There should be no reason for anyone to log in to
a server other than an SA performing
administrative work on the server.
The fewer people who log in to a machine, the
more stable it is.
If a customer can and becomes accustomed to
logging in to a particular server, he probably
will start running other jobs on it that take CPU
and I/O cycles away from the services, without
realizing that he is adversely affecting the
service.
E.g., NFS server

23
3.1.9 Reliability

If you have redundant hardware available, use it
as effectively as you can.
The single most effective way to make a service
as reliable as possible is to make it as simple
as possible.
Find the simplest solution that meets all the
requirements.
When you are building a service at a central
location that will be accessed from remote sites,
it is particularly important to take network
topology into account.
If connectivity to the main site is down, can the
service still be available to remote sites ?
Some, Yes ?stale name service, authentication
service
Others, No, ? database or file service

24
3.1.10 Single or Multiple Servers

Independent services (or daemons) should always
be on separate machines, if cost and
staffing-levels permitting.
However, if the service that you are building is
actually composed of more than one new
application or daemon and the communication
between those components is over a network, you
need to consider whether to put all of the
components on one machine or to split them across
many machines.
E.g., a website with a database, a mail system
with many filtering mechanisms (e.g., anti-spam,
anti-virus, etc.)
The choice may be determined by security,
performance, or scaling concerns.

25
Single or Multiple Servers (cont.)

In other cases, one of the components will
initially only be used for this one application,
but may later be used by other applications.
E.g.,
calendar service LDAP server (Initially)
Mail service LDAP server (later)
If a service, such as LDAP, may be used by other
services in the future, it should be placed on
dedicated machines,
so that the calendar service can be upgraded and
patched independently of the (ultimately more
critical) LDAP servers.

26
Single or Multiple Servers (cont.)

Sometimes, two applications or daemons may be
completely tied together and will never be used
apart from each other.
In this situation, it makes sense to put them
both on the same machine.
E.g., mail server DNS caching server
Video Streaming Server
Encoding, Streaming Server

27
3.1.11 Centralization and Standards

An element of building a service is centralizing
the tools, applications, and services that your
customers need.
Centralization (???/????) means that the tools,
applications, and services are primarily managed
by one central group of SAs on a single central
set of servers.
Support for these services is provided by a
central helpdesk.
Centralizing services and building them in
standard ways make them easier to support and
lower training costs.
The service should be designed and documented in
some consistent way, so that the SA answering the
support call knows where to find everything and
thus can respond more quickly.

28
Centralization and Standards (cont.)

Centralization does not preclude centralizing on
regional or organization boundaries, particularly
if each region or organization has its own
support staff.
Some services, such as e-mail, authentication
services and networks, are part of the
infrastructure and need to be centralized.
For large sites, these services can be built with
a central core that feeds information to and from
distributed regional and organizational systems.
Other services, such as file services and CPU
farms, are more naturally centralized around
departmental boundaries.

29
3.1.12 Performance

From a customers view, two things are important
in any service
Does it work ? and Is it fast ?
When designing a service, you need to pay
attention to its performance characteristics,
even though there may be many other difficult
technical challenges to overcome.
Performance expectations increase constantly as
networks, graphics, and processors get faster.
To build a service that performs well, you need
to understand how it works and perhaps look at
ways of splitting it effectively across multiple
machines.

30
3.1.12 Performance (cont.)

Performance expectations increase constantly as
networks, graphics, and processors get faster.
Performance that is acceptable now, may not be
six months or a year from now.
To build a service that performs well, you need
to understand how it works and perhaps look at
ways of splitting it effectively across multiple
machines.
You also needs to consider how to scale the
performance of the system as usage and
expectation rise above what the initial system
can do.

31
3.1.12 Performance (cont.)

When choosing the servers that run the service,
consider how the service works.
A lot of disk I/O ?
More disk read than write (or vice versa)
Keeping large tables of data in main memory ?
Lots of fast memory and larger memory caches
A network-based service that sends large amount
of data to clients or between servers ?
Multiple dedicated servers with high-speed
interfaces, clusters of servers, etc.

32
Performance (cont.)

Case Study
Bad capacity planning makes a bad first
impression
Performance at remote sites (i.e., over wide area
links)
Web site (e.g., different content for Modem, T1,
High speed links, etc.)
Solution Proxy server (HTTP accelerator )
Handset windows vs. computer windows

33
Performance at remote sites

Performance of the service for remote sites may
also be an issue.
In some cases, quality of service or intelligent
queuing mechanisms can be sufficient to make
performance acceptable.
E.g., mail relays/forwarders, web proxies, etc.
In others, you may need to look at ways for
reducing the network traffic.
Different content on a web system for different
speed of links (e.g., text-only versions for
low-speed links (modem, T1) and graphical
versions for high-speed links, etc.)

34
3.1.13 Monitoring (Ch.24)

A service is not complete and cannot be called a
service unless it is being monitored for
availability, problems, and performance and there
are capacity planning mechanisms in place.
The helpdesk, or front-line support group, must
be automatically alerted to problems with the
service so that they can start fixing them before
too many people are affected by these problems.
Likewise, the SA group should monitor the service
on an ongoing basis from a capacity planning
standpoint.
E.g., network bandwidth, server performance,
transaction rates, license and physical device
availability, etc.

35
Monitoring Example- Statistics for
mail.nctu.edu.tw
36
Monitoring Example (cont.)
37
Monitoring Example (cont.)
38
Monitoring Example (cont.)
39
3.1.14 Service Rollout(???? )

Make sure the customers first impression are
positive.
The rollout and the customers first experiences
with the service will color the way that they
view the service in the future.
One of the key pieces of making a good impression
is having all of the documentation available, the
helpdesk familiar with and trained on the new
service, and all the support procedures in place.
There is nothing worse than having a problem with
a new application and finding out that no one
seems to know anything about it when you look for
help.

40
3.1.14 Service Rollout (cont.)

The rollout also includes building and testing a
mechanism to install new software and
configuration settings that are needed on each
desktop.
One-some-many technique
One ?Some ? Many
Ideally, no new desktop software or configuration
should be required for the service, because that
is less disruptive for your customers and reduce
maintenance,
but installing new client software on the
desktops is frequently.
E.g., enabling IEEE 802.1x authentication scheme,
web browser (IE vs. Firefox)
New Trend
Example SSL VPN vs. PPTP VPN

41
3.2 The Icing

3.2.1 Dedicated Machine
3.2.2 Full Redundancy
E.g., Name Service Authentication Services
Primary vs. Secondary (duplicate) set of servers
Failed-over, backup
Tightly coupled vs. loosely-coupled servers
Load-sharing, performance-increasing

42
3.2.1 Dedicated Machine

Having dedicated machines for each service
More reliable
Debugging easier when there are reliability
problem
Outage (??????) more limited in scope,
And upgrades and capacity planning much easier

43
Dedicated Machine (cont.)

Sites that grow from a small company to a larger
one generally end up with one central
administrative machine.
Eventually, this machine will have to be split up
and the services spread across many servers
because of increased load.
IP address dependencies are the most difficult to
deal with when splitting services from one
machine to many.
Name service (e.g., DNS, NIS), Security service
(e.g., router of firewall rules ), etc.

44
3.2.3 Full Redundancy

Consider which services will benefit your
customers most to have completely redundant and
start there.
Name service and authentication services are
typically the first services to have full
redundancy.
They are designed for secondary servers
they are so critical
Other critical services, such as e-mail,
printing, and networks, tend to be considered
much later because they are more complicated or
more expensive to make completely redundant.

45
Full Redundancy

Another benefit of full redundancy
It makes upgrade procedure easier.
A rolling update can be performed
Case Study Design Email services for Reliability
Incoming mail path vs. Outgoing mail path
Mail relays vs. mail routing hosts
Mail delivery hosts
Firewall

46
Appendix

Background - Internet Applications
Networking Troubleshooting Process
Case Study
E-mail system operations and design
considerations
Security events

47
Background - Internet Applications
48
Truth Depends on Interpretation (e.g., Anti-spam
or anti-virus mail filtering)

MTA0

Filtering with H1(msg)
Mail Spool
Accept

MTA1 (or MUA1)

Filtering With H2(msg)
Discard

MTA Mail Transfer Agent
MUA Mail User Agent

MTA2 (or MUA1)

49
?? E-mail ?????

Incoming SMTP Gateway Farm

Firewall

Internet

Mail Filtering
BL/GL/WL
Auto-learn

Bouncing server

Mail Spool
server

SMTPauth

Outgoing SMTP Gateway Farm

50
Incoming Flow of a Typical Mail System
Internet
LDA
MTA

procmail

sendmail

POP3/IMAP server
Mail spool

A Typical Mail System

MUA

Netscape,
MS-outlook, etc.

user Mail storage

User PC

Anti-virus programs
51
Generic E-mail Transmission Path

Outgoing
SMTP Gateway

SMTP

source

Firewall, filtering
2
3
1
4
5
6
Firewall, filtering

Incoming
SMTP Gateway

SMTP
POP3/IMAP

destination

52
A Hybrid Model for Anti-spam -- Generic Mail
Filtering
Client
(1)
Generic Mail Filtering
White List
Pass
(2)
Reject

Bounce

Black List
Fail
(3)

Grey List
Mail Spool
Fail temporarily
(4)
Automatic SPAM Learning

Discard

Fail
Update
Pass
53
Sample Statistics anti-spam in
mail.TN.edu.tw(http//ms2.tn.edu.tw/report/day/ )
All Msg.
SpamAssassin
ClamAV
25
27
Greylist
5
Virus
Rejected
Passed SpamLevel (6-15)
3
17
2
73
Passed
Blocked SpamLevel gt 16
54
Networking Troubleshooting Process
SMTP Filtering
Router/Switch Filtering
DNS Filtering
SMTP_a
Client
Router_a
DNS_a
SMTP Filtering
Router/Switch Filtering
DNS Filtering
SMTP_b
Router_b
DNS_b
55
Port-scanning summary on DNS servers of neighbor
sites
56
???? DNS server ????- Sample scenario

2000 ?, ????????, ????? DNS servers
??, ????, ??????? server
? server-A ? security hole, ??????
???, ????? server-A, ????????
???????? abuse, postmaster ???????, ????? root
mail ??????
????, ?????????????????? e-mail
???? router ????, ?????????? DNS ??
????? (??)

57
Multiple outgoing paths and distributed DNS
Layer-1
Layer-2
ISP-1
.com
Internet

DNS
Server
farm

DNS
server

Ordinary
client

.arpa

Caching-only

Others
SMTP
www, proxy
ISP-2
58
Traffic Amplifying Attacks via DNS Zone Transfer
Q zone transfer Dn n -gtsome large number
A Attacker
Q(n)
Q(1)
D1
D2
Dn
R(1)
R(n)
R(2)
V attacked site ( Victum)
59
Common Terms

Reliability (???,??? ) --From Wikipedia,
In general, reliability (systemic def.) is the
ability of of a person or a system to perform and
maintain its functions in routine circumstances,
as well as hostile or unexpected circumstances.
The IEEE defines it as ". . . the ability of a
system or component to perform its required
functions under stated conditions for a specified
period of time."

60
Common Terms

In telecommunications and reliability theory, the
term availability has the following meanings
1. Simply put, availability is the proportion of
time a system is in a functioning condition.
Note 1 The conditions determining operability
and committability must be specified.
Note 2 Expressed mathematically, availability is
1 minus the unavailability.

61
Common Terms

In telecommunications and reliability theory, the
term availability has the following meanings
2. The ratio of (a) the total time a functional
unit is capable of being used during a given
interval to (b) the length of the interval.
Note 1 An example of availability is 100/168 if
the unit is capable of being used for 100 hours
in a week.
Note 2 Typical availability objectives are
specified either in decimal fractions, such as
0.9998, or sometimes in a logarithmic unit called
nines, which corresponds roughly to a number of
nines following the decimal point, such as "five
nines" for 0.99999 reliability.

62
Definition of availability

Barlow and Proschan 1975 define availability of
a repairable system as "the probability that the
system is operating at a specified time t."
Representation
The most simple representation for availability
is as a ratio of the expected value of the uptime
of a system to the aggregate of the expected
values of up and down time, or

Write a Comment

User Comments (0)