Title: ESnet RADIUS Authentication Fabric
1ESnet RADIUS Authentication Fabric
- Michael Helm
- 19 January 2005
2ESnet RADIUS Authentication Fabric
- Report on RADIUS-OTP feasibility study
- ESSC commissioned Apr 2004
- ESnet and collaborators May Oct 2004
- ESnet RAF Whitepaper product of study
- Proposed service
- Section-by-section
- Note ESnet RAF Progress Report
- Discuss project next steps project proposal
- Pilot production service
- Additional RD
- New opportunities FWNA, wireless roaming
3RADIUS OTP Feasibility StudySummary
- April Project proposal
- http//www.doegrids.org/CA/Research/GIRAF.doc
- Evaluated two OTP vendors
- SecurID Cryptocard
- 3 Strong Sites
- NERSC ORNL ESnet
- Applications Apache, and sshd
- RADIUS Appliance vendor Infoblox
- Results were very favorable
4RADIUS OTPNext Steps Project Proposal
- Advance to Pilot
- Build the core edge sites
- Put an application into production
- RD needed
- RADIUS has many issues
- Applications have some issues
- Kerberos and RADIUS can they play together?
- New opportunities
- 802.1x, EAP-TLS, and more
- Wireless / roaming initatives
- In order to support this, we must buy additional
equipment and add staff.
5ESnet RAF Whitepaper Plan
- http//www.es.net/raf
- Background
- Architecture
- Applications and Services
- Outstanding Engineering Issues
- Federation
- Future Work / New Opportunities
- Team
6Background
- DOE Lab Computing Facilities Hacked
- Panel discussion at Apr 2004 ESSC
- OTP and RADIUS efforts commissioned
- Why OTP?
- NOPS NERSC founded grassroots effort
- RADIUS Authentication Fabric
- ESnet response to OTP initiatives
7DOE Lab Computing Facilities Hacked
- Widespread hacking incidents early 2004
- Crossover between DOE labs, NSF labs, educational
institutions, other collaborations - Long lifetime, reusable passwords
- No amount of self-protection adequate
- Privilege escalation
- Attack unpatchable services
- Platform for further attacks
- Problems continued through 2004
8Why One Time Passwords?
- OTP tokens limit effectiveness of sniffing
- OTP breaks up domino effect noted in recent hacks
- Additional protection for unsecured, commercial
environments (kiosks, shared / home computers c) - Tokens and commercial services
- Improve the user experience over S/KEY
- Reduce / limit some threats related to management
of the one-time password list - Tokens (and OTP in general) add some additional,
DoS-type threats
9NOPS
- NERSC friends organized to respond to hacks
protect distributed collaborations - How would a large-scale OTP deployment actually
be accomplished? - What product or products would be used?
- What would happen to the applications and
services? - Requirements document
- The multiple token catastrophe
10What is the Multiple Token Catastrophe?
- Assume a large physics collaboration. A
scientist at LBNL needs to move data from an
experiment at FNAL, to an archiving service at
ANL, and then run a distributed simulation from
compute centers at NERSC and NCSA. - How many different OTP hardware tokens does she
need to use to get her work done? - ltinsert picture of DOE Bandolier heregt
- The answer should be she needs only ONE
11RADIUS Authentication Fabric
- Solution to the OTP interoperability problem
- Solution to the Multiple token catastrophe
- But does not exclude other solutions!
- App can require particular token
- Sites vendors can use proprietary
interoperability capabilities transparently - What is the RAF?
- Deploy a hierarchy of RADIUS servers
- Edge (site) RADIUS servers support applications
- Edge RADIUS forwards to ESnet RADIUS core
- ESnet RADIUS core dispatches to site back end
authentication service - Lets look at an architectural picture and work
through an example.
12How Does the RAF Work?
All RAF Realms
13Architecture
- Fill in explain protocols / jargon
- One-Time Password Technology
- RADIUS
- RADIUS Naming Convention
- RAF Terminology
- RAF Core RADIUS components _at_ ESnet
- RAF Edge RADIUS _at_ sites or
- Fabric Core EdgeRADIUS clients
14One-Time Password Technology (OTP)
- Account is static
- Password is dynamic
- Tied to a non-reversible algorithm see S/KEY RFC
- Eliminating reusable passwords eliminates classes
of threats - OTP can be retrofitted into legacy apps
- Looks like ordinary password dialogue
- App must be able to outsource Auth
- This done through PAM Pluggable Authentication
Module
15One-Time Password Technology (Vendors)
- Three principal vendors in DOE labs
- RSA SecurID market leader 4 sites
- Proprietary, time - sequence based algorithm
- Cryptocard FNAL, others experimenting
- sequence based open source (?)
- Safe Computing Safeword LBNL
- Sequence based
- Non interoperable!
16RADIUS
- RFC 2865 http//www.ietf.org/rfc/rfc2865.txt
- About 10 years development history
- Wide support in industry
- Built-in features support our needs
- Proxying RADIUS-RADIUS and RADIUS-OTP back end
- Adequate extensibility and security and other
features - RADIUS Realm name used to organize RADIUS
features - This is just a fancy way to describe how we
divide up our hierarchy and decide where to
forward auth queries - Client-server model
- NAS Network Access ServerRadius Client(PAM)
- Naming convention
17RADIUS - Infoblox
- Appliance Infoblox
- Soon FreeRADIUS based
- http//www.infoblox.com/products/radiusone_overvie
w.cfm - Dedicated Hardware
- Minimal Ports
- No User Accounts
- High Availability
- Geographical dispersion
18RADIUS Naming Convention
- RADIUS RFC defines names
- RADIUS RFC mentions realms, but does not define
realms - One widespread convention for this is
- name_at_Radius-realm eg joe_at_admin or jane_at_biguni.edu
- We will use our top-level domains
- mike_at_es.net or joe.doakes_at_nersc.gov
- Caveat
- RADIUS user names are case-sensitive!
-
19RAF Terminology
- RAF Core RADIUS servers operated by ESnet
- Authentication routers RADIUS proxy
- Back end database
- RAF Edge RADIUS servers at sites or
- RADIUS client PAM
- Usually identical most services of interest use
a RADIUS client in PAM - ( Except wireless)
20Applications and Services
- Applications consumed a lot of our time
- The ESnet RAF Progress Report documents much
of this work. - From the RADIUS operation point of view, the OTP
back end services are another application. - ESnet set up several RSA demo services in 2004
for its own use. We did not have the resources
to set up other vendors products, but relied on
NERSC and ORNL to help us with additional
instances and alternate vendor. - Applications and PAM
- PAM widely used in modern UNIX
- Semi-standardized capabilities vary
- Multi-layered API many capabilities
- GIRAF (See picture next)
- We focused on these 2 widely used applications
- SSHD
- Apache
21Grid Integrated RAFGIRAF
ESnet Root CA
OTP
4 OTP Back end authentication
0 Sign Subordinate CA
On-Demand CA (SIPS)
3 RADIUS Auth query
RADIUS Authentication Fabric
Grid App
MyProxy
2. Prime account
GridLogon
1Token authentication release proxy cert
22GIRAF
- Looks complicated, but it can be REAL simple
- openssl req simple RADIUS client signing key
pair name policy - Or
- NCSA Grid Logon proposal (Fusion Grid, NERSC)
- Or
- Infoblox RADIUS server!
- Very important app protect Grid investments
23Outstanding Engineering Issues
- RAF Core
- The set of RADIUS servers operated by ESnet
- Route authentication requests
- RADIUS Operations and Security
- Not ESnet-specific issues but somebody needs to
work on these - OTP
- Applications
- SETA Secure, Extensible, Token Authentication
(NB 15 Dec 2004) - Grid Integrated RADIUS Authentication Fabric
- Web server client authentication
- Ssh
- Large scale file transfers
- Firewalls and VPNs
- Client security
24Applications
- ESnet should pick ONE (but see later)
- Volunteers or broader project for others
- Everything needs work mostly PAM
- GIRAF NCSA/NMI support?
- SETA NERSC proposal
- Batch jobs
- Kerberos support
25RAF Core
- Reliability, reliability, reliability
- Multiple instances
- WAN high-availability
- Presentation round robin or individual
- We need to understand failure modes better
- Simpler is better
- Where does the core end?
- Doesnt matter now we need configuration
solutions for edge or site RADIUS
26RADIUS Operations and Security
- RADIUS security shortcomings
- OTP bypasses some of these
- Shared secret RADIUS-RADIUS-R Client
- Lack of confidentiality of transactions
- Need to secure admin interfaces
- Absolutely must protect against man-in-the-middle
hijacking c - Deploy VPN or IPSec to support service
- Look at EAP/802.1x for person
- RADIUS Operations and Security
27OTPEngineering Issues
- Cryptocard receives better reviews, but
- Essential to support RSA SecurID
- ESnet needs access to both technologies
- Quality control issues
- Error recovery
- Lost back end server
- Reporting synchronization errors
- Documents needed
- OTP Service Best Practices guides and other
security analysis
28Federation
- Layer 8 issues trump technical issues
- RAF Federation document in draft
- http//www.es.net/raf/DOE20OTP20federation_v2.do
c - Federation governance Based on GGF template
- RAF-specific issues
- Types of authentication permitted
- VPN or IPSec management
- Token management replacement,
resynchronization, etc - Radius shared secret management
- RADIUS configurations
- RADIUS replication
- Realm naming practices
- DISCLOSURE by participants (sites) is essential
- How should this federation be governed/populated?
29Team RAF/NOPS/Globus
- ESnet Tony Genovese, Michael Helm, Roberto
Morelli, Dhivakaran Muruganantham, John Webster - InfoBlox Edwin Menor, Andy Zindel
- LBNL Olivier Chevassut
- NERSC Stephen Chan, Eli Dart
- ORNL Tom Barron, Sue Willoughby
- ANL Remy Evard, Gene Rakow, Craig Stacey Frank
Siebenlist (Globus) - PNNL Craig Gorenson
- NCSA Jim Basney, Von Welch
30Future Work / New Opportunities
- KERBEROS RADIUS interoperability
- NERSCs SETA
- FNAL
- Use cases
- DIAMETER
- IETF (partial?) replacement for RADIUS
- Wireless FWNA
- Wireless Roaming initiatives
- September 2004 GGF-12
31KERBEROS-RADIUS
- Motivated by FNAL situation
- Cryptocard tightly integrated with KDC
- Predates this effort
- How can we integrate them?
- Use case (hypothetical)
- FNAL scientist wants to use service at LBNL (no
Kerberos) - Get OK at LBNL from FNAL KDC
- LBNL scientist wants to use FNAL service
(Kerberos) - Get TGT at FNAL from LBNL RADIUS OK
- FNAL scientist wants to use NERSC service (K2K)
- Get forwarded TGT Kerberos interrealm from FNAL
OK? - Dont have any of this can EAP do this?
- Can we find EAP-GSS or something similar?
32DIAMETER
- DIAMETER IRTF/IETF replacement for RADIUS RFC
3588 - TCP instead of UDP
- Mandatory IPSec support
- In practice, TLS too ie most security is in
hands of security protocols, not DIAMETER - Dynamic discovery
- Better proxy and roaming support
- Universal support is some years away
- However Wireless initiatives may need DIAMETER
in the core more like a true authentication
router
33Federated Wireless Network Authentication
- First contact Eduroam, Sep 2004
- TERENA initiative to support roaming in European
NREN community - Eduroam contacts led us to I2 SALSA-Netauth
- FWNA subgroup milestone 0 of similar project
- We all have the same RADIUS architecture
- Eduroam is 6-12 mos ahead of ESnet
technologically - SALSA, Eduroam, and ESnet RAF all have some
orthogonal components
34FWNA
- http//wiki.netcom.duke.edu/twiki/bin/view/NetAuth
/WebHome - Analysis and proposal toward a pilot and
eventual implementation to support network access
to visiting scholars among federated
institutions. - Subgroup of Internet 2 SALSA-Netauth
35FWNA (Project outline)
36Eduroam
- TERENA TF-Mobility
- New site http//www.eduroam.org
37Project plan
- Build on feasibility study
- Pilot 3 core, 3 edge sites ( others)
- Engineering studies
- Data replication
- Contingency RSA/SecurID support
- Federation build out
- Application support selected apps
- One internal application
38What Will It Take?
- 6 Infoblox HA pairs 12 units / 12k ea
- 150K
- 3 core pairs 3 edge pairs
- Support Misc. servers 20k
- Travel training/conferences 20k
- 3 FTE (750K)
- Developer 1.25 FTE
- Engineering cases support
- Replication services
- Selected application development
- Deployment 1.25 FTE
- RADIUS Configuration management
- VPN / IPSec management
- Support
- Federation 0.50 FTE
- National team coordination
- Outreach
- Contingency Need access to SecurID for 1 year
cost unknown - Ongoing Expect about 0.5 FTE Federation 0.5
FTE indefinitely
39What Will It Take? (2)
- New initiatives After 1st year
- Eduroam I2/FWNA 1.5 FTE
- 1.0 Deployment
- 0.5 Federation burden
- DIAMETER
- Initial deployment cost
- Reduce maintenance as IPSec and discovery
simplify - KERBEROS
- 0.25 FTE Feasibility and implementation
estimate
40RAF Pilot
External Hierarchy
Lab1
Lab3
OTP Service
OTP Service
rE
FreeRADIUS
r
R3
R2
RAF Realms
RAF Realms
All RAF Realms
R1
Lab2
Lab4
OTP Service
OTP Service
ESnet RAF Federation
rE
r
r
rE
R1 Master R2,R3 Slaves rE Edge
S
RAF Realms
RAF Realms
41RAF Pilot engineering
- Replication
- Presentation is the core round robin, or
distinct nodes? - Default and filtering new info from RADIATOR
and Infoblox - Error conditions and reporting
- VPN/IPSEC
- Lights out / Colocation configuration
- Edge site / customer configuration
42Applications
- What is the most useful application?
- sshd
- ESnet application roaming support?
- Complication trust domains
- GIRAF NERSC, Fusion Grid
43Federation
- Build out
- Rules / policies
- Role of ESSC in oversite / reporting
44Conclusion
- Discuss
- Level of Commitment
- Direction
- Project plan / management
- http//www.es.net/raf