Title: Advanced Active Directory Design and Troubleshooting
1- Advanced Active Directory Design and
Troubleshooting - Ed Whittington
- Principal Software Engineer
- HP Business Critical Call Center
- Oct. 06, 2002
2Topics
- Troubleshooting Basics
- Troubleshooting Tools
- DNS Troubleshooting
- Troubleshooting Replication
- Troubleshooting DCPromo
- Troubleshooting FRS Replication and DFS
- Troubleshooting Group Policy
- Troubleshooting in .NET
3Troubleshooting Basics
4Basic Troubleshooting Steps
- Define the problem (make sure there is one)
- Whats failing?
- Client authentication and security
- Group policy application.
- Replication.
- Name resolution.
- Errors and warnings in event logs.
- FRS/DFS
- Application
- How is the problem replicated?
- One or multiple machines?
- Narrow the variables
5Basic Troubleshooting Steps
- MPSReports_DS (from HP or Microsoft)
- Get the Log files
- Event logs
- http//www.eventid.net
- windir\debug\usermode\Userenv.log
- windir\debug\DCPromo.log
- Turn on Verbose Logging
- Run NetDiag, DCDiag (verbose)
- Get status report from Replication Monitor.
6Basic Troubleshooting Steps
- Check DNS.
- Resolver on ALL computers.
- Name Server Properties (forwarding, etc.).
- Monitoring tab test name resolution.
- Nslookup, ping to test name resolution.
- Ping SRV records.
- Check Replication.
- Force replication.
- Identify who isnt replicating to whom.
- Outbound vs. inbound.
7Basic Troubleshooting Steps
- If all else fails, try demoting.
- Really cleans up a lot of problems If problem is
isolated to one DC. - If replication isnt working, demotion wont
work. - Reinstall to remove the AD, then clean up AD
- Ntdsutil to remove server object.
- Delete server object from Sites Services.
- Delete FRS server object from System container.
- Can manually demote a DC.
8Manual Demotion of a DC
- HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet
- \Control\ProductOptions
- Product Type
- ServerNT (when the computer is a Member Server)
- LanManNT (when the computer is a Domain
Controller) - Change from LanManNT to ServerNT
- Its now a dirty member server
- Clean server objects from the AD (Ntdsutil)
- Clean up the disk and Registry
- Create new Forward Lookup Zone Bogus.com
- Run DCpromo create new forest for Bogus.com
- Demote and eliminate Bogus.com
- Wait for Replication
- Promote back into domain use same name if
desired - Tool in Windows .NET
9Troubleshooting Tools
Gathering Information
10Netdiag.exe
- NETDIAG.EXE
- /v - verbose always turn this on.
- /l - log writes netdiag.log to default
directory. - /ddomain controller finds DC in domain.
- /test - runs only specified tests.
- /skip - skips specified tests.
- Cant execute remotely.
- Cgtnetdiag /v /l
11Netdiag.exe
- Domain Controller Discovery
- Bindings, IP address, Default Gateway tests
- DNS tests
- NBTstat and WINS ping
- Netstat
- Route
- Trust
- Kerberos
12Dcdiag.exe
- DCdiag /v
- Domain controller functions of netdiag
- More domain-specific
- FSMO roles
- Connectivity
- Replications
- Domain controller locator
- Intersite health
- Topology integrity
13Nltest.exe
- /serverservername Sets default server
- /dsgetdcdomainname Dsgetdcname API
- /gc /timeserv /ldap
- /dclistdomainname Lists DCs in domain
- /parentdomain Lists parent domain
- /dsgetsite Lists site of server
- /dsgetsitecov Lists DC covering site
- /dcnamedomainname Lists PDC for domain
- /dcpromo Tests potential success
of DCPromo - /whowilldomain user Returns name of DC that
will authenticate user
14Netdom.exe
- /join
- /add
- /reset
- /resetpwd
- /query FSMO
- /trust
15NTDSUtil
- Built-in utility.
- Directly accesses Active Directory.
- Authoritative Restore.
- Can restore an older version of the AD and force
it on all DCs to correct variety of problems. - Entire AD or single tree.
- Cant restore the schema.
- FSMO Roles.
- List, Transfer, Seize roles.
- Better than UI can manipulate all roles in
forest and all domains from one utility..
16NTDSUtil
- Metadata Cleanup
- Delete orphaned objects.
- Servers
- Domains
- The UI can and will lie to you! Dont trust it.
- Useful tool for listing contents of the AD
- Sites, domains, servers, FSMO role holders.
- Domains in site.
- Servers in domain, servers in site.
- Q216364, Q216498, Q230306
17Gpresult.exe
- Run on client
- Returns
- Security group membership
- User and Computer policy info
- GPOs applied to each
- Registry settings set in the GPO
- Client-side extensions set
- Scripts applied
- Remember
- Policy is cached reboot / login to clear
- Note who authenticating server is
- Environmental Variable logon server
- Much Improved in .NET!
18GPOtool.exe
- Run on domain controller.
- Returns
- Analysis of all GPOs in domain.
- GUID and friendly name of all GPOs.
- DS and Sysvol versions.
- Errors encountered.
- Good group policy troubleshooting tool.
- May take a long time to process (GPOs)
19ADSIedit.exe
- GUI much like Users Computers snap-in
/Advanced features. - Graphical view of AD.
- Like LDP.exe but
- Easier to browse.
- Can modify attribute values
- Dont confuse with Users Computers!
20LDP.exe
- Takes time to set up
- Connect
- Bind
- View Tree
- Enter DN to start (blank for default)
- Exposes attributes quickly, easy to see.
- Faster than ADSIedit no GUI to traverse.
- LDAP searches.
- Can delete and modify, but not as easy as
ADSIedit. - Can execute remotely.
21DCPromo.log, DCPromoui.log
- Located in systemroot\debug.
- Logged every time dcpromo runs.
- DCPromo.log
- Shorter.
- Appended (read bottom up).
- DCPromoUI.log and DCPromoUI.xxxx.log
- Results of what is seen in the UI longer.
- Find Results of getdsdcname, DNS query, Time
service sync, authentication, replication, Site
info. - Error (0x0) success no error .
- Error reporting different read both logs.
22Userenv.log
- Located systemroot\debug\usermode
- User environment info
- Group policy (registry)
- Client side extensions
- Scripts
- Security
- Increase verbose logging (Q221833)
- Take time read and study and you may be
surprised at what you can find!
23Additional User Mode Logs
- Client-side extensions
- Registry see Q216357
- HKLM\software\Microsoft\WindowsNT\currentversion\w
inlogon\ GPExtension - Errors created in windir\debug\user mode
- Named after the .dll
- Scripts Gptext.dll gptext.log
- Folder Redirection fdeploy.dll fdeploy.log
- Security scecli.dll winlogon.log
- Q245422
- Produced automatically on error (except
winlogon.log) - Check User Mode directory for these files
- Invaluable in debugging. Use them!
24Client Side Extensions (registry)
25Windows .NET Troubleshooting Tools
26Remote Desktop Resource Redirection
- Client Resources Available when using Terminal
Services Remote Desktop - File System Local drives and Network drives on
Local Machine available on Remote machine - Audio Audio streams such as .wav and .mp3 files
can be played through the client sound system. - Port Applications have access to the serial and
parallel ports - Printer The default local or network printer on
the client becomes the default-printing device
for the Remote Desktop. - Clipboard The Remote Desktop and client
computer share a clipboard - Terminal Services Virtual Channel Application
Programming Interfaces (APIs) are provided to
extend client resource redirection for custom
applications.
27WMI
- Computer management
- Active Directory
- Provider MicrosoftActiveDirectory
- Classes
- Replication - See replprov.mof windir\system32
- Trust health
- Provider MicrosoftHealthMonitor
- Classes see system32\wbem\trusthm.mof
- DNS
- Provider MicrosoftDNS
- Classes system32\wbem\dnsprov.mof
- Cluster
- MSCluster
- Also look in CIM Studio in MSDN
28WMIC Sample Commands
- Look in windir\system32\wbem .mof files for
names of providers, classes, etc. - Active Directory
- Provider MicrosoftActiveDirectory
- wmic/namespace \\root\microsoftactivedirectory
- PATH msad_replneighbor
- (shows replication partners)
- wmic/namespace\\root\rsop\user path RSOP_GPO
- (lists GPOs with User settings)
29Admin Tool Improvements
- Users and Computers snap-in
- Drag and drop.
- Multi-select and edit user objects.
- Heavily revised object picker.
- Users and Computers, Sites and Services, DNS
Snap-ins - Saved queries.
- Viewing Saved DS, DNS, FRS eventlogs on non-DCs!
- .NET Adminpak (only on XP)
30Command Line Tools
- GPresult
- Enhanced reporting
- DCDiag
- dcdiag /testDCPromo
- Repadmin enhanced reporting
- Netdom computername for DCrename
- Others
- Shipped on
- Service Pack 2 CD (install manually)
- .NET Server, AdvSvr CD
31Windows .NET Improvement to NTDSUtil
- Change Offline, DS Repair Mode Password While
Online! - NTDSUtil
- Set DSRM Password (main menu)
- Increases server up-time limited by password
change interval in Win2K. - (Had to reboot to DS Repair mode to change.)
- Q223301 (Win2K limit)
- Cool error message!
- Setting password failed.
- WIN32 Error Code 0x6ba
- Error Message The RPC server is
unavailable. - See Microsoft Knowledge Base article Q271641 at
- http//support.microsoft.com for more
information.
32Errors in Windows .NET Kinder, Gentler and
Report to Microsoft
33Active Directory Load Balancing Tool
- Does the job of branch office deployment.
- KCC chooses BHS for connection objects choose
the same one. - Tool allows you to spread the load to other DCs
in the site (that have that NC). - ADLB tool modifies the Hub DCs replication
schedules to spread it out over time. - Generates a log like replmons status log.
- For Deployments with hundreds of branch offices
all replicating to a single hub.. - Toolno benefit to sites with only one DC per
domain.
34Future Graphical Replication Monitoring Tool
- Very much like Age of Directories
- Ability to make configuration changes
- Not in .NET - maybe Longhorn or Blackcomb?
35Troubleshooting DNS
36DNS Resolver Configuration
- Win2K clients, servers point to Win2K DNS Name
Server that is SOA for their zone. - Dont point to ISP, other Internal NS.
- (even as additional.)
- Keep it simple.
- Win2K Name Servers forward to ISP or internal
name server hosting registered domain.
37DNS Name Server Configuration Basics
- Dynamic updates Yes.
- Active Directory Integrated Zone
- Select one Primary
- All other ADI Primary NS point to it for DNS
- Win2k Name Servers can
- Forward to ISP or Internal NS.
- Use root hints (or modify root hints).
- Reverse Lookup Zones NOT required
- Needed only for tools - NSLookup
38ADI Primary and Standard Secondary mixed zone
- Only a DC can host an ADI primary zone
- Member Servers can host Secondary zone
- Synch off of an ADI Primary
ADI Primary
Secondary
Secondary
ADI Primary
ADI Primary
39DNS Case Study
Forwarding
na.corp.net
corp.net
sa.corp.net
eu.corp.net
na.corp.net
Zone xfers
Secondary zones
sa.corp.net
eu.corp.net
40DNS Case Study
na.corp.net
corp.net
sa.corp.net
eu.corp.net
eu.corp.net
find na.corp.net
sa.corp.net
na.corp.net
41With Conditional Forwarding FeatureIn Windows
.NET Server
na.corp.net
corp.net
sa.corp.net
eu.corp.net
find na.corp.net
42Problem SRV records only in Root domain
Location of SRV PDC GC Cname
w2k.net
corp.com
corp.com
Zone Xfer
Forwarder
EU.w2k.net
NA.w2k.net
43Solution Delegate _msdcs zone
Location of SRV PDC GC Cname
corp.com _msdcs _tcp _sites _udp
w2k.net
_msdcs
Delegation
Forwarder
EU.w2k.net
NA.w2k.net
44DNS Hotfix
- Symptom Replication breaks
- Configuration Using Secondary Zones for root
_msdcs at child domains. - Problem Serial Number of Secondary zone is
higher than the primary zone transfers stop. - Hotfix Q304653
- The Serial Number Is Decremented in DNS When You
Reboot - Solved in .Net
45DNS Troubleshooting Basics
- Check DNS event log (and others).
- Check Location of DNS servers.
- Usually want Name Server in remote sites.
- Check population of SRV records.
- _msdcs _tcp _udp _sites
- Need Kerberos, LDAP records for each DC.
- Correct address, etc.
- Can delete, repopulate by restarting netlogon.
- Check Delegations correct names, IP.
46DNS Troubleshooting Basics
- Use of Active Directory Integrated (ADI) zones.
- Put standard secondary zones on mbr svrs.
- Can clear problems by switching to Std Pri.
- Ping DC by SRV record
- ping ltguidgt.site._msdcs.compaq.com.
- Clear the server cache.
- Negative Caching problems.
- Test Server Properties Monitoring tab.
- Test Ping names, NSLookup.
47Troubleshooting AD Replication
48Replication Troubleshooting Tools
- Event logs Directory Services, System
- Sites and Services snap-in
- Age of Directories (AOD) HP
- Replication Monitor
- Aelita Event Admin
- NetPro Directory Analyzer
- Command Line (Support Tools Res Kit)
- DCdiag, Netdiag
- Repadmin.exe
49Event Logs for Replication Troubleshooting
- Directory Services Log
- 5778 - Subnets not mapped.
- Will break clients site awareness.
- 1311 - serious - Not enough connectivity.
- Connectivity, traffic issue.
- Sites with DCs and no site links.
- Site topology incorrectly defined.
- DNS Lookup failure.
- 1772 RPC Server is unavailable.
- Physical connectivity.
- DNS.
50Event Logs for Replication Troubleshooting
- System Log
- Netlogon errors
- Authentication
- Trusts
- Secure channel
- w32Time errors
- Kerberos authentication required for replication
- DCs must be no more than five minutes out of
sync. - Watch time zones!
51Sites and Services Snap-in
- Check for duplicate connection objects.
- KCC generating gt1 connection between 2 DCs.
- Delete all connections and select check
replication topology option to regenerate them. - If they come back, find out why.
- Usually a DNS problem.
- Breaks FRS and AD replication.
52Sites and Services Snap-in
- Check for sites with no DCs
- OK to have a site with no servers if you plan it
that way. - If there should be a server in that site, find it
and move it there. - Make sure all subnets are mapped to correct
sites. - Keep up on IP addressing changes.
53Sites and Services Snap-in
- Make sure site links are correct.
- Link correct sites per design (need a drawing).
- Cost, schedule, replication frequency.
- Force replication between DCs.
- All connections are inbound.
- Use check replication topology.
- Create new site, user named for the DC.
- Checks Configuration NC and Domain NC.
- Force Replication Between Replication Partners.
- On DC1 from DC2 and on DC2 from DC1.
54Sites and Services Snap-in
- Validate inbound, outbound replication on all
DCs. - Create new site, user named for the DC.
- Checks Configuration NC and Domain NC.
- Wait for replication (dont force it).
- Check each DC for copy of these users, sites.
DC1
DC3
DC2
User Site DC1 DC1 DC3 DC3
User Site DC1 DC1 DC2 DC2 DC3
User Site DC2 DC2 DC3 DC3
55Check Cname DNS Records
- In root _msdcs zone (only), alias record mapping
DCs FQDN to its server GUID. - Only one record.
- Delete duplicates.
- Match GUID in alias record to GUID reported by
Repadmin /showreps. - If in doubt, delete DCs Alias record(s) and
re-start netlogon on broken DC to re-register .
56Age Of Directories Tool - Demo
- If interested, contact me ed.whittingtonn_at_HP.com
57Replication Monitor
- Status report (replication health report)
- List of all GCs, BHS, Trusts
- List of all replication errors on all DCs in
domain - Changes not replicated
- Replication partners
- Force push/pull replication
- Meta-data
- Group Policy Object status
- FSMO validation
- Inbound connections (including reason)
58 Replication Monitor
59Command-Line Utilities
- RepAdmin
- In Support Tools.
- Perhaps the most useful tool for troubleshooting
replication. - /showreps - lists inbound, outbound connections.
- Only one to list outbound connections.
- Lists Server GUID (used for replication).
- Lists successful replication messages.
- Lists replication errors.
- Lists Replication partner used to replicate every
naming context inbound and outbound.
60NTDS Diagnostic Logging
- HKLM\system\CCS\Services\NTDS\diagnostics
- Set value 0-5
- 0 off 5very verbose
- Start with 3 to begin with
- Reported in Event log
- Important Values
- 1 Knowledge Consistency Checker
- 13 Name Resolution
- 5 Replication Events
- 8 Directory Access
- 9 Internal Processing
- 18 Global Catalog
61Things that break Replication(or indicate that
its broken)
- Duplicate connection objects
- Orphaned objects
- Esp. DC objects, caused by a DC being removed
from the domain without successful DCPromo. - Garbage Collection initiated manually before all
DCs and GCs are fully replicated. - Reported in event logs.
62Things that break Replication(or indicate that
its broken)
- DC unavailable
- Down
- Name Resolution
- Network problem
- DNS misconfigured
- TCP/IP addresses change
- Delegation
- Client resolver configuration (including name
servers) - DHCP scope configuration for DNS registration
- Failure to Contact a DNS server (for SRV records)
63Things that break Replication(or indicate that
its broken)
- KCC doesnt do its job
- Routes around inaccessible DCs by creating
duplicate connection objects. - When DCs come back on line, KCC should clean up
the duplicate connection objects. - Usually doesnt
- Causes replication errors.
- Events in the DS Log.
- Need to clean them up manually.
64Lingering Object Behavior
65Object Deletions
- Deleted objects turn into tombstones
- Tombstones replicated to other DCs
- This is how replication partners learn that an
object was deleted - Tombstones purged from local database after
tombstone lifetime has expired - AD 60 days, adjustable (2 days minimum)
- Sysvol 60 days
- If tombstone does not replicate to a DC, object
deletion is not replicated - Object not deleted on this DC
- Object is now a Lingering Object
- Can be on DC or GC
- Rule tombstone lifetime
- Max time DC can be disconnected
- Max lifetime of Backup tape
66Lingering Objects Scenarios
- Deleted object re-appears on all domain
controllers in a domain and on all GCs - Deleted account does not disappear from Exchange
GAL - Object was moved between domains and disconnected
GC is brought online - Replication error on GC when new object is
created - Lingering object still holds attribute where
uniqueness is enforced (samAccountName) - Exchange cannot create mailbox because object
already exists
67Why does this Happen????
- DCs disconnected for more than tombstone lifetime
- Left in storage room for long time
- Replication failures
- I.e., bridgehead servers overloaded, no
monitoring in place - WAN connections down for a long time
- Tombstone lifetime abuse
- Somebody changed time on a DC to garbage
collect an object - Tombstone lifetime was changed to garbage collect
objects on single servers - Can this be avoided?
- YES, monitor KCC topology and replication
- Do not set tombstone lifetime to less than 60
days - DCs offline gt tombstone lifetime must be
re-promoted
68Lingering ObjectsStrict vs. Loose Replication
Behavior
- Replication Behavior
- Defines how DC reacts if an update for an object
is replicated in, and the object does not exist
on DC - Loose Behavior
- DC requests full copy from replication source
- Logs event ID 1388
- Strict Behavior
- DC stops replication from offending replication
source - Logs error code 8240 (ERROR_DS_NO_SUCH_OBJECT)
embedded in event ID 1084 - Requires logging level 1
- Behavior can be set via registry key
- HKEY_LOCAL_MACHINE\System\CurrentControlSet\Servic
es\NTDS\Parameters\Strict Replication Consistency - Introduced in Q314282
69Deleting Lingering Objects
- If found on a DC
- In loose behavior Delete the object via users
and computers - In strict behavior Follow procedures outlined in
Q314282 - On GC (in read-only NC)
- Object cannot be changed or deleted on GC
- Solution 1 Delete object on writeable replica
(if possible) - Solution 2 Use ldp to delete the object on the
GC - Support to remove lingering objects from GC added
in Q314282 - Follow procedures outlined in Q314282
- You might have to set loose behavior temporarily
70Best Practice Recommendations
- DC has not replicated for more than 60 days
- Tombstone lifetime default (60 days)
- Do not replicate, re-install OS
- Tombstone lifetime adjusted to gt 60 days
- 60 days lt time DC disconnected lt tombstone
lifetime - Re-connect DC, restore sysvol
- Time DC disconnected gt tombstone lifetime
- Do not replicate, re-install OS
- If you have to disconnect a DC
- Make sure that it replicates successfully before
you take it off-line - New deployments
- Add registry key to enforce strict replication
behavior at DC OS installation time
71More Best Practice Recommendations
- Existing deployments
- Default setting Loose replication (even on SP3)
- Goal Get to strict mode asap
- Set registry key to strict mode on all DCs
- Watch event logs on DCs
- If you get many replication errors on single DCs,
re-promote DC - For small number of replication errors, clean-up
the DC - Delete lingering objects if necessary
- Follow procedures outlined in Q314282
- If you were monitoring
- Then dont worry, you wont see any replication
errors ? - Dont lower tombstone lifetime to less than 60
days - Monitor!
72Lingering Object Fix
- Q317097 (good instructions)
- HKLM\System\CurrentControlSet\Services\NTDS\Parame
ters - Add Value Name Correct Missing Object
- Data Type REG_DWORD
- Value 1 (tight)
- 0 (loose)
- Allows or Restricts AD replication when lingering
objects are discovered. - Tight when you want to know.
- Loose to inventory and remove the objects.
73Value Level Replication
- WNT Object Replication
- change to attribute or value
- W2K Attribute level replication
- Better than NT (more efficient)
- Change to attribute replicates attribute
- Change to value replicates attribute
- Problem Multi-Valued Attributes
- Group Attribute
- Member Value
- Change Member replicate attribute with all
members - Impacts network traffic
- Limit (per Microsoft) of 5,000 users/group
- .NET Value Level Replication
- Replicates values not attributes
- Eliminates 5,000 user/group limit
74Domain Limit
- There is a limit of about 800 child domains to a
single parent - Child domains are unlinked, multi-valued
attribute stored in the crossref attribute of
the domain object - Jet database limits the data that can be stored.
No way to patch must change Jet - Might be improved in Longhorn (not Whistler)
75Domain Limit
- One customer got to 900 domains
- Replication failed
- Authentication failed
- Mission critical application failed
- Temporary Repair
- Demote all domains in reverse order of creation
to return to 800 - Fixed Replication
- Solution
- Redesign and redeployed to a single domain
76DCPromo Troubleshooting
77DCPromo Basics
- First Test of
- DNS registration and resolution .
- LDAP query and response.
- Kerberos authentication.
- Active Directory replication.
- FRS replication.
- Application of group policy.
- Validation and Flow
- Chapter 2, Active Directory Data Storage in the
Windows 2000 Resource Kit
78DCPromo Logs
- windir\debug
- Dcpromo.log
- Dcpromoui.log
- Dcpromoui.xxx.log
- Set verbosity on dcpromoui.log
- HKLM\Software\Microsoft\Windows\CurrentVersion\Adm
inDebug - Values DCpromo and DCPromoui
- Data
- 380001 Default
- 0xFF003 full file and debugger logging output
- 0xFF001 maximum detail to DCPromoui.log
79DCPromo Phases
- Initialization
- UI Input - DNS Name resolution
- LDAP Query/resp - Kerberos Authentication
- AD Replication
- FRS Replication
- Wrap Up
- Apply policy - Upgrade Trusts
- Publish new DC in the DS
80Initialization Phase
- Authorization error
- Enterprise Admin required to create new domain
(or to remove the last one). - Domain Admin required to add replica DC (or
demote a replica). - Cant find DNS with Dynamic Updates.
- Prompt to let DCPromo configure DNS.
- Creating domain.
- Answer NO!
- Replicas, Child must find DNS server to locate
a sourcing DC.
81Errors Creating the Computer Account
- Need privileges to create the account.
- First creates the account, puts it in
domain/computers container. - Then puts it in domain controllers OU.
- Source DC identified in DCPromo logs.
82DCPromo Initialization Checklist
- Privileges required
- Enterprise Admin if creating new domain.
- Domain Admin if creating a replica.
- System time configured properly
- Kerberos requires sync within five minutes.
- All parent, child domain DCs.
- Sufficient free disk space.
- 850 MB
- Domain Naming Master FSMO required if creating
new domain.
83DCPromo Initialization Checklist
- Everyone or Enterprise DC group has Access this
computer from network - Enterprise DC group rights
- Manage Replication Topology.
- Replicating Directory Changes.
- Replication Synchronization.
- Sourcing DC
- Security policy applied.
- Enable Computer and user account to be trusted
for delegation.
84DCPromo Initialization Checklist
- Target DC has valid Kerberos tickets.
- Kerbtray.exe utility from Resource Kit.
- GC must be contacted.
- Nltest /dsgetdccompaq.com/GC
- Able to contact a functional existing DC.
- Uses UDP (watch for firewall issues).
- Can use TCP but its a Microsoft Secret!
- Use Ping, NLTest, Nslookup to find a DC.
85If Source DC not Reachable...
- See if one responds.
- Ping FQDN of domain (Ping compaq.com).
- NLTest /dsgetdccompaq.com /ds
- Other /gc /pdc /timeserv
- Check Site mapping for this computer.
- Nltest /serverltnamegt /dsgetsite
- Check Dcpromoui.log to see source.
- Force DCPromo to use a specific source
- Q224390
- Turn off Netlogon on other DCs.
- Join the Server to the domain then DCPromo.
86Info to Collect for Debug
- Netdiag /v
- Problem DC
- Source DC (see dcpromo.log)
- DCDiag /v
- Source DC
- Replication working? (other DC in site)
87AD FRS Replication Phases
- Initially inbound connection created to replicate
from source DC. - Machine acct (DC1) moved to DC OU.
- UserAccountControl Attribute set
- 4096 (1000 hex) Workstation/Server
- 532480 (82000 hex) DC
- Account is moved.
- Error DC1 not found, access denied, etc.
- Credentials of account running Dcpromo
- Source must have computer object.
- Source must have security policy applied to
itself. - Q250874
88AD FRS Replication Phases
- After first reboot
- Outbound connection created.
- AD changes for new DC replicated to source.
- Including UserAccountControl attribute.
- Server (Replication) object.
- Replicated to other DCs.
- Sysvol is populated (policies copied to new DC).
- Sysvol and Netlogon Shares created.
89Troubleshooting Missing Sysvol, Netlogon Shares
- Outbound connection failed
- Look in Sites and Services or Repadmin
- UserAccountControl still 4096 on source
- Q257338 Good but
- Build manual outbound connection
- Force KCC to Check Replication Topology
- Check UDP traffic if in a remote site.
90Missing Sysvol and Netlogon Shares
- Create replication links manually then force
replication - Repadmin /add (adds outbound link)
- Repadmin /sync (forces replication)
- Cant create them manually. When Replication is
fixed, theyll get created.
91Tracking Down a GUID
- Problem GUID referenced in event log. What is
it? - Solution (Q216359)
- LDP search for the GUID
- Search.vbs in Support tools
- Orphaned Object (will kill replication)
- Turn up NTDS diagnostic logging
- Internal processing
- Replication
- Find object (GUID) in event logs
- Delete it via LDP
92DCPromo Improvements in Windows .NET
93Install From Media (IFM)
- Source Replica AD from Media in DCPromo
- GCs or DCs (Replica only).
- No initial replication from a DC.
- Faster (no searching for a DC).
- Less network impact (No full sync on the WAN).
- Easy branch office installation.
- After initial load, replicates changes.
- Network connectivity still required.
- Unattended Answer File Support
- ReplicateFromMedia
- ReplicationSourcePath
94Install From Media (IFM)
- Unattended Answer File Support
- ReplicateFromMedia
- ReplicationSourcePath
- Media must be local drive.
- Media useful life lt 60 days.
- How?Use Backup Files/Media
- Create first DC in domain.
- Back up DC.
- Restore to Media (local disk, CD, ).
- Cgtdcpromo /adv.
- Wizard produces an additional screen
95(No Transcript)
96DCPromo Answer File
- See Q223757
- Unattended
- Unattendmodefullunattended
- DCINSTALL
- UserNameadministrator
- PasswordPassword3
- UserDomaincorp.net
- DatabasePathc\windows\ntds
- LogPathc\windows\ntds
- SYSVOLPathc\windows\sysvol
- SafeModeAdminPasswordPassword2
- CriticalReplicationOnly
- SiteNameSeattle
- ReplicaOrNewDomainReplica
- ReplicaDomainDNSNamecorp.net
- ReplicationSourceDC
! Leave this blank for IFM - ReplicateFromMediayes
- ReplicationSourcePathe\DSrestore
- RebootOnSuccessyes
97File Replication Service (FRS) Basics
98FRS Background
- File Replication Service
- Replicates file system portion of policy
- Optional replication engine for DFS
- Concepts
- Challenges
- Journal wraps
- Staging File backlog
- Reconciliation / Morphed Directories
99Concepts
- Objects in DS
- Members, Subscribers, Conn. objects, filters
- Depends on AD replication
- Determines partners and schedule
- NTFS USN Journal
- Used by FRS to track changes to NTFS volumes
- Staging File and Directory
- Rename safe
- Compression support
- Database
- Record of incoming, outgoing existing files
100File Replica Service (FRS)
- Replaces NT 3.X\4.0 LMREPL service
- Replicates SYSTEM Policy, Group Policy, DFS
- Group policy templates
- Ntconfig.pol logon scripts for down-level
clients - NETLOGON Share
- DFS share contents
- Multi-threaded replication engine
- Replicate different files to different computers
simultaneously.
101Terminology
- Computer A and B replicate DFSSYSVOL
- B is computer As outbound partner
- A is Bs inbound partner.
- A is Bs upstream partner
- Changes flow downstream to B
Downstream
Upstream
Replication
Computer A
Computer B
As Outbound partner
Bs Inbound partner
102Basic Operation
103File and Folder Filters
- Excluded from FRS Replication
- Computer specific EFS files/folders
- File names beginning with
- Files with .bak or .tmp extensions
- NTFS Mount Points
- Reparse points
- Configurable for DFS shares
104The Replication Process
AD Object version updated
DC1
\winnt\sysvol\sysvol\compaq.com\policies
\winnt\sysvol\staging\domain
\winnt\sysvol\staging areas\compaq.com
Notify Partners
105The Replication Process
DC2
Pull
Sysvol version of GPO updated
DC1
/\winnt\sysvol\sysvol\DO_NOT_REMOVE_ntfrs_PreInsta
ll_Domain
/\winnt\sysvol\sysvol\compaq.com\policies
106FRS Replication
- Observe File Replication Process
- Edit a group policy modify and save it.
- Copy of changed file goes to staging and staging
areas directories. - Copied to staging/staging areas directories on
other DCs.. - Moved to sysvol\sysvol directory on the DC.
- Group policy file is updated.
107Distributed File System (DFS)
108DFS Basics
- Domain-based (Win2K) vs Standalone (NT)
- Root
- Must be on a DC.
- Contains PKT.
- DFS service.
- Replica
- PKT from DC, stored locally.
- DC or Member Server.
- FRS Replicates Data between DCs
- Member servers DFS replicate data to share via
DFS service. - Site Aware (clients locate closest DFS Replica)
109The DFS Replication Process
DC1 - Root
DFS service
FRS
SVR1 Replica
SVR2 Replica
DC2 Replica
110DFS Troubleshooting
- Symptom Shared folders not in sync.
- Make Sure DFS service is started on all servers
and DCs. - Make sure AD Replication is working.
- Make sure FRS is working.
- DFSUtil.exe.
- Watch for applications that keep files open.
- Anti-virus.
- Defragmenters.
111FRS TroubleshootingTechniques
112Basics
- Remember
- You MUST install latest service pack and hot fix.
- Post SP2 (SP3) Hot fix Q307319
- Dont go any further until this is installed.
- Multi Master characteristics replicates changes
(and problems) quickly. Turn off the FRS Service
to get control. - FRS depends on AD Replication, which depends on
DNS.
113Diagnostic Tools
- Event Viewer FRS log, DS Log
- NTFRSutl.exe
- /outlog outbound logs
- /inlog inbound logs
- /ds directory service
- NTFRSxxx.log in \winnt\debug
- NTFRS Health Check utility
- HP, Microsoft
- Netdiag, DCDiag
- AD replication tools
114FRS Replication
- What happens if it breaks?
- Changes not replicated to all DCs, resulting in
inconsistent AD - Group policy gets out of sync and may not get
applied. - GPOTool Version mismatch
- Logon scripts dont get applied.
- DFS shares out of sync.
115FRS Replication
- How to tell if its broken
- Events in FRS log
- Event 1000, 1001 in app log every five minutes.
- Files backed up in staging areas
- Get size of staging directories (MB).
- Get date of oldest file (how long it has been
broken). - Group Policy not applied (new changes)
116Replication Problems
- Ensure DNS is working.
- DNS Lookup Failures in events (description).
- Ping, Nslookup to resolve names.
- Domain name
- DC, Server names
- Ensure AD Replication is working.
- Create New Objects and see if they replicate.
- Repadmin/showreps and /showconn
- DS Event Log
- DCDiag
117Replication Problems
- Staging Areas should have no files
- Common FRS problem.
- Check size of dir, date of files.
- Ensure FRS is working.
- Create text file on each DC, named for the DC.
- Put it in \winnt\sysvol\sysvol\ltdomain namegt.
- All DCs should have copy of all DCs text files .
118Replication Problems
- FRS Event Log
- 13508 Normalbut watch them
- 13509 success after having 13508s
- 13514 When Sysvol share not created FRS
preventing computer from becoming a DC - 13553,13554 FRS successfully added computer to
replica set (DCPromo successful) - 13557 Duplicate Connection Objects
- 13522 Staging area full Q264822
- Lots of KB Articles Search for FRS and Event
119Interpreting the Logs NTFRS_000x.log
- \WINNT\DEBUG
- Identify errors, warning messages and milestone
events in the log files - Very difficult to interpret
120NTFRSutl.exe
- Ntfrsutl inlog Lists inbound log
- Ntfrsutl outlog Lists outbound log
- Ntfrsutl sets Lists replica sets
- Ntfrsutl DS FRSs view of the DS
- Can execute remotely
- Ntfrsutl sets DC1
121Group Policy Troubleshooting
122Group Policy Troubleshooting Basics
- Policy isnt getting applied
- Set something easy Admin Templates
- User Settings Log off/on
- Computer Settings Reboot
- Client-side extensions act as separate policies
debug separately from Admin Templates - Folder Redirection
- Scripts
- Disk Quotas
- Security
- IE Branding
- EFS Recovery
- IPSec
- Application Management
123Group Policy Troubleshooting Basics
- Policy applied, but settings not effective.
- Userenv.log (verbose) Q221833
- Set Diagnostic logging Q186454
- HKLM\software\Microsoft\WindowsNT\CurrentVersion\D
iagnostics - Value RunDiagnosticLoggingGroupPolicy
- Value Type REG_DWORD
- Value Data 3 (value 0-5 0off)
- Change One setting in GPO
- Logoff/on or reboot
- Verbose info in Application log
- Lists all registry settings applied to user
- Turn it off afterward fills the event log fast!
124Gpresult.exe
- Resource Kit command-line utility.
- Reports applied policy for user, computer.
- DN
- Security groups
- Verbose mode gpresult /v
- Registry settings
- Computer Client-side extensions.
- WATCH
- Logon server.
- Cached policy on client may mask solution.
- Refresh Policy make sure its applied .
125GPOtool
- Resource Kit command-line utility.
- Run on DC only.
- Version Comparison AD vs. Sysvol.
- AD version set immediately on change.
- Sysvol version set after FRS Replication.
- Friendly name /GUID association
- Policy 08FAB736-9628-41D5-B5A8-37A0F98D7E43
- Policy OK
- Details
- -------------------------------------------------
----------- - DC Qtest-DC2.qtest.cpqcorp.net
- Friendly name Folder Redirection Policy
126Solving Version Mismatch
- Small mismatch is normal.
- After change until FRS Replication completes.
- Be patient see if it resolves.
- Big mismatch is bad.
- Prevents application of policy.
- Unreplicated changes.
- Manually set FRS version AD version.
- windir\sysvol\sysvol\ltdomaingt\policies\guid\gp
t.ini - Will lose changes.
127Resetting Default Domain Policy or Default DC
Policy
- These policies are always same (GUID).
- Default Domain 31B2F340-016D-11D2-945F-00C04FB98
4F9 - Default DC 6AC1786C-016F-11D2-945F-00C04FB984F9
- Changes are a mess need to restore default.
- To restore security defaults only, import the
BasicDC.inf template (Q258595). - If settings are hosed, copy an original copy of
the policy to winnt\sysvol\sysvol\
ltdomaingt\policies. - Copying policies only supported for these two
cases. - Other will have different GUIDs.
- Cant copy other policies from one forest to
another for debug.
128How to copy the Default Domain and Default DC
policy
- Get a copy of a clean, default policy folder.
- Restore the policy folder (GUID) from backup.
- Create new domain and copy the GUID folder from
that machine . - Dont zip it .
- Delete existing policy.
- Wait for replication.
- Copy new policy folder to winnt\sysvol\sysvol\ltdom
aingt\policies. - Wait for replication.
- Run GPOtool to make sure it shows up on all DCs.
129Unable to Edit Group Policy
- Group policy changed on PDC by default.
- If PDC is not available.
- Dialog Change on any DC, current DC or not.
- Error Unable to contact Domain (no DC).
- Solution Transfer or seize the PDC role to
another DC. - Can set policy to NOT use PDC . Dont!
130Using Userenv.log to solve Group Policy problems
- Turn on Verbose Logging Q221833
- interpreting group policy information in
userenv.log
131Debugging Logon Scripts (script doesnt apply)
- Configure it via group policy snap-in.
- Make sure policy is applied.
- Set a desktop setting.
- Use Gpresult /v.
- Enable verbose logging for Userenv.log.
- Turn on Run logon scripts visible.
- Create simple logon script as a .bat file to make
sure its not the script failing. - Example Using Userenv.log to find script errors.
132Cant find FSMO Role Holder
- Problem Operation trying to contact a FSMO role
holder PDC Emulator or? - Can ping by name seems to be ok
- Operation cant find it
- Solution
- Find out who has that role
- netdom query fsmo
- (returns a quick list)
- Transfer the role to a local DC
133Group Policy Refresh Anomaly
- Users complain of a 5-25 second hang
intermittently in any application Outlook,
Word, 3rd party apps. Keystrokes are buffered
and they can continue to work - Noticed direct correlation between the 1704
events (GP Refresh) and the hang. - Change refresh interval via group policy and the
frequency of the hang changed.
134Group Policy Refresh Anomaly
- Cause SceCli applies group policy every 16 hrs
(default) if no gpo changes have occurred. (DCs
are every 5 minutes) - Broadcasts WM_settingschanged to all top level
windows - Wakes up sleeping processes causing massive
paging in/out of memory causing hangs - More pronounced on slower computers
- Solution Configure Policy Refresh Interval in
Group Policy so refresh occurs every 12 hrs at
midnight/noon so users dont notice it.
135Account Lockout
- Background
- Finding locked out user accounts
- Client Bugs and Fixes
- Server Bugs and Fixes
- Resolution and Futures
136Lockout Reasons Options
- Prevent spoofing or hijacking account
- Optional event logging in Audit Policy
- Account Lockout Options
- Timed lockout
- Account enabled after admin defined time
- Hard lockout
- Account disabled until reset by admin
- Lockout policy defined in group policy
- Single lockout and password policy per domain
- Location default domain policy
137Account Lockout on DCs
- Each DC records of bad password attempts
- BDC check PDC for latest password
- All Bad password attempts seen by PDC
- PDC always 1st to lock out account
- PDC urgently replicates lockout when threshold
reached - Bad password attempts not replicated by DC
- BadPasswordCount reset to 0 on 1st good password
138PDC chaining operations
- If BDC fails authentication with
- STATUS_WRONG_PASSWORD
- STATUS_PASSWORD_EXPIRED
- STATUS_PASSWORD_MUST_CHANGE
- STATUS_ACCOUNT_LOCKED_OUT
- Referred to as BadPasswordStatus
- BDC chains authentication to PDC
- Return status from PDC if status success or
listed above - Otherwise, ignore PDC status and use local status
- Exception to PDC chaining
- AvoidPDCOnWan enabled and PDC in remote site
(Q225511) - 10 BadPasswordStatusevents logged in 10 minutes
- NegativeCache enhancement Q263821
- Cache reset after good password entered
139Troubleshooting account lockouts
- Your goal Answer the 4 Ws
- Who, Where, When and Why
- Environment setup
- Enable Auditing in domain policy
- Account Logon Events Failure
- Account Management Success
- Logon Events Failure
- Security Event log on DCs 10K events
over-write - Enable netlogon logging (ntlm clients)
- NLTEST /DBFLAG2080FFFF (no reboot)
- Enable Kerberos Logging
- Q262177 Kerberos logging (kerb clients)
140Account Lockout Where
- DC Resources
- NTLM Clients
- Search DC CLIENT NETLOGON.LOG for lockouts
- 0xC000006A bad passwords
- 0xC0000234 account lockout
- NTLM Kerberos Clients
- Search DS Event Logs
- Q230254, Q299475, Q273499 and Q301677 for
description - 644 NTLM Kerberos Lockout Event
- 675 Kerberos badd password
- 681 NTLM bad password
- 529 Failed logon
- 531 Account disabled
- Tools
- EVENTCOMB
- AL.EXE
- NETMON.EXE
141EVENTCOMB
142AL.EXE
143Account Lockout Why
- Attack, Pilot Error or Bug
- Wrong Password entered, mis-configured Service
Account - Scenario
- Account type user, computer or service account
- Lockout trigger?
- logon, drive access, following p/w change)
- Drill Down Look at TOD, pattern frequency
- Process related lockouts
- Structured pattern
- Logged when users not present
- Look for
- common services, applications, client
configuration - User related lockouts
- Random pattern,
- Fewer events logged
- Look at
- shortcuts, mapped drives, logon scripts,
applications
144Account Lockout Client
- Win9X
- Q278558 Access denied to a mapped drive after
disconnect - Q272594 Client can't log on after log off w/o
reboot - Q293793 VREDIR looses file tracking structures
- Q271496 One unsuccessful logon attempt triggers
lockout (13) - Net use dsgetdc logon attempt.
- Q266772 Logon fails if Unicode string password
to NTLM SSPI - DS Client on Win95, Windows 98, 98 Second Ed
- DSCLIENT MUST be installed before any hotfixes!
- Q301344, Q283261
- DS Client lets WIN98 account lockout fixes work
on Win95 - Win2K
- Q275508 User locked when accessing home dir
after changing p/w - Hotfix or SP2
- Windows XP
- None
145Account Lockout Server Fixes
- Read server side KB articles
- Q287639 Win9x Clients Locked Out after unlock
- MSV1 package does password check against BDC with