Title: FullText Indexing And Search Using Site Server And Exchange
1Full-Text Indexing And Search Using Site Server
And ExchangeTom Rizzo ? thomriz_at_microsoft.com
5-402
2(No Transcript)
3This Session Covers
- Scenarios
- Overview
- Site Server search
- Exchange indexing overview
- Exchange deployment steps
- Developing a custom search solution
- Common errors and limitations
- Additional resources
4Where To Get More Information
- SBN http//www.microsoft.com/workshop/server/sit
eserver/siteexch/exchss.asp - Site Server Update Web sitehttp//www.microsoft.
com/siteserver/update - Search Performance report
- Search deployment whitepaper (tips tricks)
- Database indexing MSDN article
- SDK update
- Evaluation guide
5(No Transcript)
6Scenarios
- Custom Helpdesk application
- Support site for internal IT apps
- Public folder for user questions
- FAQ auto-generated as questions get answered
- KnowledgeBase
- Repository for best practices
- Rich search allows people to find information
easily - Custom forms allow users to quickly enter data
into application
7Powerful, Multi-Store SearchMicrosoft Site Server
- Powerful
- Full text and property search
- Security integration for Microsoft Exchange and
NTFS - Site vocabulary and tagging
- Automatic language detection
- Extensive
- Single query across multiple stores
- HTTP, file system, Microsoft Exchange public
folders, databases
8Industrial Strength SolutionMicrosoft Site Server
- Scalable
- Distributed indexing
- Replicated query servers
- Incremental crawl reduces Net traffic
- Administration
- MMC, Web-based, command line, and scriptable
administration
9Exchange Indexing Features
- Index and search Public Folders
- Search properties, text, attachments
- Index and search custom properties
- Integrated with Exchange security
- Full and incremental crawl
- Search using web browser
- Read search result messages using OWA or Outlook
10Search Architecture
11Exchange Search Deployment Checklist
ü
- Capacity planning
- Configure Site Server accounts
- Configure Site Server for Exchange crawling
- Index and search Exchange properties
ü
ü
ü
12Scalability And Performance
- Reference HW Dual p6/200, 128MB
- Little/no benefit from quad proc
- Limits
- 5M docs/catalog max, 32 catalogs max
- Crawl speed 10 messages/sec (30KB/sec)
- Query rate 11-15 Q/s (25-55Q/s without
hitcount) - Want physical memory gt property store
- Factors hit count, result set size, Exchange
security - Non-factors query complexity, catalog size (as
long as property store fits in RAM)
13Hit Count, Query Complexity
14Result Count, Security
15Catalog Size, Hitcount
16Capacity Planning
- How many machines?
- Criteria crawl overnight (8 hrs), user queries
- 1 machine for up to 700k docs, 40k users
- Add machines for
- more documents
- more users
- fault tolerance
- How much RAM?
- Criteria Fit property store in physical RAM
- lt 150K docs 128MB RAM
- lt 1M docs 256MB RAM
- gt 1M docs 512MB
17Configuration - Accounts
- Set Content Access account on Build Server
- Configure through Catalog Build Server Properties
-gt Accounts - Need Admin privilege on the Configuration node
of Exchange Server - Set Search service to run with same account
- Configure through Control Panel -gt Services
- By default, Site Server Search Service runs as
SYSTEM - Must have same privileges above
- Must have admin privileges on local machine
18Configuration - Accounts
- Search query pages must have anonymous access
turned off - Diagnostic information can be found in
- Gatherer log
- Event viewer
- Query error return codes
19ConfigurationExchange Settings
- Set Exchange Server configuration in Server
properties (under Search MMC node) - Exchange Server Name
- Outlook Web Access Server Name - Optional
- Exchange Server Site Name
- Exchange Server Organization Name
- If settings are incorrect, diagnostic information
is in the gatherer log, event viewer and query
error strings - Must be set on the Catalog Build Server and
Catalog Search Server if distributing indexes
20What About Replication?
- If your PFs are replicated, Site Server can
access your PFs as long as - They are homed in the Site you specify for Site
Server - You use Public Folder Affinity for PFs in other
sites - Make sure Search has administrative access to all
PFs youre crawling - Increase timeout period if latency is a problem
21Build And Search An Exchange Catalog
- Create a new Catalog Definition using wizard
- Do a full build
- Verify success by searching using the default
search page
22Exchange Properties
- Mapped to standard properties
- DocTitle (Subject)
- DocAuthor (From)
- FileWrite (SendDate)
- Description (300 chars from message)
- Additional Exchange properties
- MessageClassMessageDisplayName
MessageDisplayCCMessageFolderName - MessageFolderPath
23Exchange Custom Properties
- Custom properties work the same way as HTML Meta
Tags - Available in the Meta property set
- Searchable by default (not retrievable)
- Text, date/time, and number supported
- Text properties can automatically be searched
- Use Meta_ltNamegt (e.g. Meta_myprop)
- For Custom Props with spaces use Meta_ltNamegt
- E.g., Meta_Total Expense
24Custom Properties Dates And Numbers
- For date/time and number custom properties
- Edit DefineColumns.txt
- Default location c\Microsoft Site
Server\Data\Search\Config - Add entry corresponding to custom property in the
Meta property set - Use appropriate type
- Integer DBTYPE_I4
- Date VT_FILETIME
- Refer to documentation on Meta properties
25Demo Building A Custom Search ASP Application
- Using the capabilities of Site Server, you can
create rich, custom search applications - These applications can search both standard props
as well as your custom props in Exchange
26Demo Accessing Searches
- Outlook Search techniques
- HTML Page self-hosted in Outlook
- Outlook Form driving HTML page
- Custom Outlook Forms
- DCOM connection to Site Server to enable native
Outlook search
27Specifying The Catalog
ltinput typehidden name"c6" value"_at_MessageFolder
"gt ltinput typehidden name"q6" size25
maxlength50 value"lt q6 gt"gt ltselect
name"ct" value"ct"gt ltoption value"Discussion,Pr
oductKnowledge,Business,list_Servers"gt All
Indexed Folders ltoption value"Discussion"gt
\Discussion
28String Custom Properties
lttdgtltinput typehidden name"c7"
value"_at_META_Type"gt ltselect name"q7"gt
29Date Properties
ltinput typehidden name"c3 value"_at_filewrite"gt
ltinput typehidden name"o3" value"gt"gt
ltselect name"q3"gt ltoption value"" ltif
Request("q3")"" then gtselectedltend ifgt
gt ltoption value"-1d" ltif q3"-1d" then
gtselectedltend ifgt gtday ltoption value"-1w"
ltif q3"-1w" then gtselectedltend ifgt
gtweek ltoption value"-1m" ltif q3"-1m" then
gtselectedltend ifgt gtmonth ltoption value"-1y"
ltif q3"-1y" then gtselectedltend ifgt gtyear
30Returning Results OWA Or Outlook
if ExchangeViewer"both" or ExchangeViewer"outloo
k" then gt ltobject id"Exciol" height0 width0
CLASSID"CLSIDDAFD7A40-73FF-11D1-A811-00AA006EAC9
D" CODEBASE"/siteserver/knowledge/search/control
s/exciol.ocxversion5,5,2148,0"
TYPE"application/x-oleobject"gt lt/objectgt
31Set Query Object
lt ' Set query object properties. set Q
Server.CreateObject("MSSearch.Query") '
Define all required query properties the query
itself, and all columns ' that will be used
in the results. Q.SetQueryFromURL(Request.Q
ueryString) Q.Catalog request("ct")
Q.OptimizeFor"nohitcount,performance"
Q.MaxRecords 10 Q.Columns "DocAuthor,
DocTitle, DocAddress, Description, Size,
FileWrite" gt
32Create The Recordset
lt ' Create the recordset holding the search
results. on error resume next set RS
Q.CreateRecordSet("sequential") if err
ltgt 0 then createerror err.description
end if
33Scroll Through The RS
lt ' Set up loop to iterate through results. Do
while not RS.EOF Response.Write
RS("Description") ' Increment the results.
RS.MoveNext RecordNum RecordNum 1
Loop gt
34Common Errors
- Specifying the wrong Site or Organization
- The crawling/searching account must be in the
administrators group on Site Server computer - Content access account must have Admin rights
on Exchange Server Configuration node - Search Service must be manually configured to run
as a user with admin privilege, not SYSTEM
(which is the default)
35Troubleshooting Failed Crawls
- Start with Event Viewer (Application log)
- Next, look at Gatherer Log (from MMC)
- Logs per item failures by default
- Can turn on success logging and rules logging
- Confirm network connectivity to content
- Ensure that the connectivity is checked for the
default content access accounts - Validate permissions of content access account
36Other Sessions
- Noam Topazs Notes session later today! Hell
have CDs with source for the COM Add-in!
37Questions?
38(No Transcript)