Title: Vivisimo Federal
1(No Transcript)
2 About Vivisimo Inc.
- Enterprise software company founded June 2000
- Carnegie Mellon computer science dept spinoff
- Founding technology is document clustering
- Award-winning web-search site Vivisimo.com
- Showcases our clustering meta (federated)
search products - Best meta-search site 2001-2002 (Search Engine
Watch) - 2nd place this year, but powered 1 Dogpile.com
- Main funding National Science Foundation SBIR
- Also, project started under NSF Funding at CMU in
1998
3 Problem Information Overload Overlook
- Information Overload and Information Overlook
- Look for information ? get too much back
- Most people handle overload by overlooking most
information
- Overlooking information has a business/user cost
- Employees fail to find needed information
- Citizens dont solve their problems by
themselves, online - Publishers lose potential readership
- Web search providers lose potential
click-throughs - Users miss out on unexpected discoveries or
opportunities
4How to Alleviate Information Overlook?
- Decrease the available information
- Purge the obsolete
- Censor the worthless
- Make people
- Smarter
- Work harder
- Read faster
- More efficient! (but how?)
- Best ROI Provide organized information!
5How to Provide Organized Information?
- Manually tag (classify, index) all content?
- We have no process for consistently tagging our
content. We have 50 different business units.
People in one unit do a great job, but others do
not use tags at all. Forrester Report - Forrester says 4 per page to make a controlled
vocabulary - 50 per document to manually tag (large pharma)
- Tags tend to be broad and bland (one size fits
all) - ? Tagging is costly and leads to bland results
6On-the-Fly Document Clustering
- Cluster top 200-500 Search Results at The Output
- Uses title, snippet, and (optionally) meta-tags
- Works with Autonomy, Convera, FAST, Google,
Sharepoint, Ultraseek, Verity, etc. - Folder descriptions are as fresh as your content
- No need to develop/maintain a taxonomy
- Long History of Research
- Industry IBM, Xerox, Verity, Microsoft
- MIT, Stanford, Carnegie Mellon, etc.
- What have they missed?
- Quality of description is the most important
- Cluster Descriptions Should Be
- Concise, accurate, natural, distinctive
- Cant optimize these in the traditional way
- Instead, invent a great heuristic algorithm
- Add lots of domain knowledge (English, German,
etc. specific verticals)
7Screenshots of Three Demo Govt Sites
8Screenshots of Three Demo Govt Sites
- City of San Diego
- sannet.gov
- http//vivisimo.com/demos/SanDiego.html
- DOE National Energy Technology Laboratory
- netl.doe.gov
- http//vivisimo.com/demos/NETL.html
- Centers for Medicare and Medicaid Services
- cms.hhs.gov
- http//vivisimo.com/demos/CMS.html
9(No Transcript)
10(No Transcript)
11(No Transcript)
12Modeling the Value of Topic Clusters
- Intuition
- Lots of wasted effort if information is
disorganized - View few results before exhausting your patience
- Modeling Assumptions
- User spends 12 min before giving up or moving on
- Eye skips over search results or folders
sequentially - 1,000 employees at 60 per hour
- 2 searches per employee/day
- 10 minutes to solve problem elsewhere when search
fails - Folders let you see 11 docs in detail vs. 6 for
ranked lists - Conclusion savings of 1M per year (white
paper) - Fortune 1000 customer experimental tests
- Combination of meta-search and clustering
- Minutes-to-solve 1.9 versus 3.7
13 Customer Highlight
- Cisco Systems Technical Assistance Center
- Several thousand support engineers
- Search platform is Google Search Appliance
- Had evaluated taxonomy-building products
- Cisco has complex, rapidly evolving product line
- Effort involved didnt scale with the amount of
information - Rejected the taxonomy-building approach
- Problems Challenges
- Lack of organized information
- Ranking well is hard to do on an intranet
- Separate collections couldnt be cross-searched
- How to leverage rich meta-data?
- Author, date, product version, etc.
- Solutions
- Content Integrator meta-searches Google
collections
14Multiple Sources Multiple Searches and Multiple
Lists
15A New Standard, A Better Way
16How the World is Now
17 Example of What Vivisimo Makes Possible
18Value Proposition of Clustering
- End-users
- Easy access to useful but low-ranked results
- Learn at a glance the types of available
information - See results in context of similar results
- Agency
- More efficient employees/citizen support
operations - Citizens become more engaged with your content
- IT Dept
- Quick installation on any search engine/database
- Overlays search is non-invasive no maintenance
- No need to train users
19What We Ask Potential Customers
- What search engines do you use?
- Are you happy with it them? If so, proceed.
- If not, ask our opinion and/or inquire about our
partnerships. - Are you ready for clustering?
- Can your search engine deliver 200 results
acceptably fast? - Are the descriptions of acceptable quality? send
a screenshot! - Do a pilot (2-4 weeks)
- Customize with stopwords and thesaurus (optional)
- takes minutes or days
20 Customer Quotes
21Vivisimo.com
Web search demo at Vivisimo.com showcases our
technology. ? 7M searches monthly ? 22M page
views
- International recognition
- Powered the 1 meta-search engine 3 years running
Search Engine Watch Awards - Better than Google? CNN.com
- Analyst Choice Award eWeek Labs
- Named Top 100 Company KMWorld and eContent
Magazines - Site of the Week PC Magazine
- Major Analyst Coverage Delphi, Forrester,
Outsell, Seybold and IDC
22 Vivisimo Federal
- Vivisimo on GSA Schedule through Onix Networking
Corp - Onix Networking is a Google Search Appliance
reseller - Work with several DC systems integrators and IT
consultants - Several clients (govt intelligence) plus ongoing
pilots
23 Conclusion
- Information Overlook imposes high opportunity
costs - Alleviate by showing organized info
- Clustering into folders is a natural approach
- Hard to do well but now a solved problem
- Overlays any search engine (Google, Autonomy,
Verity, etc.) - Organizes info at delivery time rather than
creation time - No taxonomy-building headaches
Info http//vivisimo.com/gov FirstGov clustering
demo http//vivisimo.com/firstgov Raul
Valdes-Perez (valdes_at_vivisimo.com)