Title: Fun with
1Fun with
Part 1 Power searches and reconnaissance
2Im feeling lucky
- The Google interface
- Preferences
- Cool stuff
- Power searching
3Classic interface
4Custom interface
5Language prefs
6Google in H4x0r
7Language
- Proxy server can be used to hide location and
identity while surfing Web - Google sets default language to match country
where proxy is - If your language settings change inexplicably,
check proxy settings - You can manipulate language manually by fiddling
directly with URL
8Google scholar
9Google University search
10Google groups
11Google freeware
- Web accelerator
- Google earth
- Picasa
- Etc.
12Golden rules of searching
- Google is case-insensitive
- Except for the Boolean operator OR, which must be
written in uppercase - Wildcards not handled normally
- nothing more than a single word in a search
phrase provides no additional stemming - Google stems automatically
- Tries to expand or contract words
automaticallycant lead to unpredictable results
13Golden rules of searching
- Google ignores stop words
- Who, where, what, the, a, an
- Except when you search on them individually
- Or when you put quotes around search phrase
- Or when you force it to use all terms
- Largest possible search?
- Google limits you to a 10-word query
- Get around this by using wildcards for stop words
14Boolean operators
- Google automatically ANDs all search terms
- Spice things up with
- OR ?
- NOT ?
- Google evaluates these from left to right
- Search terms dont even have to be syntactically
correct in terms of Boolean logic
15Search example
- What does the following search term do
- Intextpassword passcode intextusername
userid user filetypexls - Locates all pages that have either password or
passcode in their text. Then from these, show
only pages that have username, userid, or user.
From these, it shows only .XLS files. - Google not confused by the lousy syntax or
- lack of parentheses.
16URL queries
- Everything that can be done through the search
box can be done by manually entering a URL - The only required parameter is q (query)
- www.google.com/search?qfoo
- String together parameters with
- www.google.com/search?qfoohlen
- (Specifies query on foo and language of
English)
17Some advanced operators
- intitle - search text within the title of a page
- URL as_occttitle
- inurl - search text within a given URL. Alows you
to search for specific directories or folders - URL as_occturl
- filetype - search for pages with a particular
file extension - URL as_ftias_filetypeltsome file extensiongt
- site - search only within the specified sites.
Must be valid top-level domain name - URL as_dtias_sitesearchltsome domaingt
18Some advanced operators
- link - search for pages that link to other pages.
Must be correct URL syntax if invalid link
syntax provided, Google treats it like a phrase
search - URL as_lq
- daterange - search for pages published within a
certain date range. Uses Julian dates or 3 mo, 6
mo, yr. - As_qdrm6 (searches past six months)
- numrange - search for numbers within a range from
low-high. e.g., numrange99-101 will find 100.
Alternatively, use 99..101 - URL as_nloltlow numgtas_nhilthigh numgt
- Note Google ignores and , (makes searching
easier)
19Advanced operators
- cache - use Google's cached link of the results
page. Passing invalid URL as parameter to cache
will submit query as phrase search. - URL
- info - shows summary information for a site and
provides links to other Google searches that
might pertain to the site. Same as supplying URL
as a search query. - related - shows sites Google thinks are similar.
- URL as_rq
20Google groups operators
- author - find a Usenet author
- group - find a Usenet group
- msgid - find a Usenet message ID
- insubject - find a Usenet subject lines (similar
to intitle) - These are useful for finding people, NNTP
servers, etc.
21Hacking Google
- Try to explore how commands work together
- Try to find out why stuff works the way it does
- E.g., why does the following return gt 0 hits?
- (filetypepdf filetypexls) -inurlpdf
-inurlxls
22Surfing anonymously
- People who want to surf anonymously usually use a
Web proxy - Go to samair.ru/proxy and find a willing, open
proxy then change browser configs - E.g., proxy to 195.205.195.13180 (Poland)
- Check it via http//www.all-nettools.com/toolbox,
net - Resets Google search page to Polish
23Google searches for proxies
- inurl"nph-proxy.cgi" "Start browsing through
this CGI-based proxy - E.g., http//www.netshaq.com/cgiproxy/nph-proxy.cg
i/011100A/ - "this proxy is working fine!" "enter " "URL"
visit - E.g., http//web.archive.org/web/20050922222155/ht
tp//davegoorox.c-f-h.com/cgiproxy/nph-proxy.cgi/0
00100A/http/news.google.com/webhp?hlentabnwned
usq
24Caching anonymously
- Caching is a good way to see Web content without
leaving an entry in their log, right? - Not necessarilyGoogle still tries to download
images, which creates a connection from you to
the server. - The cached text only will allow you to see the
page (sans images) anonymously - Get there by copying the URL from Google cache
and appending strip1 to the end.
25Using Google as a proxy
- Use Google as a transparent proxy server via its
translation service - Translate English to English
- http//www.google.com/translate?uhttp3A2F2Fwww
.google.comlangpairen7CenhlenieUnknownoeA
SCII - Doh! Its a transparent proxyWeb server can
still see your IP address. Oh well.
26Finding Web server versions
- It might be useful to get info on server types
and versions - E.g., Microsoft-IIS/6.0 intitleindex.of
- E.g., Apache/2.0.52 server at intitleindex.of
- E.g., intitleTest.Page.for.Apache it.worked!
- Returns list of sites running Apache 1.2.6 with a
default home page.
27Traversing directories
- Look for Index directories
- Intitleindex.of inurl/admin/
- Or, Try incremental substitution of URLs (a.k.a.
fuzzing) - /docs/bulletin/1.xls could be modified to
/docs/bulletin/2.xls even if Google didnt return
that file in its search
28Finding PHP source
- PHP script executes on the server and presents
HTML to your browser. You cant do a View
Source and see the script. - However, Web servers arent too sure what to do
with foo.php.bak file. They treat it as text. - Search for backup copies of Web files
- inurlbackup intitleindex.of inurladmin php
29Recon finding stuff about people
- Intranets
- inurlintranet intitlehuman resources
- inurlintranet intitleemployee login
- Help desks
- inurlintranet help.desk helpdesk
- Email on the Web
- filetypembx intextSubject
- filetypepst inurlpst (inbox contacts)
30Recon Finding stuff about people
- Windows registry files on the Web!
- filetypereg reg intextinternet account
manager - A million other ways
- filetypexls inurlemail.xls
- inurlemail filetypemdb
- (filetypemail filetypeeml filetypepst
filetypembx) intextpasswordsubject -
31Recon Finding stuff about people
- Full emails
- filetypeeml eml intext"Subject" intext"From"
2005 - Buddy lists
- filetypeblt buddylist
- Résumés
- "phone " "address " "e-mail"
intitle"curriculum vitae - Including SSNs? Yes ?
32Recon Finding stuff about people
33Site crawling
- All domain names, different ways
- sitewww.usf.edu returns 10 thousand pages
- siteusf.edu returns 2.8 million pages
- siteusf.edu -sitewww.usf.edu returns 2.9
million pages - sitewww.usf.edu -siteusf.edu returns nada
34Scraping domain names with shell script
- trIpl3-Hgt
- trIpl3-Hgt lynx dump \ "http//www.google.com/sear
ch?qsiteusf.edu-www.usf.edunum100" gt
sites.txt - trIpl3-Hgt
- trIpl3-Hgt sed -n 's/\. http\/\/alpha.usf.e
du\// /p' sitejunk.txt gtgt sites.out - trIpl3-Hgt
- trIpl3-Hgt
- trIpl3-Hgt
35Scraping domain names with shell script
library.arts.usf.edu listserv.admin.usf.edu mailma
n.acomp.usf.edu modis.marine.usf.edu my.usf.edu nb
rti.cutr.usf.edu nosferatu.cas.usf.edu planet.blog
.usf.edu publichealth.usf.edu rarediseasesnetwork.
epi.usf.edu tapestry.usf.edu usfweb.usf.edu usfweb
2.usf.edu w3.usf.edu web.lib.usf.edu web.usf.edu w
eb1.cas.usf.edu www.acomp.usf.edu www.career.usf.e
du
- anchin.coedu.usf.edu
- catalog.grad.usf.edu
- ce.eng.usf.edu
- cedr.coba.usf.edu
- chuma.cas.usf.edu
- comps.marine.usf.edu
- etc.usf.edu
- facts004.facts.usf.edu
- fcit.coedu.usf.edu
- fcit.usf.edu
- ftp//modis.marine.usf.edu
- hsc.usf.edu
- https//hsccf.hsc.usf.edu
- https//security.usf.edu
- isis.fastmail.usf.edu
- www.cas.usf.edu
- www.coba.usf.edu
- www.coedu.usf.edu
- www.ctr.usf.edu
- www.eng.usf.edu
- www.flsummit.usf.edu
- www.fmhi.usf.edu
- www.marine.usf.edu
- www.moffitt.usf.edu
- www.nelson.usf.edu
- www.plantatlas.usf.edu
- www.registrar.usf.edu
- www.research.usf.edu
- www.reserv.usf.edu
- www.safetyflorida.usf.edu
- www.sarasota.usf.edu
- www.stpt.usf.edu
- www.ugs.usf.edu
- www.usfpd.usf.edu
36Using Google API
- Check out http//www.google.com/apis
- Google allows up to 1000 API queries per day.
- Cool Perl script for scraping domain names at
www.sensepost.com dns-mine.pl - By using combos of site, web, link, about, etc.
it kind find a lot more than previous example - Perl scripts for Bi-Directional Link Extractor
(BiLE) and BiLE Weight also available. - BiLE grabs links to sites using Google link query
- BiLE weight calculates relevance of links
37Remote anonymous scanning with NQT
- Google query filetypephp inurlnqt
intext"Network Query Tool - Network Query Tool allows
- Resolve/Reverse Lookup
- Get DNS Records
- Whois
- Check port
- Ping host
- Traceroute
- NQT form also accepts input from XSS, but it is
still unpatched at this point! - Using a proxy, perform anonymous scan via the Web
- Even worse, attacker can scan the internal hosts
of networks hosting NQT
38Other portscanning
- Find PHP port scanner
- inurlportscan.php "from Port""Port Range
- Find server status tool
- "server status" "enter domain below"
39Other portscanning
40Finding network reports
- Find Looking Glass router info
- "Looking Glass" (inurl"lg/" inurllookingglass)
- Find Visio network drawings
- Filetypevsd vsd network
- Find CGI bin server info
- Inurlfcgi-bin/echo
41Finding network reports
42Default pages
- Youve got to be kidding!
- intitle"OfficeConnect Wireless 11g Access
Point" "Checking your browser"
43Finding exploit code
- Find latest and greatest
- intitle"index of (hack sploit exploit
0day)" modified 2005 - Google says it cant add date modifier, but I can
do it manually with as_qdrm3 - Another way
- include ltstdio.hgt Usage exploit
44Finding vulnerable targets
- Read up on exploits in Bugtraq. They usually tell
version number of vulernable product. - Then, use Google to search for for powered by
- E.g., Powered by CubeCart 2.0.1
- E.g. Powered by CuteNews v1.3.1
- Etc.
45Webcams
- Blogs and message forums buzzed this week with
the discovery that a pair of simple Google
searches permits access to well over 1,000
unprotected surveillance cameras around the world
-- apparently without their owners' knowledge. - SecurityFocus, Jan. 7, 2005
46Webcams
- Thousands of webcams used for surveillance
- inurl"ViewerFrame?Mode"
- inurl"MultiCameraFrame?Mode"
- inurl"view/index.shtml"
- inurl"axis-cgi/mjpg"
- intitle"toshiba network camera - User Login"
- intitle"NetCam Live Image" -.edu -.gov
- camera linksys inurlmain.cgi
47More junk
- Open mail relays (spam, anyone?)
- inurlxccdonts.asp
- Finger
- inurl/cgi-bin/finger? "In real life
- Passwords
- !Host. intextenc_UserPassword extpcf
- "AutoCreateTRUE password
48So much to search, so little time
- Check out the Google Hacking Database (GHDB)
http//johnny.ihackstuff.com
49OK, one more
- Search on Homeseer web control
50How not to be a Google victim
- Consider removing your site from Googles index.
- Please have the webmaster for the page in
question contact us with proof that he/she is
indeed the webmaster. This proof must be in the
form of a root level page on the site in
question, requesting removal from Google. Once we
receive the URL that corresponds with this root
level page, we will remove the offending page
from our index. - To remove individual pages from Googles index
- See http//www.google.com/remove.html
51How not to be a Google victim
- Use a robots.txt file
- Web crawlers are supposed to follow the robots
exclusion standard specified at
http//www.robotstxt.org/wc/norobots.html. - The quick way to prevent search robots crawling
your site is put these two lines into the
/robots.txt file on your Web server - User-agent
- Disallow /
52Questions