MAGAZINWEB

mai 9, 2008

motoare de cautare

Search engine robots and others

The following table lists the search engines that spider the web, the IP addresses that they use, and the robot names they send out to visit your site. Version numbers are usually included in the robot names, but are omitted here except where it implies a visit from a different IP address or (as in inktomi) a different search engine.

Often multiple IP addresses are used, in which case we just give a flavour of the names or numbers. Inktomi is a company that offers search engine technology and is used by a number of sites (e.g. www.snap.com and www.hotbot.com)

Wherever <nn> appears this indicates a number of different digits may be used.


(deoarece am observat ca google face in ultimul timp “figuri” -in sensul ca dupa ce ti-a publicat o pagina , o sterge pe cea anterioara , m-am gandit ca ar fi poate mai bine pentru bloggeri sa-si publice materialele folosind alte motoare . pagina asta e o parte dintr-un alt articol protejat…il puteti vedea intreg la adresa asta : Search engine robots )

(warnnig! THIS IS A PASTE FROM A PROTECTED ARTICLE . YOU CAN SEE IT HERE : Search engine robots )


Home page/search engine Robot identifier
www.abacho.com AbachoBOT
www.abcdatos.com abcdatos_botlink
http://www.abcdatos.com/botlink/
www.aesop.com AESOP_com_SpiderMan
www.ah-ha.com ah-ha.com crawler (crawler@ah-ha.com)
www.alexa.com ia_archiver
www.altavista.com Scooter
Mercator
Scooter2_Mercator_3-1.0
roach.smo.av.com-1.0
Tv<nn>_Merc_resh_26_1_D-1.0
www.altavista.co.uk AltaVista-Intranet
jan.gelin@av.com
www.alltheweb.com FAST-WebCrawler
crawler@fast.no
www.fast.no/faq/faqfastwebsearch/faqfastwebcrawler.html
Wget
www.acoon.de Acoon Robot
www.antisearch.net antibot
www.atomz.com Atomz
www.axmo.com AxmoRobot
www.buscaplus.com Buscaplus Robi
http://www.buscaplus.com/robi/
www.canseek.ca CanSeek/
support@canseek.ca
www.christcrawler.com/search.cfm ChristCRAWLER
http://www.christcrawler.com/
www.clush.com Clushbot
http://www.clush.com/bot.html
www.crawler.de Crawler
admin@crawler.de
www.daadle.com DaAdLe.com ROBOT/
www.daum.net RaBot
Agent-admin/ phortse@hanmail.net
contact/jylee@kies.co.kr
RaBot
Agent-admin/ webmaster@kisco.go.kr
www.en.deepindex.com DeepIndex
www.ditto.com DittoSpyder
domanova.co.uk Jack
www.earthcom.info EARTHCOM.info
www.entireweb.com Speedy Spider
www.excite.com ArchitextSpider
(excite) ArchitectSpider
www.eurip.com EuripBot
www.euroseek.net Arachnoidea
arachnoidea@euroseek.net
www.ezresults.com EZResult
www.fastsearch.net Fast PartnerSite Crawler
FAST Data Search Crawler
FAST Data Search Document Retriever
www.fireball.de KIT-Fireball
http://france.misesajour.com/ france.misesajour.com
www.fybersearch.com FyberSearch
www.galaxy.com GalaxyBot
http://www.galaxy.com/galaxybot.html
www.geckobot.com geckobot
www.gendoor.com
(Genealogical Search Engine)
GenCrawler
www.geona.com GeonaBot
www.getrax.com getRAX
www.google.com Googlebot
googlebot@googlebot.com
http://googlebot.com/
www.goo.ne.jp moget/2.0
moget@goo.ne.jp
www.girafa.com Aranha
(inktomi) Slurp.so/1.0
slurp@inktomi.com
(inktomi) Slurp/2.0j
slurp@inktomi.com
www.inktomisearch.com
(inktomi) Slurp/2.0-KiteHourly
slurp@inktomi.com;
www.inktomi.com/slurp.html
(inktomi) Slurp/2.0-OwlWeekly
spider@aeneid.com
www.inktomi.com/slurp.html
(inktomi) Slurp/3.0-AU
slurp@inktomi.com
http://hoppa.com/
(need V5 browsers to view)
Toutatis 2.5-2
www.hubat.com Hubater
www.almaden.ibm.com
(research centre)
http://www.almaden.ibm.com/cs/crawler
www.iltrovatore.it IlTrovatore-Setaccio
www.incywincy.com IncyWincy
www.infoseek.com UltraSeek
InfoSeek Sidewinder
www.intags.de Mole2/1.0
webmaster@intags.de
http://mp3bot.de/ MP3Bot
www.ip3000.com C-PBWF-ip3000.com-crawler
ip3000.com-crawler

www.istarthere.com http://www.istarthere.com
spider@istarthere.com
www.knowledge.com Knowledge.com/
www.kuloko.com kuloko-bot/0.2
www.lexis-nexis.com LNSpiderguy
www.linknz.co.nz Linknzbot
www.look.com lookbot
www.looksmart.com MantraAgent
www.loopimprovements.com
(see also www.incywincy.com)
NetResearchServer
www.loopimprovements.com/robot.html
www.lycos.com Lycos_Spider_(T-Rex)
www.joocer.com JoocerBot
www.mirago.co.uk HenryTheMiragoRobot
www.mojeek.com MojeekBot
www.mozdex.com mozDex/
http://search.msn.com/ MSNBOT/0.1
http://search.msn.com/msnbot.htm)
www.navadoo.com Navadoo Crawler
www.northernlight.com Gulliver
www.objectssearch.com ObjectsSearch/0.01
http://szukaj.onet.pl/ OnetSzukaj/
www.picosearch.com PicoSearch/
www.portaljuice.com PJspider
www.powerinter.net
but it won’t let us in :-(
DIIbot
http://navi.ocn.ne.jp/ nttdirectory_robot
super-robot@super.navi.ocn.ne.jp
griffon
griffon@super.navi.ocn.ne.jp
www.maxbot.com Spider/maxbot.com
admin@maxbot.com
??? gazz/1.0
gazz@nttrd.com
www.nationaldirectory.com NationalDirectory-SuperSpider
www.naver.com dloader(NaverRobot)/
dumrobo(NaverRobot)/
www.noxtrum.com noxtrumbot/
www.openfind.com
(Chinese language)
Openfind piranha,Shark
robot-response@openfind.com.tw
Openbot/
www.picsearch.org psbot
www.picsearch.org/bot.html
www.pinpoint.com CrawlerBoy Pinpoint.com
www.petersnews.com user<n>.ip3000.com
www.qweery.nl QweeryBot
http://qweerybot.qweery.com)
www.vestris.com/alkaline AlkalineBOT
www.rambler.ru StackRambler/
www.seznam.cz SeznamBot
www.search-10.com Search-10
www.searchhippo.com Fluffy the spider
info@searchhippo.com)
www.scrubtheweb.com Scrubby/
www.singingfish.com asterias
www.speedfind.de speedfind ramBot xtreme
www.s.u-tokyo.ac.jp Kototoi/0.1
www.searchbyusa.com SearchByUsa
www.searchspider.com Searchspider/
www.sightquest.com SightQuestBot/
http://www.sightquest.com/bot.htm
www.spidermonkey.ca Spider_Monkey/
www.surfnomore.com Surfnomore Spider v1.1
www.supersnooper.com Robot@SuperSnooper.Com
www.teoma.com teoma_agent1
teoma_admin@hawkholdings.com
http://mapper.teradex.com Teradex_Mapper
mapper@teradex.com
www.travel-finder.com ESISmartSpider
www.traficdublu.ro Spider TraficDublu
www.tutorgig.com Tutorial Crawler
http://www.tutorgig.com/crawler
www.updated.com updated/0.1beta
crawler@updated.com
www.uksearcher.co.uk UK Searcher Spider
www.vivante.com
(coming soon)
Vivante Link Checker
www.walhello.com appie
www.websmostlinked.com Nazilla
www.webwombat.com.au www.WebWombat.com.au
www.webseek.de marvin/infoseek
marvin-team@webseek.de
www.webtop.com MuscatFerret
www.whizbanglabs.com WhizBang! Lab
www.wisenut.com ZyBorg
(info@WISEnut.com)
www.wire.co.uk WIRE WebRefiner:
webrefiner@wire.co.uk
www.worldsearchcenter.com WSCbot
www.yandex.com Yandex
www.yellowpet.com
pet-based search engine
Yellopet-Spider
www.yelo.no Findexa Crawler
www.yourbettersearch.com YBSbot search engine indexer
<client sites> libwww-perl
http://verno.ueda.info.waseda.ac.jp/

Browsers

Most browsers identify themselves with a string that begins “Mozilla…”. I’ve chosen not to document those (as yet). Here are a few of the rarer browser identifiers that I’ve seen.

Browser identifier Information
AmigaVoyager http://v3.vapor.com/
Voyager browser for the Amiga
xChaos_Arachne http://browser.arachne.cz/
(DOS-compatible browser. Linux version under development)
IBrowse www.hisoft.co.uk (search for IBrowse)
Amiga-based browser
ICab www.icab.de/index.html
(Macintosh-only)
JustView http://www3.justsystem.co.jp/download/justview/3.01win1a.html
(I think this is a browser. Site is in Japanese)
KMeleon http://kmeleon.sourceforge.net/
(Light browser based on the Mozilla code base)
Konqueror www.konqueror.org/konq-browser.html
(Linux KDE browser)
Lynx http://lynx.browser.org/
(Cross-platform text based browser)
OmniWeb www.omnigroup.com/products/omniweb/
(Macintosh-only)
Opera www.opera.com
(Cross-platform, small, efficient and standards lead browser)
Plucker www.plkr.org/index.pl/faq#1.1
(Palm handhelds. Written in Python)
pwWebSpeak www.prodworks.com/issound/catalog/catalog_pwwebspeak.html
Audio Browser
QWeb http://sunsite.auc.dk/qweb/ (Linux browser)
(see also http://browswerwatch.internet.com/news/story/qweb8.html)
retawq http://retawq.sourceforge.net/
Text-based browser for text terminals. Runs under Linux
SlimBrowser www.flashpeak.com/sbrowser/sbrowser.htm
Freeware tabbed browser
Sleipnir http://sleipnir.pos.to/software/sleipnir/index.html (Japanese)
Japanese browser with apparantly an English version available.
VMS_Mosaic http://vaxa.wvnet.edu/vmswww/vms_mosaic.html
(OpenVMS only version of Mosaic, a pre-Netscape browser)
WannaBe http://mindstory.com/wb2/
(Macintosh text-only browser)
w3m http://w3m.sourceforge.net/
(text-based browser)

Bloguieste pe WordPress.com.