网站如何屏蔽垃圾蜘蛛爬取?

Echo ·
更新时间:2024-09-20
· 795 次阅读

我的网站有好多垃圾蜘蛛来抓,浪费流量如何屏蔽?



蜘蛛

#禁止垃圾搜索引擎蜘蛛抓取 if ($http_user_agent ~* "CheckMarkNetwork|Synapse|Nimbostratus-Bot|Dark|scraper|LMAO|Hakai|Gemini|Wappalyzer|masscan|crawler4j|Mappy|Center|eright|aiohttp|MauiBot|Crawler|researchscan|Dispatch|AlphaBot|Census|ips-agent|NetcraftSurveyAgent|ToutiaoSpider|EasyHttp|Iframely|sysscan|fasthttp|muhstik|DeuSu|mstshash|HTTP_Request|ExtLinksBot|package|SafeDNSBot|CPython|SiteExplorer|SSH|MegaIndex|BUbiNG|CCBot|NetTrack|Digincore|aiHitBot|SurdotlyBot|null|SemrushBot|Test|Copied|ltx71|Nmap|DotBot|AdsBot|InetURL|Pcore-HTTP|PocketParser|Wotbox|newspaper|DnyzBot|redback|PiplBot|SMTBot|WinHTTP|Auto Spider 1.0|GrabNet|TurnitinBot|Go-Ahead-Got-It|Download Demon|Go!Zilla|GetWeb!|GetRight|libwww-perl|Cliqzbot|MailChimp|SMTBot|Dataprovider|XoviBot|linkdexbot|SeznamBot|Qwantify|spbot|evc-batch|zgrab|Go-http-client|FeedDemon|JikeSpider|Indy Library|Alexa Toolbar|AskTbFXTV|AhrefsBot|CrawlDaddy|CoolpadWebkit|Java|UniversalFeedParser|ApacheBench|Microsoft URL Control|Swiftbot|ZmEu|jaunty|Python-urllib|lightDeckReports Bot|YYSpider|DigExt|HttpClient|MJ12bot|EasouSpider|LinkpadBot|Ezooms|BLEBOT|petalbot|ZoominfoBot|IndeedBot|Buck|SEOkicks") { return 403; break; } #禁止扫描工具客户端 if ($http_user_agent ~* "crawl|curb|git|Wtrace|Scrapy" ) { return 403; break; } Nginx使用方法: 把上面代码保存到nginx目录下 文件名如:denybot.conf 然后在网站配置文件里面include denybot.conf 即可。

需要 登录 后方可回复, 如果你还没有账号请 注册新账号