Major search engines like Bing have introduced a spam filtering mechanism, which identifies spam by applying a set of rules. This has affected almost 3% of all search queries. The spam filtering targeted to minimize keyword stuffing in the URL or domain, and enhance page ranking of the website in the search engine. According to Bing, it is important to address the issue of URL keyword stuffing and reduce spam because it plays a significant role in SERP presence. When such spam (keyword stuffed) URLs appear in the results page, it provokes the visitors to click on them by providing the best match to their queries.
Bing Index Quality has investigated into several aspects that lead to keyword stuffing in the domain and URL, including the number of hosts, the number of words in the host/domain name and path, size of the website, and other factors. Such spam URLs have the tendency to attract attention and increase click-through rate for their websites, which is certainly a Black-hat SEO technique. However, the search engine unveiled particular details about the algorithm used for spam filtering because spammers may leverage the information to evolve better tactics to pass the vigilance.
Typically, the search engine spam filtering mechanism looks at different aspects, including –
– Domain/hostname that contains lexicons/pattern combinations (e.g. product name, year, event)
– URL squatting
– Quality of page/website content and other signals of popularity
– Percentage of the site cluster that compromises of top-searched host/domain name keywords
– Host/domain/path keyword co-occurrence, which includes unigrams and bigrams
Considering these aspects, the search engine detects spam by looking into a number of factors like size of the website, number of hosts, number of words in the host or domain name and path, host or domain names that include lexicon combinations and patterns, and content quality of the website or page. The algorithm harnesses a set of rules and spam filtering mechanism that identifies potential and obvious spam in the URL or domain name and immediately blocks it from appearing in the SERP.
To improve the efficiency of spam filtering in search engines, Bing clusters the websites based on a number of factors like domain, owner, etc. and subsequently, looks into the signal patterns that listed above in similar clusters. This procedure helps to improve the spam detection process and accuracy because there may be many spammers who develop hundreds of similar-looking websites.