@adamsdesk You know all the “European search engines” that all turn out to be rebranded versions of Microsoft Bing? Well, @Mojeek seems to be the real deal with their own crawlers, index, and rankings. You definitely get other search results than with Bing and Google, and I believe there’s value in that. Mojeek isn’t good for searching for recent news and events, but works fine for general queries. (Except for the frequent 403 Forbidden errors.)
@Mojeek I don’t remember any concrete queries, but I get the errors like every four or five searches. Reloading doesn’t help. Other searches often fail if I try other queries right after running into an HTTP403. Maybe include a captcha or something on the 403 pages to bypass the blocking. Or a “Report blocked query” link or something to gather more details. You’re saying it has to do with the queries? It’s not an over-aggressive WAF or rate-limiting?
@Mojeek Firefox via OpenSearch. Why are you returning a 403, though? Decoding this URL encoding server-side should still turn it into a space using any of the URL decoders built into just about any programming language.
@Mojeek the query works fine with the %20 encoding when I remove the quotation marks (whether literal " or %22) from the query.
@da Our mistake; it's not just the %20 encoding, it's the use of quotes which is hitting our bot blocker for some queries. We're constantly working to update our ability to block bots and are looking at better ways of doing this going forwards. Also, we've not witnessed Firefox encoding that way before and we use it with the opensearch plugin.
@Mojeek Add a "I’m a human! Bypass the blocking for this query (one bypass per IP/hour).” link on the blocker page. That way, you’ll also get feedback about false positives. Add a captcha if absolutely necessary.
@da Cheers for this by way of a suggestion, it's something that we are definitely going to look into. If you've got anything else then we're all ears!
This Mastodon instance is for people interested in technology. Discussions aren't limited to technology, because tech folks shouldn't be limited to technology either!