Why We Are Here > Traffic

Roll your own search engine?

<< < (2/5) > >>

rcjordan:
Would be particularly nice if there is a way to generate semi-curated static pages from the serps.  I see some highly specialized sites (food safety, medical research) which I think are using meta-search and selected rss feed search as source material.

Brad:
The truth is my capabilities end around HTML 3.  I'd love to make a search engine but it's beyond me.

I look at these posts as Idea Virus posts, maybe somebody will come along with more money or more skill think about this and follow through.  With this one was to point out some off the shelf resources that seem to be laying around and a few thoughts about what they might be used for.

My second goal - which I didn't explain very well - about the DIY search service - is part of "Brad's Ongoing Guerrilla Insurgency Against Google" BOGIAG*.   

Webmasters helped Google get started by putting Google search boxes on their websites.  My thought was: "What if every blogger/webmaster put up a search box for a non-Google engine?"  Then, progressing, "What if every webmaster could put together their own modular search engine, with few being exactly alike (combined spider, directory, RSS engine?), and put them on their websites?"  Then, "Well what if one could provide a service to make it easy for webmasters to put together a modular search engine, a bit like Rollyo a bit like Eurekster but better and would anyone use it?"

Hence the post.

>>size of the database

I think you are right LM.  I'm in the middle of a three week test of Mojeek.com as my default engine.  On long multi word queries it sometimes fails.  But it keeps surprising me, with Duckduck and Bing based engines, I pretty much know which trusted sites Bing will bring up for reviews and best of tech lists.  But with Mojeek, I'm getting some real gems out of what would be "the long tail" on a major search engine.  I'm kinda amazed.

Aside

*BOGAIG is utilizing lots of tiny elements to get around Google as Gatekeeper, mainly for blogs.  These include: Indieweb.org elements like webmentions, syndication to social media for traffic, RSS, old time blogrolls, curated micro-directories, maybe webrings, site searches to link several of our domains together, search boxes from any one other than Google, search feeds, maybe one exclusive subject category on our blogs that Google and only Google is excluded from in robots.txt, etc.  Anything, that is cheap, easy, off the self, low risk and kinda fun. Sounds a little bit crazy to us, but to the younger set not so crazy.  They like the idea of reviving a retro-web of many search engines, many directories, blogrolls - anything to break the monopolies.
/Aside

BoL:
>Mojeek

Bit OT but a little insight:
https://www.mojeek.co.uk/search?q=scotland&moo=0 - default
https://www.mojeek.co.uk/search?q=scotland&moo=1 - on-page factors only
https://www.mojeek.co.uk/search?q=scotland&moo=2 - off-page factors only

Site search and custom ranking factors are two things more or less available (I've seen a page that spits out stats on dozens of ranking factors, you can change the weights via URL). Gauging/funding those kind of things is something that may get looked into. Certainly challenging in pure organic on an international scale is a big task.

Brad:
>Mojeek

First, it takes a lot of guts to rely only on your own index and algo for a search engine in 2018.  I admire Mojeek for their moxie.  Qwant may be making their own index but they have Bing to do the heavy lifting right now.  Mojeek is out there all alone with only their own resources.

Second, thank you for those examples.  It makes the role of those factors so plain in contrast.

I've become a bit of a Mojeek fan since you tipped us off to it a couple of months ago.  Their algo is pretty good too.

Remember those Parallel search forms we used to have years ago?  The ones that showed Google, Yahoo, ATW results side by side for comparison on the same search.  I'd love to have one of those today for DDG, Startpage, Mojeek.

BoL:
Indeed. I spoke to Marc (the Mojeek founder) in 2006 on WmW before it got started, just talking about algos and ranking ideas. It's such a huge task that you think anyone trying it on their own would give up, but he's done an amazing job.

The parallel search forms wouldn't be too hard to code up with some iframes and javascript. Interesting that startpage use POST vars, makes their SERPs harder to link to.

Also, maybe you'd be interested in yacy.net if you've not seen it.

Do pass on your experiences with Mojeek and your trial via the feedback option, it's invaluable to hear about real-world experience

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version