https://search.marginalia.nu
Marginalia search engine, specializes in text heavy, non-commercial web pages. It has several different algo's you can try.
https://twitter.com/MarginaliaNu/status/1583464144686104576
QuoteBut get this: Marginalia now indexes 106 million documents! Off a single PC. This is kinda bonkers. Previous record was barely above 60 million. Turns out modern computers are kinda powerful.
Crawling took 2 weeks. The index is 1.1 Tb.
>specializes in text heavy, non-commercial
I wonder how they differentiate between commercial & non-commercial? Graphics? Logos? Topics? KWs? All of the above?
>differentiate
Dunno. There is some human screening going on. I suspect the amount of ads plays a part and probably graphics too.