Author Topic: Common Search: large-scale, nonprofit search engine  (Read 2493 times)

bill

  • Devil's Avocado
  • Global Moderator
  • Hero Member
  • *****
  • Posts: 1286
  • Avast!
    • View Profile
    • Email
Common Search: large-scale, nonprofit search engine
« on: August 08, 2016, 06:09:50 AM »
https://about.commonsearch.org/faq

Noticed this search engine project because they just made a Tor node available.
https://about.commonsearch.org/2016/08/state-of-common-search-august-2016/

They are only using publicly available data sets, mostly Common Crawl.

BoL

  • Inner Core
  • Hero Member
  • *
  • Posts: 1209
    • View Profile
Re: Common Search: large-scale, nonprofit search engine
« Reply #1 on: August 08, 2016, 10:22:52 AM »
I'm cynical, mainly because the commoncrawl dataset is in the low billions.  I think the coverage would be too low regardless of the quality of the algo.

Brad

  • Inner Core
  • Hero Member
  • *
  • Posts: 4154
  • What, me worry?
    • View Profile
Re: Common Search: large-scale, nonprofit search engine
« Reply #2 on: August 08, 2016, 12:42:16 PM »
At least somebody is spidering the web besides Bing and Google.  I guess we will wait and see.

rcjordan

  • I'm consulting the authorities on the subject
  • Global Moderator
  • Hero Member
  • *****
  • Posts: 16345
  • Debbie says...
    • View Profile
Re: Common Search: large-scale, nonprofit search engine
« Reply #3 on: August 08, 2016, 02:33:35 PM »
>somebody

They ain't got G maps or Waze.  Anecdotally, I see more & more eyeballs drawn in servitude to G because they are dependent upon these.

Mackin USA

  • Inner Core
  • Hero Member
  • *
  • Posts: 2905
  • Abstract Artist
    • View Profile
Re: Common Search: large-scale, nonprofit search engine
« Reply #4 on: August 08, 2016, 03:26:31 PM »
I tried to seed a site to them and it DID NOT WORK

whatever :D
Mr. Mackin


littleman

  • Administrator
  • Hero Member
  • *****
  • Posts: 6552
    • View Profile
Re: Common Search: large-scale, nonprofit search engine
« Reply #6 on: August 08, 2016, 05:23:13 PM »
BOL, well, they have to start somewhere.  In their FAQ they say that the search is still incomplete.  I think there is definitely room, with the market as big as it is and there being so few players these days.   The few tests I did seemed not too bad. 

An interesting thing about the SE business model is that server infrastructure is much more a factor of user level than the size of the data being queried.  So, in theory a good search engine could be started on a budget and then the hardware could be expanded as the user base grows.

BoL

  • Inner Core
  • Hero Member
  • *
  • Posts: 1209
    • View Profile
Re: Common Search: large-scale, nonprofit search engine
« Reply #7 on: August 08, 2016, 07:19:40 PM »
There's definitely room littleman, and I'm all for them trying to make it happen. Perhaps if the commoncrawl dataset were to grow a lot then it has the ability to satisfy more queries. One major problem it's going to have is freshness but obviously they can keep knocking down those barriers to entry and keep improving. Not even sure how many 'pages' are on the web nowadays but it seems like their crawl data needs to grow by 10x to 100x.

Majestic's historical index is sitting at 865 billion,

A member from WmW I've spoken to a bit built his own, mojeek.co.uk... they have a similar philosophy to DDG but IIRC are just focusing on the UK.