Difference between revisions of "Open Search"

From Hack Sphere Labs Wiki
Jump to: navigation, search
(Created page with "I really think that decentralized search will be the future. Reasons: *More Freedom *DCMA Requests PFFFTT? (Trying getting 300 million people to remove a result) *Content is own...")
 
 
(3 intermediate revisions by the same user not shown)
Line 1: Line 1:
 +
=Star Trek Computer=
 +
*http://fogbeam.blogspot.com/2013/05/why-star-trek-computer-will-be-open.html
 +
 +
=Elastic Search=
 +
*http://www.elasticsearch.org/videos/to-infinity-and-beyond/
 +
 +
 +
=Decentralized Search=
 +
 
I really think that decentralized search will be the future.
 
I really think that decentralized search will be the future.
 
Reasons:
 
Reasons:
Line 7: Line 16:
 
*When you know what is under the hood you know what you are getting and how to change it.
 
*When you know what is under the hood you know what you are getting and how to change it.
 
*I like the fact that it will confuse ISP's when every users computer is maxing out the "unlimited" BW that they purchase every month.
 
*I like the fact that it will confuse ISP's when every users computer is maxing out the "unlimited" BW that they purchase every month.
 +
*You can index anything and everything if you want.  Facebook,twitter, etc can't block the entire internet.
  
  
 
Finally someone has released something:
 
Finally someone has released something:
  
http://yacy.net/en/index.html
+
*http://yacy.net/en/index.html
 +
*http://www.yacy-websuche.de/wiki/index.php/Dev:GITAccess
 +
 
 +
==Resources and Links==
 +
*http://www.opensearch.org/Home
 +
 
 +
 
 +
==Custom Search Engine==
 +
 
 +
The original google search document (the paper on google search engine) is great for a start on the creation of a search engine.  The document retrieval and storage processes are easier to create today then they where before.
 +
 
 +
*http://doc.scrapy.org/en/latest/intro/install.html
 +
*http://readthedocs.org/docs/scrapy/en/0.12/intro/tutorial.html
 +
 
 +
The indexing and index storage could start the same but in the end should be different.  All search engines are the same right now.  They use similar methods to index information and such.  This is where the true experiment comes in.
 +
 
 +
Here is a reference to an index you can just download:  http://search.slashdot.org/story/11/11/15/0057200/common-crawl-foundation-providing-data-for-search-researchers
 +
 
 +
 
 +
===Algorithm Experiments===
 +
 
 +
*Image Matching?
 +
*IP relationships?
 +
*Last updated
 +
*Links
 +
*Metadata
 +
*Features of site(WEB 2.0ey?)
 +
*Information Category (Forums, Blogs, Feeds)
 +
 
 +
=Notes=
 +
*http://www.sphider.eu
 +
 
 +
=The Plan=
 +
*Get google paper
 +
*pull objects out
 +
*create objects
 +
*import local server(virtual)
 +
*setup a server on a decently fast consumer internet connection to test
 +
*move to dedi box

Latest revision as of 08:40, 23 May 2013

Star Trek Computer

Elastic Search


Decentralized Search

I really think that decentralized search will be the future. Reasons:

  • More Freedom
  • DCMA Requests PFFFTT? (Trying getting 300 million people to remove a result)
  • Content is owned by no one. This means that the content is not a lie.
  • When you know what is under the hood you know what you are getting and how to change it.
  • I like the fact that it will confuse ISP's when every users computer is maxing out the "unlimited" BW that they purchase every month.
  • You can index anything and everything if you want. Facebook,twitter, etc can't block the entire internet.


Finally someone has released something:

Resources and Links


Custom Search Engine

The original google search document (the paper on google search engine) is great for a start on the creation of a search engine. The document retrieval and storage processes are easier to create today then they where before.

The indexing and index storage could start the same but in the end should be different. All search engines are the same right now. They use similar methods to index information and such. This is where the true experiment comes in.

Here is a reference to an index you can just download: http://search.slashdot.org/story/11/11/15/0057200/common-crawl-foundation-providing-data-for-search-researchers


Algorithm Experiments

  • Image Matching?
  • IP relationships?
  • Last updated
  • Links
  • Metadata
  • Features of site(WEB 2.0ey?)
  • Information Category (Forums, Blogs, Feeds)

Notes

The Plan

  • Get google paper
  • pull objects out
  • create objects
  • import local server(virtual)
  • setup a server on a decently fast consumer internet connection to test
  • move to dedi box