Jump to content
Eternal Lands Official Forums
The_Piper

Search engine with a de-centralized index

Recommended Posts

Looking for an alternative for google or bing or whatever?

 

One interesting thing is YaCy (Yet another Cyberspace).

http://yacy.net/en/index.html

 

The advantage of YaCy is that there is no central search index like google or bing or other search engines have, which can be censored or manipulated or blocked by any authorities, or, at least, not that easy.

 

But instead of a central search index, there are a lot of local indexes run by users on their local computers, which are connected via p2p technology.

 

How to run YaCy:

 

Download and unpack the software on your computer or server, then run the startYACY.sh or startYACY.bat to start up the search engine, web crawler, web interface and such.

 

(To stop it, run stopYACY.sh btw. stopYACY.bat)

 

Now you have a running search engine with a local search index on your computer, which connects via p2p technology to other servers in the YaCy network.

 

Start a browser and type http://localhost:8090

 

(If you are running YaCy on a server, type the server's URL instead of localhost)

 

Now you see the interface of your local search engine, which will search your local search index and ask via p2p for more search results from other YaCy search engines, run by other users.

 

Nice, isnt it? :)

 

But how do you fill your local search index?

 

At the top left of your browser window is a button "Administration".

 

Click on it.

 

You see the YaCy Administration screen then.

 

(If you are asked for a password for the admin account, go to your YaCy directory, switch to the bin directory and run passwd.sh and your desired password, like passwd.sh myyacypassword. Then enter as the account "admin" and as password the password you have set with passwd.sh)

 

At the left side of the screen is a topic "Index production" / "Crawler/Harvester", click on it.

 

At the next screen you can enter the URL of a website the web crawler should download and store in your local search index.

 

To see what your local web crawler is actually doing, click on "Index production" / "Creation Monitor"

 

To return to the search interface and search for stuff, click on "Search Interface" / "Web search".

 

Thats what i've found out this evening about YaCy.

 

Would be nice, if some more ppl join this project by running the software and so expanding the amount of local search indices.

 

Have fun with it!

 

Piper

Edited by The_Piper

Share this post


Link to post
Share on other sites

Interesting. The only thing that worries me is having to download and install something from an unknown source.

Currently, I use duckduckgo and ixquick search engines which supposedly don't log and save any user information or search queries/results.

Share this post


Link to post
Share on other sites

Not only that, but if this becomes popular, just think of how many more additional search engines will be crawling the web and hitting your site. Google at least has mechanisms in place to make sure it's bots don't hit you too often, with this you could get hit constantly by thousands!

Share this post


Link to post
Share on other sites

with this you could get hit constantly by thousands!

 

"constantly hit by thousands" is a little bit extreme.

 

As far as i've found out, per default you have to start the crawler manually, tell him which websites to crawl and index.

 

If you want, you can set for each website a period of time, when the crawler revisits this website.

 

But that must be done manually for each website, so per default, the crawler gets the website once and thats it.

 

Piper

Share this post


Link to post
Share on other sites

Think of how many people though would be indexing wikipedia every day though.

Indexing wikipedia makes no sense, because the crawler follows only links on websites, and there are only a few at the wikipedia front page.

 

Piper

Share this post


Link to post
Share on other sites

The wikipedia main page has links to pages with lots of other linkes and those eventually get to articles and then you have see also links and links to the other language sites. You can find a lot just by following links from the main page.

Share this post


Link to post
Share on other sites

You can find a lot just by following links from the main page.

 

True, but you wont be able to crawl through the whole wikipedia just by starting at the start page.

 

But i got your point.

 

On the other hand, YaCy (and other search engine like that) makes only sense if ppl index various sites and not everybody focusses on WikiPedia and such.

 

You can also limit the crawler to which pages it gets, like getting only pages from the same domain, like de.wikipedia.org is the german Wikipedia and if the crawler should not follow external links, en.wikipedia.org, the english one, will not be crawled.

 

Further you can limit the pages the crawler gets from one domain.

 

Ok, if someone misconfigures the crawler, you can really cause some kind of DDOS, but you can do that too by abusing wget without YaCy.

 

Piper

Share this post


Link to post
Share on other sites

The other part of my point is there are stupid people out there that will miss configure and then do dumb things. The eample is a worst case, but unfortunately is within the realm of possibility.

Share this post


Link to post
Share on other sites

Interesting. The only thing that worries me is having to download and install something from an unknown source.

Oh, c'mon, that argument is pretty lame, you are, by running a Linux distribution, downloading every day stuff from unknown sources or, by accepting or trusting some stupid certificates every day, trusting things from unknown sources.

 

You have not even a chance, to rate stuff if it's harmfull or harmless or thrustworthy or not, and so do i and nearly 99.9999999% of mankind :P

 

Currently, I use duckduckgo and ixquick search engines which supposedly don't log and save any user information or search queries/results.

I tried duckduckgo too, but got some strange, not so expected results, when i was looking for some harmless things, like traveling tips and such. But yes, it's a nice search engine too (havent tried ixquick until now).

 

But they tell you that they don't log and save user data, the same way Google and Micro$oft tell you that they don't collaborate with the NSA.

 

Who do you believe, who can you trust?

 

Anyways, YaCy has around 1000 supporters, who run the software, and the YaCy Project was started and is run by a member of the www.heise.de forum, which is the most evil german computer forum, which is full of trolls, maniacs and paranoids. If that software is really malware, at least one of them would have been that paranoid to monitor what YaCy is doing, and if it would be malware or do misbehave, would have posted there, but that didnt happen until now :P

 

Piper

Share this post


Link to post
Share on other sites

Google is a bit aggressive these days. I updated my phone to Android 4.4 (kitkat) yesterday and I found google to be getting really tricksy with the privacy invasion stuff.

 

Phone: We need to enable search history for optimal user experience.

Me: hmmm. no thank you!

Phone: Press Yes to upload your photos and share them.

Me: No (and I press No).

Phone: Ok. You have chosen to upload your photos later.

Me: Noooooooooo. I did not say later. I said NO!

Phone: Ok, maybe later.

Edited by hussam

Share this post


Link to post
Share on other sites

Google is a bit aggressive these days. I updated my phone to Android 4.4 (kitkat) yesterday and I found google to be getting really tricksy with the privacy invasion stuff.

 

Phone: We need to enable search history for optimal user experience.

Me: hmmm. no thank you!

Phone: Press Yes to upload your photos and share them.

Me: No (and I press No).

Phone: Ok. You have chosen to upload your photos later.

Me: Noooooooooo. I did not say later. I said NO!

Phone: Ok, maybe later.

 

See?

 

The NSA whats you. Or, at least, your data.

 

YOU.

 

YOU!!

 

*Points at hussam with an evil grin*

 

Maybe i should try to update my android phone too and find out what happens then :D

 

Piper

 

Who do you believe, who can you trust?

 

 

Lavabit? ;)

 

Trustworthy company, but unfortunately offline :)

 

Piper

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×