It seems to me that we've actually got a potential alternative for the big search engines in YaCy. It's not perfect, but it's open and it's distributed so if you don't think a site that should be getting indexed is in there you can point the crawler at it yourself.
Another neat thing you can do with YaCy is to point to it as your proxy, so all your traffic goes through it and it'll automatically crawl any pages you visit. Even if you don't want to do that though, you can point it at your favorite websites (particularly ones you don't think are getting views through sites like google) and it'll just start crawling through. You can set it to follow links and just crawl through the Internet.
The most important thing right now is that I only see between 100-300 nodes on the network at any one time. This suggests to me that there's a real chance to boost this system with not a lot of users. Add another 300 nodes with people actively adding stuff that's not "allowed" on google et al., and it becomes an engine that's massively useful. I've only been running and crawling for one week at a very low utilization, and I've got 2.6 million pages crawled, which is 0.13% of all the pages available on YaCy. Imagine how quickly a few hundred of us could fill the engine with all kinds of cool content, and it'd be distributed so it wouldn't be something any one person can break.
And for people who like the idea but don't want to give up their other search engines, yacy is an option in searx as well, so it can be just one of many search engines your search engine is looking at.
Another neat thing you can do with YaCy is to point to it as your proxy, so all your traffic goes through it and it'll automatically crawl any pages you visit. Even if you don't want to do that though, you can point it at your favorite websites (particularly ones you don't think are getting views through sites like google) and it'll just start crawling through. You can set it to follow links and just crawl through the Internet.
The most important thing right now is that I only see between 100-300 nodes on the network at any one time. This suggests to me that there's a real chance to boost this system with not a lot of users. Add another 300 nodes with people actively adding stuff that's not "allowed" on google et al., and it becomes an engine that's massively useful. I've only been running and crawling for one week at a very low utilization, and I've got 2.6 million pages crawled, which is 0.13% of all the pages available on YaCy. Imagine how quickly a few hundred of us could fill the engine with all kinds of cool content, and it'd be distributed so it wouldn't be something any one person can break.
And for people who like the idea but don't want to give up their other search engines, yacy is an option in searx as well, so it can be just one of many search engines your search engine is looking at.
- replies
- 1
- announces
- 3
- likes
- 6