Latest topics
» Robert F. Kennedy Jr. Vows To Bring ‘Criminal’ Anthony Fauci To Justice
Open Source Web Crawling is About Ten to Fifteen Years Behind Google EmptyToday at 9:10 am by PurpleSkyz

» CNN Waited 12 Hours to Cover Biden’s ‘You Ain’t Black’ Comment
Open Source Web Crawling is About Ten to Fifteen Years Behind Google EmptyToday at 9:05 am by PurpleSkyz

» Our Elected Leaders Have Failed Us By Trusting the Experts and Not the Constitution
Open Source Web Crawling is About Ten to Fifteen Years Behind Google EmptyToday at 8:50 am by PurpleSkyz

» Rare Black Triangle TR3B cruising at low altitude over London, UK
Open Source Web Crawling is About Ten to Fifteen Years Behind Google EmptyToday at 8:44 am by PurpleSkyz

» Sean Stone: So You Thought You Were Free?
Open Source Web Crawling is About Ten to Fifteen Years Behind Google EmptyToday at 8:38 am by PurpleSkyz

» Alice in Wonderland Technique: The Power of Applied Confusion
Open Source Web Crawling is About Ten to Fifteen Years Behind Google EmptyToday at 8:37 am by PurpleSkyz

» What Were These Strange Black Objects Surrounding The Curiosity Rover On Mars?
Open Source Web Crawling is About Ten to Fifteen Years Behind Google EmptyToday at 8:34 am by PurpleSkyz

» We're Closer Than You Think! David Shilcock Speaks!
Open Source Web Crawling is About Ten to Fifteen Years Behind Google EmptyToday at 8:30 am by PurpleSkyz

» California Hospital Records More Suicides Than Coronavirus Deaths During Lockdown
Open Source Web Crawling is About Ten to Fifteen Years Behind Google EmptyToday at 8:27 am by PurpleSkyz

» Joe Biden Ukraine Scandal Exploding
Open Source Web Crawling is About Ten to Fifteen Years Behind Google EmptyToday at 7:58 am by topspin2

» Dave (the Douchebag) ScHmIdT thinks putting out daily videos by Meta1 is being quiet!!! What a Dumb Bunny he is!!!
Open Source Web Crawling is About Ten to Fifteen Years Behind Google EmptyYesterday at 10:52 pm by PurpleSkyz

» She said it looked like a PORTAL next to the Sun - Captures Compelling Photo! - "Gateway"
Open Source Web Crawling is About Ten to Fifteen Years Behind Google EmptyYesterday at 6:58 pm by PurpleSkyz

» Benjamin blablabla Fulford 5-25-20… Khazarian Mafia’s COVID-19 Power Grab Fails, Bill Gates Now a Dead Man Walking
Open Source Web Crawling is About Ten to Fifteen Years Behind Google EmptyYesterday at 5:14 pm by PurpleSkyz

» Dr. Steven (struggling to stay relevant) Greer - CE-5 Contact Visual Training (Part 1&2)
Open Source Web Crawling is About Ten to Fifteen Years Behind Google EmptyYesterday at 2:08 pm by PurpleSkyz

» How I induced an out of body experience without substances
Open Source Web Crawling is About Ten to Fifteen Years Behind Google EmptyYesterday at 1:21 pm by PurpleSkyz

» "Zantac is a cancerous Poison” says colon cancer lawsuit
Open Source Web Crawling is About Ten to Fifteen Years Behind Google EmptyYesterday at 1:16 pm by PurpleSkyz

» #QTard Drama Theater - Shadow Diplomacy plus MORE
Open Source Web Crawling is About Ten to Fifteen Years Behind Google EmptyYesterday at 1:02 pm by PurpleSkyz

» The mysterious sound is back!!!
Open Source Web Crawling is About Ten to Fifteen Years Behind Google EmptyYesterday at 12:53 pm by PurpleSkyz

» UFO News ~ Over mile wide UFO causes electrical damage in Kenner, Louisiana plus MORE
Open Source Web Crawling is About Ten to Fifteen Years Behind Google EmptyYesterday at 12:51 pm by PurpleSkyz

» Astronauts rehearse for Crew Dragon flight
Open Source Web Crawling is About Ten to Fifteen Years Behind Google EmptyYesterday at 12:50 pm by PurpleSkyz







***********

CLICK THE SUBSCRIBE BUTTON BELOW TO RECEIVE OUR DAILY NEWSLETTER

A 2ND EMAIL COMPLETES THE ACTIVATION PROCESS




CLICK THE PURPLE BUTTON TO VIEW OUR LATEST POSTS







You are not connected. Please login or register

OUT OF MIND » EARTH AWARENESS » SCIENCE & TECHNOLOGY » Open Source Web Crawling is About Ten to Fifteen Years Behind Google

Open Source Web Crawling is About Ten to Fifteen Years Behind Google

Go down  Message [Page 1 of 1]

PurpleSkyz

PurpleSkyz
Admin
Open Source Web Crawling is About Ten to Fifteen Years Behind Google
Date: August 31, 2019 Author: Nwo Report

Open Source Web Crawling is About Ten to Fifteen Years Behind Google Web-crawlers-730x430
Source: Brian Wang
 
In 1999, it took Google one month to crawl and build an index of about 50 million pages. In 2012, the same task was accomplished in less than one minute. The 2012 capability is about 50,000 times faster. This is slightly better than doubling the speed every year for 14 years.
In 2016, a new open-source Bubing web crawler was announced that can achieve around 12,000 crawled pages per second on a relatively slow connection. This is could be 1 billion pages per day. The pricing is about $40 per day. There is an arxiv article from 2016. (BUbiNG: Massive Crawling for the Masses) This is about the capability that Google had about ten to fifteen years ago.
BUbiNG is here at github.
a 64-core, 64 GB workstation it can download hundreds of million of pages at more than 10 000 pages per second respecting politeness both by host and by IP, analyzing, compressing and storing more than 160 MB/s of data.
It is about $200 for a 10 Terabyte hard drive. This would store about one hour of crawling.
Read More

Thanks to: https://nworeport.me



  

Back to top  Message [Page 1 of 1]

Permissions in this forum:
You cannot reply to topics in this forum