Sections 2 and 3 outlined the concept of a
parallel Web crawler and query system. Extreme datamining of the
World Wide Web requires state-of-the-art hardware and software design.
One obvious constraint is bandwidth.
Figure 3
shows the exponential growth of the commercial and the
experimental transmission
capacity in Gbit/s over time as compared to the increase in Microprocessor
Performance.
They show the same increase. However, there is a time lag of approximately seven
years between commercial and
experimental capacity. Estimated Internet traffic increase
significantly exceeds the performance increase
of photonic and computing technologies! Because of the distributed,
non-symmetrical nature of the WWW, the obvious
bandwidth problem may be eased to some extent with a parallel approach that
draws on considerable
computing power (such as the one taken by SwissTx, as outlined above).
The authors foresee the wide-spread use of Web robots and
intelligent agents, especially in business to provide a myriad of
information relevant to e-commerce. New technologies such as Jini will promote
the
pervasion of the Internet to virtually everywhere.
|
,
the next challenge will be to make these different kinds of robots to
communicate with each other.
Web archives will provide mankind with the unique possibility to monitor, keep, protect and archive expressions of all the diverse human cultures. Human culture may be our most precious assets. This might better be engraved in stone, to survive all potential crises.
``I know not with what weapons World War III will be fought, but World War IV will be fought with sticks and stones.''
-A. Einstein.
The well-known novel by Isaac Asimov, ``Foundation and Earth'', presents another pessimistic view of the future. In his novel, a small group of explorers search for the planet Earth -- the origin of the humanity. Earth is not mentioned in any existing archives. During their quest this group arrives on the planet Aurora, one of the biggest and oldest centres of civilisation. No trace of any electronics, any robots - once the pride and a wealth of Aurora. The only traces they found were those engraved in stone.
However, the authors do not share such a pessimistic outlook on mankind's future. Fortunately, real life gives plenty reasons for optimism and also room for surprise. For example, in 1903, the Polish ethnographer Bronislaw Pilsudski recorded the biggest and most important archive on the Ainu people's (a native group of people in Japan) culture [41]. These tapes were lost and then found again in 1930, but there was no suitable device left to read them. It took another 50 years, when Japanese engineers in the 1980s constructed a special laser reader for these tapes. In this way, the Ainu people got access to the archive of their cultural heritage back, causing an amazingly fast cultural revival.
The authors hope that future Web archives will not be needed to revive nearly extinct local cultures after a period of rapid globalization, but that they will enrich the cultural lives of all people on this planet. If organized by a multitude of independent organizations (such as www.archive.org), Web archives will most certainly be sources of cultural enrichment. They will help mankind to better understand the diverse world, and to take advantage of the most democratic medium, the Internet. The idea of comprehensive, long-term Web archives is both a challenging one as it is a promising one, which provides a unique chance that we should not miss.