HomeServiceStatisticsExtreme Value TheoryMarket and Term Structure ModelsDatamining Conferences | Data mining combines advanced statistical and computational approaches to explore massive amounts of data. The ultimate aim of data mining is to discover patterns and relationships within a given pool of information.
Members of the Approximity team contributed to the following publications:
The WEB archives: A time-machine in your pocket! DownloadChavez-Demoulin V.C., A.S.A. Roehrl, R.A. Roehrl and A.Weinberg Internet Archive Colloquim 2000, editor K. Bollacker, www.archive.org, 2000 Taking an interdisciplinary approach, the authors discuss both technical issues of creating archives of the World Wide Web (as suggested at www.archive.org), and the possible socio- political relevance of such archives in the future. As the Internet becomes the Ever- and Everywherenet, the Web archives may become a memory of mankind, a sort of time-machine to go back into the past. The authors present the hardware and software concepts, and an initial analysis, of a highly scalable and extendable approach to archive a fully queryable copy of the ever-changing Web
World Wide Web Robot for Extreme Datamining with Swiss-Tx Supercomputers Armin Roehrl, Martin Frey, Alexander Roehrl Interim Report IR-99-20, International Institute for Applied Systems Analysis This paper discusses the software and hardware issues of designing a highly parallel robot for extreme datamining on the Internet. As a sample application, a World Wide Web server count experiment for Switzerland and Thailand is presented. Our platform of choice is the SwissTx, a supercomputer built from commodity components that runs NT and COMPAQ Tru64 Unix. Hardware and software of this machine are discussed and benchmark results presented. They show that NT is a feasible choice even under the given extreme conditions. Using statistical modelling for optimizing the search process, the inevitable bandwidth problem is reduced to some extent to a computation problem. We suggest that our approach to Web robots is a robust bet for a multitude of future Internet applications which might lead to a large-scale and cost-efficient usage of Web robots.
Chavez-Demoulin V., Jarvis S.A. , Perera R., Roehrl A.S.A, Schmiedl S.W., Sondergaard M.P
Between Data Science and Applied Data Analysis; Proceedings of the 26th Annual Conference of the Gesellschaft Für Klassifikation E.V; Springer-Verlag, 2003, pp. 387-394. In recent years there have been a number of developments in the datamining techniques used in the analysis of terrabyte-sized logfiles resulting from Internet-based applications. The information which these datamining techniques provide allow knowledge engineers to rapidly direct business decisions. Current datamining methods however, are generally efficient only in the cases when the information obtained in the logfiles is close to the average. This means that in cases where non-standard logfiles (extreme data) are being studied, these methods provide unrealistic and erroneous results. Non-standard logfiles often have a large bearing on the analysis of web applications, the information which they provide can impact on new or even well established services. In this paper aspects of the recent Extreme Value Theory methodology are discussed. Particular emphasis is made to its application; a unique toolkit is provided with which to describe, understand and predict the non-standard fluctuations as discovered in real-life Internet-sourced log data.
Unter Verdacht: Datamining mit R DownloadArmin Roehrl, Stefan Schmiedl Linux-Enterprise 3/2002 (in German) Im ersten Teil beschreiben wir R, eine (Statistik)-Programmiersprache, die auf S basiert. Dann geben wir eine kurze Einführung in Datamining und zeigen an praktischen Beispielen, wieso R dabei sehr viel Zeit sparen kann. Da wir die mathematischen und statistischen Hintergründe hier nicht eingehend erklaeren können, verweisen wir für die Grundlagen auf die einschlägige Fachliteratur. Unser Ziel ist, dem Leser eine Vorstellung von Datamining zu vermitteln.
Datamining specialists |