| Six Degrees of Wikipedia
|
|
28 May 08 |
|
[print
link
all
] |
|
Ever heard of the game Six Degrees of Kevin Bacon? If you haven’t, it
works like this: Every actor gets a Kevin Bacon number. Kevin Bacon has a
Kevin Bacon number of 0, actors who were in a movie with Kevin Bacon get a
Kevin Bacon number of 1, actors who were in a movie with someone who has a
Kevin Bacon number 1 get a 2, and so on (Everybody always gets the smallest
number possible, so if you were in a film with two people, one with a 4 and
one with a 6, your Kevin Bacon number would be 5).
The same idea could apply to the articles Wikipedia. Instead of taking
"in the same film" as the relation, you can take "is linked
to by". We’ll call the "Kevin Bacon number" from one
article to another the "distance" between them. It’s then
possible to work out the "closeness" of an article in Wikipedia
as its average distance to any other article. I wanted to find the centre
of wikipedia, that is, the article that is closest to all other articles
(has minimum closeness).
www.netsoc.tcd.ie/~mu/wiki
|
| R Graph Gallery
|
|
20 Jun 06 |
|
[print
link
all
] |
|
I came across this useful posting by Gregor Gorjanc in the r-help ML.
- R graphical manuals (this is awesome page as there are all help pages of
all packages on CRAN and probably even more and all graphics examples are
displayed! - more than 8000 images!)
bg9.imslab.co.jp/Rhelp/
This is a very nice addition to already existing R graph and movies
galleries
addictedtor.free.fr/graphiques/
addictedtor.free.fr/movies/
|
| Estraier 1.2.26
|
|
06 Feb 05 |
|
[print
link
all
] |
|
Estraier is a full-text search system for personal use. Its principal
purpose is to realize a full-text search system for a Web site. It
functions similarly to Google, but for a personal Web site or sites in an
intranet. It has fast searching, conspicuous results, relational document
search, the ability to handle Japanese text, and support for handling a
large number of documents. Installation is easy.
Changes: A plug-in to show spelling alternation of the search phrase was
added. A bug in the search server was fixed
estraier.sf.net
|
| 10x10: 100 Words and Pictures that Define the Time
|
|
04 Dec 04 |
|
[print
link
all
] |
|
A big thx to Sven C. Koehler for the link.
Every hour, 10x10 scans the RSS feeds of several leading international news
sources, and performs an elaborate process of weighted linguistic analysis
on the text contained in their top news stories. After this process,
conclusions are automatically drawn about the hour’s most important
words. The top 100 words are chosen, along with 100 corresponding images,
culled from the source news stories. At the end of each day, month, and
year, 10x10 looks back through its archives to conclude the top 100 words
for the given time period. In this way, a constantly evolving record of our
world is formed, based on prominent world events, without any human input.
link
|
| Estraier 1.2.25
|
|
14 Nov 04 |
|
[print
link
all
] |
|
Estraier is a full-text search system for personal use. Its principal
purpose is to realize a full-text search system for a Web site. It
functions similarly to Google, but for a personal Web site or sites in an
intranet. It has fast searching, conspicuous results, relational document
search, the ability to handle Japanese text, and support for handling a
large number of documents. Installation is easy.
Changes: The search server was enhanced. The logging format was changed.
Accuracy of document clustering was improved. The building configuration
was enhanced, and now Mac OS X 10.3 is supported.
link
|
| R 2.0 is out
|
|
07 Oct 04 |
|
[print
link
all
] |
|
R is a language and environment for statistical computing and graphics. It
is similar to S, which was developed at Bell Laboratories by John Chambers
et al. It provides a wide variety of statistical and graphical techniques
(linear and nonlinear modelling, statistical tests, time series analysis,
classification, clustering, etc.). R is designed as a true computer
language with control-flow constructions for iteration and alternation, and
it allows users to add additional functionality by defining new functions.
For computationally intensive tasks, Fortran and C code can be linked and
called at run time.
Changes: Many things have changed since 1.0. The R language has acquired
namespaces, exception handling constructs, formal methods and classes, much
improved garbage collection, generalized I/O via connection objects, and
considerable improvements in the graphics area. The user workspace has been
reorganized, and so has the set of packages that ship with R. Several
"recommended packages" deemed indispensable in a statistical
system are bundled. In addition, there has been a large number of more
specific new functions, tweaks, and bugfixes.
www.r-project.org
|
| Root: An Object-Oriented Data Analysis Framework
|
|
25 Sep 04 |
|
[print
link
all
] |
Sven C. Koehler, our hard-coding dataminer has sent me an email while his
code was probably exploring the DNA of some beauty. I wonder whether it was
the beauty the root-team uses in their logo? Hey, just because of the logo,
one ought to give root a try.
What I was impressed about:
http://root.cern.ch/root/Mission.html
``We started the ROOT project in the context of the NA49 experiment at
CERN. NA49 generates an impressive amount of data, about 10 Terabytes
of raw data per run.'';
``Thanks to the builtin CINT C++ interpreter the command language,
the scripting, or macro, language and the programming language are
all C++. The interpreter allows for fast prototyping of the macros
since it removes the time consuming compile/link cycle. It also
provides a good environment to learn C++. If more performance is
needed the interactively developed macros can be compiled using a
C++ compiler.'';
http://root.cern.ch/root/Architecture.html
``The backbone of the ROOT architecture is a layered class
hierarchy with, currently, around 310 classes grouped in about 24
frameworks divided in 14 categories. This hierarchy is organized in
a mostly single-rooted class library, that is, most of the classes
inherit from a common base class TObject. While this organization
is not very popular in C++, it has proven to be well suited for our
needs (and indeed for almost all successful class libraries: Java,
Smalltalk, MFC, etc)''.
|
|
|