my %clusters;
foreach $page (keys %forward) {
my %distances = ($page => 0);
my %visited = ($page =>1);
my @queue = ($page);
while( scalar @queue ) {
my $current = shift @queue;
foreach $link (@{$forward{$current}}) {
if( !$visited{$link} ) {
$visited{$link}++;
$distances{$link} = $distances{$current} + 1;
push @queue, $link;
}
}
foreach $link (@{$back{$current}}) {
if( !$visited{$link} ) {
$visited{$link}++;
$distances{$link} = $distances{$current} + 1;
push @queue, $link;
}
}
}
my $distance;
foreach $to (keys %distances) {
$distance += $distances{$to};
}
# Ignore orphans
next if !$distance;
my $cluster = join '|', sort keys %distances;
$clusters{$cluster} = {} if !exists $clusters{$cluster};
$clusters{$cluster}->{$page} = $distance;
}
-- SunirShah
what a great idea! I'm really impressed by all the IndexingSchemes you have made for this site -- thanks a lot. I am sure that I will use a few of them to find central content after I have finished my own protracted breadth-first-search personal exploration of the site.
i was wondering if the data on the wiki that your scripts use is publically available? As in, is there a place on the site I can get something like a %forward hashtable?
See the LinkDatabase for the data source. And no, there are no options as it is run offline, but you can modify the script if you'd like. -- SunirShah
thanks. Wow, the LinkDatabase is sweet! -- bs
UserName (required):