| Nov | DEC | Jan |
| 20 | ||
| 2019 | 2020 | 2021 |
COLLECTED BY
Formed in 2009, the Archive Team (not to be confused with the archive.org Archive-It Team) is a rogue archivist collective dedicated to saving copies of rapidly dying or deleted websites for the sake of history and digital heritage. The group is 100% composed of volunteers and interested parties, and has expanded into a large amount of related projects for saving online and digital history.
History is littered with hundreds of conflicts over the future of a community, group, location or business that were "resolved" when one of the parties stepped ahead and destroyed what was there. With the original point of contention destroyed, the debates would fall to the wayside. Archive Team believes that by duplicated condemned data, the conversation and debate can continue, as well as the richness and insight gained by keeping the materials. Our projects have ranged in size from a single volunteer downloading the data to a small-but-critical site, to over 100 volunteers stepping forward to acquire terabytes of user-created data to save for future generations.
The main site for Archive Team is at archiveteam.org and contains up to the date information on various projects, manifestos, plans and walkthroughs.
This collection contains the output of many Archive Team projects, both ongoing and completed. Thanks to the generous providing of disk space by the Internet Archive, multi-terabyte datasets can be made available, as well as in use by the Wayback Machine, providing a path back to lost websites and work.
Our collection has grown to the point of having sub-collections for the type of data we acquire. If you are seeking to browse the contents of these collections, the Wayback Machine is the best first stop. Otherwise, you are free to dig into the stacks to see what you may find.
The Archive Team Panic Downloads are full pulldowns of currently extant websites, meant to serve as emergency backups for needed sites that are in danger of closing, or which will be missed dearly if suddenly lost due to hard drive crashes or server failures.
Collection: Archive Team: The Github Hitrub
This silences a dependabot warning about an issue that doesn't affect us. Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>6ed8bee
<!-- The client --> <dependency> <groupId>io.prometheus</groupId> <artifactId>simpleclient</artifactId> <version>0.9.0</version> </dependency> <!-- Hotspot JVM metrics--> <dependency> <groupId>io.prometheus</groupId> <artifactId>simpleclient_hotspot</artifactId> <version>0.9.0</version> </dependency> <!-- Exposition HTTPServer--> <dependency> <groupId>io.prometheus</groupId> <artifactId>simpleclient_httpserver</artifactId> <version>0.9.0</version> </dependency> <!-- Pushgateway exposition--> <dependency> <groupId>io.prometheus</groupId> <artifactId>simpleclient_pushgateway</artifactId> <version>0.9.0</version> </dependency>
import io.prometheus.client.Counter; class YourClass { static final Counter requests = Counter.build() .name("requests_total").help("Total requests.").register(); void processRequest() { requests.inc(); // Your code here. } }
class YourClass { static final Gauge inprogressRequests = Gauge.build() .name("inprogress_requests").help("Inprogress requests.").register(); void processRequest() { inprogressRequests.inc(); // Your code here. inprogressRequests.dec(); } }There are utilities for common use cases:
gauge.setToCurrentTime(); // Set to current unixtime.As an advanced use case, a
Gauge can also take its value from a callback by using the
setChild()
method. Keep in mind that the default inc(), dec() and set() methods on Gauge take care of thread safety, so
when using this approach ensure the value you are reporting accounts for concurrency.
class YourClass { static final Summary receivedBytes = Summary.build() .name("requests_size_bytes").help("Request size in bytes.").register(); static final Summary requestLatency = Summary.build() .name("requests_latency_seconds").help("Request latency in seconds.").register(); void processRequest(Request req) { Summary.Timer requestTimer = requestLatency.startTimer(); try { // Your code here. } finally { receivedBytes.observe(req.size()); requestTimer.observeDuration(); } } }There are utilities for timing code and support for quantiles. Essentially quantiles aren't aggregatable and add some client overhead for the calculation.
class YourClass { static final Summary requestLatency = Summary.build() .quantile(0.5, 0.05) // Add 50th percentile (= median) with 5% tolerated error .quantile(0.9, 0.01) // Add 90th percentile with 1% tolerated error .name("requests_latency_seconds").help("Request latency in seconds.").register(); void processRequest(Request req) { requestLatency.time(new Runnable() { public abstract void run() { // Your code here. } }); // Or the Java 8 lambda equivalent requestLatency.time(() -> { // Your code here. }); } }
class YourClass { static final Histogram requestLatency = Histogram.build() .name("requests_latency_seconds").help("Request latency in seconds.").register(); void processRequest(Request req) { Histogram.Timer requestTimer = requestLatency.startTimer(); try { // Your code here. } finally { requestTimer.observeDuration(); } } }The default buckets are intended to cover a typical web/rpc request from milliseconds to seconds. They can be overridden with the
buckets() method on the Histogram.Builder.
There are utilities for timing code:
class YourClass { static final Histogram requestLatency = Histogram.build() .name("requests_latency_seconds").help("Request latency in seconds.").register(); void processRequest(Request req) { requestLatency.time(new Runnable() { public abstract void run() { // Your code here. } }); // Or the Java 8 lambda equivalent requestLatency.time(() -> { // Your code here. }); } }
class YourClass { static final Counter requests = Counter.build() .name("my_library_requests_total").help("Total requests.") .labelNames("method").register(); void processGetRequest() { requests.labels("get").inc(); // Your code here. } }
static final class variable as is common with loggers.
static final Counter requests = Counter.build() .name("my_library_requests_total").help("Total requests.").labelNames("path").register();Using the default registry with variables that are
static is ideal since registering a metric with the same name
is not allowed and the default registry is also itself static. You can think of registering a metric, more like
registering a definition (as in the TYPE and HELP sections). The metric 'definition' internally holds the samples
that are reported and pulled out by Prometheus. Here is an example of registering a metric that has no labels.
class YourClass { static final Gauge activeTransactions = Gauge.build() .name("my_library_transactions_active") .help("Active transactions.") .register(); void processThatCalculates(String key) { activeTransactions.inc(); try { // Perform work. } finally{ activeTransactions.dec(); } } }To create timeseries with labels, include
labelNames() with the builder. The labels() method looks up or creates
the corresponding labelled timeseries. You might also consider storing the labelled timeseries as an instance variable if it is
appropriate. It is thread safe and can be used multiple times, which can help performance.
class YourClass { static final Counter calculationsCounter = Counter.build() .name("my_library_calculations_total").help("Total calls.") .labelNames("key").register(); void processThatCalculates(String key) { calculationsCounter.labels(key).inc(); // Run calculations. } }
DefaultExports to conveniently register them.
DefaultExports.initialize();
<?xml version="1.0" encoding="UTF-8"?> <configuration> <include resource="org/springframework/boot/logging/logback/base.xml"/> <appender name="METRICS" class="io.prometheus.client.logback.InstrumentedAppender" /> <root level="INFO"> <appender-ref ref="METRICS" /> </root> </configuration>To register the log4j collector at root level:
<?xml version="1.0" encoding="UTF-8" ?> <!DOCTYPE log4j:configuration SYSTEM "log4j.dtd"> <log4j:configuration xmlns:log4j="http://jakarta.apache.org/log4j/"> <appender name="METRICS" class="io.prometheus.client.log4j.InstrumentedAppender"/> <root> <priority value ="info" /> <appender-ref ref="METRICS" /> </root> </log4j:configuration>To register the log4j2 collector at root level:
<?xml version="1.0" encoding="UTF-8"?> <Configuration packages="io.prometheus.client.log4j2"> <Appenders> <Prometheus name="METRICS"/> </Appenders> <Loggers> <Root level="info"> <AppenderRef ref="METRICS"/> </Root> </Loggers> </Configuration>
recordStats() when building
the cache and adding it to the registered collector.
CacheMetricsCollector cacheMetrics = new CacheMetricsCollector().register(); Cache<String, String> cache = CacheBuilder.newBuilder().recordStats().build(); cacheMetrics.addCache("myCacheLabel", cache);The Caffeine equivalent is nearly identical. Again, be certain to call
recordStats()
when building the cache so that metrics are collected.
CacheMetricsCollector cacheMetrics = new CacheMetricsCollector().register(); Cache<String, String> cache = Caffeine.newBuilder().recordStats().build(); cacheMetrics.addCache("myCacheLabel", cache);
SessionFactory instances.
If you want to collect metrics from a single SessionFactory, you can register
the collector like this:
new HibernateStatisticsCollector(sessionFactory, "myapp").register();In some situations you may want to collect metrics from multiple factories. In this case just call
add() on the collector for each of them.
new HibernateStatisticsCollector() .add(sessionFactory1, "myapp1") .add(sessionFactory2, "myapp2") .register();If you are using Hibernate in a JPA environment and only have access to the
EntityManagerorEntityManagerFactory, you can use this code to access the underlying SessionFactory:
SessionFactory sessionFactory = entityManagerFactory.unwrap(SessionFactory.class);
// Configure StatisticsHandler. StatisticsHandler stats = new StatisticsHandler(); stats.setHandler(server.getHandler()); server.setHandler(stats); // Register collector. new JettyStatisticsCollector(stats).register();Also, you can collect
QueuedThreadPool metrics. If there is a single QueuedThreadPool
to keep track of, use the following:
new QueuedThreadPoolStatisticsCollector(queuedThreadPool, "myapp").register();If you want to collect multiple
QueuedThreadPool metrics, also you can achieve it like this:
new QueuedThreadPoolStatisticsCollector() .add(queuedThreadPool1, "myapp1") .add(queuedThreadPool2, "myapp2") .register();
metric-name init parameter is required, and is the name of the
metric prometheus will expose for the timing metrics. Help text via the help
init parameter is not required, although it is highly recommended. The number
of buckets is overridable, and can be configured by passing a comma-separated
string of doubles as the buckets init parameter. The granularity of path
measuring is also configurable, via the path-components init parameter. By
default, the servlet filter will record each path differently, but by setting an
integer here, you can tell the filter to only record up to the Nth slashes. That
is, all requests with greater than N "/" characters in the servlet URI path will
be measured in the same bucket and you will lose that granularity.
The code below is an example of the XML configuration for the filter. You will
need to place this (replace your own values) code in your
webapp/WEB-INF/web.xml file.
<filter> <filter-name>prometheusFilter</filter-name> <filter-class>io.prometheus.client.filter.MetricsFilter</filter-class> <init-param> <param-name>metric-name</param-name> <param-value>webapp_metrics_filter</param-value> </init-param> <init-param> <param-name>help</param-name> <param-value>This is the help for your metrics filter</param-value> </init-param> <init-param> <param-name>buckets</param-name> <param-value>0.005,0.01,0.025,0.05,0.075,0.1,0.25,0.5,0.75,1,2.5,5,7.5,10</param-value> </init-param> <!-- Optionally override path components; anything less than 1 (1 is the default) means full granularity --> <init-param> <param-name>path-components</param-name> <param-value>1</param-value> </init-param> </filter> <!-- You will most likely want this to be the first filter in the chain (therefore the first <filter-mapping> in the web.xml file), so that you can get the most accurate measurement of latency. --> <filter-mapping> <filter-name>prometheusFilter</filter-name> <url-pattern>/*</url-pattern> </filter-mapping>Additionally, you can instantiate your servlet filter directly in Java code. To do this, you just need to call the non-empty constructor. The first parameter, the metric name, is required. The second, help, is optional but highly recommended. The last two (path-components, and buckets) are optional and will default sensibly if omitted.
simpleclient_spring_web as a
dependency, annotate a configuration class with @EnablePrometheusTiming, then
annotate your Spring components as such:
@Controller public class MyController { @RequestMapping("/") @PrometheusTimeMethod(name = "my_controller_path_duration_seconds", help = "Some helpful info here") public Object handleMain() { // Do something } }
HTTPServer server = new HTTPServer(1234);To add Prometheus exposition to an existing HTTP server using servlets, see the
MetricsServlet.
It also serves as a simple example of how to write a custom endpoint.
To expose the metrics used in your code, you would add the Prometheus servlet to your Jetty server:
Server server = new Server(1234); ServletContextHandler context = new ServletContextHandler(); context.setContextPath("/"); server.setHandler(context); context.addServlet(new ServletHolder(new MetricsServlet()), "/metrics");All HTTP exposition integrations support restricting which time series to return using
?name[]= URL parameters. Due to implementation limitations, this may
have false negatives.
void executeBatchJob() throws Exception { CollectorRegistry registry = new CollectorRegistry(); Gauge duration = Gauge.build() .name("my_batch_job_duration_seconds").help("Duration of my batch job in seconds.").register(registry); Gauge.Timer durationTimer = duration.startTimer(); try { // Your code here. // This is only added to the registry after success, // so that a previous success in the Pushgateway isn't overwritten on failure. Gauge lastSuccess = Gauge.build() .name("my_batch_job_last_success").help("Last time my batch job succeeded, in unixtime.").register(registry); lastSuccess.setToCurrentTime(); } finally { durationTimer.setDuration(); PushGatewaypg= new PushGateway("127.0.0.1:9091"); pg.pushAdd(registry, "my_batch_job"); } }A separate registry is used, as the default registry may contain other metrics such as those from the Process Collector. See the Pushgateway documentation for more information.
PushGateway pushgateway = new PushGateway("127.0.0.1:9091"); pushgateway.setConnectionFactory(new BasicAuthHttpConnectionFactory("my_user", "my_password"));
PushGateway pushgateway = new PushGateway("127.0.0.1:9091"); pushgateway.setConnectionFactory(new MyHttpConnectionFactory());where
class MyHttpConnectionFactory implements HttpConnectionFactory { @Override public HttpURLConnection create(String url) throws IOException { HttpURLConnection connection = (HttpURLConnection) new URL(url).openConnection(); // add any connection preparation logic you need return connection; } }
Graphiteg= new Graphite("localhost", 2003); // Push the default registry once.g.push(CollectorRegistry.defaultRegistry); // Push the default registry every 60 seconds. Thread thread =g.start(CollectorRegistry.defaultRegistry, 60); // Stop pushing. thread.interrupt(); thread.join();
class YourCustomCollector extends Collector { public List<MetricFamilySamples> collect() { List<MetricFamilySamples> mfs = new ArrayList<MetricFamilySamples>(); // With no labels. mfs.add(new GaugeMetricFamily("my_gauge", "help", 42)); // With labels GaugeMetricFamily labeledGauge = new GaugeMetricFamily("my_other_gauge", "help", Arrays.asList("labelname")); labeledGauge.addMetric(Arrays.asList("foo"), 4); labeledGauge.addMetric(Arrays.asList("bar"), 5); mfs.add(labeledGauge); return mfs; } } // Registration static final YourCustomCollector requests = new YourCustomCollector().register()
SummaryMetricFamily works similarly.
A collector may implement a describe method which returns metrics in the same
format as collect (though you don't have to include the samples). This is
used to predetermine the names of time series a CollectorRegistry exposes and
thus to detect collisions and duplicate registrations.
Usually custom collectors do not have to implement describe. If describe is
not implemented and the CollectorRegistry was created with auto_describe=True
(which is the case for the default registry) then collect will be called at
registration time instead of describe. If this could cause problems, either
implement a proper describe, or if that's not practical have describe
return an empty list.
// Dropwizard MetricRegistry MetricRegistry metricRegistry = new MetricRegistry(); new DropwizardExports(metricRegistry).register();By default Dropwizard metrics are translated to Prometheus sample sanitizing their names, i.e. replacing unsupported chars with
_, for example:
Dropwizard metric name:
org.company.controller.save.status.400
Prometheus metric:
org_company_controller_save_status_400
It is also possible add custom labels and name to newly created Samples by using a CustomMappingSampleBuilder with custom MapperConfigs:
// Dropwizard MetricRegistry MetricRegistry metricRegistry = new MetricRegistry(); MapperConfig config = new MapperConfig(); // The match field in MapperConfig is a simplified glob expression that only allows * wildcard. config.setMatch("org.company.controller.*.status.*"); // The new Sample's template name. config.setName("org.company.controller"); Map<String, String> labels = new HashMap<String,String>(); // ... more configs // Labels to be extracted from the metric. Key=label name. Value=label template labels.put("name", "${0}"); labels.put("status", "${1}"); config.setLabels(labels); SampleBuilder sampleBuilder = new CustomMappingSampleBuilder(Arrays.asList(config)); new DropwizardExports(metricRegistry, sampleBuilder).register();When a new metric comes to the collector,
MapperConfigs are scanned to find the first one that matches the incoming metric name. The name set in the configuration will
be used and labels will be extracted. Using the CustomMappingSampleBuilder in the previous example leads to the following result:
Dropwizard metric name
org.company.controller.save.status.400
Prometheus metric
org_company_controller{name="save",status="400"}
Template with placeholders can be used both as names and label values. Placeholders are in the ${n} format where n is the zero based index of the Dropwizard metric name wildcard group we want to extract.