Focused crawls are collections of frequently-updated webcrawl data from narrow (as opposed to broad or wide) web crawls, often focused on a single domain or subdomain.
[ASPLOS 2019] PUMA-simulator provides a detailed simulation model of a dataflow architecture built with NVM (non-volatile memory), and runs ML models compiled using the puma compiler.