Καλησπέρα,
το επόμενο syscussion θα πραγματοποιηθεί Παρασκευή 2/2 στη 1μμ. Μετά από λαϊκή απαίτηση να *μην αφορά* optimization πρόβλημα, το paper που θα παρουσιάσω είναι το εξής:
*Title:* MacroBase: Prioritizing Attention in Fast Data *Authors:* Peter Bailis, Edward Gan, Samuel Madden, Deepak Narayanan, Kexin Rong, Sahaana Suri *Abstract:* As data volumes continue to rise, manual inspection is becoming increasingly untenable. In response, we present MacroBase, a data analytics engine that prioritizes end-user attention in high-volume fast data streams. MacroBase enables efficient, accurate, and modular analyses that highlight and aggregate important and unusual behavior, acting as a search engine for fast data. MacroBase is able to deliver order-of-magnitude speedups over alternatives by optimizing the combination of explanation and classification tasks and by leveraging a new reservoir sampler and heavy-hitters sketch specialized for fast data streams. As a result, MacroBase delivers accurate results at speeds of up to 2M events per second per query on a single core. The system has delivered meaningful results in production, including at a telematics company monitoring hundreds of thousands of vehicles.
Δημιοσιευμένο στο SIGMOD 2017.
Γιάννης
syscussion@lists.cslab.ece.ntua.gr