We recently submitted an abstract to give a presentation on Lumify at the 2014 Hadoop Summit in San Jose. I figured it wouldn't hurt to post it here on our blog as a shorter version of an earlier article "What is Lumify?"

The process of selecting presentations for the Hadoop Summit includes a public voting period. We'll let you know when that vote is open for you to show your support for Lumify.

Olympic Rings

Proposed Session Title: Lumify: open source big data analysis and visualization

Session Track: Hadoop for Business Applications and Development

Session Focus: Mostly Technical/Some Business

Speakers: Jeff Kunkle & Charlie Greenbacker


Today, analysts in most organizations struggle to derive actionable insights from the large volumes of diverse data flowing through the enterprise. Lumify was created to help tackle this problem in an open, non-proprietary way. Lumify is an open source platform for big data analysis and visualization.

Utilizing both Hadoop and Storm, it ingests and integrates virtually any kind of data, from unstructured text documents and structured datasets, to images and video. Several open source analytic tools (including Tika, OpenNLP, CLAVIN, OpenCV, and ElasticSearch) are used to enrich the data, increase its discoverability, and automatically uncover hidden connections. All information is stored in a secure graph database implemented on top of Accumulo to support cell-level security of all data and metadata elements.

A modern, browser-based user interface enables analysts to explore and manipulate their data, discovering subtle relationships and drawing critical new insights. In addition to full-text search, geospatial mapping, and multimedia processing, Lumify features a powerful graph visualization supporting sophisticated link analysis and complex knowledge representation.

This presentation will describe how Lumify works, how it's used for investigatory analyses of large & diverse datasets, and will include a live demo.