THE SEARCH DISTRUPTION

YOTTAASYS BLOG

Elasticsearch vs Splunk vs Hunk vs Solr

There are two specific events which mark this decade, first one being the emergence of disruptive business models such as e-commerce models for retail, travel, transport and services and the second one is the emergence of disruptive technologies such as search and analytics technologies. Both of these events present a series of opportunities in terms of doing cheaper business with faster implementation times, while at the same time they possess a serious threat to the old business models and software licensing models. In this article we will take a look into four such technologies which have been at the forefront of search technology evolution.

I have divided this Article into two sections the first section where we go through an introduction about these four technologies and in the second section we will present some summary level statistics which can guide CIO’s and Chief Architects to choose the most appropriate platform.

Elasticsearch (ELK): Elasticsearch which is also referred as ELK (elasticsearch, Logstash and Kibana) is one of the most innovative and disruptive technologies in the internal site search space. ELK consists of the following three components and they work in tandem to provide an easy to use platform for search and search based analytics. Logstash can be considered as the new age ETL engines for storing, querying, and analyzing your logs whereas Elasticsearch is the backend data store while Kibana is the front-end reporting tool.

Elasticsearch is a standalone database server, written in Java, which takes in data & stores it in a vivid format for language based searches. Working over it is convenient as it implements HTTP/JSON. It is also easily scalable and supports clustering.

All In All elasticsearch implement Lucene’s algorithm for matching text and storing indexes. It harbours useful and brief API’s, scalability, and operational tools on top of Lucene’s search implementation.

Lucene is a great tool but it is tiresome to use it directly, while elasticsearch provides features for scaling past a single machine. Elasticsearch provides inherent and simple API than Lucene Java API. Exigently, it provides a framework that makes scaling across machines and data centers relatively simple.

Logstash: Logstash is a tool for receiving, processing and outputting logs. At this point in time Logstash has active plug-ins for 165 log sources and can generally take care of the most popular log formats i.e. System logs, webserver logs, error logs, application logs etc.

Kibana: Kibana is the front end Analytics and reporting tier of the ELK stack, any data which is stored in the elasticsearch indices can be analyzed and visualized using Kibana. The latest version of Kibana provides most of the standard analytical features such as drill downs, variety of chart and map formats and ability to export into various formats. In a nutshell a very capable analytical tool which is open source and free.

Splunk is used for searching, monitoring, and analyzing machine-generated big data, via a web interface. The crop captures, indexes and interacts real-time data in a searchable archive from which it create charts, reports, notifications, user interface and visualizations.

Splunk was invented to keep track and analyze machine data out of various computerized systems. Since then, it has been used for a countless data types for example Machine data, Web Logs, Data Files and Social Media data.

Hunk is the tool which can be used to read Data from NoSQL and Hadoop systems and it makes heavy use of the Splunk foundation. Anyone can manage archival data from Splunk to Hadoop to save a lot over the expensive storage area networks, and run queries from data in Hadoop, NoSQL and Splunk Enterprise or Splunk Cloud. Hunk is introduced to access your data in remote Hadoop clusters via virtual indexes and lets you use the Splunk Processing Language (SPL) to analyze the data using the capabilities of NoSQL and Hadoop data stores.

Solr or Apache Solr is an open source search platform built upon a Java library called Lucene. Solr is a popular search platform for Web sites because it can index and search multiple sites. Solr is also a popular search platform for enterprise search because it can be used to index and search documents and email attachments. Solr currently provides vivid features such as Automatic sharding, Replication, Sharding and SQL like queries while some of the intriguing query capabilities include Faceting, Group by, Highlighting, Spellcheck, Autocomplete and geospatial support.

Conclusion: Out of the all the latest search platforms elastic search clearly stands out, its ability to scale, full stack solution, availability of seasoned consultants to implement ELK and zero software licensing cost makes a fabulous offer. It will be interesting to watch out how the licensed search software providers will handle the march of elastic search.


NEWSLETTER


 


DOWNLOAD OFFER


 
 




LATEST YOTTAASYS INSIGHTS

The Analytical Revolution Of Sensors

Speaking during Question Hour in the upper house of Indian parliament on Thursday 11-June-2015, Union Minister of State for Power Piyush Goyal claimed that electricity was available at zero rupees per unit at the Power Grid’s Monitoring Office. When there are continuous power cuts across so many cities and only a few hours of power supply at majority of villages, this actually is a good indicator towards the unbalanced demand/supply situation.

READ MORE >>

The Search Disruption

There are two specific events which mark this decade, first one being the emergence of disruptive business models such as e-commerce models for retail, travel, transport and services and the second one is the emergence of disruptive technologies such as search and analytics technologies. Both of these events present a series of opportunities in terms of doing cheaper business with faster implementation times, while at the same time they possess a serious threat to the old business models and software licensing models.

READ MORE >>