Skip to content

Elasticsearch Roadmap

  • Roadmap: https://roadmap.sh/elasticsearch

1. Introduction

  • 1.1 What is Elasticsearch
  • 1.2 Search Engines vs Relational DBs
  • 1.3 The ELK Stack
  • 1.4 Elasticsearch Usecases

1.5 Pre-requisites

  • 1.5.1 JSON
  • 1.5.2 REST API Basics

1.6 Environment Setup

  • 1.6.1 Running with Docker
  • 1.6.2 Elastic Cloud
  • 1.6.3 Kibana Console

2. Core Architecture

2.1 Logical Concepts

  • 2.1.1 Cluster (System)
  • 2.1.2 Node (Instance)
  • 2.1.3 Index (Database)
  • 2.1.4 Document (Row)
  • 2.1.5 ID (Primary Key)

2.2 Physical Layout

  • 2.2.1 Master-Elegible Nodes
  • 2.2.2 Data Nodes
  • 2.2.3 Coordinating Nodes

2.3 Sharding & Scaling

  • 2.3.1 Primary Shards
  • 2.3.2 Replica Shards
  • 2.3.3 The "Split Brain" Problem

3. Data Modelling

3.1 Mappings

  • 3.1.1 Explicit
  • 3.1.2 Dynamic
  • 3.1.3 Mapping Explosion

3.2 Data Types

3.2.1 Code Data Types

  • 3.2.1.1 Numeric
  • 3.2.1.2 Boolean
  • 3.2.1.3 Dates
  • 3.2.1.4 Geo Points

3.2.2 Text vs Keyword

  • 3.2.2.1 Text
  • 3.2.2.2 Keyword

3.2.3 Advanced Types

  • 3.2.3.1 Object
  • 3.2.3.2 Nested
  • 3.2.3.3 Flattened

4. Data Ingestion

4.1 CRUD Operations

  • 4.1.1 Create Index
  • 4.1.2 Index Document
  • 4.1.3 Delete Index
  • 4.1.4 Get Document
  • 4.1.5 Update Document
  • 4.1.6 Delete Documents

4.2 Bulk Operations

  • 4.2.1 Bulk index
  • 4.2.2 Optimizing Bulk Indexing

4.3 Migrations & Repair

  • 4.3.1 Bulk index
  • 4.3.2 Update by Query
  • 4.3.3 Delete by Query

5. Search Fundamentals

5.1 Query Languages

  • 5.1.1 Query DSL
  • 5.1.2 ES|QL
  • 5.1.3 EQL
  • 5.1.4 SQL
  • 5.1.5 KQL
  • 5.1.6 Lucene

5.2 Search Contexts

  • 5.2.1 Query
  • 5.2.2 Filter

5.3 Leaf vs Compound Queries

5.3.1 Leaf Queries

  • 5.3.1.1 Match Query
  • 5.3.1.2 Term Query
  • 5.3.1.3 Range Query
  • 5.3.1.4 Exists Query
  • 5.3.1.5 ID Query
  • 5.3.1.6 Prefix Query
  • 5.3.1.7 Wildcard Query

5.3.2 Bool Queries (Compound Queries)

  • 5.3.2.1 must
  • 5.3.2.2 should
  • 5.3.2.3 filter
  • 5.3.2.4 must_not

5.4 Controlling Search Results

  • 5.4.1 Pagination
  • 5.4.2 Source Filtering
  • 5.4.3 Sorting
  • 5.4.4 Highlighting

6. How Search Works

  • 6.1 The Inverted Index
  • 6.2 Doc values
  • 6.3 fielddata

7. Text Analysis

7.1 Search Analyzer

  • 7.1.1 The Analyze API
  • 7.1.2 Standard Analyzer
  • 7.1.3 Custom Analyzers

8. Aggregations

8.1 Metric Aggregations

  • 8.1.1 Value Count
  • 8.1.2 Cardinality
  • 8.1.3 Avg / Sum / Min / Max
  • 8.1.4 Stats / Extended Stats

8.2 Bulk Aggregations

  • 8.2.1 Terms
  • 8.2.2 Range / Date Range
  • 8.2.3 Histogram
  • 8.2.4 Filter Aggregations

8.3 Advanced Aggregations

  • 8.3.1 Nested Aggregations
  • 8.3.2 Pipeline Aggregations

9. Transformations

  • 9.1 Transform API
  • 9.2 Pivot
  • 9.3 Latest

10. Relevance & Tuning

  • 10.1 Document Scoring
  • 10.2 Understanding Similarity
  • 10.3 BM25 algorithm
  • 10.4 Improve Query Precision
  • 10.5 Boosting Queries
  • 10.6 Function Score Query
  • 10.7 Match Phrase Query
  • 10.8 Synonyms Graph

11. Production

11.1 Cluster Management

  • 11.1.1 CAT API
  • 11.1.2 Segment Merging
  • 11.1.3 Cluster Monitoring
  • 11.1.4 Cross-cluster Replication
  • 11.1.5 Autoscaling

11.2 Data Life Cycle

  • 11.2.1 ILM
  • 11.2.2 Rollover Policies

11.3 Data Safety

  • 11.3.1 Data Tiers
  • 11.3.2 Snapshots & restore
  • 11.3.3 SLM

11.4 Security

  • 11.4.1 Authentication
  • 11.4.2 Roles & Users
  • 11.4.3 API Keys

12. Advanced Features

  • 12.1 AI-Powered Search
  • 12.2 Vector Search
  • 12.3 Semantic Search
  • 12.4 Hybrid Search