Elasticsearch Roadmap¶
- Roadmap: https://roadmap.sh/elasticsearch
1. Introduction¶
- 1.1 What is Elasticsearch
- 1.2 Search Engines vs Relational DBs
- 1.3 The ELK Stack
- 1.4 Elasticsearch Usecases
1.5 Pre-requisites¶
- 1.5.1 JSON
- 1.5.2 REST API Basics
1.6 Environment Setup¶
- 1.6.1 Running with Docker
- 1.6.2 Elastic Cloud
- 1.6.3 Kibana Console
2. Core Architecture¶
2.1 Logical Concepts¶
- 2.1.1 Cluster (System)
- 2.1.2 Node (Instance)
- 2.1.3 Index (Database)
- 2.1.4 Document (Row)
- 2.1.5 ID (Primary Key)
2.2 Physical Layout¶
- 2.2.1 Master-Elegible Nodes
- 2.2.2 Data Nodes
- 2.2.3 Coordinating Nodes
2.3 Sharding & Scaling¶
- 2.3.1 Primary Shards
- 2.3.2 Replica Shards
- 2.3.3 The "Split Brain" Problem
3. Data Modelling¶
3.1 Mappings¶
- 3.1.1 Explicit
- 3.1.2 Dynamic
- 3.1.3 Mapping Explosion
3.2 Data Types¶
3.2.1 Code Data Types¶
- 3.2.1.1 Numeric
- 3.2.1.2 Boolean
- 3.2.1.3 Dates
- 3.2.1.4 Geo Points
3.2.2 Text vs Keyword¶
- 3.2.2.1 Text
- 3.2.2.2 Keyword
3.2.3 Advanced Types¶
- 3.2.3.1 Object
- 3.2.3.2 Nested
- 3.2.3.3 Flattened
4. Data Ingestion¶
4.1 CRUD Operations¶
- 4.1.1 Create Index
- 4.1.2 Index Document
- 4.1.3 Delete Index
- 4.1.4 Get Document
- 4.1.5 Update Document
- 4.1.6 Delete Documents
4.2 Bulk Operations¶
- 4.2.1 Bulk index
- 4.2.2 Optimizing Bulk Indexing
4.3 Migrations & Repair¶
- 4.3.1 Bulk index
- 4.3.2 Update by Query
- 4.3.3 Delete by Query
5. Search Fundamentals¶
5.1 Query Languages¶
- 5.1.1 Query DSL
- 5.1.2 ES|QL
- 5.1.3 EQL
- 5.1.4 SQL
- 5.1.5 KQL
- 5.1.6 Lucene
5.2 Search Contexts¶
- 5.2.1 Query
- 5.2.2 Filter
5.3 Leaf vs Compound Queries¶
5.3.1 Leaf Queries¶
- 5.3.1.1 Match Query
- 5.3.1.2 Term Query
- 5.3.1.3 Range Query
- 5.3.1.4 Exists Query
- 5.3.1.5 ID Query
- 5.3.1.6 Prefix Query
- 5.3.1.7 Wildcard Query
5.3.2 Bool Queries (Compound Queries)¶
- 5.3.2.1 must
- 5.3.2.2 should
- 5.3.2.3 filter
- 5.3.2.4 must_not
5.4 Controlling Search Results¶
- 5.4.1 Pagination
- 5.4.2 Source Filtering
- 5.4.3 Sorting
- 5.4.4 Highlighting
6. How Search Works¶
- 6.1 The Inverted Index
- 6.2 Doc values
- 6.3 fielddata
7. Text Analysis¶
7.1 Search Analyzer¶
- 7.1.1 The Analyze API
- 7.1.2 Standard Analyzer
- 7.1.3 Custom Analyzers
8. Aggregations¶
8.1 Metric Aggregations¶
- 8.1.1 Value Count
- 8.1.2 Cardinality
- 8.1.3 Avg / Sum / Min / Max
- 8.1.4 Stats / Extended Stats
8.2 Bulk Aggregations¶
- 8.2.1 Terms
- 8.2.2 Range / Date Range
- 8.2.3 Histogram
- 8.2.4 Filter Aggregations
8.3 Advanced Aggregations¶
- 8.3.1 Nested Aggregations
- 8.3.2 Pipeline Aggregations
9. Transformations¶
- 9.1 Transform API
- 9.2 Pivot
- 9.3 Latest
10. Relevance & Tuning¶
- 10.1 Document Scoring
- 10.2 Understanding Similarity
- 10.3 BM25 algorithm
- 10.4 Improve Query Precision
- 10.5 Boosting Queries
- 10.6 Function Score Query
- 10.7 Match Phrase Query
- 10.8 Synonyms Graph
11. Production¶
11.1 Cluster Management¶
- 11.1.1 CAT API
- 11.1.2 Segment Merging
- 11.1.3 Cluster Monitoring
- 11.1.4 Cross-cluster Replication
- 11.1.5 Autoscaling
11.2 Data Life Cycle¶
- 11.2.1 ILM
- 11.2.2 Rollover Policies
11.3 Data Safety¶
- 11.3.1 Data Tiers
- 11.3.2 Snapshots & restore
- 11.3.3 SLM
11.4 Security¶
- 11.4.1 Authentication
- 11.4.2 Roles & Users
- 11.4.3 API Keys
12. Advanced Features¶
- 12.1 AI-Powered Search
- 12.2 Vector Search
- 12.3 Semantic Search
- 12.4 Hybrid Search