When an Index Automatically Becomes a Data Stream in Elasticsearch
In Elasticsearch, Data Streams are a way to organize continuous time-based data (logs, metrics, events) so that Elasticsearch can automatically manage the physical indices and handle rollover. Not every index is a data stream—it depends on how I configure Filebeat and index templates.
Common Scenario with Filebeat
In my first setup (Elasticsearch / Kibana – Export logs 1/2), I configured Filebeat like this:
output.elasticsearch:
hosts: ["https://elasticsearch.devops-db.internal:9200"]
index: "serviceexample-logs-%{+yyyy.MM.dd}"
- I did not create an index template manually.
- Filebeat applied Elastic’s default templates, which already include
"data_stream": { "hidden": false }
. - As a result, each day created a separate data stream, with backing index
.ds-serviceexample-logs-YYYY.MM.DD-000001
.
Even though the index
had a date suffix, Elasticsearch interpreted the template as a data stream, producing multiple daily data streams instead of plain indices.
Manual Data Stream Scenario
In my second post (Elasticsearch / Kibana – Export logs 2/2), I also tried creating a data stream explicitly:
PUT /_data_stream/serviceexample-custom
With an index template including "data_stream": {}
:
- I got a single logical data stream (
serviceexample-custom
). - Elasticsearch automatically managed the backing indices (
.ds-serviceexample-custom-2025.09.24-000001
,.ds-serviceexample-custom-2025.09.25-000002
, etc.). - When writing or querying, I always used the logical data stream name, without worrying about dates or rollover.
Plain Index Scenario
If I create an index template manually and do not include "data_stream": {}
:
- Any index I create with
index: "my-index-%{+yyyy.MM.dd}"
becomes a plain index, not a data stream. - Each daily index is independent.
- I have to manage rollover and ILM manually if I want automated rotation.
Single Data Stream: serviceexample-custom
----------------------------------------
serviceexample-custom (Logical Data Stream)
│
├─ .ds-serviceexample-custom-2025.09.24-000001
├─ .ds-serviceexample-custom-2025.09.25-000002
└─ .ds-serviceexample-custom-2025.09.26-000003
Automatic Daily Data Streams: serviceexample-logs
-------------------------------------------------
serviceexample-logs-2025.09.24
└─ .ds-serviceexample-logs-2025.09.24-000001
serviceexample-logs-2025.09.25
└─ .ds-serviceexample-logs-2025.09.25-000001
Plain Indices
-------------
serviceexample-index-2025.09.24
serviceexample-index-2025.09.25
Key Takeaways:
- Automatic Data Stream: In my setup, Filebeat + default template → each day created a separate data stream.
- Manual Data Stream: I explicitly created → a single logical data stream with internal rollover.
- Plain Index: I created without
"data_stream": {}
→ independent daily indices, manual rollover.
So in short:
1. Single Data Stream (manual)
Advantages:
- Single logical name: I always read or write to
serviceexample-custom
, without worrying about dates. - Internal rollover managed: Elasticsearch automatically creates backing indices as needed.
- Cleaner view: In Kibana, I see only one data stream, even if there are multiple backing indices.
- More control: I can configure ILM, retention, and rollover centrally for the data stream.
Disadvantages:
- ILM dependency: If I don’t configure lifecycle management properly, backing indices can grow indefinitely.
- Less visibility per day: Each backing index is hidden behind the data stream, so for daily auditing or troubleshooting I need to check the backing indices individually.
2. Automatic Daily Data Streams (Filebeat default)
Advantages:
- Quick setup: Filebeat automatically creates a daily data stream without complex templates.
- Natural daily separation: Each daily data stream has its own backing index, making auditing or deleting logs per day easier.
- No need to worry about rollover: Each day is a new data stream, avoiding a single large index.
Disadvantages:
- More visual clutter: In Kibana, multiple daily data streams appear, which can be confusing.
- Harder for long-term aggregations: Queries across multiple days require combining multiple data streams.
- Less flexible: There’s no single data stream managing rollover centrally, so configuring ILM is less straightforward.
Summary in one sentence:
- Single Data Stream: cleaner, centralized, good for continuous production with centralized rollover.
- Automatic Daily Data Streams: fast, simple, good for daily auditing, but creates many streams and can complicate aggregated queries.