When an Index Automatically Becomes a Data Stream in Elasticsearch

In Elasticsearch, Data Streams are a way to organize continuous time-based data (logs, metrics, events) so that Elasticsearch can automatically manage the physical indices and handle rollover. Not every index is a data stream—it depends on how I configure Filebeat and index templates.

Common Scenario with Filebeat

In my first setup (Elasticsearch / Kibana – Export logs 1/2), I configured Filebeat like this:

output.elasticsearch:
  hosts: ["https://elasticsearch.devops-db.internal:9200"]
  index: "serviceexample-logs-%{+yyyy.MM.dd}"

  • I did not create an index template manually.
  • Filebeat applied Elastic’s default templates, which already include "data_stream": { "hidden": false }.
  • As a result, each day created a separate data stream, with backing index .ds-serviceexample-logs-YYYY.MM.DD-000001.

Even though the index had a date suffix, Elasticsearch interpreted the template as a data stream, producing multiple daily data streams instead of plain indices.


Manual Data Stream Scenario

In my second post (Elasticsearch / Kibana – Export logs 2/2), I also tried creating a data stream explicitly:

PUT /_data_stream/serviceexample-custom

With an index template including "data_stream": {}:

  • I got a single logical data stream (serviceexample-custom).
  • Elasticsearch automatically managed the backing indices (.ds-serviceexample-custom-2025.09.24-000001, .ds-serviceexample-custom-2025.09.25-000002, etc.).
  • When writing or querying, I always used the logical data stream name, without worrying about dates or rollover.

Plain Index Scenario

If I create an index template manually and do not include "data_stream": {}:

  • Any index I create with index: "my-index-%{+yyyy.MM.dd}" becomes a plain index, not a data stream.
  • Each daily index is independent.
  • I have to manage rollover and ILM manually if I want automated rotation.

Single Data Stream: serviceexample-custom
----------------------------------------
serviceexample-custom  (Logical Data Stream)

├─ .ds-serviceexample-custom-2025.09.24-000001
├─ .ds-serviceexample-custom-2025.09.25-000002
└─ .ds-serviceexample-custom-2025.09.26-000003

Automatic Daily Data Streams: serviceexample-logs
-------------------------------------------------
serviceexample-logs-2025.09.24
└─ .ds-serviceexample-logs-2025.09.24-000001

serviceexample-logs-2025.09.25
└─ .ds-serviceexample-logs-2025.09.25-000001

Plain Indices
-------------
serviceexample-index-2025.09.24
serviceexample-index-2025.09.25

Key Takeaways:

  1. Automatic Data Stream: In my setup, Filebeat + default template → each day created a separate data stream.
  2. Manual Data Stream: I explicitly created → a single logical data stream with internal rollover.
  3. Plain Index: I created without "data_stream": {} → independent daily indices, manual rollover.

So in short:

1. Single Data Stream (manual)

Advantages:

  • Single logical name: I always read or write to serviceexample-custom, without worrying about dates.
  • Internal rollover managed: Elasticsearch automatically creates backing indices as needed.
  • Cleaner view: In Kibana, I see only one data stream, even if there are multiple backing indices.
  • More control: I can configure ILM, retention, and rollover centrally for the data stream.

Disadvantages:

  • ILM dependency: If I don’t configure lifecycle management properly, backing indices can grow indefinitely.
  • Less visibility per day: Each backing index is hidden behind the data stream, so for daily auditing or troubleshooting I need to check the backing indices individually.

2. Automatic Daily Data Streams (Filebeat default)

Advantages:

  • Quick setup: Filebeat automatically creates a daily data stream without complex templates.
  • Natural daily separation: Each daily data stream has its own backing index, making auditing or deleting logs per day easier.
  • No need to worry about rollover: Each day is a new data stream, avoiding a single large index.

Disadvantages:

  • More visual clutter: In Kibana, multiple daily data streams appear, which can be confusing.
  • Harder for long-term aggregations: Queries across multiple days require combining multiple data streams.
  • Less flexible: There’s no single data stream managing rollover centrally, so configuring ILM is less straightforward.

Summary in one sentence:

  • Single Data Stream: cleaner, centralized, good for continuous production with centralized rollover.
  • Automatic Daily Data Streams: fast, simple, good for daily auditing, but creates many streams and can complicate aggregated queries.