Show HN: Oodle – Unified Debugging with OpenSearch and Grafana

blog.oodle.ai

8 points by kirankgollu 8 hours ago

Hey HN -

Kiran and Vijay here.

We built Oodle after seeing teams forced to choose between low cost, great experience, zero ops, and no lock-in. Even with premium tools, debugging still meant switching between Grafana for metrics, OpenSearch for logs, and Jaeger or Tempo for traces - copying timestamps, losing context, and burning time during incidents.

So we decided to rethink observability from first principles - in both architecture and experience.

Architecturally, we borrowed ideas from Snowflake: separated storage and compute so each can scale independently. All telemetry - metrics, logs, and traces - is stored on S3 in a custom columnar format, while serverless compute scales on demand. The result is 3–5× lower cost, massive scale, and zero operational overhead, with full compatibility for Grafana dashboards, PromQL, and OpenSearch queries.

On the experience side, Oodle unifies everything you already use. It works with your existing Grafana and OpenSearch setup, but when an alert fires, Oodle automatically correlates metrics, logs, and traces in one view - showing the latency spike, the related logs, and the exact service that caused it.

It’s already in production across SaaS, fintech, and healthcare companies processing 10 TB+ logs/day and 50 M+ time-series/day.

We’ve both spent years building large-scale data systems. Vijay worked on Rubrik’s petabyte-scale file system on object storage, and I helped build AWS S3 and DynamoDB before leading Rubrik’s cloud platform. Oodle applies the same design principles to observability.

You can try a live OpenTelemetry demo in < 5 minutes (no signup needed): https://play.oodle.ai/settings?isUnifiedExperienceTourModalO...

or watch a short product walkthrough here: https://www.youtube.com/watch?v=wdYWDG3dRkU

Would love feedback - what’s your biggest observability pain today: cost, debuggability, or lock-in?

kbavishi90 an hour ago

Curious about how you handle metrics with ephemeral labels? We have a bunch of serverless workloads emitting a lot of short-lived metrics, and this causes big cardinality spikes because the labels are ephemeral.

Do you support storing such high cardinality metrics in disk alone and not in the in-memory index?

kirankgollu an hour ago

Our approach is that you need not worry about this. Oodle will be able to handle any scale including the ephemeral metrics.
Secondly, we only look at the unique time series in an hour when computing the billing. This should help us handle ephemeral metrics a lot better compared to many disk based data stores.
Finally, we do support high cardinality since the underlying datastore is columnar. Existing customers are sending on the order of few million cardinality per metric.
re: storing high cardinality metrics in disk vs in-memory index, vijay will comment shortly.
- mvijaykarthik a minute ago
  
  1. We store high cardinality metrics in object storage and only keep most recent data in memory. We use serverless functions (lambdas) for querying these objects which helps us scale.
  2. We do not have a global index - each hour of metrics have their own index. This helps with ephemeral labels as size of the index remains small