17.5 C
New York
Friday, August 22, 2025

How Airties achieved scalability and cost-efficiency by shifting from Kafka to Amazon Kinesis Knowledge Streams


This put up was cowritten with Steven Aerts and Reza Radmehr from Airties.

Airties is a wi-fi networking firm that gives AI-driven options for enhancing dwelling connectivity. Based in 2004, Airties makes a speciality of creating software program and {hardware} for wi-fi dwelling networking, together with Wi-Fi mesh techniques, extenders, and routers. The flagship software program as a service (SaaS) product, Airties Dwelling, is an AI-driven platform designed to automate buyer expertise administration for dwelling connectivity, providing proactive buyer care, community optimization, and real-time insights. By utilizing AWS managed companies, Airties can give attention to their core mission: enhancing dwelling Wi-Fi experiences via automated optimization and proactive situation decision. This consists of minimizing community downtime, enabling sooner diagnostic capabilities for troubleshooting, and enhancing general Wi-Fi high quality. The answer has demonstrated vital affect in lowering each the frequency of assist desk calls and common name length, resulting in improved buyer satisfaction and decreased operational prices for Airties whereas delivering enhanced service high quality to their clients and the end-users.

In 2023, Airties initiated a strategic migration from Apache Kafka working on Amazon Elastic Compute Cloud (Amazon EC2) to Amazon Kinesis Knowledge Streams. Previous to this migration, Airties operated a number of fixed-size Kafka clusters, every deployed in a single Availability Zone to reduce cross-AZ site visitors prices. Though this structure served its objective, it required fixed monitoring and handbook scaling to deal with various information masses. The transition to Kinesis Knowledge Streams marked a big step of their cloud optimization journey, enabling true serverless operations with computerized scaling capabilities. This migration resulted in substantial infrastructure price discount whereas enhancing system reliability, eliminating the necessity for handbook cluster administration and capability planning.

This put up explores the methods the Airties staff employed throughout this transformation, the challenges they overcame, and the way they achieved a extra environment friendly, scalable, and maintenance-free streaming infrastructure.

Kafka use instances for Airties workloads

Airties constantly ingests information from tens of hundreds of thousands of entry factors (resembling modems and routers) utilizing AWS IoT Core. Earlier than the transition, these messages have been queued and saved inside a number of siloed Kafka clusters, with every cluster deployed in a separate Availability Zone to reduce cross-AZ site visitors prices. This fragmented structure created a number of operational challenges. The segmented information storage required complicated extract, rework, and cargo (ETL) processes to consolidate info throughout clusters, rising the time to derive significant insights. The info collected serves a number of important functions—from real-time monitoring and reactive troubleshooting to predictive upkeep and historic evaluation. Nevertheless, the siloed nature of the information storage made it notably difficult to carry out cross-cluster analytics and delayed the power to establish network-wide patterns and developments.

The info processing structure at Airties served two distinct use instances. The primary was a conventional streaming sample with a batch reader processing information in bulk for analytical functions. The second use case used Kafka as a queryable information retailer—a sample that, although unconventional, has change into more and more frequent in large-scale information architectures.

For this second use case, Airties wanted to supply fast entry to historic system information when troubleshooting buyer points or analyzing particular community occasions. This was applied by sustaining a mapping of information factors to their Kafka offsets in a database. When buyer help or analytics groups wanted to retrieve particular historic information, they might rapidly find and fetch the precise information from high-retention Kafka matters utilizing these saved offsets. This method eradicated the necessity for a separate database system whereas sustaining quick entry to historic information.

To deal with the huge scale of operations, this answer was horizontally scaled throughout dozens of Kafka clusters, with every cluster answerable for managing roughly 25 TB of information.

The next diagram illustrates the earlier Kafka-based structure.

Challenges with the Kafka-based structure

At Airties, managing and scaling Kafka clusters has offered a number of challenges, hindering the group from specializing in delivering enterprise worth successfully:

  • Operational overhead: Sustaining and monitoring Kafka clusters requires vital handbook effort and operational overhead at Airties. Duties resembling managing cluster upgrades, dealing with {hardware} failures and rotation, and conducting load testing consistently demand engineering consideration. These operational duties take away from the staff’s capability to focus on core enterprise features and value-adding actions inside the firm.
  • Scaling complexities : The method of scaling Kafka clusters includes a number of handbook steps that create operational burden for the cloud staff. These embrace configuring new brokers, rebalancing partitions throughout nodes, and offering correct information distribution—all whereas sustaining system stability. As information quantity and throughput necessities fluctuate, scaling sometimes includes including or eradicating complete Kafka clusters, which is a posh and time-consuming course of for the Airties staff.
  • Proper-sizing cluster capability: The static nature of Kafka clusters created a “one-size-fits-none” state of affairs for Airties. For giant-scale deployments with excessive information volumes and throughput necessities, including new clusters required vital handbook work, together with capability planning, dealer configuration, and partition rebalancing, making it inefficient for dealing with dynamic scaling wants. Conversely, for smaller deployments, the usual cluster measurement was outsized, resulting in useful resource waste and pointless prices.

How the brand new structure addresses these challenges

The Airties staff wanted to discover a scalable, high-performance, and cost-effective answer for real-time information processing that will enable seamless scaling with rising information volumes. Knowledge sturdiness was a important requirement, as a result of dropping system telemetry information would create everlasting gaps in buyer analytics and historic troubleshooting capabilities. Though momentary delays in information entry might be tolerated, the lack of any system information level was unacceptable for sustaining service high quality and buyer help effectiveness.

To deal with these challenges, Airties applied two completely different approaches for various situations.

The first use case was real-time information streaming with Kinesis Knowledge Streams. Airties changed Kafka with Kinesis Knowledge Streams to deal with the continual ingestion and processing of telemetry information from tens of hundreds of thousands of endpoints. This shift supplied vital benefits:

  • Auto-scaling capabilities : Kinesis Knowledge Streams could be scaled via easy API calls, assuaging the necessity for complicated configurations and handbook interventions.
  • Stream isolation : Every stream operates independently, that means scaling operations on one stream don’t have any affect on others. This alleviated the dangers related to cluster-wide adjustments of their earlier Kafka setup.
  • Dynamic shard administration : Not like Kafka, the place altering the variety of partitions requires creating a brand new matter, Kinesis Knowledge Streams permits including or eradicating shards dynamically with out dropping message ordering inside a partition.
  • Software Auto Scaling: Airties applied AWS Software Auto Scaling with Kinesis Knowledge Streams, permitting the system to robotically modify the variety of shards primarily based on precise utilization patterns and throughput necessities.

These options empowered Airties to effectively handle assets, optimizing prices during times of decrease exercise whereas seamlessly scaling as much as deal with peak masses.

For offering on-demand entry to historic system information, Airties applied a decoupled structure that separates streaming, storage, and information entry issues. This method changed the earlier answer the place historic information was saved straight in Kafka matters. The brand new structure consists of a number of key elements working collectively:

  • Knowledge assortment and processing : The structure begins with a client utility that processes information from Kinesis Knowledge Streams. This utility implements analyzing the information, as making it out there for detailed historic evaluation. The results of the information evaluation is written to Amazon Knowledge Firehose, which buffers the information, writing it often to Amazon Easy Storage Service (Amazon S3), the place it might probably later be picked up by Amazon EMR. This path is optimized for environment friendly storage and bulk studying from Amazon S3 by Amazon EMR. For uncooked information storage, a number of uncooked information samples are batched collectively in bulk recordsdata, that are saved in a separate Amazon S3 path. This path is optimized for storage effectivity and fetching uncooked information utilizing Amazon S3 vary queries.
  • Indexing and metadata administration: To allow quick information retrieval, the structure implements a classy indexing system. For every file within the uploaded bulk recordsdata, two essential items of data are recorded in an Amazon DynamoDB desk: the Amazon S3 location (bucket and key) the place the majority file was written, and the sequence variety of the corresponding information file within the Kinesis Knowledge Streams queue. This indexing technique offers low-latency entry to particular information factors, environment friendly querying capabilities for each real-time and historic information, computerized scaling to deal with rising information volumes, and excessive availability for metadata lookups.
  • Advert-hoc information retrieval: When particular historic information must be accessed, the system follows an environment friendly retrieval course of. First, the appliance queries the DynamoDB desk utilizing the related identifiers. The question returns the precise Amazon S3 location and offset the place the required information is saved. The appliance then fetches the particular information straight from Amazon S3 utilizing vary queries. This method permits fast entry to historic information factors, minimal information switch prices by retrieving solely wanted information, environment friendly troubleshooting and evaluation workflows, and decreased latency for buyer help operations.

This decoupled structure makes use of the strengths of every AWS service: Amazon Kinesis Knowledge Streams offers scalable and dependable real-time information streaming, whereas Amazon S3 delivers sturdy and cost-effective object storage for uncooked information, and Amazon DynamoDB permits quick and versatile storage of metadata and indexing. By separating streaming from storage and using every service for its particular strengths, Airties created a more cost effective and scalable answer for ad-hoc information entry wants, aligning every element with its optimum AWS service. The brand new structure not solely improved information entry efficiency but additionally considerably decreased operational complexity. As a substitute of managing Kafka matters for historic information storage, Airties now advantages from absolutely managed AWS companies that robotically deal with scaling, sturdiness, and availability. This method has confirmed notably useful for buyer help situations, the place fast entry to historic system information is essential for resolving points effectively.

Answer overview

Airties’s new structure includes a number of important elements, together with environment friendly information ingestion, indexing with AWS Lambda features, optimized information aggregation and processing, and complete monitoring and administration practices utilizing Amazon CloudWatch. The next diagram illustrates this structure.

The brand new structure consists of the next key levels:

  • Knowledge assortment and storage: The info journey begins with Kinesis Knowledge Streams, which ingests real-time information from hundreds of thousands of entry factors. This streaming information is then processed by a client utility that batches the information into bulk recordsdata (often known as briefcase recordsdata) for environment friendly storage in Amazon S3. This method of streaming, batching, after which storing minimizes write operations and reduces general prices, whereas offering information sturdiness via built-in replication in Amazon S3. When the information is in Amazon S3, it’s available for each fast processing and long-term evaluation. The processing pipeline continues with aggregators that learn information from Amazon S3, course of it, and retailer aggregated outcomes again in Amazon S3. By integrating AWS Glue for ETL operations and Amazon Athena for SQL-based querying, Airties can course of giant volumes of information effectively and generate insights rapidly and cost-effectively.
  • Knowledge aggregation and bulk file creation: The aggregators play an important position within the preliminary information processing. They combination the incoming information primarily based on predefined standards and create bulk recordsdata. This aggregation course of reduces the amount of information that must be processed in subsequent steps, optimizing the general information processing workflow. The aggregators then write these bulk recordsdata on to Amazon S3.
  • Indexing: Upon profitable add of a bulk file to Amazon S3 by the aggregators, the aggregator will write an index entry for the majority file an Amazon DynamoDB desk. This indexing mechanism permits for environment friendly retrieval of information primarily based on system IDs and timestamps, facilitating fast entry to related information utilizing S3 vary queries on the majority recordsdata.
  • Additional processing and evaluation: The majority recordsdata saved in Amazon S3 at the moment are in a format optimized for querying and evaluation. These recordsdata could be additional processed utilizing AWS Glue and analyzed utilizing Athena, permitting for complicated queries and in-depth information exploration with out the necessity for added information transformation steps.
  • Monitoring and administration: To keep up the reliability and efficiency of the Kafka-less structure, complete monitoring and administration practices have been applied. CloudWatch offers real-time monitoring of system efficiency and useful resource utilization, permitting for proactive administration of potential points. Moreover, automated alerts and notifications make sure that anomalies are promptly addressed.

Outcomes and advantages

The transition to this new structure yielded vital advantages for Airties:

  • Scalability and efficiency: The brand new structure empowers Airties to scale seamlessly with rising information volumes. The power to independently scale reader and author operations has decreased efficiency impacts throughout high-demand intervals. It is a vital enchancment over the earlier Kafka-based system, the place scaling usually required complicated reconfigurations and will have an effect on the whole cluster. With Kinesis Knowledge Streams, Airties can now deal with peak masses effortlessly whereas optimizing useful resource utilization throughout quieter intervals.
  • Reliability and fault tolerance: By utilizing AWS managed companies, Airties has considerably decreased system latency and improved general uptime. The automated information replication and restoration processes of Kinesis Knowledge Streams present enhanced information sturdiness, a important requirement for Airties’s operations. The improved excessive availability implies that Airties can now provide extra dependable companies to their clients, minimizing disruptions and enhancing the general high quality of their dwelling connectivity options.
  • Operational effectivity: The brand new structure has dramatically decreased the necessity for handbook intervention in capability administration. This shift has freed up useful engineering assets, permitting the staff to give attention to delivering enterprise worth moderately than managing infrastructure. The simplified operational mannequin has elevated the staff’s productiveness, empowering them to innovate sooner and reply extra rapidly to buyer wants. The discount in operational overhead has additionally led to sooner deployment cycles and extra frequent characteristic releases, enhancing Airties’s competitiveness out there.
  • Environmental affect and sustainability: The transition to a serverless structure demonstrated vital environmental advantages, reaching a outstanding 40% discount in power consumption. This substantial lower in power utilization was achieved by eliminating the necessity for consistently working EC2 cases and utilizing extra environment friendly, managed AWS companies. This enchancment in power effectivity aligns with Airties’s dedication to environmental sustainability and establishes them as an environmentally accountable chief within the tech trade.
  • Value optimization: The monetary advantages of transitioning to a Kafka-less structure are clearly demonstrated via complete AWS Value Explorer information. As proven within the following diagram, the overall price breakdown throughout all related companies from January to July consists of EC2 cases, DynamoDB, different Amazon EC2 prices, Kinesis Knowledge Streams, Amazon S3, and Amazon Knowledge Firehose. Probably the most notable change was a 33% discount in complete month-to-month infrastructure prices (in comparison with January baseline), primarily achieved via vital lower in Amazon EC2 associated prices because the migration progressed, elimination of devoted Kafka infrastructure, and environment friendly use of the AWS pay-as-you-go mannequin. Though new prices have been launched for managed companies (DynamoDB, Kinesis Knowledge Streams, Amazon Knowledge Firehose, Amazon S3), the general month-to-month AWS prices maintained a transparent downward development. With these price financial savings, Airties can provide extra aggressive pricing to their clients. The diagram under exhibits month-to-month price breakdown in the course of the transition.

Conclusion

The transition to this new structure with Kinesis Knowledge Streams has marked a big milestone in Airties’s journey in direction of operational excellence and sustainability. These initiatives haven’t solely enhanced system efficiency and scalability, however have additionally resulted in substantial price financial savings (33%) and power effectivity (40%). By utilizing superior applied sciences and progressive options on AWS, the Airties staff continues to set the benchmark for environment friendly, dependable, and sustainable operations, whereas paving the best way for a sustainable future. In an effort to discover how one can modernize your streaming structure with AWS, see the Kinesis Knowledge Streams documentation and watch this re:invent session on serverless information streaming with Kinesis Knowledge Streams and AWS Lambda.


In regards to the Authors

Steven Aerts is a principal software program engineer at Airties, the place his staff is answerable for ingesting, processing, and analyzing the information of tens of hundreds of thousands of houses to enhance their Wi-Fi expertise. He was a speaker at conferences like Devoxx and AWS Summit Dubai, and is an open supply contributor.

Reza Radmehr is a Sr. Chief of Cloud Infrastructure and Operations at Airties, the place he leads AWS infrastructure design, DevOps and SRE automation, and FinOps practices. He focuses on constructing scalable, cost-efficient, and dependable techniques, driving operational excellence via good, data-driven cloud methods. He’s keen about mixing monetary perception with technical innovation to enhance efficiency and effectivity at scale.

Ramazan Ginkaya is a Sr. Technical Account Supervisor at AWS with over 17 years of expertise in IT, telecommunications, and cloud computing. He’s a passionate problem-solver, offering technical steering to AWS clients to assist them obtain operational excellence and maximize the worth of cloud computing.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Stay Connected

0FansLike
0FollowersFollow
0SubscribersSubscribe
- Advertisement -spot_img

Latest Articles