Build applications using Kafka clients

Overview

Migrating foundational services like databases and message queues can be a highly challenging task, requiring coordination among multiple teams and carrying significant risks. The migration process often involves inevitable modifications to code, including data migration, when transitioning to a new product. As Pulsar gains popularity, more and more users are drawn to its outstanding features. However, the high cost of migration has become a barrier for them to leverage these remarkable capabilities.

The Kafka protocol handler aka KoP adds the native Kafka protocol support for Pulsar which aims to facilitate Kafka users to migrate to Pulsar without code changes and enable Kafka applications to leverage Pulsar’s powerful features.

Rapid Horizontal Scalability

Pulsar separates the serving layer (brokers) from the storage layer (Apache BookKeeper for storage). This separation allows for independent scaling of the two layers and enables handling the sudden surge in traffic by scaling out in seconds. Certainly, Pulsar offers the added advantage of dynamic scaling, allowing for the reduction of broker count during periods of low traffic. This flexibility enables cost optimization by avoiding unnecessary data replication. In Kafka, brokers are responsible for both serving traffic and storing messages, which can limit the scalability. Through KoP, Pulsar brokers gain the incredible capability to accommodate the Kafka protocol, allowing Kafka users to fully leverage Pulsar's unparalleled scalability.

Automatic Load Balancing

Benefits from the architecture of separation of storage and computing, Pulsar's traffic load balancing has undergone remarkable enhancements, rendering it not only more efficient but also highly cost-effective. Load balancing in Pulsar operates swiftly, typically completing within a matter of seconds, while ensuring minimal disruption to clients and avoiding redundant data replication. Therefore, it necessitates scarcely any manual intervention, and all operations function effortlessly and autonomously. Pulsar bestows upon Kafka users the complete propensity for such functionality via KoP.

Multi-tenancy

Pulsar was created from the ground up as a multi-tenant system. To support multi-tenancy, Pulsar has a concept of tenants. Tenants can be spread across clusters and can each have their own authentication and authorization scheme applied to them. They are also the administrative unit at which storage quotas, message TTL, and isolation policies can be managed. KoP not only serves as a Kafka topic but has also evolved into a multi-tenancy service through the utilization of multi-level topic names. By leveraging Pulsar's robust multi-tenant management capabilities, you can effectively oversee Kafka topics.

Infinite data retention

In Kafka, a topic can have multiple partitions but the data of a partition will be bound to several brokers which depends on the replicas. If these brokers reach the storage capacity limit, you will have to add more partitions or expand the storage capacity. However, these operations inevitably bring challenges to operation, maintenance, and use. However, Pulsar does not bind the data in the partition to a specific device. Instead, it divides the data in the partition into segments that can be stored in different bookies. Therefore, the limitation of a single storage node will not limit the data size topics or partitions. Users only need to care about the total storage capacity. The Kafka topics exposed to Kafka users through KoP will also fully have this capability.

Tiered Storage

The tiered storage for Pulsar is designed for storing long-term cold data in a cheaper storage such as AWS S3, and GCS. This will significantly reduce the cost of storing large amounts of historical data. The Kafka service provided by KoP will also have this full capability, thereby alleviating the long-term storage costs for Kafka users.

Serverless event processing with Pulsar Functions

Pulsar Functions perform simple computations on messages before routing the messages to consumers. These Lambda-style functions are specifically designed and integrated with Pulsar. The framework provides a simple computing framework on your Pulsar cluster and takes care of the underlying details of sending and receiving messages. You only need to focus on the business logic. It’s simplified deployment and operations, easy troubleshooting, and supports Serverless computing with Kubernetes. With KoP, Pulsar Functions can be effectively employed to perform efficient computations on the data incoming from Kafka Producers.

Undoubtedly, the listed features bestowed from Pulsar to KoP are not the whole picture. High data durability, strong consistency, low write latency, and geo-replication will be fully accessible to Kafka users through KoP.

In addition, KoP establishes interoperability between Kafka clients and Pulsar clients. Both Kafka clients and Pulsar clients can read data sent by each other. This is particularly beneficial when leveraging the various subscription types supported by Pulsar, as data written by Kafka clients can be seamlessly consumed by Pulsar consumers, facilitating the implementation of traditional message queue use cases. Of course, this necessitates users to employ the Pulsar Consumer API.

KoP is not a replacement for Pulsar. Eventually, you will be attracted by Pulsar's excellent API and functionality such as the versatile consumption mode by subscription types, individual message acknowledgments, delayed messages, chunking messages, DLQ, and non-persistent topics. KoP will help you migrate to the Pulsar world smoothly and with low risk. The interoperability of Kafka Clients and Pulsar Clients gives you greater migration flexibility.

KSN vs. KoP, why you need KSN

KSN(Kafka on StreamNative) is built based on open-source KoP with a truly cloud-native experience, and enterprise-grade features to unleash developer productivity and operation efficiently.

Unrestricted Developer Productivity

KSNKoP
Pub/SubYESYES
Serverless FunctionsYES-
IO ConnectorsYES-
Unified Schema RegistryComing Soon-
Schema RegistryYESYES
TransactionsYESYES
Compacted TopicYES
KStreams IntegrationYES
KSqlDB IntegrationYES

Production-stage Prerequisites

KSNKoP
OAuth AuthenticationYES-
Schema Registry OAuth AuthenticationYES-
TLSYESYES
AuthorizationYESYES
99.95% UptimeYES-
Multi-AZ / Multi-Region ClustersYES-
Geo-replicationYES-

Deployments and Efficient Operations

KSNKoP
Hosted Cloud / BYOCYES-
Private CloudYES-
On-premYESYES
Auto-scalingYES-
Cloud Console / UIYES-

Committer-driven Expertise

KSNKoP
24x7x365 Expert SupportYES-
Professional ServicesYES-
EducationYESYES

You can use the KSN to migrate existing Kafka applications and services to Pulsar without modifying the code. See a full list of supported Kafka clients.

Get started

To set up a Pulsar cluster with the Kafka protocol enabled and configure a Kafka client to produce and consume messages, see QuickStart.

Kafka client page wizard

To help you get started with setting up Kafka client libraries and tools after provisioning your cluster, StreamNative Console provides a step-by-step wizard to walk you through the basic setup and configuration process, such as selecting or creating service accounts, downloading key files or tokens, installing client libraries, generating sample codes to run, and so on.

To get started with the Kafka client setup wizard, follow these steps.

  1. On the left navigation pane of StreamNative Console, in the Admin section, click Kafka Clients.

    gif of kafka client setup process through wizard

  2. Follow the wizard to generate the sample code you need for connecting to your Pulsar cluster.

With a copy-and-paste, you can run the given sample code to produce and consume messages.

Previous
Build Applications with Kafka