CLAUDE.md - StreamNative Connect Documentation
This file provides guidance to Claude Code (claude.ai/code) when working with StreamNative connector documentation.Purpose
Theconnect/ directory contains documentation for data connectors that enable integration between StreamNative/Pulsar and external systems, supporting both Pulsar IO and Kafka Connect frameworks.
Directory Structure
- connectors/: Individual connector documentation
- Each connector has its own directory (e.g.,
google-bigquery-sink/,snowflake-sink/) - Contains
/current/subdirectory for latest version
- Each connector has its own directory (e.g.,
- overview.mdx: General connector concepts and architecture
Source Code Repository Mappings
StreamNative Connectors
-
pulsar-io-bigquery (
source_code_refs/pulsar-io-bigquery/): Google BigQuery connector- Sink connector for writing to BigQuery
- Source connector for reading from BigQuery
- Configuration in
conf/pulsar-io-bigquery.yaml - Documentation in
docs/directory
-
pulsar-io-snowflake-streaming (
source_code_refs/pulsar-io-snowflake-streaming/): Snowflake streaming connector- High-performance sink for Snowflake data warehouse
- Uses Snowflake Streaming API
- Configuration examples in
conf/directory
Kafka Connect Support
-
ksn (
source_code_refs/ksn/): Kafka Connect runtime support- Implements Kafka Connect API compatibility
- Enables running Kafka Connect connectors on Pulsar
- Key directories:
kafka-connect/: Connect runtime implementationdocs/: Architecture and configuration guides
-
sn-operator (
source_code_refs/sn-operator/): Kafka Connect deploymentkafkaconnect_controller.go: Manages Kafka Connect clusters- Handles connector lifecycle in Kubernetes
Documentation Patterns
Connector Documentation Structure
Each connector typically includes:- Overview: What the connector does and use cases
- Prerequisites: Required accounts, permissions, dependencies
- Installation: How to deploy the connector
- Configuration: Detailed parameter reference
- Usage Examples: Step-by-step guides
- Schema Support: Data format and schema evolution
- Monitoring: Metrics and health checks
- Troubleshooting: Common issues and solutions
Configuration Documentation
- Required Parameters: Clearly marked with descriptions
- Optional Parameters: Default values and use cases
- Security Parameters: Authentication and encryption options
- Performance Parameters: Throughput and batching settings
- Example Configurations: Common scenarios
SNIP References
Checksource_code_refs/snip/proposals/ for design documents related to:
- BigQuery connector (SNIP-39)
- Snowflake connector (SNIP-49)
- Kafka Connect support (SNIP-130, SNIP-134)
- Connector secrets management (SNIP-107)
- Connector UI enhancements (SNIP-109)
- SQS sink improvements (SNIP-117)
- Cloud storage package management (SNIP-132)
Common Tasks
Adding New Connector Documentation
- Create directory under
connectors/{connector-name}/current/ - Follow standard structure (overview, config, usage, etc.)
- Include architecture diagrams showing data flow
- Add to navigation in
docs.json - Update connector index/overview pages
Documenting Connector Configuration
- List all configuration parameters
- Group by category (connection, authentication, performance)
- Include validation rules and constraints
- Show example values and use cases
- Document environment variable alternatives
Cloud vs Self-Hosted Deployment
- Cloud: Console UI deployment steps
- Cloud: API/CLI deployment examples
- Private Cloud: Kubernetes CRD examples
- Standalone: Docker and binary deployment
Performance Tuning Guides
- Batch size optimization
- Parallelism settings
- Memory allocation
- Network timeout configuration
- Error handling and retry policies
Connector Categories
Source Connectors
- Database CDC (Debezium MySQL, PostgreSQL, MongoDB)
- Message queues (Kafka, RabbitMQ, AWS SQS)
- Cloud storage (S3, GCS, Azure Blob)
- Streaming platforms (Kinesis, EventBridge)
Sink Connectors
- Data warehouses (BigQuery, Snowflake)
- Databases (Cassandra, MongoDB, JDBC)
- Search engines (Elasticsearch)
- Message queues (Kafka, SQS)
- Cloud storage (S3, GCS, Azure Blob)
- Analytics (InfluxDB, Pinecone)
Kafka Connect Connectors
- Kafka Connect compatible connectors
- Deployed via Kafka Connect runtime
- Configured using Connect REST API
- Support for SMTs (Single Message Transforms)
Important Considerations
Exactly-Once Semantics
- Which connectors support exactly-once delivery
- Configuration requirements for guarantees
- Performance implications
- Failure recovery behavior
Schema Management
- Schema registry integration
- Schema evolution support
- Data format conversions
- Compatibility between source and sink
Security
- Authentication methods by connector
- Encryption in transit and at rest
- Secret management best practices
- Network security requirements
Monitoring and Operations
- Metrics exposed by connectors
- Health check endpoints
- Log aggregation
- Alert configuration
- Scaling considerations
Cross-References
- Cloud docs for connector deployment UI
- Private Cloud docs for CRD-based deployment
- Clients docs for producer/consumer patterns
- API docs for connector management endpoints

































