275 lines
12 KiB
Markdown
275 lines
12 KiB
Markdown
# 📨 Messaging a streaming platformy
|
||
|
||
## Přehled platformem
|
||
|
||
| Platforma | Typ | Jazyk | Protokol | Persistence | Use case |
|
||
|-----------|-----|-------|----------|-------------|----------|
|
||
| **Apache Kafka** | Distributed event store | Java/Scala | Binary (TCP) | Disk (log) | Event streaming, data pipeline, log aggregation |
|
||
| **RabbitMQ** | Message broker | Erlang | AMQP 0-9-1, MQTT, STOMP | Disk / RAM | Aplikační messaging, task queue, RPC |
|
||
| **Apache Pulsar** | Distributed messaging + streaming | Java | Binary (TCP) + REST | Disk (segmented log) | Streaming + queue v jednom, multi-tenant |
|
||
| **NATS** | Lightweight messaging | Go | NATS protocol (TCP) | Memory / JetStream (disk) | Microservices, IoT, edge, low-latency |
|
||
| **AWS SQS** | Managed queue | — | HTTPS | Managed | Decoupling services, serverless |
|
||
| **AWS SNS** | Managed pub/sub | — | HTTPS, SQS, Lambda, email | Managed | Push notifications, fanout |
|
||
| **Azure Service Bus** | Managed messaging | — | AMQP, HTTPS | Managed | Enterprise messaging, sessions, transactions |
|
||
| **Google Pub/Sub** | Managed streaming | — | gRPC, REST | Managed | Event-driven, data pipeline |
|
||
| **Red Hat AMQ 7** (Artemis) | Message broker | Java | AMQP, MQTT, STOMP, OpenWire | Disk | Enterprise, JMS, high-availability |
|
||
| **Oracle Service Bus (OSB)** | Enterprise ESB | Java | HTTP/S, JMS, SOAP, REST, MQ, FTP, AQ | Managed (WebLogic) | Enterprise integration, SOA, protocol mediation, routing |
|
||
|
||
---
|
||
|
||
## Detail platformem
|
||
|
||
### Apache Kafka
|
||
|
||
**Architektura:**
|
||
|
||
```
|
||
Producer ──► Topic ──► Partition ──► Consumer Group
|
||
│
|
||
├── Partition 0 (Leader) ──► Broker 1
|
||
├── Partition 1 (Follower) ──► Broker 2
|
||
└── Partition 2 (Follower) ──► Broker 3
|
||
```
|
||
|
||
| Koncept | Popis |
|
||
|---------|-------|
|
||
| **Topic** | Logická kategorie zpráv |
|
||
| **Partition** | Append-only log, ordered sequence of messages |
|
||
| **Broker** | Server v Kafka clusteru |
|
||
| **Producer** | Publikuje zprávy do topicu |
|
||
| **Consumer** | Čte zprávy z partition (v rámci consumer group) |
|
||
| **Consumer Group** | Skupina consumerů sdílejících čtení topicu |
|
||
| **Offset** | Pozice v partition (sledovaná consumerem) |
|
||
| **KRaft** | Controller quorum (nahrazuje Zookeeper od Kafka 3.x) |
|
||
|
||
**Replikace a HA:**
|
||
|
||
| Parametr | Hodnota |
|
||
|----------|---------|
|
||
| Replication factor | 2–3 (typicky 3 pro produkci) |
|
||
| ISR (In-Sync Replicas) | Počet replik, které drží krok s leaderem |
|
||
| Min ISR | Minimální počet ISR pro potvrzení zápisu (acks=all) |
|
||
| acks=0 | Fire-and-forget (nejrychlejší, možná ztráta dat) |
|
||
| acks=1 | Zápis potvrzen leaderem (kompromis) |
|
||
| acks=all | Zápis potvrzen všemi ISR (nejbezpečnější) |
|
||
| Leader failover | Automatický výběr nového leadera z ISR |
|
||
|
||
**Důležité konfigurace:**
|
||
|
||
```properties
|
||
# Produkce
|
||
replication.factor=3
|
||
min.insync.replicas=2
|
||
default.replication.factor=3
|
||
|
||
# Retention
|
||
log.retention.hours=168 # 7 dní
|
||
log.retention.bytes=-1 # neomezeno (nebo limit)
|
||
log.segment.bytes=1073741824 # 1 GB per segment
|
||
|
||
# Performance
|
||
num.partitions=3 # podle potřeb (scale-out)
|
||
compression.type=snappy # (snappy, gzip, lz4, zstd)
|
||
```
|
||
|
||
**Partitioning strategies:**
|
||
|
||
| Strategy | Klíč | Výhoda | Nevýhoda |
|
||
|----------|------|--------|----------|
|
||
| Round-robin | null | Rovnoměrné rozložení | Ztráta pořadí per klíč |
|
||
| Key-based | user_id, order_id | Zprávy se stejným klíčem → stejná partition | Nerovnoměrné rozložení (hot keys) |
|
||
| Custom partitioner | Vlastní logika | Optimalizace per use case | Složitější na údržbu |
|
||
|
||
### RabbitMQ
|
||
|
||
**Architektura:**
|
||
|
||
```
|
||
Producer ──► Exchange ──► Binding ──► Queue ──► Consumer
|
||
│
|
||
┌───────────┼───────────┐
|
||
▼ ▼ ▼
|
||
Direct Topic Fanout
|
||
Exchange Exchange Exchange
|
||
```
|
||
|
||
| Koncept | Popis |
|
||
|---------|-------|
|
||
| **Exchange** | Přijímá zprávy od producera, routuje do queue |
|
||
| **Binding** | Vazba exchange → queue s routing key |
|
||
| **Queue** | FIFO fronta zpráv (consumer čte) |
|
||
| **Virtual Host (vhost)** | Izolace tenantů v rámci jednoho clusteru |
|
||
| **Publisher Confirm** | Potvrzení že broker zprávu přijal |
|
||
| **Consumer Ack** | Potvrzení že consumer zprávu zpracoval |
|
||
|
||
**Exchange typy:**
|
||
|
||
| Typ | Routing | Use case |
|
||
|-----|---------|----------|
|
||
| **Direct** | routing_key = binding_key | Task queue, point-to-point |
|
||
| **Topic** | routing_key match binding pattern (wildcard `*`, `#`) | Pub/sub s filtrováním |
|
||
| **Fanout** | Všem bindovaným queue | Broadcast, event notification |
|
||
| **Headers** | AMQP headers match | Komplexní routing (není závislý na routing key) |
|
||
|
||
**Queue typy:**
|
||
|
||
```properties
|
||
# Classic Queue (deprecated v produkci)
|
||
x-queue-type: classic
|
||
|
||
# Quorum Queue (doporučeno pro produkci)
|
||
x-queue-type: quorum
|
||
x-quorum-initial-group-size: 3
|
||
x-dead-letter-exchange: dlx
|
||
|
||
# Stream Queue (pro large backlogs)
|
||
x-queue-type: stream
|
||
x-max-length-bytes: 1073741824
|
||
```
|
||
|
||
**HA a clustering:**
|
||
|
||
| Režim | Popis | Use case |
|
||
|-------|-------|----------|
|
||
| **Quorum Queues** | Raft-based replikace (3–5 node), auto failover | Produkce, HA messaging |
|
||
| **Federation** | Async forwarding zpráv mezi nezávislými RabbitMQ clustery | Multi-region, DR |
|
||
| **Shovel** | Point-to-point forwarding zpráv (Federation na úrovni queue) | Migrace, specifický routing |
|
||
| **Warm Standby (DR)** | Druhý cluster, start až při failoveru | Cold DR |
|
||
|
||
### Apache Pulsar
|
||
|
||
**Unikátní architektura (compute/storage separation):**
|
||
|
||
```
|
||
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
|
||
│ Producer │ │ Consumer │ │ Consumer │
|
||
└──────┬───────┘ └──────┬───────┘ └──────┬───────┘
|
||
│ │ │
|
||
┌──────▼───────────────────▼───────────────────▼──────┐
|
||
│ Broker (stateless) │
|
||
│ Subscription: Exclusive / Shared / Failover │
|
||
└──────────────────────┬──────────────────────────────┘
|
||
│
|
||
┌──────────────────────▼──────────────────────────────┐
|
||
│ BookKeeper (stateful storage) │
|
||
│ ├── Bookie 1 ├── Bookie 2 ├── Bookie 3 ├── ... │
|
||
│ └── Ledger (append-only, segmented log) │
|
||
└─────────────────────────────────────────────────────┘
|
||
```
|
||
|
||
| Koncept | Popis |
|
||
|---------|-------|
|
||
| **Topic** | Logická kategorie (partitioned nebo non-partitioned) |
|
||
| **Subscription** | Způsob doručení (Exclusive, Shared, Failover, Key_Shared) |
|
||
| **Ledger** | Storage unit v BookKeeper (append-only) |
|
||
| **Bookie** | Storage node (BookKeeper) |
|
||
| **Managed Ledger** | Segmentovaný log s cache a retention |
|
||
|
||
**Výhody oproti Kafce:**
|
||
- Compute/storage separation — nezávislé škálování
|
||
- Geo-replication built-in (nativní)
|
||
- Multi-tenant (namespaces, isolation)
|
||
- TTL, retry, dead letter topic (built-in)
|
||
- Read-at-least-once / effectively-once
|
||
|
||
### NATS
|
||
|
||
| Feature | Popis |
|
||
|---------|-------|
|
||
| **Core NATS** | Pub/sub, request-reply, < 1 ms latence |
|
||
| **JetStream** | Persistence, exactly-once, key-value store, object store |
|
||
| **Leaf nodes** | Hierarchické propojení clusterů |
|
||
| **Super-cluster** | Multi-region clustering (global) |
|
||
|
||
**Use case:** IoT, edge computing, microservices communication, low-latency messaging.
|
||
|
||
### Oracle Service Bus (OSB)
|
||
|
||
Součást Oracle SOA Suite, běží na WebLogic Serveru. Enterprise service bus pro integraci v Oracle-heavy prostředích.
|
||
|
||
| Koncept | Popis |
|
||
|---------|-------|
|
||
| **Proxy Service** | Vstupní endpoint (HTTP, JMS, MQ, SOAP, REST) |
|
||
| **Business Service** | Cílový backend service |
|
||
| **Pipeline** | Message processing — routing, transformation, validation |
|
||
| **Split-Join** | Parallel/sequential orchestration více služeb |
|
||
| **Reporting** | Message tracking, SLA monitoring |
|
||
|
||
**Klíčové vlastnosti:**
|
||
- **Protocol mediation** — překlad mezi SOAP/REST/JMS/MQ/FTP
|
||
- **Message transformation** — XSLT, XQuery, MFL (neXML)
|
||
- **Throttling, SLA, alerting** — built-in
|
||
- **Oracle AQ (Advanced Queuing)** — integrace s Oracle DB frontami
|
||
- **XPath, XQuery, XSLT 2.0/3.0** — nativní podpora
|
||
- **Error handling** — fault policies, error queues, retry
|
||
|
||
**Use case:** Enterprise SOA, Oracle DB → Kafka bridging, legacy mainframe wrapping, B2B integration.
|
||
|
||
**Alternativy:** IBM Integration Bus (IIB), MuleSoft Anypoint, WSO2 EI, Apache Camel / ServiceMix.
|
||
|
||
---
|
||
|
||
## Srovnání platformem
|
||
|
||
### Výkon a škálování
|
||
|
||
| Platforma | Max throughput | Latence (P99) | Počet zpráv/s (1 broker) | Škálování |
|
||
|-----------|--------------|---------------|-------------------------|-----------|
|
||
| **Kafka** | > 1 GB/s | 2–10 ms | ~1 000 000 | Partitions (horizontální) |
|
||
| **Pulsar** | > 1 GB/s | 5–15 ms | ~1 000 000 | Brokers + Bookies |
|
||
| **RabbitMQ** | ~100 MB/s | < 1 ms (RAM) | ~100 000 | Clustering (node) |
|
||
| **NATS** | > 10 GB/s | < 0,5 ms | ~10 000 000 | Clustering + Leaf nodes |
|
||
| **OSB** | < 1 GB/s | 10–100 ms | ~10 000 | Vertikální (WebLogic cluster)
|
||
|
||
### Delivery guarantees
|
||
|
||
| Platforma | At most once | At least once | Exactly once | Ordering |
|
||
|-----------|-------------|---------------|-------------|----------|
|
||
| **Kafka** | Ano | Ano (acks=all + min.insync) | Ano (idempotent + transactional) | Per partition |
|
||
| **Pulsar** | Ano | Ano | Ano (dedup + transactional) | Per partition |
|
||
| **RabbitMQ** | Ano | Ano (Publisher Confirm + Consumer Ack) | Omezeně | Per queue |
|
||
| **NATS** | Ano | Ano (JetStream) | Omezeně | Per subject |
|
||
| **OSB** | Ano | Ano (XA transactions, exactly-once delivery) | Ano (XA + WS-AT) | Per pipeline |
|
||
|
||
### Kdy co použít
|
||
|
||
| Use case | Doporučená platforma | Zdůvodnění |
|
||
|----------|---------------------|------------|
|
||
| **Event sourcing / audit log** | Kafka, Pulsar | Append-only log, high throughput, replay |
|
||
| **CDC (Change Data Capture)** | Kafka (Kafka Connect + Debezium) | Ekosystém konektorů |
|
||
| **Task queue (job processing)** | RabbitMQ, SQS | Dead letter, retry, priority, scheduling |
|
||
| **API messaging / microservices** | NATS, RabbitMQ | Nízká latence, jednoduchost |
|
||
| **Data pipeline (ETL)** | Kafka (KSQL, Kafka Streams) | Stream processing v platformě |
|
||
| **IoT / Edge** | NATS, MQTT (RabbitMQ) | Lightweight, leaf nodes |
|
||
| **Enterprise SOA / EAI** | OSB, IBM IIB, MuleSoft | Protocol mediation, XA, B2B, legacy wrapping |
|
||
| **Multi-tenant cloud** | Pulsar | Nativní multi-tenant, geo-replication |
|
||
| **Serverless / event-driven** | SQS/SNS, Pub/Sub | Managed, auto-scaling |
|
||
|
||
---
|
||
|
||
## DR a vysoká dostupnost
|
||
|
||
Viz [DATACENTERS.md](DATACENTERS.md) — sekce "Vliv jednotlivých technologií na výběr DC topologie" pro detail DR mapping per platforma.
|
||
|
||
### Best practices
|
||
|
||
- **Neztrať zprávu v queue** — preferovat aknowledge-based consumption (ne auto-ack)
|
||
- **Dead letter queue** — každá hlavní queue má DLQ pro nedoručitelné zprávy
|
||
- **Monitoring lag** — consumer lag je klíčová metrika (Kafka: `kafka.consumer:consumer_lag`)
|
||
- **Idempotentní consumer** — stejná zpráva může být doručena dvakrát
|
||
- **Retry s backoff** — exponenciální backoff při selhání zpracování
|
||
- **Schema registry** — vyhnout se deserialization errors (Avro, Protobuf, JSON Schema)
|
||
- **Šifrování** — TLS in transit, encryption at rest (Kafka: cluster-side + topic-level)
|
||
|
||
---
|
||
|
||
## Související
|
||
|
||
- [DATACENTERS.md](DATACENTERS.md) — DR topologie, per-platforma mapping
|
||
- [CLOUD.md](CLOUD.md) — managed messaging (SQS, SNS, Service Bus, Pub/Sub)
|
||
|
||
## Zdroje
|
||
|
||
Odkazy, knihy a standardy: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
|
||
|
||
*Poslední revize: 2026-06-12* |