Big Industries Academy
Exploring Messaging and Streaming Technologies Part5: Google Pub/Sub
This is the fifth article in a series of articles by Francine Anestis, where she looks at different Messaging and Streaming technologies. In this piece, she delves into the world of Google Pub/Sub, uncovering its Key Features, Architecture, Use Cases, Strengths and Weaknesses, Cost, and Maturity Level.
Google Cloud Pub/Sub is a fully managed, real-time messaging service that allows you to send and receive messages between independent applications. It provides durable message storage and delivery at any scale, with high availability and consistent performance. Designed for building event-driven systems and real-time analytics, Pub/Sub decouples senders and receivers and offers reliable, scalable, and secure messaging for cloud-native applications.
Key Features
- Low Latency: Google Cloud Pub/Sub maintains low and predictable latency, typically under 100 milliseconds for end-to-end message delivery (for throughput 1MB/s, ~100ms latency). This makes it suitable for real-time and near-real-time applications where timely message delivery is critical.
- Google Cloud Ecosystem: Strong integration with Google Cloud services and third-party tools.
- Scalability: Automatically horizontally scales to handle up to millions of messages per second.
- Message Retention: Supports message retention for up to 7 days.
- Exactly Once Delivery: Ensures messages are delivered exactly once to subscribers.
- Push and Pull: Supports both push and pull delivery models for different application needs.
- Message Ordering: Provides message ordering to ensure messages are delivered in the order they were published.
- Encryption: Encrypts messages in transit and at rest.
- IAM Integration: Integrates with Google Cloud IAM for fine-grained access control.
- Cloud Monitoring: Integration with Google Cloud Monitoring for comprehensive monitoring and logging.
- Service Level Agreements (SLAs): Offers enterprise-grade SLA's for message delivery and availability.
Architecture
Google Pub/Sub architecture typically involves the following components:
- Publishers: Applications or services that send messages to Pub/Sub topics.
- Topics: Named resources to which messages are sent by publishers.
- Subscriptions: Named resources representing the stream of messages from a single, specific topic, to be delivered to the subscribing application.
- Subscribers: Applications or services that receive messages from subscriptions.
- Message Storage: Ensures reliable and durable storage of messages until they are delivered to subscribers.
Use Cases
- Event-Driven Architectures:
- Decoupling microservices and enabling communication through events.
- Real-Time Analytics:
- Collecting and analyzing streaming data in real-time for operational insights.
- Data Integration:
- Integrating various data sources and sinks for real-time data pipelines.
- IoT Data Processing:
- Handling data ingestion and processing from IoT devices.
- Log Aggregation:
- Centralizing and processing logs from distributed systems for monitoring and analysis.
- Centralizing and processing logs from distributed systems for monitoring and analysis.
Strenghts
- Fully Managed Service: Google Cloud Pub/Sub does not use an equivalent to Kafka's Zookeeper for orchestration. Instead, Google Cloud Pub/Sub is designed as a fully managed, serverless service that abstracts away the underlying infrastructure and management complexities.
- Scalability: Automatically scales to handle large volumes of messages with low latency.
- Security: Provides robust security features, including encryption and IAM (Identity and Access Management) integration.
- Protocols Interoperability: Google Cloud Pub/Sub supports multiple protocols, such as REST API, gRPC API, Pub/Sub Lite API.
Weaknesses
- Cost: Costs can accumulate with high message throughput and long retention periods.
- Complexity: Although architecturally speaking is quite straight forward, it may require careful planning and design to optimize performance and cost.
- Vendor Lock-In: Dependency on Google Cloud may limit flexibility in multi-cloud environments.
Cost
Google Pub/Sub pricing is based on several factors:
- Message Ingestion: Charged per million messages published to a topic.
- Message Delivery: Charged per million messages delivered to subscribers.
- Data Storage: Charged per GB-month for message storage.
- Egress Traffic: Charges apply for data transfer out of Google Cloud.
To estimate the costs based on your requirements, the Google Cloud Pricing Calculator can be used.
Maturity Level
- Mature (2015): Google Pub/Sub is a mature service that has been widely adopted in the industry for various messaging and streaming use cases. It is part of Google Cloud's core services, benefiting from continuous improvements and integrations with other Google Cloud services.
Conclusion
Google Cloud Pub/Sub is a powerful, fully managed messaging and streaming service that provides robust, scalable, and secure message delivery for cloud-native applications. With its low-latency performance, high reliability, and seamless integration with other Google Cloud services, Pub/Sub is well-suited for building event-driven architectures, real-time analytics, and IoT applications. While it offers many strengths, including scalability and security, organizations should consider the potential costs and complexity involved. Overall, Google Pub/Sub is a valuable tool for modern, data-driven applications that require reliable and scalable messaging capabilities.
Francine Anestis
My diploma thesis as well as my internship being on ETL, Analysis and Forecasting of Big Streaming Data, I am keen on learning more and immersing myself in Data Engineering and Data Space in general. Building data pipelines, using Kafka, databases and algorithms captivated me during my studies as Electrical and Computer Engineer and as a result I decided to dedicate myself on Data Engineering. I am very excited starting my learning and career path at Big Industries. Regarding my skills, if I had to choose one programming language and a platform, I would say that Python and Kafka are my strongest assets, but I am looking forward to extending that list.