DevFormat
Back to Guides

What is ProtoBuf? Protocol Buffers Explained

When performance is critical, text-based formats like JSON and XML can be too slow and large. Enter Protocol Buffers (ProtoBuf), Google's high-performance binary serialization standard.

Introduction

Protocol Buffers (often called Protobuf) is a method of serializing structured data. It is useful in developing programs to communicate with each other over a wire or for storing data.

The key difference is that it is a binary format, meaning it is not directly human-readable. You define how you want your data to be structured once, then you use special generated source code to easily write and read your structured data to and from a variety of data streams and using a variety of languages.

How it works

  1. Define Schema (.proto): You define the data structure in a `.proto` file.
  2. Compile: You run the `protoc` compiler to generate data access classes in your preferred language (C++, Java, Python, Go, etc.).
  3. Serialize/Deserialize: Use the generated code to serialize objects to bytes (for network transmission) or deserialize bytes back to objects.

Example .proto file

person.proto
syntax = "proto3";

message Person {
  string name = 1;
  int32 id = 2;
  string email = 3;

  enum PhoneType {
    MOBILE = 0;
    HOME = 1;
    WORK = 2;
  }

  message PhoneNumber {
    string number = 1;
    PhoneType type = 2;
  }

  repeated PhoneNumber phones = 4;
}

Benefits over JSON

  • Size: Protobuf messages are typically 3-10x smaller than equivalent XML/JSON.
  • Speed: Serialization/deserialization is 20-100x faster than XML/JSON.
  • Type Safety: The schema enforces specific types, reducing bugs.

Use Cases

gRPC: Protobuf is the default serialization format for gRPC, a high-performance RPC framework.
Microservices: Ideal for internal communication between microservices where bandwidth and latency are critical.
Mobile Apps: Reduces data usage and parsing time on mobile devices.

Tools