Day 36/100

Designing Data-Intensive Applications [Book Highlights]

[Part I : Chapter IV] Encoding and Evolution

Avro

  • To maintain forward and backward compatibility its important to add or remove a field that has some default value.
  • In case of Avro, You can only use null as a default value if it is one of the branches of the union.
  • Changing the datatype of a field is possible, provided that Avro can convert the type
  • Changing a field name is very tricky. reader’s schema can contain aliases for field names, so it can match an old writer’s schema field names against the aliases
  • So, Changing a field name is (possibly) backward compatible but not forward compatible.
  • The simplest solution is to include a version number at the beginning of every encoded record, and to keep a list of schema versions in your database
  • A database of schema versions is a useful thing to have.

Dataflow Through Databases

Dataflow Through Services: REST and RPC

Message-Passing Dataflow

  • message brokers are used as follows: one process sends a message to a named queue or topic, and the broker ensures that the message is delivered to one or more consumers of or subscribers to that queue or topic
  • There can be many producers and many consumers on the same topic.
  • Message brokers typically don’t enforce any particular data model a message is just a sequence of bytes with some metadata

Distributed actor frameworks

  • The actor model is a programming model for concurrency in a single process
  • logic is encapsulated in actors. Each actor typically represents one client or entity, it may have some local state, and it communicates with other actors by sending and receiving asynchronous messages.

Summery

  • Databases, where the process writing to the database encodes the data and the process reading from the database decodes it
  • RPC and REST APIs, where the client encodes a request, the server decodes the request and encodes a response, and the client finally decodes the response
  • Asynchronous message passing (using message brokers or actors), where nodes communicate by sending each other messages that are encoded by the sender and decoded by the recipient
  • with a bit of care, backward/forward compatibility and rolling upgrades are quite achievable