Sep 24, 2018

[Golang] Protoactor-go 101: How actors communicate with each other

Designing actor-based program is all about dividing tasks into smaller pieces. Fine-grained actors concentrate on their tasks, collaborate with other actors and accomplish a big task as a whole. Hence mastering actors' communication mechanism and modeling well-defined messages are always the keys to designing an actor system. This article describes protoactor-go's actor categories, their messaging methods and how those methods differ on referencing sender actors.
See my previous article, [Golang] Protoactor-go 101: Introduction to golang's actor model implementation, for protoactor-go's basic concepts and terms.

TL;DR

While there are several kinds of actors, those actors share a unified interface to communicate with each other. Various methods are provided for their communication, but always use Request() to acknowledge the recipient actor who the sender actor is. When that is not an option, include the sender actor's actor.PID in the sending message.

Example codes

Example codes that cover all communication means for all actor implementations are located at github.com/oklahomer/protoactor-go-sender-example. Minimal examples are introduced in this article for description, but visit this repository for comprehensive examples.

Premise: Three major kinds of actors

protoactor-go comes with three kinds of actors: local, remote and cluster grain.

  • Local ... Those actors located in the same process.
  • Remote ... Actors located in different processes or servers. An actor is considered to be "local" when addressed from within the same process; while this is "remote" when addressed across a network. Because a message is sent over a network, message serialization is required. Protocol Buffers is used for this task in protoactor-go.
  • Cluster grain ... A kind of remote actor but the lifecycle and other complexity are taken care of by protoactor-go library. Cluster topology is managed by consul and a grain can be addressed over a network. Consul manages the cluster membership and the availability of each node.
Thanks to the location transparency, an actor can communicate with other actors in the same way without worrying about where the recipient actors are located at. In addition to those basic communication means, a cluster grain has an extra mechanism to provide RPC based interface.
Each actor is encapsulated in an actor.PID instance so developers communicate with actors via methods provided by this actor.PID. (actor.Context also provides equivalent methods, but these can be considered as wrappers for actor.PID's corresponding methods.) One important thing to remember is that above actors are not the only entities encapsulated in actor.PIDs. As a matter of fact, any actor.Process implementation including mailbox, Future mechanism and others are also encapsulated in actor.PIDs. This may be familiar to those with Erlang background. Understanding this becomes vital when one tries referring to message sender actor. The rest of this article is going to describe each messaging method and how a recipient actor can refer to the sending actor.

Communication methods

Below are the common communication methods -- Tell(), Request() and RequestFuture() -- and RPC based method for cluster grain. Examples in this article all demonstrate local actor messaging because local and remote actors share a common messaging interface. Visit my example repository to cover all messaging implementations of local, remote and cluster grain.

Tell() tells nothing about the sender 

To send a message to an actor, one may call actor.PID's Tell() method. When a message is sent from outside of an actor system by calling PID.Tell(), the recipient actor fails to refer to the sending actor with Context.Sender()This is pretty obvious. Because the message is sent from outside, there is no such thing as sending actor. Below is an example:
package main

import (
 "github.com/AsynkronIT/protoactor-go/actor"
 "time"
)

type ping struct{}

type pong struct{}

func main() {
 props := actor.FromFunc(func(ctx actor.Context) {
  switch ctx.Message().(type) {
  case *ping:
   // This fails to get sender
   // because the message came
   // from outside of actor system
   //
   // Below execution leads to dead letter
   // 2018/09/14 22:40:02 [ACTOR] [DeadLetter] pid="nil" message=&{} sender="nil"
   ctx.Respond(&pong{})

   // Below execution causes a panic since Sender() returns nil.
   // Actor crashes and that causes supervisor to restart this failing actor.
   // 2018/09/14 22:40:02 [MAILBOX] [ACTOR] Recovering actor="nonhost/$1" reason="runtime error: invalid memory address or nil pointer dereference" stacktrace="github.com/AsynkronIT/protoactor-go/actor.(*PID).ref:26"
   // 2018/09/14 22:40:02 [ACTOR] [SUPERVISION] actor="nonhost/$1" directive="RestartDirective" reason="runtime error: invalid memory address or nil pointer dereference"
   ctx.Sender().Tell(&pong{})

  }
 })

 pid := actor.Spawn(props)
 pid.Tell(&ping{})

 time.Sleep(1 * time.Second) // Just to make sure system ends after actor execution
}
In the above example, a message is directly sent to an actor from outside of an actor system. Therefore the recipient actor fails to refer to the sending actor. With Akka, this behavior is similar to set ActorRef#noSender as the second argument of ActorRef#tell -- when the recipient tries to respond, the message goes to the dead letter mailbox.

When a message is sent from one actor to another, there indeed is a sender-recipient relationship. Recipient actor's contextual information, actor.Context, appears to provide such information for us. Below is an example code that tries to refer to the sender actor with actor.Context:
package main

import (
 "github.com/AsynkronIT/protoactor-go/actor"
 "log"
 "time"
)

type pong struct {
}

type ping struct {
}

type pingActor struct {
 pongPid *actor.PID
}

func (p *pingActor) Receive(ctx actor.Context) {
 switch ctx.Message().(type) {
 case struct{}:
  // Below does not set ctx.Self() as sender,
  // and hence the recipient has no knowledge of the sender
  // even though the message is sent from another actor via actor.Context.
  //
  ctx.Tell(p.pongPid, &ping{})

 case *pong:
  log.Print("Received pong message")

 }
}

func main() {
 pongProps := actor.FromFunc(func(ctx actor.Context) {
  switch ctx.Message().(type) {
  case *ping:
   log.Print("Received ping message")

   // 2018/09/15 02:01:27 [ACTOR] [DeadLetter] pid="nil" message=&{} sender="nil"
   ctx.Respond(&pong{})

   // 2018/09/15 02:01:27 [MAILBOX] [ACTOR] Recovering actor="nonhost/$1" reason="runtime error: invalid memory address or nil pointer dereference" stacktrace="github.com/AsynkronIT/protoactor-go/actor.(*PID).ref:26"
   // 2018/09/15 02:01:27 [ACTOR] [SUPERVISION] actor="nonhost/$1" directive="RestartDirective" reason="runtime error: invalid memory address or nil pointer dereference"
   ctx.Sender().Tell(&pong{})

  default:

  }
 })
 pongPid := actor.Spawn(pongProps)

 pingProps := actor.FromProducer(func() actor.Actor {
  return &pingActor{
   pongPid: pongPid,
  }
 })
 pingPid := actor.Spawn(pingProps)
 pingPid.Tell(struct{}{})
 time.Sleep(1 * time.Second) // Just to make sure system ends after actor execution
}
However, the recipient fails to refer to the sender actor in the same way it failed in the previous example. This may seem odd, but let us take a look at actor.Context's implementation. A call to Context.Tell() is proxied to Context.sendUserMessage(), where the message is stuffed into actor.MessageEnvelope with nil Sender field as below:
func (ctx *localContext) Tell(pid *PID, message interface{}) {
 ctx.sendUserMessage(pid, message)
}

func (ctx *localContext) sendUserMessage(pid *PID, message interface{}) {
 if ctx.outboundMiddleware != nil {
  if env, ok := message.(*MessageEnvelope); ok {
   ctx.outboundMiddleware(ctx, pid, env)
  } else {
   ctx.outboundMiddleware(ctx, pid, &MessageEnvelope{
    Header:  nil,
    Message: message,
    Sender:  nil,
   })
  }
 } else {
  pid.ref().SendUserMessage(pid, message)
 }
}
That is why a recipient cannot refer to the sender even though the messaging occurs between two actors and such contextual information seems to be available. The above code fragment suggests that passing actor.MessageEnvelope with pre-filled Sender field should tell the sending actor to the recipient. This actually works because all actor.MessageEnvelope's fields are public and accessible, but this is a cumbersome job. There should be a way to do that.

Request() lets a recipient request for the sender reference

A second messaging method is Request(). This lets developers set who the sender actor is, and the recipient actor can reply to the sender actor by calling Context.Respond() or by calling Context.Sender().Tell(). Below is the method signature.
// Request sends a messages asynchronously to the PID. The actor may send a response back via respondTo, which is
// available to the receiving actor via Context.Sender
func (pid *PID) Request(message interface{}, respondTo *PID) {
 env := &MessageEnvelope{
  Message: message,
  Header:  nil,
  Sender:  respondTo,
 }
 pid.ref().SendUserMessage(pid, env)
}
Above signature may look more like Akka's ActorRef#tell than Tell() in a way that a developer can set a sender actor, more precisely a sending actor.PID in this case, as a second argument. An actor.PID and an actor.Context both have Request() method and they behave equivalently as described in the below example:
package main

import (
 "github.com/AsynkronIT/protoactor-go/actor"
 "log"
 "time"
)

type pong struct {
}

type ping struct {
}

type pingActor struct {
 pongPid *actor.PID
}

func (p *pingActor) Receive(ctx actor.Context) {
 switch ctx.Message().(type) {
 case struct{}:
  // Below both send a message with sender information
  ctx.Request(p.pongPid, &ping{})
  p.pongPid.Request(&ping{}, ctx.Self())

 case *pong:
  log.Print("Received pong message")

 }
}

func main() {
 pongProps := actor.FromFunc(func(ctx actor.Context) {
  switch ctx.Message().(type) {
  case *ping:
   log.Print("Received ping message")

   // Below both work
   ctx.Respond(&pong{})
   ctx.Sender().Tell(&pong{})

  default:

  }
 })
 pongPid := actor.Spawn(pongProps)

 pingProps := actor.FromProducer(func() actor.Actor {
  return &pingActor{
   pongPid: pongPid,
  }
 })
 pingPid := actor.Spawn(pingProps)
 pingPid.Tell(struct{}{})
 time.Sleep(1 * time.Second) // Just to make sure system ends after actor execution
}

This not only works for request-response model, but also works to propagate the sending actor identity to subsequent actor calls.

RequestFuture() only has its future

The last method is ReqeustFuture(). This can be used as an extension of Request() where an actor.Future is returned to the requester. However, its behavior differs slightly but significantly when the recipient actor tries referring to the sender with Context.Sender() and treating this as a reference to the sender actor. Below is a simple example that demonstrates a regular request-response model:
package main

import (
 "github.com/AsynkronIT/protoactor-go/actor"
 "log"
 "time"
)

type pong struct {
}

type ping struct {
}

type pingActor struct {
 pongPid *actor.PID
}

func (p *pingActor) Receive(ctx actor.Context) {
 switch ctx.Message().(type) {
 case struct{}:
  // Below both work.
  //
  //future := p.pongPid.RequestFuture(&ping{}, time.Second)
  future := ctx.RequestFuture(p.pongPid, &ping{}, time.Second)
  result, err := future.Result()
  if err != nil {
   log.Print(err.Error())
   return
  }
  log.Printf("Received %#v", result)

 case *pong:
  // Never comes here.
  // When the pong actor responds to the sender,
  // the sender is not a ping actor but a future process.
  log.Print("Received pong message")

 }
}

func main() {
 pongProps := actor.FromFunc(func(ctx actor.Context) {
  switch ctx.Message().(type) {
  case *ping:
   log.Print("Received ping message")
   // Below both work in this example, but their behavior slightly differ.
   // ctx.Sender().Tell() panics and recovers if the sender is nil;
   // while ctx.Respond() checks the presence of sender and redirects the message to dead letter process
   // when sender is absent.
   //
   //ctx.Sender().Tell(&pong{})
   ctx.Respond(&pong{})

  default:

  }
 })
 pongPid := actor.Spawn(pongProps)

 pingProps := actor.FromProducer(func() actor.Actor {
  return &pingActor{
   pongPid: pongPid,
  }
 })
 pingPid := actor.Spawn(pingProps)
 pingPid.Tell(struct{}{})
 time.Sleep(1 * time.Second) // Just to make sure system ends after actor execution
}
Now the below example demonstrates how Request() and RequestFuture() behave differently when Context.Sender() or Context.Respond() is called to refer to the sender actor's actor.PID. The code structure is almost the same as the previous example besides that below tries to send back multiple messages to the sender actor.
package main

import (
 "github.com/AsynkronIT/protoactor-go/actor"
 "log"
 "time"
)

type pong struct {
}

type ping struct {
}

type pingActor struct {
 pongPid *actor.PID
}

func (p *pingActor) Receive(ctx actor.Context) {
 switch ctx.Message().(type) {
 case struct{}:
  // Below both work.
  //
  //future := p.pongPid.RequestFuture(&ping{}, time.Second)
  future := ctx.RequestFuture(p.pongPid, &ping{}, time.Second)
  result, err := future.Result()
  if err != nil {
   log.Print(err.Error())
   return
  }
  log.Printf("Received %#v", result)

 case *pong:
  // Never comes here.
  // When the pong actor responds to the sender,
  // the sender is not a ping actor but a future process.
  log.Print("Received pong message")

 }
}

func main() {
 pongProps := actor.FromFunc(func(ctx actor.Context) {
  switch ctx.Message().(type) {
  case *ping:
   log.Print("Received ping message")
   // Below both work in this example, but their behavior slightly differ.
   // ctx.Sender().Tell() panics and recovers if the sender is nil;
   // while ctx.Respond() checks the presence of sender and redirects the message to dead letter process
   // when sender is absent.
   //
   //ctx.Sender().Tell(&pong{})
   ctx.Respond(&pong{})

   // Take a look at the id field.
   // 2018/09/23 10:58:53 &actor.PID{Address:"nonhost", Id:"future$3", p:(*actor.Process)(0xc4200ea010)}
   log.Printf("%#v", ctx.Sender())

   // Below all fail because the sender PID does not represents the sender actor,
   // but the sending Future process and the Future process ends when the first payload is returned.
   ctx.Sender().Tell(&pong{})
   ctx.Respond(&pong{})
   ctx.Sender().Tell(&pong{})
   ctx.Respond(&pong{})
   ctx.Sender().Tell(&pong{})
   ctx.Respond(&pong{})
   ctx.Sender().Tell(&pong{})
   ctx.Respond(&pong{})

  default:

  }
 })
 pongPid := actor.Spawn(pongProps)

 pingProps := actor.FromProducer(func() actor.Actor {
  return &pingActor{
   pongPid: pongPid,
  }
 })
 pingPid := actor.Spawn(pingProps)
 pingPid.Tell(struct{}{})
 time.Sleep(1 * time.Second) // Just to make sure system ends after actor execution
}
Remember, as briefly introduced in the "Premise" section, an actor.PID not only encapsulates an actor.Actor instance but also encapsulates any actor.Process implementation. The concept of "process" and its representation, PID, are quite similar to those of Erlang in this way. With that said, let us take a closer look at how the above example behaves under the hood. First, two processes for actor PIDs are explicitly created by the developer: pingPid and pongPid. When pingPid sends a message to pongPid, another process is implicitly created by protoactor-go: that of actor.Future. And this actor.Future process is set as the sender PID when communication takes place.
func (ctx *localContext) RequestFuture(pid *PID, message interface{}, timeout time.Duration) *Future {
 future := NewFuture(timeout)
 env := &MessageEnvelope{
  Header:  nil,
  Message: message,
  Sender:  future.PID(),
 }
 ctx.sendUserMessage(pid, env)

 return future
}
When the recipient actor's process, pongPid, receives the message and respond to the sender, the "sender" is not actually pingPid but the actor.Future's process. After one message is sent back to pingPid, the actor.Future process ends and therefore the subsequent calls to Context.Respond() or Context.Sender() from pongPid fail to refer to the sender. So when the passing of sender actor's PID is vital for the recipient's task execution, use Request() or include the sender actor's actor.PID in the sending message so the recipient can refer to the sender actor for sure.

Cluster grain's unique RPC based messaging

Actors can communicate with Cluster grains just like communicating with remote actors. In fact, protoactor-go's cluster mechanism is implemented on top of actor.remote implementation. However, this cluster mechanism adopts the idea of Microsoft Orleans where the actor lifecycle and other major tasks are managed by the actor framework to ease the developer's work. This effort includes the introduction of handy RPC based communication protocol. Communication with cluster grains still use Protocol Buffers for serialization and deserialization, but this goes a bit further by providing a wrapper for gRPC service calls.
By using gograin protoc plugin, a code is generated for gRPC services. This code provides an actor.Actor implementation where Receive() receives a message from another actor, deserializes it and calls a corresponding method depending on the incoming message type. Developers only have to implement a method for each gRPC service. The returning value of the implemented method is returned to the sender actor.  One thing to notice is that this remote gRPC call is implemented with RequestFuture() under the hood. So when the method tries referring to the sender by Context.Sender(), the returned actor.PID is not a representation of the sender actor but an actor.Future. The example contains a relatively large amount of code so visit my example repository for details. Directory layout is as below:

  • messages ... This includes messages shared by sender and recipient actors. protos_protoactor.go contains the code generated by gograin protoc plugin. This is used for the gRPC based communication.
  • cluster-ping-grpc and cluster-pong-grpc ... These provide implementations for ping actor and pong actor. They communicate over gRPC based protocol.
  • cluster-ping-future, cluster-ping-request, cluster-ping-tell and cluster-pong ... These are examples that communicate with actor.remote implementation without the gRPC service.

Conclusion

While there are several kinds of actors, those actors have unified ways to communicate with other actors no matter where they are located at. However, because an actor.PID is not only a representation of an actor process but also a representation of any actor.Process implementation, extra work may be required for a recipient actor to refer to the sender actor since the returning actor.PID of Context.Sender() is not necessarily a sender actor's representation. To ensure that the recipient actor can refer to the sender actor, include the sender actor's PID in the sending message or use Request(). Visit github.com/oklahomer/protoactor-go-sender-example for more comprehensive examples.

Jul 22, 2018

[Golang] Protoactor-go 101: Introduction to golang's actor model implementation

A year has passed since I officially launched go-sarah. While this bot framework had been a great help with my ChatOps, I found myself becoming more and more interested in designing chat system as a whole. Not just a text-based communication tool or its varied extension; but as a customizable event aggregation system that provides and consumes any conceivable event varied from virtual to real-life. In the course of its server-side design, Golang's actor model implementation, protoactor-go, seemed like a good option. However, protoactor-go is still in its Beta phase and has less documentation at this point in time. This article describes what I have learned about this product. The basic of actor model is not going to be covered, but for those who are interested, my previous post "Yet another Akka introduction for dummies" might be a help.
Unless otherwise noted, this introduction is based on the latest version as of 2018-07-21.

Terms, Concepts, and Common Types

  • Message ... With the nature of the actor model, a message plays an important part to let actors interact with others. Messages internally fall into two categories:
    • User message ... Messages defined by developers for actor interaction.
    • System message ... Messages defined by protoactor-go for internal use that mainly handles the actor life cycle.
  • PID ... actor.PID is a container that combines a unique identifier, the address and a reference to actor.Process altogether. Since this provides interfaces for others to interact with the underlying actor, this can be seen as an actor reference if one is familiar with Akka. Or simply a Pid if familiar with Erlang. However, this is very important to remember that actor process is not the only entity that a PID encapsulates.
  • Process ... actor.Process defines a common interface that all interacting "process" must implement. In this project, the concepts of process and PID are quite similar to those of Erlang. Understanding that PID is not necessarily a representation of actor process is vital when referring to actor messaging context. This distinction and its importance are described in the follow-up article, [Golang] Protoactor-go 101: How actors communicate with each other. Its implementation varies depending on each role:
    • Router ... router.process receives a message and broadcasts it to all subordinating actors, "routees." 
    • Local process ... actor.localProcess has a reference to a mailbox. On message reception,  this enqueues the message to its mailbox so the actor can receive this for further procedure.
    • Remote process ... On contrary to a local process, this represents an actor that exists in a remote environment. On message reception, this serializes the message and sends it to the destination host.
    • Guardian process ... When a developer passes a "guardian"'s supervisor strategy for actor constructor, a parent actor is created with this supervisor strategy along with the actor itself. This parent "guardian" actor will take care of the child actor's uncontrollable state. This should be effective when the constructing actor is the "root actor" -- an actor without a parent actor -- but customized supervision is still required. When multiple actor constructions contain the same settings for guardian supervision, only one guardian actor is created and this becomes the parent of all actors with the same settings.
    • Future process ... actor.futureProcess provides some dedicated features for Future related tasks.
    • Dead letter process ... actor.deadLetterProcess provides features to handle "dead letters." A dead letter is a message that failed to reach target because, for example, the target actor did not exist or was already stopped. This dead letter process publishes actor.DeadLetterEvent to the event stream, so a developer can detect the dead letter by subscribing to the event via eventstream.Subscribe().
  • Mailbox ... This works as a queue to receive incoming messages, store them temporarily and pass them to its coupled actor when the actor is ready for message execution. The actor is to receive the message one at a time, execute its task and alter its state if necessary. Mailbox implements mailbox.Inbound interface.
    • Default mailbox ... mailbox.defaultMailbox not only receives incoming messages as a mailbox.Inbound implementation, but also coordinates the actor invocation schedule with its mailbox.Dispatcher implementation entity. This mailbox also contains mailbox.MessageInvoker implementation as its entity and its methods are called by mailbox.Dispatcher for actor invocation purpose. actor.localContext implements mailbox.MessageInvoker.
  • Context ... This is equivalent to Akka's ActorCoontext. This contains contextual information and contextual methods for the underlying actor such as below:
    • References to watching actors and methods to watch/unwatch other actors
    • A reference to the actor who sent the currently processing message and a method to access to this
    • Methods to pass a message to another actor
    • etc...
  • Middleware ... Zero or more pre-registered procedures can be executed around actor invocation, which enables an AOP-like approach to modify behavior.
    • Inbound middleware ... actor.InboundMiddleware is a middleware that is executed on message reception. A developer may register one or more middleware via Props.WithMiddleware().
    • Outbound middleware ... actor.OutboundMiddleware is a middleware that is executed on message sending. A developer may register one or more middleware via Props.WithOutboundMiddleware().
  • Router ... router sub-package provides a series of mechanism that routes a given message to one or more of its routees.
    • Broadcast router ... Broadcast given message to all of its routee actors.
    • Round robin router ... Send given message to one of its routee actors chosen by round-robin manner
    • Random router ... Send given message to a randomly chosen routee actor.
  • Event Stream ... eventstream.EventStream is a mechanism to publish and subscribe given event where the event is an empty interface, interface{}. So the developer can technically publish and subscribe to any desired event. By default an instance of eventstream.EventStream is cached in package local manner and is used to publish and subscribe events such as dead letter messages.

Actor Construction

To construct a new actor and acquire a reference to this, a developer can feed an actor.Props to actor.Spawn or actor.SpawnNamed. The struct called actor.Props is a set of configuration for actor construction. actor.Props can be initialized with helper functions listed below:
  • actor.FromProducer() ... Pass a function that returns an actor.Actor implementation. This returns a pointer to actor.Props, which contains a set of configurations for actor construction.
  • actor.FromFunc() ... Pass a function that satisfies actor.ActorFunc type, which receives exactly the same arguments as Actor.Recieve(). This is a handy wrapper of actor.FromProducer.
  • actor.FromSpawnFunc() ... Pass a function that satisfies actor.SpawnFunc type. on actor construction, this function is called with a series of arguments containing id, actor.Props and parent PID to construct a new actor. When this function is not set, actor.DefaultSpawner is used.
  • actor.FromInstance() ... Deprecated.
Additional configuration can be added via its setter methods with "With" prefix. See example code.

Spawner -- Construct actor and initiate its life cycle

A developer feeds a prepared actor.Props to actor.Spawn() or actor.SpawnNamed() depending on the requirement to initialize an actor, its context, and its mailbox. In any construction flow, Props.spawn() is called. To alter this spawning behavior, an alternative function can be set with actor.FromSpawnFunc() or Props.WithSpawnFunc() to override the default behavior. When none is set, actor.DefaultSpawner is used by default. Its behavior is as below:
  • The default spawner creates an instance of actor.localProcess, which is an actor.Process implementation.
  • Add the instance to actor.ProcessRegistry.
    • The registry returns an error if given id is already registered.
  •  Create new actor.localContext which is an actor.Context implementation. This stores all contextual data.
  • Mailbox is created for the context. To modify the behavior of mailbox, use Props.WithDispatcher() and Props.WithMailbox().
  • Created mailbox is stored in the actor.localProcess instance.
  • The pointer to the process is set to actor.PID's field.
  • actor.localContext also has a reference to the actor.PID as "self."
  • Start mailbox
  • Enqueue mailbox a startedMessage as a system message which is an instance of actor.Started.
When construction is done and the actor life cycle is successfully started, actor.PID for the new actor is returned.

Child Actor construction

With the introduced actor construction procedure, a developer can create any "root actor," an actor with no parent. To achieve a hierarchized actor system, use actor.Context's Spawn() or SpawnNamed() method. Those methods work similarly to actor.Spawn() and actor.SpawnNamed(), but the single and biggest difference is that they create a parent-child relationship between the spawning actor and the newly created actor. They work as below:

  • Check if Props.guardianStrategy is set
    • If set, it panics. Because the calling actor is going to be the parent and be obligated to be a supervisor, there is no need to set one. This strategy is to create a parent actor for customized supervision as introduced in the first section.
  • Call Props.spawn()
    • The ID has a form of {parent-id}/{child-id}
    • Own PID is set as a parent for the new actor
  • Add created actors actor.PID to its children
  • Start watching the created actor.PID to subscribe its life cycle event

 Supervisor Strategy

This is a parent actor's responsibility to take care of its child actor's exceptional state. When a child actor can no longer control its state, based on the "let-it-crash" philosophy, child actor notifies such situation to parent actor by panic(). The parent actor receives such notification with recover() and decides how to treat such failing actor. This decision is made by a customizable actor.SupervisorStrategy. When no strategy is explicitly set by a developer, actor.defaultSupervisorStrategy is set on actor construction.
The supervision flow is as follows:
  • A mailbox passes a message to Actor.Recieve() via target actor context's localContext.InvokeUserMessage().
  • In Actor.Receive(), the actor calls panic().
  • Caller mailbox catches such uncontrollable state with recover().
  • The mailbox calls localContext.EscalateFailure(), where localContext is that of the failing actor.
    • In localContext.EscalateFailure(), this tells itself to suspend any incoming message till recovery is done.
    • Create actor.Failure instance that holds failing reason and other statistical information, where "reason" is the argument passed to panic().
    • Judges if the failing actor has any parent
      • If none is found, the failing actor is the "root actor" so the actor.Failure is passed to actor.handleRootFactor().
      • If found, this passes actor.Failure to parent's PID.sendSystemMessage() to notify failing state
        • The message is enqueued to parent actor's mailbox
        • Parent's mailbox calls its localContext.InvokeSystemMessage.
        • actor.Failure is passed to localContext.handleFailure
        • If its actor.Actor entity itself implements actor.SupervisorStrategy, its HandleFailure() is called.
        • If not, its supervisor entity's handleFailure() is called.
        • In HandleFailure(), decide recovery policy and call localContext.(ResumeChildren|RestartChildren|StopChildren|EscalateFailure).

Upcoming Interface Change

A huge interface change is expected according to the issue "Design / API Changes upcoming."

Further Readings

See below articles for more information:

Aug 19, 2017

[Golang] Introducing go-sarah: simple yet highly customizable bot framework

As mentioned in the latest blog post, I created a new bot framework: go-sarah. This article introduces its notable features and overall architecture along with some sample codes. Upcoming articles should focus on details about each specific aspect.

Notable features

User's Conversational Context

In this project, user's conversational context is referred to as "user context," which stores previous user states and defines what function should be executed on following input. While typical bot implementation is somewhat "stateless" and hence user-and-bot interaction does not consider previous state, Sarah natively supports the idea of this conversational context. Its aim is to let user provide information as they send messages, and finally build up complex arguments to be passed.

For example, instead of obligating user to input long confusing text such as ".todo Fix Sarah's issue #123 by 2017-04-15 12:00:00" at once, let user build up arguments in a conversational manner as below image:


Live Configuration Update

When configuration file for a command is updated, Sarah automatically detects the event and re-builds the command or scheduled task in thread-safe manner so the next execution of that command/task appropriately reflects the new configuration values.

See the usage of CommandPropsBuilder and ScheduledTaskPropsBuilder for detail.

Concurrent Execution by Default

Developers may implement their own bot by a) implementing sarah.Bot interface or b) implementing sarah.Adapter and pass it to sarah.NewBot() to get instance of default Bot implementation.

Either way, a component called sarah.Runner takes care of Commmand execution against given user input. This sarah.Runner dispatches tasks to its internal workers, which means developers do not have to make extra effort to handle flooding incoming messages.

Alerting Mechanism

When a bot confronts critical situation and can not continue its operation or recover, Sarah's alerting mechanism sends alert to administrator. Zero or more sarah.Alerter implementations can be registered to send alert to desired destinations.

Higher Customizability

To have higher customizability, Sarah is composed of fine grained components that each has one domain to serve; sarah.Alerter is responsible for sending bot's critical state to administrator, workers.Worker is responsible for executing given job in a panic-proof manner, etc... Each component comes with an interface and default implementation, so developers may change Sarah's behavior by implementing corresponding component's interface and replacing default implementation.

Overall Architecture

Below illustrates some major components.


Runner

Runner is the core of Sarah; It manages other components' lifecycles, handles concurrent job execution with internal workers, watches configuration file changes, re-configures commands/tasks on file changes, executes scheduled tasks, and most importantly makes Sarah comes alive.

Runner may take multiple Bot implementations to run multiple Bots in single process, so resources such as workers and memory space can be shared.

Bot / Adapter

Bot interface is responsible for actual interaction with chat services such as Slack, LINE, gitter, etc...

Bot receives messages from chat services, sees if the sending user is in the middle of user context, searches for corresponding Command, executes Command, and sends response back to chat service.

Important thing to be aware of is that, once Bot receives message from chat service, it sends the input to Runner via a designated channel. Runner then dispatches a job to internal worker, which calls Bot.Respond and sends response via Bot.SendMessage. In other words, after sending input via the channel, things are done in concurrent manner without any additional work. Change worker configuration to throttle the number of concurrent execution -- this may also impact the number of concurrent HTTP requests against chat service provider.

DefaultBot

Technically Bot is just an interface. So, if desired, developers can create their own Bot implementations to interact with preferred chat services. However most Bots have similar functionalities, and it is truly cumbersome to implement one for every chat service of choice.

Therefore defaultBot is already predefined. This can be initialized via sarah.NewBot.

Adapter

sarah.NewBot takes multiple arguments: Adapter implementation and arbitrary number ofsarah.DefaultBotOptions as functional options. This Adapter thing becomes a bridge between defaultBot and chat service. DefaultBot takes care of finding corresponding Command against given input, handling stored user context, and other miscellaneous tasks; Adapter takes care of connecting/requesting to and messaging with chat service.

package main

import (
        "github.com/oklahomer/go-sarah"
        "github.com/oklahomer/go-sarah/slack"
        "gopkg.in/yaml.v2"
        "io/ioutil"
)

func main() {
        // Setup slack bot.
        // Any Bot implementation can be fed to Runner.RegisterBot(), but for convenience slack and gitter adapters are predefined.
        // sarah.NewBot takes adapter and returns defaultBot instance, which satisfies Bot interface.
        configBuf, _ := ioutil.ReadFile("/path/to/adapter/config.yaml")
        slackConfig := slack.NewConfig() // config struct is returned with default settings.
        yaml.Unmarshal(configBuf, slackConfig)
        slackAdapter, _ := slack.NewAdapter(slackConfig)
        sarah.NewBot(slackAdapter)
}

Command

Command interface represents a plugin that receives user input and return response. Command.Match is called against user input in Bot.Respond. If it returns true, then the command is considered "corresponds to user input," and hence its Execute method is called.

Any struct that satisfies Command interface can be fed to Bot.AppendCommand as a command. CommandPropsBuilder is provided to easily implement Command interface on the fly:

Simple Command

There are several ways to setup Commands:
  • Define a struct that implements Command interface. Pass its instance to Bot.ApendCommand.
  • Use CommandPropsBuilder to construct a non-contradicting set of arguments, and pass this to Runner.Runner internally builds a command, and re-built it when configuration struct is present and corresponding configuration file is updated.
Below are several ways to setup CommandProps with CommandPropsBuilder for different customization.
// In separate plugin file such as echo/command.go
// Export some pre-build command props
package echo

import (
 "github.com/oklahomer/go-sarah"
 "github.com/oklahomer/go-sarah/slack"
 "golang.org/x/net/context"
 "regexp"
)

// CommandProps is a set of configuration options that can be and should be treated as one in logical perspective.
// This can be fed to Runner to build Command on the fly.
// CommandProps is re-used when command is re-built due to configuration file update.
var matchPattern = regexp.MustCompile(`^\.echo`)
var SlackProps = sarah.NewCommandPropsBuilder().
        BotType(slack.SLACK).
        Identifier("echo").
        MatchPattern(matchPattern).
        Func(func(_ context.Context, input sarah.Input) (*sarah.CommandResponse, error) {
                // ".echo foo" to "foo"
                return slack.NewStringResponse(sarah.StripMessage(matchPattern, input.Message())), nil
        }).
        InputExample(".echo knock knock").
        MustBuild()

// To have complex checking logic, MatchFunc can be used instead of MatchPattern.
var CustomizedProps = sarah.NewCommandPropsBuilder().
        MatchFunc(func(input sarah.Input) bool {
                // Check against input.Message(), input.SenderKey(), and input.SentAt()
                // to see if particular user is sending particular message in particular time range
                return false
        }).
        // Call some other setter methods to do the rest.
        MustBuild()

// Configurable is a helper function that returns CommandProps built with given CommandConfig.
// CommandConfig can be first configured manually or from YAML/JSON file, and then fed to this function.
// Returned CommandProps can be fed to Runner and when configuration file is updated,
// Runner detects the change and re-build the Command with updated configuration struct.
func Configurable(config sarah.CommandConfig) *sarah.CommandProps {
        return sarah.NewCommandPropsBuilder().
                ConfigurableFunc(config, func(_ context.Context, input sarah.Input, conf sarah.CommandConfig) (*sarah.CommandResponse, error) {
                        return nil, nil
                }).
                // Call some other setter methods to do the rest.
                MustBuild()
}

Reconfigurable Command

With CommandPropsBuilder.ConfigurableFunc, a desired configuration struct may be added. This configuration struct is passed on command execution as 3rd argument. Runner is watching the changes on configuration files' directory and if configuration file is updated, then the corresponding command is built, again.

To let Runner supervise file change event, set sarah.Config.PluginConfigRoot. Internal directory watcher supervises sarah.Config.PluginConfigRoot + "/" + BotType + "/" as Bot's configuration directory. When any file under that directory is updated, Runner searches for corresponding CommandProps based on the assumption that the file name is equivalent to CommandProps.identifier + ".(yaml|yml|json)". If a corresponding CommandProps exists, Runner rebuild Command with latest configuration values and replaces with the old one.

Scheduled Task

While commands are set of functions that respond to user input, scheduled tasks are those that run in scheduled manner. e.g. Say "Good morning, sir!" every 7:00 a.m., search on database and send "today's chores list" to each specific room, etc...

ScheduledTask implementation can be fed to Runner.RegisterScheduledTask. When Runner.Run is called, clock starts to tick and scheduled task becomes active; Tasks will be executed as scheduled, and results are sent to chat service via Bot.SendMessage.

Simple Scheduled Task

Technically any struct that satisfies ScheduledTask interface can be treated as scheduled task, but a builder is provided to construct a ScheduledTask on the fly.
package foo

import (
 "github.com/oklahomer/go-sarah"
 "github.com/oklahomer/go-sarah/slack"
 "github.com/oklahomer/golack/slackobject"
 "golang.org/x/net/context"
)

// TaskProps is a set of configuration options that can be and should be treated as one in logical perspective.
// This can be fed to Runner to build ScheduledTask on the fly.
// ScheduledTaskProps is re-used when command is re-built due to configuration file update.
var TaskProps = sarah.NewScheduledTaskPropsBuilder().
        BotType(slack.SLACK).
        Identifier("greeting").
        Func(func(_ context.Context) ([]*sarah.ScheduledTaskResult, error) {
                return []*sarah.ScheduledTaskResult{
                        {
                                Content:     "Howdy!!",
                                Destination: slackobject.ChannelID("XXX"),
                        },
                }, nil
        }).
        Schedule("@everyday").
        MustBuild()

Reconfigurable Scheduled Task

With ScheduledTaskPropsBuilder.ConfigurableFunc, a desired configuration struct may be added. This configuration struct is passed on task execution as 2nd argument. Runner is watching the changes on configuration files' directory and if configuration file is updated, then the corresponding task is built/scheduled, again.

To let Runner supervise file change event, set sarah.Config.PluginConfigRoot. Internal directory watcher supervises sarah.Config.PluginConfigRoot + "/" + BotType + "/" as Bot's configuration directory. When any file under that directory is updated, Runner searches for corresponding ScheduledTaskProps based on the assumption that the file name is equivalent to ScheduledTaskProps.identifier + ".(yaml|yml|json)". If a corresponding ScheduledTaskProps exists, Runner rebuild ScheduledTask with latest configuration values and replaces with the old one.

UserContextStorage

As described in "Notable Features," Sarah stores user's current state when Command's response expects user to send series of messages with extra supplemental information. UserContextStorage is where the state is stored. Developers may store state into desired storage by implementing UserContextStorage interface. Two implementations are currently provided by author:

Store in Process Memory Space

defaultUserContextStorage is a UserContextStorage implementation that stores ContextualFunc, a function to be executed on next user input, in the exact same memory space that process is currently running. Under the hood this storage is simply a map where key is user identifier and value is ContextualFunc. This ContextFunc can be any function including instance method and anonymous function that satisfies ContextFunc type. However it is recommended to use anonymous function since some variable declared on last method call can be casually referenced in this scope.

Store in External KVS

go-sarah-rediscontext stores combination of function identifier and serializable arguments in Redis. This is extremely effective when multiple Bot processes run and user context must be shared among them.
e.g. Chat platform such as LINE sends HTTP requests to Bot on every user input, where Bot may consist of multiple servers/processes to balance those requests.

Alerter

When registered Bot encounters critical situation and requires administrator's direct attention, Runner sends alert message as configured with Alerter. LINE alerter is provided by default, but anything that satisfies Alerter interface can be registered as Alerter. Developer may add multiple Alerter implementations via Runner.RegisterAlerter so it is recommended to register multiple Alerters to avoid Alerting channel's malfunction and make sure administrator notices critical state.

Bot/Adapter may send BotNonContinurableError via error channel to notify critical state to Runner. e.g. Adapter can not connect to chat service provider after reasonable number of retrials.

Getting Started

That is pretty much everything developers should know before getting started. To see working example code, visit https://github.com/oklahomer/go-sarah/tree/master/examples. Fore more details, make sure to follow upcoming blog posts.

P.S. Stars on go-sarah project are always welcom :)

Aug 6, 2017

Parenting software engineer

It was a cold day for spring that my wife gave birth to a beautiful baby girl, Sarah. Despite the snowy weather, Sarah was sleeping peacefully in her mother's arm. Being overwhelmed with grateful feeling after watching the faces of a newborn and her mother, I realized a passion to give birth to something was evolving in me. Giving birth is the most beautiful and creative act only allowed for females that weaves a rich tapestry of life, so I as a male software engineer wanted closer experience to this. That was the moment I decided to start a new project.


I named this project Sarah. This project would not only be a good memento of my daughter's birth, but also be a good memory of our growth. Once a software engineer stops growing as one, he can easily be left behind from this rapidly growing industry. This fact frightened me all the time. I needed to grow as much as my daughter did. However many parents complained that having kids gave them less time to work on what they want to and there was nothing they could do about. This was a reasonable complaint I was not going to agree. Having a daughter must be something enriches my life; not something burdens me. If there is someone to blame, that should be me. Not my daughter. Working on a new project that focuses on a new area of interest should help me grow as a software engineer.

For this project, I chose customizable chatbot framework as a theme. It was 2015 and creating chatbot was becoming a new trend. In technical perspective, creating chatbot framework involves skills such as follows:
  1. having better design to clearly separate abstraction from implementation layer
  2. having better understanding about multiple communication protocols depending on what chat service to adapt.
They captured me as promising challenges that bring me to the next higher level.

I started implementing Sarah with Python 3.5. At that time, the official announcement of PEP 484 release was around the corner and PyCharm was working on adapting this type hinting feature. While learning Python, I found a package named abc that could be used to define abstract base classes. I thought a combination of type hinting and abc could provide well-structured architecture. Decorator was also a good solution to minimize plugins function's specification by wrapping its core logic with actual messaging logic. However, it became obvious that I took type hinting too serious. Instead of passing around arbitrary dictionary as a function argument, I preferred to define a designated class to represent a particular object and pass its instance. I even implemented a base class called ValueObject to provide immutable objects. Passing those objects among public interfaces could be a good idea in terms of unambiguity, but I did the same to private methods. At this time Python's flexibility was lost and my code became an inferior Java.

A few months later I redesigned this project and started implementing with Golang. I found learning Golang was a joyful experience. The previous Python codebase not only gave me a better understanding of the whole picture, it also presented some hidden requirements that I missed last time. To fulfill the requirements, I added another layer called Runner at the bottom. Adapter focuses on connecting to designated chat service; Runner focuses on coordinating and supervising other components. Thanks to this newly added component, the other components' implementations became simpler and more focused. As described on its repository, Sarah is now composed of fine-grained components and interfaces, which make it is easier to replace pre-defined default behavior with customized implementation.


As of July 4th, 2017, Sarah is no longer pre-alpha and is now listed on awesome-go. While I am proud of what I have achieved, I must admit that this is not the end of our journey. Throughout all time, working on Sarah was not just coding. As a matter of fact coding in private time was the last thing I could do as a parent. That frustrated me from time to time. But I also knew we were going to have less and less time to spend together as my daughter grew up; She would make friends in school, spend time with them, make a boyfriend, go to college, and eventually leave home. Having this project told me an important lesson that our time is always limited and we need to have a continuing effort to spend it wisely. I will continue to work on Sarah, but I am sure the actual Sarah, my daughter, always has higher priority. I am her father. I always am.

[EDIT] FYI, this project's design philosophy, detailed specs, and my learned knowledge will be introduced on following blog posts. Until then its GitHub repository should help.

Dec 30, 2016

[IntelliJ] Switch focus back to editor while keep the embedded terminal open

TL;DR
IntelliJ IDEA 13's default shortcut, ⌥F12, closes embedded terminal when switching focus to editor window. Hitting ⌘2 twice works to switch focus to editor and still have the terminal open.

Software engineers often have demand to open both editor and terminal at the same time so they can code on editor and tail logs, display git-grep's result, or show whatever they want aside their editor. It is pretty handy because you can focus on coding task, but still keep an eye on and monitor server logs. Or you can open a particular log file on the terminal on the right side of your monitor, and apply changes to code on left side of your monitor. It is even handier if both can be displayed within IntelliJ IDEA since the pre-defined or user-defined shortcut keys allows engineers to switch from and to terminal when needed while both editor and terminal are displayed and aligned in more organized manner comparing to opening both Terminal.app and IntelliJ IDEA and toggle. It is Integrated Development Environment, after all.

The first thing one may notice is the pre-defined shortcut to toggle between terminal and editor: ⌥F12. When focus is on editor window, this opens embedded terminal and switch focus to it; when focus is on embedded terminal, switch focus to editor window. This, however, has one problem. Embedded terminal closes when focus is switched to editor. There actually are some modes that define the behaviour of the terminal window, and programmers can choose one or combinations of those modes depending on their preference. But still the terminal window closes with this shortcut. Hence the term, "toggle."

So how can we switch focus back to editor and still have the embedded terminal open? If there is no pre-defined shortcut, what is the best workaround? There were some workarounds introduced on the web including defining macros/shortcuts, and the simplest yet less disruptive way was found on stackoverflow.com. On this question, Andrey introduces the idea of switching the focus to some different window with pre-defined shortcut keys; dev shows comprehensive answer. Andrey's approach seems simple yet effective. Since the ⌥F12 shortcut closes terminal window on "toggle," just use another shortcut to "switch" focus to different window such as "Favorite" with ⌘2. At this point, the state is just the same as you hit ⌘2 from editor and switch focus to Favorite window. Then hitting esc key or another ⌘2 works to switch focus back to editor. The only requirement is to check "Docked mode" and "Pinned mode" for terminal window.

Note that embedded terminal is a bit different from other tool windows since this is terminal and consumes most key inputs including esc keys and other ⌘ related inputs. And as long as there is no pre-defined shortcut to just switch -- not toggle -- focus from and to terminal, the workaround that introduced above is required. There is a request on youtrack that asks for shortcut doing exactly the one discussed in this article, so until that is implemented or is declined the simple workaround without macro assignment could be enough.

Jan 21, 2016

Yet another Akka introduction for dummies

It has been 5+ years since the initial launch of Akka toolkit. You can find many articles that cover what this is all about and why we use it, so this article does not discuss those areas. Instead I am going to introduce what I wanted to know and what I should have known when getting started with Akka actors.


Akka has well maintained comprehensive document for both Java and Scala implementations. The only flaw I can think of is that this is too huge that you can easily get lost in this. Or you have less time to work on this so you just skip reading. Trust me, I was one. Then you google and find articles that cover areas of your interest. It is nice that many developers are working on the same OSS and sharing their knowledge. The difficult part is that, like any other product, it takes time to capture the whole picture, and those how-to fragments you search can not fully support you unless you have the whole picture.

So what I am going to do here is to summarize basics, explain each of them with some reference to official document, and then share some practices that I have learned in a hard way. Unless otherwise stated, I used Java implementation of version 2.4.0.

Summary

  • Akka actor is only created by other actor
    • Hence the parent-child relationship
    • You can not initialize it outside of actor system
      • Actor instance is hidden under ActorRef so methods can not be called directly
      • Test may require some work-around
    • Parent supervises its children
  • When actor throws exception, the actor may be restarted
    • Supervisor (the parent actor) defines the recovery policy (Supervisor Strategy)
      • One-to-one v.s. all-for-one strategy
      • Options: resume, stop, restart, or escalate
      • Maximum retry count and timeout settings are customizable
    • Under the ActorRef, the old actor is replaced with the new one
  • To implement stateful actor, use UntypedPersistentActor
    • Otherwise, state is lost on actor restart
    • Variety of storage plugins are provided
  • Keep each actor simple
    • Do not let one actor do too much
    • Consider adding a supervisor layer to manage different kinds of actors with different strategy

Terms and concepts

Parental Supervision

The first rule you MUST remember about actor life cycle is that an actor can only be created by another actor; created actor is called "child" and is "supervised" by creating actor, "parent." Then who creates the top-level actor? The document says "top-level actor is provided by the library" and some say the top-level actor is also supervised by an imaginary actor. This family tree is described in a form of file system's file-path like hierarchy.



The root guardian is the one I described as "top-level" actor. This creates and supervises two special actors as its children: user guardian and system guardian. Since this tree is described in file-path like hierarchy, root guardian has a path of "/" while user guardian and system guardian have "/user" and "/system" accordingly. User-defined actors all belong to user guardian so your actors have and are accessible with the path of "/user/dad/kid/grand_kid."

As described above, all actors belong to their parents. In other words, you can not initialize your actor outside of actor system, which makes your tests a bit troublesome. If you try to create your actor directly, you will most likely get an error saying "You cannot create an instance of [XXX] explicitly using the constructor (new)." Without a supervising parent, child actor can not exist. So spy(new MyActor()) with Mockito will not work as you expect. For detailed example, see the code fragments below.

Here is one more thing to know about testing. Usually your actor is hidden under ActorRef instance and you can not call actor's method from outside. This will make your unit test difficult. In that case you can use TestActorRef to get the underlying actor with TestActorRef#underlyingActor.
Props props = Props.create(MyActor.class, () -> new MyActor());
TestActorRef<MyActor> testActorRef = TestActorRef.create(actorSystem, props, "my_actor");
// This is the actual actor. You can call its method directly.
MyActor myActor = testActorRef.underlyingActor();

// If you must do spy(new MyActor()) or equivalent you can do this here
Props props = Props.create(MyActor.class, () -> {
    MyActor myActor = spy(new MyActor());

    // BEWARE: preStart is called on actor creation, 
    // so doing spy(testActorRef.underlyingActor()) after TestActorRef#create
    // is too late to mock preStart().
    doThrow(new Exception()).when(myActor).preStart();

    return myActor;
})

Supervisor Strategy

In the previous section we covered how actors are created and who is responsible for supervision. This section will introduce how you can specify the supervising behaviour. Akka employs a "let-it-crash" philosophy, where actors throw exception when it can no longer proceed its task and supervisor takes responsibility for the recovery. When parent actor can not proceed the recovery task, it may escalate the recovery task to its parent. So your actors can stay small and concentrate on their tasks.

Defining Strategy

By default akka provides us two different strategies: one-for-one and one-for-all strategy. With one-for-one strategy, the failing actor is the only subject to handle; one-for-all strategy takes all children including failing one as subject to recovery. If no strategy is set, one-for-one is used.

Defining strategy is straight forward, and below code fragment describes pretty much everything.
public class MyActor extends UntypedActor {
    private static SupervisorStrategy strategy = new OneForOneStrategy(10, Duration.create("1 minute"), t -> {
        // http://doc.akka.io/docs/akka/snapshot/java/fault-tolerance.html#Default_Supervisor_Strategy
        if (t instanceof ActorInitializationException) {
            return stop();
        } else if (t instanceof ActorKilledException) {
            return stop();
        } else if (t instanceof Exception) {
            return restart();
        }

        return escalate();
    });

    @Override
    public SupervisorStrategy supervisorStrategy() {
        return strategy;
    }

    @Override
    public void onReceive(Object o) throws Exception {
    }
}
With above code, it defines strategy as below:
  • The failing actor is the only subject to recovery. (One-for-one strategy)
  • Reties 10 times within the timeout of 1 minute. (Duration instance)
  • Failing actor stops when ActorInitializationException or ActorKilledException is thrown.
  • Failing actor restarts when Exception is thrown.
  • Supervisor escalates this failure when other Throwable is thrown.
This setting is actually the default strategy that is applied when you do not specify any. There is one thing you really need to know about supervision. As you see, you can only have one strategy setting for each supervising actor. It is possible to define how a given supervisor reacts to given exception type, but still you can have only one Duration and retry settings. So again, you will want to divide your actors into small peaces such as adding one additional supervisor layer in the middle.

Supervisor's Directive Options

Then let us take a closer look at directive options that each supervisor can choose: restart, resume, stop, and escalate.

Restart

When supervisor decides to restart failing actor, actor system follows below steps as described in "What Restarting Means."
  1. suspend the actor (which means that it will not process normal messages until resumed), and recursively suspend all children
  2. call the old instance’s preRestart hook (defaults to sending termination requests to all children and calling postStop)
  3. wait for all children which were requested to terminate (using context.stop()) during preRestart to actually terminate; this—like all actor operations—is non-blocking, the termination notice from the last killed child will effect the progression to the next step
  4. create new actor instance by invoking the originally provided factory again
  5. invoke postRestart on the new instance (which by default also calls preStart)
  6. send restart request to all children which were not killed in step 3; restarted children will follow the same process recursively, from step 2
  7. resume the actor
Note that you can optionally stop one child actor or more in step 3. In step 6 those children that were not explicitly terminated in step 3 will restart.

One more thing I noticed is that, when you return restart() on preStart failure (ActorInitializationException) postStop is not called even though step 2 says postStop is called. Take a look at the very bottom of the code below.
public static class DummySupervisor extends UntypedActor {
    private SupervisorStrategy supervisorStrategy;

    public DummySupervisor(SupervisorStrategy supervisorStrategy) {
        this.supervisorStrategy = supervisorStrategy;
    }

    @Override
    public SupervisorStrategy supervisorStrategy() {
        return supervisorStrategy;
    }

    @Override
    public void onReceive(Object o) throws Exception {
        // Do nothing
    }
}

public TestActorRef generateDummySupervisor(SupervisorStrategy supervisorStrategy) {
    Props props = Props.create(DummySupervisor.class, () -> new DummySupervisor(supervisorStrategy));
    return TestActorRef.create(actorSystem, props, "dummy_supervisor-" + randomGenerator.nextInt(1000));
}

@Test
public void shouldPostStopNotBeCalledOnPreStartException() throws Exception {
    List<WorthlessActor> actors = new ArrayList<>();
    // Prep a supervisor that always tries to restart
    SupervisorStrategy myStrategy = new OneForOneStrategy(3, Duration.create("1 minute"), t -> {
        return restart();
    });
    DummySupervisor dummySupervisor = generateDummySupervisor(myStrategy).underlyingActor();

    // Create child actor
    Props worthlessActorProps = Props.create(WorthlessActor.class, () -> {
        WorthlessActor actor = spy(new WorthlessActor());

        // Throw exception on preStart
        doThrow(new Exception()).when(actor).preStart();

        actors.add(actor);

        return actor;
    });
    dummySupervisor.getContext().actorOf(worthlessActorProps);

    Thread.sleep(50);
    assertThat(actors).hasSize(4);

    // They NEVER call postStop so we have to do some clean up when it fails in the middle of preStart().
    verify(actors.get(0), never()).postStop();
    verify(actors.get(1), never()).postStop();
    verify(actors.get(2), never()).postStop();
    verify(actors.get(3), never()).postStop();
}
Actually postStop() is called when stop() is returned on preStart failure, though.

Resume

Resume is pretty straight forward. It just let failing actor resume its task. You just might want to leave log at here.

Stop (Terminate)

Along with the restart, the most important option to note is stop. This will stop the failing actor. The important thing is that actor termination also occurs in the regular operation, such as when actor finishes its task and is no longer needed. When stop is selected, it follows the same steps as termination. Detail is described in Termination section later.

Escalate

When supervisor can not handle its child's failure, supervising actor may fail itself and let its parent actor, the grand parent actor of failing one, take care of it. When the exception is escalated all the way up, the last strategy to be applied is Stopping Strategy.

Termination

Actor termination basically occurs in three occasions:
  • As a part of actor restart (old actor is terminated)
  • When supervisor decides to stop failing actor
  • When actor finishes its task and getContext().stop(targetActorRef) is called
In any cases below steps are followed:
  1. Stopping actor's postStop is called
  2. Watching actors get Terminated message 
So what does "watching actor" in step 2 mean? Other than supervising (parent) actor, actors may "watch" other actors. When you call getContext().watch(anotherActorRef), the calling actor starts to subscribe the anotherActorRef's termination. When anotherActor stops, Terminated message is passed to its parent and watching actors. This is called Death Watch.

You must remember that, when you receive Terminated instance, you can access to closing actor via Terminated#actor. BUT, this is just an ActorRef instance, so you can not know what type of actor is hiding under the hood.

Another important thing is that supervision-related messages are sent and stored in a different mailbox than usual one, so the message reception order is not guaranteed to be in the order of event occurrence.

Stateful Actor

As you already saw, actor's state gets lost on restart since actor system replaces the failing actor with a new actor instance. When you need to have stateful actor you can use an actor called UntyptedPersistentActor instead of UntypedActor. To store the state, you can configure what storage plugin to use.

However, to store data casually and locally, I prefer to create class that caches data. Remember that same arguments are passed on restarting, so the same FooCache instance is passed to new MyActor instance with the code below. Before employing UntyptedPersistentActor, I would re-think if this is really required. You will like to keep your actors simple, so creating simple cache class or add another layer to transfer and store data should be considered first.
FooCache cache = new FooCache();
Props.create(MyActor.class, () -> new MyActor(cache));

What I have learned

The single most important thing I have learned is that we should keep actors small and simple. As your actor become complex, supervisor strategy and test become much more complex. This is clearly stated in the document and I think this is the most important basic. So let us keep this in mind.
The quintessential feature of actor systems is that tasks are split up and delegated until they become small enough to be handled in one piece. In doing so, not only is the task itself clearly structured, but the resulting actors can be reasoned about in terms of which messages they should process, how they should react normally and how failure should be handled. If one actor does not have the means for dealing with a certain situation, it sends a corresponding failure message to its supervisor, asking for help. The recursive structure then allows to handle failure at the right level.

Jul 25, 2015

[Python] Trying out PEP0484 Type Hinting, ABCMeta, and PyCharm

After my wife gave birth to our first-born daughter, I started a new Python project and named it after this baby girl. This was a brand new project and did not require me to think about backward compatibility, so I decided to use later version of Python and challenge to employ some ideas that I liked while learning Java: type safety, private properties, explicit overriding with @Override, and organized inheritance. These are things that make me feel safer and comfortable.
Python itself does not actually support them, but I found some can be achieved in certain level with help of modules and IDE.

Type Hinting

I just realized that PEP 0484 was accepted on May. This kind of type hinting was already available with docstrings, but it employed PEP 3107 style annotations to achieve it, so it is now officially a part of Python 3.5 or later. Python 2.7 or 3.2+ can still use backported version of typing module to implement it. I knew, in terms of readability, there was a constructive discussion introduced in Type hinting in Python: focus on readability, but I decided to give this a try.
Utilizing this type hinting, however, included another challenge for me. Since jedi-vim did not support this yet, I had to switch from Vim to PyCharm. PyCharm's support is still limited, but basic features are covered. With the help of IdeaVim plugin, switching cost was much lower than I had ever thought.

Installing typing module

Installing typing was a bit tricky. Regular pip command, `pip install typing`, somehow broke and ended up installing empty ver.0.0, so I had to explicitly set version with `pip install -lv typing==3.5.0b1`. This worked O.K.
With the nature of PEP0484, type checking is not done on runtime, so the type hinting is mostly for static check by other third party modules or IDEs. Therefore I found it really important to know IDEs' current support level. The worst case scenario is that you think your IDE supports PEP 0484 perfectly, you mis-define or miscall the function, but your IDE does not actually support correctly so you end up with no warnings. Here are some limitations I have found so far.

Some limitation with PyCharm

Type aliases

Using type aliases to declare complex type at one place or to give a more meaningful type name seems to be a pretty good idea, but this support is limited with current version of PyCharm.
See the capture below. PyCharm correctly gives warnings when the argument declaration has no type alias or has simple alias with built-in types; but gives no warning for type alias with customized types provided by typing module.


Just like other warnings, you can click on the yellow marked part to see the detailed description.


The tricky thing is that you want to use type alias for complex declaration to avoid miscalling or mis-declaration, but this ironically leads to a bug since the use of type aliases eliminates the chance to see the proper warnings. I stored all complex declarations in types.py and used them in other modules, then I found out it did not work with PyCharm. We must keep this in mind.
It will be supported with PyCharm ver.5.0, though. Let us look forward to it.

Abstract collection types

Although PEP0484 recommends to use abstract collection types for arguments as below, such declaration results in warnings.
Note: Dict , List , Set and FrozenSet are mainly useful for annotating return values. For arguments, prefer the abstract collection types defined below, e.g. Mapping , Sequence or AbstractSet .

I guess this is not supported yet just like the previous one, but as long as you get warnings you have chance to know something is happening and have option to ignore it. So I am not as worried as the previous type alias ignorance.

@overload

With type hinting, function signature became more detailed comparing to *args and **kwargs, so I became more interested in what typing module's @overload decorator would do for us.
It did not take so long to find that this decorator can only be used in stub files with .pyi extension, and the actual declaration and implementation stay in .py just like usual. In other words, even though typing module provide us this decorator, we can not declare same-named functions with different signatures.
Let us look at the signature below. This tells us that __getitem__ receives int or slice as an argument, and returns int or bytes as return value. BUT you can not be sure which of int or bytes is returned when you give int as argument.
def __getitem__(self, a: Union[int, slice]) -> Union[int, bytes]:
    if isinstance(a, int):
        # Do something
        pass
    else:
        # Do something 
        pass

In such case, in stub file, we can declare separately with @overload decorator like below:
@overload
def __getitem__(self, i: int) -> int: ...

@overload
def __getitem__(self, s: slice) -> bytes: ...

This looks more obvious that int argument returns int while slice argument returns bytes. Since I was not going to need such use cases and preparing separate .pyi file seemed a little pain, I did not use this, anyway.

My feedback about type hinting

As I employed type hinting with typing module, I became more cautious about giving arbitrary arguments. Especially for public methods, I stopped using dict, *args, **kwargs. Instead I started to define a JavaBean-like classes to hold those values, use name mangling strategy to provide private-like properties, and set @property for getter methods. As a happy side-effect, I enjoy stronger code completion where PyCharm support type hinting. Now I feel more comfortable and safe.

Structured/Organized inheritance

As I learned Java for several days, I started to like the idea of interface, default modifier, abstract class, @Override annotation, and etc... Those things seem to make me more comfortable in terms of safer design. It is a bit tiring to think about design in detail, though.
I found ABCMeta and Type hinting can support some of them.

ABCMeta

With a help of ABCMeta, we can create a class with abstract methods and abstract properties that obligate inheriting class to override, and can create default methods that can be overridden. This is a bit like Java's interface that has methods with abstract or default modifiers.

@abstractmethod and @abstractproperty

Method and property with these decorators obligate inheriting class to override. I am really comfortable with PyCharm's assistance giving me the option to override the abstract methods.


It automatically declare the function with the same signature, which I think is pretty neat. So when I want to define behavior, I like to use ABCMeta with @abstractmethod.

@Override

In Java, we have @Override annotation, but Python does not provide @override decorator. I missed it at first, but I found it somewhat O.K. since PyCharm gives us warnings for wrongful overriding. Why *somewhat*? It is now safer that PyCharm gives us warnings for mis-overriding, but we still miss the explicitness to indicate the declaring method is overriding the existing one or we are extending the class to declare a new method.

My feedback about abc module

After I started using abc.ABCMeta, I feel more comfortable about declaring behavior in an Java's interface-like class. It is still more casual, but it is nice that we can obligate inheriting class to behave in a certain way by overriding designated abstract methods.

Conclusion

When I work on my personal projects, it is not uncommon that projects are left for relatively a long time. Then I look into the implementation for maintenance purpose, but I do not remember much about them. So I feel more comfortable when things are organized and are explicitly declared.
With that said, things I tried this time are pretty neat, and I am looking forward to having next version of PyCharm.

Further reading