Aug 19, 2017

[Golang] Introducing go-sarah: simple yet highly customizable bot framework

As mentioned in the latest blog post, I created a new bot framework: go-sarah. This article introduces its notable features and overall architecture along with some sample codes. Upcoming articles should focus on details about each specific aspect.

Notable features

User's Conversational Context

In this project, user's conversational context is referred to as "user context," which stores previous user states and defines what function should be executed on following input. While typical bot implementation is somewhat "stateless" and hence user-and-bot interaction does not consider previous state, Sarah natively supports the idea of this conversational context. Its aim is to let user provide information as they send messages, and finally build up complex arguments to be passed.

For example, instead of obligating user to input long confusing text such as ".todo Fix Sarah's issue #123 by 2017-04-15 12:00:00" at once, let user build up arguments in a conversational manner as below image:


Live Configuration Update

When configuration file for a command is updated, Sarah automatically detects the event and re-builds the command or scheduled task in thread-safe manner so the next execution of that command/task appropriately reflects the new configuration values.

See the usage of CommandPropsBuilder and ScheduledTaskPropsBuilder for detail.

Concurrent Execution by Default

Developers may implement their own bot by a) implementing sarah.Bot interface or b) implementing sarah.Adapter and pass it to sarah.NewBot() to get instance of default Bot implementation.

Either way, a component called sarah.Runner takes care of Commmand execution against given user input. This sarah.Runner dispatches tasks to its internal workers, which means developers do not have to make extra effort to handle flooding incoming messages.

Alerting Mechanism

When a bot confronts critical situation and can not continue its operation or recover, Sarah's alerting mechanism sends alert to administrator. Zero or more sarah.Alerter implementations can be registered to send alert to desired destinations.

Higher Customizability

To have higher customizability, Sarah is composed of fine grained components that each has one domain to serve; sarah.Alerter is responsible for sending bot's critical state to administrator, workers.Worker is responsible for executing given job in a panic-proof manner, etc... Each component comes with an interface and default implementation, so developers may change Sarah's behavior by implementing corresponding component's interface and replacing default implementation.

Overall Architecture

Below illustrates some major components.


Runner

Runner is the core of Sarah; It manages other components' lifecycles, handles concurrent job execution with internal workers, watches configuration file changes, re-configures commands/tasks on file changes, executes scheduled tasks, and most importantly makes Sarah comes alive.

Runner may take multiple Bot implementations to run multiple Bots in single process, so resources such as workers and memory space can be shared.

Bot / Adapter

Bot interface is responsible for actual interaction with chat services such as Slack, LINE, gitter, etc...

Bot receives messages from chat services, sees if the sending user is in the middle of user context, searches for corresponding Command, executes Command, and sends response back to chat service.

Important thing to be aware of is that, once Bot receives message from chat service, it sends the input to Runner via a designated channel. Runner then dispatches a job to internal worker, which calls Bot.Respond and sends response via Bot.SendMessage. In other words, after sending input via the channel, things are done in concurrent manner without any additional work. Change worker configuration to throttle the number of concurrent execution -- this may also impact the number of concurrent HTTP requests against chat service provider.

DefaultBot

Technically Bot is just an interface. So, if desired, developers can create their own Bot implementations to interact with preferred chat services. However most Bots have similar functionalities, and it is truly cumbersome to implement one for every chat service of choice.

Therefore defaultBot is already predefined. This can be initialized via sarah.NewBot.

Adapter

sarah.NewBot takes multiple arguments: Adapter implementation and arbitrary number ofsarah.DefaultBotOptions as functional options. This Adapter thing becomes a bridge between defaultBot and chat service. DefaultBot takes care of finding corresponding Command against given input, handling stored user context, and other miscellaneous tasks; Adapter takes care of connecting/requesting to and messaging with chat service.

package main

import (
        "github.com/oklahomer/go-sarah"
        "github.com/oklahomer/go-sarah/slack"
        "gopkg.in/yaml.v2"
        "io/ioutil"
)

func main() {
        // Setup slack bot.
        // Any Bot implementation can be fed to Runner.RegisterBot(), but for convenience slack and gitter adapters are predefined.
        // sarah.NewBot takes adapter and returns defaultBot instance, which satisfies Bot interface.
        configBuf, _ := ioutil.ReadFile("/path/to/adapter/config.yaml")
        slackConfig := slack.NewConfig() // config struct is returned with default settings.
        yaml.Unmarshal(configBuf, slackConfig)
        slackAdapter, _ := slack.NewAdapter(slackConfig)
        sarah.NewBot(slackAdapter)
}

Command

Command interface represents a plugin that receives user input and return response. Command.Match is called against user input in Bot.Respond. If it returns true, then the command is considered "corresponds to user input," and hence its Execute method is called.

Any struct that satisfies Command interface can be fed to Bot.AppendCommand as a command. CommandPropsBuilder is provided to easily implement Command interface on the fly:

Simple Command

There are several ways to setup Commands:
  • Define a struct that implements Command interface. Pass its instance to Bot.ApendCommand.
  • Use CommandPropsBuilder to construct a non-contradicting set of arguments, and pass this to Runner.Runner internally builds a command, and re-built it when configuration struct is present and corresponding configuration file is updated.
Below are several ways to setup CommandProps with CommandPropsBuilder for different customization.
// In separate plugin file such as echo/command.go
// Export some pre-build command props
package echo

import (
 "github.com/oklahomer/go-sarah"
 "github.com/oklahomer/go-sarah/slack"
 "golang.org/x/net/context"
 "regexp"
)

// CommandProps is a set of configuration options that can be and should be treated as one in logical perspective.
// This can be fed to Runner to build Command on the fly.
// CommandProps is re-used when command is re-built due to configuration file update.
var matchPattern = regexp.MustCompile(`^\.echo`)
var SlackProps = sarah.NewCommandPropsBuilder().
        BotType(slack.SLACK).
        Identifier("echo").
        MatchPattern(matchPattern).
        Func(func(_ context.Context, input sarah.Input) (*sarah.CommandResponse, error) {
                // ".echo foo" to "foo"
                return slack.NewStringResponse(sarah.StripMessage(matchPattern, input.Message())), nil
        }).
        InputExample(".echo knock knock").
        MustBuild()

// To have complex checking logic, MatchFunc can be used instead of MatchPattern.
var CustomizedProps = sarah.NewCommandPropsBuilder().
        MatchFunc(func(input sarah.Input) bool {
                // Check against input.Message(), input.SenderKey(), and input.SentAt()
                // to see if particular user is sending particular message in particular time range
                return false
        }).
        // Call some other setter methods to do the rest.
        MustBuild()

// Configurable is a helper function that returns CommandProps built with given CommandConfig.
// CommandConfig can be first configured manually or from YAML/JSON file, and then fed to this function.
// Returned CommandProps can be fed to Runner and when configuration file is updated,
// Runner detects the change and re-build the Command with updated configuration struct.
func Configurable(config sarah.CommandConfig) *sarah.CommandProps {
        return sarah.NewCommandPropsBuilder().
                ConfigurableFunc(config, func(_ context.Context, input sarah.Input, conf sarah.CommandConfig) (*sarah.CommandResponse, error) {
                        return nil, nil
                }).
                // Call some other setter methods to do the rest.
                MustBuild()
}

Reconfigurable Command

With CommandPropsBuilder.ConfigurableFunc, a desired configuration struct may be added. This configuration struct is passed on command execution as 3rd argument. Runner is watching the changes on configuration files' directory and if configuration file is updated, then the corresponding command is built, again.

To let Runner supervise file change event, set sarah.Config.PluginConfigRoot. Internal directory watcher supervises sarah.Config.PluginConfigRoot + "/" + BotType + "/" as Bot's configuration directory. When any file under that directory is updated, Runner searches for corresponding CommandProps based on the assumption that the file name is equivalent to CommandProps.identifier + ".(yaml|yml|json)". If a corresponding CommandProps exists, Runner rebuild Command with latest configuration values and replaces with the old one.

Scheduled Task

While commands are set of functions that respond to user input, scheduled tasks are those that run in scheduled manner. e.g. Say "Good morning, sir!" every 7:00 a.m., search on database and send "today's chores list" to each specific room, etc...

ScheduledTask implementation can be fed to Runner.RegisterScheduledTask. When Runner.Run is called, clock starts to tick and scheduled task becomes active; Tasks will be executed as scheduled, and results are sent to chat service via Bot.SendMessage.

Simple Scheduled Task

Technically any struct that satisfies ScheduledTask interface can be treated as scheduled task, but a builder is provided to construct a ScheduledTask on the fly.
package foo

import (
 "github.com/oklahomer/go-sarah"
 "github.com/oklahomer/go-sarah/slack"
 "github.com/oklahomer/golack/slackobject"
 "golang.org/x/net/context"
)

// TaskProps is a set of configuration options that can be and should be treated as one in logical perspective.
// This can be fed to Runner to build ScheduledTask on the fly.
// ScheduledTaskProps is re-used when command is re-built due to configuration file update.
var TaskProps = sarah.NewScheduledTaskPropsBuilder().
        BotType(slack.SLACK).
        Identifier("greeting").
        Func(func(_ context.Context) ([]*sarah.ScheduledTaskResult, error) {
                return []*sarah.ScheduledTaskResult{
                        {
                                Content:     "Howdy!!",
                                Destination: slackobject.ChannelID("XXX"),
                        },
                }, nil
        }).
        Schedule("@everyday").
        MustBuild()

Reconfigurable Scheduled Task

With ScheduledTaskPropsBuilder.ConfigurableFunc, a desired configuration struct may be added. This configuration struct is passed on task execution as 2nd argument. Runner is watching the changes on configuration files' directory and if configuration file is updated, then the corresponding task is built/scheduled, again.

To let Runner supervise file change event, set sarah.Config.PluginConfigRoot. Internal directory watcher supervises sarah.Config.PluginConfigRoot + "/" + BotType + "/" as Bot's configuration directory. When any file under that directory is updated, Runner searches for corresponding ScheduledTaskProps based on the assumption that the file name is equivalent to ScheduledTaskProps.identifier + ".(yaml|yml|json)". If a corresponding ScheduledTaskProps exists, Runner rebuild ScheduledTask with latest configuration values and replaces with the old one.

UserContextStorage

As described in "Notable Features," Sarah stores user's current state when Command's response expects user to send series of messages with extra supplemental information. UserContextStorage is where the state is stored. Developers may store state into desired storage by implementing UserContextStorage interface. Two implementations are currently provided by author:

Store in Process Memory Space

defaultUserContextStorage is a UserContextStorage implementation that stores ContextualFunc, a function to be executed on next user input, in the exact same memory space that process is currently running. Under the hood this storage is simply a map where key is user identifier and value is ContextualFunc. This ContextFunc can be any function including instance method and anonymous function that satisfies ContextFunc type. However it is recommended to use anonymous function since some variable declared on last method call can be casually referenced in this scope.

Store in External KVS

go-sarah-rediscontext stores combination of function identifier and serializable arguments in Redis. This is extremely effective when multiple Bot processes run and user context must be shared among them.
e.g. Chat platform such as LINE sends HTTP requests to Bot on every user input, where Bot may consist of multiple servers/processes to balance those requests.

Alerter

When registered Bot encounters critical situation and requires administrator's direct attention, Runner sends alert message as configured with Alerter. LINE alerter is provided by default, but anything that satisfies Alerter interface can be registered as Alerter. Developer may add multiple Alerter implementations via Runner.RegisterAlerter so it is recommended to register multiple Alerters to avoid Alerting channel's malfunction and make sure administrator notices critical state.

Bot/Adapter may send BotNonContinurableError via error channel to notify critical state to Runner. e.g. Adapter can not connect to chat service provider after reasonable number of retrials.

Getting Started

That is pretty much everything developers should know before getting started. To see working example code, visit https://github.com/oklahomer/go-sarah/tree/master/examples. Fore more details, make sure to follow upcoming blog posts.

Aug 6, 2017

Parenting software engineer

It was a cold day for spring that my wife gave birth to a beautiful baby girl, Sarah. Despite the snowy weather, Sarah was sleeping peacefully in her mother's arm. Being overwhelmed with grateful feeling after watching the faces of a newborn and her mother, I realized a passion to give birth to something was evolving in me. Giving birth is the most beautiful and creative act only allowed for females that weaves a rich tapestry of life, so I as a male software engineer wanted closer experience to this. That was the moment I decided to start a new project.

I named this project Sarah. This project would not only be a good memento of my daughter's birth, but also be a good memory of our growth. Once a software engineer stops growing as one, he can easily be left behind from this rapidly growing industry. This fact frightened me all the time. I needed to grow as much as my daughter did. However many parents complained that having kids gave them less time to work on what they want to and there was nothing they could do about. This was a reasonable complaint I was not going to agree. Having a daughter must be something enriches my life; not something burdens me. If there is someone to blame, that should be me. Not my daughter. Working on a new project that focuses on a new area of interest should help me grow as a software engineer.

For this project, I chose customizable chatbot framework as a theme. It was 2015 and creating chatbot was becoming a new trend. In technical perspective, creating chatbot framework involves skills such as follows:
  1. having better design to clearly separate abstraction from implementation layer
  2. having better understanding about multiple communication protocols depending on what chat service to adapt.
They captured me as promising challenges that bring me to the next higher level.

I started implementing Sarah with Python 3.5. At that time, the official announcement of PEP 484 release was around the corner and PyCharm was working on adapting this type hinting feature. While learning Python, I found a package named abc that could be used to define abstract base class. I thought a combination of type hinting and abc could provide well structured architecture. Decorator was also a good solution to minimize plugins function's specification by wrapping its core logic with actual messaging logic. However it became obvious that I took type hinting too serious. Instead of passing around arbitrary dictionary as function argument, I preferred to define designated class that represents particular object and pass its instance. I even implemented a base class called ValueObject to provide immutable objects. Passing those objects among public interfaces could be a good idea in terms of unambiguity, but I did the same to private methods. At this time Python's flexibility was lost and my code became an inferior Java.

A few months later I redesigned this project and started implementing with Golang. I found learning Golang was a joyful experience. The previous Python codebase not only gave me a better understanding of the whole picture, it also presented some hidden requirements that I missed last time. To fulfill the requirements, I added another layer called Runner at the bottom. Adapter focuses on connecting to designated chat service; Runner focuses on coordinating and supervising other components. Thanks to this newly added component, the other components' implementations became simpler and more focused. As described on its repository, Sarah is now composed of fine grained components and interfaces, which make it is easier to replace pre-defined default behavior with customized implementation.


As of July 4th, 2017, Sarah is no longer pre-alpha and is now listed on awesome-go. While I am proud of what I have achieved, I must admit that this is not the end of our journey. Throughout all time, working on Sarah was not just coding. As a matter of fact coding on private time was the last thing I could do as a parent. That frustrated me from time to time. But I also knew we were going to have less and less time to spend together as my daughter grew up; She would make friends in school, spend time with them, make a boyfriend, go to college, and eventually leave home. Having this project told me an important lesson that our time is always limited and we need to have a continuing effort to spend it wisely. I will continue to work on Sarah, but I am sure the actual Sarah, my daughter, always has higher priority. I am her father. I always am.

[EDIT] FYI, this project's design philosophy, detailed specs, and my learned knowledge will be introduced on following blog posts. Until then its github repository should help.

Dec 30, 2016

[IntelliJ] Switch focus back to editor while keep the embedded terminal open

TL;DR
IntelliJ IDEA 13's default shortcut, ⌥F12, closes embedded terminal when switching focus to editor window. Hitting ⌘2 twice works to switch focus to editor and still have the terminal open.

Software engineers often have demand to open both editor and terminal at the same time so they can code on editor and tail logs, display git-grep's result, or show whatever they want aside their editor. It is pretty handy because you can focus on coding task, but still keep an eye on and monitor server logs. Or you can open a particular log file on the terminal on the right side of your monitor, and apply changes to code on left side of your monitor. It is even handier if both can be displayed within IntelliJ IDEA since the pre-defined or user-defined shortcut keys allows engineers to switch from and to terminal when needed while both editor and terminal are displayed and aligned in more organized manner comparing to opening both Terminal.app and IntelliJ IDEA and toggle. It is Integrated Development Environment, after all.

The first thing one may notice is the pre-defined shortcut to toggle between terminal and editor: ⌥F12. When focus is on editor window, this opens embedded terminal and switch focus to it; when focus is on embedded terminal, switch focus to editor window. This, however, has one problem. Embedded terminal closes when focus is switched to editor. There actually are some modes that define the behaviour of the terminal window, and programmers can choose one or combinations of those modes depending on their preference. But still the terminal window closes with this shortcut. Hence the term, "toggle."

So how can we switch focus back to editor and still have the embedded terminal open? If there is no pre-defined shortcut, what is the best workaround? There were some workarounds introduced on the web including defining macros/shortcuts, and the simplest yet less disruptive way was found on stackoverflow.com. On this question, Andrey introduces the idea of switching the focus to some different window with pre-defined shortcut keys; dev shows comprehensive answer. Andrey's approach seems simple yet effective. Since the ⌥F12 shortcut closes terminal window on "toggle," just use another shortcut to "switch" focus to different window such as "Favorite" with ⌘2. At this point, the state is just the same as you hit ⌘2 from editor and switch focus to Favorite window. Then hitting esc key or another ⌘2 works to switch focus back to editor. The only requirement is to check "Docked mode" and "Pinned mode" for terminal window.

Note that embedded terminal is a bit different from other tool windows since this is terminal and consumes most key inputs including esc keys and other ⌘ related inputs. And as long as there is no pre-defined shortcut to just switch -- not toggle -- focus from and to terminal, the workaround that introduced above is required. There is a request on youtrack that asks for shortcut doing exactly the one discussed in this article, so until that is implemented or is declined the simple workaround without macro assignment could be enough.

Jan 21, 2016

Yet another Akka introduction for dummies

It has been 5+ years since the initial launch of Akka toolkit. You can find many articles that cover what this is all about and why we use it, so this article does not discuss those areas. Instead I am going to introduce what I wanted to know and what I should have known when getting started with Akka actors.


Akka has well maintained comprehensive document for both Java and Scala implementations. The only flaw I can think of is that this is too huge that you can easily get lost in this. Or you have less time to work on this so you just skip reading. Trust me, I was one. Then you google and find articles that cover areas of your interest. It is nice that many developers are working on the same OSS and sharing their knowledge. The difficult part is that, like any other product, it takes time to capture the whole picture, and those how-to fragments you search can not fully support you unless you have the whole picture.

So what I am going to do here is to summarize basics, explain each of them with some reference to official document, and then share some practices that I have learned in a hard way. Unless otherwise stated, I used Java implementation of version 2.4.0.

Summary

  • Akka actor is only created by other actor
    • Hence the parent-child relationship
    • You can not initialize it outside of actor system
      • Actor instance is hidden under ActorRef so methods can not be called directly
      • Test may require some work-around
    • Parent supervises its children
  • When actor throws exception, the actor may be restarted
    • Supervisor (the parent actor) defines the recovery policy (Supervisor Strategy)
      • One-to-one v.s. all-for-one strategy
      • Options: resume, stop, restart, or escalate
      • Maximum retry count and timeout settings are customizable
    • Under the ActorRef, the old actor is replaced with the new one
  • To implement stateful actor, use UntypedPersistentActor
    • Otherwise, state is lost on actor restart
    • Variety of storage plugins are provided
  • Keep each actor simple
    • Do not let one actor do too much
    • Consider adding a supervisor layer to manage different kinds of actors with different strategy

Terms and concepts

Parental Supervision

The first rule you MUST remember about actor life cycle is that an actor can only be created by another actor; created actor is called "child" and is "supervised" by creating actor, "parent." Then who creates the top-level actor? The document says "top-level actor is provided by the library" and some say the top-level actor is also supervised by an imaginary actor. This family tree is described in a form of file system's file-path like hierarchy.



The root guardian is the one I described as "top-level" actor. This creates and supervises two special actors as its children: user guardian and system guardian. Since this tree is described in file-path like hierarchy, root guardian has a path of "/" while user guardian and system guardian have "/user" and "/system" accordingly. User-defined actors all belong to user guardian so your actors have and are accessible with the path of "/user/dad/kid/grand_kid."

As described above, all actors belong to their parents. In other words, you can not initialize your actor outside of actor system, which makes your tests a bit troublesome. If you try to create your actor directly, you will most likely get an error saying "You cannot create an instance of [XXX] explicitly using the constructor (new)." Without a supervising parent, child actor can not exist. So spy(new MyActor()) with Mockito will not work as you expect. For detailed example, see the code fragments below.

Here is one more thing to know about testing. Usually your actor is hidden under ActorRef instance and you can not call actor's method from outside. This will make your unit test difficult. In that case you can use TestActorRef to get the underlying actor with TestActorRef#underlyingActor.
Props props = Props.create(MyActor.class, () -> new MyActor());
TestActorRef<MyActor> testActorRef = TestActorRef.create(actorSystem, props, "my_actor");
// This is the actual actor. You can call its method directly.
MyActor myActor = testActorRef.underlyingActor();

// If you must do spy(new MyActor()) or equivalent you can do this here
Props props = Props.create(MyActor.class, () -> {
    MyActor myActor = spy(new MyActor());

    // BEWARE: preStart is called on actor creation, 
    // so doing spy(testActorRef.underlyingActor()) after TestActorRef#create
    // is too late to mock preStart().
    doThrow(new Exception()).when(myActor).preStart();

    return myActor;
})

Supervisor Strategy

In the previous section we covered how actors are created and who is responsible for supervision. This section will introduce how you can specify the supervising behaviour. Akka employs a "let-it-crash" philosophy, where actors throw exception when it can no longer proceed its task and supervisor takes responsibility for the recovery. When parent actor can not proceed the recovery task, it may escalate the recovery task to its parent. So your actors can stay small and concentrate on their tasks.

Defining Strategy

By default akka provides us two different strategies: one-for-one and one-for-all strategy. With one-for-one strategy, the failing actor is the only subject to handle; one-for-all strategy takes all children including failing one as subject to recovery. If no strategy is set, one-for-one is used.

Defining strategy is straight forward, and below code fragment describes pretty much everything.
public class MyActor extends UntypedActor {
    private static SupervisorStrategy strategy = new OneForOneStrategy(10, Duration.create("1 minute"), t -> {
        // http://doc.akka.io/docs/akka/snapshot/java/fault-tolerance.html#Default_Supervisor_Strategy
        if (t instanceof ActorInitializationException) {
            return stop();
        } else if (t instanceof ActorKilledException) {
            return stop();
        } else if (t instanceof Exception) {
            return restart();
        }

        return escalate();
    });

    @Override
    public SupervisorStrategy supervisorStrategy() {
        return strategy;
    }

    @Override
    public void onReceive(Object o) throws Exception {
    }
}
With above code, it defines strategy as below:
  • The failing actor is the only subject to recovery. (One-for-one strategy)
  • Reties 10 times within the timeout of 1 minute. (Duration instance)
  • Failing actor stops when ActorInitializationException or ActorKilledException is thrown.
  • Failing actor restarts when Exception is thrown.
  • Supervisor escalates this failure when other Throwable is thrown.
This setting is actually the default strategy that is applied when you do not specify any. There is one thing you really need to know about supervision. As you see, you can only have one strategy setting for each supervising actor. It is possible to define how a given supervisor reacts to given exception type, but still you can have only one Duration and retry settings. So again, you will want to divide your actors into small peaces such as adding one additional supervisor layer in the middle.

Supervisor's Directive Options

Then let us take a closer look at directive options that each supervisor can choose: restart, resume, stop, and escalate.

Restart

When supervisor decides to restart failing actor, actor system follows below steps as described in "What Restarting Means."
  1. suspend the actor (which means that it will not process normal messages until resumed), and recursively suspend all children
  2. call the old instance’s preRestart hook (defaults to sending termination requests to all children and calling postStop)
  3. wait for all children which were requested to terminate (using context.stop()) during preRestart to actually terminate; this—like all actor operations—is non-blocking, the termination notice from the last killed child will effect the progression to the next step
  4. create new actor instance by invoking the originally provided factory again
  5. invoke postRestart on the new instance (which by default also calls preStart)
  6. send restart request to all children which were not killed in step 3; restarted children will follow the same process recursively, from step 2
  7. resume the actor
Note that you can optionally stop one child actor or more in step 3. In step 6 those children that were not explicitly terminated in step 3 will restart.

One more thing I noticed is that, when you return restart() on preStart failure (ActorInitializationException) postStop is not called even though step 2 says postStop is called. Take a look at the very bottom of the code below.
public static class DummySupervisor extends UntypedActor {
    private SupervisorStrategy supervisorStrategy;

    public DummySupervisor(SupervisorStrategy supervisorStrategy) {
        this.supervisorStrategy = supervisorStrategy;
    }

    @Override
    public SupervisorStrategy supervisorStrategy() {
        return supervisorStrategy;
    }

    @Override
    public void onReceive(Object o) throws Exception {
        // Do nothing
    }
}

public TestActorRef generateDummySupervisor(SupervisorStrategy supervisorStrategy) {
    Props props = Props.create(DummySupervisor.class, () -> new DummySupervisor(supervisorStrategy));
    return TestActorRef.create(actorSystem, props, "dummy_supervisor-" + randomGenerator.nextInt(1000));
}

@Test
public void shouldPostStopNotBeCalledOnPreStartException() throws Exception {
    List<WorthlessActor> actors = new ArrayList<>();
    // Prep a supervisor that always tries to restart
    SupervisorStrategy myStrategy = new OneForOneStrategy(3, Duration.create("1 minute"), t -> {
        return restart();
    });
    DummySupervisor dummySupervisor = generateDummySupervisor(myStrategy).underlyingActor();

    // Create child actor
    Props worthlessActorProps = Props.create(WorthlessActor.class, () -> {
        WorthlessActor actor = spy(new WorthlessActor());

        // Throw exception on preStart
        doThrow(new Exception()).when(actor).preStart();

        actors.add(actor);

        return actor;
    });
    dummySupervisor.getContext().actorOf(worthlessActorProps);

    Thread.sleep(50);
    assertThat(actors).hasSize(4);

    // They NEVER call postStop so we have to do some clean up when it fails in the middle of preStart().
    verify(actors.get(0), never()).postStop();
    verify(actors.get(1), never()).postStop();
    verify(actors.get(2), never()).postStop();
    verify(actors.get(3), never()).postStop();
}
Actually postStop() is called when stop() is returned on preStart failure, though.

Resume

Resume is pretty straight forward. It just let failing actor resume its task. You just might want to leave log at here.

Stop (Terminate)

Along with the restart, the most important option to note is stop. This will stop the failing actor. The important thing is that actor termination also occurs in the regular operation, such as when actor finishes its task and is no longer needed. When stop is selected, it follows the same steps as termination. Detail is described in Termination section later.

Escalate

When supervisor can not handle its child's failure, supervising actor may fail itself and let its parent actor, the grand parent actor of failing one, take care of it. When the exception is escalated all the way up, the last strategy to be applied is Stopping Strategy.

Termination

Actor termination basically occurs in three occasions:
  • As a part of actor restart (old actor is terminated)
  • When supervisor decides to stop failing actor
  • When actor finishes its task and getContext().stop(targetActorRef) is called
In any cases below steps are followed:
  1. Stopping actor's postStop is called
  2. Watching actors get Terminated message 
So what does "watching actor" in step 2 mean? Other than supervising (parent) actor, actors may "watch" other actors. When you call getContext().watch(anotherActorRef), the calling actor starts to subscribe the anotherActorRef's termination. When anotherActor stops, Terminated message is passed to its parent and watching actors. This is called Death Watch.

You must remember that, when you receive Terminated instance, you can access to closing actor via Terminated#actor. BUT, this is just an ActorRef instance, so you can not know what type of actor is hiding under the hood.

Another important thing is that supervision-related messages are sent and stored in a different mailbox than usual one, so the message reception order is not guaranteed to be in the order of event occurrence.

Stateful Actor

As you already saw, actor's state gets lost on restart since actor system replaces the failing actor with a new actor instance. When you need to have stateful actor you can use an actor called UntyptedPersistentActor instead of UntypedActor. To store the state, you can configure what storage plugin to use.

However, to store data casually and locally, I prefer to create class that caches data. Remember that same arguments are passed on restarting, so the same FooCache instance is passed to new MyActor instance with the code below. Before employing UntyptedPersistentActor, I would re-think if this is really required. You will like to keep your actors simple, so creating simple cache class or add another layer to transfer and store data should be considered first.
FooCache cache = new FooCache();
Props.create(MyActor.class, () -> new MyActor(cache));

What I have learned

The single most important thing I have learned is that we should keep actors small and simple. As your actor become complex, supervisor strategy and test become much more complex. This is clearly stated in the document and I think this is the most important basic. So let us keep this in mind.
The quintessential feature of actor systems is that tasks are split up and delegated until they become small enough to be handled in one piece. In doing so, not only is the task itself clearly structured, but the resulting actors can be reasoned about in terms of which messages they should process, how they should react normally and how failure should be handled. If one actor does not have the means for dealing with a certain situation, it sends a corresponding failure message to its supervisor, asking for help. The recursive structure then allows to handle failure at the right level.

Jul 25, 2015

[Python] Trying out PEP0484 Type Hinting, ABCMeta, and PyCharm

After my wife gave birth to our first-born daughter, I started a new Python project and named it after this baby girl. This was a brand new project and did not require me to think about backward compatibility, so I decided to use later version of Python and challenge to employ some ideas that I liked while learning Java: type safety, private properties, explicit overriding with @Override, and organized inheritance. These are things that make me feel safer and comfortable.
Python itself does not actually support them, but I found some can be achieved in certain level with help of modules and IDE.

Type Hinting

I just realized that PEP 0484 was accepted on May. This kind of type hinting was already available with docstrings, but it employed PEP 3107 style annotations to achieve it, so it is now officially a part of Python 3.5 or later. Python 2.7 or 3.2+ can still use backported version of typing module to implement it. I knew, in terms of readability, there was a constructive discussion introduced in Type hinting in Python: focus on readability, but I decided to give this a try.
Utilizing this type hinting, however, included another challenge for me. Since jedi-vim did not support this yet, I had to switch from Vim to PyCharm. PyCharm's support is still limited, but basic features are covered. With the help of IdeaVim plugin, switching cost was much lower than I had ever thought.

Installing typing module

Installing typing was a bit tricky. Regular pip command, `pip install typing`, somehow broke and ended up installing empty ver.0.0, so I had to explicitly set version with `pip install -lv typing==3.5.0b1`. This worked O.K.
With the nature of PEP0484, type checking is not done on runtime, so the type hinting is mostly for static check by other third party modules or IDEs. Therefore I found it really important to know IDEs' current support level. The worst case scenario is that you think your IDE supports PEP 0484 perfectly, you mis-define or miscall the function, but your IDE does not actually support correctly so you end up with no warnings. Here are some limitations I have found so far.

Some limitation with PyCharm

Type aliases

Using type aliases to declare complex type at one place or to give a more meaningful type name seems to be a pretty good idea, but this support is limited with current version of PyCharm.
See the capture below. PyCharm correctly gives warnings when the argument declaration has no type alias or has simple alias with built-in types; but gives no warning for type alias with customized types provided by typing module.


Just like other warnings, you can click on the yellow marked part to see the detailed description.


The tricky thing is that you want to use type alias for complex declaration to avoid miscalling or mis-declaration, but this ironically leads to a bug since the use of type aliases eliminates the chance to see the proper warnings. I stored all complex declarations in types.py and used them in other modules, then I found out it did not work with PyCharm. We must keep this in mind.
It will be supported with PyCharm ver.5.0, though. Let us look forward to it.

Abstract collection types

Although PEP0484 recommends to use abstract collection types for arguments as below, such declaration results in warnings.
Note: Dict , List , Set and FrozenSet are mainly useful for annotating return values. For arguments, prefer the abstract collection types defined below, e.g. Mapping , Sequence or AbstractSet .

I guess this is not supported yet just like the previous one, but as long as you get warnings you have chance to know something is happening and have option to ignore it. So I am not as worried as the previous type alias ignorance.

@overload

With type hinting, function signature became more detailed comparing to *args and **kwargs, so I became more interested in what typing module's @overload decorator would do for us.
It did not take so long to find that this decorator can only be used in stub files with .pyi extension, and the actual declaration and implementation stay in .py just like usual. In other words, even though typing module provide us this decorator, we can not declare same-named functions with different signatures.
Let us look at the signature below. This tells us that __getitem__ receives int or slice as an argument, and returns int or bytes as return value. BUT you can not be sure which of int or bytes is returned when you give int as argument.
def __getitem__(self, a: Union[int, slice]) -> Union[int, bytes]:
    if isinstance(a, int):
        # Do something
        pass
    else:
        # Do something 
        pass

In such case, in stub file, we can declare separately with @overload decorator like below:
@overload
def __getitem__(self, i: int) -> int: ...

@overload
def __getitem__(self, s: slice) -> bytes: ...

This looks more obvious that int argument returns int while slice argument returns bytes. Since I was not going to need such use cases and preparing separate .pyi file seemed a little pain, I did not use this, anyway.

My feedback about type hinting

As I employed type hinting with typing module, I became more cautious about giving arbitrary arguments. Especially for public methods, I stopped using dict, *args, **kwargs. Instead I started to define a JavaBean-like classes to hold those values, use name mangling strategy to provide private-like properties, and set @property for getter methods. As a happy side-effect, I enjoy stronger code completion where PyCharm support type hinting. Now I feel more comfortable and safe.

Structured/Organized inheritance

As I learned Java for several days, I started to like the idea of interface, default modifier, abstract class, @Override annotation, and etc... Those things seem to make me more comfortable in terms of safer design. It is a bit tiring to think about design in detail, though.
I found ABCMeta and Type hinting can support some of them.

ABCMeta

With a help of ABCMeta, we can create a class with abstract methods and abstract properties that obligate inheriting class to override, and can create default methods that can be overridden. This is a bit like Java's interface that has methods with abstract or default modifiers.

@abstractmethod and @abstractproperty

Method and property with these decorators obligate inheriting class to override. I am really comfortable with PyCharm's assistance giving me the option to override the abstract methods.


It automatically declare the function with the same signature, which I think is pretty neat. So when I want to define behavior, I like to use ABCMeta with @abstractmethod.

@Override

In Java, we have @Override annotation, but Python does not provide @override decorator. I missed it at first, but I found it somewhat O.K. since PyCharm gives us warnings for wrongful overriding. Why *somewhat*? It is now safer that PyCharm gives us warnings for mis-overriding, but we still miss the explicitness to indicate the declaring method is overriding the existing one or we are extending the class to declare a new method.

My feedback about abc module

After I started using abc.ABCMeta, I feel more comfortable about declaring behavior in an Java's interface-like class. It is still more casual, but it is nice that we can obligate inheriting class to behave in a certain way by overriding designated abstract methods.

Conclusion

When I work on my personal projects, it is not uncommon that projects are left for relatively a long time. Then I look into the implementation for maintenance purpose, but I do not remember much about them. So I feel more comfortable when things are organized and are explicitly declared.
With that said, things I tried this time are pretty neat, and I am looking forward to having next version of PyCharm.

Further reading

Apr 5, 2015

Added v2.3 support to Facebook::OpenGraph

Last weekend I added Graph API v2.3 support to Facebook::OpenGraph, and released it to CPAN. Basically this module does not require a lot of code change to support different API versions because it focuses on minimal implementation with maximum convenience. When I designed it, I did not want to require developers to cover both Graph API and this module's specification; Client modules should be as thin as possible so developers do not have to be conscious of its existence.
This time, however, Graph API v2.3 has major change on /oauth/access_token response. It used to return URL encoded query string, but is going to return JSON object from its latest version. So I added several new methods and some code change as below:
  • Now Facebook::OpenGraph::Response has methods for Graph API version comparison
  • Facebook::OpenGraph checks API version, and if it is higher than v2.3, it parses the response body as JSON object
These version comparison methods are handy. $response->api_version returns API Version that is returned from Graph API, which I think is safer than using the version given by developers. The designated version and actual experiencing version may differ sometimes as described on document.
we now return an HTTP response header called 'facebook-api-version' which indicates which API version your app is actually experiencing - this may be different to the API version you specify in your request due to upgrades.
I am not creating brand new Facebook App lately, so there might be some edge-cases that I am missing. I will be happy to have your feedback on github.

Jan 18, 2015

Progress Report: Add mbed AudioCODEC (TLV320AIC23B) support to Raspbian

Lately, on last cyber Monday, I purchased a new Raspberry Pi B+ and some peripherals from Adafruit. I already had a RasPi B, and I was fairly satisfied with my previous camcorder project, but those cool features including HAT's concept and improved electrical stability seemed attractive to me. So I decided to migrate my camcorder project to RasPi B+, and hopefully add audio support.
Here are some features my camcorder previously offered:
  • GPS logging
  • Live preview on PiTFT with current driving speed and recording status
  • Video recording
My first plan to add audio recording feature was to connect electret microphone amplifier via ADC, but turned out to be a bad idea; The combination of ADC, I2C connection and linux (time-sliced OS) can not provide enough sampling rate of 40kHz or more.
I decided to use my mbed AudioCODEC, instead. The latest Raspbian kernel does not include support for this device, so I had to go through a lot of search. What I found really helpful were koalo's article and jasaw's project. With their help, what I have done so far are 1) adding device driver, 2) cross compiling Raspbian kernel, 3) applying this modification to running B+, and 4) wiring. I am going to describe each step to show what I did and how things failed.

Add device driver

There are multiple ways to add device driver. You may just build with required files, and then install them. I think it is the fastest way because you will never have to compile the whole Raspbian kernel. However, for my better understanding and later convenience, I chose to add required files to proper location, add modification to Kconfig and Makefile, and cross compile the entire Raspbian kernel. All modifications can be found at here, and are self-explanatory.

Cross compile Raspbian kernel

To avoid confusion regarding gcc version issue and OSX's case-sensitivity problem, I prepared a new Debian environment on VirtualBox. This way, if anything happens, I can just get rid of this environment and start all over again.
First I cloned my repository to $HOME/dev/oklahomer/linux/, and tools repository to $HOME/dev/raspberrypi/tools/. Then export some environment variables as below:
➜ linux git:(feature/mbed_support) export CCPREFIX=$HOME/dev/raspberrypi/tools/arm-bcm2708/gcc-linaro-arm-linux-gnueabihf-raspbian-x64/bin/arm-linux-gnueabihf-
➜ linux git:(feature/mbed_support) export MODULE_TEMP_PATH=~/work/modules
Then run `make`. The output of `make menuconfig` is located at here.
➜ linux git:(feature/mbed_support) make ARCH=arm CROSS_COMPILE=${CCPREFIX} menuconfig
➜ linux git:(feature/mbed_support) make ARCH=arm CROSS_COMPILE=${CCPREFIX} -j4
Then it complains about absence of GLIBC_2.14 as below:
/home/oklahomer/dev/raspberrypi/tools/arm-bcm2708/gcc-linaro-arm-linux-gnueabihf-raspbian-x64/bin/arm-linux-gnueabihf-gcc: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.14' not found (required by /home/oklahomer/dev/raspberrypi/tools/arm-bcm2708/gcc-linaro-arm-linux-gnueabihf-raspbian-x64/bin/arm-linux-gnueabihf-gcc)
Follow the instruction described here, and install glibc >= 2.14.
  1. Add following lines to /etc/apt/sources.list
    deb http://ftp.iinet.net.au/debian/debian wheezy main contrib non-free
    deb http://ftp.iinet.net.au/debian/debian wheezy-backports main
    deb http://ftp.iinet.net.au/debian/debian jessie main contrib non-free
  2. Add following lines to /etc/apt/preferences
    Package: *
    Pin: release a=testing
    Pin-Priority: 10
    
    Package: *
    Pin: release a=stable
    Pin-Priority: 900
  3. Install
    ➜ linux git:(feature/mbed_support) sudo apt-get install -t testing libc6-dev
Try again.
➜ linux git:(feature/mbed_support) make ARCH=arm CROSS_COMPILE=${CCPREFIX} -j4
➜ linux git:(feature/mbed_support) make ARCH=arm CROSS_COMPILE=${CCPREFIX} INSTALL_MOD_PATH=${MODULE_TEMP_PATH} modules
➜ linux git:(feature/mbed_support) make ARCH=arm CROSS_COMPILE=${CCPREFIX} INSTALL_MOD_PATH=${MODULE_TEMP_PATH} modules_install
Prepare files to scp.
➜ linux git:(feature/mbed_support) find ./ -name zImage
./arch/arm/boot/zImage
➜ linux git:(feature/mbed_support) mv arch/arm/boot/zImage ~/work/.
➜ linux git:(feature/mbed_support) cd ~/work/
➜ work tar czf modules.tar.gz modules
➜ work  ls -l
total 15576
drwxr-xr-x 3 oklahomer oklahomer     4096 Jan  4 14:36 modules
-rw-r--r-- 1 oklahomer oklahomer 12688435 Jan  4 15:20 modules.tar.gz
-rwxr-xr-x 1 oklahomer oklahomer  3254856 Jan  4 15:14 zImage
Finally send zImage and modules.tar.gz to RasPi B+.

Apply changes

SSH login to RasPi B+.
  1. Place modules
    cd /tmp
    tar xzf modules.tar.gz
    cd /lib
    mv modules modules_org
    mv /tmp/modules/lib/modules /lib
    chown -R root:root /lib/modules
  2. Place kernel image
    cd /boot
    mv kernel.img kernel.img.org
    cp /tmp/zImage kernel.img
  3. Reboot
  4. Check update/upgrade
    1. sudo apt-get update
    2. sudo apt-get upgrade
  5. Add modules to /etc/modules
    snd-bcm2835
    
    # i2c related modules are required for i2s
     i2c-bcm2708
     i2c-dev
    
    # for i2s
    snd_soc_bcm2708_i2s
    bcm2708_dmaengine
    
    # for mbed AudioCODEC
    snd_soc_tlv320aic23
    snd_soc_rpi_mbed
  6. Reboot
  7. Check i2cdetect.
    It seems O.K. to me since UU on 0x1b indicates that this is reserved by kernel. At least I thought so.
  8. Check aplay.
  9. Check arecord.

Wiring

RasPi b+ has different GPIO pin layout than older model. For mapping, I referred to this PDF.
I am not 100% sure about the mapping below, and I think it has something to do with the problems I am going to introduce later.
mbed AudioCodec   |     Raspberry Pi
----------------- +---------------------
    BCLK   (I2S)  |       P5 - 03
    3V3           |       3V3
    DIN    (I2S)  |       P5 - 06
    DOUT   (I2S)  |       P5 - 05
    SCLK   (I2C)  |       P1 - 05
    SDIN   (I2C)  |       P1 - 03
    GND           |       GND

Problems

Here is the output of `dmesg` that confuses me.
[    5.067470] i2c i2c-1: Failed to register i2c client tas5713 at 0x1b (-16)
[    5.338188] i2c i2c-1: Can't create device at 0x1b
[   11.813418] snd-rpi-mbed snd-rpi-mbed.0:  tlv320aic23-hifi <-> bcm2708-i2s.0 mapping ok
[   11.933579] tlv320aic23-codec 1-001b: ASoC: Capture Source DAPM update failed: -5
The entire output is located at gist. I am not sure if it is the direct reason, but `aplay` does not give me any sound at all. I am going to edit or post when I have any progress.