Mar 21, 2014

Ideas on utilizing Teng#new_row_from_hash

This article is basically a follow-up of my previous article: Teng::Row and data2row.
As of Teng Ver. 0.21, data2row is implemented as Teng#new_row_from_hash. I already introduced a way how I used this method on my previous article, so I am going to introduce 2 different usages.

Examples are as shown on bottom.

The first example describes how to cache column values and then creates table row object with cached values. With Devel::Size I compared the size of row object and column values then I realized that the size of row objects become much bigger. So my idea is to cache only the column values to minimize cache size and then create table row object with the cached values.

The Second example shows how I utilize the table row object's method by creating temporary object. In this case I'm trying to upload image to a certain path and the path is generated based on the filename and campaign_type. I didn't want to generate path in 2 places, but instead I wanted to generate it at one place so I'm not going to mess up by modifying one and forgetting modifying the other.

I like the way Teng enables me to do this kind of things easier.


Feb 16, 2014

Benchmarked Sort::Maker's each sort style for my specific purpose

I use various internal and external APIs to build my services. An external API returns content something like below:
{
    "status": 1,
    "error_msg": '',
    "result": {
        "foo1": { .... },
        "foo2": { .... },
        "foo3": { .... },
        "foo4": { .... }
    }
}
After retrieving this, I have to sort the result in result.foo1, result.foo2, ..., result.foo100 order. I'm wondering why they don't return this in array, but I must face it as long as the provider is returning values in this way. Although Perl Best Practices insists on using Sort::Maker, I implemented this in my way because this sort wasn't that complicated.
my %tmp_cache; # for orcish maneuver
my @sorted_keys = sort {
    ( $tmp_cache{$a} //= $a =~ s/\A foo(\d+) \z/$1/xr ) <=>
    ( $tmp_cache{$b} //= $b =~ s/\A foo(\d+) \z/$1/xr )
} keys %{ $content->{result} };

# then sorted results are stored in stash to be displayed
$c->stash->{results} = [ map { $content->{result}->{$_} } @sorted_keys ];
Today, I benchmarked each sort type provided by Sort::Maker and found that using Sort::Maker was much faster. The code is as below. It was a bit surprising that my original method with orcish maneuver is13% slower than Sort::Maker's one with orcish maneuver. Needless to say, other sort types are much faster. So my conclusion is that even relatively simple sort should be implemented with Sort::Maker to increase readability, maintenancibility and performance.

Feb 9, 2014

Released Lingua::PigLatin::Bidirectional Ver.0.01

The other day I talked about Lingua::PigLatin and how it works. This module really helped me understand how Pig Latin works, but this module only handled English-to-PigLatin translation. So I created Lingua::PigLatin::Bidirectional. As the name implies, this translates English sentences to Pig Latin, and vice versa.
use Lingua::PigLatin::Bidirectional;
 
warn to_piglatin('hello');     # ellohay
warn from_piglatin('ellohay'); # hello
My IRC bot uses this module so when I want to pig-latinize some sentences or want to see if my pig-latinization is correct, I talk to this bot and it returns translated sentences.

Feb 2, 2014

How we should teach how to eat soba noodle

Soba eating experience is totally different from that of spaghetti. It's not just a difference between forks and chopsticks, but it involves the difference of table manners so understanding and mastering soba eating can be an indicator how much he or she understands Japanese culture. Then how should I tell westerners, who grew up with a manner of not making noise while eating pasta, how to eat soba?
Lately I found a good article, "The sound makes the experience," which approaches this problem from both cultural and technical aspects. It describes the reason to make noise as a very polite gesture to acknowledge cook how much you enjoy your meal. And then it describes the technique as follows:
For me, the best way to conceive of the proper slurping technique is like when you are eating a very hot slice of pizza. You take a small bite, and because it's hot, you start to suck in air while chewing. This allows you to eat the food while it is still very hot, while you are breathing in. This is the same manner in which you should eat soba in Japan.
This helped me a lot. For Japanese like me, it is very difficult to teach this kind of basic skills because we all acquire them during our childhood and do not remember how we learned. Even worse, We can't understand why they can't do it.
This leads me to a conclusion that Japanized foreigners are better teachers than Japanese ourselves. I'll catch up those foreign media to learn how I should teach Japanese culture.

Why I think Glocks are not for everyone... especially sport shooters

When I say 'I hate Glocks' to my friends, I'm not just talking about its plastic frame and Good Ol’ ‘merica. I'm at age of 28, and I think I'm not old enough to say that. I'm not even American, after all. Actually I know Glocks have some cool features and that's why I carefully say 'Glocks are not for everyone.' I'm going to explain why I think that way.

Premise: Its Uniqueness


Glock's company history is briefly introduced on a web page, Timeline | GLOCK USA. It started its activity as a plastic and steel parts manufacturer and then, in 1970s, shifted its field to military industry including knives. When this company started gun manufacturing after 1980, it brought its invention of nylon-based polymer, Polymer 2, to gun industry. That's Glock 17. So its origin and design policy are totally different from those of other gun manufacturers with long history such as Colt, Winchester and Smith & Wesson. This critical difference includes weight balance and safety mechanism.

Weight Balance

When I lived in Oklahoma -- BTW, that's why I call myself Oklahomer -- I visited H&H Shooting Sports every other week for sport shooting. My favorite choice of full-size handgun was Sig Sauer P226 and second was Colt M1911. Comparing to those 2 handguns, Glocks are extremely light. The weight of cartridges in magazine don't differ a lot so the lightness of front half stands out. It makes difficult to handle the recoil and aim of the second shot.
I know lightness is important for those who must carry gun with them on daily basis such as police officers, but it doesn't profit me.

Its Mechanical Simplicity and Handling Complicity


Glock's safety mechanism, which is called Safety Action system, is pretty simple and unique. It's all about trigger and it doesn't include anything like M1911's manual and grip safety or P226's decocking lever, which I think is the biggest difference concerning safety.
I understand this Safety Action system is a reliable mechanism, but it's just a mechanism. *WE*, humans, make mistakes. To avoid misfire, I believe we need decocking lever or at least cocking indicator. Of course it should be O.K. as long as the shooter, such as law enforcer, handles only Glocks and has enough time for continual training. For others, like sport shooters, who handles various guns and can't afford to train daily basis, I think they should choose guns with more common safety mechanism that involves cocking indicator and decocking lever.

Conclusion

As I described above, Glocks are very unique in terms of its lightness and safety mechanism. I believe this can benefit law enforcers with adequate training and need of portability, but can disbenefit ocasional sport shooters.
By the way I love Gunny from Full Metal Jacket, lol.

Jan 21, 2014

Convert English Sentence to Pig Latin with Perl

Every once in a while I hear them talk Pig Latin in movies. I repeat it so many times in my mind and try to understand the original line, but it takes time because English is my second language and I'm not used to its transition. So I came up with this idea: go lookup wikipedia and related linguistics articles and create a Perl module that pig-latinize given sentence(s). That should help me understand the rule.
Before launching vim, I searched CPAN for similar module and it didn't take long before I found what I wanted. Lingua::PigLatin converts given sentence to Pig Latin with simple regular expression below.
    s/\b(qu|[cgpstw]h # First syllable, including digraphs
    |[^\W0-9_aeiou])  # Unless it begins with a vowel or number
    ?([a-z]+)/        # Store the rest of the word in a variable
    $1?"$2$1ay"       # move the first syllable and add -ay
    :"$2way"          # unless it should get -way instead 
    /iegx; 
Since my goal was to understand this game's rule through coding, I read what this regular expression did. I'm not a regular expression expert and it was a bit difficult to understand at once so, with a help of Perl Best Practice, I modified its coding style to increase my readability as below.
s{\b                   # See if each given word starts with...
    (   qu             # 1. qu (e.g. question => estionquay)
      | [cgpstw]h      # 2. digraphs
      | [^\W0-9_aeiou] # 3. any "word" character other than 0-9, _ and vowels
    )?
    (                  # and followed by...                               
        [a-z]+         # alphabet character(s)
    )
}
{
    $1       # if the first rule applies
  ? "$2$1ay" # then append former part and add -ay,
  : "$2way"; # otherwise add -way
}iegx;
Now things became clearer. Here is what it does.
First, it checks every words' beginning by having \b at the very beginning of this expression. It checks if it starts with or without any of the 3 rules below:
  1. starts with "qu"
  2. starts with digraphs such as ch, gh, ph, sh, th and wh to capture words like channel, shell and what
  3. starts with any word character other than 0-9, _ and vowels(AEIOU)
Second, it checks if the following characters are all alphabet.
If both first and second steps apply, the first part is appended at the bottom of the word with -ay; If only second step applies, it just put -way at the end; If both don't apply, it does nothing with the word.

O.K. now I understand how Pig Latin works. But I have a new question. Do Americans really do this in their mind while they are just talking those things that randomly come to their mind? Can't believe it...

Dec 30, 2013

FQL: How to retrieve "furigana" for Japanese user name

Japanese Writing System

Japanese language is very unique and its origin is still debated. Its writing system is also unique in terms of having three scripts:
  • kanji -- ideographic characters borrowed from Chinese
  • hiragana -- phonogramic characters, originally a simplified form of kanji 
  • katakana -- phonogramic characters, originally derived from components of kanji 

Problem with Kanji

Kanji consists of more than 2,000 commonly used characters while hiragana and katakana each consist of only 46 characters. This extremely large amount of kanji becomes troublesome to most people.
One more tough thing about Japanese kanji is that each character has more than two pronunciations: Chinese original pronunciation -- on-yomi(音読み) -- and Japanese original pronunciation(s) -- kun-yomi(訓読み). So the problem is that when kanji characters are combined and used in person names, we can't really tell what pronunciation to use.

Hiragana and Katakana as Reading Aid

With experience, people can tell how to pronounce commonly used person names, but telling each pronunciation in programatic way requires a large dictionary and seems almost impossible. In most cases, since hiragana and katakana are phonograms, we use them to help tell the pronunciation. When hiragana or katakana is explicitly used in this purpose, it is called furigana(フリガナ). So most user registration system obligate users to input furigana along with their original kanji names.

Retrieving Furigana with FQL

In 2012, I implimented Facebook social login to my service and found it difficult to retrieve furigana of registering user name. It was really frustrating. Using social login should simplify both user experience and source code, but if we can't retrieve furigana, we have to obligate users to input manually, which I think ruins user experience at first place. I checked up whole Facebook Graph API and FQL documents and finally found a way to do it.
There are columns called sort_first_name and sort_last_name on user table and these columns returns furigana for first name and last name. The minimum query is as below:
SELECT first_name, sort_first_name, last_name, sort_last_name,name FROM user WHERE uid = 4400758
On request, depending on your Facebook app settings, you must add locale=ja_JP parameter.
curl -X GET 'https://graph.facebook.com/fql?q=SELECT+first_name%2C+sort_first_name%2C+last_name%2C+sort_last_name%2Cname+FROM+user+WHERE+uid+%3D+44007581&locale=ja_JP'
You must be careful if the user hasn't input Japanese name -- mostly those Japanese users who registered back in those days Facebook was served only in English may not have registered kanji and furigana -- returns latin characters in sort_*_name columns.

Below is the code I used.
#!/usr/bin/env perl
use strict;
use warnings;
use utf8;
use Facebook::OpenGraph;
use Data::Dumper;
use Data::Recursive::Encode;

my $fb  = Facebook::OpenGraph->new;
my $ret = $fb->fql('SELECT first_name, sort_first_name, last_name, sort_last_name,name FROM user WHERE uid = 44007581');
$ret = Data::Recursive::Encode->encode_utf8($ret || +{});
warn Dumper $ret;
#$VAR1 = {
#          'data' => [
#                      {
#                        'sort_first_name' => 'ゴウ',
#                        'name' => '萩原 豪',
#                        'first_name' => '豪',
#                        'last_name' => '萩原',
#                        'sort_last_name' => 'ハギワラ'
#                      }
#                    ]
#        };