Flag #12 - Building Personalized Maid Assistant for Dummies

Iskandar Setiadi

Flag #12 - Building Personalized Maid Assistant for Dummies

Today is January 1, 2100. Artificial Intelligence (AI) is now taking over 99% of the jobs in the world. All single corners of the world can be accessed in minutes with self-driving vehicles. Since 2050, people started to live in Mars and journey to other planets have shown fruitful results. Human beings are genetically modified in order to make higher lifespan expectancy and less diseases. The source of world power is 100% renewable by utilizing heat, solar, wind, and water. Thanks to AI, all knowledge fields have advanced tremendously in the past 80 years.

Sitting in front of her butler robot, Alice listens to her butler explanation about the history of humanity. Computer was introduced 150 years ago. Internet was first developed 125 years ago and gained popularity around 100 years ago. From that point, AI received a lot of attention from scientists around the world and several labor jobs were replaced by AI since 90 years ago. Ten years ahead, some executives and decision-making positions were also replaced by self-learning AI. It only took several decades before 99% of the jobs were totally replaced by AI. Alice ponders to herself in curiosity, what is the purpose of her life right now? Her butler, which has an ability to understand human's emotion, knows Alice concern and says, "Worry not, dear Alice. The vast sky above still hides a lot of secrets. Life ecosystem in Mars is still as young as tree's seed. And probably, the most important one: throughout the history, human tends to have different opinion from each other and conflict could not be avoided. To make things worse, they should also think to live in harmony with AI robots nowadays. You should go around the world, meet new people, and put yourself in their shoes. Building things are quite easy, but protecting existing things such as relationship are harder. There're still a lot of things to do out there."

Alice prays to her ancestors, "Thanks grandpa, grandma. My life today will not be possible without all of your efforts. Thank you for the world without serious climate change issues, thank you for the world without nuclear and AI wars, and thank you for creating Artificial Intelligence with the most appropriate self-ethics*."

*Try the experiment out at http://moralmachine.mit.edu/

Perhaps, "Artificial Intelligence" is one of the most popular IT keyword right now. Just in the past one year, we saw Google DeepMind (Alpha Go) landed a victory against 9-dan Go player. There are also Elon Musk with his self-driving Tesla cars, political controversies around Cambridge Analytica, and Mark Zuckerberg's Jarvis assistant.

In December 2016, Mark Zuckerberg posted his note of building Jarvis as a part of his personal challenge for 2016. Mark's Jarvis handles a lot of stuffs around home automation. It sounds cool, right? AI is cool, but it does not necessarily complex, especially nowadays. Simple AI might only contain simple conditional if..else.., which is famously known as Expert System. It might be complex, but there are a lot of open-source tools out there such as Microsoft BotBuilder by Microsoft (duh) and Hubot by Github.

Personalized: The Requirements

To fill my new year holiday, I finally decided to build a simple personalized assistant -- "Maid-chan". Instead of building own User Interface, there are a lot of available choices out there: Slack, Skype, Facebook Messenger, etc. I chose Facebook Messenger Bot since I can share the app with non-developer friends and it has client-side Messenger apps in mobile phones. The only problem with Facebook Messenger Bot is you need to pass Facebook's review if you want to publish your bot. However, you could still invite your close friends to use your bot as testers without releasing your bot to public. The code, which is written in Python, is available via Github and the complete documentation here.

At this point, some people might ask, "Why do you need to build a personalized assistant instead of using existing ones?". One of the simplest reason is for self-learning purposes and in addition to that, there are several features which are too specific and not generally available in generic bots. For example, Maid-chan is designed to apply Primitive image filtering for each uploaded images. There is also daily Kanji and Vocabulary for Japanese lesson and daily greetings with offerings. To sum it up, Maid-chan has the following features, which I will explain in the next sections:

Maid-chan Architecture

Server-side Image Processing with Machine Learning

Instead of using client-side computational power, there are a lot of cases where we want to leverage server-side CPU. In the first experiment, I am trying to use Primitive, a machine-learning based program which converts images to its geometric primitives form (written in Go). Initially, I didn't create any background worker to handle user's images. However, Facebook Messenger API endpoint always complains if the bot doesn't send any response back in 20 seconds. Therefore, I decided to use a simple response "しばらくねえ <3" (means: "Please wait, okay <3") before handling computation asynchronously.

It takes 1 - 2 minutes to generate a PNG primitive image with t2.micro EC2 instance. But, a static image is not fun! By utilizing ImageMagick, 3 PNG primitive images are combined to create a GIF result.

Primitive itself uses hill climbing and simulated annealing in image generation. The interesting part here is you don't need to actually implement the machine learning part by yourselves!

Chatbot with Machine Learning

Maid-chan chatbot utilizes ChatterBot, a machine-learning based conversational dialog engine which is able to generate responses based on collections of known conversations. Maid-chan chatbot has an ability to converse conversations in 3 languages: Bahasa Indonesia, English, and Japanese.

I created a customize corpus to allow inter-languages conversation with Maid-chan-ish dialects.


...
        [
            "Arigatou",
            "どういたしまして~"
        ],
        [
            "おねがいします",
            "Onee-chan makasete!"
        ],
        [
            "What is your name?",
            "My name is Maid-chan nanodesu~"
        ],
        [
            "How old are you?",
            "Hi..mi..tsu >_< Don't ask sensitive stuffs please"
        ],
        [
            "What is love?",
            "Love is simply an electrical bug in the human neural circuit"
        ],
        [
            "What is the meaning?",
            "意味わかない :("
        ],
...

ChatterBot updates its model based on conversation responses. At one point, Maid-chan became similar to Microsoft's Tay because of malicious responses from other users (read: my wicked friends). By utilizing open-source library such as ChatterBot, I don't need to implement natural language processing / machine learning and I can simply provide custom-made corpus to ChatterBot.

Daily Scheduler (Good Morning & Night, Japanese lesson, RSS Aggregator)

One of the most interesting component from Maid-chan is its scheduler worker. Currently, there are around 10+ people of Maid-chan testers who are still utilizing this specific features. Daily scheduler is similar to cronjob, which checks all available rules every 60 seconds.

First, Maid-chan behaves like a normal human, where she might oversleep and sends out late good morning messages. There are around 10+ variations of good morning messages and unique images to accompany each messages. User could adjust Maid-chan good morning & night messages based on their sleeping pattern.

Second, Maid-chan is equipped with dictionary of Kanji & Vocabulary which are used for JLPT (Japanese Language Proficiency Test). User could adjust their competency level and receives "Kanji & Vocabulary of the Day".

Third, Maid-chan allows RSS aggregation based on user-defined RSS rules. There are also several defined presets, since the RSS rules between my friends and myself are quite the same.

However, this feature also has an apparent problem: timezone handling. Since some of my friends reside in Europe, Southeast Asia, and East Asia, Maid-chan encounters timezone problem for several of the default rules. For example, Japanese lesson is usually being sent out around 01:00 PM UTC+9, whereas it is equal to 05:00 AM UTC+1.

Translate text with Natural Language

Maid-chan translation feature is exactly the same with Google Translate. It utilizes translate-shell as an interface to use Google Translate features. The interesting part about Maid-chan translation feature is the custom-made rules and natural language handling.

By custom-made rules, it has a specific source-target languages such as English will be translated to Japanese, Bahasa Indonesia will also be translated to Japanese, while Japanese will be translated to English. By default, it uses Google's "Detect Language" feature if the given input is not one of the three configured languages. If I need to switch Google Translate options a lot because of using more than 2 languages, Maid-chan can directly translate the given sentence to my own translation preferences.

Maid-chan uses a simple keyword-based natural language handling if the translation rule is outside given presets. For example, "Translate X to B" has a same meaning with "Terjemahkan X ke B". It is possible to use real NLP techniques, but it will not land in this "for Dummies" article :)

Closing Remarks

In the near future, I am planning to add some features such as mini-games (If you know Rinna -- Microsoft Japan AI, you could play Shiritori with her), image "waifu" recognizer, and probably some location-aware features since user can send location information via Messenger. Currently, Maid-chan has no IoT feature, which is not fun! Probably I should buy some RasPi / Arduino and create a simple home automation with it. The real challenge with IoT feature is to make it available for all users, since the implementation might be tightly coupled with my home.

Maid-chan features are niche, it might be different from your requirements as it was built based on writer's preference. I didn't implement any voice recognition feature since I never use voice-based command for my daily activities. If you are interested in using voice-based command, there are also a lot of open-source libraries out there. Therefore, I highly recommend you to try building your own personalized assistant, and while on it, you will learn a lot of new things about AI and realize that AI might be closer than you think.

Finally, thank you for reading this post! Feel free to have a discussion, opinion, or idea-sharing here!

Iskandar Setiadi
Software Engineer at HDE, Inc.
Freedomofkeima's Github