Reverse engineering Apple’s in-terminal emojis

I found a somewhat interesting feature in the Apple terminal, supported by both Terminal.app and iTerm2. Essentially, it allows for the printing of emojis in the terminal. I discovered this when I noticed that the Homebrew package manager prints beer and tap emojis when it is downloading and installing packages. I was curious about how these emojis are represented, so I used a program that I had written, called hex, which essentially just displays the hex values of characters entered by the user. I’ve used this program to find escape sequences for things like function keys and the like, allowing me to implement these inputs in my programs.

Here is the source code for hex. It’s a fairly simple program…


 1 // This program displays the octal, hex, decimal,
 2 // and byte character representations of characters
 3 // typed at the terminal.
 4 
 5 #include <stdio.h>
 6 #include <termios.h>
 7 #include <signal.h>
 8 
 9 struct termios term;
10 struct termios save;
11 
12 void stop( int );
13 void cont( int );
14 
15 int mainint argc, char **argv ){
16         tcgetattr0, &term );
17         save = term;
18         term.c_lflag &= ~( ICANON | ECHO | ECHONL );
19         signalSIGINT, stop );
20         signalSIGTSTP, stop );
21         signalSIGCONT, cont );
22         tcsetattr0TCSANOW, &term );
23         // Section where the program executes
24         // in raw mode:
25         unsigned char c;
26         for(;;){
27                 c = (unsigned chargetchar();
28                 printf"%3o\t%2x\t%3d\t%c\n", c, c, c, c );
29         }
30         // End raw mode:
31         tcsetattr0TCSANOW, &save );
32         return 0;
33 }
34 
35 // Restores terminal settings before allowing
36 // the signal.
37 void stop( int signum ){
38         tcsetattr0TCSANOW, &save );
39         signal( signum, SIG_DFL );
40         raise( signum );
41 }
42 
43 // This function undoes the cleanup performed
44 // by stop() when the process is suspended.
45 void cont( int signum ){
46         tcsetattr0TCSANOW, &term );
47         signalSIGTSTP, stop );
48 }

I copied the beer mug emoji from the terminal after downloading a package. I then started the hex program and hit Command-V, then copied the hex bytes (there were four of them) by hand into a file using a hex editor. To test it, I ran cat on the file I had created, and lo and behold – a beer emoji was printed to my terminal.

I then decided to test a concept, the concept of printing emojis in a program. I used assembly language for this, because I’m not entirely sure of the proper syntax for C, and I thought it would be faster to write if I did it in assembler. I tested several codes, and then when I found the result, I wrote a comment in the source file indicating what the emoji was. Be warned – this code is extremely tedious, even for an assembly program.


 1 global start
 2 
 3 segment .data
 4 str1:   db      0xf0,0x9f,0x8d,0xb0,0x0a,0x00 ; Cake emoji
 5 str2:   db      0xf0,0x9f,0x8d,0xba,0x0a,0x00 ; Beer emoji
 6 str3:   db      0xf0,0x9f,0x8d,0xa9,0x0a,0x00 ; Donut emoji
 7 str4:   db      0xf0,0x9f,0x8d,0x8e,0x0a,0x00 ; Apple emoji
 8 str5:   db      0xf0,0x9f,0x8d,0x81,0x0a,0x00 ; Maple leaf emoji
 9 str6:   db      0xf0,0x9f,0x8d,0x82,0x0a,0x00 ; Leaf emoji
10 str7:   db      0xf0,0x9f,0x8d,0x97,0x0a,0x00 ; Drumstick emoji
11 
12 segment .text
13 start:
14         push    dword 5
15         push    dword str1
16         push    dword 1
17         mov     eax,4
18         sub     esp,4
19         int     0x80
20 
21         push    dword 5
22         push    dword str2
23         push    dword 1
24         mov     eax,4
25         sub     esp,4
26         int     0x80
27 
28         push    dword 5
29         push    dword str3
30         push    dword 1
31         mov     eax,4
32         sub     esp,4
33         int     0x80
34 
35         push    dword 5
36         push    dword str4
37         push    dword 1
38         mov     eax,4
39         sub     esp,4
40         int     0x80
41 
42         push    dword 5
43         push    dword str5
44         push    dword 1
45         mov     eax,4
46         sub     esp,4
47         int     0x80
48 
49         push    dword 5
50         push    dword str6
51         push    dword 1
52         mov     eax,4
53         sub     esp,4
54         int     0x80
55 
56         push    dword 5
57         push    dword str7
58         push    dword 1
59         mov     eax,4
60         sub     esp,4
61         int     0x80
62 
63         push    dword 0
64         mov     eax,1
65         sub     esp,4
66         int     0x80

Here is what I get when I run the program:

emoji-run

It’s not particularly useful, especially considering that these emojis are not portable to other systems, but it’s an interesting concept to test.  It’s kind of confusing that the emojis use four bytes each, whereas the UTF-8 character set used by the terminal uses three bytes for each Unicode character.  I’m not sure how Apple implemented this.

Some hacking – Using Keka to import data from OS X to my VMs

Ever since I started using VirtualBox, I have needed a way to get data from my host operating system (Mac OS X) to the guests. This data includes drivers that I want to install in the guest, and programs and games that I want to run in the guest.

Sure there’s Guest Additions, but that has its own problems. For one thing, it only works with Mac, Windows, and Linux guests; there’s no drag-and-drop feature for things like DOS. Secondly, it requires the guest to have Internet access so you can install the guest addition drivers; if I have Internet access, then all I need to do is just email the files to myself in the host and download them in the guest. So basically, guest additions are merely a convenience that makes the already possible more practical; they do not enable you to do anything you couldn’t do before.

I realized my best bet for guests that either don’t have networking drivers installed or aren’t in the Mac/Windows/Linux category would be to find a way to turn the directories in my OS X filesystem into disk images, either floppy images or ISO files, so I could then insert them into the virtual drives of my VMs. I found a utility called Keka, which can be downloaded here. Keka is a file compression, extraction, and archiving utility for OS X. It can archive or compress files in several formats, and it can extract even more formats. Formats it creates include zip, gzip, bzip2, 7zip, tar, DMG, and ISO.

Keka

Though it is a graphical program, the interface for Keka isn’t particularly intuitive. It allows you to select a format to archive a directory to, but gives no indication of how to select a directory for archiving.  Through wild guesswork I discovered that you can do this by dragging the icon for the directory into the Keka window, after which it will automatically create an archive in the selected format.  Who knew?

The next step is of course to insert the newly created ISO into my virtual machines.  I can only do this with VMs that recognize optical disks, obviously, so MS-DOS 5.0 doesn’t work.  However, I was able to import Deluxe Paint into my FreeDOS VM.  I’m in the process of trying to import Norton Commander.

Storage

As an experiment, I tried converting some of the DOS folders to ISO files and using them to install software directly, as if they were floppies.  This didn’t work, because the installers require the disk to be mounted on drive A.  Obviously I wouldn’t be able to just change the extension, because it would still have the ISO-9660 filesystem, rather than the FAT filesystem, and it wouldn’t be recognized as a floppy.

Yeah, I kinda figured that wouldn’t work.

I’ve also tried using some software and games in VMs that I’m running on live ISOs, but there were some problems.  First, VirtualBox only allows for one optical drive.  You can work around this limitation by adding a USB drive and telling the VM to treat it as an optical drive, but when I did this, the VM tried to boot from the wrong ISO.  Maybe I should select the “Live CD/DVD” option.

Homebrew – “the missing package manager for OS X”

I made a pretty neat discovery recently. There’s this package manager for Mac OS X called Homebrew that allows you to install Unix packages from the command line, using the brew command. It’s patched together from Ruby scripts and shell scripts and uses git as a backend. You can install it on Mac OS X using the following command:

/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"

This is much better than using apt-get in Mac OS X (now no longer an option), where it would barf because I didn’t have the XCode command line tools, but it would provide no indication of how to get said tools. That script installs the XCode tools automatically if they’re not there. It’s also better than manually compiling code from source, which can be a major pain the ass, especially when you run into dependency issues (which is more often than not the case).

Homebrew is very similar to apt-get in its command structure. For example, here is how you install a package:

brew install lynx

There are several other Homebrew commands worth remembering. For example:

brew edit lynx

This opens the source code for the installer script for the package in the default text editor.

Also:

  • remove – uninstalls a package
  • list – lists installed packages
  • search – lists currently available packages
  • update – updates a package

All of these have the same command structure: brew <command> <package>.

I’ve installed several packages so far. I will list them here:

  • Lynx – the text-mode browser
  • Snownews – an RSS feed aggregator for the CLI
  • Bitchx – a text-based IRC client
  • NASM – an x86 assembler
  • wget – allows you to download files from the command line
  • CLISP – Command Lisp compiler, interpreter, and REPL

This is an exciting discovery. Now I can use all my favorite Linux software in Mac OS X!