Typing Chinese on a Computer

Just today, I read an article about the influence of the computer on the chinese language. I can agree with some of the points of the author, but think that the difficulty of using a method like Wubi is generally overstated. CangJie is more difficult, but in contrast to spoken language, they both have the very valuable property of not changing according to dialect, region or time. The speedups a user of predictive input gains, are also avialable to users of handwriting or structure-based input methods, but the input speed should be excellent at 150 words, achievable in Wubi, or the 200 words achievable in CangJie. On top of predictive input and much less guesswork that makes the phonetic input methods slow, the structure-based input methods sport phrase books and rules for having hortcuts to type several characters in one go. And while I have seen every undergrad student using only PinYin or ZhuYin, every PhD student that I have met so far, has switched to Wubi, simply for the massive speed increase.

However, I am unconvinced about the notion that writing Chinese is slower than English:

If you can type 150 chinese characters per minute, that amounts to roughly 50 words per minute if you subtract particles and composita, as many chinese words have only one or two characters. Now, imagine how fast you'd have to type to achieve similar speed in English: If the average English word has four characters, which is probably not enough, you'd have to type at 600 characters per minute to achieve similar results, and then you have spacing, too, which does not exist in Chinese. I also hold that the structure-based input methods at least help you memorize the graphic elements of the characters, thus being closer to hand-writing than phonetic input methods. With the composition rules and phrase books, you end up usually having one to three key strokes to produce a chinese character. In summary, I think it is not easy to say whether English or Chinese can be typed faster.

Unfortunately, my own experience with Chinese input is limited to PinYin and Wubi, and as far as the steep learning curve goes, the principles of Wubi can be explained in probably one or three hours, and after that, it takes two weeks of practice to achieve some fluency. Not a big invest in comparison to learning Chinese in the first place, or the waste to be accrued over time using an inferiour method. I guess it is mostly the psychological barrier, possibly combined with unsophisticated didactics that contribute to the perception that these methods are hard.

Links:

Back to top


Small Timezone Code Snippet

Today, I was looking at how to adjust a time stamp from a log file without a timezone info to contain the local timezone, so I can stuff a timezone aware value into a database. It turns out that this is a somewhat under-polished part of the Python standard library, at least as of Python 2.6, which I am using (don't ask why). While looking for a solution, I frequently came across code that used pytz , but I wanted something that would stay within the standard library.

So here's my hodgepodge solution to the problem, which should work in most of Europe:

import time

def getTimeOffset():
    offset = time.timezone
    if bool(time.localtime().tm_isdst):
        offset = offset - 3600
    stz = "%+02.2d%02d" % (offset / 3600, offset % 3600)
    return stz

This approach is a straightforward extension of the idea presented here.

Back to top


New Blog Software, Links Changed

As you might have noticed, I have switched from MovableType to Pelican. As a consequence, the links in my blog changed - usually only a little, but in a slightly irregular fashion. Please peruse the archives and search for the title of your article. The content itself should all be there.

Thank you!

Back to top


DNS: Open Resolvers, Revisited

Long has been the list of failures in ISPs and carriers to force borken DNS servers on their customers, thereby manipulating their customers traffic, or outright censoring what their customers can see. To combat such manipulations, and also to make it harder to observe their customers' behaviour, it has been a pet project for some, also for me at some time, to run an open resolver, that allows random people on the Internet to query your DNS server for an arbitrary name. Unfortunately, the evil guys developed an attack [0] that makes it impractical to run an open resolver. So, while politically desirable, it is unfeasible to run an open resolver, and network operators around the globe strive for shutting them down.

Now, these attacks all rely on the simple fact that, with UDP, you do not have any kind of assurance that the source address in a packet in fact belongs to the sending host. In my opinion, if you are willing to take the effort, there is one obvious way to provide an open resolver that does not have this flaw: For hosts not on your own network, provide DNS over TCP only.

I hope that someone will hack this feature into unbound [1], so people can easily deploy open resolvers in a reasonably safe way, without disrupting the Internet. Currently, unbound's do-udp setting is only a combined setting for incoming and outgoing queries, causing upstream name servers excessive load.

Thank you for reading!

[0]See eg. http://openresolverproject.org/
[1]https://www.unbound.net

Back to top


Fixing the Android Update Problem - A Few Thoughts

Time and again, Android has been getting the heat for leaving its users in the lurch in the face of security problems, while fixing such problems only in the most recent version. But in my opinion, not only Google, but also the manufacturers, are to blaim for this situation: They are the ones who aim to lock down the devices with their Frankenstein'ed versions of Android because they think it's their selling point, or at least their way to more revenue.

The following suggestions relies purely on speculation, because I am not privy to any contracts, product design or marketing discussions on behalf of any party. But from all I know, the following approach could be used to alleviate the problem from the user's perspective:

Google should imho

  1. fix such bugs in as many versions of Android as required to achieve 75% market coverage, and
  2. adjust their contracts in a way so that manufacturers who desire early-access and support from Google, as opposed to simply warping AOSP, are required to offer these updates for all handsets that were originally shipped or are currently running with any of the fixed versions of Android, within two weeks time, lest they lose some kind of access to the program, and the right to use the Android logo. Compliance should be determined frequently enough to not water down these requirements.

This would have the following nice side effects:

  • Google gets rid of the blame for not supporting their users (see point 1).
  • The manufacturers can still avoid the huge and profit-eating work of supplying the users with new versions of Android, but are being pressed to at least not leave their users alone (see point 2).

By going this route, the manufacturers are not required to give up a part of their business relationship to Google, which would be hard to argue despite them doing it all the time towards the carriers (let's think about that battle later), while making sure that the users are safe, sort of (and relegate the general security debate about Android to a different debate, too), without making it impossible to market new devices with new versions of Android.

The current situation, which I'd liken to driving a car with broken brakes, would imho warrant compulsory recall actions on behalf the manufacturers, which they would otherwise be legally obligued to perform - at least as far as my understanding of German consumer protection laws goes. It would be somewhat interesting to see such a case being heard before a German court, and I'm far from confident that the Android brand will not be hurt while the problem festers.

I have the nagging feeling that I cannot be the first to have had these ideas, but wanted to state them nonetheless.

Back to top


Firefox 30: Flash Always Plays After Upgrade

Recently, I have upgraded to Firefox 30 in order to profit from the security fixes. I was delighted with its much improved speed as well, but thoroughly aggravated with a number of very nasty bugs. Contrary to my previous experience, Firefox insisted on playing Flash videos instantly, of course despite me having had click-to-play etc. already enabled.

Anyway, to fix it, go to this, go to about:config and change the setting of

plugin.state.flash from the default value of '2' to '1'.

Save, and you're mostly set.

Unfortunately, I have found a number of situations where the browser insists on playing a video regardless, which I have not yet been able to configure away, although I have all obvious things configured to not auto-play.

Back to top


GitLab: Changing the URL

Recently, I had the problem of moving my GitLab installation from one URL to another. I guess the problem will hit many people every once in a while. On StackOverflow, someone posted half of the answer.

Make a backup and stop GitLab first! Then:

Quote:

  1. In your application.rb file:

    config.relative_url_root = "/gitlab"
    
  2. In your gitlab.yml file:

    relative_url_root: /gitlab
    
  3. In your unicorn.rb:

    ENV["RAILS_RELATIVE_URL_ROOT"] = "/gitlab"
    

You also need to re-configure your web server appripriately.

Now... that does part of the job, but after doing it, I could not properly reach nor use my GitLab site. The two remaining issues are (a) adjusting the URLs in the database, and (b) updating the assets cache. Here are the remaining bits, assuming that you are on MySQL (adapt for other database engines, accordingly):

mysql> update events set data = replace(data, "http://1.1.1.1/", "http://1.1.1.1/gitlab/");

Next, as the appropriate user, do this:

$ rake gitlab:satellites:create RAILS_ENV=production
$ rake assets:clean RAILS_ENV=production
$ rake assets:precompile RAILS_ENV=production

Then restart GitLab.

If you omit the recompiling of the assets, that will yield you broken icons, or, if you simply deleted them, a lot of 502s, until Rails was able to compile them.

At this point, you have access to your projects using the web, but SSH access does not yet work. You will see "Access denied" message. To fix that, you have to update your gitlab-shell configuration. Then put this in your config.yml:

gitlab_url: "http://1.1.1.1/gitlab"

This URL must reflect the new URL you have chosen. After that, re-install gitlab-shell.

Thanks to <rikurrr> for the idea of directly looking into the database.

Back to top