Category Archives: Uncategorized

SAE死了, 精神上彻底死了

大概是一个月之前 SAE 又改收费规则了, 结果就是所有使用数据库的应用都要交一个所谓的 MySQL 租金, 我账户里原先盘算的能用两三年的余额瞬间没了.

现在来看两年前把博客从 SAE 迁出真是一个非常明智的决定, 现在可以彻底告别 SAE 了.

Screen Shot 2016-04-10 at 11.57.25

“Multiple dex files define” Error in Android development Caused by IntelliJ IDEA bug.

Recently I encountered a “Multiple Dex Files Define” error when writing and building Android libraries. After a lot of work, the root reason is attributed to a IntelliJ IDEA bug which appears when you are using .classpath(Eclipse style) project configuration file format. I reported this bug to Jetbrains and they had confirmed my report on it. It is now being tracked in their bug system. Before Jetbrains fix the bug, you can avoid it by using .iml(IntelliJ style) project configuration file format.

Here is how I encountered and reproduced this bug, which is also available through

When exporting jars, the same configuration will have different output behaviors using different config file style (.classpath/.iml)

How to reproduce:

Suppose here is my project structure:

AndroidLibraryModuleA depends on AndroidLibraryModuleB, RootAndroidApplication depends on both AndroidLibraryModuleA and AndroidLibraryModuleB.
Now we want to export AndroidLibraryModuleA and AndroidLibraryModuleB as jar without resources. In other words, we are only using the Java code part.

In artifacts settings, we add to jar configurations. For AndroidLibraryModuleA we only include AndroidLibraryModuleA compile output. For AndroidLibraryModuleB we only include AndroidLibraryModuleB compile output.

In AndroidLibraryModuleA we change the compile level of dependency AndroidLibraryModuleB to “provided”. Then we build the artifacts of AndroidLibraryModuleA.

If we open the jar file exported, we can see only compiled classes from AndroidLibraryModuleA is listed here.

However, if we keep everything unchanged, just switch the project file format from IntelliJ’s .iml to Eclipse’s .classpath of AndroidLibraryModuleA. Then we rebuild the artifacts.

Then the compiled classes from AndroidLibraryModuleB is listed in AndroidLibraryModuleA’s jar file.

I believe the .iml’s behavior is the intended one while the .classpath one’s behavior may result from a bug.

This difference will lead to a “Multiple dex files define” error when AndroidLibraryModuleA.jar and AndroidLibraryModuleB.jar are added to another project as jar dependency as there are duplicate class files in two jars.

Use HAProxy to load balance 300k concurrent tcp socket connections: Port Exhaustion, Keep-alive and others

I’m trying to build up a push system recently. To increase the scalability of the system, the best practice is to make each connection as stateless as possible. Therefore when bottleneck appears, the capacity of the whole system can be easily expanded by adding more machines. Speaking of load balancing and reverse proxying, Nginx is probably the most famous and acknowledged one. However, TCP proxying is a rather recent thing. Nginx introduced TCP load balancing and reverse proxying from v1.9, which is released in late May this year with a lot of missing features. On the other hand, HAProxy, as the pioneer of TCP loading balacing, is rather mature and stable. I chose to use HAProxy to build up the system and eventually I reached a result of 300k concurrent tcp socket connections. I could have achieved a higher number if it were not for my rather outdated client PC.

Step 1. Tuning the Linux system

300k concurrent connection is not a easy job for even the high end server PC. To begin with, we need to tune the linux kernel configuration to make the most use of our server.

File Descriptors

Since sockets are considered equivalent to files from the system perspective, the default file descriptors limit is rather small for our 300k target. Modify /etc/sysctl.conf to add the following lines:

fs.file-max = 10000000 
fs.nr_open = 10000000

These lines increase the total file descriptors’ number to 1 million.
Next, modify /etc/security/limits.conf to add the following lines:

* soft nofile 10000000
* hard nofile 10000000
root soft nofile 10000000
root hard nofile 10000000

If you are a non-root user, the first two lines should do the job. However, if you are running HAProxy as root user, you need to claim that for root user explicitly.

TCP Buffer

Holding such a huge number of connections costs a lot of memory. To reduce memory use, modify /etc/sysctl.conf to add the following lines.

net.ipv4.tcp_mem = 786432 1697152 1945728
net.ipv4.tcp_rmem = 4096 4096 16777216
net.ipv4.tcp_wmem = 4096 4096 16777216

Step 2. Tuning HAProxy

Upon finishing tuning Linux kernel, we need to tune HAProxy to better fit our requirements.

Increase Max Connections

In HAProxy, there is a “max connection cap” both globally and backend specifically. In order to increase the cap, we need to add a line of configuration under the global scope.

maxconn 2000000

Then we add the same line to our backend scope, which makes our backend look like this:

backend pushserver
        mode tcp
        balance roundrobin
        maxconn 2000000

Tuning Timeout

By default, HAProxy will detect dead connections and close inactive ones. However,  the default keepalive threshold is too low and when applied to a circumstance where connections have to be kept in a long-pulling way. From my client side, my long socket connection to the push server is always closed by HAProxy as the heartbeat is 4 minutes in my client implementation. Heartbeat that is too frequent is a heavy burden for both client (actually android device) and server. To increase this limit, add the following lines to your backend. By default these numbers are all in milliseconds.

 timeout connect 5000
 timeout client 50000
 timeout server 50000

Configuring Source IP to solve port exhaustion

When you are facing simultaneous 30k connections, you will encounter the problem of “port exhaustion”. It is resulted from the fact that each reverse proxied connection will  occupy an available port of a local IP. The default IP range that is available for outgoing connections is around 30k~60k. In other words, we only have 30k ports available for one IP. This is not enough. We can increase this range by modify /etc/sysctl.conf to add the following line.

net.ipv4.ip_local_port_range = 1000 65535

But this does not solve the root problem, we will still run out of ports when the 60k cap is reached.

The ultimate solution to this port exhaustion issue is to increase the number of available IPs. First of all, we bind a new IP to a new virtual network interface.

ifconfig eth0:1

This command bind a intranet address to a virtual network interface eth0:1 whose hardware interface is eth0. This command can be executed several times to add arbitrary number of virtual network interfaces. Just remember that the IP should be in the same sub-network of your real application server. In other words, you cannot have any kind of NAT service in your link between HAProxy and application server. Otherwise, this will not work.

Next, we need to config HAProxy to use these fresh IPs. There is a source command that can be used either in a backend scope or as a argument of server command. In our experiment, the backend scope one doesn’t seem to work, so we chose the argument one. This is how HAProxy config file looks like.

backend mqtt
        mode tcp
        balance roundrobin
        maxconn 2000000
        server app1 source
        server app2 source
        server app3 source
        server app4 source
        server app5 source
        server app6 source

Here is the trick, you need to declare them in multiple entries and give them different app names. If you set the same app name for all four entries, the HAProxy will just not work. If you can have a look at the output of HAProxy status report, you will see that even though these entries has the same backend address, HAProxy still treats them as different apps.

That’s all for the configuration! Now your HAProxy should be able to handle over 300k concurrent TCP connections, just as mine.

IntelliJ / WebStorm slow debugging in Node.js

I recently experienced a severe slow debugging experience in IntelliJ + nodejs plugin / WebStorm, which made me to wait nearly one minute for my app to start. I tried to figured out why, and I noticed that the most of the time was spent on loading various packages.

Later on I found the cause for such slowness: the IDE’s break on exception option is enabled. In other words, the IDE will try catch almost every line of JavaScript code, no matter it is written by you or it is from a third party package, which leads to a huge performance loss.

Disabling it by navigating through menu ‘RUN -> View Breakpoints…’ and toggle ‘JavaScript Exception Breakpoints’. You will have your program debugging much faster. To further accelerate your experience, navigate through menu ‘Help -> Find Action…’, type in ‘Registry’ and enter. Uncheck ‘js.debugger.v8.use.any.breakpoint’.

Now your nodejs program should run in debug mode as fast as it is not.


其实本来我是不打算看这部电影的,刚好公司组织一起去看,原价35的3D票只要10块钱,算是个福利,似乎网上一片称赞,索性就看一看。我最近刚把 GTA 5 打通,一篇很长的评论还没写完,就先来说说这个《大圣归来》。







一个由于Nginx配置不当导致的启动失败: Stopping System V runlevel compatibility

最近在做Android Push系统的服务器端, 要用到1.9版Nginx引入的TCP代理功能, 由于Nginx默认的连接数太少, 我就按照之前改内核参数的习惯, 直接大手一挥直接把连接数加到了1000W. reload配置之后, 我的机器死掉了.

我当时根本没想到是Nginx的原因, 下意识的认为是我用的那个MQTT的库一定是泄露内存了, 然后果断重启机器.

然后就起不来了, 在Ubuntu的启动界面一直转啊转. 再次重启, 进 Recovery Mode 打日志, 发现卡在 Stopping System V runlevel compatibility [OK] 这里.

网上几乎一边倒的认为是 NVidia 的显卡驱动问题, 虽然我觉得不太可能是这个原因不过网上都这么说, 那就卸了吧.

卸了之后还是进不去系统, 而且还是卡在老地方! 于是开始在笔记本上查资料, 机器就放在那里没管. 过了几分钟之后, 我瞟了一眼机器, 出了一行 log, Out of Memory Error, Kill Nginx. 这时候我才意识到到是不是和我之前改了 Nginx 配置有关系. 很快我就确定是 Nginx 的问题, 因为 Nginx 用的是连接池, 即使没有连接也会预先创建好一定数量的备用, 我的机器大约8G内存, 之前的测试中大约能抗住80W左右的连接, 1000W的连接池必定导致OOM, 然后Nginx就吃光了所有的内存, 强迫操作系统不断进行垃圾回收, 导致启动卡死.

接下来就很简单了, 把Nginx的连接数改回去, 再把显卡驱动装上, 成功进入系统, 然后再给Nginx设置一个合适的连接池大小, 继续进行试验.

Closures in different languages

In most scripting languages, there are first-class functions. In short, first-class functions refer to functions that can serve as call arguments, work in expressions and be assigned to variables.

So what is a closure? A closure is a function that brings context information with it. Among all the languages, JavaScript is probably the language where closure is mostly frequently used. In my opinion, the reason why closures are so widely used in Javascript lies in that Javascript does not have a mature OO system compared to other programming languages.

function a()
    var t = 1;
    function b()
    return b;

x = a();

The output of this piece of Javascript code is “2” and “3”. As we can see, function a() returned another function. However, this function not only carries the information about itself, but also the context it lies in, i.e. the value of variable t.

In Apple’s programming language Swift, we have similar closures that act almost identical to Javascript’s.

import Foundation

func a() -> (Void -> Void){
    var t = 1;
    func b() -> Void{
    return b;

var x = a();

The result is also the same as Javascript’s, “2” and “3”. What about Python?

def a():
   t = 1;
   def b(t = t):
      t += 1;
   return b;

x = a();

There are a few notable differences here. First of all, the variable t defined in a() is not visible to b(), thus it must be passed through named argument. Secondly, the output is “2” and “2”. This is reasonable since the value of t is passed through an argument and changing the value of the argument will not affect its original value. However, python do support “real” closures. The trick is to change the t into a nonlocal variable.

def a():
   t = 1;
   def b():
      nonlocal t;
      t += 1;
   return b;

x = a();

Now we are finally there. Next up, Groovy.

def a()
    int t = 1;
    return { -> println(++t)};

x = a();

Since in Groovy, named functions cannot be defined in another function, we can only use unnamed function to do this. The output is “2” and “3”.

In Java, functions are not first-class members, thus we will never have terms like closure. A workaround for this is to use anonymous class. Here is an example.

public class Closure
   public static Function a()
      final int t = 1;
      return new Function(){
         public void call()
            System.out.println(t + 1);

   public static void main(String[] args)
      Function x = a();;;

interface Function
   public void call();

Rule No.1 for anonymous class is that you cannot change the value of the variables that exist in the stack context. In other words, the variables on the stack are all “final” to the inner class. If one want’s to change the value of it, it must be declared as a field of a Class, which will be stored in heap.

Improve ListView Performance on Android

The performance of ListView on Android is sometimes a disaster when it comes to very complex list. Things become more frustrating when you are working with other things with Android like network images and dynamic loading.  The best example of a complex ListView is the Facebook feed in the Android app. They posted an article to show how complex but smooth ListView can be achieved at the same time.

In short, they split each post in the feed into several parts: the header, the main body and the action panel. Then each part will be able to be reused when rendering the ListView. This is a very clever alternative solution to the problematic ListView performance.

However, for my case things are more complicated since the content of a post does not have a fixed style. It may be pure text, or text with some decorations, or full of images without a single line of text. Moreover, the content of the post should be interactive, i.e., when you click on a link or an image, the app should respond with different actions with that click.

In a previous open source Android app that I contributed to, we tried at least two options.

  1. Use a ListView with WebView, i.e., each item in the ListView is a WebView. In this case, it is easy to interact with other parts of the app while at the same time achieving high dynamic usability. Everything works fine when it is with Android 4.2 or before. Performance becomes a really big issue when it comes to Android 4.4, in which Google made the webkit kernel more functional but also heavier. Creating a WebView becomes a really time intensive task which we cannot afford. Thus we got several other workarounds for this problem.
    1. Keep as many WebViews as possible in the memory as cache. In the case that a user scroll down and up a WebView, the WebViews that are cached in the memory can be used directly. To keep the memory use to an acceptable level, we can use Soft Reference to cache each object.
      This workaround does not work well since when you scroll down the WebView, you are still creating new WebViews and it will only work when you scroll up and down, which is actually not very useful.
    2. Reuse each WebView. This does improved the scrolling experience since we no longer create a lot of WebViews. Instead, we alter the content of each WebView. The experience is still a little bit laggy since rendering HTML also takes a lot of time.
      This workaround worked better than the 1st one but it brings another big issue, i.e., when reusing the WebView, the height of it will not change when the length of its content changes. In other words, we will see a lot of blanks in the ListView when the length of each content of WebView varies significantly.
  2. Use a ScrollView with WebViews and render a page of posts at once. This is very brute but surprisingly work! The disadvantage of this solution is, firstly it is very memory consuming since the whole page of posts live inside the main memory. Secondly the app may froze for a second or two while rendering the page, depending on how complex the page is. However, once the page is rendered, it becomes super smooth no matter how you scroll it!

When I was figuring out the solution for the new app TGFC, I was thinking about what kind of solution I should implement. However, I realized that a WebView may not be the only option for my scenario since actually I don’t really need all the features that a heavy WebView provides. I want to have my app to be able to show some different styles of text, several images and that’s all. I don’t need stuffs like z-index or absolute positioning. In my case, TextView can work perfectly to meet my demands.

I started with a ListView of TextViews. At first, everything works fine when there is only text content in the TextView. However, things become a little bit complicated when I introduced network images into my app. Inside the ImageGetter that I was using, I first download images asynchronously to the local cache. Later on I load those images into memory and show them on the screen. I see notable lag when using ListView as the outside container when it was loading images from the local cache, so I switched to ScrollView later and rendered the whole page of posts at once.

The only thing that we need to be careful about is the usage of images. Large images can consume a lot of memory and make the ScrollView really laggy. Remember to resize those images when loading them into the memory.

Right now I still have some issues with the interaction between my TextView and the other parts of the app. I don’t have time to fix those issues and see whether they come from the HTML that Jsoup generates or the way I use TextView for the time being. But I’m pretty confident that these problems are not unsolvable and actually I have got some ideas on how to handle them. For the time being, TextView + ScrollView may be the solution for extreme complex and dynamic ListView with good user experience with better memory performance than WebView + ScrollView if you do not want to parse HTML and analyze the content to distinguish text parts and image parts.

This article is written as a complement to my Zhihu answer. In my opinion, the reason why we are having so many problems with ListView is the problematic designing of ListView that comes out from Google. Here are my suggestions on how to improve the performance of ListView from the Android designing perspective.

  1. Prepare more Views before scrolling. Currently the ListView will only prepare one more View that are invisible to the user but I believe its not enough. The number of views to be rendered should be extended.
  2. Android should introduce a kind of @PausableTask that is run on the UI thread but pausable to let the UI thread draw things to be shown on the screen. We can only show the basic outlines of the items in the initialization of a View and then gradually fill it with detailed content using the intervals of UI refresh, just like the way that Facebook used in its webpage, filling the page with place holders and filling those holders with content later on.


Android Support Library v22.1 轻度使用感受

前两天Google出了新的Support Library v22.1,这两天轻度使用了一下,贴点使用感受,慢慢更新

  1. Google又改Activity的基类了,原来是ActionbarActivity,现在是AppCompatActivity。很多逻辑也发生变化,之前是用Toolbar作为Actionbar的,现在也不需要了,直接自带Actionbar了,也是Material Design的设计,这个改动如果是从新写App的话挺好,如果是之前的老项目。。估计要改的吐血吧
  2. 各种Material Design的组件更新极其缓慢,这次确实又加了几个,但是设置不了深度,从用户的角度根本看不出来有多大变化。Annimation什么的通通没有,Google看这样子是不打算支持4.x用户了么?
  3. AlertDialog出了新的了,改下import就行了,好评。


最近在写一个新的应用,后端用的是Nginx +  Python + Django + Gunicorn + Celery + libav,Celery又依赖RabbitMQ,为了让Celery和Gunicorn跑起来又用了Supervisor,东西太多配置又太繁琐,写代码调试和部署都是一个挑战,时间长了怕是配置文件和日志的location都要忘光光。我的工作环境是Mac和Windows,服务器则是Ubuntu或者Centos,像libav这种东西基本上算是Linux独有的,Mac虽然是*nix like但是毕竟不是Linux,想了半天最后决定放弃在Mac做Celery worker的调试,改在Ubuntu的服务器上用最原始的Log来分析调试,简直蛋疼到不行。

这两天在网上看到了Docker这个玩意,看起来确实不错:在Linux上有近乎Native的性能,在其它平台上则通过类似虚拟机的机制来构建运行环境,通过remote debug机制和IDE沟通,而部署则有点像拷贝虚拟机镜像:Docker的每一个优势都直击开发部署的痛点,下次部署的时候一定要用一下。