Nobody really cares if it's reliable because it doesn't have any measurable applications; it's all just a matter of introspection, ways of organizing the data you've gathered about the way others prefer to interact.
I care whether something is reliable, regardless of application.
I've already agreed that the original form is silly, about 200 times. Why do you guys feel the need to continually prove this?
To continually irritate you.
If this is the best of possible worlds, what then are the others?
― Voltaire, Candide
There are several problems with the validity of the MBTI instrument.
The first is the assumption that people have a preference on each dimension. That is, the idea that most people have a tendency to be, for example, more E or I, and few people are in the middle of E and I. This justifies the "forced choice" nature of the MBTI test. The idea should be translated in the fact that MBTI scores on the dimensions are bi-modally distributed. However, research suggests that even the official MBTI scores, based on forced choices, are normally distributed on the dimensions. This would suggest that most people tend to be in the middle of each dimension, and few people have a preference on each dimension. These results shed doubt on the fact that people actually have a personality "preference".
The second is that the type descriptions have been not been assembled in an empirical way, but based on theoretical assumptions that were made by Jung. As such, they are speculations about how types are and not actual observations of the types. This means that type descriptions are mostly speculative in nature and their accuracy is unclear, even when considering the assumption that the instrument is psychometrically correct.
The last is that the validity of the dimensions is limited when considering their correlation with other personality models. The study you noted says only the E-I has sufficient construct validity. A recent paper I read mentionned that the E-I scale has the best construct validity, the P-J scale and S-N have good construct validity, but the real problem is with the T-F scale that does not seem to correspond to one dimension in the Big Five.
There are, however, several strengths of the MBTI compared to other models of personality.
The MBTI actually has a theory at its roots, to the contrary of the Five Factor Model. The latter has been based on a "lexical" approach, assuming that major human personality traits can be inferred based on adjectives to describe people. The MBTI, on the other hand, is based on Jung's theory of cognitive functions that can be considered a "theory of the mind" at the border between philosophy and psychology. Although being limited by the assumptions it makes, the presence of a theory provides at least some logicallly sound explanations for personality differences.
The MBTI is non-judgmental in tone compared to other models of personality. The assumption of the MBTI is that it is not better to be E than I, for example. This has the advantage of making a statement regarding the importance of differences between people. On the other hand, the Five Factor Model, although neutral and descriptive in nature, has labels for traits and/or explanations which may be construed as socially undesirable or inferior to the traits of the opposite end. Some people have suggest that the Five Factor Model suggests that a "good" person is basically extraverted, agreeable, stable, open, and conscientious.
All in all, little research suggests that people have a preference, that type descriptions actually represent the types (as measured by the MBTI), and that the dimensions are meaningful psychometrically. However, the MBTI does have the benefit of having a theory at its roots and offering a more explicitly non-judgmental approach to personality differences. Personally, I enjoy the MBTI as a light way of thinking about personality differences that offers a way to make people think about who they are without feeling that their worth is judged.