Thursday, 1 December, 2016 UTC


Summary

[special] Editor's Note: Being in a Java channel, most of us know the language very well and have been in its ecosystem for at least a couple of years. This gives us routine and expertise but it also induces a certain amount of tunnel vision. In a new series Outside-In Java non-Javaists will give us their perspective of our ecosystem. [/special]
I don't deal with Java much, so I'm investigating how true all my preconceived notions about it are. Last time, I mostly explored user-facing concerns, like speed and size. Results: inconclusive.
But modern Java is increasingly used in places where it's invisible to users: In datacenters, on phones, on my toaster. So perhaps the really interesting questions are about how Java looks to developers.
Java Is Insecure
Java's infamy as a walking security issue dates back to the ancient days where Java applets were a common thing, you trusted that the JVM could effectively sandbox them, and occasionally it couldn't.
Maybe trying to whittle down an entire general-purpose language and its massive standard library to be safe enough to run from the web was a bad idea, but it's a moot point now. I very rarely see Java applets any more. I don't think I even have the NPAPI plugin installed. Firefox doesn't let Java applets run automatically, and is dropping support for them entirely in March; Chrome dropped support last year.
Granted, this was probably in part because Java applets had become more of an attack surface than a useful platform; a CISCO report from 2014 prominently claims that 91% of web exploits were aimed at Java. I think that same year, my then-employer was warning everyone to manually disable Java in their browsers if they didn't specifically need it. If that's the only exposure you get to Java, well, it's not going to leave a great impression.
Hey, hang on. This is supposed to be about the developer perspective. So what about the runtime itself, independent of applet concerns? "Secure" is difficult to quantify, but as a very rough approximation, I can look for the number of CVEs issued this year.
  • PHP: 107
  • Oracle Java: 37
  • Node: 9
  • CPython: 6
  • Perl: 5
  • Ruby: 1
Er, whoops, that caught me off guard. I honestly expected to be pleasantly surprised and clearly proven wrong here, but that list makes Java sound somewhat worse than I thought. I hope there's a great explanation for this, but I don't have one.
Java Is Enterprisey
Ah, another word that everyone uses (myself included) but that doesn't mean anything. It conjures a very specific image, but a very fuzzy definition. I have a few guesses as to what it might mean.

Java Is Abstracted Into The Stratosphere

The abstractosphere, if you will. The realm of the infamous AbstractSingletonProxyFactoryBean.
I'm actually a little confused about this one. Turning to elasticsearch again, I stumbled upon this class, WhitespaceTokenizerFactory. Its entire source code is:
public class WhitespaceTokenizerFactory extends AbstractTokenizerFactory {

    public WhitespaceTokenizerFactory(
            IndexSettings indexSettings,
            Environment environment,
            String name,
            Settings settings) {
        super(indexSettings, name, settings);
    }

    @Override
    public Tokenizer create() {
        return new WhitespaceTokenizer();
    }
}
Okay, sure. You want to be able to create an arbitrary tokenizer from some external state, but you don't want the tokenizers themselves to depend on the external state. Makes sense.
Still, this code looks pretty silly, especially if you haven't seen the other classes that do more elaborate things. The same words are repeated three times; a 38-line file has only two lines of actual code. It's easy to look at this and think Java code goes to ridiculous extremes with its indirection. At worst, I might do this in Python:
@builder_for(WhitespaceTokenizer)
def build(cls, index_settings, env, name, settings):
    return cls()

@builder_for(SomeOtherTokenizer)
def build(cls, index_settings, env, name, settings):
    return cls(index_settings.very_important_setting)

# etc.
I'm handwaving how this would actually work, but there's not much to it. It might even be possible in Java, come to think of it, but probably not pretty or idiomatic. Alternatively, Python code might just have the build on the tokenizer classes themselves. One nice thing about dynamic typing is that code can use a type without depending on it. The tokenizer class can work with IndexSettings and Environment objects without having to import the types or even know they exist. It's a little iffy, but in a case like this where everything's internal, it could make sense.
But given that Java's type system is what it is, I can understand why you'd end up with the above code. What confuses me is this.
Why don't I see the same thing in other languages?
I found this collection of tiny factory classes after about a minute of randomly clicking around in the most starred Java project on GitHub. I'm completely unsurprised by it. Yet I can't recall seeing anything similar in other explicit, statically-typed languages. Where are the tiny factory classes in C++? The most starred C++ project is Electron, and searching for "factory" only finds me code like this, which has a lot more going on. The most starred Objective-C project is AFNetworking, which contains "factory" once — in a changelog. The most starred Swift project is Alamofire, which somehow doesn't contain the word "factory" anywhere!
So while I can accept that layers of indirection and tiny classes are useful for getting along with a C++-style type system, I don't understand why I see them so much more often in Java than even in, well, C++.
Is this a cultural difference? Are C++ developers happy to have a tangled web of interconnected dependencies? Do these tiny classes exist in C++, but live all together in a single file where they're much easier to ignore?
Java definitely seems to live in the abstractosphere, but I can't figure out why it's so different from similar languages.

Java is Tediously Verbose

"Enterprise" makes me think of repetitive bureaucracy sucking the joy out of everything.

Accessors Everywhere

And Java makes me think of accessors. Same idea, really.
private int foo;

public int getFoo() {
    return this.foo;
}

public setFoo(int foo) {
    this.foo = foo;
}
Look at all this code eating up precious vertical space to do absolutely nothing. I could've just said public int foo; and been done with it.
There are three kinds of programmers in the world, distinguished by how they reacted to that last paragraph. Some nodded their heads, and they are probably Python programmers. Some balked that this violates encapsulation, and will balk again when I say that I don't care about encapsulation. Finally, some rolled their eyes and pointed out that a public attribute is frozen into the API and can never be changed without breaking existing code.
Ah, those latter folks might have a point. The trouble is that Java doesn't support properties. "Property" is a horrible generic name for a language feature that's become popular only somewhat recently, but if you're not familiar, I mean this magical thing you can do in Python. If you have a foo attribute that external code is free to modify, and later you decide that it should only ever be set to an odd number, you can do that without breaking your API:
class Bar:
    def __init__(self):
        # Leading underscore is convention for "you break it, you bought it"
        self._foo = 3

    @property
    def foo(self):
        return self._foo

    @foo.setter
    def foo(self, foo):
        if foo % 2 == 0:
            raise ValueError("foo must be odd")
        self._foo = foo

bar = Bar()
bar.foo = 8  # ValueError: foo must be odd
@property is an artifact of great power that transparently intercepts attempts to read or write an attribute. Other code can still work with obj.foo as expected and never know the difference. Even @property itself can be expressed in plain Python code, and there are some interesting variants: Lazy-loading attributes, attributes that transparently act as weak references, etc.
I know Python, Swift, and a number of .NET languages (C#, F#, VB, Boo, ...) support properties. JavaScript is specced as supporting them by now, though I'm not sure how much code relies on them in the wild. Ruby has them, with slightly different semantics. Lua and PHP can fake them. Perl has a thing but you probably shouldn't use it. The JVM itself must be able to support them, since Jython and JRuby exist. So why not Java the language?
It seems odd to me that Java hasn't picked up on this feature that would cut out a lot of repetition. It was apparently proposed for Java 7, but I can't find an explanation of why it didn't make the cut, and now it seems to be very much not a priority.

But Wait, There's More

Continue reading %Maybe I Was Wrong about Java – Part 2%