Can Java learn from python when it comes to library design?

January 16, 2008

When it comes to the language wars, I eventually came to realize that productivity in a language is driven first by the library that comes with the language (if any) and second by the IDE’s that are available for that language.

One thing that makes python great is that they integrate a lot of useful stuff right into the core library. Java, unfortunately, hasn’t done a great job of following this lead. For example, they don’t provide a way to encode a Base64 string by default. When I use my IDE to search for classes with the name Base64, I get a rather amusing list of re-re-implementations and copies of the same class, all because the Java library maintainers are unable or unwilling to put this common functionality into the library.

Anyway, when I saw this, I thought “Java could really learn a few things from python about having a feature rich standard library …”

All the completions of Base64 in my Java EE app

procedural programming, ORM implementations

November 18, 2007

It recently occurred to me that, as a consequence of the way ORMs are implemented currently, I’ve been doing some procedural programming and actually finding it somewhat uncomfortable to have to do so. I think I’ve been doing OO programming for so long, I can’t handle procedural code as well any more.

Maybe you’re wondering what the hell I’m talking about with this “ORMs are making code procedural” idea. It certainly seems like OO code – you have classes and methods and stuff, objects, etc.? Yes, it’s mostly procedural. However, I’ve noticed in both Elixir (a python framework) and the Java Persistence API that when you manage relationships between objects using an ORM, you typically use a centralized object (often called the DAO in Java-land) that persists new objects and removes old objects, as well as ensuring that both sides of the relation are always kept in sync. Although I always start out trying to put all the logic onto the objects (where, theoretically, it belongs) I always run into trouble doing this, and it’s back to using the DAO for pretty much every operation except updating simple fields.

I seem to be forgetting, for the moment, the exact reasons why this is. I know that the first problem I’ve run into has to do with the lack of delete-orphan support in the Java Persistence API. Since rows are not automatically deleted when you remove them from a collection, you need a pointer to the EntityManager when you remove from a relationship. I think I’ve run into other problems, but the details are eluding me temporarily. I’m sure there’s something I’m missing and there’s really a great way around all this, but I don’t know what it is. Maybe the next version of JPA will suck a bit less.

Python isinstance considered useful

October 7, 2007

I seem to occasionally bump into this idea in Python that you shouldn’t use “isinstance” to check the type of something.  You should instead infer what type of object you have by checking for the presence or absence of particular methods.  Every time I do, I just completely fail to agree with the arguments against isinstance().

The argument is that these “implicit interfaces” provide more flexbility to whoever calls into your object.  The argument is that if two objects happen to implement methods with the same “meaning”, that because these objects are theoretically interchangeable they should really be interchangeable.  The classic example of this is python’s famous “file-like objects”, which typically implement read and/or write in the same way and are accepted by various python functions.  I believe the DB API is another well-used example of this.  In both cases, there is no common base-class, so it’s impossible to use isinstance() to check whether a particular object is, in fact, file-like or a database object.

The majority of the good cases for “implicit interfaces” are relying on a particular context where the nature of the object being given is completely clear – the function is documented as taking one type or another, and generally if it accepts more than one type, the possible “types” of objects are unlikely to collide.  In the case of file-like objects, you’re often given the option of passing a string, which is then treated as a URL or a filename and opened for you as a convenience.

I think that implicit interfaces would benefit from being replaced with explicit interfaces.  Python doesn’t have great support for these, but you can declare a class with all empty methods and use that as the interface.  Python allows multiple inheritance, so you can implement all the interfaces your object supports by making them superclasses.  Ideally python would include an interface called FileLike and an interface DatabaseLike and we’d implement that in our file-like and database-like objects, and python would provide a way to verify (in the test suite, I suppose) that all interfaces are fully implemented.  That way when a new version of DatabaseLike comes out, we’ll get a notification that the database class needs to be updated.

There are two cases where isinstance() might fall down, but testing for attributes wouldn’t.  First, when you use reload() to reload a module, the interfaces might be re-created, and your isinstance() tests would incorrectly fail.  However, I suggest that this indicates a problem with the use of reload() – really, if you use reload(), you should make sure you’ve reloaded the entire class hierarchy.  Second, if you have some kind of object proxy, for RPC or weak references, it might not handle isinstance() as expected.

When designing an API, the majority of the functions’ parameters will expect only one type of argument, and if you like the idea of flexibility, just don’t assert against the parameters at all – let the app fail when it discovers a needed operation is missing.  That’ll allow callers to use subclassing or not, as they choose.  However, when you need to take a different path based on the type of object you’ve been given, you’ll have to decide whether to use implicit interfaces, which you check for by checking for certain attributes, or explicit interfaces which you check for using isinstance().  I think that for an interface with relatively few methods and which is likely to be replaced by the callers not only with their own implementation, but possibly with other objects they didn’t write but “luckily” have the right methods, then an implicit interface is a good thing.  If your interface is more complex or you know there aren’t any other objects which have compatible methods, you can create an explicit interface without method bodies, and ask the user to provide an appropriate subclass of that interface.  Finally, if you know that you’re only passing subclasses of a certain class and have no use for any other type of implementation, just pass those subclasses – there’s a little thing called “keep it simple” and this is probably great most of the time.

So, implicit interfaces have their place, and so does isinstance(), use them both as needed!