One reason why I hate beans.

I’m now poking around some source code for a project that was built with a bunch of little beans. I understand the theory of a Java bean: by segregating your code into small and self-contained modules it makes development easier (the person developing the bean should only worry about just that bean), and it makes debugging easier (you can host the bean in a test framework and test it in isolation).

However (and there is always a ‘however’ in my rants), I’m reminded of a post on Cocoa development: Thoughts about large Cocoa projects:

My app isn’t huge compared to Photoshop or Word—it’s teeny in comparison—but it’s large compared to some Cocoa apps. (It has 345 .m files and an executable size of 3.2MB when stripped.)

It’s big enough that, were you to ask me how _____ works, I’d have to go look. There’s no way I can remember, with any level of detail, how every part of it works.

I call it the Research Barrier, when an app is big enough that the developer sometimes has to do research to figure things out. (“Research” just means reading the code and following some paths of execution, sometimes running in the debugger.)

To summarize my problem with beans and the Spring framework and other systems which are designed to encapsulate your Java program into a bunch of small objects interconnected with a mechanism that uses Java reflection to hook things together is that it breaks the “research barrier.” It makes answering the question “who calls this?” and “what does it call?” much harder–if not impossible–to answer. Rather than making the answer ‘who calls this’ a matter of right-clicking on the method call in Eclipse and doing a search, you cannot answer the question anymore in a reasonable way: the entire application has been reduced to a complex dance of dozens or hundreds of jar files. Worse: the overall ‘flow’ of the application is no long encapsulated in a handful of classes but instead lives in some configuration file which–by design–can change without notice.

So here I am with a bunch of beans and all I want to answer is “what database tables are queried to generate a report?” The answer? With all these beans scattered around, apparently the answer changes with the phase of the moon.

The Mystique of Google

No, I haven’t quite drunk the koolade yet here at Yahoo.

But I’m starting to get a better appreciation of the reality verses the mystique of Google and of Yahoo and of other search engines.

Basically it boils down to this: in the group I’m in, we sell advertising. Perhaps 25 years ago when I first got on the ‘net, the notion of advertising would be a bad thing. But today, when you get right down to it, we use the ‘net for shopping and searching for things to buy–which means that advertising is a natural extension of what we do. If I type in “dvd players” into a search engine, I may be looking for a dvd player to buy; perfect opportunity for Circuit City to remind me that they’re just down the street.

Now Google makes a ton of money. They make a ton of money because they have successfully driven a ton of traffic to their site. Google is comfortable with allowing their trademark to become a verb: we don’t ‘search’ the ‘net, we ‘google’ the net.

And they’ve made a ton of money by, well, by creating a mystique around how brilliant they are: Google has spent a lot of time talking about how brilliant their search is, how brilliant their developers are, and how brilliant their company is. And yes, at the time Google Maps was a brilliant early example of the power of AJAX.

But–I hate to say this, but search is a solved problem. Google didn’t invent search; why people flocked to Google was because it was a simple and easy and unobtrusive search engine. The ‘brilliant’ part came about later as Google spent a ton of time talking about search strategies–but every company uses different scoring techniques to score search results. Google happens to be much better at detecting certain technical terms: “web.xml” searches for the literal seven character string, while Yahoo breaks the term into “web” and “xml”, which gives completely different results. And Google happens to “hide” a bunch of goodies into their search interface–which is less about being “brilliant” and more about being “useable.”

In other words, Google has conflated “brilliant” and “useable.” And that creates the illusion that any portal is somehow “primitive.” Further, by leveraging this sort of “hidden” functionality, it causes people to learn how to use Google–and it creates a sort of psychological lock-in: if I need to add five numbers together, I can just pop up Google.

But this isn’t really all that “brilliant”, is it?

The people I’m meeting here at Yahoo are brilliant. Once you get past the fact that Yahoo’s home page is a portal–and I hate portals, I’ll readily admit–you find that Yahoo maps are pretty cool, Yahoo mail beats Google mail, aside from the fact that you are welcomed by a damned ad, and Yahoo Groups has a lot more functionality than Google Groups. (Google Groups suffers from being too simple: a good interface should be simple, but not too simple.)

I’m not saying that Yahoo is better than Google–what I’m saying is that the illusion of Google’s fundamental brilliance is just that: an illusion. Web technologies by Yahoo or Google or MSN or Ask.com are far more sophisticated than the typical small-shop web store tech done by someone with a basic shopping cart–and it is far more sophisticated than this web site’s Wiki and Blog.

But it isn’t a hell of a lot more sophisticated than desktop programming. In fact, I’d suggest that, while as sophsticated as our advertising network back-end (Panama) is, what makes it hard is keeping the components simple so we can quickly react to new frauds. There are certainly fewer moving parts in Panama than in Photoshop, for example.

This is an interesting business. But look to me putting a multi-threaded B-tree implementation in Java up on the Wiki in the next month or so–because while the business is interesting, it’s not all that technologically interesting.