The Mystique of Google

No, I haven’t quite drunk the koolade yet here at Yahoo.

But I’m starting to get a better appreciation of the reality verses the mystique of Google and of Yahoo and of other search engines.

Basically it boils down to this: in the group I’m in, we sell advertising. Perhaps 25 years ago when I first got on the ‘net, the notion of advertising would be a bad thing. But today, when you get right down to it, we use the ‘net for shopping and searching for things to buy–which means that advertising is a natural extension of what we do. If I type in “dvd players” into a search engine, I may be looking for a dvd player to buy; perfect opportunity for Circuit City to remind me that they’re just down the street.

Now Google makes a ton of money. They make a ton of money because they have successfully driven a ton of traffic to their site. Google is comfortable with allowing their trademark to become a verb: we don’t ‘search’ the ‘net, we ‘google’ the net.

And they’ve made a ton of money by, well, by creating a mystique around how brilliant they are: Google has spent a lot of time talking about how brilliant their search is, how brilliant their developers are, and how brilliant their company is. And yes, at the time Google Maps was a brilliant early example of the power of AJAX.

But–I hate to say this, but search is a solved problem. Google didn’t invent search; why people flocked to Google was because it was a simple and easy and unobtrusive search engine. The ‘brilliant’ part came about later as Google spent a ton of time talking about search strategies–but every company uses different scoring techniques to score search results. Google happens to be much better at detecting certain technical terms: “web.xml” searches for the literal seven character string, while Yahoo breaks the term into “web” and “xml”, which gives completely different results. And Google happens to “hide” a bunch of goodies into their search interface–which is less about being “brilliant” and more about being “useable.”

In other words, Google has conflated “brilliant” and “useable.” And that creates the illusion that any portal is somehow “primitive.” Further, by leveraging this sort of “hidden” functionality, it causes people to learn how to use Google–and it creates a sort of psychological lock-in: if I need to add five numbers together, I can just pop up Google.

But this isn’t really all that “brilliant”, is it?

The people I’m meeting here at Yahoo are brilliant. Once you get past the fact that Yahoo’s home page is a portal–and I hate portals, I’ll readily admit–you find that Yahoo maps are pretty cool, Yahoo mail beats Google mail, aside from the fact that you are welcomed by a damned ad, and Yahoo Groups has a lot more functionality than Google Groups. (Google Groups suffers from being too simple: a good interface should be simple, but not too simple.)

I’m not saying that Yahoo is better than Google–what I’m saying is that the illusion of Google’s fundamental brilliance is just that: an illusion. Web technologies by Yahoo or Google or MSN or Ask.com are far more sophisticated than the typical small-shop web store tech done by someone with a basic shopping cart–and it is far more sophisticated than this web site’s Wiki and Blog.

But it isn’t a hell of a lot more sophisticated than desktop programming. In fact, I’d suggest that, while as sophsticated as our advertising network back-end (Panama) is, what makes it hard is keeping the components simple so we can quickly react to new frauds. There are certainly fewer moving parts in Panama than in Photoshop, for example.

This is an interesting business. But look to me putting a multi-threaded B-tree implementation in Java up on the Wiki in the next month or so–because while the business is interesting, it’s not all that technologically interesting.

Found the sysdeo plugin on the French site.

*sigh*

If you search for the sysdeo plugin for developing Tomcat on Eclipse, you’ll find that every link in the blasted world points to the English site, which is now gone.

A little investigation and puzzling out some random french, and I found where it’s now located on the French site. Hopefully if anyone is setting up to do Tomcat development under Eclipse, and they want the Sysdeo plugin (trust me; worth its weight in gold), here is a link.

Books I’d like to see.

I’d love to see someone write a book on the “trial and error” company. Because that’s what most web companies really are: once they get a web application up on the web and start attracting customers, they can quickly use the statistics they gather as to how users reach their site and how they use their site to make adjustments quickly to make the customers happy.

Of course in a sense all companies are trial-and-error companies. Unfortunately most non-web based companies use customer support as the primary source of feedback from their customers, but rather than consider customer support part of the development process or part of the feedback process, companies seem to treat customer support as a necessary evil, as a cost center to shut the customers up.

Dry Run, and Fun Gadgets

Today I tried a dry run, testing out my new bicycle to ride into Yahoo Burbank. I also equipped it with a Garmin eTrex Vista CX GPS on a bike handlebar mount, so I can figure out which side streets to take and know my altitutde and speed.

Six miles, and when I got to Yahoo, I was beat to hell–I know I’m out of shape, I’m not that out of shape… But the Garmin told the story: the first mile was a 350 foot drop in altitude, which I knew about. But the last four miles and change was a steady 400 foot climb–which means that while the ground seems somewhat flat, I was working for those four and a half miles. I wasn’t just coasting along supplying energy to beat wind resistance.

Lesson of the day: I’m going to have to take a change of clothes so I can shower when I get to Yahoo–at least until I get into good enough shape that bike riding six miles is a no-brainer.

(Yes, I know; there was no programming content. So to compensate, I give you the Yahoo Developer Center on JavaScript, along with the best book I’ve encountered so far on AJAX: Ajax Hacks from O’Reilly.)

Using the First-Run Experience to make you happy.

From Daring Fireball:

Apple is the one and only PC maker that sees the first-run experience as an opportunity to make you happy, rather than as an opportunity to make a few bucks by showing you ads and stuffing trialware down your throat.

How true.

Advertisers know about this. They know that when you sell something to someone–even something that won’t be replaced often, like a car–that they need to continue to advertise to you. It’s a halo effect: ads for BMW are targeted to BMW drivers to make them happy they bought a BMW, so they are more likely to recommend BMW, and so they are more likely to replace their older BMWs with a newer BMW. It’s called customer loyalty, and its built first and foremost by taking every opportunity you can to make the customer happy.

Computer manufacturers, however, don’t seem to remember this basic law of advertising, a law that was probably discovered by the Romans–if not by caveman. And so they abuse the hell out of you on the first boot, exchanging short-term gain of collecting a few bucks from advertisers for the happiness of their customers in making a multi-hundred or multi-thousand dollar purchase. The assumption is that happiness isn’t as important as making a few bucks.

But Apple has figured out this basic law of Happiness. So much so that when I first unboxed my Apple TV, I carefully took it out of the box, plugged it into the TV, configured the TV, used a flashlight to triple-check to make sure the TV was properly hooked up and the tuner was switched to the correct channel–all so I could make sure that the moment I plugged the Apple TV into the wall, full video and audio would work correctly from the millisecond power was appled.

Because I didn’t know what the Apple TV box would do when it was first turned on–but I knew that it would make me happy, and I didn’t want to miss a second of it.

The checking of the wires and verifying the component video and audio was correctly synced was well rewarded: when I plugged the Apple TV in, it made me happy.

And it is the same with the iPhone: I don’t know what the unboxing and first-run experience will be like–but I know it will make me happy. My present to my wife when I first get her an iPhone will be to leave it in the box, so she can also experience the same joy–a joy which is so rare when unboxing electronics that she so far has refused to unbox anything. Because she has learned, thanks to years of training, that unboxing and opening up a new gadget is not like christmas, but more like opening up the sealed container that has been living in the back of your refrigerator for the past year or so. You don’t know what to expect, but you know it will take work before you like it.

Why Windows Sucks, or at least one reason…

My parents can’t get Microsoft Excel to work.

The symptoms was that after installing some other program, suddenly when they open a document they are greeted with a Microsoft Excel spreadsheet–but the entire content area is gray. No cells, no editing, no joy–just a large gray expance where the data should be. But the application is not frozen; oh, no. The whole thing seems to be behaving just fine, like a small puppy with a large candy stuck to its fur: the thing just hops around like all is well, despite the fact that there is this ugly red sticky mass hanging off its fur.

My usual recommendation in a situation like this: you’re hosed. Uninstall the entire application, uninstall the other offending application, and reinstall from scratch–and hopefully this will clear the whole mess up.

So my father tried this–no joy. Of course the results were different: he can create a new spreadsheet just fine. But try to open an existing spreadsheet–gray empty blob. The red sticky candy has somewhat been removed–but there is still a read sticky matted spot where there should be clean brown fur.

*sigh*

I hate Windows. And one of the reasons why is this sort of odd-ball DLL hell that Windows seems to infect all applications. Without ever having seen the source code to Microsoft Excel or this other application, I can tell you exactly what failed: there is some DLL somewhere, which was “upgraded” by the other application, which broke Excel. (The other application also provided spreadsheet-like functionality in one of its Windows, which makes me think they shipped with some DLLs used by Excel.)

And here’s the problem. In a large corporate environment we build our applications with tons of plug-ins: DLLs, Jar files, MacOS X Frameworks–and we do this because we want to either leverage code written by someone else without increasing the overall shipping footprint of our application, or because we want to minimize inter-team communications, which is the bane of any large software development team. (Because as we all know, team communications is essentially an O(N**2) problem–unless we can create multiple groups, in which case we can reduce this to a max(O(G**2),O(T**2)) problem with G*T = N, where G is the number of groups you’ve broken your overall team down into.)

We go along, we test our software, we see everything works, we ship. Job done.

Except… Well, except that, because three quarters of the code we just shipped is someone else’s, and because that code (in the form of frameworks, Jars, DLLs, whatever) can be replaced willy-nilly, with at best only a lose contract as to how those DLLs work, a lose contract that we may not have used correctly because some guy wanted to go home early rather than make his code more bullet proof, or because he didn’t understand the contract but by coincidence his code worked anyways… Well, you see where I’m going with this.

Someone ships an update to some plugin, which ships a new version of some DLL, and your code breaks in some subtle, bit-rottish sort of way. Something flickers before its drawn, or under some situations a file opened before some application ran causes a crash–the code you thought was bullet proof wasn’t all that bullet proof after all.

Windows seems especially prone to this problem: I don’t know of a single application that ships as a single, self-contained application. Every one of them breaks down into a million little DLLs, even if they are all Microsoft DLLs–and none of them are under your control. And worse: we’re encouraged to slam all of our DLLs in a common area (WindowsSystem or wherever), which causes the potential for bitrot to increase.

Your application isn’t rotting from the inside; it just seems like it.

A few years go by, the user has made compromise after compromise in their workflow–after having learned the hard way because of some weird DLL interaction on his system that he is supposed to shake the mouse like a voodoo stick before he is allowed to open his Word document–and something finally gives. The pile of crap dumped into WindowsSystem finally gives away, and the whole thing just dies a horrible, flaming death.

And the user, thinking that somehow it is his hardware that has failed (and not a pile of subtly incompatible software which caused his system to die), decides that, well, what the hell; perhaps it is time to replace my computer. After all, the computer is, as far as the user is concerned, behaving exactly like his car: after a few thousand miles, a filter needs to be replaced; a few thousand more, a gas shock needs fixing; a few thousand more, a gasket blows, and he is stranded by the side of the road.

We’re used to the idea that things rot, so we assume that our computers are also failing because of rot. But it isn’t rot–it’s a slow accumulation of slightly incompatible software causing things to die.

When I was at Microsoft for a few weeks I discovered that many people there simply reach out to a global server where all of Microsoft’s product lives, and they just blow away the entire contents of their computers and reinstall what they need from scratch.

End-users cannot do this.

And I think that’s why I hate Windows: because this bug, this defect, this software error, which could be solved by preventing third party applications from installing DLLs in a shared directory, causes the computer to behave exactly like an old creaky car–which encourages users to replace their computers every few years rather than just reformatting the hard disks of their existing (and perfectly good) computers.

It’s a software design defect which makes Microsoft money.

When Abbreviations Attack!

I have a bachelors of science degree from the California Institute of Technology.

Now one of the abbreviations for degree and major that I got used to was to write something like “BS/?”, where “?” is replaced with the abbreviation for your degree. So for example “BS/E&AS” means Bachelors of Science, Engineering and Applied Sciences. “BS/CE” means “Bachelors of Science, Chemical Engineering.” And “BS/AP”, Bachelors of Science, Applied Physics.

I have a BS in Mathematics. And when I applied for a new job at a company just spitting distance from me, I was in a hurry, so I scribbled down “BS/MA”: Bachelors of Science, Mathematics. My resume has it, but in the long form “B.Sc. Mathematics”; the on-line application I checked “four-year degree”–I’ve been very clear everywhere what I have–and during the interview everyone interviewed me from my (correct) resume, which they didn’t even bother to look at, since I graduated nearly 20 years ago.

I got a call last night from a field check company which wanted to verify an error on my application. They told me flatly that I did not have a Masters of Arts degree from Caltech; do I have an explanation? And so now what was a shoe-in job becomes a living nightmare as I get a quick course in security and application verification and the people who do it at Yahoo.

Damn, damn, damn, damn, damn…

Update: It appears not to have made a difference, which is as I thought it should. I’ve got my offer, which means I’ll be doing a lot more Java in the future–and will have plenty of Java idiosyncrocies to complain about in the future!

My Key Takeaways for Struts after One Day

After one day of playing with Struts these are my key takeaways:

– Struts is an MVC framework which attempts to separate the view (the HTML page), the control code (“Actions”), and the model or database code. This means it’s rather heavyweight if you’re building a three-page site that presents a calendar–but if you’re writing a complex site, then this separation of responsibility makes things much easier to track what is going on. Of course like all design models, it only helps if you keep things clear–all any design model can do is provide you tools to keep things clean. But like a vaccuum cleaner, it only works if you use the tool religiously and use the tool correctly.

– Setting up Struts2 is fairly simple: once you’ve created a Tomcat project within Eclipse (in my setup, I downloaded Tomcat 5.0.28 into the Eclipse project, then used the Sysdeo Eclipse Tomcat Plugin to get Eclipse to play well with Tomcat), it’s a matter of copying the xwork, ognl, freemarker, commons-logging and struts2-core jar files from the Struts2 distribution into the WEB-INF/lib folder, and telling Eclipse about the jar files located there.

Then create the web.xml file in WEB-INF, creating the struts.xml file in WEB-INF/src (where Eclipse will then automatically copy it into WEB-INF/classes where it belongs), and you should be set.

Of course if you screw something up–such as accidently copying the struts-core rather than the struts2-core jar file–then you will get the dreaded “FilterStart Error” without any explanation. Things to look for: did you mistype the name of the filter-class in the web.xml file? Did you copy the wrong jar file like I did? If you see a lot of struts startup stuff but then it fails–did you mistype the class name or package name for your action in struts.xml?

– Struts uses these things called “Actions”, derived from com.opensymphony.xwork2.ActionSupport, which implements the “execute()” method, which essentially is what happens when your web site processes a form. Actions take you from one place to another. How actions are wired up is defined in struts.xml, and what page we move to after executing an action is defined in struts.xml. Results can route to a .jsp file which uses struts tags to simplify getting state information for presentation from an action or from the current state.

– State between actions is maintained in a Map, which can be obtained through ActionContext.getContext().getSession()–the context is thread safe (it uses thread-local storage for the context), though I suspect the session Map is not, since multiple pages may share the same session.

Once you know how to build pages (Dreamweaver, Struts tags), how to build actions (and ActionSupport seems like a good place to start), and store context, then you should be able to get off the ground.

Now, of course, like any API, about three to four months from now I will read back at this post and say “ohmygod, how naive was I?” But you gotta start somewhere. And this looks to me like as good a place as any to start.

Time to understand Struts

Turns out Yahoo internally uses Struts for their web development, so it appears its time for me to understand Struts. (Did I mention the fact that I’m changing jobs? Could be; haven’t told my current employer, though, and the offer isn’t in writing.)

The problem is my fuzzy little brain doesn’t work well with black boxes. I need to have a mental model of how things work–even if that mental model isn’t precisely what is going on under the surface. I’ve never been a “cookie-cutter” programmer; I’m not the type to want to be handed something then told “okay, follow this formula and you’ll get something that works.”

So the tutorials for Struts which begin “grab the empty war file, and start adding stuff”–uh, it doesn’t work for me. I have a mental model as to how Tomcat works and the lifespan of a servlet; I even have a mental model of how JSP pages work–though the mental model is missing stuff, like how tags in the tag library works. So the idea that I should just follow the instructions and not worry about how it works for Struts–that won’t make my life easier.

Fortunately I came across the simple setup guide, which appears to walk through the skeleton of a Struts application, from where the .jar files go to how to set up the web.xml configuration file to the basic struts.xml file. Between this and the documentation home with the architecture, tags and configuration in a nutshell, that looks like a very promising place for me to start…