Thursday, June 30, 2011

Basic tools for analyzing Large Codebases


  • Graphs: Everything related to code is a graph at its core. Code has attributes, it relates to other bits of code in varying levels of hierarchy and entanglement, and the understanding of this requires graphs of various kinds to be built, visualized and used.
  • History/Telemetry/Timelines/Versioning: What happened to the code MATTERS. Especially since the translation from written to running code is a lossy process. From another POV to understand how the code reached its current state, you need to know its previous states. From yet another POV, to understand the running state of code requires maintaining history of its execution states
  • Visualizations: Plural emphasized. Code is a many-dimensioned thing. Not everyone can understand it in its entirety and not all the time. Slices of the beast are a must.
  • "Glue": Some language to annotate the code with to "fill in the gaps" from all the lossy translations involved. The simplest is human language in the form of comments.

Thursday, June 23, 2011

Flexible Software

Ever wonder how easy or difficult it would be to create a DIY version of anything?

America has a great DIY culture, so there's a lot of commercial support for kits that allow you to build or fix anything from reading glasses to whole kitchens. This is not how it works in other places where labor is cheaper and the tradespeople build and fix things. In such places, the interfaces (for lack of a better term) that products allow assume expertise on the part of the person using it.

Simple example: the bicycle is largely a recreation vehicle in the US. As such most of its parts are easily adjusted with knobs and such end-user friendly means. For the most part, you'll rarely have to go into a bike shop for a repair. They even have kits to clean your chain! This would be unheard of in a place like India mainly because there're tons of bicycle repair shops and people dont need the user friendly interface.

But I digress. The point I'm trying to make is that designing a product such that it can be broken down by somebody else than the original creator (or somebody skilled in the products innards) requires a different kind of thinking than just producing the product. A DIY robot kit is more difficult to make than a regular one. The key features added are:
  • the ability to be broken down into pieces with well defined interfaces
  • the ability to integrate with other pieces to form the whole product.
That's the physical world; and yet we have quite a large DIY capability in general.

Now take that same concept to the world of software. Why do we still not have long promised Software ICs? I think we know the answer to that one: because my concept of component that does logging (to pick a mundane, yet excruciatingly relevant-in-terms-of-my-point example) is not the same as yours :).

But there's more than one way to skin that cat. We dont necessarily need a universal agreement on what constitutes a specific component; we just need to ability to easily take components in and out of the software as often and appropriately as required. There in lies the rub, however. Our current means of doing this in increasing order of cost are:
  • Configuration: which is the hardwired ability to switch between known behaviors of a component
  • Code Change: which is the hard ability to change the behavior of a component from the current to the desired. This is the snapshot based development that I've blogged about before.
  • Integration: which is the tying together of fairly coarse components with known behavior to realize a larger, compound behavior. The integration itself could be via configuration (better) or code change (not so good)
There's a reason for integration being at a large enough scale: integration is hard due to the mine != yours problem mentioned above.

Is there a solution? I think languages should develop ESB-like capabilities, and allow code change to be applied descriptively instead of as a stream of editor actions.

More on this later.

Wednesday, June 08, 2011

First time parents: an app rollout story

I've seen that teams involved with the first time rollout of an app are like first time parents. So pardon me while I mix more metaphors that are required to get the message across:
  • Its MY baby, so how can you possibly know whats right? I let you hold it; doesnt mean you can code to it in any way except MINE
  • If the baby cries, it must be a critical. Doesn't matter that you're an experienced parent who's seen this before on your App, and know it affects less than 1% of customers, and therefore will feely admit that while the issue is a critical, it need not be addressed NOW
  • Related: My baby's needs come FIRST. That means before anything else. Especially yours. Because I said so. So there.
  • My baby is special, so it needs to be eased very very gently into anything new - especially the big, bad world. Maybe we can get him out of the door an inch a time, so he'll be ok with it. Oh you just went ahead and threw yours in the pool and he's doing fine? Well, we'll not have any of that around here.Our production launch will have many interim milestones, thank you very much!
I can keep going, but you see what I mean? I say read Ship It, and see how things work out. You might find the second time much easier. Unless of course, you are bent on doing everything right this time around so we have that perfect child :)

 The killer combination: A team made of some people who've never built anything and some that have built exactly one thing before and are dying to fix that in the successor!

Note: I've found that there are levels of noobie parenthood. People who are completely sane and rational at one level will make an exception and become noobs for that pet/key/most important project which apparently transcends normal levels.

A runtime that does static/dynamic analysis

Traditionally, code analysis has been done offline (statically) or online (debug or test in prod). In either case the effort has been to minimize the impact of analysis overhead on execution of code. Ironically, its this exact execution state that we need to know and understand better to improve and maintain software.

In the database world, its common for the runtime (ie db engine) to maintain statistics of query execution and use that to better plan future execution. Why not have a code runtime that does the same?

Again, not a new concept - cpus have had this for ages. What about App Servers, though? Imagine having a code analyzer attached to your running code that not just tells you where the hotspots are, but also points out the tangles in your code and so forth?

The advantages are obvious: no additional tools required, code analysis in running code, etc; as are the disadvantages: performance penalty being the main one. However, I really have to ask: We've come a long way from expecting "running on bare metal" mode. Why not this additional step? In a long enough time line wouldn't the benefits and advances in hardware outweigh this?

We've seen the advent of opinionated languages. I think its time for an opinionated App Stack.

Implementation note: Steal Fantom's concept of conceptual package = deployment package.