Updated Dec 3, 2007
Writing software is a big fat mess and is usually tangled up with bureaucracy and Human collaboration issues. As development progresses, software typically becomes more and more fragile and more and more buggy. As we've discussed, it's rare that a system is delivered that actual does what the customer wants (let alone on time).
To fix the problem, people tried to apply engineering principles to the problem. That "analyze, design, implement, test" sequence is what I was taught as an undergraduate in the early 80's. That waterfall method produces software that is inflexible, late, and most importantly doesn't do what the user wants. Apparently it's not the solution.
The problem is that users don't know what they want until you start showing them something. Quoting Tom Burns:
"You know the least about a problem at the start of a project--that is why it doesn't make any sense to do your 'final' design at the beginning. You need to familiarize yourself with the problem space by making something useful--by writing code. Letting people use it and give feedback. Then you should upgrade your design (refactor) as you discover what is truly important."
The single biggest lesson I learned, and a point emphasized by Agile Development, the subject of this lecture, was that requirements never stay the same for even a few days. You had better get used to it, or better yet, take advantage of it lest you drown. Interestingly, there is a close relationship between what XP espouses and between what I ended up following at jGuru, most of which I learned on the job or by listening to Tom Burns, our CEO.
You should read this excellent blog entry by Erik Swan, CTO and cofounder of Splunk, describing how software development is like managing the furniture in your living room.
Agile development describes a number of lightweight development processes such as SCRUM and Extreme Programming or XP that have the following common goals:
and usually has the following key elements in its mechanism:
Martin Fowler (of refactoring fame) says quote:
Ian MacFarland's XP slides from talk at USF.
Here are some fundamental principles to guide development.
Quoting from Principles behind the Agile Manifesto:
Balancing Agility and Discipline: A Guide for the Perplexed says:
Agile home ground:
Plan-driven home ground:
I had a conversation with a friend of mine, Mark Sambrooke, who is a product manager for a large software company here in the Bay Area. I made a number of notes as he described how agile development works in his team. Then, in the summer of 2007, I visited the company for a day to watch scrums in action and to interview developers.
A project begins by having lots of conversations with the customer. Even if there is a single company as customer, there are usually multiple parties within such as users, the purchaser, the IT group, the manager of the project, and so on. So, later when deciding priorities, the product manager has to take all of those people into consideration. The goal of these meetings is to produce a series of user stories in no particular order. These simply reflect the kinds of things that users want to be able to do.
Once you have a list of stories, you go back to your development team. Who is on the development team? The coders, the quality assurance people, and the documenters. Your goal is now to size all of the stories (note: you are not estimating time--you are estimating size). There are two important things to keep in mind when sizing stories. First, all members of the team must be taken into consideration. Just because you can code something quickly does not mean that you can test it quickly; it might also take a very long time to document. You must take all time requirements into consideration. Second, size must be relative not absolute. Ask a human how tall building is and you will get a huge variance, but ask a human to order buildings by height and he/she is pretty good at it. The same is true of software. You may not know how long something takes, but usually you know whether it takes more or less than another problem.
Sizing starts out by picking an average story from the entire list and deciding that it is maybe a 3. Then the other stories are sized relative to this and give them size numbers to reflect their relationship. Sizing numbers are done using the Fibonacci series: 1 2 3 5 8 13 21 34 ... The reason for this is that you avoid the temptation of false precision. Remember that precision such as 14.5 does not imply accuracy. The numbers might get large, but you may only use 10 different discrete size numbers.
After sizing, you must prioritize the stories in terms of order of development. This can be done with the customer, but most likely a product manager decides on the priority given what he or she knows of the market and what all of the various customers have said.
From the sized stories, your team must break down the stories into tasks; e.g., create classes, update schema, make unit tests, write documentation, and so on. These tasks are all written down on little cards. This is the first time that you start talking about time. The process goes like this:
stories -> size stories in points -> create tasks measured in hours
These cards are usually less than a day's worth of work and so you can do a pretty good job of estimating the time requirements for each task. These cards are all placed on the left side of the wall. Then to the right is a middle column of the team members names. All the way to the right is an empty column where completed cards will go.
Every morning, there is a standup meeting for 15 minutes where everyone on the team talks about what they did the day before and which tasks/cards they are going to do today. They move the finished cards from yesterday to the right column next to their name. If they have not finished the task they update the remaining time on the card and leave it where it is. The cards they intend to complete today, they move next to their name in the middle column. There is considerable peer pressure to complete your tasks. Everyone knows what everyone else is doing. QA members report just like the developers...they indicate what they have finished testing and who or what they are waiting on. You can make new tasks in the scrum and some tasks may split. You can also decide to remove a goal or push to the next sprint.
During the standup meetings, there is considerable negotiation and discussion about which tasks to do next (i.e., task dependencies) and to discuss issues related to code integration and communication. The quality assurance and documentation people are in the room also. They have tasks to complete as well. Naturally there are plenty of other opportunities during the day for two people to get together and communicate directly. There is no way to do the agile method with people in different locations.
The development process is broken down into multiple development cycles. A release is generally multiple cycles. You would not want to see a new version of Microsoft Word come out every month. The cost to IT groups would be very high. (I pointed out that there are some things like websites that can easily stand very quick release cycles). Mark uses a two to four week cycle for development. At the end of each cycle everything is done including documentation and testing. You have a very solid foundation upon which to continue. You could actually do a release at the end of every development cycle. Sometimes these cycles are called sprints.
For the first cycle or two, it is hard to estimate how many tasks you will complete. Quickly, however, your team's work output will stabilize and you will get a velocity that you and your customer can use to compute trajectory towards a finished project.
The yellow line in that image is the baseline slope chosen from the last sprint. The line moves from the number of tasks completed in the previous sprint sloped negatively to zero crossing the x-axis at the end of the 30 day sprint.
This whole thing seems like anarchy, but it turns out there are a number of managers involved, including the product manager and the project manager, to ensure that everything is on track and to manage all of the communication details with other teams.
The optimal team size seems to be about six or seven, with a max of 10. Mark has three developers, two quality assurance people, and one documentor. He indicated that a team of 20 is not working so well at his company. Basically it's hard to keep track of what 19 other people are doing.
How then do you build a large project? You have multiple teams working on different pieces or components or plug-ins. Multiple teams coordinated through the managers associated with each team. Mark's impression is that this would not work with inexperienced developers. That said, older developers seem to resist going to an agile method.
Overall time estimation: In the waterfall method, you make a wild ass guess (a "WAG") or simply say "yes" when the CEO asks you if you can have the product done by September 1. ;) With the agile method, your initial answer is "I don't know", but as time progresses you get a good sense of your team's velocity which makes it possible for you to start projecting when you might finish. The closer you get to finishing, the more accurate your answer. With the waterfall method, you actually have no idea up until you are actually totally done when it will be finished. This is primarily because software development progresses at an unknown pace and even when you finish you need to do quality assurance, which could kick it back to development for an unknown amount of time.
I summarize Extreme Programming Explained by Kent Beck in this section and pepper it with experience I gained from building jGuru.com.
The four XP Values
Does not imply that just start "daredevil" hacking. You must be disciplined.
"Extreme" implies "what is good, do to the extreme."
Paradox: What would you do if you had lots of time to build a project? You'd build lots of tests, you'd restructure a lot, and you'd talk with the customer and other programmers a lot. The normal mentality is that you never have enough time. XP says that if you operate in this counterintuitive manner, you'll get done faster and with better software.
Designed for 2-10 programmers. (TJP: note similarity with the 10-person surgical team suggested by Frederick Brooks in "The Mythical Man Month").
This information is derived from Ian McFarland's slides from above and by talking to a number of my colleagues that use agile development commercially. This stuff really works! I also note that it is very similar to how Tom Burns and I built the jGuru.com server.
The application is broken down into a large number of so-called "user stories" that describe one particular customer-visible task, operation, process. For some agile development styles, developers actually write the user stories down on little cards that they can put up on the wall; these are then organized according to a point system.
Estimating time is difficult, but programmers find it fairly easy to estimate difficulty. So, estimate difficulty not time. Each story is broken down into units of work whose difficulty can be quantized. Some people advocate 3 levels:
Some teams prefer a value from 1 to 10. Others prefer the Fibonacci sequence. The key is having a small number of quantization levels otherwise you are right back to poor estimation. Very large stories are broken down into smaller stories that fit within the max cost.
In agile development, not only is the customer involved in picking features, the customer is given control of prioritizing the features. It is up to them to balance the cost of development with how soon they need the various feature.
After a month or two of many quick releases and completion of many user stories, a team's work output stabilizes to a known value. This tells the customer exactly how much they can "spend" each week. They are free to realign priorities and spend the work anyway they want. Ultimately, this breeds a great deal of trust between the customer and the developers, resulting in much better communication and a much better product. This work output is sometimes called the velocity.
Testing is an important part of the XP/agile approach to development. Here are some of my random thoughts on testing. [This should really go somewhere else as it doesn't belong and is not nearly complete]
It's easy to say build lots of unit tests. It's another thing to know what a good test is. One of the things I noticed in student projects is that people test their code with that same code! Ack!
Consider testing forum message inserts.
public void testForumInsert() { boolean worked = db.insertForum(...); assertTrue(worked); }
The problem is that I can implement db.insertForum() as return true; and get a "green signal."
Another less egregious problem is the following:
public void testForumInsert() { Message in = ...; int ID = db.insertForum(in); Message out = db.getForum(ID); assertTrue(in.equals(out)); }
This is a poor test because I could implement insertForum to store things in RAM (not even storing in a db) and getForum would yield the proper result.
You should test your code with "exterior code" and the rawest code you can find. For example, I would test insertForum by writing SQL via the shell or a Java program that looked at the physical db tables.
You must have an integration box that is identical to your development box that is identical to your deployment box. All using the same OS and Java version. Otherwise you will not know if a threading problem or other weird system "feature" will bite you. You must know that tests run on your integration box implies it will run on your deployment box. Conversely, if the deployment box has a problem, you need to be able to find it on a non-live box.
Load testing tools and such are crucial to reproducing conditions found on the live site.
Many programmers I have worked with just didn't seem to care as much as I did about the system. My attitude follows Yoda: "there is no try, only do!" Therefore, I tested and tested and never left "an enemy at my back." When I released something at jGuru I had great confidence in it. Our first system was so bad (getting phone calls at 3 AM to reboot a frozen server sucks! I should have made the employees get up to fix it--would have resulted in better software) that I swore the second system would be well done.
If you thing something might go wrong, it will of course. The only time that the second version of jGuru has crashed due to software error was right before I went on vacation just after launch naturally. In fact, when it crashed, I knew just where the problem was immediately--the one bit of code I just threw together and didn't test for boundary conditions! We got an infinite loop. The system has crashed about 10 times: 2 power failures (induced by the moron ISP), 1 software crash, 1 disk overflow (oops), a handful of crashes due to insufficient resources (memory or file descriptors needed by lucene search engine). Naturally, there have been bugs in functionality ;)