The Mythical Man Month #### Introduction These notes based on (paraphrasing, quoting) Fred Brooks (mid 70's on experience in 60's doing IBM OS 360) Also glanced at resources: http://www.ics.uci.edu/~redmiles/ics121-FQ99/lecture/fifteen.pdf Prof. Redmiles summarized as: o Published 1975, Republished 1995 o Experience managing the development of OS/360 in 1964-65 o Central Argument o Large programming projects suffer management problems different in kind than small ones, due to division of labor. o Critical need is the preservation of the conceptual integrity of the product itself. o Central Conclusion o Integrity achieved through exceptional designer. o Implementation achieved through well-managed effort. Issues in software development; US Government Spending o 47% Software not delivered o 29% Delivered but never used o 19% Abandoned o 3% Changed and then used o 2% Used as delivered Brooks' project: o was late o was a memory hog o cost many times more than estimate #### Tar Pit (What is programming) Software like a tar pit: The more you fight it, the deeper you sink! No one single thing seems that difficult; "any particular paw can be pulled away." Simultaneous and interacting factors brings productivity to a halt. A program is just a set of instructions that seems to do what you want. All programmers say "Oh, I can easily beat the 10 lines / day cited by industrial programmers." They are talking about just coding something, not building a product. A _product_ (more useful than a program): o can be run, tested, repaired by anyone o usable in many environments on many sets of data. o must be tested o documentation Brooks estimates a 3x cost increase for this. To be a component in a _programming system_ (collection of interacting programs like an OS): o input and output must conform in syntax, semantics to defined interfaces o must operate within resource budget o must be tested with other components to check integration (very expensive since interactions grows exponentially in n). Brooks estimates that this too costs 3x. A combined _programming system product_ is *9x* more costly than a _program_. ### Why is programming fun? o Joy of creation o Fun making things people use o fascination with complex puzzles o curiosity, learning o working in an etherial medium like a poet ### On the other hand... o you have to be very precise o you rarely control what you are to build (specs) o dependency on other programs, programmers o finding bugs is a drag and very difficult o takes forever to build anything big o product obsolete when you finish; people are after next big thing before you finish (can make the current project wander and never finish) #### Mythical Man Month Why does software fail? o Difficult to estimate time o Optimism o Effort != progress o Schedule not monitored o Slippage induces addition of people Brooks says programmers are optimists (everything will go right etc...). *Incompleteness and inconsistencies become clear only during implementation*. He concludes that experimenting, "working out" are essential disciplines. Each task has a nonzero probability of failure or slippage. Probability that all will go well is near zero. Cost varies with manpower and resources, but progress does not! Hence, using "man month" (person month) as a measure is misleading and dangerous. They are interchangeable only when there is no interaction whatsoever between tasks. o Draw line downwards top left to lower right months (y) vs people (x) for a perfectly partionable task. o For semi-partionable task, show convex downward curve top left to bottom right o For unpartionable task, show flat line from same months vs people graph. For partionable tasks that require communication, must add communication to the cost of completion. Communication is: o training (technology, goals, overall strategy, plan of work) o intercommunication; if each pair must communicate, cost grows with "n(n-1)/2". Meetings of >2 people makes it even worse! For building a _system_ (requires lots of communication), the communication effort quickly dominates the effort. *Adding more people lengthens not shortens the schedule* Testing cost underestimated always. Brooks suggests: o 1/3 planning (yes this costs a lot) o 1/6 coding o 1/4 component test, early system test o 1/4 system test, all components in hand TJP: don't forget that writing test harnesses can be almost as much work or sometimes more as writing the actual code. Delays during final testing are very demoralizing! ### Gutless estimating Urgency of boss forces programmers to agree to unrealistic schedules. It is very hard to defend an estimate (good or bad); people use "hunches" #### Surgical Team The difference between a good programmer and bad programmer is at least: o 10x in productivity o 5x in program speed, space measurement The 20k/yr programmer is more than 10x more productive than 10/yr programmer (1960's salaries...i hope) ;) Data showed no correlation between experience and performance (but clearly there must be some). "Small" team shouldn't exceed 10 programmers. Managers want small sharp team, but you can't build a very large system this way. OS360 took about 5000 man years to complete. A team of 200 programmers would take 25 years (assuming simple linear partitionable tasks) Took only 4 years with 1000 people (quoting from book these numbers but don't seem to add up). Instead of hiring 200 programmers what about this: hire 10 superstars with say 7x productivity factor and 7x reduction in communication costs. 5000 hours / (10 x 7 x 7) = 10 years. Hmm...may not work even still. Harlan Mills suggests _surgical teams_: team leader (chief programmer==surgeon) with supporting surgeons, nurses. Chief does the cutting and other support. Might be tough to find right mix of people, desires, skills (i.e., who wants to do the testing?) o *Surgeon*: chief programmer o *Co-pilot*: able to do what surgeon does but is less experienced o *Administrator*: handles money, people, space, machines, etc...) o *Editor*: surgeon must do doc, but editor must clean it up o *Two secretaris* o *Program clerk*: maintaining technical records o *Toolsmith*: serves surgeon's need for tools, utilities o *Tester*: Devise system component tests, does debugging What's the difference? Surgical team: project is surgeon's brainchild and they are in charge of conceptual integrity etc... In collaborative team, everyone is equal and things are "designed by committee". Causes chaos. How to scale? Large system broken into subsystems assigned to surgical teams. Some coordination between surgical team leaders. #### Aristocracy, Democracy, and System Design _Conceptual Integrity_: o better to have one good idea than many bad or uncoordinated nonstandard ideas o very important aspect of a large system. o user has to know just that main concept; whole project makes sense. For example, in UNIX everything is a stream (files, devices, tty, ...) o preserved by architect that designs system top-down. TJP: in my experience, having a single mind behind ANTLR has made all tools, concepts hold together well. Most projects are "touched" by many grad students as they drift through a department and work on the tool for a prof. Ratio of functionality / conceptual complexity is important. One or a few minds design, many implement (per surgical team). Brooks argues for separating implementation by using clock (hands, etc...) design with the many implementations. Architecture is _what_ happens, and implementation is _how_ it happens. Defending aristocracy he says: Even though implementators will have some good ideas, if they don't fit within the conceptual integrity, they are best left out. ### Dangers of architects o specs will be too rich and will not reflect practical cost considerations. A real danger. o architects will get all the fun and shut out inventiveness of implementators. Brooks argues that implementation is fun too per previous arg. o implementors will sit idle until specs become ready. Can wait to hire the implementors. #### Second System Effect First system tends to be small and clean. Knows he/she doesn't know everything and goes slowly. As the system is built, new features occur to them. They record these ideas for the "next system." With the confidence of having built the previous system, the programmer builds the second system with *everything*. Tendency is to overdesign. Cites the IBM 709 architecture that is an update to 704. So big that only 50% of features are used. Another version of the effect is to refine pieces of code or features from old system that just aren't that useful anymore. TJP: I tend to consider the next system to be functionally exactly the same but with a much better implementation. A few new features are ok. Actually ANTLR is less functional that old PCCTS! To avoid can have concepts like _feature x is worth m bytes and n ns of time_. Managers should hire chiefs that have at least two systems under their belt. #### Communication in large project How to communicate? In as many ways as possible. o Informally; telephones and good idea of worker relationships. o TJP: nowadays programmers use instant messaging a lot. o Meetings; Regular project meetings, with teams giving briefings. Smoke out the little misunderstandings. Good way to check to see if you all have same ideas. o Workbook; formal project workbook must be started. #### Brook's Rules 1. A program is not a product nor system 2. Adding programmers to fix a delay only makes it take longer 3. Plan to throw one away, you will anyway. This book is ancient, but he says _The only constancy is change itself_ and _plan the system for change_, which could come straight from the extreme programming books. 4. Second system affect: overdoing new feature list to overcome weakness in first system.