home |
syllabus |
groups |
moodle |
video |
review |
© 2023
(that makes me want to use your code)
"I have made this letter longer since I did not have time to make it shorter" -- Blaise Pascal
- Documentation is hard
- Doc generation tools are the start, not end, of documentation
- Badges are “bling” (not documentation)
- What a great documentation (first "C" book):
- Chapter1: A modern work of art.
Showcases succintly,
- the key points
- what the thing is
- where to learn more?
e.g. Starship
Lightweight (markdown):
Medium weight, highly recommended (asiidoc):
- see asciidoclive
Heavyweight (latex):
(Aside: please take a moment to read the sad, sad story about markdown's creator).
Pdoc3: Python doc strings ⇒ markdown ⇒ web pages
Trans-document documentation tools
-
Great list of pointers:
-
Doc should be kept under version ctonrol
- Maybe even store your documentation in the same repository as its corresponding product source code.
- Big win with versioned documentation
- People can go back to the doco at that that for that version of the code. Bug win!
- Developers are more likely to contribute if they don't have to clone a separate repository.
- Big win with versioned documentation
- Maybe even store your documentation in the same repository as its corresponding product source code.
Etter, Andrew. Modern Technical Writing: An Introduction to Software Documentation (p. 15). Kindle Edition.
-
I have perhaps an irrational bias towards static websites. I love them. I love their speed, simplicity, portability, and security. You can host static websites practically anywhere, including Amazon S3 and GitHub Pages. They have no server-side application dependencies, no databases, and nothing to install, so migrating the entire site is as easy as moving a directory.
-
Doc should answer 4 questions:
- What is this product? Why would anyone want it?
- How does this product fit into a broader ecosystem, if at all? Does it have any dependencies?
- Where can I acquire this product? If there are multiple distribution packages, which should I choose and why?
- How do I install the product? What are the basic configuration options, if any?
- What does a simple, start to finish operation look like?
- Some e.g. pictorial walkthrough to a functional code sample
-
Bias towards including headers, tables, lists, diagrams, and images.
- These additions make your writing more approachable and simpler to scan than paragraph after paragraph of prose.
-
Dont use wikis (if you even remember them) "In short, for a wiki to make sense, your documentation should be uncontroversial and never need to be versioned. You also shouldn't mind writing in an inferior editor, only working online, and maintaining a piece of enterprise software."
-
Don't use MS Word
- "Microsoft Word is a wonderful choice for creating résumés and a horrible choice for creating documentation.
- "Its lone purpose in this world—one that, again, it really does perform admirably- is to create short, attractive PDFs that people can read and discard.
- "Documentation with any sort of lifespan needs to be kept in version control, which Word's DOCX file format (a compressed collection of XML files) actively opposes.
- "Documentation should live online, and Word's HTML export is totally unsuitable for creating websites."
-
WRONG: Technical writers produce comprehensive documentation.
- RIGHT: Writing should be the minimum possible length. Huge blocks of writing look intimidating, and excessive content waters down useful content. Identify what the audience actually needs to know, and include only that.
-
If you aren't keeping an eye on documentation metrics, you're making a huge mistake. User research is wonderful, but knowing exactly which pages are most popular, your site's bounce rate, and common behavioral flows are all invaluable.
-
Pandoc is a marvelous tool for converting between markup formats. It calls itself the Swiss Army knife of markup converters and can convert to and from a huge number of formats. Unfortunately, these conversions are rarely perfect. If one day you decide that you'd rather write in AsciiDoc instead of Markdown, expect to perform some manual cleanup after running the conversion script.
Not really
None of these tools solve the documentation problem
Kinds of doc (hint: "why not"; rarer than "why"; rarer than "how"; rarer than "what"):
- What : point description
- e.g. UML,
- e.g. what pdoc3 generates
- How: common use cases
- See The scikit-learn doco (exploding with examples)
- See [last third of “PCL”[(https://gigamonkeys.com/book/)
- Why: top-level motivation
- See Meyer’s OO software construction Chapter1 says nothing about objects
- See “Data mining from scratch” (Joel Grus)
- Why-not: choices within the design
- Path not travelled
For a sensational case study, read the original textbook on "C".
- Chapter1 is a work of art
Make the documents something reasonable
(e.g. documentation ⇒ verification or documentation ⇒ code generation)
- e.g. Feature model: looks like “just” a pretty picture.
- But look again: see the logic? the constraints? the why nots
- This model: 10 variables, 8 rules
- Linux kernel: 6,888 variables, 344000 rules
-
e.g. State transition diagrams
-
David Harel, Statecharts: A visual formalism for complex systems. Science of Computer Programming, 8(3):231-274, June 1987. 10,100+ citations (!!!)
-
Black dot = initial state
-
Dashed lines: parallel work
-
Words on arcs: transition guards
-
Solid line: nested sub-state machines (and all transitions on super-states apply to sub-states)
-
Good for scriblling on white board:
- very good for small systems ( e.g. internal reasoning within one class )
-
Good for reasoning about mission critical kernels on safety critical systems
- Can reason over diagram to look for
- Live locks: loops which never terminate
- Dead locks: states we can never escape
- Can reason over diagram to look for
- e.g. Entity-Relationship Diagrams
- Database design
- Don’t document the code
- Document the data it runs on
- Everything is tables, whose cells can be String, Number, Date, Blob (binary object), Null, etc...
- But NOT a nested table
- If many of these things depends on many of those things
- Add a relationship table in between
- A set of sanity checks for “bad” table design
- Store every datum once, and only once
- Avoid, add/delete/update anomalies
- Many tools for auto-application generation (GUI screen Generation, ORMs like Entity Framework [C#], mapping to databases).
Interactive GUIs generated from database table design
cross-compile to “C”
- Flows change stuff (and stuff is called Stocks).
- Stocks are real-valued variables , some entity that is accumulated over time by inflows and/or depleted by outflows.
-
To code CM,
- sum in + out_flows_ around each stock;
- multiply that by the time tick dt
- add the result back to the stock
- e.g. v.C += dt*(u.q - u.r) (u,v = before,now)
“What” documentation: UML = ER + procedures
Hints for writing class diagrams
- Don't add gets/setters to class methods
- If there is a relationship classX to classY,
- Don't add variables to X,Y.
- Instead connect them with a line and label the line.
- E.g. Professor instructs student
- Also, consider writing fewer classes (this one will just blow your mind).
- e.g. Before: 10,000 LOC, 115 modules, 207 classes
- After: Downsized to 135 LOC, 3 classes (to handle a particular application)
- e.g. Before: 10,000 LOC, 115 modules, 207 classes
- Brainstorm with CRC cards
- Class, responsibility, collaboration cards
- Walk through specific scenarios
- Hold the card to your chest and say “I update the X”
- And when the responsibility feels wrong, pass it to another class
- Class, responsibility, collaboration cards
History has not been kind to UML
- UML = under-defined modeling language
- Not enough semantics to support verification
- Marian Petre: "UML in practice" ICSE'13, 2013. http://oro.open.ac.uk/35805/
- UML has been described by some as "the lingua franca of software engineering". Evidence from industry does not necessarily support such endorsements. How exactly is UML being used in industry — if it is? Interviews with 50 professional software engineers in 50 companies
Number | |
---|---|
no UML | 35 |
selective | 11 |
auto-code gen | 3 |
retrofit | 1 |
wholehearted | 0 |
Who uses what | Number |
---|---|
Class diagrams | 7 |
Sequence diagrams | 6 |
Activity diagrams | 6 |
State machine diagram | 3 |
Use case diagrams | 1 |
- UML useful for
- A 'thought tool'
- For communicating with stakeholders
- For collaborative dialogs
- As the starting point for adaptation (i.e., using a homegrown variant of the "real" notation)
- Two ways to read Petre’14:
- An indictment: after 20 years, UML is still mostly not used and not valued.
- More hopefully, parts of UML are used; the more we learn about which ones, where, why, and how, the better our chances of building something better.