Skip to content

Software Engineering best practices

seantalts edited this page Apr 10, 2019 · 9 revisions

This page contains resources for thinking about code-as-artifact and general software engineering practices that are not taught in university or much outside the tech industry.

Audience

The primary audience for computer code is human, full stop. Imagine you are writing very detailed instructions to an acquaintance. This perspective guides all that follows

Naming

Of all the technical debt you can incur, the worst in my experience is bad names -- for database columns, variables, functions, etc. Fix those IMMEDIATELY before they metastasize all over your codebase and become extremely painful to fix later.. and they always do.

— Jeff Atwood (@codinghorror) April 10, 2019

Abstraction

  • Try to balance Don't Repeat Yourself (DRY) against You Ain't Gonna Need It (YAGNI).
  • Generally, giving mildly complicated subexpressions or functionality names and abstracting into variables or functions aids in communication when done right. But indirection is not abstraction.

Usable History

Quoting from Sumana Harihareswara's blog post on the topic,

We aren't just making code. We are working in a shared workplace, even if it's an online place rather than a physical office or laboratory, making stuff together. The work includes not just writing functions and classes, but experiments and planning and coming up with "we ought to do this" ideas. And we try to make it so that anyone coming into our shared workplace -- or anyone who's working on a different part of the project than they're already used to -- can take a look at what we've already said and done, and reuse the work that's been done already.

We aren't just making code. We're making history. And we're making a usable history, one that you can use, and one that the contributor next year can use.

So if you're contributing now, you have to learn to learn from history. We put a certain kind of work in our code repositories, both code and notes about the code. git grep idea searches a code repository's code and comments for the word "idea", git log --grep="idea" searches the commit history for times we've used the word "idea" in a commit message, and git blame codefile.py shows you who last changed every line of that codefile, and when. And we put a certain kind of work into our conversations, in our mailing lists and our bug/issue trackers. We say "I tried this and it didn't work" or "here's how someone else should implement this" or "I am currently working on this". You will, with practice, get better at finding and looking at these clues, at finding the bits of code and conversation that are relevant to your question.

And you have to learn to contribute to history. This is why we want you to ask your questions in public -- so that when we answer them, someone today or next week or next year can also learn from the answer. This is why we want you to write emails to our mailing lists where you explain what you're doing. This is why we ask you to use proper English when you write code comments, and why we have rules for the formatting and phrasing of commit messages, so it's easier for someone in the future to grep and skim and understand. This is why a good question or a good answer has enough context that other people, a year from now, can see whether it's relevant to them.

How to write good commit messages

There are a lot of resources out there, but the most specific and comprehensive is from the PyInstaller project doc. This article is much shorter and just has 5 tips, the most important to answer the following questions:

  1. Why is this change necessary?
  2. How does it address the issue?
  3. What side effects does this change have?

Zulip's page on Commit Discpline is also excellent and recommends splitting commits into atomic chunks that pass tests and represent a single idea, though this can be difficult and not worth it at times.

Guidelines from other projects

Many of these from the isocpp Philosophy section apply broadly to software engineering projects, specifically

  • Express ideas directly in code
  • Express intent
  • Ideally, a program should be statically type safe
  • Prefer compile-time checking to run-time checking
  • What cannot be checked at compile time should be checkable at run time
  • Catch run-time errors early
  • Prefer immutable data to mutable data
  • Encapsulate messy constructs, rather than spreading through the code
  • Use supporting tools as appropriate
  • Use support libraries as appropriate