How I Scaffold Go Projects
YMMV.
Published: 2023-11-12
Updated: 2023-11-14
tl;dr
If you don’t know the size of your problem or the logical components, I recommend starting with a single file and breaking it up as things get uncomfortable.
A common question from newer developers is, “How should I organize my Golang project?”
It’s an understandable question since many frameworks will scaffold a project for you upon initialization.
On the other hand, go mod init
only sets up the Go mod
and sum
files. From there, you can do whatever you’d like. My approach to project structure
is based mainly on two factors: the size of the project and whether it’s a library or an application.
Let’s dive in.
Application Development
App dev is the more intriguing category, so let’s start there. This category implies anything
with a main
module. Use cases include web servers, clients, CLI tools, workflow automation,
etc.
Approach by Size
Under 1k LoC -> 1 file
If I think the solution I’m building will end up being ~1000 lines or less, it will all go in
that main.go
file. I find it easy to read/maneuver the code at that size and find what
I want.
1k - 4k LoC -> 1 package
Over ~1k LoC, I find a single file unwieldy and usually contains enough separate concepts that I
don’t want to see or think about all at once. At this point, I typically break out some
functionality into a few well-named files, still within the same folder as the main.go
file. Maybe I have a struct
with a few methods that perform some specific
functionality. I don’t always want to see the implementation details, so the implementation gets
its own file, and in main.go
, I only think about the well-named method I’m
referencing, i.e., what it does and not how it does it.
Consider a service that communicates with a database. If I have thousands of lines of code, maybe I don’t care to see the implementation details of how the database connection is established or managed or how “insert” and “lookups” are performed.
Over ~5k LoC -> Hub & Spoke model
At this size, the project is getting to the point where I can’t hold everything in my head
simultaneously, and my mind starts to naturally break things up into separate concerns. This is
when I start introducing individual packages to handle isolated components of the problem I’m
trying to solve. I’ve found the “Hub & Spoke” model effective for that. In this model, a
central “hub” imports from the “spokes” as needed to fulfill its objective. The exception is the
entry point (main
), which imports the hub. Usage notes about this model:
A spoke never takes a dependency on another. This simple model helps sidestep any potential issues with cyclic dependencies, which aren’t allowed in Go.
The abstraction usually needs to be corrected if two spokes interact a lot. Perhaps those spokes should be merged or the interactions extracted.
If one spoke needs another, I enable this via interfaces rather than imports. Consider the diagram above, where there is a state manager. Imagine the core business logic component needs to be aware of some state. It can accept some interface for the methods it cares about. Then, after the server initializes the state manager, it can provide it to initialize the core business logic unit, satisfying the defined interface. Utilizing interfaces like this makes it much simpler to test individual system components. It also helps on a human organization layer where dozens of developers can be working on building the system and rely on these interfaces, which are essentially API contracts with compile-time enforcement.
There is usually some functionality required by multiple components, but core to none. For example, some helper functions like parsers or formatters for structured logging. This group typically consists of functions that are pure and stateless. For this, I create separate sub-packages, which does not import any package in that project. It can be imported by any other package which needs it. My general rule is that I must have copied a function at least twice to consider moving it to a shared package. Dave Cheney advises against package names such as “shared”/“utils”/“common” etc. I usually name the helper packages in the style of
shared/logger
to aid my mental model, but having alogger
as a sub-package in the top-level directory also works well.The entry point is usually a simple main file in the top-level folder. If the entry point is complex (as can be the case with CLI tools with many options), then there may be a
cmd
folder containing multiple files in amain
package.
If you don’t know the size of your problem or the logical components, I recommend starting with a single file and breaking it up as things get uncomfortable. I find that this method causes the right abstractions to reveal themselves. I’ve provided some numbers here which work for me. The ranges for you will be different, but I hope this framework can serve as a guide.
Library Development
Much of the guidance given earlier for app development is also applicable here. Some additional advice:
Start with everything unexported when creating your library. When you have a working solution, you can start thinking about the ideal API surface for consumers. Remember: you can always increase the API surface. It’s much harder to reduce it. So, start at zero.
Restrict the API surface to a single module import unless your library is massive.
If you need to partition a library into multiple external modules, keep them at the same import module depth. It’s okay if the user needs to import
github.com/my/foo/bar
andgithub.com/my/foo/baz
. But I don’t want to also have to importgithub.com/my/foo/bar/sub
. It can make usage confusing and frustrating. The azure-go-sdk is one of the worst offenders, in my opinion. I understand why a plethora of modules are needed, but I think the organization could have been better. It’s not uncommon to importgithub.com/Azure/azure-sdk-for-go/sdk/foo
and then also needgithub.com/Azure/azure-sdk-for-go/sdk/foo/internal
. Never mind that making an “internal” module part of the external API surface is baffling.
Some rules of thumb:
Never panic.
Have sensible defaults. Initialization should be straightforward. Example: if you don’t want a user to initialize an exported struct on their own, consider making all the variables on the struct unexported and having a
New
method that performs the initialization.Don’t log by default. Return an error or a result so the consumer can decide if and how to log. Or have a debug/logger setting that is configurable by the consumer.
Don’t spin up goroutines just because the work is parallelizable and the runtime package tells you the user’s machine has multiple cores. That’s not your call. At best, make it a configurable setting.
If a method is/isn’t concurrent-safe and goes against common intuition, highlight that in the documentation. Likewise, if special handling is required (e.g., should not be copied after first use), highlight that.
Test coverage matters a lot more for libraries. Consumers want to trust that the thing does what is intended and that behavior is locked in. But they aren’t going to read the code to ensure that. So, for better or worse, they’ll use test coverage as a proxy.
Hope this helps. Of course, YMMV.