Monorepos & Scale
Enterprise Git Patterns
Why it matters
As projects grow, you’ll face decisions about repository structure. Understanding monorepos, submodules, and LFS helps you scale effectively.
Key concepts
- Monorepo — Single repository containing multiple projects
- Submodules — Include other Git repos at specific commits
- Subtrees — Merge another repo’s history into your repo
- Git LFS — Store large files outside main Git history
The idea
The Repository Philosophy
Polyrepo (Many Repositories) One repo per project/service. Frontend, backend, mobile each in their own repo. Simple to understand, clear boundaries, independent versioning.
Monorepo (Single Repository) Everything in one repo. Google, Facebook, Microsoft use monorepos. Atomic changes across projects, shared tooling, single source of truth.
Trade-offs
| Polyrepo | Monorepo |
|---|---|
| Clear boundaries | Atomic cross-project changes |
| Independent releases | Shared code/tooling |
| Simpler permissions | Single source of truth |
| Smaller clone size | Better discoverability |
| Harder code sharing | Needs custom tooling at scale |
Walkthrough
Monorepo Structure
monorepo/
├── apps/
│ ├── web/
│ ├── mobile/
│ └── api/
├── packages/
│ ├── shared-utils/
│ ├── design-system/
│ └── types/
├── tools/
│ └── scripts/
└── package.json
Git Submodules
Include another repo inside your repo at a specific commit:
# Add submodule
git submodule add https://github.com/lib/library.git libs/library
# Clone repo with submodules
git clone --recurse-submodules https://github.com/user/repo.git
# Update submodules after pulling
git submodule update --init --recursive
# Update to latest commit in submodule
cd libs/library
git pull origin main
cd ../..
git add libs/library
git commit -m "Update library submodule"
Git Subtrees
Alternative to submodules—history is merged:
# Add subtree
git subtree add --prefix=libs/library https://github.com/lib/library.git main --squash
# Pull updates
git subtree pull --prefix=libs/library https://github.com/lib/library.git main --squash
# Push changes back
git subtree push --prefix=libs/library https://github.com/lib/library.git main
Git LFS (Large File Storage)
Store large files (images, videos, binaries) outside the repo:
# Install LFS
git lfs install
# Track file types
git lfs track "*.psd"
git lfs track "*.mp4"
git lfs track "assets/**"
# Check tracked patterns
cat .gitattributes
# See LFS files
git lfs ls-files
How it works:
- LFS stores a pointer in Git (small)
- Actual file lives on LFS server
- Downloaded on demand
Key takeaways
- Monorepo for shared code, polyrepo for independence
- Submodules link repos at specific commits
- LFS keeps large files from bloating repo
- All approaches have trade-offs—choose based on needs
Dos & don’ts
✅ DO
- Use LFS for large binaries: Images, videos, datasets
- Keep submodules updated: Stale submodules cause confusion
- Document your repo structure: README explaining layout
- Use monorepo tooling: Nx, Turborepo, Bazel for large monorepos
❌ DON’T
- Don’t commit large binaries directly: Use LFS
- Don’t ignore submodule changes: They’re easy to miss
- Don’t assume one approach fits all: Evaluate trade-offs
Going deeper
Sparse Checkout:
For huge monorepos, clone only the directories you need:
git sparse-checkout set apps/web packages/shared
Monorepo Tooling:
- Nx: Smart builds, caching, dependency graph
- Turborepo: Incremental builds for JS/TS
- Bazel: Google’s build system, extreme scale
- Lerna: JS monorepo management (older)
Common mistakes
Submodule commit not pushed: If you update a submodule but forget to push it, others can’t clone. Always push submodule changes first, then the parent repo.
Binary files in history:
Once committed, binary files live forever in history, even if deleted.
Use LFS from the start, or use git filter-branch to remove (complex).