A Comprehensive Guide to Git Submodules for Developers
Written on
Understanding Git Submodules
Git submodules have existed for over ten years, yet many developers remain unfamiliar with them. While contemporary package managers and shared libraries offer various options, git submodules are particularly beneficial when you require greater control over code modifications from both the source and user perspectives. Let's delve deeper into this concept.
At its core, a git submodule serves as a pointer to another repository. For instance, consider a primary project such as your own Software Development Kit (SDK). This SDK might rely on various sub-projects, including a compiler, a runtime for executing compiled binaries, and code samples. In this scenario, the SDK would act as your main project repository, with the compiler and runtime functioning as submodules (realistically, you would need several submodules in an actual application). If you find yourself needing to modify both the SDK's source code and its sub-projects simultaneously, git submodules could be an appropriate solution.
This guide will outline the essential features of git submodules, including their creation, fundamental functionality, and a summary of their key attributes at the conclusion.
Adding Git Submodules to a New Project
Incorporating submodules into an existing git repository is straightforward, but there are specific considerations to keep in mind. Let’s begin by creating a simple SDK project with a single file, README.md, while enabling git tracking:
$ mkdir sdk
$ cd sdk
$ touch README.md
$ git init
$ git add README.md
$ git commit -m "initial commit"
Now, suppose you have another repository with a remote origin on GitHub. You can effortlessly add this repository as a submodule to your SDK like this:
This action will clone the compiler repository into your SDK folder and generate a .gitmodules file. This file contains basic information about your submodules, such as their name, path, and URL. Be sure to add and commit these new files to maintain a clean git state. To check the status of your submodules, simply run:
$ git submodule status
The listed commit reference for the compiler indicates the most recent commit from that repository’s primary branch. Essentially, git submodules act as links between repositories, allowing easy access and sharing of source code.
Working with Git Submodules
By default, any basic git commands you execute are confined to the repository you are currently in. Therefore, if you make modifications in the SDK repo (but outside the compiler repo), git will only reflect those changes in the SDK. Pulling, pushing, and committing will only affect the repository you are in.
To execute a command like git pull across all submodules in your project, use the command:
$ git pull --recurse-submodules
This command will pull changes from the SDK as well as updates from the branch each submodule is currently on. If your submodules contain their own submodules, this operation will continue until everything is up to date. To ensure that submodules are automatically updated whenever you run git pull, execute:
$ git config submodule.recurse true
For broader commands across your submodules, utilize the foreach command. For instance:
$ git submodule foreach npm install
Tracking Submodule Commit References
Imagine you modify one of the SDK's submodules, adding and committing your changes. Once you return to the SDK's root directory and run:
$ git status
You may see an output like:
On branch main
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: compiler (new commits)
This message indicates that the current commit in one of your submodules differs from its previous state. You will need to add the modified submodule to git tracking and commit the updates with a message such as "updated submodule versions." A common error is pushing incorrect submodule references, so it's crucial to monitor which commit or branch each submodule should point to.
Cloning a Project with Its Submodules
When you clone a git repository that includes submodules, the submodules will not be pulled by default. For example, if the SDK project is publicly hosted on GitHub and you clone it, you will only receive a folder reference without any of its subprojects:
$ git clone [email protected]:Some-User/sdk.git
$ cd sdk/compiler
$ ls
# nothing here
To clone the project along with all the files for its submodules, you can either use the --recurse-submodules flag while cloning or run:
$ git submodule update --init
Both methods will ensure all relevant files are retrieved.
Git Submodules Highlights
In summary, git submodules differ significantly from typical package managers. They allow linking to git repositories while treating them as separate projects. Despite facing some criticism in the past, it's essential to assess what submodules provide and determine if they are suitable for your next project.
- Facilitates linking git repositories for easy access
- Enables executing commands across multiple repositories simultaneously
- Maintains precise repository versions
- Enhances version tracking clarity
- Allows sharing of source code between repositories
- Distributes multiple repositories as a single unit
Considerations:
- Increased complexity in version control may lead to mistakes
- The submodule update mechanism does not eliminate outdated submodules
If you have thoughts or experiences regarding git submodules, feel free to share in the comments! To support my writing directly, consider signing up using the referral link below. Thank you for reading!
The first video titled "Git Submodules Tutorial | For Beginners" provides an introduction to the concept of git submodules, explaining their purpose and basic functionality.
The second video, "Git submodules - Why and How to use them," discusses the reasons for utilizing git submodules and offers practical examples of their application.