Merge existing codebases into a monorepo

At Norday we used to work with separate repositories for our front-end, back-end, and other services. Projects consisting of more than three repositories can be hard to discover and document. Having the code in one repository, a monorepo (mono repository), can improve discoverability.


In this post, we will merge multiple repositories into a single new one without losing the Git history.

We will be using a fictional project that consists of two parts, a front-end, and a back-end. Each is in separate repositories with the names my-company/front-end and my-company/back-end.

The result will be a new repository called my-company/website with two subdirectories: front-end and back-end.

Create the monorepo

Let’s first create the new monorepo:

git init website

Skip this step if you already have a repository created via GitHub or GitLab.

Prepare the front-end repository

Create a new checkout of the repository by calling git clone git@github.com:my-company/front-end.git front-end-for-monorepo. Why not use your existing checkout? Because we are going to rewrite the Git history to prepare it for the merge into the monorepo. But we don’t commit these changes to the original repositories. This is a temporary checkout and will be removed after the merge.

Rewrite the Git history

All commits must keep the original author, date, description, and changes, only file paths must change. Git has a filter-branch command to rewrite branches. But the documentation says, and I quote: “its use is not recommended. Please use an alternative history filtering tool such as git filter-repo.”. The Git documentation is good, so we will follow up on that.

They advise newren/git-filter-repo. Install it by following their instructions.

This script moves the code to a subdirectory and updates the paths in the commit history.

cd front-end-for-monorepo
git-filter-repo --to-subdirectory-filter front-end

Your checkout now contains the front-end code in a subdirectory front-end. The history is preserved but with updated paths (git log --stat). It looks as if the code has always been in that subdirectory.

Merge the front-end code

Let’s add the front-end code to the monorepo. First add your local checkout, with the rewritten history, as a remote:

cd ../website # Change directory to the monorepo
git remote add front-end ../front-end-for-monorepo/
git fetch front-end

Now we can merge the front-end code into the monorepo:

git merge --allow-unrelated-histories front-end/main

The --allow-unrelated-histories flag is the trick. Normally a branch you want to merge requires a common ancestor, but we don’t have that. It’s like the first commit in your repository. That one hasn’t got a parent either.

Finally, clean up the checkout and remote:

git remote remove front-end
rm -r ../front-end-for-monorepo/

Merge the back-end code

Well, this is just the same as the front-end. But with the back-end repository. You can do this for as many repositories as you like.

And that’s it!

You have merged multiple repositories into one without losing the Git history.