When I started building a simple backend to clone a single Git repository to a user’s local machine, I decided to save the repository files into /data/{slug}/
instead of directly into /data/
. My reasoning? It would make it easier to support multiple repositories later by using a unique slug
for each one. This got me thinking: does this design align with the Open-Closed Principle (OCP)—the idea that software should be open for extension but closed for modification? In this post, I’ll walk through my journey of refactoring this repository cloner in a FastAPI MVC backend, starting with a basic implementation, making it OCP-compliant, and then extending it to support Mercurial—all without touching the core logic.
The Initial Repository Cloner (Not Quite OCP)
Here’s how my initial setup looked in a FastAPI backend. I had an MVC structure with a RepositoryController
for routing, a GitService
for cloning, and a RepositoryService
to manage the process. The ORM handled database persistence (e.g., storing repo metadata), but I’ll simplify that part for clarity.
python# git_service.py
class GitService:
def clone_repo(self, url, slug):
# Hardcoded to /data/{slug}/ and Git
os.system(f"git clone {url} /data/{slug}/")
# repository_service.py
class RepositoryService:
def add_repo(self, url, slug):
GitService().clone_repo(url, slug)
# ... ORM save ...
print(f"Saved to DB: {slug}, {url}")
# repository_controller.py
@app.post("/repository/")
def add_repository(url: str, slug: str):
RepositoryService().add_repo(url, slug)
return {"message": "Repo added"}
Running It
A POST request like this would clone a Git repository:
POST /repository/?url=https://github.com/user/repo.git&slug=repo1
The /data/{slug}/
choice was a good start—it allowed me to clone multiple repositories (e.g., repo1
, repo2
) without overwriting files in /data/
. But was it OCP-compliant? Not quite. If I wanted to add support for Mercurial (Hg) or a different storage location (e.g., S3), I’d have to modify GitService.clone_repo
. That’s a violation of OCP’s “closed for modification” rule. Let’s fix that.
Refactoring for OCP
To make this OCP-compliant, I introduced abstractions for cloning and path generation, leveraging dependency injection to keep the system flexible. Here’s the refactored version:
python# repo_cloner.py
from abc import ABC, abstractmethod # A Python requirement for OCP
class RepoCloner(ABC):
@abstractmethod
def clone(self, url, slug):
pass
class GitCloner(RepoCloner):
def clone(self, url, slug):
os.system(f"git clone {url} /data/{slug}/")
# path_generator.py
class PathGenerator(ABC):
@abstractmethod
def get_path(self, slug):
pass
class LocalPathGenerator(PathGenerator):
def get_path(self, slug):
return f"/data/{slug}/"
# git_service.py
class GitService:
def __init__(self, cloner: RepoCloner, path_gen: PathGenerator):
self.cloner = cloner
self.path_gen = path_gen
def clone_repo(self, url, slug):
path = self.path_gen.get_path(slug)
self.cloner.clone(url, path)
# repository_service.py
class RepositoryService:
def __init__(self, git_service: GitService):
self.git_service = git_service
def add_repo(self, url, slug):
self.git_service.clone_repo(url, slug)
print(f"Saved to DB: {slug}, {url}")
# repository_controller.py
from fastapi import FastAPI
app = FastAPI()
def setup_services():
cloner = GitCloner()
path_gen = LocalPathGenerator()
git_service = GitService(cloner, path_gen)
return RepositoryService(git_service)
@app.post("/repository/")
def add_repository(url: str, slug: str):
repo_service = setup_services()
repo_service.add_repo(url, slug)
return {"message": "Git repo added"}
Why This is OCP-Compliant
-
Open for Extension: I can add new cloning behaviors (e.g., Mercurial) or storage locations (e.g., S3) by creating new
RepoCloner
orPathGenerator
implementations. -
Closed for Modification:
GitService
,RepositoryService
, and the controller logic don’t need to change when I extend the system. The abstractions handle the variability.
This setup still clones Git repos to /data/{slug}/
, but now it’s ready to grow without breaking existing code.
Adding Mercurial Support (No Changes to Core Classes)
Now, let’s extend this to support Mercurial repositories. All I need is a new MercurialCloner
and a way to select it—without modifying GitService
, RepositoryService
, or the core controller logic. Here’s how:
python# repo_cloner.py (updated)
from abc import ABC, abstractmethod
class RepoCloner(ABC):
@abstractmethod
def clone(self, url, slug):
pass
class GitCloner(RepoCloner):
def clone(self, url, slug):
os.system(f"git clone {url} /data/{slug}/")
class MercurialCloner(RepoCloner):
def clone(self, url, slug):
os.system(f"hg clone {url} /data/{slug}/")
# path_generator.py (unchanged)
class PathGenerator(ABC):
@abstractmethod
def get_path(self, slug):
pass
class LocalPathGenerator(PathGenerator):
def get_path(self, slug):
return f"/data/{slug}/"
# git_service.py (unchanged)
class GitService:
def __init__(self, cloner: RepoCloner, path_gen: PathGenerator):
self.cloner = cloner
self.path_gen = path_gen
def clone_repo(self, url, slug):
path = self.path_gen.get_path(slug)
self.cloner.clone(url, path)
# repository_service.py (unchanged)
class RepositoryService:
def __init__(self, git_service: GitService):
self.git_service = git_service
def add_repo(self, url, slug):
self.git_service.clone_repo(url, slug)
print(f"Saved to DB: {slug}, {url}")
# repository_controller.py (updated for repo_type)
from fastapi import FastAPI
app = FastAPI()
def setup_services(repo_type="git"):
path_gen = LocalPathGenerator()
if repo_type.lower() == "git":
cloner = GitCloner()
elif repo_type.lower() == "hg":
cloner = MercurialCloner()
else:
raise ValueError("Unsupported repo type")
git_service = GitService(cloner, path_gen)
return RepositoryService(git_service)
@app.post("/repository/")
def add_repository(url: str, slug: str, repo_type: str = "git"):
repo_service = setup_services(repo_type)
repo_service.add_repo(url, slug)
return {"message": f"{repo_type.capitalize()} repo added"}
Testing the Extension
-
Git Repo:
POST /repository/?url=https://github.com/user/repo.git&slug=repo1&repo_type=git Response: {"message": "Git repo added"}
Clones to
/data/repo1/
usinggit clone
. -
Mercurial Repo:
POST /repository/?url=https://hg.example.com/user/repo&slug=repo2&repo_type=hg Response: {"message": "Hg repo added"}
Clones to
/data/repo2/
usinghg clone
.
OCP in Action: No Changes to Core Classes
Notice what didn’t change:
-
GitService: Still delegates to
self.cloner.clone()
andself.path_gen.get_path()
. It doesn’t care whether it’s Git or Mercurial. -
RepositoryService: Still calls
git_service.clone_repo()
without knowing the repo type. -
PathGenerator and LocalPathGenerator: Unchanged, still generating
/data/{slug}/
.
The only additions were:
-
MercurialCloner
: A new implementation ofRepoCloner
. -
Controller tweak: Added
repo_type
to the endpoint andsetup_services
to pick the right cloner. This is a minimal change to enable flexibility, not a modification of the core cloning logic.
This is OCP in action: I extended the system to support Mercurial by adding new code (MercurialCloner
) without altering the existing classes’ behavior.
Why /data/{slug}/
Helped
My initial choice of /data/{slug}/
laid the groundwork for this flexibility. It ensured that each repository had its own isolated directory, making it easy to scale to multiple repos. When paired with OCP abstractions, it became part of a system that’s both extensible and stable.
Takeaways
-
Start with Flexibility: Even small decisions like
/data/{slug}/
can set you up for future growth. -
Use Abstractions: Interfaces like
RepoCloner
andPathGenerator
keep your code open to new features. -
Dependency Injection: Wiring services dynamically (e.g., via
setup_services
) lets you swap implementations without refactoring.
Next, I could add S3 storage with a new S3PathGenerator
or parallel cloning with a ParallelGitCloner
, all without touching GitService
or RepositoryService
. That’s the power of OCP!
Album of the day: