The repository pattern is a design pattern that helps you separate business logic from data access code.
It does so by providing a unified interface for interacting with different data sources, bringing the following advantages to your system:
- Flexibility: You can swap out your data source without changing business logic.
- Maintainability: It’s easier to manage and test different storage backends.
- Decoupling: You reduce tight coupling between your application and data access layer.
A practical example
Let’s use Python and sqlmodel
(PyPI) to demonstrate this pattern (code here):
from abc import ABC, abstractmethod
from sqlmodel import SQLModel, create_engine, Session, Field, select
# Define the model
class Item(SQLModel, table=True):
id: int = Field(default=None, primary_key=True)
name: str
# Repository Interface
class IRepository(ABC):
@abstractmethod
def add(self, item: Item):
pass
@abstractmethod
def get(self, name: str) -> Item | None:
pass
# SQLModel implementation
class SQLModelRepository(IRepository):
def __init__(self, db_string="sqlite:///todo.db"):
self.engine = create_engine(db_string)
SQLModel.metadata.create_all(self.engine)
self.session = Session(self.engine)
def add(self, item: Item):
self.session.add(item)
self.session.commit()
def get(self, name: str) -> Item | None:
statement = select(Item).where(Item.name == name)
return self.session.exec(statement).first()
# CSV implementation
class CsvRepository(IRepository):
def __init__(self, file_path="todo.csv"):
self._file_path = file_path
def add(self, item: Item):
with open(self._file_path, "a") as f:
f.write(f"{item.id},{item.name}\n")
def get(self, name: str) -> Item | None:
with open(self._file_path, "r") as f:
return next(
(
Item(id=int(id_str), name=item_name)
for line in f
if (id_str := line.strip().split(",", 1)[0])
and (item_name := line.strip().split(",", 1)[1]) == name
),
None,
)
if __name__ == "__main__":
repo = SQLModelRepository()
repo.add(Item(name="Buy Milk"))
sql_item = repo.get("Buy Milk")
# Swap out the repository implementation
csv_repo = CsvRepository()
csv_repo.add(Item(id=1, name="Buy Milk"))
csv_item = csv_repo.get("Buy Milk")
print(f"{sql_item=}, {csv_item=}, {sql_item == csv_item=}")
# outputs:
# sql_item=Item(name='Buy Milk', id=1), csv_item=Item(id=1, name='Buy Milk'), sql_item == csv_item=True
- First we define the
Item
model usingSQLModel
. You can also use SQLAlchemy or any other ORM, but I find SQLModel a bit easier to use and I like its integration with Pydantic. - Next we define the
IRepository
interface with theadd
andget
methods, making them required for any class that inherits from it. This is done using Python’s Abstract Base Classes (ABCs) and applying the@abstractmethod
decorator. This is a way to “enforce a contract” and a crucial part of the repository pattern (see also the tests). If you don’t implement these methods in a subclass, Python will raise an error. This is a way to ensure that all repository classes have the same interface, even if they use different storage. To learn more about ABCs, check out our article). - Then we implement the
SQLModelRepository
andCsvRepository
subclasses, which inherit fromIRepository
. These classes implement theadd
andget
methods, which are required by theIRepository
interface. This is where we define the data access logic for each of the storage. TheSQLModelRepository
usesSQLModel
to interact with a SQLite database, while theCsvRepository
interacts with a CSV file. Same interface, different storage backends. - Finally, we demonstrate how to use the repository pattern by adding an item to both the SQL and CSV repositories and then retrieving it.
Note: In this implementation, we did not add error handling to keep the example relatively small. You should ensure that if something goes wrong (e.g., database connection issues or file access errors), the program can handle the error gracefully and provide useful feedback.
The example might be a bit contrived, but it shows how we can leverage the repository pattern in Python.
Again, the advantage is that we have a flexible design that allows us to swap out the data access implementation without changing the business logic. This makes our code easier to test and maintain.
Have you used this pattern yourself and if so, how? Hit us up on social media and let us know: @pybites …