How to Implement the Repository Pattern

How to Implement the Repository Pattern

Profile-Image
Bright SEO Tools in saas Published: Apr 04, 2026 | Updated: Apr 04, 2026 · 2 months ago
0:00

How to Implement the Repository Pattern

Most applications start with database queries scattered throughout the codebase—controllers directly calling SQL, business logic mixed with data access, and domain concepts buried in WHERE clauses. This works initially, then becomes unmaintainable when you need to switch databases, add caching, change query logic, or test business logic without hitting the database. Refactoring at this point means touching dozens of files and risking subtle breakage.

The Repository pattern solves this by creating a clean abstraction layer between your business logic and data access code. Repositories encapsulate all database operations for an entity, providing a collection-like interface that business logic uses without knowing whether data comes from PostgreSQL, MongoDB, or an in-memory cache. This guide covers how to implement repositories that actually improve code quality rather than adding pointless abstraction layers.

We'll focus on practical implementation: structuring repositories for real use cases, handling complex queries without leaking SQL into business logic, implementing caching effectively, and avoiding the common mistakes that make repositories more complex than direct database access.

Why the Repository Pattern Matters

Direct database access from business logic creates tight coupling. Your user registration code directly calls database.query() with SQL strings, which means that code knows about database schema, query syntax, and connection management. Testing requires a real database. Switching from PostgreSQL to MongoDB means rewriting every query in every file that touches users.

The Repository pattern inverts this dependency. Business logic calls userRepository.create(user) without knowing or caring how that user gets persisted. The repository handles all data access concerns—SQL generation, connection pooling, error handling, caching—while exposing a simple domain-oriented interface. Business logic depends on the repository interface, not the implementation, which enables testing with fake repositories and switching implementations without touching business logic.

Key Insight: Repositories aren't about abstracting databases—they're about expressing data access in domain terms. A good repository lets you write findActiveUsersWithExpiredTrials() instead of complex SQL queries scattered through your code. The abstraction enables testability as a side benefit.

The Core Problem: Scattered Data Access

Without repositories, data access code spreads everywhere. A user management feature has queries in the registration controller, the profile update handler, the authentication middleware, and the admin dashboard. Each implements similar logic slightly differently, creating inconsistencies and bugs.

// Scattered queries throughout the codebase
// In registration controller
const user = await db.query(
    'INSERT INTO users (email, password) VALUES ($1, $2) RETURNING *',
    [email, hashedPassword]
);

// In profile controller
const user = await db.query(
    'SELECT * FROM users WHERE id = $1',
    [userId]
);

// In admin dashboard
const users = await db.query(
    'SELECT * FROM users WHERE created_at > $1 ORDER BY created_at DESC',
    [thirtyDaysAgo]
);

This pattern fails when requirements change. Adding a soft-delete feature means updating every SELECT query to filter deleted_at IS NULL. Moving to a different database means rewriting SQL in dozens of files. Testing any business logic requires database setup because data access is embedded in the logic.

Centralized Data Access with Repositories

Repositories centralize all data access for an entity in one place. All code that needs user data goes through UserRepository, which provides domain-oriented methods instead of raw SQL.

// All user data access centralized in repository
class UserRepository {
    async create(userData) {
        const result = await db.query(
            'INSERT INTO users (email, password) VALUES ($1, $2) RETURNING *',
            [userData.email, userData.password]
        );
        return this.mapToUser(result.rows[0]);
    }

    async findById(id) {
        const result = await db.query(
            'SELECT * FROM users WHERE id = $1 AND deleted_at IS NULL',
            [id]
        );
        return result.rows[0] ? this.mapToUser(result.rows[0]) : null;
    }

    async findRecentUsers(days) {
        const cutoff = new Date(Date.now() - days * 24 * 60 * 60 * 1000);
        const result = await db.query(
            'SELECT * FROM users WHERE created_at > $1 AND deleted_at IS NULL ORDER BY created_at DESC',
            [cutoff]
        );
        return result.rows.map(row => this.mapToUser(row));
    }

    mapToUser(row) {
        return {
            id: row.id,
            email: row.email,
            createdAt: row.created_at
        };
    }
}

Now soft-delete logic lives in one place (the repository), all queries consistently filter deleted users, and business logic uses clean domain methods. Testing becomes trivial—swap in a fake repository.

Repository Interface Design

Repository interfaces should reflect domain concepts, not database operations. Don't expose SQL queries or database-specific details. Think about what business logic needs, then design the interface to provide that.

Domain-Oriented Methods

Good repository methods express business intent. findActiveUsersWithExpiredTrials() is better than findWhere({ status: 'active', trial_end: { lt: new Date() } }) because it expresses what you're looking for in domain terms.

class UserRepository {
    // Good: expresses business intent
    async findActiveUsersWithExpiredTrials() { }
    async findUsersNeedingPasswordReset() { }
    async countUsersBySubscriptionTier(tier) { }

    // Bad: exposes database implementation
    async query(sql, params) { }
    async findWhere(conditions) { }
    async executeStoredProcedure(name, args) { }
}

Domain-oriented methods make code self-documenting. When you read findActiveUsersWithExpiredTrials(), you understand what data you're getting. When you read query('SELECT ...'), you must parse SQL to understand intent.

Return Domain Objects, Not Database Rows

Repositories should return domain objects, not raw database rows. This decouples business logic from database schema. If your users table has 20 columns but business logic only needs 5 fields, the repository should return objects with those 5 fields.

// Good: returns domain object
async findById(id) {
    const row = await db.query('SELECT * FROM users WHERE id = $1', [id]);
    if (!row) return null;

    return {
        id: row.id,
        email: row.email,
        displayName: row.display_name,
        createdAt: new Date(row.created_at),
        isActive: row.status === 'active'
    };
}

// Bad: returns database row directly
async findById(id) {
    const row = await db.query('SELECT * FROM users WHERE id = $1', [id]);
    return row;  // Exposes database schema to business logic
}

Mapping to domain objects means business logic never depends on database column names. Renaming display_name to full_name requires changing only the repository, not every file that uses user data.

Standard Repository Methods

Most repositories need a common set of methods for basic CRUD operations. Standardize these across repositories for consistency.

Method Purpose Returns
findById(id) Retrieve single entity by primary key Entity or null
findAll() Retrieve all entities (use with caution) Array of entities
create(data) Create new entity Created entity
update(id, data) Update existing entity Updated entity
delete(id) Delete entity (or soft delete) Success boolean
exists(id) Check if entity exists Boolean
count() Count total entities Number

Beyond these basics, add domain-specific methods as needed. Each method should have a clear purpose and return type.

Handling Complex Queries

The challenge with repositories is handling complex queries without creating method explosion or exposing SQL to callers. You need flexible querying while maintaining abstraction.

Specific Methods for Common Queries

For queries you run frequently, create specific methods. This makes code readable and keeps query logic centralized.

class OrderRepository {
    async findByCustomerId(customerId) {
        return this.query(
            'SELECT * FROM orders WHERE customer_id = $1 ORDER BY created_at DESC',
            [customerId]
        );
    }

    async findPendingOrdersOlderThan(days) {
        const cutoff = new Date(Date.now() - days * 24 * 60 * 60 * 1000);
        return this.query(
            'SELECT * FROM orders WHERE status = $1 AND created_at < $2',
            ['pending', cutoff]
        );
    }

    async findHighValueOrders(minAmount) {
        return this.query(
            'SELECT * FROM orders WHERE total >= $1 ORDER BY total DESC',
            [minAmount]
        );
    }
}

These methods express business concepts (pending orders older than X days) rather than database concepts (WHERE status = 'pending' AND created_at < cutoff). They're self-documenting and testable.

Query Object Pattern for Complex Filtering

When you need flexible querying, use the Query Object pattern instead of exposing raw SQL. Create a query builder that constructs queries using domain concepts.

class OrderQuery {
    constructor() {
        this.filters = {};
        this.sortField = 'created_at';
        this.sortOrder = 'DESC';
        this.limitValue = 100;
    }

    forCustomer(customerId) {
        this.filters.customer_id = customerId;
        return this;
    }

    withStatus(status) {
        this.filters.status = status;
        return this;
    }

    createdAfter(date) {
        this.filters.created_after = date;
        return this;
    }

    sortBy(field, order = 'DESC') {
        this.sortField = field;
        this.sortOrder = order;
        return this;
    }

    limit(count) {
        this.limitValue = count;
        return this;
    }
}

class OrderRepository {
    async find(query) {
        // Build SQL from query object
        let sql = 'SELECT * FROM orders WHERE 1=1';
        const params = [];

        if (query.filters.customer_id) {
            params.push(query.filters.customer_id);
            sql += ` AND customer_id = $${params.length}`;
        }

        if (query.filters.status) {
            params.push(query.filters.status);
            sql += ` AND status = $${params.length}`;
        }

        if (query.filters.created_after) {
            params.push(query.filters.created_after);
            sql += ` AND created_at > $${params.length}`;
        }

        sql += ` ORDER BY ${query.sortField} ${query.sortOrder}`;
        sql += ` LIMIT ${query.limitValue}`;

        return this.query(sql, params);
    }
}

// Usage
const orders = await orderRepository.find(
    new OrderQuery()
        .forCustomer('customer_123')
        .withStatus('pending')
        .createdAfter(new Date('2024-01-01'))
        .sortBy('total', 'DESC')
        .limit(50)
);

This provides flexibility without exposing SQL. Business logic composes queries using domain terms, and the repository handles SQL generation. You can swap database implementations by changing how the repository interprets the query object.

Warning: Don't create generic find({ where: {}, orderBy: {} }) methods that mirror SQL directly. This just moves SQL into JavaScript objects without providing real abstraction. Use specific methods or query objects that express domain concepts.

Repository Implementation Patterns

Repository implementation involves managing database connections, handling errors, mapping between database and domain representations, and ensuring consistent behavior across repositories.

Database Connection Management

Repositories need database connections but shouldn't manage connection lifecycle. Inject a database connection or connection pool into repositories at construction time.

class UserRepository {
    constructor(db) {
        this.db = db;  // Database connection or pool
    }

    async findById(id) {
        const result = await this.db.query(
            'SELECT * FROM users WHERE id = $1',
            [id]
        );
        return result.rows[0] ? this.mapToUser(result.rows[0]) : null;
    }

    mapToUser(row) {
        return {
            id: row.id,
            email: row.email,
            createdAt: new Date(row.created_at)
        };
    }
}

// Repository initialization
const db = createDatabasePool(config);
const userRepository = new UserRepository(db);
const orderRepository = new OrderRepository(db);

This pattern keeps repositories focused on data access logic while connection management stays separate. It also enables easy testing—inject a test database connection or mock.

Transaction Support

Operations that modify multiple entities need transaction support. Repositories should accept an optional transaction parameter that allows coordinating multiple repository calls in a single transaction.

class UserRepository {
    constructor(db) {
        this.db = db;
    }

    async create(userData, transaction = null) {
        const conn = transaction || this.db;
        const result = await conn.query(
            'INSERT INTO users (email, password) VALUES ($1, $2) RETURNING *',
            [userData.email, userData.password]
        );
        return this.mapToUser(result.rows[0]);
    }
}

class AccountRepository {
    constructor(db) {
        this.db = db;
    }

    async create(accountData, transaction = null) {
        const conn = transaction || this.db;
        const result = await conn.query(
            'INSERT INTO accounts (user_id, plan) VALUES ($1, $2) RETURNING *',
            [accountData.userId, accountData.plan]
        );
        return this.mapToAccount(result.rows[0]);
    }
}

// Using repositories in a transaction
async function registerUser(userData, accountData) {
    return this.db.transaction(async (tx) => {
        const user = await userRepository.create(userData, tx);
        const account = await accountRepository.create({
            ...accountData,
            userId: user.id
        }, tx);

        return { user, account };
    });
}

Repositories work with or without transactions. Business logic that needs atomicity across multiple operations uses transactions. Simple operations use repositories directly.

Error Handling and Mapping

Repositories should catch database-specific errors and throw domain exceptions. Business logic shouldn't handle PostgreSQL error codes or MySQL constraint violations.

class UserRepository {
    async create(userData) {
        try {
            const result = await this.db.query(
                'INSERT INTO users (email, password) VALUES ($1, $2) RETURNING *',
                [userData.email, userData.password]
            );
            return this.mapToUser(result.rows[0]);
        } catch (error) {
            if (error.code === '23505') {  // PostgreSQL unique violation
                throw new DuplicateEmailError(userData.email);
            }
            throw new RepositoryError('Failed to create user', error);
        }
    }
}

// Business logic handles domain exceptions
try {
    const user = await userRepository.create({ email, password });
} catch (error) {
    if (error instanceof DuplicateEmailError) {
        return { error: 'Email already registered' };
    }
    throw error;
}

This keeps database-specific details out of business logic. Business logic handles domain errors (duplicate email) without knowing about PostgreSQL error codes.

Caching Layer Integration

Repositories are ideal places to implement caching because they centralize all data access. Caching in repositories is transparent to business logic—code calls userRepository.findById() and doesn't know whether the result comes from cache or database.

Read-Through Cache Pattern

Read-through caching checks the cache before hitting the database. On cache miss, load from database and populate cache.

class UserRepository {
    constructor(db, cache) {
        this.db = db;
        this.cache = cache;
        this.cacheKeyPrefix = 'user:';
        this.cacheTTL = 300;  // 5 minutes
    }

    async findById(id) {
        const cacheKey = `${this.cacheKeyPrefix}${id}`;

        // Try cache first
        const cached = await this.cache.get(cacheKey);
        if (cached) {
            return JSON.parse(cached);
        }

        // Cache miss: load from database
        const result = await this.db.query(
            'SELECT * FROM users WHERE id = $1',
            [id]
        );

        if (!result.rows[0]) {
            return null;
        }

        const user = this.mapToUser(result.rows[0]);

        // Populate cache
        await this.cache.set(
            cacheKey,
            JSON.stringify(user),
            { ttl: this.cacheTTL }
        );

        return user;
    }

    async update(id, data) {
        const result = await this.db.query(
            'UPDATE users SET email = $2 WHERE id = $1 RETURNING *',
            [id, data.email]
        );

        const user = this.mapToUser(result.rows[0]);

        // Invalidate cache on update
        const cacheKey = `${this.cacheKeyPrefix}${id}`;
        await this.cache.delete(cacheKey);

        return user;
    }
}

Caching is invisible to callers. Business logic calls findById() and gets data fast without knowing about Redis. Cache invalidation happens automatically on updates.

Pro Tip: Cache individual entities, not query results. Caching findById() results is effective because cache keys map directly to entity IDs. Caching findRecentUsers() is harder because invalidation becomes complex—which cache entries should invalidate when a user is created?

Testing Repositories

Repositories make testing easier by providing clear boundaries. Test repositories themselves against a real database to ensure data access works correctly. Test business logic with fake repositories to isolate business rules.

Testing Repository Implementation

Repository tests should use a real test database to verify SQL correctness, connection handling, and data mapping.

describe('UserRepository', () => {
    let db;
    let repository;

    beforeEach(async () => {
        db = await createTestDatabase();
        repository = new UserRepository(db);
    });

    afterEach(async () => {
        await db.close();
    });

    test('findById returns user when exists', async () => {
        const user = await repository.create({
            email: '[email protected]',
            password: 'hashed'
        });

        const found = await repository.findById(user.id);

        expect(found.id).toBe(user.id);
        expect(found.email).toBe('[email protected]');
    });

    test('findById returns null when not exists', async () => {
        const found = await repository.findById('nonexistent-id');
        expect(found).toBeNull();
    });

    test('create throws DuplicateEmailError for duplicate email', async () => {
        await repository.create({
            email: '[email protected]',
            password: 'hashed'
        });

        await expect(
            repository.create({
                email: '[email protected]',
                password: 'hashed2'
            })
        ).rejects.toThrow(DuplicateEmailError);
    });
});

Testing Business Logic with Fake Repositories

Business logic tests should use fake repositories (in-memory implementations) to avoid database dependencies and make tests fast.

class FakeUserRepository {
    constructor() {
        this.users = new Map();
        this.nextId = 1;
    }

    async create(userData) {
        const id = `user_${this.nextId++}`;
        const user = {
            id,
            email: userData.email,
            createdAt: new Date()
        };
        this.users.set(id, user);
        return user;
    }

    async findById(id) {
        return this.users.get(id) || null;
    }

    async findByEmail(email) {
        return Array.from(this.users.values())
            .find(u => u.email === email) || null;
    }
}

// Business logic test using fake repository
test('user registration creates user and sends welcome email', async () => {
    const userRepository = new FakeUserRepository();
    const emailService = new FakeEmailService();
    const registrationService = new RegistrationService(
        userRepository,
        emailService
    );

    const result = await registrationService.register({
        email: '[email protected]',
        password: 'password123'
    });

    expect(result.success).toBe(true);
    expect(userRepository.users.size).toBe(1);
    expect(emailService.sentEmails).toHaveLength(1);
    expect(emailService.sentEmails[0].to).toBe('[email protected]');
});

Fake repositories keep tests fast and deterministic. You control exactly what data exists, making edge cases easy to test.

Common Repository Anti-Patterns

Several patterns seem reasonable but create problems. Avoid these common mistakes.

Generic Repository Base Class

Creating a base Repository class with generic CRUD methods sounds like good code reuse but creates awkward APIs.

// Anti-pattern: generic repository
class Repository {
    async findById(id) { }
    async findAll() { }
    async save(entity) { }
    async delete(id) { }
}

class UserRepository extends Repository { }
class OrderRepository extends Repository { }

// Problem: all repositories have same interface regardless of domain needs
// Orders need findByCustomerId, users need findByEmail,
// but generic base class doesn't know about these

Instead, define repository interfaces independently based on each entity's needs. Share implementation utilities if needed, but don't force a common interface.

Repository per Database Table

Creating one repository per table treats repositories like ORMs. Repositories should be organized by domain aggregates, not database tables.

// Anti-pattern: table-based repositories
class UsersRepository { }       // users table
class UserProfilesRepository { } // user_profiles table
class UserSettingsRepository { } // user_settings table

// Better: aggregate-based repository
class UserRepository {
    async findById(id) {
        // Joins users, user_profiles, user_settings
        // Returns complete User aggregate
    }
}

One repository per aggregate means business logic works with meaningful domain objects, not database tables. Joining multiple tables to build a User aggregate is the repository's job.

Exposing IQueryable or Query Builders

Returning queryable objects that let callers build arbitrary queries defeats the purpose of repositories.

// Anti-pattern: exposing query builder
class UserRepository {
    query() {
        return this.db.queryBuilder('users');  // Returns query builder
    }
}

// Caller builds queries directly
const users = await userRepository
    .query()
    .where('status', 'active')
    .where('created_at', '>', cutoff)
    .orderBy('email')
    .limit(10);

// This is just database access with extra steps

Repositories should provide domain-specific methods, not generic query interfaces. If you need flexible querying, use the Query Object pattern that expresses domain concepts.

Danger: The repository pattern adds value when it abstracts data access behind domain interfaces. If your repositories just wrap database calls with no abstraction, you're adding complexity without benefit. Either commit to domain-oriented repositories or use direct database access.

Repository Pattern with ORMs

ORMs like Prisma, TypeORM, and Sequelize already provide data access abstraction. You might not need explicit repositories, or you might want thin repositories that wrap ORM models to provide domain-specific methods.

When ORMs Replace Repositories

If your ORM provides good domain modeling and you're comfortable with business logic depending on the ORM, you might not need repositories. ORMs offer querying, relationships, and transactions already.

// Using Prisma directly without repository
const user = await prisma.user.findUnique({
    where: { id: userId },
    include: { profile: true, settings: true }
});

const recentOrders = await prisma.order.findMany({
    where: {
        customerId: userId,
        createdAt: { gte: thirtyDaysAgo }
    },
    orderBy: { createdAt: 'desc' }
});

This works fine if you're happy depending on Prisma throughout your codebase. The tradeoff: switching ORMs means updating every file that uses Prisma.

Thin Repositories Over ORMs

You can wrap ORM models with thin repositories that provide domain-specific methods while using the ORM for actual data access.

class UserRepository {
    constructor(prisma) {
        this.prisma = prisma;
    }

    async findById(id) {
        const user = await this.prisma.user.findUnique({
            where: { id },
            include: { profile: true, settings: true }
        });
        return user ? this.mapToUser(user) : null;
    }

    async findActiveUsersWithExpiredTrials() {
        const users = await this.prisma.user.findMany({
            where: {
                status: 'active',
                trialEndsAt: { lt: new Date() }
            }
        });
        return users.map(u => this.mapToUser(u));
    }

    mapToUser(prismaUser) {
        return {
            id: prismaUser.id,
            email: prismaUser.email,
            displayName: prismaUser.profile?.displayName,
            // Domain object structure, not Prisma structure
        };
    }
}

This balances abstraction with practicality. You get domain-oriented interfaces and independence from ORM specifics without reinventing data access.

FAQ

Do I really need repositories if I'm using an ORM?

It depends on your abstraction goals. ORMs already provide data access abstraction, so repositories might be unnecessary overhead. Use repositories over ORMs when you want domain-specific interfaces, need to hide ORM details, or plan to swap ORMs. Otherwise, using the ORM directly is simpler.

Should each repository have create, read, update, delete methods?

Only if those operations make sense for the entity. Some entities are read-only (reference data), some are append-only (audit logs), and some support full CRUD. Design each repository's interface based on business operations, not a generic template.

How do I handle complex queries that join multiple tables?

Create specific methods for common complex queries, or use the Query Object pattern for flexible querying. Don't expose raw SQL or query builders directly—that defeats the abstraction. Repositories should handle joins internally and return domain objects.

Where should transaction management happen—in repositories or business logic?

Business logic coordinates transactions across repository calls. Repositories accept optional transaction parameters so they can participate in transactions, but business logic decides when transactions are needed. This keeps repositories focused on data access while business logic orchestrates operations.

Should repositories return domain models or DTOs?

Repositories should return domain models (entities representing business concepts). DTOs are for API layer communication—repositories operate at the domain layer. Return objects with domain-relevant properties, not database table structures or API response formats.

How do I test repositories without hitting a real database?

Test repository implementations against a real test database to verify SQL correctness. Test business logic using fake repositories (in-memory implementations) to avoid database dependencies. This gives you confidence that data access works while keeping business logic tests fast.

Is it okay to have multiple repositories for related entities?

Yes, organize repositories by domain aggregate roots. If User and UserProfile are part of the same aggregate, one UserRepository handles both. If Order and OrderItem are part of the Order aggregate, one OrderRepository handles both. Separate repositories when entities are independent aggregates.

How should repositories handle pagination?

Add pagination parameters to methods that return collections: findAll({ page, limit }) or use a query object with pagination methods. Return both results and total count to enable UI pagination. Make pagination optional with sensible defaults to keep common cases simple.

Should repositories handle caching or should that be separate?

Repositories are ideal places for caching because they centralize data access. Implement read-through caching in repositories to make caching transparent to business logic. This keeps cache invalidation logic close to data modification code.

What's the difference between repository and DAO patterns?

DAOs (Data Access Objects) provide CRUD operations per database table. Repositories provide domain-oriented operations per aggregate. DAOs map closely to database structure; repositories map to domain concepts. Repositories are higher-level abstractions focused on business needs, not storage implementation.

Conclusion

The Repository pattern provides valuable abstraction when implemented with domain needs in mind. Good repositories express business intent through their interfaces, hide data access details, and enable testing by providing clear boundaries. Poor repositories just wrap database calls without adding value, creating complexity without benefit.

Design repository interfaces around domain operations, not database operations. Think about what business logic needs and provide methods that express those needs clearly. Return domain objects, not database rows. Handle database-specific concerns (connection management, error mapping, caching) within repositories so business logic stays clean.

Use repositories where they add value—complex domains with rich business logic, systems where testability matters, or applications that might switch databases. For simple CRUD applications or when using feature-rich ORMs, direct data access might be simpler. Choose patterns that solve real problems in your specific context.


Share on Social Media: