Best SaaS Feature Flag Implementations

You just deployed a new payment processing feature to production, and within minutes, error rates spike to 15%. Without feature flags, your only options are rolling back the entire deployment or pushing a hotfix—both requiring 20+ minutes during which customers can't complete purchases. With feature flags, you toggle the new feature off in 3 seconds, restoring service while you investigate the issue in a safe environment.

This article covers seven production-ready feature flag implementations for SaaS applications, from simple database-backed toggles to sophisticated targeting systems that enable gradual rollouts, A/B testing, and instant emergency rollbacks. You'll learn the data models, caching strategies, and evaluation logic that power feature flags at scale, plus the operational patterns that prevent feature flag debt from destroying your codebase.

We'll progress from basic boolean flags for entire applications through percentage rollouts, user segment targeting, and finally the architectural patterns that let you safely retire flags without breaking production.

Why Feature Flags Are Infrastructure, Not Optional

Feature flags decouple deployment from release. Every code deploy becomes a low-risk operation because new functionality ships disabled, waiting for a flag toggle to activate it. This separation transforms your release process from a high-stakes event requiring all-hands-on-deck coordination to a routine operation that happens multiple times per day.

The risk mitigation value becomes obvious during incidents. Traditional deployments require rolling back code, which can take 15-30 minutes depending on your CI/CD pipeline, health checks, and autoscaling warmup. Feature flags let you disable problematic features instantly—typically under 5 seconds with proper caching—limiting the blast radius while you investigate root causes.

Beyond incident response, feature flags enable progressive delivery patterns that reduce risk through controlled exposure. Instead of releasing a new checkout flow to all customers simultaneously, you release it to 1% of traffic, monitor error rates and conversion metrics, then gradually increase exposure to 5%, 10%, 50%, and finally 100%. Problems surface at limited scale where they affect dozens of sessions instead of thousands.

The testing and staging benefit is equally valuable. Feature flags let you merge code to production branches before it's ready for customer exposure. Your QA team can test integrated features in production using flag overrides, finding issues that only manifest with production data and scale. This eliminates the "works in staging, breaks in production" problem that plagues systems with significant production-staging environmental differences.

Key Insight: Feature flags are not primarily about gradual rollouts or A/B testing. The core value is deployment safety—the ability to ship code to production with new features disabled, then enable them independently of deployment. Everything else is a bonus.

Database-Backed Boolean Flags: The Foundation

The simplest production-ready feature flag implementation uses a database table with flag names and boolean values. This pattern works for applications with modest traffic (under 10,000 requests per second) and forms the foundation for more sophisticated systems.

The schema requires only three essential fields: flag name, enabled status, and description. Additional metadata like created timestamp, last modified timestamp, and modified by user helps with auditing. A unique index on flag name ensures lookups are fast and prevents duplicate flag definitions.

-- PostgreSQL schema for feature flags
CREATE TABLE feature_flags (
  id SERIAL PRIMARY KEY,
  name VARCHAR(255) NOT NULL UNIQUE,
  enabled BOOLEAN NOT NULL DEFAULT false,
  description TEXT,
  created_at TIMESTAMP NOT NULL DEFAULT NOW(),
  updated_at TIMESTAMP NOT NULL DEFAULT NOW(),
  updated_by VARCHAR(255)
);

CREATE INDEX idx_feature_flags_name ON feature_flags(name);
CREATE INDEX idx_feature_flags_enabled ON feature_flags(enabled);

The application-side implementation requires a flag evaluation function that queries the database, with caching to prevent database overload. The naive approach queries the database on every flag check, which can generate millions of queries per day for frequently-checked flags. Caching flag state in memory with a TTL (time-to-live) reduces database load by 99%+ while keeping flag changes relatively quick.

// Node.js implementation with in-memory caching
class FeatureFlagService {
  constructor(db, cacheTTL = 60000) { // 60 second cache
    this.db = db;
    this.cache = new Map();
    this.cacheTTL = cacheTTL;
  }

  async isEnabled(flagName, defaultValue = false) {
    const cached = this.cache.get(flagName);

    if (cached && Date.now() - cached.timestamp < this.cacheTTL) {
      return cached.enabled;
    }

    try {
      const result = await this.db.query(
        'SELECT enabled FROM feature_flags WHERE name = $1',
        [flagName]
      );

      const enabled = result.rows[0]?.enabled ?? defaultValue;

      this.cache.set(flagName, {
        enabled,
        timestamp: Date.now()
      });

      return enabled;
    } catch (error) {
      console.error(`Feature flag lookup failed for ${flagName}:`, error);
      return defaultValue;
    }
  }

  invalidateCache(flagName) {
    if (flagName) {
      this.cache.delete(flagName);
    } else {
      this.cache.clear();
    }
  }

  async updateFlag(flagName, enabled, userId) {
    await this.db.query(
      'UPDATE feature_flags SET enabled = $1, updated_at = NOW(), updated_by = $2 WHERE name = $3',
      [enabled, userId, flagName]
    );

    this.invalidateCache(flagName);
  }
}

This implementation balances simplicity and performance. The 60-second cache means flag changes take up to a minute to propagate across all application servers, which is acceptable for most use cases. For applications requiring instant flag updates, reduce the TTL to 5-10 seconds or implement active cache invalidation via pub/sub channels.

One critical detail: always provide a default value when checking flags. Database connectivity issues, migration errors, or missing flag definitions should not crash your application. The default value determines behavior when flag lookup fails—typically false for new features (fail closed) or true for infrastructure changes (fail open).

Percentage-Based Rollouts with Consistent Hashing

Boolean flags are all-or-nothing: a feature is either enabled for everyone or disabled for everyone. Percentage-based rollouts enable gradual exposure—release a feature to 5% of users, monitor metrics, then increase to 10%, 25%, 50%, and finally 100%. This limits the impact of bugs while gathering real-world performance data.

The implementation challenge is consistency. If you randomly enable a feature for 10% of requests, the same user might see the feature on one request and not see it on the next—a confusing experience. Percentage rollouts must be user-consistent: if a user is in the 10% group, they stay in that group across all their requests.

Consistent hashing solves this problem. Hash the user's ID with the flag name, convert the hash to a number between 0 and 100, and compare it to the rollout percentage. The same user ID always produces the same hash, guaranteeing consistency. The flag name is included in the hash input so different flags select different user subsets—if user 123 is in the 10% group for feature A, they might not be in the 10% group for feature B.

// Percentage-based rollout with consistent hashing
const crypto = require('crypto');

class FeatureFlagService {
  constructor(db, cacheTTL = 60000) {
    this.db = db;
    this.cache = new Map();
    this.cacheTTL = cacheTTL;
  }

  async isEnabled(flagName, userId, defaultValue = false) {
    const flag = await this.getFlag(flagName);

    if (!flag) {
      return defaultValue;
    }

    // Simple boolean flag
    if (flag.rollout_percentage === null || flag.rollout_percentage === 100) {
      return flag.enabled;
    }

    // Flag disabled, regardless of rollout
    if (!flag.enabled) {
      return false;
    }

    // Percentage-based rollout
    return this.isUserInRollout(userId, flagName, flag.rollout_percentage);
  }

  isUserInRollout(userId, flagName, percentage) {
    // Hash user ID + flag name for consistency
    const hash = crypto
      .createHash('md5')
      .update(`${userId}:${flagName}`)
      .digest('hex');

    // Convert first 8 hex chars to number, mod 100
    const bucket = parseInt(hash.substring(0, 8), 16) % 100;

    return bucket < percentage;
  }

  async getFlag(flagName) {
    const cached = this.cache.get(flagName);

    if (cached && Date.now() - cached.timestamp < this.cacheTTL) {
      return cached.flag;
    }

    const result = await this.db.query(
      'SELECT enabled, rollout_percentage FROM feature_flags WHERE name = $1',
      [flagName]
    );

    const flag = result.rows[0] || null;

    this.cache.set(flagName, {
      flag,
      timestamp: Date.now()
    });

    return flag;
  }
}

The updated schema includes a rollout_percentage column. A value of 100 or null means the flag is fully enabled when the enabled column is true. A value of 10 means only 10% of users see the feature, regardless of the enabled column—set enabled to true and rollout_percentage to 10 for a 10% rollout.

-- Updated schema with rollout percentage
ALTER TABLE feature_flags
ADD COLUMN rollout_percentage INTEGER CHECK (rollout_percentage >= 0 AND rollout_percentage <= 100);

This approach distributes users pseudo-randomly but deterministically. Because MD5 hashing produces uniform distribution, approximately 10% of users fall into any given 10% bucket. The consistency guarantee means user experience stays predictable even as you adjust rollout percentages.

Warning: Never use Math.random() for percentage rollouts. Random selection on each request means users see inconsistent behavior. Always use deterministic hashing based on user ID or session ID to maintain consistency across requests.

User Segment Targeting for Controlled Rollouts

Percentage rollouts select users randomly, but often you want control over which specific users see new features. Common scenarios include enabling features for internal team members first, for beta tester cohorts, for enterprise customers before free tier, or for specific customer accounts experiencing issues that the feature fixes.

User segment targeting extends feature flags with explicit user/group inclusion and exclusion rules. The data model stores targeting rules alongside flags, and the evaluation logic checks if the current user matches any inclusion rules and doesn't match any exclusion rules.

-- Schema for segment targeting
CREATE TABLE feature_flag_targets (
  id SERIAL PRIMARY KEY,
  feature_flag_id INTEGER NOT NULL REFERENCES feature_flags(id) ON DELETE CASCADE,
  target_type VARCHAR(50) NOT NULL, -- 'user', 'organization', 'segment'
  target_id VARCHAR(255) NOT NULL,
  include BOOLEAN NOT NULL DEFAULT true, -- true for inclusion, false for exclusion
  created_at TIMESTAMP NOT NULL DEFAULT NOW(),
  UNIQUE(feature_flag_id, target_type, target_id)
);

CREATE INDEX idx_feature_flag_targets_lookup ON feature_flag_targets(feature_flag_id, target_type);

The evaluation logic first checks explicit inclusions—if the user is specifically included, enable the feature regardless of other rules. Then check explicit exclusions—if the user is specifically excluded, disable the feature. Finally, fall back to percentage rollout logic for users without explicit rules.

// Segment targeting implementation
class FeatureFlagService {
  async isEnabled(flagName, context, defaultValue = false) {
    const flag = await this.getFlag(flagName);

    if (!flag || !flag.enabled) {
      return defaultValue;
    }

    // Check explicit user inclusion
    if (context.userId) {
      const userTarget = await this.getTarget(flag.id, 'user', context.userId);
      if (userTarget && userTarget.include) {
        return true;
      }
      if (userTarget && !userTarget.include) {
        return false;
      }
    }

    // Check organization inclusion
    if (context.organizationId) {
      const orgTarget = await this.getTarget(flag.id, 'organization', context.organizationId);
      if (orgTarget && orgTarget.include) {
        return true;
      }
      if (orgTarget && !orgTarget.include) {
        return false;
      }
    }

    // Check segment membership
    if (context.segments && context.segments.length > 0) {
      for (const segment of context.segments) {
        const segmentTarget = await this.getTarget(flag.id, 'segment', segment);
        if (segmentTarget && segmentTarget.include) {
          return true;
        }
        if (segmentTarget && !segmentTarget.include) {
          return false;
        }
      }
    }

    // Fall back to percentage rollout
    if (flag.rollout_percentage !== null && flag.rollout_percentage < 100) {
      return this.isUserInRollout(context.userId, flagName, flag.rollout_percentage);
    }

    return true; // Fully enabled, no targeting
  }

  async getTarget(flagId, targetType, targetId) {
    const cacheKey = `target:${flagId}:${targetType}:${targetId}`;
    const cached = this.cache.get(cacheKey);

    if (cached && Date.now() - cached.timestamp < this.cacheTTL) {
      return cached.target;
    }

    const result = await this.db.query(
      'SELECT include FROM feature_flag_targets WHERE feature_flag_id = $1 AND target_type = $2 AND target_id = $3',
      [flagId, targetType, targetId]
    );

    const target = result.rows[0] || null;

    this.cache.set(cacheKey, {
      target,
      timestamp: Date.now()
    });

    return target;
  }
}

This pattern enables sophisticated rollout strategies. Enable a feature for your internal organization first, gather feedback, then enable for a segment of beta testers, then roll out to 10% of remaining users, then 100%. Or enable a bug fix for specific customers who reported the issue before rolling out broadly.

The context object passed to isEnabled contains all relevant user/session information. In a typical SaaS application, context includes userId, organizationId, email, subscription plan, and any custom segments like "beta_testers" or "enterprise_customers". The more context you provide, the more targeting flexibility you have.

Redis-Based Distributed Flag Evaluation

In-memory caching works for single-server deployments or when 60-second propagation delays are acceptable. For high-traffic SaaS applications running dozens of API servers, you need centralized flag storage that all servers query with sub-millisecond latency. Redis is the standard solution.

The Redis approach stores flag definitions and targeting rules in Redis instead of PostgreSQL. Flag updates propagate to all servers instantly via Redis pub/sub. Caching happens in Redis itself—no per-server memory cache needed—ensuring all servers see consistent state.

// Redis-based feature flag service
class RedisFeatureFlagService {
  constructor(redis) {
    this.redis = redis;
    this.localCache = new Map();

    // Subscribe to flag updates
    redis.subscribe('flag_updates', (message) => {
      const { flagName } = JSON.parse(message);
      this.localCache.delete(flagName);
    });
  }

  async isEnabled(flagName, context, defaultValue = false) {
    // Check local cache first (short TTL)
    const cached = this.localCache.get(flagName);
    if (cached && Date.now() - cached.timestamp < 5000) {
      return this.evaluate(cached.flag, context);
    }

    // Fetch from Redis
    const flag = await this.redis.hgetall(`flag:${flagName}`);

    if (!flag || Object.keys(flag).length === 0) {
      return defaultValue;
    }

    // Parse stored values
    flag.enabled = flag.enabled === 'true';
    flag.rollout_percentage = flag.rollout_percentage ? parseInt(flag.rollout_percentage) : null;

    this.localCache.set(flagName, {
      flag,
      timestamp: Date.now()
    });

    return this.evaluate(flag, context);
  }

  evaluate(flag, context) {
    if (!flag.enabled) {
      return false;
    }

    // Check targeting rules (stored in Redis sets)
    // Implementation similar to database version but using Redis data structures

    // Fall back to percentage
    if (flag.rollout_percentage !== null && flag.rollout_percentage < 100) {
      return this.isUserInRollout(context.userId, flag.name, flag.rollout_percentage);
    }

    return true;
  }

  async updateFlag(flagName, updates, userId) {
    // Update Redis
    await this.redis.hmset(`flag:${flagName}`, updates);

    // Publish update to all servers
    await this.redis.publish('flag_updates', JSON.stringify({
      flagName,
      updatedBy: userId,
      timestamp: Date.now()
    }));
  }
}

The two-tier caching strategy provides optimal performance: a 5-second local cache eliminates most Redis calls for frequently-checked flags, while Redis pub/sub ensures flag changes propagate within seconds across all servers. This balances latency (sub-millisecond for cached flags), consistency (5-second maximum staleness), and Redis load (minimal).

Redis data structures map naturally to feature flag needs. Use hashes for flag definitions (name, enabled, rollout_percentage). Use sets for targeting rules (SADD flag:new_checkout:users 123 456 789). Use pub/sub for instant change propagation. This architecture scales to millions of flag evaluations per second with a properly provisioned Redis cluster.

Implementation	Propagation Time	Latency	Best For
Database Only	Instant	5-20ms	Low traffic, simple needs
DB + Memory Cache	30-60 seconds	< 1ms	Medium traffic, non-critical timing
Redis Primary	1-5 seconds	1-3ms	High traffic, fast propagation needed
Redis + Local Cache	5-10 seconds	< 1ms	Very high traffic, optimal performance

Admin UI and Audit Logging

Feature flags need a management interface beyond SQL queries. Non-technical team members—product managers, customer success, executives—need the ability to toggle flags during incidents or rollouts. An admin UI makes feature flags accessible to the entire team while maintaining safety through permissions and audit logs.

The minimum viable admin UI shows all flags with current status, provides toggle buttons to enable/disable, and displays rollout percentages with controls to adjust them. Beyond that, add targeting management, change history, and real-time usage metrics showing how many users are seeing each flag state.

// Express.js admin API endpoints
app.get('/admin/api/feature-flags', authenticate, authorize('admin'), async (req, res) => {
  const flags = await db.query(`
    SELECT
      f.*,
      COUNT(DISTINCT ft.id) as target_count,
      (SELECT COUNT(*) FROM feature_flag_evaluations WHERE flag_name = f.name AND result = true AND created_at > NOW() - INTERVAL '1 hour') as enabled_count_1h
    FROM feature_flags f
    LEFT JOIN feature_flag_targets ft ON f.id = ft.feature_flag_id
    GROUP BY f.id
    ORDER BY f.name
  `);

  res.json(flags.rows);
});

app.post('/admin/api/feature-flags/:name/toggle', authenticate, authorize('admin'), async (req, res) => {
  const { name } = req.params;
  const { enabled } = req.body;

  await db.query('BEGIN');

  try {
    // Update flag
    await db.query(
      'UPDATE feature_flags SET enabled = $1, updated_at = NOW(), updated_by = $2 WHERE name = $3',
      [enabled, req.user.email, name]
    );

    // Log the change
    await db.query(
      'INSERT INTO feature_flag_audit_log (flag_name, action, old_value, new_value, user_email) VALUES ($1, $2, $3, $4, $5)',
      [name, 'toggle', !enabled, enabled, req.user.email]
    );

    await db.query('COMMIT');

    // Invalidate caches
    await featureFlagService.invalidateCache(name);

    res.json({ success: true });
  } catch (error) {
    await db.query('ROLLBACK');
    throw error;
  }
});

Audit logging is non-negotiable for feature flags. Every change must record who made it, when, what changed, and ideally why (via a required comment field). During incident post-mortems, audit logs let you correlate flag changes with error spikes or user complaints. During compliance audits, they demonstrate controlled change management.

-- Audit log schema
CREATE TABLE feature_flag_audit_log (
  id SERIAL PRIMARY KEY,
  flag_name VARCHAR(255) NOT NULL,
  action VARCHAR(50) NOT NULL, -- 'toggle', 'update_rollout', 'add_target', 'remove_target'
  old_value TEXT,
  new_value TEXT,
  comment TEXT,
  user_email VARCHAR(255) NOT NULL,
  created_at TIMESTAMP NOT NULL DEFAULT NOW()
);

CREATE INDEX idx_audit_log_flag ON feature_flag_audit_log(flag_name, created_at);
CREATE INDEX idx_audit_log_user ON feature_flag_audit_log(user_email, created_at);

The comment field encourages documentation. Require comments for all flag changes in production: "Enabling new payment flow for 10% of users to test Stripe integration" or "Disabling AI suggestions due to high error rate, investigating." These comments become invaluable documentation when you revisit flags months later trying to determine if they're safe to remove.

Pro Tip: Build a Slack bot that posts flag changes to a dedicated channel. Visibility ensures the team knows when features are being rolled out, catches accidental changes quickly, and creates a real-time audit trail that's more accessible than querying database logs.

Feature Flag Hygiene and Technical Debt Management

Feature flags accumulate over time. A flag added for a cautious rollout in January remains in the codebase in July, long after the feature reached 100% rollout. Stale flags create technical debt: they clutter code with conditional logic, confuse new team members, and eventually no one remembers what they control or if they're safe to remove.

Preventing feature flag debt requires discipline and tooling. First, establish flag lifecycle expectations: temporary flags (for rollouts/testing) should be removed within 30-90 days of reaching 100% rollout. Permanent flags (for paid features, infrastructure toggles) should be explicitly marked as permanent in the database.

-- Add lifecycle tracking to schema
ALTER TABLE feature_flags
ADD COLUMN flag_type VARCHAR(50) NOT NULL DEFAULT 'temporary', -- 'temporary', 'permanent', 'ops'
ADD COLUMN expected_removal_date DATE,
ADD COLUMN fully_rolled_out_date DATE;

-- Alert on overdue temporary flags
SELECT name, created_at, expected_removal_date
FROM feature_flags
WHERE flag_type = 'temporary'
AND expected_removal_date < CURRENT_DATE
AND enabled = true
AND rollout_percentage = 100;

Automated monitoring identifies stale flags. Run a weekly report showing all temporary flags older than 90 days or flags that have been at 100% rollout for more than 30 days. Route these reports to the engineering team with a clear ownership model—the engineer who created the flag owns its removal, or it falls to the team that owns the relevant code area.

Removing flags safely requires caution. Even flags at 100% rollout for months might have dependents—other flags, analytics, or external integrations that check flag state. The safe removal process has three steps: mark the flag as deprecated in the database, remove all conditional logic (making the flag-on behavior permanent), then remove the flag definition after a grace period.

// Flag removal process
// Step 1: Mark deprecated (keep flag, always return true)
await db.query(
  'UPDATE feature_flags SET deprecated = true, deprecated_at = NOW() WHERE name = $1',
  ['old_feature']
);

// Step 2: Remove conditional code (1-2 week grace period)
// Before:
if (await flags.isEnabled('old_feature', context)) {
  // New behavior
} else {
  // Old behavior
}

// After:
// Just the new behavior, no flag check

// Step 3: Remove flag from database (after confirming no errors)
await db.query('DELETE FROM feature_flags WHERE name = $1', ['old_feature']);

The deprecation period provides safety. If removing the flag causes unexpected issues, you can quickly restore it. Once the flag has been deprecated for two weeks with no incidents, the removal is safe. This process prevents the "delete and hope" approach that occasionally breaks production in subtle ways.

Code analysis tools help identify flag usage. Grep your codebase for flag names or use static analysis to find all isEnabled calls. This inventory shows which flags are still actively used versus defined in the database but never checked—candidates for immediate removal. Tools like ESLint can be configured to warn on flags that haven't changed in 180+ days.

Advanced Patterns: Dependencies and Prerequisites

Some features depend on other features being enabled. You might have a "new_checkout" flag that only makes sense when "new_payment_processor" is enabled. Or a "beta_ai_features" flag that should automatically enable "enhanced_analytics". Flag dependencies prevent invalid states and reduce management complexity.

The implementation adds a prerequisite check to flag evaluation. Before returning true for a flag, verify all prerequisite flags are also enabled. Store prerequisites in a junction table linking flags to their dependencies.

-- Flag dependencies schema
CREATE TABLE feature_flag_dependencies (
  id SERIAL PRIMARY KEY,
  flag_id INTEGER NOT NULL REFERENCES feature_flags(id) ON DELETE CASCADE,
  prerequisite_flag_id INTEGER NOT NULL REFERENCES feature_flags(id) ON DELETE CASCADE,
  created_at TIMESTAMP NOT NULL DEFAULT NOW(),
  UNIQUE(flag_id, prerequisite_flag_id)
);

-- Prevent circular dependencies
CREATE OR REPLACE FUNCTION check_flag_dependency_cycle()
RETURNS TRIGGER AS $$
BEGIN
  -- Check if adding this dependency would create a cycle
  -- Implementation uses recursive CTE to detect cycles
  IF EXISTS (
    WITH RECURSIVE dependency_chain AS (
      SELECT prerequisite_flag_id AS flag_id FROM feature_flag_dependencies WHERE flag_id = NEW.prerequisite_flag_id
      UNION ALL
      SELECT ffd.prerequisite_flag_id FROM feature_flag_dependencies ffd
      JOIN dependency_chain dc ON ffd.flag_id = dc.flag_id
    )
    SELECT 1 FROM dependency_chain WHERE flag_id = NEW.flag_id
  ) THEN
    RAISE EXCEPTION 'Circular dependency detected';
  END IF;
  RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER prevent_circular_dependencies
BEFORE INSERT ON feature_flag_dependencies
FOR EACH ROW EXECUTE FUNCTION check_flag_dependency_cycle();

The evaluation logic recursively checks prerequisites. If flag A depends on B, and B depends on C, then enabling A requires both B and C to be enabled for the same user/context. This cascading logic prevents inconsistent states where dependent features appear without their foundations.

// Dependency-aware flag evaluation
async isEnabledWithDependencies(flagName, context, defaultValue = false, checkedFlags = new Set()) {
  // Prevent infinite recursion in case of circular dependencies (defense in depth)
  if (checkedFlags.has(flagName)) {
    return false;
  }
  checkedFlags.add(flagName);

  const flag = await this.getFlag(flagName);
  if (!flag) {
    return defaultValue;
  }

  // Check prerequisites
  const prerequisites = await this.getPrerequisites(flag.id);
  for (const prereq of prerequisites) {
    const prereqEnabled = await this.isEnabledWithDependencies(
      prereq.name,
      context,
      false,
      checkedFlags
    );

    if (!prereqEnabled) {
      return false; // Prerequisite not met
    }
  }

  // All prerequisites met, check this flag
  return this.evaluate(flag, context);
}

Dependencies simplify rollout management. Instead of carefully coordinating the order of enabling multiple related flags, you express the dependency relationship once in the database. The system automatically enforces correct ordering—enabling a child flag when its parent is disabled simply has no effect until the parent is enabled.

A/B Testing and Experiments Framework

Feature flags and A/B testing are closely related but distinct. Feature flags control feature availability—is this feature on or off? A/B tests measure impact—does variant A or variant B produce better outcomes? Combining these concepts creates an experiments framework built on feature flags.

The data model extends flags with experiment metadata: hypothesis, success metrics, statistical significance requirements, and variant definitions. Instead of just enabled/disabled, flags can specify multiple variants with different rollout percentages.

-- Experiment tracking schema
CREATE TABLE experiments (
  id SERIAL PRIMARY KEY,
  feature_flag_id INTEGER NOT NULL REFERENCES feature_flags(id) ON DELETE CASCADE,
  name VARCHAR(255) NOT NULL,
  hypothesis TEXT NOT NULL,
  success_metric VARCHAR(255) NOT NULL, -- 'conversion_rate', 'revenue_per_user', etc
  minimum_sample_size INTEGER NOT NULL,
  confidence_level DECIMAL(3,2) NOT NULL DEFAULT 0.95,
  start_date TIMESTAMP NOT NULL DEFAULT NOW(),
  end_date TIMESTAMP,
  winning_variant VARCHAR(50),
  status VARCHAR(50) NOT NULL DEFAULT 'running' -- 'running', 'completed', 'stopped'
);

CREATE TABLE experiment_variants (
  id SERIAL PRIMARY KEY,
  experiment_id INTEGER NOT NULL REFERENCES experiments(id) ON DELETE CASCADE,
  name VARCHAR(50) NOT NULL, -- 'control', 'variant_a', 'variant_b'
  rollout_percentage INTEGER NOT NULL,
  description TEXT
);

CREATE TABLE experiment_events (
  id SERIAL PRIMARY KEY,
  experiment_id INTEGER NOT NULL,
  variant_name VARCHAR(50) NOT NULL,
  user_id VARCHAR(255) NOT NULL,
  event_type VARCHAR(100) NOT NULL, -- 'impression', 'conversion', 'revenue'
  event_value DECIMAL(10,2),
  created_at TIMESTAMP NOT NULL DEFAULT NOW()
);

CREATE INDEX idx_experiment_events_lookup ON experiment_events(experiment_id, variant_name, event_type);

The flag evaluation logic assigns users to variants based on consistent hashing, similar to percentage rollouts. Once a user is assigned to a variant, they stay in that variant for the experiment duration. The system logs impressions (user saw the variant) and conversions (user completed the success metric) to the experiment_events table.

// Experiment-aware flag evaluation
async getVariant(experimentName, userId) {
  const experiment = await this.getExperiment(experimentName);

  if (!experiment || experiment.status !== 'running') {
    return 'control';
  }

  // Consistent variant assignment
  const hash = crypto.createHash('md5')
    .update(`${userId}:${experimentName}`)
    .digest('hex');
  const bucket = parseInt(hash.substring(0, 8), 16) % 100;

  let cumulative = 0;
  for (const variant of experiment.variants) {
    cumulative += variant.rollout_percentage;
    if (bucket < cumulative) {
      // Log impression
      await this.logExperimentEvent(experiment.id, variant.name, userId, 'impression');
      return variant.name;
    }
  }

  return 'control';
}

async logConversion(experimentName, userId, value = null) {
  const experiment = await this.getExperiment(experimentName);
  if (!experiment) return;

  const variant = await this.getUserVariant(experiment.id, userId);

  await db.query(
    'INSERT INTO experiment_events (experiment_id, variant_name, user_id, event_type, event_value) VALUES ($1, $2, $3, $4, $5)',
    [experiment.id, variant, userId, 'conversion', value]
  );
}

This framework separates flag management from experiment analysis. The feature flag system handles rollout mechanics—which users see which variant. The experiment system tracks metrics and calculates statistical significance. This separation lets you run experiments without coupling feature code to analytics logic.

Pattern	Use Case	Complexity	When to Use
Boolean Flags	Simple on/off toggles	Low	Internal features, ops toggles
Percentage Rollout	Gradual user exposure	Medium	New features, risk mitigation
Segment Targeting	Specific users/orgs	Medium	Beta testing, tier-based features
Dependencies	Related feature sets	High	Complex feature interactions
Experiments	A/B testing, metrics	High	Product optimization, data-driven decisions

Client-Side vs Server-Side Evaluation

Feature flag evaluation can happen on the server (API checks flags before sending responses) or on the client (JavaScript checks flags in the browser). Each approach has distinct tradeoffs affecting latency, security, and user experience.

Server-side evaluation keeps flag logic secure. The client never sees flag definitions, targeting rules, or rollout percentages—just the result of evaluation for the current user. This prevents users from inspecting code to discover unreleased features or manipulating flag state. Server-side also ensures consistent flag state across the entire request—no race conditions where a flag changes mid-request.

Client-side evaluation reduces latency for UI changes. If a flag controls whether a button appears, client-side evaluation avoids a server round trip—the page renders instantly with correct flag state. Client-side is essential for mobile apps that need offline flag access. The downside is exposing flag state in JavaScript, which users can inspect and potentially manipulate.

The hybrid approach combines benefits: evaluate flags on the server during initial page load, embed results in the HTML as a JavaScript object, then use client-side logic to check flags during user interaction. This gives immediate UI responsiveness while keeping flag definitions server-side.

// Server-side: embed flags in HTML
app.get('/', authenticate, async (req, res) => {
  const flags = {
    new_dashboard: await featureFlags.isEnabled('new_dashboard', req.user),
    ai_assistant: await featureFlags.isEnabled('ai_assistant', req.user),
    dark_mode: await featureFlags.isEnabled('dark_mode', req.user)
  };

  res.render('index', {
    flags: JSON.stringify(flags)
  });
});

// In HTML template


// Client-side JavaScript
function isFeatureEnabled(flagName) {
  return window.featureFlags[flagName] || false;
}

if (isFeatureEnabled('new_dashboard')) {
  // Render new dashboard UI
} else {
  // Render old dashboard UI
}

This pattern works for most SaaS applications. Server-side evaluation during page load ensures flags are evaluated with correct user context and security. Client-side checks use the pre-evaluated results, adding zero latency. The only limitation is flag changes during a session—users must refresh the page to see updated flags.

For real-time flag updates without refresh, use WebSocket or Server-Sent Events to push flag changes to connected clients. This enables instant feature rollout to active users, useful for emergency kill switches or time-sensitive promotions. The implementation complexity is significant—only add it if your use case requires real-time updates.

Security Warning: Never expose targeting rules or rollout percentages to clients. Only send evaluated boolean results. Exposing rules lets users discover which segments see features early, potentially revealing business strategy or creating unfair access to beta features.

Frequently Asked Questions

How many feature flags is too many?

Most teams should maintain under 50 active flags at any time. Beyond 100 flags, technical debt and cognitive load become significant problems. Enforce flag hygiene: every temporary flag should have an expected removal date. Remove flags within 30-90 days of reaching 100% rollout. If you consistently exceed 50 flags, you're likely not removing old flags aggressively enough.

Should feature flags be in config files or databases?

Database storage is superior for SaaS applications. Config file changes require deployments, eliminating the core benefit of feature flags—changing behavior without deploying code. Databases enable instant toggling via admin UI, gradual rollouts with percentage adjustments, and user-specific targeting. Config files work only for permanent operational toggles that rarely change.

How do I test code behind feature flags?

Test both flag states. For temporary rollout flags, write tests that pass with the flag on and tests that verify old behavior with the flag off. For permanent flags, use test utilities that override flag values. Never mock the entire flag service—that hides bugs in flag evaluation logic. Instead, inject flag overrides at the context level so flag evaluation code still executes.

What's the difference between feature flags and environment variables?

Environment variables configure how an application connects to external resources (database URLs, API keys). Feature flags control which code paths execute. Environment variables change per deployment environment (staging vs production). Feature flags vary per user or user segment within the same environment. Don't use environment variables for feature flags—you lose dynamic control and user targeting.

How do I handle feature flags in multiple environments?

Maintain separate flag state per environment. A flag enabled in staging shouldn't automatically enable in production. Use environment-specific databases or Redis instances. The flag definitions (names, descriptions) can be synchronized across environments via migration scripts, but enabled state and rollout percentages must be independent. This lets you test rollouts in staging before production.

Should I use a commercial feature flag service or build my own?

Build your own for the first implementation—the patterns in this article provide a production-ready foundation. Commercial services like LaunchDarkly or Split become valuable at scale (100+ flags, complex targeting, extensive A/B testing). They provide sophisticated admin UIs, real-time updates, advanced analytics, and SDKs for multiple platforms. Start simple, migrate to commercial when flag complexity justifies the cost.

How do I prevent feature flags from making code unmaintainable?

Follow three rules: (1) Keep flag checks at boundaries, not scattered throughout business logic. Check flags in controllers/routes, then pass explicit parameters to services. (2) Remove temporary flags within 90 days. (3) Use abstraction—create feature service classes instead of littering if (flag) statements everywhere. Well-structured flag usage adds minimal complexity; flag sprawl destroys codebases.

Can I use feature flags for infrastructure changes like database switches?

Yes—ops toggles let you switch infrastructure without code deploys. Add a flag for read_from_new_db, use it to route queries to a new database, verify correctness, then make it permanent. Unlike temporary feature flags, ops toggles often remain indefinitely as circuit breakers. If the new database has issues, toggle back to the old one instantly. These are "permanent" flags that provide operational safety.

Conclusion

Feature flags transform deployment from risky events to routine operations by decoupling code deployment from feature release. Start with simple database-backed boolean flags, add percentage rollouts for gradual exposure, then introduce segment targeting for controlled access. For high-traffic applications, Redis-based evaluation provides the sub-millisecond latency needed at scale.

The technical implementation is straightforward—most patterns in this article can be implemented in a few hundred lines of code. The operational discipline is harder: maintaining flag hygiene, removing temporary flags promptly, documenting flag purposes, and auditing changes. Teams that treat flags as permanent additions accumulate technical debt that eventually requires painful cleanup. Teams that aggressively retire flags maintain lean, comprehensible codebases.

Feature flags are infrastructure, not optional tooling. They enable progressive delivery, instant rollback during incidents, and A/B testing for product optimization. Build flag systems early in your SaaS journey—the patterns are simple at small scale and become indispensable as you grow. The confidence to deploy multiple times per day without fear of breaking production is worth the modest implementation effort.

Best SaaS Feature Flag Implementations

Best SaaS Feature Flag Implementations

Why Feature Flags Are Infrastructure, Not Optional

Database-Backed Boolean Flags: The Foundation

Percentage-Based Rollouts with Consistent Hashing

User Segment Targeting for Controlled Rollouts

Redis-Based Distributed Flag Evaluation

Admin UI and Audit Logging

Feature Flag Hygiene and Technical Debt Management

Advanced Patterns: Dependencies and Prerequisites

A/B Testing and Experiments Framework

Client-Side vs Server-Side Evaluation

Frequently Asked Questions

How many feature flags is too many?

Should feature flags be in config files or databases?

How do I test code behind feature flags?

What's the difference between feature flags and environment variables?

How do I handle feature flags in multiple environments?

Should I use a commercial feature flag service or build my own?

How do I prevent feature flags from making code unmaintainable?

Can I use feature flags for infrastructure changes like database switches?

Conclusion

Share on Social Media:

Bright SEO Tools