Skip to main content

Contains Duplicate

🔗 NeetCode Problem:
https://neetcode.io/problems/duplicate-integer/question?list=neetcode150
🔗 LeetCode Problem:
https://leetcode.com/problems/contains-duplicate

Problem Statement

Given an integer array nums, return true if any value appears at least twice in the array, and return false if every element is distinct.

Approach 1: Brute Force (Nested Loop)

Idea

Compare every element with every other element to check if a duplicate exists.

Pseudocode

for i from 0 to n-1:
    for j from i+1 to n-1:
        if nums[i] == nums[j]:
            return true
return false

C++ Implementation

class Solution {
public:
    bool containsDuplicate(vector<int>& nums) {
        for (int i = 0; i < nums.size(); i++) {
            for (int j = i + 1; j < nums.size(); j++) {
                if (nums[i] == nums[j])
                    return true;
            }
        }
        return false;
    }
};

Complexity Analysis

  • Time Complexity: O(n²) → Nested loops over the array
  • Space Complexity: O(1) → No extra space needed

Why it’s not optimal?

For large arrays, comparing every pair becomes extremely slow. If the array has 10,000 elements, we’d do ~50 million comparisons in the worst case.

Optimized Approach: Using Hash Map (Unordered Map)

Idea

Use a hash map to store elements as we iterate.
If an element already exists in the map, it means a duplicate is found.

Key Insight

Hash map lookup is O(1) on average, so we only need a single pass through the array.

C++ Implementation

class Solution {
public:
    bool containsDuplicate(vector<int>& nums) {
        unordered_map<int, int> mp;

        for (int i = 0; i < nums.size(); i++) {
            if (mp.find(nums[i]) != mp.end())
                return true;

            mp[nums[i]] = i;
        }
        return false;
    }
};

Complexity Analysis

  • Time Complexity: O(n) → Single pass through the array with O(1) lookups
  • Space Complexity: O(n) → Hash map stores up to n elements

Why it’s better?

Single pass with constant-time lookups beats nested loops. Early termination when duplicate is found.

Alternative Optimized Approach: Using Hash Set

Idea

Similar to hash map, but we only need to store the elements themselves, not their indices.

C++ Implementation

class Solution {
public:
    bool containsDuplicate(vector<int>& nums) {
        unordered_set<int> seen;

        for (int num : nums) {
            if (seen.find(num) != seen.end())
                return true;

            seen.insert(num);
        }
        return false;
    }
};

Complexity Analysis

  • Time Complexity: O(n) → Single pass with O(1) lookups
  • Space Complexity: O(n) → Hash set stores up to n elements

Advantage over Hash Map

Hash set uses slightly less memory since we don’t store indices, only the elements themselves.

Optimized Approach: Using Sorting

Idea

Sort the array, then check if any adjacent elements are equal.

C++ Implementation

class Solution {
public:
    bool containsDuplicate(vector<int>& nums) {
        sort(nums.begin(), nums.end());

        for (int i = 0; i < nums.size() - 1; i++) {
            if (nums[i] == nums[i + 1])
                return true;
        }
        return false;
    }
};

Complexity Analysis

  • Time Complexity: O(n log n) → Sorting dominates
  • Space Complexity: O(1) → In-place sorting (depending on sort implementation)

When to use?

When space is critical and you can afford sorting overhead. Not as good as hash-based approaches for time.

Comparison of All Approaches

ApproachTime ComplexitySpace ComplexityProsCons
Brute ForceO(n²)O(1)No extra spaceVery slow for large arrays
Hash MapO(n)O(n)Fast, stores indices if neededUses extra memory
Hash SetO(n)O(n)Fast, cleaner codeUses extra memory
SortingO(n log n)O(1)Space-efficientSlower than hash-based

Why Hash Set is Most Optimal

  1. Single pass: O(n) time complexity
  2. Fast lookups: O(1) average case for hash operations
  3. Early termination: Returns true immediately when duplicate is found
  4. Memory efficient: Stores only values, not indices like hash map
  5. Simple code: Clean and easy to understand

Example Walkthroughs

Example 1: nums = [1, 2, 3, 1]

Using Hash Set:
seen = {}

i=0: num=1
  - 1 not in seen
  - seen = {1}

i=1: num=2
  - 2 not in seen
  - seen = {1, 2}

i=2: num=3
  - 3 not in seen
  - seen = {1, 2, 3}

i=3: num=1
  - 1 IS in seen ✓
  - Return true
Duplicate found! Time: O(4) = O(n)

Example 2: nums = [1, 2, 3, 4]

Using Hash Set:
seen = {}

i=0: num=1 → seen = {1}
i=1: num=2 → seen = {1, 2}
i=2: num=3 → seen = {1, 2, 3}
i=3: num=4 → seen = {1, 2, 3, 4}

Loop ends, no duplicate found
Return false
No duplicates. Time: O(4) = O(n)

Example 3: nums = [99, 99]

Using Hash Set:
seen = {}

i=0: num=99
  - 99 not in seen
  - seen = {99}

i=1: num=99
  - 99 IS in seen ✓
  - Return true (immediately!)
Duplicate found on second element! Very efficient.

Key Takeaways

  • Avoid brute force for this problem (O(n²) is too slow)
  • Hash Set is optimal for general use (O(n) time, O(n) space)
  • Sorting is an alternative if space is strictly limited
  • Early termination makes hash-based solutions even faster in practice
  • Hash Map only needed if you must track indices, otherwise use Hash Set