Skip to main content

Command Palette

Search for a command to run...

Solving Leetcode Interviews in Seconds with AI: Delete Duplicate Folders in System

Updated
5 min read

Introduction

In this blog post, we will explore how to solve the LeetCode problem "1948" using AI. LeetCode is a popular platform for preparing for coding interviews, and with the help of AI tools like Chatmagic, we can generate solutions quickly and efficiently - helping you pass the interviews and get the job offer without having to study for months.

Problem Statement

Due to a bug, there are many duplicate folders in a file system. You are given a 2D array paths, where paths[i] is an array representing an absolute path to the ith folder in the file system. For example, ["one", "two", "three"] represents the path "/one/two/three". Two folders (not necessarily on the same level) are identical if they contain the same non-empty set of identical subfolders and underlying subfolder structure. The folders do not need to be at the root level to be identical. If two or more folders are identical, then mark the folders as well as all their subfolders. For example, folders "/a" and "/b" in the file structure below are identical. They (as well as their subfolders) should all be marked: /a /a/x /a/x/y /a/z /b /b/x /b/x/y /b/z However, if the file structure also included the path "/b/w", then the folders "/a" and "/b" would not be identical. Note that "/a/x" and "/b/x" would still be considered identical even with the added folder. Once all the identical folders and their subfolders have been marked, the file system will delete all of them. The file system only runs the deletion once, so any folders that become identical after the initial deletion are not deleted. Return the 2D array ans containing the paths of the remaining folders after deleting all the marked folders. The paths may be returned in any order. Example 1: Input: paths = [["a"],["c"],["d"],["a","b"],["c","b"],["d","a"]] Output: [["d"],["d","a"]] Explanation: The file structure is as shown. Folders "/a" and "/c" (and their subfolders) are marked for deletion because they both contain an empty folder named "b". Example 2: Input: paths = [["a"],["c"],["a","b"],["c","b"],["a","b","x"],["a","b","x","y"],["w"],["w","y"]] Output: [["c"],["c","b"],["a"],["a","b"]] Explanation: The file structure is as shown. Folders "/a/b/x" and "/w" (and their subfolders) are marked for deletion because they both contain an empty folder named "y". Note that folders "/a" and "/c" are identical after the deletion, but they are not deleted because they were not marked beforehand. Example 3: Input: paths = [["a","b"],["c","d"],["c"],["a"]] Output: [["c"],["c","d"],["a"],["a","b"]] Explanation: All folders are unique in the file system. Note that the returned array can be in a different order as the order does not matter. Constraints: 1 <= paths.length <= 2 104 1 <= paths[i].length <= 500 1 <= paths[i][j].length <= 10 1 <= sum(paths[i][j].length) <= 2 105 path[i][j] consists of lowercase English letters. No two paths lead to the same folder. For any folder not at the root level, its parent folder will also be in the input.

Explanation

Here's a breakdown of the solution, followed by the Python code:

  • Tree Traversal and Serialization: Build a tree-like structure from the input paths. Serialize each folder by recursively serializing its children in a sorted order. This allows us to identify identical folder structures based on their serialized strings.
  • Identify Duplicates: Use a hash map (dictionary) to store the serialized representations of each folder. If a serialization is encountered more than once, mark all corresponding folders as duplicates.
  • Filter and Return: Iterate through the original paths and filter out any paths (and their subpaths) that were marked as duplicates. Return the remaining paths.

  • Runtime Complexity: O(N M L log(degree)), where N is the number of paths, M is the maximum path length, L is the average string length, and degree is the average number of children of a node. Storage Complexity: O(N M * L)

Code

    from collections import defaultdict

def deleteDuplicateFolder(paths):
    """
    Deletes duplicate folders from a file system represented by paths.

    Args:
        paths: A 2D array of strings, where paths[i] is an array representing an absolute path
               to the ith folder in the file system.

    Returns:
        A 2D array of strings containing the paths of the remaining folders after deleting all
        marked folders.
    """

    tree = defaultdict(list)
    path_map = {}
    node_map = {}

    # Build the tree structure
    for path in paths:
        curr = tree
        node = ""
        for folder in path:
            node = node + "/" + folder if node else folder
            if folder not in curr:
                curr[folder] = defaultdict(list)
            curr = curr[folder]
        path_map[tuple(path)] = True  # Mark path as existing
        node_map[node] = tuple(path)  # Map each file path to the tuple

    duplicates = set()

    def serialize(node):
        """Serializes the folder structure rooted at the given node."""

        if not node:
            return "#"

        children = sorted(node.keys())
        serialized = "("
        for child in children:
            serialized += child + serialize(node[child])
        serialized += ")"
        return serialized

    serialized_map = defaultdict(list)

    def mark_duplicates(node, path):
        nonlocal duplicates
        serialized = serialize(node)
        serialized_map[serialized].append(path)

    # Collect all the serialized paths
    for path, exists in path_map.items():
        if exists:
            curr = tree
            for folder in path:
                curr = curr[folder]
            mark_duplicates(curr, tuple(path))

    # Mark the serialized paths which appeared more than once
    for serialized, path_list in serialized_map.items():
        if len(path_list) > 1:
            for path in path_list:
                duplicates.add(path)

    # Mark all their subpaths as duplicates too
    def mark_subpaths(path):
        nonlocal duplicates
        if path in path_map:
            duplicates.add(path)

    for path in list(duplicates):
        temp = []
        for i in range(1,len(path)+1):
            temp.append(tuple(path[:i]))

        for sub_path in temp:
            mark_subpaths(sub_path)

    # Filter the paths to get the remaining paths
    remaining = []
    for path, exists in path_map.items():
        if exists and path not in duplicates:
            remaining.append(list(path))

    return remaining

More from this blog

C

Chatmagic blog

2894 posts