Dec 9, 2024

Advent of Code 2024 - Day 9: Disk Fragmenter

Attention!

Check Advent of Code for the full challenge details. They created everything we need to know about the challenge - give them your support.
My solutions are not ‘the’ solutions - I target one of the possible solutions. They might not be the most efficient, but they are a way to solve the problem. I do not plan to spend too much time the challenges.
Every challenge = different language - just for fun and learning.
I do not include the input data, but you can find it on the Advent of Code website or in the repository of the code.

New day, new challenge.

I was tasked with helping an amphipod compact the files on a disk to create contiguous free space, and then calculate a checksum for the filesystem. The task involves moving files on a disk using two different methods and then calculating the checksum based on the final configuration.

Task Overview

The disk map is represented by a string of digits. The digits alternate between indicating the length of a file and the length of free space on the disk. The goal is to move the files toward the leftmost free spaces, and for each move, calculate a checksum based on the final arrangement.

I implemented the solution in Python and approached the task in two parts: Part One involves moving files one block at a time, while Part Two involves moving whole files in one go.

Part One: One Block at a Time

In Part One, I needed to compact the disk by moving file blocks one by one, starting from the end of the disk to the leftmost free space. I then had to compute the checksum by multiplying each block’s position by its file ID.

Step-by-Step Explanation of the Code

Input Processing and Conversion: The disk is represented as a string of digits where the even indices represent the file sizes, and the odd indices represent free space. First, I convert this input string into a list of file IDs, where each file ID corresponds to a specific file, and -1 represents free space.
```
# main.py
get_id = lambda i: i // 2 if i % 2 == 0 else -1
disk = [get_id(i) for i, size in enumerate(disk) for _ in range(size)]
```
Here, the get_id lambda function helps generate a list of file IDs based on whether the index corresponds to a file or free space. If the index is even, it represents a file and gets assigned an ID. If it’s odd, it represents free space and is marked as -1.

File Movement Logic: I then use two pointers, left_ptr and right_ptr, to traverse the disk. I move the file blocks from the right to the left, filling free spaces as I go.

# main.py
left_ptr = 0
right_ptr = len(disk) - 1
while left_ptr < right_ptr:
    if disk[right_ptr] == -1:
        right_ptr -= 1
        continue
    if disk[left_ptr] != -1:
        left_ptr += 1
        continue
    disk[left_ptr] = disk[right_ptr]
    disk[right_ptr] = -1
    left_ptr += 1
    right_ptr -= 1

Checksum Calculation: Once all files are compacted, I calculate the checksum by multiplying the position of each file by its ID, and summing the results.
```
# main.py
return sum(position * file_id for position, file_id in enumerate(disk) if file_id != -1)
```

Part Two: Moving Whole Files at Once

Part Two introduces a new twist: instead of moving individual blocks, I now need to move whole files. The task is to attempt to place each file in the leftmost available space that can accommodate it. This is done starting from the file with the highest ID, working downwards.

File and Free Space Parsing: I first identify the files and free spaces in the disk. For each file, I check if there’s enough free space available to move it to the left. If there’s enough space, I move the file; otherwise, it remains in its current position.

# main.py
files = []
free_spaces = []
curr_idx = 0
for i, size in enumerate(disk):
    if i % 2 == 0:
        files.append((i // 2, size, curr_idx))
    elif size > 0:
        free_spaces.append((size, curr_idx))
    curr_idx += size

Moving Files: I then attempt to move each file in order of decreasing ID. For each file, I check the available free spaces and move the file to the leftmost free space that is large enough.

# main.py
for file_id, size, position in reversed(files):
    for i, (gap_size, free_space) in enumerate(free_spaces):
        if gap_size >= size:
            moved_files.append((file_id, size, free_space))
            free_spaces[i] = (gap_size - size, free_space + size)
            break

Checksum Calculation: Similar to Part One, I calculate the checksum by summing the positions of the moved files, but now I need to consider the size of each file.
```
# main.py
calculate_progression =
    lambda file_id, size, position: file_id * (position * size + size * (size - 1) // 2)
return sum(calculate_progression(file_id, size, pos) for file_id, size, pos in moved_files)
```
The calculate_progression might look intimidating, let’s break it down:
- position * size: This part multiplies the position of the file by its size. This could be intended to reflect a direct influence of the position (where the file is moved to) and the file’s size on the checksum. If the position is higher or lower, it affects the checksum more or less based on the file’s size.
- size * (size - 1) // 2: This part is a formula for the sum of the first (size - 1) integers, which is commonly used in combinatorics. For example:
  - If size = 3, this part calculates 3 * (3 - 1) // 2 = 3.
  - If size = 4, it calculates 4 * (4 - 1) // 2 = 6.
This formula factors in the cumulative effect of the file’s size on the checksum, taking into account the “progression” or sum of the positions the file might have gone through. The whole expression is calculating a weighted progression value for each file, where the file ID is the factor that uniquely distinguishes each file, and the combination of its size and position, plus the sum of the integers related to its size, adjusts the contribution of each file to the checksum.

Final Solution Code

The final solution code can be found on my GitHub repository.

If you like the content, don’t forget to give it a star ⭐.

Part One:
- Time Complexity: O(n), where n is the size of the disk. This is because we are moving files by adjusting pointers and performing a single pass through the disk.
- Space Complexity: O(n), where n is the size of the disk, as we store the disk in a list.
Part Two:
- Time Complexity: O(n * m), where n is the number of files and m is the number of free spaces. This is because, for each file, I search through the free spaces.
- Space Complexity: O(n), where n is the number of files, as we store file details and movements.

Conclusion

This problem was an interesting exercise in simulating the movement of files and calculating a checksum.

I learned that while Part One was relatively straightforward with a two-pointer approach, Part Two required more careful handling of file sizes and free space, the problem is quite hard.

The challenge was enjoyable, and I’m happy with the solution.

As of language, it wasn’t new to me of course, so it was just for ‘getting easier’ (I’ve had a tough day).

Tommorow I will go for ‘the father of jvm’ (yesterday it was Kotlin) - Java.