KeiruaProd

I help my clients acquire new users and make more money with their web businesses. I have ten years of experience with the technical aspects of growing SaaS projects using data science for decision-making. If that’s something you need help with, we should get in touch!

Preparing Adventofcode in Python 3/3, algorithms and data structures

This is the last part of the “Getting ready for Adventofcode in Python” series.

  1. Parsing the puzzles efficiently
  2. Cool tools from Python’s standard library
  3. Useful algorithms and data structures for competitive programming

We’ll look at which algorithms and data structures we need to get into competitive programming. AoC is not easy, but it’s way easier than a lot of other challenges like TopCoder so it’s a good starting point.

This article itself has 3 parts:

Things you should already know

You will almost certainly use the basic data structures:

The odds are you already use most of those, but maybe you have to lookup a couple things.

You should know, for each of these data structures:

The last part is critical ; you don’t want to spend too much time looking up basic things about the trivial part of the implementation when 10mn of research ahead of time could solve it for the duration of the contest.

As an example, here is quick recap about python’s dict:

d = {'user_id': 123, 'email': "bidou@yopmail.com"}
d.items()  # [('user_id', 123), ('email', 'bidou@yopmail.com')]
d.keys()  # ['user_id', 'email']
d.values()  # [123, 'bidou@yopmail.com']
d.get("birth_year", 1970) # returns the value for key birth_year if any, or 1970. 1970 here
d.clear()  # d = {}

# defaultdict have a default value when a key is missing 
from collections import defaultdict

As you can see it’s not complicated, you probably already do that often for the frequent data structures. As part of a preparation, better do it beforehand for the less frequent data structures, like set.

As for arrays and string, it can be useful to know how search/insert/erase an item. Here are strings:

# find returns the index of the string, or -1 if not found
>>> "plop".find("n")
-1
>>> "plop".find("o")
2
# appending to a string is done with the + operator
>>> s2 = s+"pouet"
>>> s2
'ploppouet'
# You can replace/delete with replace. It does not modify the string in place
s = "plop"
>>> s.replace("p", "g")
'glog'
>>> s
'plop'

Things that come up every year

Every year there are problems about:

Let’s dig a bit.

Bruteforcing

Some problems lend themselves to bruteforce. It may not be super efficient: 2020 day 15 involved the Van Ecke sequence that my bruteforce solution solved in 20s, but… it gets the job done and a more ad hoc solution would involve more brain time spent on the matter.

In order to bruteforce efficiently:

Recursion

Getting familiar with recursion is non-trivial.

As a starting point, try to write a few popular algorithms with recursion (eg the fibonacci sequence), then try to work out:

Recursion has a lot of usage with number-theoretical problems and graphs. No need to read a lot of theory, just get a feel for it.

Graphs and Trees, and graph and tree traversal

There often are some kind of trees and graphs and knowing how to run some of the classic algorithms on them is a must.

Breadth-first search is a tree traversal algorithm. It was hard for me a few years ago, but having implemented it many times since my student years I finally got the logic. Here is a sample implementation heavily inspired from wikipedia:

from collections import deque

edges = {
    "a": ["b", "c"], # from edge a, it is possible to go to b or c
    "b": ["d", "e"],
    "c": ["f", "g"],
    "d": [],
    "e": ["h"],
    "f": [],
    "g": []
}

# Sample code from
# https://en.wikipedia.org/wiki/Breadth-first_search#Pseudocode
# Is it possible to go from root to goal ?
def bfs(G, root, goal):
    Q = deque()
    SEEN = [root]
    Q.appendleft(root)
    while not len(Q) == 0:
        v = Q.popleft()
        if v == goal:
            return True
        if v in G:
            for w in G[v]: # for all neighbor vertices w of v
                if w not in SEEN:
                    SEEN.append(w)
                    Q.appendleft(w)
    return False

print(bfs(edges, "a", "h"))
print(bfs(edges, "a", "c"))
print(bfs(edges, "b", "c"))

DFS is similar, the difference is that we navigate differently inside the tree.

# iterative DFS:
# https://en.wikipedia.org/wiki/Depth-first_search#Pseudocode
def dfs(G, root, goal):
    S = [root]
    SEEN = []
    while len(S) > 0:
        v = S.pop()
        if v == goal:
            return True
        if v in G and v not in SEEN:
            SEEN.append(v)
            for w in G[v]: # for all neighbor vertices w of v
                S.append(w)
    return False

print(dfs(edges, "a", "h"))
print(dfs(edges, "a", "c"))
print(dfs(edges, "b", "c"))

Backtracking

Backtracking is a popular algorithm for bruteforcing the search in a tree, while discarding huge parts of the tree when some candidates will not yield a valid solution. There are a lot of variations for pruning the search tree. Give it a try at some point to get a feel for it:

https://en.wikipedia.org/wiki/Backtracking#Pseudocode

Shortest Path in a graph - Dijkstra

Sometimes tree traversal is not enough, you need to search for a shortest path, and Dijkstra’s algorithm comes to mind! I love this one, it’s short and elegant.

Parsing an running a small interpreted language

There are at least one small programming language to parse and run, so learn what it takes to do that: what is a virtual machine and its memory ? What is an instruction, how could you represent it, how do you store its parameters?

You may also want to have a look at the Shunting-yard algorithm

Dynamic programming

Dynamic programming is really not trivial to me.

Some problems are way too huge to be bruteforced, and many paths will recompute data that have already been computed.

A popular example is the recursive implementation of the fibonacci sequence, that involves many calls to the same values. With memoization, you can get rid of entire call trees: https://medium.com/geekculture/how-to-solve-fibonacci-sequence-using-dynamic-programming-b7cd784ee10d#f98d (turns out there are even better solutions, but that’s not the topic )

If you want to go deeper with more complex examples, here is Jonathan Paulson who:

Things you may need

OK, here we are in uncharted territory. You may need this, or you may not. Study at your own risk! The adventure starts here, you’re on your own.

See a typo ? You can suggest a modification on Github.