Google foobar, part ii: stashing bunnies
written on February 4, 2023
categories: challenges
This post is the second installment of the foobar challenge series. You can read the first one to learn more about foobar and see what sort of intro problems the challenge presents if you don't want any real spoilers.
WARNING: Obviously, MAJOR spoilers ahead, for the problems themselves and possible solutions or hints. Even if you're attempting one of these and feel completely stuck, give it another go before reading the rest. Keep reading only if you don't mind getting spoiled!
Problem statement
Okay, so this one is about stashing bunnies. Bunnies? Well, anything that's fuzzy and warm and you can assign an ID to.
Assume you have a bunch of bunnies stashed in a corner, represented by their ID numbers, revealing the order you stacked them in:
We want to figure out what the ID of the bunny is given a pair of coordinates (x, y)
,
where x
is the distance from the leftmost "wall" and y
the distance from the
"floor". For example, the bunny with ID 1 is at (1, 1)
, the one with ID 8 is at
(2, 3)
and so on.
@@@@@@@@@@
@@@@@@@@@@
SECOND WARNING: If you want to solve this yourself, STOP HERE! Anything past this point is a spoiler of some part of figuring out the solution, or of the solution itself. You've been warned!
@@@@@@@@@@
@@@@@@@@@@
First ideas
A naive solution to this could be to simply construct this thing as a 2D array, and
index into it to get the answer immediately. Of course that wouldn't be very
interesting, so the challenge prevents us from doing so by setting the max values of x
and y
to a considerably large number (something like 100,000 if I remember correctly).
While you might be able to get away with "constructing" the solution like this here, I
haven't tried it and I think it's more fun to find a solution that doesn't need that
anyway.
So... where do we start? Well, as always it's a good idea to start looking for patterns in the numbers, so let's start by writing down the first few coordinates, ordered by the ID they point to:
(1, 1) -> 1
(1, 2) -> 2
(2, 1) -> 3
(1, 3) -> 4
(2, 2) -> 5
(3, 1) -> 6
(1, 4) -> 7
(2, 3) -> 8
(3, 2) -> 9
(4, 1) -> 10
...
See a pattern? I'll try to describe what came to me first, and it's okay if it's kind of wishy-washy and not clearly formulated yet, we will get to a more "mathematical" version later.
What I first observed when writing those down is that you always seem to come back to
a (1, M)
pair, then some other pairs, then (M, 1)
, and then (1, M + 1)
. For
example, you have the pair (1, 4)
, then some other pairs, then (4, 1)
, and then
(1, 5)
immediately after.
The second thing I observed was that those "intermediate" pairs between (1, M)
and
(M, 1)
were kind of "stepping through" the values between 1
and M
. More
specifically, the value of the x
coordinate increments until it reaches M
, and the
value of the y
coordinate decrements until it reaches 1. So, finally, the sequence
is:
(1 , M )
(2 , M - 1)
(3 , M - 2)
...
(M - 2, 3 )
(M - 1, 2 )
(M , 1 )
(1 , M + 1)
...
Great, but.. what ID is associated to (1, M)
, how does this get us anywhere? Well, at
this point we could pretty much solve this by iterating on this sequence until we reach
the given coordinate pair.
I don't love this solution; sure, maybe there's some clever dynamic programming trick we
could use to optimize this, but basically if we're given (100000, 100000)
we're
screwed: we'd have to go through literally every other ID before we find that one.
There's also another problem: our "sequence algorithm" gets us IDs in the same order as
the one they're put in the stash in, so diagonally! This means that we will only get the
pair (M, M)
if it's in the diagonal of (1, N) ... (N, 1)
, but of course M <= N
.
If the max value of x
and y
was 4, we would never be able to get (M, M)
for any
M
greater than 2!
NOTE: if you don't get why that is, think that the diagonal containing
(2, 2)
ends at(3, 1)
. This means that the next diagonal starts at(1, 4)
, but it doesn't contain an(M, M)
pair (you can convince yourself by writing down the entire diagonal). The diagonal after that one,(1, 5)
, does contain(3, 3)
, but as you can clearly tell(1, 5)
is "out-of-bounds", because for this example we said that the max value ofx
andy
is 4.
Looking for more patterns
We can do much better than this. The next idea I had was to look at what IDs were
associated with coordinates of the form (1, M)
, for different values of M
:
(1, 1) -> 1
(1, 2) -> 2
(1, 3) -> 4
(1, 4) -> 7
(1, 5) -> 11
...
Let's forget about the pairs of any x
and y
and focus on the (1, M)
pairs for
a moment: reducing a problem by adding a constraint like this often makes it easier to
find a partial solution to the original problem, and then by removing the constraint we
can start to build a complete solution.
When I got to this point while solving this, I was trying to find a sort of "law" for
the sequence of IDs (1, 2, 4, 7, 11, ...), or for the relationship of the ID and M
,
the y
coordinate. Maybe there is one, but I didn't find it, and I came up with another
idea instead: what if we thought of every (1, M)
pair as the beginning of a group?
In that case, the ID of the beginning of a group would basically be the number of IDs
below it, plus one (since IDs start at 1). The first group is the (1, 1) ... (1, 1)
group, which contains a single element ((1, 1)
), so the ID of the first element of the
second group is <number of elements in group 1> + 1 = 1 + 1 = 2
.
Let's do another example to convince ourselves: group 2 has 2 elements ((1, 2)
,
(2, 1)
), so we can say that (1, 3)
will have an ID of
<number of elements in group 1> + <number of elements in group 2> + 1 = 1 + 2 + 1 = 4
,
which is indeed the case.
So, for a given "group leader" pair (1, M)
, we have two things to figure out in order
to be able to find its ID:
- Which group it is in
- How many elements are "below" it
The first one is easy: (1, M)
is the first element of the M
-th group! The second one
is more complicated. I mean, we can sort of answer the question by saying "the number of
elements below M
is the sum of elements of the groups 1
through M - 1
", but of
course that's not the complete answer. It does simplify things though: now we only need
to find how many elements are in a given group M
. Well... if you think of it, the
"groups" we've been talking about are simply the diagonals!
So, this question is actually trivial: group M
has M
elements! Now we can start
assembling our answer.
Every pair (1, M)
has an ID of <number of elements of previous groups> + 1
, which is
equal to (1 + 2 + ... + (M - 1)) + 1
.
That's great! We now know the ID of every "group leader" pair. How can we solve the
general problem of finding the ID of any (x, y)
pair? Well, all we have to do at
this point is figure out two things, again:
- Which group
(x, y)
belongs to - What is its offset from the "group leader"
The first one is easy to figure out if we remember how we obtained the sequence of each
group: increment x
until M
, decrement y
until 1
. Since we're incrementing x
and decrementing y
at the same time, that means that their sum is constant, so no
matter which (x, y)
we land on in the sequence, x + y
will give us the same value
(for group 3 for instance, 1 + 3 = 2 + 2 = 3 + 1
). But what's that sum? Well, since
the "group leader" of group M
is (1, M)
, the sum is M + 1
! That's the answer to
the first question: (x, y)
belongs to group x + y - 1
.
The second question is also easy to answer if again we look at the patterns: since we
are looking for an offset from (1, M)
, and the sequence pattern is formed by
incrementing x
, all we have to do is look at x
! For the first element of the group
x
will be 1
, for the second 2
and so on. So, to get the offset from the "group
leader" for any given (x, y)
pair, we simply calculate x - 1
.
Solution
Let's put everything together. For a given pair (x, y)
, belonging to group
x + y - 1
, the ID is:
<ID of "group leader" of (x + y - 1)> + <offset from "group leader" of (x + y - 1)>
Expanding on that, we have:
(<number of elements before (x + y - 1)> + 1) + (x - 1)
And by expanding further we get:
(<sum from 1 to (x + y - 2)> + 1) + (x - 1)
With a bit of simplifying, we finally get:
<sum from 1 to (x + y - 2)> + x
If we wanna be even cooler and pretend we actually know some basic math, we can simplify
it further since <sum from 1 to N>
is equal to N * (N + 1) / 2
. So the final formula
actually is:
ID(x, y) = (x + y - 2) * (x + y - 1) / 2 + x
Okay, that's good and all, but does it work? Let's test our formula on a couple of pairs:
ID(2, 3) = (2 + 3 - 2) * (2 + 3 - 1) / 2 + 2 = 8
Nice! And another one:
ID(4, 2) = (4 + 2 - 2) * (4 + 2 - 1) / 2 + 4 = 14
Woo! The formula seems to be correct!
This is great because, since it's a closed-form solution, the time complexity will be
O(1)
(or, in layman's terms, really frickin fast). Let's implement it in code:
solution.py
1 2 |
|
And that's it!