Cheraq.com

I innovate, then I am!

europython-logo1

Google Code Jam: 2011 Euro Python

| 0 comments

In this Problem Solving session, we will have a look at Google Code Jam – Euro Python 2011 contest. Contest consists of four questions which are from easy to medium difficulty and contestants had more than 3 hours to solve the problems which is more than enough. Lets start solving problems rather than talking.

Problem A: Centauri Prime

This problem has 7 points for small data set and 8 points for large data set, which can result to 15 points total into your account :) . Well the points shows that it is an easy problem. You can read the original problem statement here, But here we are going to keep it as simple as possible. Below you can find the problem in brief.

You are going to receive a list which contains name of kingdoms in a planet. For each kingdom name, you have to check it as per below conditions to find out the ruler of that kingdom:

  • If the name ends with letter “y”, then nobody rules that kingdom.
  • If the name ends with any of following letters (“a”,”e”,”i”,”o”,”u”), then kingdom is ruled by a queen.
  • In any other cases, kingdom is rules by a king.

I think the answer is also written in problem statement and you should just implement it. So what you are going to do? For small data set, nothing, but for large data set, you need to read the constraints properly. Then problem statement for large data set says each kingdom name can be at most 100 characters long, which means it can be an string with length “1″. This is important because the problem statement says the name of the kingdom starts with a capital letter, so you need to consider if the kingdom name is just one character, then you need to change the case to lowercase. Below you can find the code to return the ruler.

def findRuler(country):
    lastChar = country.lower()[-1]
    if lastChar == 'y':
        return 'nobody'
    elif lastChar in 'aieou':
        return 'a queen'
    else:
        return 'a king'

Here you can find the complete code and sample input and output files could be found here on GitHub.

Problem B: Music Collection

This is an interesting problem with 20 points (8 for small data set, and 12 for large data set). Here you can read the original problem statement, but if you don’t want to spend time to read the whole story, here you can find the problem in brief.

Lets say you have lots of music files and they got different names (no duplicates). Your music player has a search option which is case insensitive and given any keyword, if the search result contains only one music, it will play it. Your job is, for each entry in the music list, try to find shortest keyword which returns only that item.

The key to solve this problem is, using some sort of indexing table. What we are going to do is, if there are only one item in the list, we will return empty string (this is an edge case, but we have to consider it). If there are more than one item, for each item first we will extract all the substrings (except empty string, can you guess why?). So if the input string is “abc”, we have to generate {“a”, “ab”, “abc”, “b”, “bc”, “c”}. We also need an empty hash table (dictionary). For each substring, we will check the hash table, if we don’t have this item in the hash table, then we will add this item as key and we will use the music name index as value (For music 1, we will add substring “ab” as {“ab”->1}). But if the substring already exists in the hash table, means this substring can return more than one music entry (we don’t need to know how many it will return), so we simply turn the value for that substring in the hash table to some value (like for substring “ab”, we will change the value to {“ab” -> “1000″}). Below code snippet will do this.

subsets = {}
for n in xrange(noEntries):
    entry = inputFile.readline().strip().upper()
    entryLength = len(entry)
    for startIndex in xrange(entryLength):
        for subLength in xrange(entryLength-startIndex):
            subString = entry[startIndex:startIndex+subLength+1]
            if subString in subsets:
                if subsets[subString] != n:
                    subsets[subString] = 1000
            else:
                subsets[subString] = n

Once we prepared the subsets hash table, then we can simple select items that their value is less than 1000 (as per the limit definition, we will not have more than 100 entries, so 1000 is a safe value. You could also change it to any negative value, it doesn’t matter). Since the question is to find shortest string, then for each entry, we need to prepare list of strings which will uniquely identify them, and then sort them as per the problem statement.

result = {}
for s in subsets:
    entryIndex = subsets[s]
    if entryIndex < 1000:
        if entryIndex not in result:
            result[entryIndex] = []
        result[entryIndex].append(s)

Now that we have the result set, we can simply generate result. If you noticed, when we were indexing items (first step) we turned all of them to upper case, the reason is, as per problem statement, the upper case items should appear before lower case items, so we simply can ignore the lower case items. So what we need to do is, simply sort the items first based on the length, and then alphabetically. below code will do this and generates result.

for i in xrange(noEntries):
    if i in result:
        vals = result[i]
        vals = sorted(vals,key=(lambda x: (len(x),x)))
        outputFile.write('"%s"\n'% (vals[0]))
    else:
        outputFile.write(':(\n')

Here you can find the complete code and sample input and output files could be found here on GitHub.

Problem C: Irregular Expression

This is one of those magic games that usually you will see in Google Code Jam contests. This problem has 25 points (10 for small data set, 15 for large data set) and you can find the complete problem statement here. The problem is about 2011 World Witch and Warlock tournament. If we ignore the story line, then problem in brief is as following: You will get a string from input, if you could find below pattern in it, then you will say this string contains some SPELL (be careful, the spell might turn you into frog!). The pattern is as below:

  • A sentence which consists of 3 words
  • Each word consists of one or more syllables
  • Each syllable consists of any number of letters, including exactly one vowel (‘a’, ‘e’, ‘i’, ‘o’, ‘u’)
  • First and third words are same and consists of exactly two syllables
  • The string might start and end with gibberish characters and also there is no space to separate words

Well, what do you think? If you are thinking about regular expressions, then it is better to look at the problem form different point of view. This is an irregular expression problem, so you cannot use regular expression (the name is somehow magical as well and invites people to use regular expressions or think about it, but NO! Don’t do that!!). Lets say the input string is “kajabbamajabbajab”. This string contains a spell. We can find three words as “jabba”, “ma” and “jabba” in it which satisfies the given pattern. “jabba” contains two syllables: “jab” and “ba” or even you can say “ja” and “bba”. But we can also find another pattern in input string like “abba”, “maj” and “abba” which still it is true. So there are more than one answer (Number of answers doesn’t matter! You should just be able to find one). Google guys are smart enough to not suggest the second answer, otherwise you could simple find the solution, because finding a pattern using the second answer is quiet simple.

In order to solve it, first we will extract the position of each vowel in the string (doesn’t matter which vowel it is). Below code simply does this.

def checkSpell(entry):
    positions = []
    for i,l in enumerate(entry):
        if l in 'aoeiu':
            positions.append(i)

This code for the given example string will generate a list like [1,3,6,8,10,13,15]. Now we start from 0, at each step (i), we get a substring that starts at position[i] and ends at position[i+1]. This will be our search term. Then we will check if we can find this string after position[i+2] or not. If we could find it, then we are done. Otherwise, we will advance i and check the next item, till we reach to n-2. What we are doing is, we are using vowels at position i and i+1 to form a two syllables word, and then we are searching for the same two syllables word from position i+2 to make sure we have enough space for the second word. If we could find the first word again anywhere in the string, then we can make sure the string is matching the given pattern and contains a spell. Below you can find code snippet which does the same.

for i in xrange(len(positions)-2):
    startIndex = positions[i]
    endIndex = positions[i+1]+1
    startWord = entry[startIndex:endIndex]
    searchIndex = positions[i+2]+1
    try:
        spell = entry.index(startWord,searchIndex)
        return 'Spell!'
    except ValueError:
        continue

Here you can find the complete code and sample input and output files could be found here on GitHub.

Problem D: Twibet

This problem has 40 points (15 for small data set and 25 for large data set) and believe it or not, it is super easy question. Here you can find the problem statement, and as usual, below you can find it brief. Let’s say we have some people’s twitter account Id and we know who follows who. Accounts are stored in a list. At day i, ith person (Pi) will tweet something, an all his/her followers will re-tweet it (if they didn’t already). Your task it to find at day i, how many times that message will be tweeted (or re-tweeted). As you can see, this problem could be solved easily using an stack. By considering the limits, you can even solve the problem without any optimization. At day i, you will add the followers of Pi to the top of the stack, then for each item in the stack, you will add the followers of it to the stack (if it is not already there), once you have checked all the items, then you have the result. Below you can find code snippet which does the same.

N = int(inputFile.readline().strip())
fks = inputFile.readline().split()
monks = {}
for n in xrange(N):
    fk = int(fks[n]) - 1
    if fk not in monks:
        monks[fk] = []
    monks[fk].append(n)
outputFile.write('Case #%d:\n' % (t+1))
for d in xrange(N):
    stack = []
    visited = []
    visited.append(d)
    if d in monks:
        stack = monks[d][:]
        while len(stack)>0:
            m = stack.pop(0)
            if m not in visited:
                if m in monks:
                    stack += monks[m][:]
                visited.append(m)
    outputFile.write('%d\n' % (len(visited)))

Here you can find the complete code and sample input and output files could be found here on GitHub.

This concludes the EuroPython2011 contest. As you can see, most of the questions doesn’t need any specific algorithm and they could be solved directly (usually what you expect to see in Google Code Jam contest is lots of Dynamic Programming questions).

Leave a Reply

Required fields are marked *.

*