dimanche 12 janvier 2020

Sort non repeated characters and then repeated ones‽

Hello I am doing the following programming exercise: Return String As Sorted Blocks. The statement is:

Task

You will receive a string consisting of lowercase letters, uppercase letters and digits as input. Your task is to return this string as blocks separated by dashes ("-"). The elements of a block should be sorted with respect to the hierarchy listed below, and each block cannot contain multiple instances of the same character.

The hierarchy is:

lowercase letters (a - z), in alphabetic order
uppercase letters (A - Z), in alphabetic order
digits (0 - 9), in ascending order

Examples

"21AxBz" -> "xzAB12" - since input does not contain repeating characters, you only need 1 block
"abacad" -> "abcd-a-a" - character "a" repeats 3 times, thus 3 blocks are needed
"" -> "" - an empty input should result in an empty output

Good luck!

I have written the following code:

def blocks(s):
    print("s: "+s)
    lowers = []
    uppers = []
    digits = []
    for c in s:
        if c.islower():
            lowers.append(c)
        if c.isupper():
            uppers.append(c)
        if c.isdigit():
            digits.append(c)
    lowers.sort()
    uppers.sort()
    digits.sort()
    print("lowers: ")
    print(lowers)
    print("uppers: ")
    print(uppers)
    print("digits: ")
    print(digits)
    result = ""
    sorted = lowers+uppers+digits
    removedLetters = 0
    needsNextBlock = False
    nextBlock = "-"
    while len(sorted) > 0:
        for i, c in enumerate(sorted):
            print(i, c)
            print("result: ")
            print(result)
            if c not in result:
                result += c
                print("we want to delete: ")
                print(c)
                sorted = sorted[0:i-removedLetters] + sorted[i+1-removedLetters:]
                removedLetters += 1
                print("new sorted: ")
                print(sorted)
            else:
                if c not in nextBlock:
                    needsNextBlock = True
                    nextBlock += c
                    sorted = sorted[0:i-removedLetters] + sorted[i+1-removedLetters:]
                    removedLetters += 1

                    print("new sorted: ")
                    print(sorted)

        if needsNextBlock:
            result += nextBlock
        needsNextBlock = False
        nextBlock = "-"
    return result

And there is a bug, because of when we have the following test:

Test.assert_equals(blocks("abacad"), "abcd-a-a")

The trace is:

s: abacad
lowers: 
['a', 'a', 'a', 'b', 'c', 'd']
uppers: 
[]
digits: 
[]
0 a
result: 

we want to delete: 
a
new sorted: 
['a', 'a', 'b', 'c', 'd']
1 a
result: 
a
new sorted: 
['a', 'b', 'c', 'd']
2 a
result: 
a
3 b
result: 
a
we want to delete: 
b
new sorted: 
['a', 'c', 'd']
4 c
result: 
ab
we want to delete: 
c
new sorted: 
['a', 'd']
5 d
result: 
abc
we want to delete: 
d
new sorted: 
['a']
0 a
result: 
abcd-a
new sorted: 
['a']
0 a
result: 
abcd-a-a
new sorted: 
['a']
0 a
result: 
abcd-a-a-a
new sorted: 
['a']
0 a
result: 
abcd-a-a-a-a
new sorted: 
['a']
0 a
(infinite loop)

So as we see the difficulty is created when we execute:

sorted = sorted[0:i-removedLetters] + sorted[i+1-removedLetters:]
removedLetters += 1

Because we have previously passed over the repeated letter, in this case 'a', but we have not counted it, so the calculus for the new sorted substring keeps being the same.

I tried a naive approach:

def blocks(s):
    print("\n\n\ns: "+s)
    lowers = []
    uppers = []
    digits = []
    for c in s:
        if c.islower():
            lowers.append(c)
        if c.isupper():
            uppers.append(c)
        if c.isdigit():
            digits.append(c)
    lowers.sort()
    uppers.sort()
    digits.sort()
    print("lowers: ")
    print(lowers)
    print("uppers: ")
    print(uppers)
    print("digits: ")
    print(digits)
    result = ""
    sorted = lowers+uppers+digits
    removedLetters = 0
    needsNextBlock = False
    nextBlock = "-"
    while len(sorted) > 0:
        initialIterationLength = len(sorted)
        for i, c in enumerate(sorted):
            print(i, c)
            print("result: ")
            print(result)
            if c not in result:
                result += c
                print("we want to delete: ")
                print(c)
                sorted = sorted[0:i-removedLetters] + sorted[i+1-removedLetters:]
                removedLetters += 1
                print("new sorted: ")
                print(sorted)
            else:
                if c not in nextBlock:
                    needsNextBlock = True
                    nextBlock += c
                    sorted = sorted[0:i-removedLetters] + sorted[i+1-removedLetters:]
                    removedLetters += 1
                    if initialIterationLength == len(sorted):
                        sorted = []
                    print("new sorted: ")
                    print(sorted)

        if needsNextBlock:
            result += nextBlock
        needsNextBlock = False
        nextBlock = "-"
    return result

As you see, I added when we start the while loop the sentence: initialIterationLength = len(sorted) and inside the loop, in the if condition:

if initialIterationLength == len(sorted):
    sorted = []

It does work for the test being discussed, however for larger inputs it won't work.

For example when input is:

ZrXx2VpVJMgPs54SwwxSophZEWvwKUxzqNxaxlgY0T

Our result is:

aghlopqrsvwxzEJKMNPSTUVWXYZ0245-gpwxSVZ-wx

Expected is:

aghlopqrsvwxzEJKMNPSTUVWXYZ0245-gpwxSVZ-wx-x-x

I think there should be a better algorithm.

I have read:

How could we sort non repeated characters and then repeated ones‽

Aucun commentaire:

Enregistrer un commentaire