How to solve Word Ladder Problem?

Sometime back, a colleague of mine asked me about the word ladder problem. She was looking for a change. So, I believe she stumbled across this while preparing for data structures and algorithms.

graph-header

Problem Statement

Typically, the puzzle shared is a flavor of below:

Find the smallest number of transformations needed to change an initial word to a target word of same length. In every transformation, change only one character and make sure word exists in the given dictionary.

Explanation

Assuming all these 4 letter words are there in the dictionary provided, it takes minimum 4 transitions to convert word from SAIL to RUIN, i.e.
SAIL -> MAIL -> MAIN -> RAIN -> RUIN

Intent here is to know about Graph algorithm. So, what are graphs in context of algorithms and how do we apply them to solve such problems?

Graph Data Structure

Graphs are flow structure that represents entities connection with each other. Visually, they are represented with help of a Node (Vertex) & an Edge (Connector).

graph-general

A tree is an undirected graph in which any two nodes are connected by only one path. In it, each node (except the root node) comprises exactly one parent node.

Most common way to represent a graph is using an Adjacency matrix. In it, Element A[i][j] is 1 if there is an edge from node i to node j or else it is 0. For example, adjacency matrix of above unidirected graph is:

  | 1 2 3 4
------------
1 | 0 1 0 1
2 | 1 0 1 0
3 | 0 1 0 1
4 | 1 0 1 0

Another common way is via Adjacency list. (List format of the data instead of a matrix.)

Related Algorithms

Graphs are applied in search algorithms. Traversing the nodes and edges in a defined order helps in optimizing search. There are two specific approaches to traverse graph:

Breadth First Search (BFS)

Given a graph G and a starting node s, search proceeds by exploring edges in the graph to find all the nodes in G for which there is a path from s. With this approach, it finds all the nodes that are at a distance k from s before it finds any nodes that are at a distance k+1.

For easy visualization, think of it as, in a tree, finding all the child nodes for a parent node as first step. Post it, find all the grandchildren and hence forth.

Depth First Search (DFS)

Given a graph G and a starting node s, search proceeds by exploring edges in the graph to find all the nodes in G traversed from s through it’s edges. With this approach, we go deep in graph connecting as many nodes in the graph as possible and branch where necessary.

For easy visualization, think of it as, in a tree, finding all the family nodes for a parent node. With this, for a given node, we connect its children, grand children, grand grand children and so on before moving to next node of same level.

Thus, with DFS approach, we can have multiple deduced trees.

Knight’s tour is a classic example that leverages Depth First Search algorithm.

Shortest Path First OR Dijkstra’s Algorithm (SPF)

Given a graph G and a starting node s, search the shortest path to reach node d. It uses a concept of weights. It’s an iterative algorithm similar to results of BFS.

Many real world example fits in here, e.g. what would be shortest path from home to office.

With BFS (a simple queue), we visit one node at a time whereas in SPF (a priority queue), we visit a node at any level with lowest cost. In a sense, BFS follows Dijkstra's algorithm, a step at a time with all edge weights equal to 1. The process for exploring the graph is structurally the same in both cases. at times, BFS is preferred with equal weight graphs. This is because, operations on a priority queue are O(log n) compared to operations on a regular queue which is O(1).

Code

I will be using a breadth first graph algorithm here based on the problem need:

import collections
from collections import deque 

class Solution(object):
    # method that will help find the path
    def ladderLength(self, beginWord, 
                        endWord, wordList):
        """
        :type beginWord: str
        :type endWord: str
        :type wordList: Set[str]
        :returntype: int
        """

        # Queue for BFS
        queue = deque()

        # start by adding begin word
        queue.append((beginWord, [beginWord]))

        while queue:
            # let's keep a watch at active queue
            print('Current queue:',queue)

            # get the current node and 
            # path how it came
            node, path = queue.popleft()

            # let's keep track of path length 
            # traversed so far
            print('Current transformation count:',
                                        len(path))

            # find possible next set of 
            # child nodes, 1 diff
            for next in self.next_nodes(node, 
                            wordList) - set(path):
                # traversing through all child nodes
                # if any of the child matches, 
                # we are good               
                if next == endWord:
                    print('found endword at path:',
                                            path)
                    return len(path)
                else:
                    # keep record of next 
                    # possible paths
                    queue.append((next, 
                                path + [next]))
        return 0

    def next_nodes(self, word, word_list):
        # start with empty collection
        possiblenodes = set()

        # all the words are of fixed length
        wl_word_length = len(word)

        # loop through all the words in 
        # the word list
        for wl_word in word_list:
            mismatch_count = 0

            # find all the words that are 
            # only a letter different from 
            # current word those are the 
            # possible next child nodes
            for i in range(wl_word_length):
                if wl_word[i] != word[i]:
                    mismatch_count += 1
            if mismatch_count == 1:
                # only one alphabet different-yes
                possiblenodes.add(wl_word)
        
        # lets see the set of next possible nodes 
        print('possible next nodes:',possiblenodes)
        return possiblenodes

# Setup
beginWord = "SAIL"
endWord = "RUIN"
wordList = ["SAIL","RAIN","REST","BAIL","MAIL",
                                    "MAIN","RUIN"]

# Call
print('Transformations needed: ',
    Solution().ladderLength(beginWord, 
                            endWord, wordList))

# Transformation expected == 4
# One possible shortes path with 4 transformation:
# SAIL -> MAIL -> MAIN -> RAIN -> RUIN

Used deque (doubly ended queue) of Python

deque helps with quicker append and pop operations from both the ends. It has O(1) time complexity for append and pop operations. In comparison, list provides it in O(n) time complexity.

A quick look at the code workflow to validate if all nodes at a particular distance was traversed first and then moved to next level:

Current queue: deque([('SAIL', ['SAIL'])])

Current transformation count: 1
possible next nodes: {'BAIL', 'MAIL'}
Current queue: deque([('BAIL', ['SAIL', 'BAIL']), 
                      ('MAIL', ['SAIL', 'MAIL'])])

Current transformation count: 2
possible next nodes: {'SAIL', 'MAIL'}
Current queue: deque([('MAIL', ['SAIL', 'MAIL']), 
                      ('MAIL', ['SAIL', 'BAIL', 
                       'MAIL'])])

Current transformation count: 2
possible next nodes: {'BAIL', 'MAIN', 'SAIL'}
Current queue: deque([('MAIL', ['SAIL', 'BAIL', 
                                'MAIL']), 
                      ('BAIL', ['SAIL', 'MAIL', 
                                'BAIL']), 
                      ('MAIN', ['SAIL', 'MAIL', 
                                'MAIN'])])

Current transformation count: 3
possible next nodes: {'BAIL', 'MAIN', 'SAIL'}
Current queue: deque([('BAIL', ['SAIL', 'MAIL', 
                                'BAIL']), 
                      ('MAIN', ['SAIL', 'MAIL', 
                                'MAIN']), 
                      ('MAIN', ['SAIL', 'BAIL', 
                                'MAIL', 'MAIN'])])

Current transformation count: 3
possible next nodes: {'SAIL', 'MAIL'}
Current queue: deque([('MAIN', ['SAIL', 'MAIL', 
                                'MAIN']), 
                      ('MAIN', ['SAIL', 'BAIL', 
                                'MAIL', 'MAIN'])])

Current transformation count: 3
possible next nodes: {'RAIN', 'MAIL'}
Current queue: deque([('MAIN', ['SAIL', 'BAIL', 
                                'MAIL', 'MAIN']), 
                      ('RAIN', ['SAIL', 'MAIL', 
                                'MAIN', 'RAIN'])])

Current transformation count: 4
possible next nodes: {'RAIN', 'MAIL'}
Current queue: deque([('RAIN', ['SAIL', 'MAIL', 
                                'MAIN', 'RAIN']), 
                      ('RAIN', ['SAIL', 'BAIL', 
                        'MAIL', 'MAIN', 'RAIN'])])

Current transformation count: 4
possible next nodes: {'MAIN', 'RUIN'}
found endword at path: ['SAIL', 'MAIL', 'MAIN', 
                                        'RAIN']

Transformations needed:  4
Overall path: ['SAIL', 'MAIL', 'MAIN', 
                               'RAIN', 'RUIN']

Complexity

For above code that I used to find the shortest path for transformation:

Time

In next_nodes, for each word in the word list, we iterated over its length to find all the intermediate words corresponding to it. Thus we did M×N iterations, where M is the length of each word and N is the total number of words in the input word list. Further, to form an intermediate word, it takes O(M) time. This adds up to O(M2×N).

In ladderLength, BFS can go to each of the N words and for each word, we need to examine M possible intermediate words. This adds up to O(M2×N).

Overall, it adds up to O2(M2×N) which would be called O(M2×N).

Space

In next_nodes, each word in the word list would have M intermediate combinations. For every word we need a space of M2 to save all the transformations corresponding to it. Thus, it would need a total space of O(M2×N).

In ladderLength, BFS queue would need a space of O(M×N)

Overall, it adds up to O(M2×N) + O(M×N) which would be called O(M2×N)

Wrap Up

It could be little tricky and thus would need some practice to visualize the graph as well to write code for it.

Great, so now we know how to solve problems like word ladder problem. It also touch based other related common graph algorithms that we can refer to.

I had a read of the following reference and it has much more details if needed.


Keep problem solving!

samples GitHub Profile Readme
Learn Python – Beginners step by step – Basics and Examples
Sandeep Mewara Github
Sandeep Mewara Learn By Insight
Matplotlib plot samples
Sandeep Mewara Github Repositories

Just reverse alphabets in a string?

Last week it was a simple problem that we discussed to tangle our brain.

string-reverse

Reverse only alphabets in a provided string in a most efficient way. (Special characters, numbers, etc should continue to stay at their original place)

First thought …

Okay, so first thought was to make use of an extra space and then we can get our desired result with two traversals of the string. Yeah, but let’s optimize.

Updated thought …

We can make the swaps in place in a single traversal. A good stopping criteria seems to be when index while moving from front crosses the index moving backwards from end.

Let’s write code:

static void Main(string[] args)
{
    Console.WriteLine($"Please enter a string:");

    // Ignore casing
    var inputString = Console.ReadLine().ToLower();
    char[] inputArray = inputString.ToCharArray();
    ReverseAlphabetsOnly(inputArray);        
    Console.WriteLine(
            $"Reversed: {new String(inputArray)}");
}

static void ReverseAlphabetsOnly(char[] inputArray)
{
    int frontIndex = 0;
    int endIndex = inputArray.Length-1;
    char temp;

    while(frontIndex < endIndex)
    {
        if(!IsAlphabet(inputArray[frontIndex]))
            frontIndex++;
        else if(!IsAlphabet(inputArray[endIndex]))
            endIndex--;    
        else
        {
            temp = inputArray[frontIndex];
            inputArray[frontIndex] 
                 = inputArray[endIndex];
            inputArray[endIndex] = temp;

            frontIndex++;
            endIndex--;
        }
    }
}

static bool IsAlphabet(char x) 
{ 
    return ( (x >= 'a' && x <= 'z') 
            || (x >= 'A' && x <= 'Z') ); 
}

// Input:  Le@rn By In$ig#t...
// Output: tg@in Iy Bn$re#L...

Closure …

Approach looks good, as it would be at maximum a single traversal with no extra space used. Thus we are able to solve it with an overall Order of Time complexity O(n) & Space complexity O(1).


Happy solving …

samples GitHub Profile Readme
Learn Python – Beginners step by step – Basics and Examples
Sandeep Mewara Github
Sandeep Mewara Learn By Insight
Matplotlib plot samples
Sandeep Mewara Github Repositories

Linear time partition – a three way split

Linear-time partition is a divide & conquer based selection algorithm. With it, data is split into three groups using a pivot.
.

linear-time-partioning

An integral part of Quick Sort algorithm which uses this partitioning logic recursively. All the elements smaller than the pivot are put on one side and all the larger ones on the other side of the pivot.

Similar to discussion of Dynamic Programming, this algorithm plays on solving sub-problems to solve complex problem.

Algorithm

Post selecting the pivot, Linear-time partition routine separates the data into three groups with values:

  • less than the pivot
  • equal to the pivot
  • greater than the pivot

Generally, this algorithm is done in place. This results in partially sorting the data. There are handful of problems that makes use of this fact, like:

  • Sort an array that contains only 0’s, 1’s & 2’s
  • Dutch national flag problem
  • Print all negative integers followed by positive for an array full of them
  • Print all 0’s first and then 1’s or vice-versa for an array with only 0’s & 1’s
  • Move all the 0’s to the end maintaining relative order of other elements for an array of integers

If done out of place, (i.e. not changing the original data), it would cost O(n) additional space

Example

Let’s take an example of: sort a array that contains only 0’s, 1’s & 2’s

First thought for such problem is to perform a count of 0’s, 1’s and 2’s. Once we have the counts, reset the array with them. Though it has time complexity O(n), it takes two traversal of the array or uses an extra array.

Below is an attempt to solve using Linear-time partition algorithm to avoid that extra traversal/space.

def threeWayPartition(A):
    start = mid = 0
    end = len(A)-1
    
    # define a Pivot
    pivot = 1
    
    while (mid <= end):
        # mid element is less than pivot
        # current element is 0
        
        # so lets move it to start
        # current start is good. 
        # move start to next element
        # move mid to next element to move forward
        if (A[mid] < pivot) :
            swap(A, start, mid)
            start = start + 1
            mid = mid + 1
            
        # mid element is more than pivot
        # current element is 2
        
        # so lets move it to end
        # current end is good. 
        # move end to previous element
        elif (A[mid] > pivot) :
            swap(A, mid, end)
            end = end - 1
        
        # mid element is same as pivot
        # current element is 1
        
        # just move forward: 
        # mid to next element
        else :
            mid = mid + 1
            
# Swap two elements A[i] and A[j] in the list
def swap(A, i, j):
    A[i], A[j] = A[j], A[i]


# Define an array
inputArray = [0, 1, 2, 2, 1, 0, 0, 2]

# Call the Linear-time partition routine
threeWayPartition(inputArray)

# print the final result
print(inputArray)

# Outputs
# [0, 0, 0, 1, 1, 2, 2, 2]

With a defined pivot, we segregated the data on the either side which resulted in desired output. Dutch nation flag problem or printing all negative first and then positive, or printing all 0s first follows the same code.

For moving all 0’s to end maintaining other elements order, we do a tweak in swap index to maintain order:

def threeWayPartition(A):
    current = 0
    nonzero = 0
    end = len(A)-1
    
    # define a Pivot
    pivot = 0
    
    while (current <= end):
        if (A[current] != pivot) :
            swap(A, current, nonzero)
            nonzero = nonzero + 1
        current = current + 1
            
# Swap two elements A[i] and A[j] in the list
def swap(A, i, j):
    A[i], A[j] = A[j], A[i]


# Define an array
inputArray = [7,0,5,1,2,0,2,0,6]

# Call the Linear-time partition routine
threeWayPartition(inputArray)

# print the final result
print(inputArray)

# Output
# [7, 5, 1, 2, 2, 6, 0, 0, 0]

Complexity

With above algorithm approach, we solved our problem with Time complexity O(n) & Space complexity O(1) (with single traversal of the array)


It was fun solving!

samples GitHub Profile Readme
Learn Python – Beginners step by step – Basics and Examples
Sandeep Mewara Github
Sandeep Mewara Learn By Insight
Matplotlib plot samples
Sandeep Mewara Github Repositories

Dynamic Programming – What, How & When

There are many programming techniques, which when applied appropriately to specific situation, leads to efficient results (either time wise or space wise or both). Dynamic Programming (DP) is one such optimization concept.

dp-header

Not to be confused with Dynamic programming language or Dynamic type in C#

What is?

Dynamic programming is solving a complicated problem by breaking it down into simpler sub-problems and make use of past solved sub-problems.

Dynamic Programming is mainly an optimization over plain recursion.

Let’s use Fibonacci series as an example to understand this in detail.

Fibonacci Series is a sequence, such that each number is the sum of the two preceding ones, starting from 0 and 1.

0, 1, 1, 2, 3, 5, 8, 13, 21…

When to?

It is best applied to problems that exhibits Overlapping Subproblems & Optimal Substructure.

Following is how our Fibonacci problem fib(4) breakdown would look like:

Problem tree for fib(4)

We can see that fib(0), fib(1) & fib(2) are occurring multiple times in the graph. These are overlapping subproblems.

Binary search tree would not fall into same category. There would no common subproblem by binary search definition itself.

We can see for optimal solution of fib(4), we would need optimal solution of fib(3) & fib(2) and so on. This tells it has optimal substructure.

Longest path problem does not exhibit as such. Optimal solution for longest distance between A & C might not be optimal for B & C that involves A.

How to?

It’s not an unknown or a new technique. Our brain do it almost all the time. Let’s use Fibonacci series as an example to understand this and then we will apply DP to solve it programmatically.

Logically …

Let’s say, I ask, what is the 4th number in the Fibonacci series? You would work from 0, 1, 1, etc and come to fib(4) = 2. Now, if I ask, what is the 5th number? Would you restart from 0, 1, 1..? Or you would just make use of fib(4) you just solved and get fib(5)?

You see – it’s obvious, you will make use of previous solved result and right away come up with fib(5) instead of starting from 0. In programming language, this is tabulation using calculated fib(4) & fib(3).

Tabulation is a technique of starting from smallest sub-problem and storing the results of entries until target value is reached.

Let’s say, now I ask to calculate fib(8). Our natural way of working on it would be to first find fib(7) & fib(6). In order to know fib(7), we will figure out fib(6) & fib(5) and hence forth. With fib(7), we already traversed through fib(6). This is another approach where we calculated the next subproblem and keep on storing them in case of need later. In programming language, this is memoization.

Memoization is a technique of storing the results of expensive function calls and returning the cached result when the same inputs occur again.

DP breaks the problem into sub-problems and uses memoization or tabulation to optimize. We will understand about them with examples below.

Programmatically in action …

In order to compare the optimization cost, we will use recursion as another way to solve the problem.

static void Main(string[] args)
{
    Stopwatch stopwatch = new Stopwatch();

    Console.WriteLine($"Fibonacci Recursive:");
    for (int i = 30; i <= 50; i+=5)
    { 
        stopwatch.Reset();
        stopwatch.Start();
        var _ = FibonacciRecursive(i);
        stopwatch.Stop();
        Console.WriteLine($"{i}th took time: ({stopwatch.Elapsed})");
    }

    Console.WriteLine($"Dynamic Programming:");
    for (int i = 30; i <= 50; i+=5)
    { 
        stopwatch.Reset();
        stopwatch.Start();
        var _ = FibonacciDPTabulation(i);
        stopwatch.Stop();
        Console.WriteLine($"{i}th took time: ({stopwatch.Elapsed})");
    }
}

static long FibonacciRecursive(int number)
{
    if (number <= 1)
        return number;
    return FibonacciRecursive(number-2) + FibonacciRecursive(number - 1);
}

static long FibonacciDPTabulation(int number)
{
    long[] arr = new long[100];
    arr[0] = 1; arr[1] = 1; 
    for( int x=2; x <= number; x++)
    {
        arr[x] = arr[x-1] + arr[x-2];
    }
    return arr[number];
}

With above code, we got the following output:

 RecursiveDP (Tabulation)DP (Memoization)
30th took time(00:00:00.0090536)(00:00:00.0002756)(00:00:00.0183122)
35th took time(00:00:00.0908688)(00:00:00.0000037)(00:00:00.0000009)
40th took time(00:00:00.9856354)(00:00:00.0000006)(00:00:00.0000011)
45th took time(00:00:10.7981258)(00:00:00.0000006)(00:00:00.0000008)
50th took time(00:02:24.8694889)(00:00:00.0000006)(00:00:00.0000008)

Difference is astonishing! DP is too fast in comparison. Just look at the 50th time difference.

With simple iterative approach, Fibonacci Series can be calculated in the same ballpark time as of DP. Current example is just to keep things simple and to understand the DP concept. Please look at the end of post for common examples that would clarify where DP would be of real help over recursion. (& iterative approach would be difficult to code)

Above approach of DP is considered Bottom-Up approach as we started with bottom (lowest term) and then moved to the highest one. This is tabulation. We keep on filling on the cache here till we reach the target.

Alternatively, there is a Top-down approach. We start solving from highest term and store solutions of sub problems along the way. This is memoization. A code for this approach would look like:

private static Dictionary<int, long> memo 
                    = new Dictionary<int, long>();

static long FibonacciDPMemoization(int number)
{
    if (number == 0 || number == 1) {
        return number;
    }
    
    // see if we've already calculated this
    if (memo.ContainsKey(number)) {                
        return memo.GetValueOrDefault(number);
    }

    long result = FibonacciDPMemoization(number - 1) 
                + FibonacciDPMemoization(number - 2);
    memo.Add(number, result);

    return result;
}

memoization is sometimes simpler to understand and write code because of it’s natural way of solving.

Generally, tabulation outperformes memoization by a constant factor. This is because of no overhead for recursion and can be stored as a pre-allocated array. We first calculate and store the value of a sub problem repeated most number of times here.

Few Common Examples?

  • Tower of Hanoi
  • Checkerboard
  • Egg dropping puzzle
  • Matrix chain multiplication

Details of these above problem can be read here.

Beware?

Apart from avoiding problems that do not have Overlapping Subproblems & Optimal Substructure, one need to understand we are doing a trade here – space for time!

We can always discuss that though DP is using extra space to store the sub-problems data, but in turn helps save in memory calculation which could be expensive. Thus, in case of constrained space, we need to evaluate usage of DP.


Keep learning!

samples GitHub Profile Readme
Learn Python – Beginners step by step – Basics and Examples
Sandeep Mewara Github
Sandeep Mewara Learn By Insight
Matplotlib plot samples

Any anagram of a string, a palindrome?

Often in our group we discuss about puzzles or problems related to data structure and algorithms. One such day, we discussed about:

how will we find if any anagram of a string is palindrome or not?

Our first thought went in the direction to start from first character and then traverse till end to see if there could be matching pair. Keep track of it, move to next character till middle and then stitch all to figure if so. It solves, but the query was – could it be solved better though?

Of course! With putting some stress on the brain, it turned out that in a single read, we will have info enough, to tell, if any anagram formed from the input string can be a palindrome or not.

Thought converted to Code

static void CheckIfStringAnagramHasPalindrome()
{
    Console.WriteLine($"Please enter a string:");

    // Ignore casing
    var inputString = Console.ReadLine().ToLower();

    // Just need to keep track of unique characters
    var characterSet = new HashSet<char>();

    // Single traversal of input string 
    for(int i=0; i<inputString.Length; i++)
    {
        char currentCharacter = inputString[i];
        if(characterSet.Contains(currentCharacter))
            characterSet.Remove(currentCharacter);
        else
            characterSet.Add(currentCharacter);
    }

    // Character counts in set will help 
    // identify if palindrome possible 
    var leftChars = characterSet.Count;
    if(leftChars == 0 || leftChars == 1)
        Console.WriteLine($"YES - possible.");
    else
        Console.WriteLine($"NO - Not possible.");
}


Approach looked good, as with a single traversal and usage of HashSet, i.e. with overall Order of Time complexity O(n) & Space complexity O(1), we were able to solve it.

It was fun solving!

GitHub Profile Readme Samples
Learn Python – Beginners step by step – Basics and Examples
Sandeep Mewara Github
Sandeep Mewara Learn By Insight