Microsoft WebView2 – A new friend for native apps!

Now, we can embed web content (HTML, CSS, and JavaScript) in our native applications with Microsoft Edge WebView2. Earlier, it was announced only for Win32 C/C++ apps but later, it was announced available for use in .NET 5, .NET Core, and .NET Framework Windows Forms and WPF applications.

It uses the modern Microsoft Edge (Chromium) platform to host web content within native Windows applications.

edge-webview2

In the future, the Evergreen WebView2 Runtime plans to ship with future releases of Windows. Deploy the Runtime with your production app until the Runtime becomes more universally available.

Microsoft recommendation

Power of Native

With growing presence of online world, desktop applications are pushed to use more of online like capabilities. Further, web solution also helps reuse most of the code across different platforms and easy change delivery. This is leading desktop applications more and more towards hybrid approach where best of both (native and web) can be leveraged.

hybrid
Credit: Microsoft

Microsoft WebView2 comes to rescue. It helps build powerful applications with controlled access to native capabilities.

WebView2 uses the same process model as the Microsoft Edge browser. A browser process is associated with only one user data folder. A request process that specifies more than one user data folder is associated with the same number of browser processes.

browser-process-model
Credit: Microsoft

More details about browser process model can be read here.

WebView2 apps create a user data folder to store data such as cookies, credentials, permissions, and so on. After creating the folder, your app is responsible for managing the lifetime of the user data folder, including clean up when the app is uninstalled

Microsoft – Managing user data folder

Microsoft has laid here some best practices for developing secure WebView2 application.

Distribution

When distributing your WebView2 app, ensure the backing web platform, the WebView2 Runtime, is present before the app starts.

By default, WebView2 is evergreen and receives automatic updates to stay on the latest and most secure platform.

  • Evergreen Bootstrapper – a tiny installer that downloads the Evergreen Runtime matching device architecture and installs it locally.
  • Evergreen Standalone Installer – a full-blown installer that can install the Evergreen Runtime in offline environment.
  • Fixed Version – to select and package a specific version of the WebView2 Runtime with your application.

The WebView2 Runtime is a redistributable runtime and serves as the backing web platform for WebView2 apps.

webview2-exception

Download the runtime from here. Supported platforms are mentioned here.

Sample Application

I built a sample WPF application (runs on .NET Framework and not Core) to try WebView2. This was to evaluate how comparatively older .NET applications would work out.

webview2-sample

I tried to display my blog in the WPF application using a WebView2 control. Sample application had capabilities to post message to host application and back as well as hook events as per need.

public MainWindow()
{
    InitializeComponent();

    // NavigationEvents
    webView.NavigationStarting += WebView_NavigationStarting; ;
    webView.SourceChanged += WebView_SourceChanged;
    webView.ContentLoading += WebView_ContentLoading;
    webView.NavigationCompleted += WebView_NavigationCompleted;

    // Embedded at CoreWebView2 level
    InitializeOnceCoreWebView2Intialized();
}

/// <summary>
/// initialization of CoreWebView2 is asynchronous.
/// </summary>
async private void InitializeOnceCoreWebView2Intialized()
{
    await webView.EnsureCoreWebView2Async(null);

    // Hook other events
    webView.CoreWebView2.FrameNavigationStarting += CoreWebView2_FrameNavigationStarting;
    webView.CoreWebView2.HistoryChanged += CoreWebView2_HistoryChanged;

    // For communication host to webview & vice versa
    webView.CoreWebView2.WebMessageReceived += CoreWebView2_WebMessageReceived;
    await webView.CoreWebView2.AddScriptToExecuteOnDocumentCreatedAsync("window.chrome.webview.postMessage(window.document.URL);");
    await webView.CoreWebView2.AddScriptToExecuteOnDocumentCreatedAsync("window.chrome.webview.addEventListener(\'message\', event => alert(\'Message from App to WebView2 on navigation!\'));");
}

/// <summary>
/// Web content in a WebView2 control may post a message to the host 
/// </summary>
/// <param name="sender"></param>
/// <param name="e"></param>
private void CoreWebView2_WebMessageReceived(object sender, CoreWebView2WebMessageReceivedEventArgs e)
{
    // Retrieve message from Webview2
    String uri = e.TryGetWebMessageAsString();
    addressBar.Text = uri;

    // Send message to Webview2
    webView.CoreWebView2.PostWebMessageAsString(uri);
    log.Content = $"Address bar updated ({uri}) based on WebView2 message!";
}

/// <summary>
/// Execute URL
/// </summary>
/// <param name="sender"></param>
/// <param name="e"></param>
private void ButtonGo_Click(object sender, RoutedEventArgs e)
{
    try
    {
        Uri uri = new Uri(addressBar.Text);

        if (webView != null && webView.CoreWebView2 != null)
        {
            webView.CoreWebView2.Navigate(uri.OriginalString);
        }
    }
    catch (UriFormatException)
    {
        MessageBox.Show("Please enter correct format of url!");
    }
}

/// <summary>
/// Allow only HTTPS calls
/// WebView2 starts to navigate and the navigation results in a network request. 
/// The host may disallow the request during the event.
/// </summary>
/// <param name="sender"></param>
/// <param name="e"></param>
private void WebView_NavigationStarting(object sender, CoreWebView2NavigationStartingEventArgs e)
{
    String uri = e.Uri;
    if (!uri.StartsWith("https://"))
    {
        e.Cancel = true;
        //MessageBox.Show("Only HTTPS allowed!");

        // Inject JavaScript code into WebView2 controls at runtime
        webView.CoreWebView2.ExecuteScriptAsync($"alert('{uri} is not safe, try an https link please.')");
    }
}

Various events run when specific asynchronous actions occur to the content displayed in a WebView2 instance.

navigation-graph
NavigationStartingWebView2 starts to navigate and the navigation results in a network request. The host may disallow the request during the event.
SourceChangedThe source of WebView2 changes to a new URL. The event may result from a navigation action that does not cause a network request such as a fragment navigation.
ContentLoadingWebView starts loading content for the new page.
HistoryChangedThe navigation causes the history of WebView2 to update.
NavigationCompletedWebView2 completes loading content on the new page.
ProcessFailedTo react to crashes and hangs in the browser and renderer processes
CloseTo safely shut down associated browser and renderer processes
Key events

Working with the sample application, I was able to display a webpage, intercept calls both ways and embed message/code to my need. It provides all the capabilities that seems to be needed for a stable web app display control.

Complete sample application can be downloaded from here: https://github.com/sandeep-mewara/WebView2WpfBrowserApp

Reference

https://developer.microsoft.com/en-us/microsoft-edge/webview2/
https://docs.microsoft.com/en-us/microsoft-edge/webview2/gettingstarted/wpf
https://docs.microsoft.com/en-us/microsoft-edge/webview2/concepts/distribution

Sandeep Mewara Github
News Update
Tech Explore
Data Explore
samples GitHub Profile Readme
Learn Machine Learning with Examples
Machine Learning workflow
What is Data Science
Word Ladder solution
What is Dynamic Programming
Learn Microsoft Tech via Videos LiveTV Streams

Find missing number from 1 to N?

Last week, there was a discussion in my team on the problem of finding missing number(s). We had different thoughts and approaches and thus I thought to share it across.

find-missing-number

Problem statement was something like:

– An array of size (n) has numbers from 1 to (n+1). Find the missing one number.

– An array of size (n) has numbers from 1 to (n+2). Find the missing two numbers.

First thought …

Keep track of numbers found while traversing. At the end, use it to find the missing number. So kind of brute force approach.

We can maintain a hash or a boolean array of n size and keep on updating the hash or the array index location based on number found while traversing. Use it now to find the missing number. It would cover both one as well as two missing numbers case.

This would have two traversals of n (one for filling in the structure and another to find the missing one). Thus overall, time complexity of O(n). This would need an extra space to keep track of all numbers found and thus a space complexity of O(n).

Q: Now, can we avoid extra space or two times traversal?

Second thought …

We know how to calculate the sum of n natural numbers, i.e.: n*(n+1)/2. With it, we can traverse the given array and keep a sum of all numbers. Difference of the sum from formula to sum found would give us the missing number. Nice!

# Keep track of sum
def sumOfGivenNumbers(nos, n):
    sum = 0
    # calculate sum
    for i in range(0, n):
        sum += nos[i]
    return sum

# Input
numbers = [4, 2, 1, 6, 5, 7] 

# number range 
n = len(numbers) + 1
expectedSum = n*(n+1)/2
numbersSum = sumOfGivenNumbers(numbers,len(numbers))

print('Missing number:', expectedSum - numbersSum)

# Output
# Missing number: 3.0

This would help is solve one missing number in single traversal, thus time complexity of O(n). No extra space was used and thus space complexity of O(1).

Q: Can we extend this to two missing numbers now?

Yes, we can extend it. Along with sum, we can also use the product of n natural number as an expression. With it, we will have two equations and two numbers to find:

Missing1 = x1
Missing2 = x2
Sum of provided numbers = N1
Sum of n Natural numbers = N
Product of provided numbers = P1
Product of n Natural numbers = P

x1 + x2 + N1 = N
x1 * x2 * P1 = P

We can solve it to find the two missing numbers. It does have the quadratic flavor associated though. It maintains the time complexity as O(n) and space complexity as O(1). Nice!

Q: Does the solution help with large integers? Think of possible overflow?

Third thought …

Let’s look at possible way for 1 missing number first.

We will traverse through all the numbers of the array. While doing so, maintain a number that would be sum of all numbers traversed so far reduced by sum of all the indexes traversed (+1 if index starts from 0). It is still making use of n natural numbers (in form of indexes) to keep a check on sum to a defined limit.

# Keep track of sum
def getMissingNumber(nos, n):
    sum = 0
    # calculate sum
    for i in range(0, n):
        sum += (i+1)
        sum -= nos[i]

    # last number to add from n+1 natural nos.
    return sum+n+1

# Input
numbers = [4, 2, 1, 6, 5, 7] 

missingNumber =getMissingNumber(numbers,len(numbers))

print('Missing number:', missingNumber)

# Output
# Missing number: 3.0

This looks good and we maintain the same complexities along with solving for overflow.

We can probably try a similar thing for two missing numbers where we keep on multiple and divide the traversed number by index but it still could have overflow issues in worst case. Further, there could be round off issues.

Fourth thought …

Looking more, it seems we can make use of XOR operation to find the missing numbers. We can make use of XOR’s property to nullify the duplicate pair. We will take XOR of provided numbers and XOR of natural numbers. Combining both again with XOR will leave with missing numbers XOR output.

For one missing number, this would be easy and covers all the hurdles discussed earlier keeping same performance.

# Keep track of XOR data
def getMissingNumber(nos, n):
    x1 = nos[0]
    xn = 1

    # start from second
    for i in range(1, n):
        x1 = x1 ^ nos[i]
        xn = xn ^ (i+1)
    
    # last number to XOR
    xn = xn ^ (n+1)

    # find the missing number
    return x1 ^ xn

# Input
numbers = [4, 2, 1, 6, 5, 7] 

missingNumber =getMissingNumber(numbers,len(numbers))

print('Missing number:', missingNumber)

# Output
# Missing number: 3.0

For two missing numbers, using a similar logic of XOR above, we will have an output of XOR value of both missing numbers. Now, given the XOR value will not be zero, the XOR corresponding valid bit in missing1 and missing2 must be different to make it “1”.

# Keep track of XOR data
def getTwoMissingNumber(nos, n):
    x1 = nos[0]
    xn = 1

    # start from second
    for i in range(1, n-2):
        x1 = x1 ^ nos[i]
        xn = xn ^ (i+1)
    
    # last numbers to XOR
    xn = xn ^ (n-1) ^ (n)

    # XOR of two missing numbers
    # Any set bit in it must be 
    # set in one missing and 
    # unset in other missing number 
    XOR = x1 ^ xn

    # Get a rightmost set bit of XOR  
    set_bit_no = XOR & ~(XOR-1) 
  
    # Divide elements in two sets 
    # by comparing rightmost set bit of XOR 
    # with bit at same position in each element. 
    x = 0
    y = 0 
    for i in range(0,n-2): 
        if nos[i] & set_bit_no:    
            # XOR of first set in nos[]  
            x = x ^ nos[i]   
        else: 
            # XOR of second set in nos[]  
            y = y ^ nos[i]   

    for i in range(1,n+1): 
        if i & set_bit_no: 
            # XOR of first set in nos[]  
            x = x ^ i        
        else: 
            # XOR of second set in nos[]  
            y = y ^ i
    
    print ("Missing Numbers: %d %d"%(x,y)) 
    return

# Input
numbers = [4, 2, 1, 6, 7, 5] 

# total length will be provided count+2 missing ones
getTwoMissingNumber(numbers, len(numbers) + 2)

# Output
# Missing Numbers: 3 8

This overcomes the overflow issue and was easier to solve (compared to solving a quadratic equation). Though it took more than one traversal, overall it maintains the time complexity as O(n) and space complexity as O(1). Nice!

Closure …

There could be multiple ways to solve for one or more missing numbers. One can look at it based on ease and need.


Keep solving!

Sandeep Mewara Github
News Update
Tech Explore
Data Explore
samples GitHub Profile Readme
Learn Machine Learning with Examples
Machine Learning workflow
What is Data Science
Word Ladder solution
What is Dynamic Programming
Learn Microsoft Tech via Videos LiveTV Streams

How to solve Word Ladder Problem?

Sometime back, a colleague of mine asked me about the word ladder problem. She was looking for a change. So, I believe she stumbled across this while preparing for data structures and algorithms.

graph-header

Problem Statement

Typically, the puzzle shared is a flavor of below:

Find the smallest number of transformations needed to change an initial word to a target word of same length. In every transformation, change only one character and make sure word exists in the given dictionary.

Explanation

Assuming all these 4 letter words are there in the dictionary provided, it takes minimum 4 transitions to convert word from SAIL to RUIN, i.e.
SAIL -> MAIL -> MAIN -> RAIN -> RUIN

Intent here is to know about Graph algorithm. So, what are graphs in context of algorithms and how do we apply them to solve such problems?

Graph Data Structure

Graphs are flow structure that represents entities connection with each other. Visually, they are represented with help of a Node (Vertex) & an Edge (Connector).

graph-general

A tree is an undirected graph in which any two nodes are connected by only one path. In it, each node (except the root node) comprises exactly one parent node.

Most common way to represent a graph is using an Adjacency matrix. In it, Element A[i][j] is 1 if there is an edge from node i to node j or else it is 0. For example, adjacency matrix of above unidirected graph is:

  | 1 2 3 4
------------
1 | 0 1 0 1
2 | 1 0 1 0
3 | 0 1 0 1
4 | 1 0 1 0

Another common way is via Adjacency list. (List format of the data instead of a matrix.)

Related Algorithms

Graphs are applied in search algorithms. Traversing the nodes and edges in a defined order helps in optimizing search. There are two specific approaches to traverse graph:

Breadth First Search (BFS)

Given a graph G and a starting node s, search proceeds by exploring edges in the graph to find all the nodes in G for which there is a path from s. With this approach, it finds all the nodes that are at a distance k from s before it finds any nodes that are at a distance k+1.

For easy visualization, think of it as, in a tree, finding all the child nodes for a parent node as first step. Post it, find all the grandchildren and hence forth.

Depth First Search (DFS)

Given a graph G and a starting node s, search proceeds by exploring edges in the graph to find all the nodes in G traversed from s through it’s edges. With this approach, we go deep in graph connecting as many nodes in the graph as possible and branch where necessary.

For easy visualization, think of it as, in a tree, finding all the family nodes for a parent node. With this, for a given node, we connect its children, grand children, grand grand children and so on before moving to next node of same level.

Thus, with DFS approach, we can have multiple deduced trees.

Knight’s tour is a classic example that leverages Depth First Search algorithm.

Shortest Path First OR Dijkstra’s Algorithm (SPF)

Given a graph G and a starting node s, search the shortest path to reach node d. It uses a concept of weights. It’s an iterative algorithm similar to results of BFS.

Many real world example fits in here, e.g. what would be shortest path from home to office.

With BFS (a simple queue), we visit one node at a time whereas in SPF (a priority queue), we visit a node at any level with lowest cost. In a sense, BFS follows Dijkstra's algorithm, a step at a time with all edge weights equal to 1. The process for exploring the graph is structurally the same in both cases. at times, BFS is preferred with equal weight graphs. This is because, operations on a priority queue are O(log n) compared to operations on a regular queue which is O(1).

Code

I will be using a breadth first graph algorithm here based on the problem need:

import collections
from collections import deque 

class Solution(object):
    # method that will help find the path
    def ladderLength(self, beginWord, 
                        endWord, wordList):
        """
        :type beginWord: str
        :type endWord: str
        :type wordList: Set[str]
        :returntype: int
        """

        # Queue for BFS
        queue = deque()

        # start by adding begin word
        queue.append((beginWord, [beginWord]))

        while queue:
            # let's keep a watch at active queue
            print('Current queue:',queue)

            # get the current node and 
            # path how it came
            node, path = queue.popleft()

            # let's keep track of path length 
            # traversed so far
            print('Current transformation count:',
                                        len(path))

            # find possible next set of 
            # child nodes, 1 diff
            for next in self.next_nodes(node, 
                            wordList) - set(path):
                # traversing through all child nodes
                # if any of the child matches, 
                # we are good               
                if next == endWord:
                    print('found endword at path:',
                                            path)
                    return len(path)
                else:
                    # keep record of next 
                    # possible paths
                    queue.append((next, 
                                path + [next]))
        return 0

    def next_nodes(self, word, word_list):
        # start with empty collection
        possiblenodes = set()

        # all the words are of fixed length
        wl_word_length = len(word)

        # loop through all the words in 
        # the word list
        for wl_word in word_list:
            mismatch_count = 0

            # find all the words that are 
            # only a letter different from 
            # current word those are the 
            # possible next child nodes
            for i in range(wl_word_length):
                if wl_word[i] != word[i]:
                    mismatch_count += 1
            if mismatch_count == 1:
                # only one alphabet different-yes
                possiblenodes.add(wl_word)
        
        # lets see the set of next possible nodes 
        print('possible next nodes:',possiblenodes)
        return possiblenodes

# Setup
beginWord = "SAIL"
endWord = "RUIN"
wordList = ["SAIL","RAIN","REST","BAIL","MAIL",
                                    "MAIN","RUIN"]

# Call
print('Transformations needed: ',
    Solution().ladderLength(beginWord, 
                            endWord, wordList))

# Transformation expected == 4
# One possible shortes path with 4 transformation:
# SAIL -> MAIL -> MAIN -> RAIN -> RUIN

Used deque (doubly ended queue) of Python

deque helps with quicker append and pop operations from both the ends. It has O(1) time complexity for append and pop operations. In comparison, list provides it in O(n) time complexity.

A quick look at the code workflow to validate if all nodes at a particular distance was traversed first and then moved to next level:

Current queue: deque([('SAIL', ['SAIL'])])

Current transformation count: 1
possible next nodes: {'BAIL', 'MAIL'}
Current queue: deque([('BAIL', ['SAIL', 'BAIL']), 
                      ('MAIL', ['SAIL', 'MAIL'])])

Current transformation count: 2
possible next nodes: {'SAIL', 'MAIL'}
Current queue: deque([('MAIL', ['SAIL', 'MAIL']), 
                      ('MAIL', ['SAIL', 'BAIL', 
                       'MAIL'])])

Current transformation count: 2
possible next nodes: {'BAIL', 'MAIN', 'SAIL'}
Current queue: deque([('MAIL', ['SAIL', 'BAIL', 
                                'MAIL']), 
                      ('BAIL', ['SAIL', 'MAIL', 
                                'BAIL']), 
                      ('MAIN', ['SAIL', 'MAIL', 
                                'MAIN'])])

Current transformation count: 3
possible next nodes: {'BAIL', 'MAIN', 'SAIL'}
Current queue: deque([('BAIL', ['SAIL', 'MAIL', 
                                'BAIL']), 
                      ('MAIN', ['SAIL', 'MAIL', 
                                'MAIN']), 
                      ('MAIN', ['SAIL', 'BAIL', 
                                'MAIL', 'MAIN'])])

Current transformation count: 3
possible next nodes: {'SAIL', 'MAIL'}
Current queue: deque([('MAIN', ['SAIL', 'MAIL', 
                                'MAIN']), 
                      ('MAIN', ['SAIL', 'BAIL', 
                                'MAIL', 'MAIN'])])

Current transformation count: 3
possible next nodes: {'RAIN', 'MAIL'}
Current queue: deque([('MAIN', ['SAIL', 'BAIL', 
                                'MAIL', 'MAIN']), 
                      ('RAIN', ['SAIL', 'MAIL', 
                                'MAIN', 'RAIN'])])

Current transformation count: 4
possible next nodes: {'RAIN', 'MAIL'}
Current queue: deque([('RAIN', ['SAIL', 'MAIL', 
                                'MAIN', 'RAIN']), 
                      ('RAIN', ['SAIL', 'BAIL', 
                        'MAIL', 'MAIN', 'RAIN'])])

Current transformation count: 4
possible next nodes: {'MAIN', 'RUIN'}
found endword at path: ['SAIL', 'MAIL', 'MAIN', 
                                        'RAIN']

Transformations needed:  4
Overall path: ['SAIL', 'MAIL', 'MAIN', 
                               'RAIN', 'RUIN']

Complexity

For above code that I used to find the shortest path for transformation:

Time

In next_nodes, for each word in the word list, we iterated over its length to find all the intermediate words corresponding to it. Thus we did M×N iterations, where M is the length of each word and N is the total number of words in the input word list. Further, to form an intermediate word, it takes O(M) time. This adds up to O(M2×N).

In ladderLength, BFS can go to each of the N words and for each word, we need to examine M possible intermediate words. This adds up to O(M2×N).

Overall, it adds up to O2(M2×N) which would be called O(M2×N).

Space

In next_nodes, each word in the word list would have M intermediate combinations. For every word we need a space of M2 to save all the transformations corresponding to it. Thus, it would need a total space of O(M2×N).

In ladderLength, BFS queue would need a space of O(M×N)

Overall, it adds up to O(M2×N) + O(M×N) which would be called O(M2×N)

Wrap Up

It could be little tricky and thus would need some practice to visualize the graph as well to write code for it.

Great, so now we know how to solve problems like word ladder problem. It also touch based other related common graph algorithms that we can refer to.

I had a read of the following reference and it has much more details if needed.


Keep problem solving!

samples GitHub Profile Readme
Learn Python – Beginners step by step – Basics and Examples
Sandeep Mewara Github
Sandeep Mewara Learn By Insight
Matplotlib plot samples
Sandeep Mewara Github Repositories

Just reverse alphabets in a string?

Last week it was a simple problem that we discussed to tangle our brain.

string-reverse

Reverse only alphabets in a provided string in a most efficient way. (Special characters, numbers, etc should continue to stay at their original place)

First thought …

Okay, so first thought was to make use of an extra space and then we can get our desired result with two traversals of the string. Yeah, but let’s optimize.

Updated thought …

We can make the swaps in place in a single traversal. A good stopping criteria seems to be when index while moving from front crosses the index moving backwards from end.

Let’s write code:

static void Main(string[] args)
{
    Console.WriteLine($"Please enter a string:");

    // Ignore casing
    var inputString = Console.ReadLine().ToLower();
    char[] inputArray = inputString.ToCharArray();
    ReverseAlphabetsOnly(inputArray);        
    Console.WriteLine(
            $"Reversed: {new String(inputArray)}");
}

static void ReverseAlphabetsOnly(char[] inputArray)
{
    int frontIndex = 0;
    int endIndex = inputArray.Length-1;
    char temp;

    while(frontIndex < endIndex)
    {
        if(!IsAlphabet(inputArray[frontIndex]))
            frontIndex++;
        else if(!IsAlphabet(inputArray[endIndex]))
            endIndex--;    
        else
        {
            temp = inputArray[frontIndex];
            inputArray[frontIndex] 
                 = inputArray[endIndex];
            inputArray[endIndex] = temp;

            frontIndex++;
            endIndex--;
        }
    }
}

static bool IsAlphabet(char x) 
{ 
    return ( (x >= 'a' && x <= 'z') 
            || (x >= 'A' && x <= 'Z') ); 
}

// Input:  Le@rn By In$ig#t...
// Output: tg@in Iy Bn$re#L...

Closure …

Approach looks good, as it would be at maximum a single traversal with no extra space used. Thus we are able to solve it with an overall Order of Time complexity O(n) & Space complexity O(1).


Happy solving …

samples GitHub Profile Readme
Learn Python – Beginners step by step – Basics and Examples
Sandeep Mewara Github
Sandeep Mewara Learn By Insight
Matplotlib plot samples
Sandeep Mewara Github Repositories

Linear time partition – a three way split

Linear-time partition is a divide & conquer based selection algorithm. With it, data is split into three groups using a pivot.
.

linear-time-partioning

An integral part of Quick Sort algorithm which uses this partitioning logic recursively. All the elements smaller than the pivot are put on one side and all the larger ones on the other side of the pivot.

Similar to discussion of Dynamic Programming, this algorithm plays on solving sub-problems to solve complex problem.

Algorithm

Post selecting the pivot, Linear-time partition routine separates the data into three groups with values:

  • less than the pivot
  • equal to the pivot
  • greater than the pivot

Generally, this algorithm is done in place. This results in partially sorting the data. There are handful of problems that makes use of this fact, like:

  • Sort an array that contains only 0’s, 1’s & 2’s
  • Dutch national flag problem
  • Print all negative integers followed by positive for an array full of them
  • Print all 0’s first and then 1’s or vice-versa for an array with only 0’s & 1’s
  • Move all the 0’s to the end maintaining relative order of other elements for an array of integers

If done out of place, (i.e. not changing the original data), it would cost O(n) additional space

Example

Let’s take an example of: sort a array that contains only 0’s, 1’s & 2’s

First thought for such problem is to perform a count of 0’s, 1’s and 2’s. Once we have the counts, reset the array with them. Though it has time complexity O(n), it takes two traversal of the array or uses an extra array.

Below is an attempt to solve using Linear-time partition algorithm to avoid that extra traversal/space.

def threeWayPartition(A):
    start = mid = 0
    end = len(A)-1
    
    # define a Pivot
    pivot = 1
    
    while (mid <= end):
        # mid element is less than pivot
        # current element is 0
        
        # so lets move it to start
        # current start is good. 
        # move start to next element
        # move mid to next element to move forward
        if (A[mid] < pivot) :
            swap(A, start, mid)
            start = start + 1
            mid = mid + 1
            
        # mid element is more than pivot
        # current element is 2
        
        # so lets move it to end
        # current end is good. 
        # move end to previous element
        elif (A[mid] > pivot) :
            swap(A, mid, end)
            end = end - 1
        
        # mid element is same as pivot
        # current element is 1
        
        # just move forward: 
        # mid to next element
        else :
            mid = mid + 1
            
# Swap two elements A[i] and A[j] in the list
def swap(A, i, j):
    A[i], A[j] = A[j], A[i]


# Define an array
inputArray = [0, 1, 2, 2, 1, 0, 0, 2]

# Call the Linear-time partition routine
threeWayPartition(inputArray)

# print the final result
print(inputArray)

# Outputs
# [0, 0, 0, 1, 1, 2, 2, 2]

With a defined pivot, we segregated the data on the either side which resulted in desired output. Dutch nation flag problem or printing all negative first and then positive, or printing all 0s first follows the same code.

For moving all 0’s to end maintaining other elements order, we do a tweak in swap index to maintain order:

def threeWayPartition(A):
    current = 0
    nonzero = 0
    end = len(A)-1
    
    # define a Pivot
    pivot = 0
    
    while (current <= end):
        if (A[current] != pivot) :
            swap(A, current, nonzero)
            nonzero = nonzero + 1
        current = current + 1
            
# Swap two elements A[i] and A[j] in the list
def swap(A, i, j):
    A[i], A[j] = A[j], A[i]


# Define an array
inputArray = [7,0,5,1,2,0,2,0,6]

# Call the Linear-time partition routine
threeWayPartition(inputArray)

# print the final result
print(inputArray)

# Output
# [7, 5, 1, 2, 2, 6, 0, 0, 0]

Complexity

With above algorithm approach, we solved our problem with Time complexity O(n) & Space complexity O(1) (with single traversal of the array)


It was fun solving!

samples GitHub Profile Readme
Learn Python – Beginners step by step – Basics and Examples
Sandeep Mewara Github
Sandeep Mewara Learn By Insight
Matplotlib plot samples
Sandeep Mewara Github Repositories

Dynamic Programming – What, How & When

There are many programming techniques, which when applied appropriately to specific situation, leads to efficient results (either time wise or space wise or both). Dynamic Programming (DP) is one such optimization concept.

dp-header

Not to be confused with Dynamic programming language or Dynamic type in C#

What is?

Dynamic programming is solving a complicated problem by breaking it down into simpler sub-problems and make use of past solved sub-problems.

Dynamic Programming is mainly an optimization over plain recursion.

Let’s use Fibonacci series as an example to understand this in detail.

Fibonacci Series is a sequence, such that each number is the sum of the two preceding ones, starting from 0 and 1.

0, 1, 1, 2, 3, 5, 8, 13, 21…

When to?

It is best applied to problems that exhibits Overlapping Subproblems & Optimal Substructure.

Following is how our Fibonacci problem fib(4) breakdown would look like:

Problem tree for fib(4)

We can see that fib(0), fib(1) & fib(2) are occurring multiple times in the graph. These are overlapping subproblems.

Binary search tree would not fall into same category. There would no common subproblem by binary search definition itself.

We can see for optimal solution of fib(4), we would need optimal solution of fib(3) & fib(2) and so on. This tells it has optimal substructure.

Longest path problem does not exhibit as such. Optimal solution for longest distance between A & C might not be optimal for B & C that involves A.

How to?

It’s not an unknown or a new technique. Our brain do it almost all the time. Let’s use Fibonacci series as an example to understand this and then we will apply DP to solve it programmatically.

Logically …

Let’s say, I ask, what is the 4th number in the Fibonacci series? You would work from 0, 1, 1, etc and come to fib(4) = 2. Now, if I ask, what is the 5th number? Would you restart from 0, 1, 1..? Or you would just make use of fib(4) you just solved and get fib(5)?

You see – it’s obvious, you will make use of previous solved result and right away come up with fib(5) instead of starting from 0. In programming language, this is tabulation using calculated fib(4) & fib(3).

Tabulation is a technique of starting from smallest sub-problem and storing the results of entries until target value is reached.

Let’s say, now I ask to calculate fib(8). Our natural way of working on it would be to first find fib(7) & fib(6). In order to know fib(7), we will figure out fib(6) & fib(5) and hence forth. With fib(7), we already traversed through fib(6). This is another approach where we calculated the next subproblem and keep on storing them in case of need later. In programming language, this is memoization.

Memoization is a technique of storing the results of expensive function calls and returning the cached result when the same inputs occur again.

DP breaks the problem into sub-problems and uses memoization or tabulation to optimize. We will understand about them with examples below.

Programmatically in action …

In order to compare the optimization cost, we will use recursion as another way to solve the problem.

static void Main(string[] args)
{
    Stopwatch stopwatch = new Stopwatch();

    Console.WriteLine($"Fibonacci Recursive:");
    for (int i = 30; i <= 50; i+=5)
    { 
        stopwatch.Reset();
        stopwatch.Start();
        var _ = FibonacciRecursive(i);
        stopwatch.Stop();
        Console.WriteLine($"{i}th took time: ({stopwatch.Elapsed})");
    }

    Console.WriteLine($"Dynamic Programming:");
    for (int i = 30; i <= 50; i+=5)
    { 
        stopwatch.Reset();
        stopwatch.Start();
        var _ = FibonacciDPTabulation(i);
        stopwatch.Stop();
        Console.WriteLine($"{i}th took time: ({stopwatch.Elapsed})");
    }
}

static long FibonacciRecursive(int number)
{
    if (number <= 1)
        return number;
    return FibonacciRecursive(number-2) + FibonacciRecursive(number - 1);
}

static long FibonacciDPTabulation(int number)
{
    long[] arr = new long[100];
    arr[0] = 1; arr[1] = 1; 
    for( int x=2; x <= number; x++)
    {
        arr[x] = arr[x-1] + arr[x-2];
    }
    return arr[number];
}

With above code, we got the following output:

 RecursiveDP (Tabulation)DP (Memoization)
30th took time(00:00:00.0090536)(00:00:00.0002756)(00:00:00.0183122)
35th took time(00:00:00.0908688)(00:00:00.0000037)(00:00:00.0000009)
40th took time(00:00:00.9856354)(00:00:00.0000006)(00:00:00.0000011)
45th took time(00:00:10.7981258)(00:00:00.0000006)(00:00:00.0000008)
50th took time(00:02:24.8694889)(00:00:00.0000006)(00:00:00.0000008)

Difference is astonishing! DP is too fast in comparison. Just look at the 50th time difference.

With simple iterative approach, Fibonacci Series can be calculated in the same ballpark time as of DP. Current example is just to keep things simple and to understand the DP concept. Please look at the end of post for common examples that would clarify where DP would be of real help over recursion. (& iterative approach would be difficult to code)

Above approach of DP is considered Bottom-Up approach as we started with bottom (lowest term) and then moved to the highest one. This is tabulation. We keep on filling on the cache here till we reach the target.

Alternatively, there is a Top-down approach. We start solving from highest term and store solutions of sub problems along the way. This is memoization. A code for this approach would look like:

private static Dictionary<int, long> memo 
                    = new Dictionary<int, long>();

static long FibonacciDPMemoization(int number)
{
    if (number == 0 || number == 1) {
        return number;
    }
    
    // see if we've already calculated this
    if (memo.ContainsKey(number)) {                
        return memo.GetValueOrDefault(number);
    }

    long result = FibonacciDPMemoization(number - 1) 
                + FibonacciDPMemoization(number - 2);
    memo.Add(number, result);

    return result;
}

memoization is sometimes simpler to understand and write code because of it’s natural way of solving.

Generally, tabulation outperformes memoization by a constant factor. This is because of no overhead for recursion and can be stored as a pre-allocated array. We first calculate and store the value of a sub problem repeated most number of times here.

Few Common Examples?

  • Tower of Hanoi
  • Checkerboard
  • Egg dropping puzzle
  • Matrix chain multiplication

Details of these above problem can be read here.

Beware?

Apart from avoiding problems that do not have Overlapping Subproblems & Optimal Substructure, one need to understand we are doing a trade here – space for time!

We can always discuss that though DP is using extra space to store the sub-problems data, but in turn helps save in memory calculation which could be expensive. Thus, in case of constrained space, we need to evaluate usage of DP.


Keep learning!

samples GitHub Profile Readme
Learn Python – Beginners step by step – Basics and Examples
Sandeep Mewara Github
Sandeep Mewara Learn By Insight
Matplotlib plot samples

Any anagram of a string, a palindrome?

Often in our group we discuss about puzzles or problems related to data structure and algorithms. One such day, we discussed about:

how will we find if any anagram of a string is palindrome or not?

Our first thought went in the direction to start from first character and then traverse till end to see if there could be matching pair. Keep track of it, move to next character till middle and then stitch all to figure if so. It solves, but the query was – could it be solved better though?

Of course! With putting some stress on the brain, it turned out that in a single read, we will have info enough, to tell, if any anagram formed from the input string can be a palindrome or not.

Thought converted to Code

static void CheckIfStringAnagramHasPalindrome()
{
    Console.WriteLine($"Please enter a string:");

    // Ignore casing
    var inputString = Console.ReadLine().ToLower();

    // Just need to keep track of unique characters
    var characterSet = new HashSet<char>();

    // Single traversal of input string 
    for(int i=0; i<inputString.Length; i++)
    {
        char currentCharacter = inputString[i];
        if(characterSet.Contains(currentCharacter))
            characterSet.Remove(currentCharacter);
        else
            characterSet.Add(currentCharacter);
    }

    // Character counts in set will help 
    // identify if palindrome possible 
    var leftChars = characterSet.Count;
    if(leftChars == 0 || leftChars == 1)
        Console.WriteLine($"YES - possible.");
    else
        Console.WriteLine($"NO - Not possible.");
}


Approach looked good, as with a single traversal and usage of HashSet, i.e. with overall Order of Time complexity O(n) & Space complexity O(1), we were able to solve it.

It was fun solving!

GitHub Profile Readme Samples
Learn Python – Beginners step by step – Basics and Examples
Sandeep Mewara Github
Sandeep Mewara Learn By Insight

Read from console in VS Code

I moved to Apple Mac late last year because of different set of technologies now I work in. As shared in one of my previous posts, I use Visual Studio Code for programming in Python exploring Machine Learning. Though, for anything in .NET, I switch to a Windows VM and use Visual Studio.

For quick console apps, it feels painful to switch to a VM and work. Thus, I looked and installed C# extension in VS Code to try of. Details are here.

While running a console app, I got stuck to read any value from Console. In debug mode, IDE would stop on the Console.ReadLine() but whatever I type in Console would not go through.

I looked around and found that there are few settings for Console in VS Code. The console setting controls what console (terminal) window the target app is launched into.

"internalConsole" (default) : This does NOT work for applications that want to read from the console (ex: Console.ReadLine).

How to Solve it?

Suggested way to take input is to set the console setting as integratedTerminal. This is a configuration setting in the launch.json file under .vscode folder.

"integratedTerminal" : the target process will run inside VS Code’s integrated terminal (Terminal tab in the tab group beneath the editor). Alternatively add "internalConsoleOptions": "neverOpen" to make it so that the default foreground tab is the terminal tab.

Change the default setting like below:

vscode-read-console

With above change, the input and output will happen through integrated terminal like:

vscode-read-console-2

So far, it looks good and seems I will stick to Visual Studio Code on Mac for quick console applications.

Reference here.


Keep learning!

LearnByInsight
GitHub Profile Readme Samples

New C# features I really liked!

There has been many new features added to C# over last few years. A recent survey in CodeProject community lead me to the thought of sharing what I find really helpful. It spreads from C# 6.0 to C# 8.0. Below few made writing code easy, more fun and have improved productivity.

Null Conditional Operator (?. & ?[])

They make null checks much easier and fluid. Add a ? just before the the member access . or indexer access [] that can be null. It short-circuits and returns null for assignment.

// safegaurd against NullReferenceException

Earlier
if(address != null)
{
   var street = address.StreetName;
}
Now
var street = address?.StreetName;

// safegaurd against IndexOutOfRangeException

Earlier
if(row != null && row[0] != null)
{
   int data = row[0].SomeCount;
}
Now
int? data = row?[0]?.SomeCount;

Null Coalescing Operator (?? & ??=)

Null-coalescing operator ?? helps to assign default value if the properties is null. Often, used along with null conditional operator.

Earlier
if(address == null)
{
   var street = "NA";
}
Now
var street = address?.StreetName ?? "NA"; 

Null-coalescing assignment operator ??= helps to assign the value of its right-hand operand to its left-hand operand only if the left-hand operand evaluates to null.

The left-hand operand of the ??= operator must be a variable, a property, or an indexer element

Earlier
int? i = null;

if(i == null)
   i = 0;

Console.WriteLine(i);  // output: 0
Now
int? i = null;

i ??= 0;
Console.WriteLine(i);  // output: 0

String Interpolation ($)

It enables to embed expressions in a string. With a special character $ to identify a string literal as an interpolated string. Interpolation expressions are replaced by the string representations of the expression results in the result string.

Earlier
string address = string.Format("{0},{1}", HouseNo, StreetName);

log.Write("Address: "+ HouseNo.ToLower() + "," + StreetName);
Now
string address = $"{HouseNo}, {StreetName}";

log.Write($"Address: {HouseNo.ToLower()}, {StreetName}");

Auto-Property Initializer

It helps declare the initial value for a property as part of the property declaration itself.

Earlier
Language _currentLanguage = Language.English;
public Language CurrentLanguage
{
   get { return _currentLanguage; }
   set { _currentLanguage = value; }
}   

// OR

// Improvement in C# 3.0
public Language CurrentLanguage { get; set; }

public MyClass()
{
    CurrentLanguage = Language.English;
}
Now
public Language CurrentLanguage { get; set; } = Language.English;

using static

It helps to import the enum or static methods of a single class.

Earlier
public class Enums
{
    public enum Language
    {
        English,
        Hindi,
        Spanish
    }
}

// Another file
public class MyClass
{
    public Enums.Language CurrentLanguage { get; set; };
}
Now
public class Enums
{
    public enum Language
    {
        English,
        Hindi,
        Spanish
    }
}

// Another file
using static mynamespace.Enums

public class MyClass
{
    public Language CurrentLanguage { get; set; };
}

Tuples

They are lightweight data structures that contain multiple fields to represent the data members.

# Initialize Way 1
(string First, string Second) ranks = ("1", "2");
Console.WriteLine($"{ranks.First}, {ranks.Second}");

# Initialize Way 2
var ranks = (First: "1", Second: "2");
Console.WriteLine($"{ranks.First}, {ranks.Second}");

It support == and !=

Expression bodied get/set accessors

With it, members can be implemented as expressions.

Earlier
public string Title
{
    get { return _title; }
    set 
    { 
       this._title = value ?? "Default - Hello";
    }
}
Now
public string Title
{
    get => _title;
    set => this._title = value ?? "Default - Hello";
}

Access modifier: private protected

A new compound access modifier: private protected to indicate a member accessible by containing class or by derived classes that are declared in the same assembly. One more level of abstraction compared to protected internal.

// Assembly1.cs
public class BaseClass
{
    private protected int myValue = 0;
}

public class DerivedClass1 : BaseClass
{
    void Access()
    {
        var baseObject = new BaseClass();

        // Error CS1540, because myValue can only be
        // accessed by classes derived from BaseClass
        // baseObject.myValue = 5;

        // OK, accessed through the current 
        // derived class instance
        myValue = 5;
    }
}

//
// Assembly2.cs
// Compile with: /reference:Assembly1.dll
class DerivedClass2 : BaseClass
{
    void Access()
    {
        // Error CS0122, because myValue can only
        // be accessed by types in Assembly1
        // myValue = 10;
    }
}

await

It helps suspend evaluation of the enclosing async method until the asynchronous operation represented by its operand completes. On completion, it returns result of the operation if any.

It does not blocks the thread that evaluates async method, instead suspends the enclosing async method and returns to the caller of the method.

using System;
using System.Net.Http;
using System.Threading.Tasks;

public class AwaitOperatorDemo
{
    // async Main method allowed since C# 7.1 
    public static async Task Main()
    {
        Task<int> downloading = DownloadProfileAsync();
        Console.WriteLine($"{nameof(Main)}: Started download.");

        int bytesLoaded = await downloading;
        Console.WriteLine($"{nameof(Main)}: Downloaded {bytesLoaded} bytes.");
    }

    private static async Task<int> DownloadProfileAsync()
    {
        Console.WriteLine($"{nameof(DownloadProfileAsync)}: Starting download.");

        var client = new HttpClient();
        // time taking call - await and move on
        byte[] content = await client.GetByteArrayAsync("https://learnbyinsight.com/about/");

        Console.WriteLine($"{nameof(DownloadProfileAsync)}: Finished download.");
        return content.Length;
    }
}

// Output:
// DownloadProfileAsync: Starting download.
// Main: Started download.
// DownloadProfileAsync: Finished download.
// Main: Downloaded 27700 bytes.

Default Interface methods

Now, we can add members to interfaces and provide a default implementation for those members. It helps in supporting backward compatibility. There would be no breaking change to existing interface consumers. Existing implementations inherit the default implementation.

public interface ICustomer
{
    DateTime DateJoined { get; }
    string Name { get; }

    // Later added to interface:
    public string Contact()
    {
       return "contact not provided";
    }
}

Wrap up

There are many more additions to C#. Believe, above are few that one should know and use in their day to day coding right away (if not already doing it). Most of it helps us with being more concise and avoid convoluted code.

Reference: https://docs.microsoft.com/en-us/dotnet/csharp/whats-new

Keep learning!

LearnByInsight
GitHub Profile Readme Samples

Kubernetes – Evolution of application deployment

Kubernetes (K8s) is turning out as the cutting-edge of application deployment. It is becoming core to the creation and operation of modern software (few call it as modern SaaS). Thus, I planned to look into it and see what Kubernetes is and how/what application design will help adapt it in the application deployment evolution.

Kubernetes is a portable, extensible, open-source platform for automating deployment, scaling, and management of containerized applications.

History

Google originally designed and open-sourced the Kubernetes project in 2014. Kubernetes has inputs from over 15 years of Google’s experience to run production workloads at scale with best ideas and practices from the community. It is maintained by the Cloud Native Computing Foundation now. It’s current development repository is here.

First challenge …

With modern goal parameters like: recoverability, release cycle time & release frequency – applications need to be designed and deployed in a way that makes them improve year over year.

This leads to first step of breaking the monolith into microservices such that the changes and impact are compartmentalized for easy deployment and recovery.

monolith2microservice

A monolithic application puts all it’s functionality in a single process. In need of scaling, it replicates entire monolith on multiple servers. On the other hand, a microservice architecture separates out (keeps) each functionality into a separate service. Thus in case of scaling need, these services are distributed across servers as required.

Second challenge …

With multiple microservices in play, a variance of stack versions or deployment styles kicks in as trouble. Each team would have their own set of tools, versions to build the artifacts, store them and then deploy them. Thus, different applications/services can have different patterns and network topology. This in turn makes managing security and infrastructure more challenging.

This leads to the step of abstracting infrastructure out to ease maintenance and relieve from security and other infrastructure related concerns.

deployment-progression
Deployment scheme evolution
  • Traditional: Applications running on a physical server. No way to define resource boundaries for applications.
  • Virtualization: Allows to run multiple Virtual Machines (VMs) on a single physical server’s CPU. This leads to better utilization of resources and better scalability as an application can be added or updated easily. Also, if needed, applications can be isolated between different VMs to provide a level of security.
  • Containers: Like VM, it has its own filesystem, CPU, memory, process space, etc. Are environment consistent, easy to scale, portable across clouds and OS distributions. This leads to loosely coupled setup where application is totally decoupled from infrastructure and makes it easy to move towards smaller, modular microservices.

Containers are abstraction to next level. It does not matter on which OS you are on (although there could be different containers for different OS and how they work underlying), all we need is to package our code and needed libraries together, which then runs inside a container based on configured resource need. Docker is an example of container runtime, a packaging software.

Final challenge …

So, the packaging has been simplified and running the application on a single node has been simplified. When we move to enterprise, we need to scale up/down our containers on need basis automatically. Further, one would scale the application to be served from multiple servers instead of just one for better load distribution and easy recovery/fail safe. Now, while distributing the load, we would need to ensure the availability of nodes, resources like space on node for running a container, etc.

This is where Kubernetes pitch in. It acts as a container orchestrator that help provides with a framework to run distributed systems resiliently. It takes care of scaling and failover of containers having application, provides deployment patterns, and more.

kubernetes-architecture

Kubernetes has master-slave architecture where there is one master node and multiple worker nodes. A Pod is the smallest deployable unit in it. In order to run a single container, we would need to create a Pod for that container. A Pod can contain more than one container if those containers are relatively tightly coupled (like a container to download all secret configs related before application starts in other container).

API Server is the heart of the architecture. User interacts with Kubernetes via it and master node communicates to worker nodes through it. Number of containers requested is stored in the etcd (key-value store). Controller acts as a manager that keeps a constant check on the store, schedules the request for scheduler to pick and execute, spins of another worker node in case of need.

Wrap Up …

I have just touched the surface of both containerization and Kubernetes. They seem to have much more and can be explored in depth. Along with vast benefits, it can also bring new challenges on the table with moving to cloud like security and networking.

It was good to know how application design and deployment are evolving, getting abstracted and loosely coupled.

Keep learning!

Reference: https://kubernetes.io/docs/home/

GitHub Readme Samples