New C# features I really liked!

September 5, 2020September 25, 2020Sandeep Mewara Leave a comment

There has been many new features added to C# over last few years. A recent survey in CodeProject community lead me to the thought of sharing what I find really helpful. It spreads from C# 6.0 to C# 8.0. Below few made writing code easy, more fun and have improved productivity.

Null Conditional Operator (?. & ?[])

They make null checks much easier and fluid. Add a ? just before the the member access . or indexer access [] that can be null. It short-circuits and returns null for assignment.

// safegaurd against NullReferenceException

Earlier

if(address != null)
{
   var street = address.StreetName;
}

Now

var street = address?.StreetName;

// safegaurd against IndexOutOfRangeException

Earlier

if(row != null && row[0] != null)
{
   int data = row[0].SomeCount;
}

Now

int? data = row?[0]?.SomeCount;

Null Coalescing Operator (?? & ??=)

Null-coalescing operator ?? helps to assign default value if the properties is null. Often, used along with null conditional operator.

Earlier

if(address == null)
{
   var street = "NA";
}

Now

var street = address?.StreetName ?? "NA";

Null-coalescing assignment operator ??= helps to assign the value of its right-hand operand to its left-hand operand only if the left-hand operand evaluates to null.

The left-hand operand of the ??= operator must be a variable, a property, or an indexer element

Earlier

int? i = null;

if(i == null)
   i = 0;

Console.WriteLine(i);  // output: 0

Now

int? i = null;

i ??= 0;
Console.WriteLine(i);  // output: 0

String Interpolation ($)

It enables to embed expressions in a string. With a special character $ to identify a string literal as an interpolated string. Interpolation expressions are replaced by the string representations of the expression results in the result string.

Earlier

string address = string.Format("{0},{1}", HouseNo, StreetName);

log.Write("Address: "+ HouseNo.ToLower() + "," + StreetName);

Now

string address = $"{HouseNo}, {StreetName}";

log.Write($"Address: {HouseNo.ToLower()}, {StreetName}");

Auto-Property Initializer

It helps declare the initial value for a property as part of the property declaration itself.

Earlier

Language _currentLanguage = Language.English;
public Language CurrentLanguage
{
   get { return _currentLanguage; }
   set { _currentLanguage = value; }
}   

// OR

// Improvement in C# 3.0
public Language CurrentLanguage { get; set; }

public MyClass()
{
    CurrentLanguage = Language.English;
}

Now

public Language CurrentLanguage { get; set; } = Language.English;

using static

It helps to import the enum or static methods of a single class.

Earlier

public class Enums
{
    public enum Language
    {
        English,
        Hindi,
        Spanish
    }
}

// Another file
public class MyClass
{
    public Enums.Language CurrentLanguage { get; set; };
}

Now

public class Enums
{
    public enum Language
    {
        English,
        Hindi,
        Spanish
    }
}

// Another file
using static mynamespace.Enums

public class MyClass
{
    public Language CurrentLanguage { get; set; };
}

Tuples

They are lightweight data structures that contain multiple fields to represent the data members.

# Initialize Way 1
(string First, string Second) ranks = ("1", "2");
Console.WriteLine($"{ranks.First}, {ranks.Second}");

# Initialize Way 2
var ranks = (First: "1", Second: "2");
Console.WriteLine($"{ranks.First}, {ranks.Second}");

It support == and !=

Expression bodied get/set accessors

With it, members can be implemented as expressions.

Earlier

public string Title
{
    get { return _title; }
    set 
    { 
       this._title = value ?? "Default - Hello";
    }
}

Now

public string Title
{
    get => _title;
    set => this._title = value ?? "Default - Hello";
}

Access modifier: private protected

A new compound access modifier: private protected to indicate a member accessible by containing class or by derived classes that are declared in the same assembly. One more level of abstraction compared to protected internal.

// Assembly1.cs
public class BaseClass
{
    private protected int myValue = 0;
}

public class DerivedClass1 : BaseClass
{
    void Access()
    {
        var baseObject = new BaseClass();

        // Error CS1540, because myValue can only be
        // accessed by classes derived from BaseClass
        // baseObject.myValue = 5;

        // OK, accessed through the current 
        // derived class instance
        myValue = 5;
    }
}

//
// Assembly2.cs
// Compile with: /reference:Assembly1.dll
class DerivedClass2 : BaseClass
{
    void Access()
    {
        // Error CS0122, because myValue can only
        // be accessed by types in Assembly1
        // myValue = 10;
    }
}

await

It helps suspend evaluation of the enclosing async method until the asynchronous operation represented by its operand completes. On completion, it returns result of the operation if any.

It does not blocks the thread that evaluates async method, instead suspends the enclosing async method and returns to the caller of the method.

using System;
using System.Net.Http;
using System.Threading.Tasks;

public class AwaitOperatorDemo
{
    // async Main method allowed since C# 7.1 
    public static async Task Main()
    {
        Task<int> downloading = DownloadProfileAsync();
        Console.WriteLine($"{nameof(Main)}: Started download.");

        int bytesLoaded = await downloading;
        Console.WriteLine($"{nameof(Main)}: Downloaded {bytesLoaded} bytes.");
    }

    private static async Task<int> DownloadProfileAsync()
    {
        Console.WriteLine($"{nameof(DownloadProfileAsync)}: Starting download.");

        var client = new HttpClient();
        // time taking call - await and move on
        byte[] content = await client.GetByteArrayAsync("https://learnbyinsight.com/about/");

        Console.WriteLine($"{nameof(DownloadProfileAsync)}: Finished download.");
        return content.Length;
    }
}

// Output:
// DownloadProfileAsync: Starting download.
// Main: Started download.
// DownloadProfileAsync: Finished download.
// Main: Downloaded 27700 bytes.

Default Interface methods

Now, we can add members to interfaces and provide a default implementation for those members. It helps in supporting backward compatibility. There would be no breaking change to existing interface consumers. Existing implementations inherit the default implementation.

public interface ICustomer
{
    DateTime DateJoined { get; }
    string Name { get; }

    // Later added to interface:
    public string Contact()
    {
       return "contact not provided";
    }
}

Wrap up

There are many more additions to C#. Believe, above are few that one should know and use in their day to day coding right away (if not already doing it). Most of it helps us with being more concise and avoid convoluted code.

Reference: https://docs.microsoft.com/en-us/dotnet/csharp/whats-new

Keep learning!

LearnByInsight
GitHub Profile Readme Samples

Harness your voice using Transcribe in Word

September 5, 2020September 25, 2020Sandeep Mewara Leave a comment

A new enhancement in Microsoft Office 365’s Word for the web – Transcribe in Word. It leverages the Azure Cognitive Services AI platform.

Transcribe converts speech (recorded directly in Word or from an uploaded audio file) to a text transcript with each speaker individually separated.

We can record our conversations directly in Word for the web and it transcribes them automatically with each speaker identified separately. Transcript will appear alongside the Word document, along with the recording.

For now, English (EN-US) is the only language supported for transcribe audio

Once the recording is finished, we can:

easily follow the flow of the transcript
revisit parts of the recording by playing back the time-stamped audio
edit the transcript for any corrections or if we see something amiss
save the full transcript as a Word document

How to use it?

Transcribe in Word is already available in Word for the web for all Microsoft 365 subscribers. Usage wise, it is completely unlimited to record and transcribe within Word for the Web.

There is a five hour limit per month for uploaded recordings and each uploaded recording is limited to 200mb.

Real life applications …

It has multiple values in different aspects of usage:

would be much easier to concentrate in meetings & discussions if doing multitask affects (taking notes during discussion)
provide important quotes with others in quick time
summarize the meeting based on key topics identified
- Minutes of meetings
- Key notes
opens up potential for NLP world (AI) in future
- access patterns particular speakers on how they speak, use specific words, provide feedback
- access questions and their response, act specifically
- improve auto corrections

Wrap Up

Seems like a nice move by Microsoft, to cover more than one aspect where it can help. Worth a feature to try out and see how it works and helps.

Reference: https://www.microsoft.com/en-us/microsoft-365/blog/2020/08/25/microsoft-365-transcription-voice-commands-word/

Keep exploring!

Samples GitHub profile Readme

pandas – get started with examples

August 30, 2020September 25, 2020Sandeep Mewara Leave a comment

This is to get started with pandas and try few concrete examples. pandas is a Python based library that helps in reading, transforming, cleaning and analyzing data. It is built on the NumPy package.

pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language.
https://pandas.pydata.org

Key data structure in pandas is called DataFrame – it helps to work with tabular data translated as rows of observations and columns of features.

Download or fork entire Jupiter notebook from my GitHub to play around: https://github.com/sandeep-mewara/python-examples

pandas basics includes:

Series
Dataframes
- Create
  - from list of tuples
  - from a dictionary
  - from a CSV
  - from built-in dataset (eg: from sklearn.datasets)
- Data retrieval
- Modifying data
- Group by operation
- Custom Functions – apply method
- Pre-Processing
  - drop, mean, mode
  - ordinal feature
  - nominal feature
- Reshaping
  - CrossTab
  - Merge
  - Melt
  - Pivot

# .info(), .head(), .sample are handy method to use first off with dataframe to get a high level details
# index may be not unique – can return multiple values
# boolean indexing (masking) can help select certain set of rows
# .isin() is a useful when building a boolean index
# .where() is useful to retain shape of the original table
# Column names & Indexes can be set if needed
# to modify the table right away, use inplace=True
# aggregate operations can be applied on a groupby object
# dropna(), mean() or mode() are handy ways for pre-processing missing data
Key learning’s …

Examples notebook includes:

Uber taxi drivers
Apple stock price
Day or Night
Students marks
Balance Calculator

# .describe() is a handy method to get the statistical summary of numerical columns
# one-hot-encoding is really helpful for nominal features (that cannot be ordered)
# converting the columns into right datatype helps
# converting data into meaningful numbers help for analysis
# groupby is a powerful tool with dataframes for analysis
Key learning’s …

Cheat sheet

Download cheat sheet pdf from here
For more details about pandas, look at the documentation reference.

Keep learning!

“vshost32.exe has stopped working”

August 29, 2020September 25, 2020Sandeep Mewara Leave a comment

This is another one of the common errors developers get and ask about: vshost32.exe has stopped working.

Problem Statement

When I run my project (or a particular usecase), it displays an error: vshost32.exe has stopped working

Assessment

vshost was introduced in Visual Studio 2005 (only for use in VS). These are files that contains vshost in the file name and are placed under the output (default bin) folder of the application. It is the “hosting process” created when we build a project in Visual Studio.

It has following core responsibilities:

to provide support for improved F5 performance
To run a managed application in debug mode using F5 command, Visual Studio would need an AppDomain to provide a place for the runtime environment within which the application can run. It takes quite a bit of time to create an AppDomain and initialize the debugger along with it. The hosting process speeds up this process by doing all of this work in the background before we hit F5, and keeps the state around between multiple runs of the application.
for partial trust debugging
To simulate a partial trust environment within Visual Studio under the debugger would require special initialization of the AppDomain. This is handled by the hosting process
for design time expression evaluation
To test code in the application from the immediate window, without actually having to run the application. The hosting process is used to execute code under design time expression evaluation.

Possible Resolutions

Generally, it would be to figure out if the issue is specifically because of Visual Studio hosting process or there are other issues at play interacting with vshost.

Scenario 1:

It’s 64 bit OS, app is configured to build as AnyCPU, yet we get an error

Try:
32 bit/64 bit issues usually plays a role in relation to OS features and locations that are different. There is a setting in Build configuration that drives the debugger behavior when it is setup for AnyCPU. You need to turn off (un-tick checkbox) the Prefer 32 bit flag to run in 64 bit mode.

Now, even with above change, we can face issues that fall into 32/64 bit region. This is where vshost is still playing a role. Irrespective of above, flag vshost continues to work in 32 bit mode (platform config AnyCPU). Now, calls to certain APIs can be affected when the hosting process is enabled. In these cases, it is necessary to disable the hosting process to return the correct results. Details about how to turn it off in Debug tab: How to: Disable the Hosting Process

With above changes, AnyCPU configuration would be equivalent to the app as platform target x64 configuration.

Scenario 2:

Application is configured to build as x86 (or AnyCPU)

Try:
If the workflow is related to a third party, for 32 bit applications, use 32 bit runtime, irrespective of the OS being 32 bit or 64 bit.

Scenario 3:

Application is throwing an error for a specific code work flow that involves unmanaged assembly

Try:
If the workflow includes an interop call to an external assembly (unmanaged code that is executed outside the control of CLR), there might be incorrect usage of the function all. I have seen examples where a wrong return type can cause a vshost error. Return type of the external DLL cannot be string, it must be IntPtr.

[DllImport("Some.dll", CallingConvention = CallingConvention.Cdecl)]
private static extern IntPtr SomeMethod();

Scenario 4:

Application is throwing an error for a specific code work flow that is in realms of managed code (by CLR)

Try:
It could be that the process is taking time while executing that particular workflow. If the process is busy for a long time, it can throw an error. One of the solve would be to try the entire long operation on a BackgroundWorker thread and free up UI thread.

Conclusion

We can turn off the vshost as long as we are okay without it. It always helps to have same debugging environment (32/64 bit) as the app is expected to run in. We should be cognizant of the operations done with third party assemblies or unmanaged ones and have the right set of code/files interacting with application.

Happy troubleshooting!

Samples GitHub Profile Readme

Decode with Google

August 29, 2020August 10, 2021Sandeep Mewara Leave a comment

For details on Decode with Google 2021, please go here: Decode with Google 2021

Couple of days back, I received an email from Google about an upcoming virtual event on 11th & 12th September 2020. All the details of the event are here: Decode with Google

Looks like, theme of the event is: Innovations contributing to Technology in India

At Google, we’re always excited by the potential of technology to solve large-scale, real world problems. Decode with Google is an opportunity for techmakers, entrepreneurs, and academia to get a sneak peek at some of the toughest and most interesting challenges that Googlers in India are solving for.
Event highlight shared by Google

If topics resonate, feel free to register and join for the virtual edition of Decode with Google. Seems Google Tech leaders would share highlights about what they are working on, the state of AI and the opportunities it presents.

Happy connecting!

GitHub Profile Readme Samples

Sandeep Mewara Github
Sandeep Mewara Learn By Insight

GitHub profile README a rage!

August 23, 2020September 25, 2020Sandeep Mewara Leave a comment

I believe this feature was opened up recently (last month) and ever since it has been a rage. People have been updating their GitHub profile with a special repository using a README.md.

I came to know about it from the tech community I am part of. I used the following video that explains about how to set it up: https://www.youtube.com/watch?v=-otyb0ngsa4. Later, I checked and found many posts sharing the similar thing. This feature has piggy-backed on a convention (of README.md file) that is familiar to majority of GitHub users.

I went ahead and tried out myself. It supports markdown and thus it makes it easy to have content that has nice visual affects (yes, images and gifs are allowed!). README contents are placed above pinned repositories and thus prominently visible.

sandeep-mewara-github-repo — https://github.com/sandeep-mewara

After setup, looking at it, a profile-level README seems like a great idea. This is another good opportunity (right next to your code repository) to let everyone know about yourself and showcase some highlights you feel important.

I read we can keep it auto-updating (like with our recent blog entries) using GitHub Actions. I am going to try that next. Go ahead and try out for yourself.

I have created my own repo for everyone to add/share their profile samples with the world. Please contribute and raise PR: https://github.com/sandeep-mewara/github-profile-README-samples

I found the official detailed documentation about it here: Setting up and managing your GitHub profile

Keep exploring!

Python as statistics workbench

August 22, 2020August 30, 2020Sandeep Mewara Leave a comment

While reading for AI/ML (Artificial Intelligence/Machine Learning), I came across a discussion – if Python can be used as a “statistics workbench” to replace R, SPSS, etc? It was nice shareout by multiple knowledge folks related to languages used for problems of statistics, specifically R (read about R here).

Discussion here: https://stats.stackexchange.com/questions/1595/python-as-a-statistics-workbench

For quick reference, I will quote few of the latest thoughts from there that are in favor of Python and how it has evolved. I too conquer with most of them:

1. Python is easily the most intuitive syntax of any programming language. This makes for extremely fast development time.
2. Python is performant. It opens large datasets reliably.
3. The packages in Python are fast catching up to R’s packages. Python usage has increased tremendously last few years.
4. Readability is one of the most important qualities good code can possess, and Python is one of the most readable language.
5. Python has an extremely well-thought-out IDE now: PyCharm & Visual Studio Code.
https://stats.stackexchange.com/a/457753

Overall, Python is a general purpose language with an easy to understand syntax which would be relatively easier for usual programmers to learn/adopt. R is developed keeping statisticians in mind. Thus it has many features around data visualization and is a tad ahead currently.

A little research …

Recently DataCamp too published an article comparing R and Python for data analysis. There is a nice comparison in it on various parameters, picking just couple of them here:

Final analysis in the paper shares R being ahead in comparison for data analysis but Python having potential to catch up quickly and easily.

My thoughts …

My intent was to understand which of the programming language serves as an essential tool to demonstrate AI/ML capabilities. Looking at them, Python seems good enough for me to serve as AI/ML tool to start and probably conquer it.

Ammunition needed …

There are many python based libraries and packages that are generally used for statistical work. Below are few of them that would help in our data analysis exploration going ahead:

scipy – python-based ecosystem of open-source software for mathematics, science, and engineering.
- cookbook – many statistical facilities, a collection of various user-contributed recipes already available
- numpy – base N-dimensional array package. Handful of example lists here
- pandas – a fast, powerful, flexible and easy to use data analysis and manipulation tool
- matplotlib – a comprehensive library for creating static, animated, and interactive visualizations
scikit-learn – simple and efficient machine learning tools for predictive data analysis
keras – API for deep learning
tensorflow – API to develop and train ML models

Since I am a programmer, I maybe be biased here. But, it seems Python can and does all the needful to start with AI/ML journey.

Happy learning!

NumPy – Basics & Examples

August 15, 2020September 27, 2020Sandeep Mewara Leave a comment

This is to get started with NumPy and try few concrete examples. NumPy (Numerical Python) are packages for numerical computation designed for efficient work on large data sets.

Entire Jupiter notebook can be downloaded or forked from my GitHub to play around: https://github.com/sandeep-mewara/python-examples

Reference: https://numpy.org/learn/

NumPy basics includes:

Initialize Matrix via
- List
- NULL Matrix
- IDENTITY Matrix
- ONES Matrix
Matrix Transpose
Matrix Indexing
Simulation
Basic CSV file operations
Matrix Broadcasting
Basic Image Processing

# matrix in python is list of a list
# arrays are compatible for broadcasting when the trailing dimensions match or either of them is of length 1
# image when read as numbers, the values are between 0 & 1
Key learning’s …

Examples notebook includes:

Random walk simulation
Triangle simulation
Random Number
Correlation co-efficient
Mean/Variance of crude oil

# masking helps get all the values back that satisfy the mask
# cumsum() is a handy function for cumulative sum
# there are handy methods for random number generation
Key learning’s …

For learning more about NumPy, look here: https://numpy.org/doc/stable/

Keep learning!

Kubernetes – Evolution of application deployment

August 9, 2020September 25, 2020Sandeep Mewara Leave a comment

Kubernetes (K8s) is turning out as the cutting-edge of application deployment. It is becoming core to the creation and operation of modern software (few call it as modern SaaS). Thus, I planned to look into it and see what Kubernetes is and how/what application design will help adapt it in the application deployment evolution.

Kubernetes is a portable, extensible, open-source platform for automating deployment, scaling, and management of containerized applications.

History

Google originally designed and open-sourced the Kubernetes project in 2014. Kubernetes has inputs from over 15 years of Google’s experience to run production workloads at scale with best ideas and practices from the community. It is maintained by the Cloud Native Computing Foundation now. It’s current development repository is here.

First challenge …

With modern goal parameters like: recoverability, release cycle time & release frequency – applications need to be designed and deployed in a way that makes them improve year over year.

This leads to first step of breaking the monolith into microservices such that the changes and impact are compartmentalized for easy deployment and recovery.

A monolithic application puts all it’s functionality in a single process. In need of scaling, it replicates entire monolith on multiple servers. On the other hand, a microservice architecture separates out (keeps) each functionality into a separate service. Thus in case of scaling need, these services are distributed across servers as required.

Second challenge …

With multiple microservices in play, a variance of stack versions or deployment styles kicks in as trouble. Each team would have their own set of tools, versions to build the artifacts, store them and then deploy them. Thus, different applications/services can have different patterns and network topology. This in turn makes managing security and infrastructure more challenging.

This leads to the step of abstracting infrastructure out to ease maintenance and relieve from security and other infrastructure related concerns.

deployment-progression — Deployment scheme evolution

Traditional: Applications running on a physical server. No way to define resource boundaries for applications.
Virtualization: Allows to run multiple Virtual Machines (VMs) on a single physical server’s CPU. This leads to better utilization of resources and better scalability as an application can be added or updated easily. Also, if needed, applications can be isolated between different VMs to provide a level of security.
Containers: Like VM, it has its own filesystem, CPU, memory, process space, etc. Are environment consistent, easy to scale, portable across clouds and OS distributions. This leads to loosely coupled setup where application is totally decoupled from infrastructure and makes it easy to move towards smaller, modular microservices.

Containers are abstraction to next level. It does not matter on which OS you are on (although there could be different containers for different OS and how they work underlying), all we need is to package our code and needed libraries together, which then runs inside a container based on configured resource need. Docker is an example of container runtime, a packaging software.

Final challenge …

So, the packaging has been simplified and running the application on a single node has been simplified. When we move to enterprise, we need to scale up/down our containers on need basis automatically. Further, one would scale the application to be served from multiple servers instead of just one for better load distribution and easy recovery/fail safe. Now, while distributing the load, we would need to ensure the availability of nodes, resources like space on node for running a container, etc.

This is where Kubernetes pitch in. It acts as a container orchestrator that help provides with a framework to run distributed systems resiliently. It takes care of scaling and failover of containers having application, provides deployment patterns, and more.

Kubernetes has master-slave architecture where there is one master node and multiple worker nodes. A Pod is the smallest deployable unit in it. In order to run a single container, we would need to create a Pod for that container. A Pod can contain more than one container if those containers are relatively tightly coupled (like a container to download all secret configs related before application starts in other container).

API Server is the heart of the architecture. User interacts with Kubernetes via it and master node communicates to worker nodes through it. Number of containers requested is stored in the etcd (key-value store). Controller acts as a manager that keeps a constant check on the store, schedules the request for scheduler to pick and execute, spins of another worker node in case of need.

Wrap Up …

I have just touched the surface of both containerization and Kubernetes. They seem to have much more and can be explored in depth. Along with vast benefits, it can also bring new challenges on the table with moving to cloud like security and networking.

It was good to know how application design and deployment are evolving, getting abstracted and loosely coupled.

Keep learning!

Reference: https://kubernetes.io/docs/home/

GitHub Readme Samples

Troubleshoot: Kafka setup on Windows

August 1, 2020August 30, 2020Sandeep Mewara Leave a comment

Recently, I did a setup of Kafka on a windows system and shared a Kafka guide to understand and learn. I was using a Win10 VM on my MacBook. It was not a breeze setup and had few hiccups on the way. It took some time for me to resolve them one after another looking around on web. Collating all of them here for quick reference.

ERROR #1

When:
I tried to start Zookeeper.

Command:
zookeeper-server-start.bat config\zookeeper.properties

Error:
java.lang.IllegalArgumentException: config/zookeeper.properties file is missing

Stack trace:

INFO Reading configuration from: config/zookeeper.properties (org.apache.zookeeper.server.quorum.QuorumPeerConfig)
[2014-08-21 11:53:55,748] FATAL Invalid config, exiting abnormally (org.apache.zookeeper.server.quorum.QuorumPeerMain)
org.apache.zookeeper.server.quorum.QuorumPeerConfig$ConfigException: Error processing config/zookeeper.properties
    at org.apache.zookeeper.server.quorum.QuorumPeerConfig.parse(QuorumPeerConfig.java:110)
    at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:99)
    at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:76)
Caused by: java.lang.IllegalArgumentException: config/zookeeper.properties file is missing
    at org.apache.zookeeper.server.quorum.QuorumPeerConfig.parse(QuorumPeerConfig.java:94)
    ... 2 more

How I solved?
It was clearly the case of relative path. config/zookeeper.properties was at two roots lower than where the start up script was. Either I had to correct the level or use an absolute path to move ahead.

zookeeper-server-start.bat C:\Installs\kafka_2.12-2.5.0\config\zookeeper.properties
rem OR relative path option below

zookeeper-server-start.bat ../../config/zookeeper.properties

ERROR #2

When:
Zookeeper is up and running. Attempted to start Kafka server and it failed.

Command:
kafka-server-start.bat C:\Installs\kafka_2.12-2.5.0\config\server.properties

Error:
kafka.zookeeper.ZooKeeperClientTimeoutException: Timed out waiting for connection while in state: CONNECTING

Stack trace:

........
........
2020-07-19 01:20:32,081 ERROR Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer) [main]
kafka.zookeeper.ZooKeeperClientTimeoutException: Timed out waiting for connection while in state: CONNECTING
at kafka.zookeeper.ZooKeeperClient.$anonfun$waitUntilConnected$3(ZooKeeperClient.scala:268)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:12)
at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:251)
at kafka.zookeeper.ZooKeeperClient.waitUntilConnected(ZooKeeperClient.scala:264)
at kafka.zookeeper.ZooKeeperClient.(ZooKeeperClient.scala:97)
at kafka.zk.KafkaZkClient$.apply(KafkaZkClient.scala:1694)
at kafka.server.KafkaServer.createZkClient$1(KafkaServer.scala:348)
at kafka.server.KafkaServer.initZkClient(KafkaServer.scala:372)
at kafka.server.KafkaServer.startup(KafkaServer.scala:202)
at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:38)
at kafka.Kafka$.main(Kafka.scala:75)
at kafka.Kafka.main(Kafka.scala)
2020-07-19 01:20:32,088 INFO shutting down (kafka.server.KafkaServer) [main]
2020-07-19 01:20:32,105 INFO shut down completed (kafka.server.KafkaServer) [main]
2020-07-19 01:20:32,106 ERROR Exiting Kafka. (kafka.server.KafkaServerStartable) [main]
2020-07-19 01:20:32,121 INFO shutting down (kafka.server.KafkaServer) [kafka-shutdown-hook]

How I solved?
Investigation lead to increasing the timeout settings for Kafka-Zookeeper. Because of environment settings (RAM, CPU, etc), it turns out this plays some role.
I updated the ${kafka_home}/config/server.properties file:

# Timeout in ms for connecting to zookeeper (default it was 18000)
zookeeper.connection.timeout.ms=36000

I read many other reasons for this error (did not look applicable to my case) like:
1. zookeper service not running
2. restarting system
3. zookeper is hosted on zookeeper:2181 or other server name instead of localhost:2181

ERROR #3

When:
Zookeeper is up and running. Attempted to start Kafka server and it failed.

Command:
kafka-server-start.bat C:\Installs\kafka_2.12-2.5.0\config\server.properties

Error:
java.lang.OutOfMemoryError: Map failed OR java.io.IOException: Map failed

Stack trace:

.......
.......
java.io.IOException: Map failed
        at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:944)
        at kafka.log.AbstractIndex$$anonfun$resize$1.apply(AbstractIndex.scala:115)
        at kafka.log.AbstractIndex$$anonfun$resize$1.apply(AbstractIndex.scala:105)
        at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:213)
        at kafka.log.AbstractIndex.resize(AbstractIndex.scala:105)
        at kafka.log.LogSegment.recover(LogSegment.scala:256)
        at kafka.log.Log.kafka$log$Log$$recoverSegment(Log.scala:342)
        at kafka.log.Log.recoverLog(Log.scala:427)
        at kafka.log.Log.loadSegments(Log.scala:402)
        at kafka.log.Log.&lt;init>(Log.scala:186)
        at kafka.log.Log$.apply(Log.scala:1609)
        at kafka.log.LogManager$$anonfun$loadLogs$2$$anonfun$5$$anonfun$apply$12$$anon
fun$apply$1.apply$mcV$sp(LogManager.scala:172)
        at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:57)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1
149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:
624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.OutOfMemoryError: Map failed
        at sun.nio.ch.FileChannelImpl.map0(Native Method)
        at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:941)
        ... 17 more

How I solved?
It turned out related to Java heap size. I made a change in the Kafka startup script file: ${kafka_home}/bin/windows/kafka-server-start.bat

IF NOT ERRORLEVEL 1 (
        rem 32-bit OS
        set KAFKA_HEAP_OPTS=-Xmx512M -Xms512M
    ) ELSE (
        rem 64-bit OS
        rem set KAFKA_HEAP_OPTS=-Xmx1G -Xms1G => Commented this
        rem added this below line
	set KAFKA_HEAP_OPTS=-Xmx512M -Xms512M
    )

Though, while looking for solution, quite a few also solved it up upgrading their Java from 32bit to 64bit application. I did not try this solution as had other Java setup dependencies on my system that I wanted to keep intact.

ERROR #4

When:
I tried to delete Kafka topic because I was having problems while pushing message from Producer

Command:
kafka-topics.bat --list --bootstrap-server localhost:9092 --delete --topic my_topic_name

Error:
Topic test is already marked for deletion

Stack trace:

Topic test is marked for deletion.
Note: This will have no impact if delete.topic.enable is not set to true.

How I solved?
I enabled topic deletion configuration. It needs to be set as delete.topic.enable = true in file ${kafka_home}/config/server.properties. Restarted the server post updating the config.

# Delete topic enabled
delete.topic.enable=true

ERROR #5

When:
Zookeeper & Kafka is up and running. I get an error when I try to create a Topic.

Command:
kafka-topics.bat --create --bootstrap-server localhost:9092 --replication-factor 1 --partitions 1 --topic testkafka

Error:
org.apache.kafka.common.errors.TimeoutException: Timed out waiting for a node assignment

Stack trace:

Error while executing topic command : org.apache.kafka.common.errors.TimeoutException: Timed out waiting for a node assignment.
[2020-07-19 01:41:35,094] ERROR java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.TimeoutException: Timed out waiting for a node assignment.
    at org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45)
    at org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32)
    at org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89)
    at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:260)
    at kafka.admin.TopicCommand$AdminClientTopicService.createTopic(TopicCommand.scala:163)
    at kafka.admin.TopicCommand$TopicService.createTopic(TopicCommand.scala:134)
    at kafka.admin.TopicCommand$TopicService.createTopic$(TopicCommand.scala:129)
    at kafka.admin.TopicCommand$AdminClientTopicService.createTopic(TopicCommand.scala:157)
    at kafka.admin.TopicCommand$.main(TopicCommand.scala:60)
    at kafka.admin.TopicCommand.main(TopicCommand.scala)
Caused by: org.apache.kafka.common.errors.TimeoutException: Timed out waiting for a node assignment.
 (kafka.admin.TopicCommand$)

How I solved?
For once it worked for me as is but when I tried again later, I kept getting this error. While looking on web, suggestions were to enable listener and set it up like: listeners=PLAINTEXT://localhost:9093 in the server config file.

Before attempting this, I rebooted my system as it was little sluggish too. Turns out, mostly it was memory issue. I was in a Windows VM and probably it was craving for memory space. Without a change, things worked fine as is for me.

ERROR #6

When:
This was during another instance of Kafka setup (from start) in few days. Zookeeper is up and running. Attempted to start Kafka server and it failed.

Command:
kafka-server-start.bat C:\Installs\kafka_2.12-2.5.0\config\server.properties

Error:
It was around logs or lock file.

How I solved?
Looking at details, it hinted me to look into pre-exisiting (something related to my previous setup). I went ahead and deleted the logs and data folder that was auo created when I moved ahead with the entire process setup. Post this, the error was gone. Believe my server shutdown was not smooth and thus something was interferring with the current startup.

Hope these would help. Keep learning!

CodeProject