Tip - HttpUtility.UrlEncode vs WebUtility.UrlEncode | Conrad Akunga

In this world of websites and APIs you will invariably be called upon to build URLs, either manually or through code.

For example, a google search typically looks like this:

https://www.google.com/search?q=wildife

The interesting stuff is after the q=

(There’s some other stuff but I have trimmed that for simplicity.)

If you were to build a query programatically and then search, you would likely do it like so:

var client = new HttpClient();
var response = await client.GetStringAsync("https://www.google.com/search?q=wildife");
Console.WriteLine(response);

This will print the results to the console.

The challenge comes when you want to search something like ng'ang'a

The apostrophe ' is not in fact a valid character for a URL.

You will need to convert it to the appropriate encoding for a URL, which happens to be the string %27.

So the URL should be like so:

https://www.google.com/search?q=ng%27an%27a

The question becomes how do we know that ' is %27 and a space is %20?

We don’t need to.

There is a class that helps with this - the HttpUtility class.

Within this class is a UrlEnocode method.

So your code becomes this:

var query = "ng'ang'a";
var encodedQuery = HttpUtility.UrlEncode(query);
Log.Information("{query} has been encoded to  {result}", query, encodedQuery);
var client = new HttpClient();
var response = await client.GetStringAsync($"https://www.google.com/search?q={encodedQuery}");
Console.WriteLine(response);

If you run this code, just before the results you should see a line like this:

[17:15:53 INF] ng'ang'a has been encoded to  ng%27ang%27a

To move the other way from encoded text to the actual text, there is a reverse method - HttpUtility.UrlDecode.

So I have been using this class for years until the other day during a design session when this very problem came up and a team member responded saying “Use the WebUtility class.”

This class was not familiar to me, so off to the documentation I went.

On the surface it appears identical to HttpUtility. It even has UrlEncode and UrlDecode methods that function the same.

Or do they?

Let us use a more robust query:

var query = "!@#$%^&*()_+|}{:?><";
var httpEncoded = HttpUtility.UrlEncode(query);
var webEncoded = WebUtility.UrlEncode(query);

Console.WriteLine(httpEncoded);
Console.Write(webEncoded);

If we run this we get the following result:

!%40%23%24%25%5e%26*()_%2b%7c%7d%7b%3a%3f%3e%3c
!%40%23%24%25%5E%26*()_%2B%7C%7D%7B%3A%3F%3E%3C

These look the same. BUT THEY AREN’T.

We need to do do a diff to see:

Using Winmerge for both results I get the following:

The values may be the same, but the cases are not. WebUtility appears to uppercase the encoded values whereas HttpUtility appears to lowercase them.

Both will decode to the same value, so there is no concern about that.

However if you have any unit tests and for whatever reason you are using both classes in your code, you might find the tests failing due to differences in the encoded values.

Which begs the question of why are there two classes essentially doing the same thing at all?

HttpUtility is in the System.Web namespace, which you will have to include from System.Web.HttpUtility.dll. This comes for free if you are building a web or an API application. If you are building a console, class library or desktop application, you will need to explicitly add it to you project references.

WebUtility is in the System.Net namespace, which is in System.Runtime.dll so you don’t need to include any additional DLLs into your project regardless of type.

In fact the documentation for HttpUtility explicitly informs you of the same:

Ok, so why wasn’t HttpUtility dropped and all applications use WebUtility?

I would imagine the issue here would be backward compatibility.

HttpUtility has been part of the .NET Framework from version 1 - the very beginning.

WebUtility is a younger member of the family.

Happy hacking!