Users see the UI, not the code

UI design is hard. Like, it’s way hard. And it’s also a very important piece of the software puzzle. In fact, some might say it’s the most important piece because to users, it is the software:

A good user interface is one of the most important aspects of an enterprise product. While the underlying architecture is extremely important to deliver the functionality, to the end-user, the interface is the product. They don’t know, (and don’t care, usually,) of what goes on behind the scenes, other than that they expect things to work. Every way they interact with the product is through the interface.

When a user opens an app, they see the interface. They don’t see the code behind it, the layers, the interfaces, the helper libraries; they see the UI. That is the software. If you perform massive technical improvements but leave the UI the same, no one will notice. This is why the interface is so critically important, but also why it’s one of the hardest things to do in software. Designing an interface that both looks good and is intuitive to all users takes effort and skill, and is something that Microsoft, Google and even Apple have yet to fully master.

Look, I am in no way a master at UI design. I kind of suck at it. But I can get by if I have to, and one thing that helps me as I’m working is to ask myself this question:

If I was a user how would I expect this to work?

You are a user of many more pieces of software than you will ever write yourself. You, like everyone, will have expectations of how something should function. So tap into those experiences. Put yourself in the shoes of a user and design the feature as you think it should work. Think about the different reasons a user would use this feature and the goals they might want to achieve while using it. Try to come up with something that minimizes the pain of accomplishing those goals. Chances are you’ll come up with something better than these.

Migrating from SVN to Git; How we did it

My team migrated from SVN to Git about 3 months ago. After a few tweaks, a few bugs and a little elbow grease we’ve been stable ever since. And you know what? It was one of the best moves we’ve ever made. Developers are more efficient and we have finally documented and streamlined our release workflow.

I was in charge of handling the migration. That included setting up an internal Git host, migrating the SVN repositories over, documenting the new Git development process and training the developers. One important requirement was that we couldn’t stop active development during the migration – developers always had to have a place to commit code.

Technical Side – What tools were used and how was it done?

Stash

Due to legal reasons, we weren’t able to make use of popular Git hosts such as GitHub or Bitbucket, so we needed to find an internal hosting solution. We looked at many open source hosts, along with GitHub Enterprise, and finally determined that Atlassian Stash was the best option for us. It offered most of the features we desired – internally hosted, pull requests, HTTP/HTTPS/SSH access, and the ability to connect with Active Directory – and was corporate backed and reasonably priced.

Other than the setup being archaic Stash was up and running within 20 minutes. Configuration was relatively trivial, and mostly included configuring permissions and user setup. We hooked up Stash with Active Directory so all employees can login using their domain accounts. This reduces the number of username/password pairs everyone has to remember which, imo, is a really good thing.

Initial SVN Migration

The initial SVN migration went relatively smoothly with little to no hiccups. We followed the steps I layed out in my previous post, Migrating from SVN to Git, with one small caveat – after we performed the first fetch, we left SVN as the primary repository, and all code was still committed there. We set the permissions on the Git server to be readonly so that developers could clone the repositories, get introduced and familiar to Git, and we could confirm that there were no connection or permission issues with Stash. Everyday we performed a fetch from the SVN repository and pushed the changes up to Git to keep things up to date. We left this process in place for about a week; once we confirmed there were no issues, and all devs had some sort of Git client they liked, we switched over. (As a side note, if we had many more repositories, and/or were going to leave this process in place for longer than a week, I would have set up a job to run daily (perhaps hourly?) to perform the fetch. If you’re in this boat, I recommend you do that using Powershell or similar, unless you like performing the same monotonous task every morning, in which case go for it.)

When we were ready to shut off SVN, we had all developers commit any pending changes to SVN, then we switched the repository to be readonly. We performed one final fetch/push from SVN, and opened up the Git server to the world. (Okay, opened to our office, but whatever.)

Tools

So we set up Stash on the server, but what clients did we use? We are a Microsoft shop, and as such have a mix between SourceTree, Posh-Git and bash. We didn’t really set any limitations on what client to use, as long as it works for the dev. (If you ask me, though, Posh-Git is the way to go. By a mile.)

Human Side – Git workflow, developer training and hiccups

Git Workflow

This is where the fun starts. Developers inevitably had questions, most of which I could answer but some of which we had to work out together. Most of these questions revolved around workflow – when do I branch, why do I branch, do I need to branch? Our SVN workflow was, well, not exactly much of a workflow. We had a develop branch, and most work went into that, and sometimes we would branch for features, but then we would have merge problems because SVN sucks at that, and then we’d release whenever from wherever, and… yeah. Not much of a workflow.

So, I took this opportunity to standardize our process, which is basically git-flow. We have a develop branch, all features get branched from there and merged back when they’re ready. When we decide to release, we branch into a release branch, perform fixes, and merge the production ready code into master. Hotfixes are merged off of master and merged back into master and develop. I laid this workflow out in a formal document that was available to everyone – developer or otherwise.

The fun part about documents, at least that I have found, is that nobody reads them. Ever. I still got a lot of questions about where to branch feature branches from, when to create a release branch, and where to release from. My answer, most of the time, was “read the documentation” (without being rude) to which I got a “what documentation?” response.

Training

So the next logical step was group training. I set aside 30 minutes to get all developers together and explain things – both about Git and the new workflow. We went over the differences and similarities between SVN and Git – what distributed means in practice, pushing, pulling, committing, stashing, adding it items to the index, etc. And then we covered the new workflow (with pictures!) and how Stash helps formalize the process with Pull Requests and such.

The training was a huge success even with it only being a 30 minute session. Everyone was able to ask questions and get on the same page. I highly advise giving a formal presentation if you can with as many visual aids as possible. It’s much easier to understand a live, visual presentation over emails and a Word document.

Issues

We luckily didn’t run into any technical issues. The only slight issue we ran into was getting developers to follow the new workflow. Again, training pretty much mitigated this issue and everything was smoothed out in a matter of days. We have yet to have any technical issues.

Wrap Up

If you’re on the fence about making the switch to Git, I highly recommend it. There are many benefits with little to no drawbacks. We’ve only been using it for 3 months and I can already see an increase in productivity and quality of output. Formal Pull Requests have strengthened our peer reviews and having a strict release process has increased our quality. It has been one of the best decisions we’ve made as a team in a long time.

Migrating from SVN to Git

So you’ve done it – you’ve finally made the decision to switch to Git. SVN does some things very well, and has been a great source control system since it’s creation in 2000. But the features that Git brings – distribution, performance, easy branches, easy merges, stash – are hard to pass up. After you make the switch, you’ll probably wonder how you ever worked without it.

So how do you get all of your data, branches, tags and history into Git? Git includes an incredibly useful tool, git-svn which is a bidirectional connection between Git and SVN. It allows you to pull, and if you so desire push, commits to and from SVN. I recommend avoiding pushing back to SVN because, well, why would you? We’re here to switch, not combine! We’ll just use it to pull down all commit history, branches and tags from SVN.

The general workflow for this process is:

  • Initialize a Git repository with the SVN repository as a remote
  • Configure the user mapping between SVN and Git
  • Fetch from the SVN repo
  • Convert the SVN tags and branches into Git tags and branches
  • Push the repository to a bare repo on the Git host

To start, initialize a git repository with the svn repository as a remote:

> git svn init http://url.to.svn/ --prefix svn

The --prefix svn will prefix all branches and tags with the word svn which will make it easier to distinguish them later on. If you have a non-standard SVN layout (i.e. not named trunk, branches and tags), you can specify each of those with -T for Trunk, -B for Branches and -t for Tags:

> git svn init http://url.to.svn/ -T Trunk -B Branches -t Tags

Next, create an authors.txt file that maps the SVN usernames to the desired usernames in Git. The format is My Svn Username = My Name <myemail>. For instance:

davidzych = David Zych <dave@example.com>
johncandy = John Candy <john@example.com>
michaelscott = Michael Scott <michael@dundermifflin.com>

Once you have the authors file, configure Git-Svn to use the file when performing the fetch:

> git config svn.authorsfile ../authors.txt

You could also configure this in the global config if you have many repos to migrate and only want to specify it once.

It is at this point that I recommend switching the permissions on SVN to be readonly. This way, no one can commit to the repository while you’re performing the migration and no commits will be lost. Once you’ve done that, fetch from svn:

> git svn fetch

After what is probably going to be a long time, you’ll have a git repository with a lot of commits with funny looking commit messages and some remote branches that apparently are your SVN branches and you might be feeling pretty good right now. But you have more work to do! If you do a git branch -a, you’ll see all of your branches from SVN listed as remote branches:

> git branch -a
 * master
   remotes/svn/trunk
   remotes/svn/feature-123
   remotes/svn/feature-456

We need to take those remote branches and turn them into local Git branches. To do this you can run git branch branch-name remotes/svn/branch-name. If you only have a few, you can run that command manually for each branch and be done with it. If you have a lot, well, you can use Powershell or something similar to loop through the branches and automate it, or be like me and copy the branches into Excel, create a formula that generates the create branch statements and save those as a batch file and run it. I’m not an Excel fan but, hey, it works. However you want to do it, get it done.

After branches, you can do the same thing with tags:

> git tag -a -m "Migrating SVN tag" tag-name refs/tags/tag-name

Now you have all your branches and tags as local branches and tags in Git.

Next, add your bare remote Git repository as a remote (you did create one of those, right?).

> git remote add newrepo https://url.to.git/repo.git

And push everything up! Remember to specify --all to push all local Git branches, and perform a second push with --tags to push all tags.

> git push --all newrepo
> git push --tags newrepo

You now have all of your commits, branches and tags from SVN migrated to Git. Instead of attempting to clean out your Git-SVN hybrid repository, it’s probably easiest to perform a clean checkout of the new repository before you start working again:

> git clone https://url.to.git/repo.git

And, with that, you’re done! Enjoy your new Git repository!

Microsoft releases a preview of the .NET Compiler Platform, codenamed Roslyn

Microsoft released a public preview of the .NET Compiler Platform, codenamed Roslyn, on April 3rd, 2014. The code is available at http://roslyn.codeplex.com/ for you to bask in all of it’s glory.

You can clone the .NET Compiler Platform Git repository using this command:

git clone https://git01.codeplex.com/roslyn

Or install the Nuget Package:

Install-Package Microsoft.CodeAnalysis -Pre

What is the .NET Compiler Platform?

The .NET Compiler Platform is Microsoft’s effort to open source the C# and Visual Basic compilers. The code is released under the Apache License 2.0. From Codeplex:

Traditionally, compilers are black boxes — source code goes in one end, magic happens in the middle, and object files or assemblies come out the other end. As compilers perform their magic, they build up deep understanding of the code they are processing, but that knowledge is unavailable to anyone but the compiler implementation wizards. The information is promptly forgotten after the translated output is produced.

This is the core mission of the .NET Compiler Platform (“Roslyn”): opening up the black boxes and allowing tools and end users to share in the wealth of information compilers have about our code. Instead of being opaque source-code-in and object-code-out translators, through the .NET Compiler Platform (“Roslyn”), compilers become platforms—APIs that you can use for code related tasks in your tools and applications.

Microsoft took the original C# and Visual Basic compilers, which were written mostly in C++, and completely rewrote them in managed code. This means they were able to create a set of APIs that allow you to consume the code compilation and analysis results. There are currently 2 main APIs: The Compiler APIs and Workspace APIs. It is worth noting that neither of these APIs have a dependency on Visual Studio which means you can provide much of the same Visual Studio functionality in any application you want.

Compiler APIs

The Compiler API layer allows you to view information about the compilation process. This includes syntax and semantic information, errors, warnings, as well as access to files and information after compilation is complete. It provides Syntax Trees that display the structure and references between your code, Syntax Tokens which are the keywords, variables, etc in your code, and Syntax Trivia which is essentially the items that the compiler ignores such as whitespace and comments.

Workspace APIs

The Workspace APIs provide you information about the current project and solution, allowing quick and easy access to a vast array of information about the code. This assists in providing code analysis, refactoring, and Intellisense to the user. The Workspace API has a CurrentSolution property that gets updated whenever a change to the host environment occurs. This can be anything from typing a letter in a source file to saving a project.

Why is this cool?

Well, first off, it’s open source! The .NET Compiler Platform is part of Microsoft’s newly created .NET Foundation, which is a foundation created to help spur on development of open source technologies making use of .NET. Open source means that the community at large can review the code, provide bug reports and fixes, and can maintain the code even if Microsoft falls off the face of the earth. The fact that Microsoft open sourced these compilers means they are serious about their recent push cultivate the open source .NET community.

This is also awesome because it means that creating code analysis tools is much, much easier. Like, an order of magnitude easier. Like, I might even be able to do it. Right now, developers of tools like JetBrains’ ReSharper, Telerik’s JustCode, and even Visual Studio itself had to write their own code that is essentially a duplicate of the existing compiler code. Roslyn allows them to tie into existing operations and make use of the analysis and syntax trees the compiler already has.

If you don’t want to create a full blown productivity extension, anyone can take this and write small extensions that provide new warnings and errors to the compiler. Or create a new refactoring extension that finds duplicate code through an entire solution. Or a tool that finds all comments in your solution and outputs a documentation file. Or create an analysis tool that provides your method’s Kevin Bacon Number!.

Now what?

Remember, this is just a preview release. Microsoft hasn’t provided a final release date, but I don’t expect it to be anytime soon. For now, go play with it! Look through the code, download the source, have fun! If nothing else, it’s a great look into the C# and Visual Basic compilers.

Now, if you’ll excuse me, I’m going to go add Intellisense to Notepad.

Coloring your Posh-Git output

As a followup to my previous post, Coloring your Git output, if you use Posh-Git you can also edit the colors of the Git output by modifying the Posh-Git settings.

What is Posh-Git? It’s a fantastic set of Powershell scripts for Git. It provides tab completion plus information right in the prompt stating the currently checked out branch along with the working copy and index statuses.

PoshGit

The Posh-Git color settings can be changed using the $global:GitPromptSettings object. Here are the available properties you can set:

  • IndexForegroundColor
  • BranchForegroundColor
  • BranchAheadBackgroundColor
  • AfterBackgroundColor
  • BranchBehindForegroundColor
  • UntrackedBackgroundColor
  • AfterText
  • BeforeForegroundColor
  • WorkingForegroundColor
  • RepositoriesInWhichToDisableFileStatus
  • EnableWindowTitle
  • ShowStatusWhenZero
  • BeforeIndexForegroundColor
  • BeforeIndexBackgroundColor
  • BranchBackgroundColor
  • DescribeStyle
  • BeforeBackgroundColor
  • WorkingBackgroundColor
  • DelimText
  • UntrackedForegroundColor
  • DefaultForegroundColor
  • AfterForegroundColor
  • DelimBackgroundColor
  • Debug
  • BeforeIndexText
  • BranchAheadForegroundColor
  • DelimForegroundColor
  • UntrackedText
  • EnableFileStatus
  • IndexBackgroundColor
  • AutoRefreshIndex
  • BeforeText
  • BranchBehindAndAheadBackgroundColor
  • BranchBehindBackgroundColor
  • BranchBehindAndAheadForegroundColor
  • EnablePromptStatus

You have a few more color options than the 9 that Git allow as well:

  • Black
  • Blue
  • Cyan
  • DarkBlue
  • DarkCyan
  • DarkGray
  • DarkGreen
  • DarkMagenta
  • DarkRed
  • DarkYellow
  • Gray
  • Green
  • Magenta
  • Red
  • White
  • Yellow

You can edit these by editing Posh-Git’s GitPrompt.ps1 file although it’s not recommended. If (and when) you update Posh-Git those settings will be overwritten. The better way is to edit your profile settings to set the colors on startup. Calling $profile at the Powershell prompt will display the location of your profile file; open it to edit your Powershell profile. You’ll see a line in there that initializes Posh-Git:

. 'C:\tools\poshgit\dahlbyk-posh-git-c481e5b\profile.example.ps1'

You should place any customizations after that line:

$global:GitPromptSettings.WorkingForegroundColor    = [ConsoleColor]::Yellow 
$global:GitPromptSettings.UntrackedForegroundColor  = [ConsoleColor]::Yellow

PoshGit

Now git nuts!

Coloring your Git output

Do you sometimes have a hard time viewing the output of a Git command? Updating the colors might help! In Git, you can edit the config to change the color of the output. You can set colors per repository or globally. We’ll focus on the global config here. The global config can be edited either by using the git config --global command, or by editing your global .gitconfig file.

Starting with Git 1.8.4, you can set color.ui auto which will color the output with the default colors. You’re also able to set the colors manually if you’re so inclined. You are able to edit the colors of the status, diff, and branch commands.

There are 9 colors available:

Color
normal
black
red
green
blue
yellow
cyan
magenta
white

If you choose to use the git config --global command, you edit the color.{command}.{property} property. For instance, to change the color of the untracked files listed in the status command to yellow:

git config --global color.status.untracked yellow

If you choose to edit the global file manually, the .gitconfig file can be found at these locations:

OS Path
Windows (Vista up) C:/Users/{username}/.gitconfig
Mac $HOME/.gitconfig
Linux ~/.gitconfig

When editing the file, add a new [color] section for the command you want to edit followed by a list of the properties and colors.

[color "diff"]
    meta = yellow
    frag = magenta
    old = red
    new = green

[color "status"]
    added = yellow
    changed = green
    untracked = red

[color "branch"]
    current = green
    local = white
    remote = red

Writing your own Convert.ToBase64String in C#

Have you ever wondered what Base64 is? How it works? Why you need it? Have you ever wanted to write your own Base64 encoder? Well, my friend, you are in luck because that’s what we’re talking about today. To get started…

What is Base64?

Base64 is a common way to convert binary data into a text form. This is commonly used to store and transfer data over media that was designed to store and transfer only text, such as including an image in an XML document.

It works by converting the data into a base-64 representation and displaying it using a common character set. The most common character set used is A-Z, a-z, 0-9, + and /, although different implementations can use different character sets. The goal is to use a common set of characters that can be represented in most encoding schemes. Here’s the index table of the most common set:

Index Character
0 A
1 B
2 C
3 D
4 E
5 F
6 G
7 H
8 I
9 J
10 K
11 L
12 M
13 N
14 O
15 P
16 Q
17 R
18 S
19 T
20 U
21 V
22 W
23 X
24 Y
25 Z
26 a
27 b
28 c
29 d
30 e
31 f
32 g
33 h
34 i
35 j
36 k
37 l
38 m
39 n
40 o
41 p
42 q
43 r
44 s
45 t
46 u
47 v
48 w
49 x
50 y
51 z
52 0
53 1
54 2
55 3
56 4
57 5
58 6
59 7
60 8
61 9
62 +
63 /

How does it work?

It works by grouping the bits of the data into chunks 24 bits, treating those as 4 chunks of 6 bits (sextets), converting each sextet into base10 and looking up the corresponding character for that decimal number. A single 24 bit string is represented by 4 encoded characters.

For instance, to start encoding the first 3 characters of my name we first have to convert the letters into bytes, and the bytes into bits. In this instance, we’ll say the characters are encoded in ASCII. The byte representations for Dav are:

D: 68
a: 97
v: 118

Those numbers, written in 8 bit binary, are 01000100, 01100001, and 01110110 respectively. Group those together to form a 24 bit string and you get 010001000110000101110110.

Next, grab 4 sextets of bits, convert those to decimal and look up the corresponding character in the index table. 010001 is 17, 000110 is 6, 000101 is 5, and 110110 is 54. Looking those up in the index table gives the string RGF2. We just converted to Base64! Hooray!

Padding

But wait… we have a problem. What happens when the data we want to represent isn’t divisible by three and our last grouping doesn’t have 24 bits?

This is where padding comes in. When we lack 1 or 2 octects out of our 24 bit string, we need to pad the end of the base64 string with =. To extend our previous example, let’s encode my entire first name (Dave if you already forgot…). We know that Dav is encoded as RGF2 so we just need to encode the last letter, e.

e as a byte is 101, which is 01100101 in binary. If we attempt to get our sextet groupings out of that, we get 011001 and 01. Huh. That last sextet is missing a few bits.

What we need to do is pad the last sextet with 0 and note that we have 2 octects missing. That leaves us with 011001 and 010000, which are 25 and 16, which are Z and Q. Our final string, padded with = for the two missing octets, is RGF2ZQ==.

Writing your own encoder

First, a disclaimer. What we’re writing here is for educational purposes. It’s slow, unoptimized and pretty useless considering .NET comes with a respectable Base64 converter. This is a learning exercise.

The existing Convert.ToBase64String method in the System namespace takes a byte[] as a parameter and returns a string. Here’s the full method signature:

public static string ToBase64String(
    byte[] inArray
)

We’re going to write our own implementation of this method:

namespace MyBase64Converter
{
    public static string ToBase64String(byte[] inArray)
    {
        //Converter code goes here
    }
}

The good part about the method taking a byte[] parameter is that part of the work is already done for you – getting the byte representation of your data. From there, we need to convert each byte into it’s 8-bit binary representation. We could use one of the Convert.ToString() overloads in .NET, or we could use the one we wrote ourselves! We’re using the PadLeft method after our call to IntToBinaryString to ensure the binary string is a full 8-bits.

namespace MyBase64Converter
{
    public static string ToBase64String(byte[] inArray)
    {
        var bits = string.Empty;
        for(var i = 0; i < inArray.Length; i++)
        {
            bits += IntToBinaryString(inArray[i]).PadLeft(8, "0");
        }
    }
}

Now that we have our data represented as binary, we need to grab 24-bit chunks at a time. We’ll make use of the Skip and Take methods in LINQ to accomplish this.

string base64 = string.Empty;

const byte threeOctets = 8 * 3;
var octetsTaken = 0;
while(octetsTaken < bits.Length)
{
    var currentOctects = bits.Skip(octetsTaken).Take(threeOctets).ToList();

    // More code here

    octetsTaken += threeOctets;
}

Note that we loop while octectsTaken is less than the length. This will allow us to loop through the end of the string, regardless of whether or not we have full 24 bit chunks.

Next we go sextet by sextet, convert the binary to a byte and look it up in the table. We're making use of another LINQ method, Aggregate, which is basically a fancy way of joining the bits into a string again.

const byte sixBits = 6;
int hextetsTaken = 0;
while(hextetsTaken < currentOctects.Count())
{
    var chunk = currentOctects.Skip(hextetsTaken).Take(sixBits);
    hextetsTaken += sixBits;

    var bitString = chunk.Aggregate(string.Empty, (current, currentBit) => current + currentBit);

    if (bitString.Length < 6)
    {
        //This happens when we need to pad
        bitString = bitString.PadRight(6, '0');
    }
    var singleInt = Convert.ToInt32(bitString, 2);

    base64 += Base64Letters[singleInt];
}

Great! Finally, we'll check if we need to pad the end with =. If we check the remainder of the length of the full bit string divided by 3, that will tell us how many padding characters are required.

// Pad with = for however many octects we have left
for (var i = 0; i < (bits.Length % 3); i++)
{
    base64 += "=";
}

Below is the full code, including the index table for the base64 characters.

private static string Base64Encode(string s)
{
    var bits = string.Empty;
    foreach (var character in s)
    {
        bits += Convert.ToString(character, 2).PadLeft(8, '0');
    }

    string base64 = string.Empty;

    const byte threeOctets = 24;
    var octetsTaken = 0;
    while(octetsTaken < bits.Length)
    {
        var currentOctects = bits.Skip(octetsTaken).Take(threeOctets).ToList();

        const byte sixBits = 6;
        int hextetsTaken = 0;
        while(hextetsTaken < currentOctects.Count())
        {
            var chunk = currentOctects.Skip(hextetsTaken).Take(sixBits);
            hextetsTaken += sixBits;

            var bitString = chunk.Aggregate(string.Empty, (current, currentBit) => current + currentBit);

            if (bitString.Length < 6)
            {
                bitString = bitString.PadRight(6, '0');
            }
            var singleInt = Convert.ToInt32(bitString, 2);

            base64 += Base64Letters[singleInt];
        }

        octetsTaken += threeOctets;
    }

    // Pad with = for however many octects we have left
    for (var i = 0; i < (bits.Length % 3); i++)
    {
        base64 += "=";
    }

    return base64;
}

private static readonly char[] Base64Letters = new[]
                                        {
                                              'A'
                                            , 'B'
                                            , 'C'
                                            , 'D'
                                            , 'E'
                                            , 'F'
                                            , 'G'
                                            , 'H'
                                            , 'I'
                                            , 'J'
                                            , 'K'
                                            , 'L'
                                            , 'M'
                                            , 'N'
                                            , 'O'
                                            , 'P'
                                            , 'Q'
                                            , 'R'
                                            , 'S'
                                            , 'T'
                                            , 'U'
                                            , 'V'
                                            , 'W'
                                            , 'X'
                                            , 'Y'
                                            , 'Z'
                                            , 'a'
                                            , 'b'
                                            , 'c'
                                            , 'd'
                                            , 'e'
                                            , 'f'
                                            , 'g'
                                            , 'h'
                                            , 'i'
                                            , 'j'
                                            , 'k'
                                            , 'l'
                                            , 'm'
                                            , 'n'
                                            , 'o'
                                            , 'p'
                                            , 'q'
                                            , 'r'
                                            , 's'
                                            , 't'
                                            , 'u'
                                            , 'v'
                                            , 'w'
                                            , 'x'
                                            , 'y'
                                            , 'z'
                                            , '0'
                                            , '1'
                                            , '2'
                                            , '3'
                                            , '4'
                                            , '5'
                                            , '6'
                                            , '7'
                                            , '8'
                                            , '9'
                                            , '+'
                                            , '/'
                                        };
}

Converting a binary string to an int in C#

Back in my previous post, Converting an int to a binary string, we looked at how to write out the bits of an int without using the existing Convert.ToString method in the .NET Framework. Now, let’s look at the reverse – how to convert that binary string back into an int. The .NET Framework already has a built in method to do this (obviously), which is the Convert.ToInt32(string, int) method. This method takes the binary string and the base to convert from as parameters.

The easiest way that I have found to convert a binary number to decimal is to look at the bits of the binary number and raise 2 to the power of the index of the “on” bits and add those together. I define an “on” bit as a bit that is 1 as opposed to 0.

For example, the binary number 100 can be looked at as

22 + 0 + 0 = 4

Similary, 101 can be looked at as

22 + 0 + 20 = 5

Knowing this, we can then loop through the characters of the binary string, check if the bit is “on” and, if so, add

2 [index]

to resulting int. In the code example below, I first reverse the array to allow the index of our loop (power) match up with the index of the binary string (the power in which we want to raise 2 to).

public static int BitStringToInt(string bits)
{
    var reversedBits = bits.Reverse().ToArray();
    var num = 0;
    for (var power = 0; power < reversedBits.Count(); power++)
    {
        var currentBit = reversedBits[power];
        if (currentBit == '1')
        {
            var currentNum = (int) Math.Pow(2, power);
            num += currentNum;
        }
    }

    return num;
}

Converting an int to a binary string in C#

The .NET Framework has a built in overload of Convert.ToString which takes 2 parameters: the int you want to convert and an int of the base you want to convert to. Utilizing this with base 2, you can print out the string representation of a number in binary, like so:

var binary = Convert.ToString(5, 2); //Gives you "101"    

Now this is all fine and dandy, but you didn’t learn anything. (Or maybe you did. I don’t know. But you can learn some more so keep reading). For fun, let’s pretend that .NET didn’t have this method built in. How would you convert your number to it’s binary representation?

We can use a combination of bit shifting and logical AND’s to achieve this. If you logical AND a number with 1, that will give the value 1 or 0 depending on the value of the bit in the first position:

  1101
& 0001 (The number 1 in binary)
------
  0001

As we bit shift, 0′s are brought in from the left and the rightmost bit is dropped off and lost. If we bit shift the number to the right and then AND it with 1 again, we’ll get the result of the second bit.

  0110
& 0001
------
  0000

If we loop and continue to bit shift until the number is 0 we can build the entire binary string.

Example: Say we have the number 9, which in binary is 1001. Here’s the breakdown:

Number In Binary AND Result String
9 1001 1 1
4 0100 0 01
2 0010 0 001
1 0001 1 1001
0 0000 N/A (number is 0 so we’re done!) 1001

Now, in C#, to perform a right bit shift we use the >> operator, and to perform a logical AND we use the & operator. Here’s the code:

public string IntToBinaryString(int number)
{
    const int mask = 1;
    var binary = string.Empty;
    while(number > 0)
    {
        // Logical AND the number and prepend it to the result string
        binary = (number & 1) + binary;
        number = number >> 1;
    }

    return binary;
}

If you would like to print the string with a specific bit length, you can use the PadLeft method in the .NET Framework. it will prepend the specified number of a character of your choosing to your string:

binary = "1001";
binary = binary.PadLeft(8, '0');
// binary is now "00001001";

When doing string concatenation in C#, why don’t you have to call ToString on non string variables?

To get the string representation of a variable, in this case we’ll use int, there is no implicit conversion from int to string so you can call the ToString method:

string s = myInt; //INVALID!!
string s = myInt.ToString(); // Valid!

When concatenating with another string, though, you don’t have to do so:

string s = "First string " + myInt; //No ToString??!?

Why?

When performing string concatenation like above, the code is compiled to actually use the string.Concat(object, object) method. So our example above actually compiles to:

string s = string.Concat((object)"First string ", (object)myInt);

Notice the casts to object for both the string and int variables. This is also known as boxing:

Boxing is the process of converting a value type to the type object or to any interface type implemented by this value type. When the CLR boxes a value type, it wraps the value inside a System.Object and stores it on the managed heap. Unboxing extracts the value type from the object. Boxing is implicit; unboxing is explicit. The concept of boxing and unboxing underlies the C# unified view of the type system, in which a value of any type can be treated as an object.

The boxing and unboxing processes, as with anything, technically incur a performance penalty, but in the grand scheme of things they are nothing to worry about.

When it comes to string concatentation technically string s = "First string " + myInt.ToString(); is faster because it will use the string.Concat(string, string) overload. But the performance difference is so negligible that you should use whatever you find most readable.