Brief Description

Generate word cloud form your code to see what your code is about and what it does. A word cloud is a set of randomly arranged keywords, variable and class names etc. used in your code. The size and the color of each word expresses it's usage frequency. Rarely used words are small and pale. It might give you a hint about how good or bad your code base is and how to improve it.

Currently supports languages: c#, Java, VB.NET.

Motivation

Recently during a seminar Kevlin Henney showed us a post form Phillip Calçado's blog - See How Noisy Your Code Is http://fragmental.tw/2009/04/29/tag-clouds-see-how-noisy-your-code-is.

The idea behind it is very simple. A tag cloud (word cloud) is a visual representation for text data. Words are usually placed on some rectangular area and the importance of each tag is shown with font size and/or color. This format is useful for quickly perceiving the most prominent terms in analyzed text. Wordle http://www.wordle.net/ is one of the free tools to build such clouds. You can paste any text or a website URL and in a few seconds you get an idea what the website or text is about.

And what is your code about?

Reading the Tag Cloud of your Code Base

So if you take your code remove comments, literals, block some very common words (like company name) and generate a word cloud of it, you will get an interesting picture to discuss with your colleagues in a coffee corner.
  • If words "if", "then", "else", "switch", "case" are first what you see - your code is sprinkled with conditionals!
  • Is "string" in your words top 10 ? - Congratulations if you write text processing software, otherwise in might be a bad smell.
  • Are you writing API or a library so you should see word "public" in front rows. If you are not working on a library or API, the word public might be a signal to think on better protection.
  • Do you see your classes at first glance or are they far away in background? Behind "int", "byte", "array" etc.? Is your code in your domain language?

One more very interesting article about that can be found under http://programmer.97things.oreilly.com/wiki/index.php/Code_in_the_Language_of_the_Domain in the book 97 Things Every Programmer Should Know.

Why this App?

The problem I faced was that I was not able to analyze large amount of code this way.
Furthermore my problem was that I was not allowed to paste commercial code to an internet page.
And one more reason why I have created this app was that I wanted to play with it, generating clouds of different code bases quickly and comparing them with each other.

OurCodeCloud.png
Code Cloud of Microsoft Data Access Application Block Project - generated using our application


We have tried this with multiple projects code bases. Every time except interesting new facts we learned about our own code, it was pretty much fun comparing different code clouds with each other. But do not forget, it is not a replacement for static code analysis and even not a code metric calculator. "Like most visualisation tools it is not a scientific proof of any kind but it gives you a hint about how good or bad your code base is." ( Phillip Calçado)

Any code metric tool will give you probably more precise and reliable information, but such a picture can be a killer argument for someone in a 5 minutes conversation to convince him to act.

George Mamaladze

Credits

Thanks to Michael Coyle for the great article A Simple QuadTree Implementation in C# http://www.codeproject.com/KB/recipes/QuadTree.aspx
Thanks to Jonathan Feinberg, creator of Wordle for that beautiful cloud and hints about algorithms behind http://stackoverflow.com/questions/342687/algorithm-to-implement-something-like-wordle
Thanks to rajesh-lal for his article A Windows Explorer in a user control|url http://www.codeproject.com/KB/miscctrl/ExplorerTree.aspx

Last edited Jul 26, 2011 at 1:43 PM by gmamaladze, version 32