A C#/Mono Language Module for BBEdit

Since I've been using Unity, I've loved it for the most part. Sure, it leaves a few socks on the floor, such my inability to post to their forum without pestering a moderator for help, or the way an infinite loop in your code locks up the whole Unity environment. But on the whole, it's a really great development system.

One area where it is still pretty raw though is code editing. I get the impression that many Unity developers don't actually code all that much, spending much more time in the Unity environment dragging objects around, hooking up pre-built scripts, and setting their properties. But our project is the opposite: it's almost all code, and we spend 95% of our time in the code editor.

Unity offers two code editors out of the box: Unitron, which is awful, and MonoDevelop, which is slightly less awful. The latter is fine for brief work, but when we're going to be writing code all day, we last about half an hour before we start pining for a more powerful (and Mac-like) text editor, such as TextWrangler.

Unfortunately, while TextWrangler groks a lot of languages, C# isn't one of them. We got by for a while setting the language to Java, which colored most of the keywords correctly, but the function popup only worked on classes with no superclass (perhaps because the "extends" keyword in Java is spelled ":" in C#). I looked around for C# language plug-ins, and found a few, but none that I could find implemented the function popup correctly, which is a pretty important feature.

So, we decided to make our own TextWrangler/BBEdit C# language module. This took a bit of poking around (and a bit of help from the friendly folks at Bare Bones Software), but it turns out to be not too difficult. You need two things:

  1. The BBEdit plug-in SDK. Ignore the all the folders therein except for "Codeless Examples" (and especially ignore the "Documentation" folder, which leads you in the wrong direction).

     

  2. The BBEdit User Manual. This book-length manual contains, in Appendix D, the documentation you need to understand the examples you found in thing 1.

Here's the big picture: BBEdit language modules used to have to be plug-ins, written in C/C++, which would parse text fed to it by BBEdit and do things like identify keywords and function boundaries with your own custom code. But a while back, Bare Bones realized that they could instead define everything most language modules would need to do with a set of properties, stored in a simple plist file. And a little while after that, they realized that the large set of properties could be reduced to a much smaller set, using the power of regular expressions.

So now, to add a new language to BBEdit/TextWrangler, you pretty much just need to make a plist (XML) file, known as a codeless language file. Regular expressions therein tell the editor how to find string literals, comments, functions, etc. The trick, of course, is that the regular expressions for these can end up rather long and dense.

So, we started with the C++ codeless language module, which is one of the samples included with the SDK. To our delight we discovered that the regex for finding functions — the biggest and hairiest one on the whole file — already worked great for C# methods, even in derived classes. The C++ comment pattern worked fine in C# too. We replaced the keywords list with the official C# keyword list from MSDN, updated the metadata (module name, file extension mapping, etc.) and we were nearly done.

The only regular expression we had to muck with much was the one for string literals. C# has regular character and string literals, with backslash escaping, just like C++; but it also has something called a verbatim string literal, which starts with @" (similar in syntax to Cocoa, though not in semantics). Verbatim string literals don't do any backslash escaping, but they do allow you to embed a quotation mark by doubling it (just like in REALbasic).

So, we had to extend the string literal regex to handle this case. It take four of us about a half-hour of experimentation and head-scratching to be sure it covered all the cases, including line breaks, embedded quotes, and so on, but we finally got it. The whole string pattern is now:

	(?x:
		(?>	"	(?s: \\. | [^"] )*?		(?: " | $)	)	|
		(?>	'	(?s: \\. | [^'] )*?		(?: ' | $)	)	|
		(?>	@	(?: " (?s: .*?) " )+ )

The first two lines handle regular, C-style string and character literals, respectively; the third line handles verbatim strings. (Keep in mind that you can ignore whitespace in the above, thanks to the ?x modifier around the whole thing.)

So, long story short, we now have a C# language module for TextWrangler/BBEdit, and do most of our development in that, switching to MonoDevelop only when we need to use the debugger. You can download the module here, and install it by putting it into ~/Library/Application Support/TextWrangler/Language Modules (creating that folder if necessary, and replacing "TextWrangler" with "BBEdit" if you prefer).

If you have any improvements to suggest, please contact me or post to the comment form below. When it's all settled down, we'll submit it to the BBEdit Language Module Library. Enjoy!