All blog posts of month 10 of the year 2011

Creating a documentation system - Part 1

.NET, English posts, Open source Comments (3)

A while ago we started revamping the documentation of RavenDB. The work on that resulted in quite a nice documentation system that will be described in general in this post, and more posts will follow as we make more progress and introduce new features to it.

For some time now it was clear that much more organized docs were needed for RavenDB, and it had to be complete too. A lot of content is scattered around the net on blogs and FAQs, and we started gathering it all, arranging it and rewriting the docs almost from scratch. It was also obvious some content is worthy of being available, but is not really a "documentation" content - so we had to figure out what to do with such content too.

Also, one of the reasons the old docs were scattered around in blogs and FAQs is the rapid development process of RavenDB. So to add to all that, we needed to find a way to keep up with that - for example, by having working code samples at all times.

Documentation changes over time as best practices change or new features are added in, and so we needed to take that into account as well - being able to version the docs and see past revisions. Another important factor was community content - we wanted to allow the community to be able to respond, suggest fixes or additions, and to offer new content, even if it is not "documentation" per se.

Wiki sounded a bit too much, and we didn't really want to build something of our own. We played with some ideas for a while, until we had it all figured out.

Don't reinvent the wheel

The most important rule of all - don't waste your time creating a tool that you already have on your belt. In our context those are git and Markdown.

With git, we get easy content versioning, and it gets extremely easy to accept patches from other writers (forks and pull requests). Versioning is simply a matter of branching or tagging - the master HEAD is always the latest docs for the latest version, and whenever we want to mark a version we just create a branch from the working copy, or tag it. Obviously, github plays a huge role here - our documentation system even has an issue tracker, for heaven's sake...

And of course, Markdown is a natural fit. A super-simple text-based markup language, which is very easy to write with. Combined with git it really shines and is able to produce very nice diffs. It never was so easy to track documentation revisions.

Markdown also allows us to export documentation in various formats. On our website we show it as HTML, but it can also be compiled to a PDF book and other e-book formats. We will touch this in detail later on the series.

This is how we got our 3rd party hosted full-featured Wiki for documentation, which is available on github: https://github.com/ravendb/docs.

Editorial notes

Now that we had the basics figured out, we needed to decide on a structure. This actually proved very easy to do. Since we use git, all changes, including moving files around, are recorded and can be tracked. So if we represent each documentation item as a file, and store files under hierarchical folders we are pretty much done.

At the time of this writing we still don't have all the terminology figured out - what is a section, sub-section or a chapter. And at this stage we don't really care about all that. We just write the docs as it appears to make sense, and when it will all be done we should be able to revisit that.

We also created a Knowledge-Base section on the website, where content that is not "documentation" per se can still be published and viewed. All content that is considered out of scope for the actual documentation will be posted there - official articles by Hibernating Rhinos side-by-side with user generated content. The KB is a simple web application and has nothing to do with the documentation system, but in the larger scope - the product - it is important to have, both as means of providing extra content and for interaction with the community.

Code samples QA

To make sure all the code samples are up to date, we created a project with all the sample code used in the docs, and compile it to test the code is valid. With every new official release, we update it there and compile again, to make sure nothing requires changing. This way the code samples are guaranteed to stay up to date and work - what could never be the case with code that is in-lined in the docs themselves.

We added our own Markdown syntax which points to the code file, and write the code snippets within named #region-s, to make it easier to track and identify the code relevant to each page. If you ever wondered what #regions in VS are for, now you know :)

In the documentation, it looks like this:

{CODE region_name@folder/file.cs /}

And an actual code file with such regions can be found here: https://github.com/ravendb/docs/blob/master/code-samples/Intro/BasicOperations.cs

The custom Markdown syntax is parsed when compiling the docs, using a tool we developed that is now part of the documentation repository, before we resolve the markdown itself. The tool will go to the source files directory, locate the file, parse out the requested region, normalize the line spacing and inject it to the markdown source. Only then the source will be compiled and saved to the specified output.

Next...

In the next posts we will look at how the docs are actually compiled, how they are browsed in the website, and how we allow for versioning of docs.


Whose bug is it anyway? Google vs Microsoft

.NET, English posts Comments (0)

Consider the following code (.NET):

public static IEnumerable<SyndicationItem> ReadFeed()
{
    IEnumerable<SyndicationItem> ret;

    using (var reader = XmlReader.Create(ListAtomUrl))
    {
        var feed = SyndicationFeed.Load(reader);

        if (feed == null)
            return null;

        ret = feed.Items;

        reader.Close();
    }

    return ret;
}

This is a a simple code to read an ATOM feed, using the relatively new .NET syndication API. When executing it on a Google groups ATOM feed (http://groups.google.com/group/ravendb/feed/atom_v1_0_topics.xml, for example), it would fail miserably with this error:

Error in line 10 position 22.

An error was encountered when parsing a DateTime value in the XML.

Description: An unhandled exception occurred during the execution of the current web request. Please review the stack trace for more information about the error and where it originated in the code.

Exception Details: System.Xml.XmlException: Error in line 10 position 22. An error was encountered when parsing a DateTime value in the XML.

The reason for this error is this XML:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <updated>-0-0T::Z</updated>
  <generator uri="http://groups.google.com" version="1.99">Google Groups</generator>
  <entry>
  <author>

Notice the "updated" tag. This only happens for the "topics" feed, not the "new messages" feed Google provides for each group.

So whose bug is it? My bet is on Google. Skimming briefly over the ATOM RFC, I could find no mention of a "n/a" value for the "updated" field, so I can't tell if its legit, but this value just doesn't seem right.

However, Microsoft is at fault here too by not providing a way to tolerate those kind of errors. After all, the syndication API is meant to be used with external services, such that the developer would not have access too, and this API renders useless on the slightest bug a feed provider has. Fact is, no other reader I use had problems reading that feed.


Showing 2 posts out of 2 total, page 1