Unicorn now available on NuGet

Unicorn, a library I wrote to handle automatically serializing Sitecore content items into a source control system, is now available on NuGet.

Be sure to review the post-installation steps in the GitHub README file - if used incorrectly, Unicorn can be dangerous to your items.

In other news, I began a major refactoring of Unicorn this weekend that will bring it a ton of new features mainly in the form of extensibility. Don’t want it to delete orphaned items? Custom comparison to determine updating? Implement your own Evaluator. Want to serialize to SQL? Use a custom serialization format that ignores versions? Sweet - make a SerializationProvider. Want to filter serialization based on anything you can imagine (like the rules engine? an Excel spreadsheet? presence of a version in a specific language?)? That’s where a Predicate comes in.

At the moment “Unicorn 2.0” is not yet fully stable. I plan to add a few more things to it such as detecting duplicate item IDs during processing and possibly a “what-if” mode. I’ll be working on it time permitting :)

WebConsole now available on NuGet

My handy WebConsole library is now available from NuGet for easier installation.


This release also fixes an issue where disposal was working in a rather silly fashion on sub-tasks.


If you’re unaware of this library, it’s a very useful thing to use if you find yourself writing “hack pages,” “processor pages,” “rebuild pages” or other things of the like on a regular basis. Any long running or complex task you’d run from a page can easily get a realtime progress bar, console-style logging, exception handling support, and the ability to break the task up into ‘subtasks’ with their own tracked progress.


In short, you can make your processing pages more informative and even vaguely pretty if you like the console aesthetic, in practically no extra time over simply writing them in the first place. I wrote this for myself, and I use it in a bunch of things :)

Scoping searches to the current context in Sitecore 7's LINQ

As a followup to my previous post about gotchas with Sitecore 7’s LINQ provider, here’s another thing to consider.


The indexing providers are very greedy about indexing. This means that unlike with traditional querying with Sitecore.Data.Item, where your results are automatically filtered by the context language and latest item version, no such filtering occurs with LINQ. You will receive all versions and all languages unless you specify otherwise.


As you might imagine, this can result in unexpectedly large quantities of search results in some cases. It can also be extra sneaky since during development you might only have one version in one language - so you wouldn’t even notice the issue.


So how do you fix the issue? First, let’s talk about versions. The default indexing configuration includes a field called _latestversion.
This is a boolean field that is only set to true for items that are the latest version in their language. We can take advantage of this by implementing a property on our mapped objects and mapping it to this index field like so:

[IndexField(“_latestversion”)]
public bool IsLatestVersion { get; set; }



Then, when we write a query we want to limit, we simply add a clause to the query:

.Where(x => x.IsLatestVersion)

// alternatively if you don’t want to be strongly typed and have an indexer, you can use
.Where(x=> x[“_latestversion”] == “1”)


Now you’ll only get the latest version. Now for languages, which are also pretty simple. If you’re inheriting from the SearchResultItem class, you already have a Language property. Otherwise you can add one like so:

[IndexField(“_language”)]
public string Language { get; set; }



Then, we add the following clause to the query:

.Where(x => x.Language == Sitecore.Context.Language.Name)

Now we get results like regular queries. If you’re like me, the next question you’re asking is “how can I just write this once and forget about it?” For example something like:

public static IQueryable<T> GetFilteredQueryable<T>(this IProviderSearchContext context)
    where T : MyItemBaseClass
{
    return context.GetQueryable<T>().Where(x => x.Language == Sitecore.Context.Language.Name && x.IsLatestVersion);
}

Unfortunately, this seems to be nigh impossible with the current revision of Sitecore 7. The issue has to do with how LINQ resolves expressions involving generic types that are not resolved at compile time. Effectively the expression in the example above converts to:

.Where(x=> ((T)x).Language == Sitecore.Context.Language.Name)

Notice the cast to T? That throws the expression parser for a loop. I’ve been told this will be fixed in a later release of Sitecore 7, but will not be part of the RTM release, so for the moment it looks like we’re writing filtering on each query.

Sitecore 7 LINQ gotchas

The upcoming Sitecore 7 release brings with it a new “LINQ-to-provider” architecture that effectively allows you to query search indexes using standard LINQ syntax. Mark Stiles wrote a pretty good synopsis of the feature that you should probably read first if you’re unfamiliar with the basics of how it works. This post won’t cover the basics.


I’ve been diving in to the underpinnings of the LINQ architecture and have discovered a number of things that may well cause confusion when people start using this technology. Be warned that this post is based on a non-final release of Sitecore 7, and may well contain technical inaccuracies compared to the final release.


You have to enable field storage to get output

By default, the values of fields are not stored in the index. If the values are not stored, you can query against the index using custom object mapping (e.g. filters work), but you will not see any field values mapped into results objects. You can define field storage parameters either on a per-field basis, or a per-field-type (e.g. Single-Line Text) in the default index configuration file (in App_Config/Include).

Changes to the storage type require an index rebuild before the storage is reflected.

LINQ is not C Sharp

Yeah, you heard me right. LINQ may look exactly like C#, but it is not parsed the same way. The lambda expressions you may use to construct queries against Sitecore are compiled into an Abstract Syntax Tree (AST) - sort of a programmatic representation of the code forming the lambda - and that is in turn parsed by the Sitecore LINQ provider.

The code you write into lambdas is not executed as normal C# code is. This is important to remember, because effectively the LINQ provider is simply mapping your query in as simple terms as possible to a key-value query inside of Lucene (or SOLR, etc). For example:

// we'll use this as a complex property type to map into a lucene object<br>
public class FieldType {
    public string Value { get; set; }
    public string SomeOtherValue { get; set; }
}
// this will be the class we'll query on in LINQ
public class MappedClass {
    [IndexField("field1")]
    public FieldType Field1 { get; set; }
}

// example queries (abbreviated code)
var query1 = context.GetQueryable<mappedclass>().Where(x=>x.Field1.Value == "foo");
var query2 = context.GetQueryable<mappedclass>().Where(x=>x.Field1.SomeOtherValue == "foo");

You’d expect query1 and query2 to be different wouldn’t you? NOPE. You’re not writing C# here, you’re writing an AST in C# syntax. The Sitecore LINQ engine takes the shortest path to a key-value query pair. What this really means is that it:

  1. Resolves the Lucene field name of the field you’re querying on (in this case, “field1”)

  2. Evaluates the operator you used, and the constant value you’ve assigned to compare to

  3. Constructs a Lucene query expression based on that

In effect, you can only ever have one value queried for each property. In the example above both examples are in effect x.Field1 == "foo". The query would be that way even if you did a crazy query like x.Field1.Foo.Bar.Baz.Boink.StartsWith("foo") - that would boil down to x.Field1.StartsWith("foo").

There is a facility you can tie into that controls how Sitecore converts types in queries (TypeConverters). Unfortunately, that does not solve the problem of disambiguating properties - the conversion only informs the engine how to convert the constant portion of the query (in this case, the string.

Sitecore LINQ does not care if the return entity type is mapped to a valid object for it

If you execute a query against a type, say the MappedClass type in the previous example, the LINQ layer will map all results of the query against the MappedClass type. Sounds great, but be careful - it will also map results that may not have the expected template to map to the MappedClass type onto it.

For example, suppose I made a model for the Sample Item template that comes with Sitecore. Then I queried the whole database as SampleItem. Out of my results, probably only two are really designed to be mapped to my SampleItem - the rest will have nulls everywhere. This is potentially problematic if you forget to be specific in your queries to limit the template ID to the expected one.

You must enumerate the results before the SearchContext is disposed

If you’ve ever dealt with NHibernate or other lazy-loaded ORMs, this might make perfect sense to you. A typical search query method might look something like this:

public IEnumerable<MappedClass> Foo()
{
    var index = SearchManager.GetIndex("sitecore_master_index");
    using (var context = index.CreateSearchContext(SearchSecurityOptions.EnableSecurityCheck)) {
        return context.GetQueryable<mappedclass>().Where(x=>x.Name == "foo");
    }
}

Can you guess the problem? IEnumerable doesn’t actually execute until you enumerate the values (e.g. run it through foreach, .ToArray(), etc). If you return the IEnumerable<mappedclass>, it cannot be enumerated until the SearchContext has already been disposed of. Which means that will throw an exception!

To avoid this problem you need to either:

  • Return a pre-enumerated object. Usually the simplest form of this is simply to return the query with .ToArray() at the end, thus enumerating the query into an array before the context is out of scope. Warning: This also means that you should filter as far as possible within the query, including paging, especially with large result sets.

  • Execute all the code within the context’s scope. This is probably less desirable as it either means spaghetti code in your codebehind, or a request-level context manager like NHibernate sometimes does.

The object mapper uses cached reflection to set each mapped property

Yes, it uses “the R word.” Reflection is great, but it’s not all that fast. This is fine if you’re running optimized code: you’ll want to perform pagination and filtering within Lucene, and avoid returning large quantities of mapped objects. The performance is dependent on both the number of objects and the number of properties you have on each object as each property results in a reflection property set.

I would suggest trying to keep the number of mapped objects returned under 100 in most cases, and/or using output caching to minimize the effect of reflection.

It’s also possible to patch the way the mapping code works (I have a reflection-less proof of concept that in relatively unscientific tests is about 50-100x faster than the reflection method. It enforces some code generation requirements however and is not as general purpose or as simple to use as the reflection method.

It’s still awesome

While this post has largely focused on unexpected gotchas around the Sitecore LINQ provider, it’s actually pretty darn nice once you get used to its quirks. It’s certainly loads easier to use than any previous Lucene integration, and the backend extensibility allows for all sorts of interesting extensions and value storage.

New Open Source Projects

I’ve recently released several new open source projects on GitHub:


Kamsar.WebConsole


A library that makes it easy to build progress-reporting “console” pages in websites. These are useful for pages that execute long-running processes like search index rebuilds, CMS publishing, etc.




An example WebConsole page using the default styling




https://github.com/kamsar/Kamsar.WebConsole


Beaver




Beaver is a flexible build/configure/deploy system for ASP.NET projects based in PowerShell. It is designed to simplify the task of maintaining builds that may span many target environments (such as dev, QA, production) where each has its own slightly different configuration requirements.





Everything in Beaver is based on a consistent, simple idea: the “pipeline.” Pipelines are simply a folder that contain “buildlet” files that perform tasks in alphabetical order. Their structure is quite similar to POSIX System V-style init scripts, only instead of a pipeline per runlevel there’s a pipeline per build stage. Buildlets can be written in several domain-specific languages (PowerShell, XDT Transform, MSBuild, or even sub-pipelines), and you can implement your own buildlet provider if you wish. Using this architecture allows you to implement small, easily digestible single-responsibility units of action - as well as gain an immediate understanding of how the build process works by simply looking in a few folders and seeing visually the order things run in and their descriptions.





To manage multiple target environments, Beaver implements a powerful inheritance chain of pipelines. Archetypes are a way to extend the global pipelines to perform a specific sort of configuration that might apply to multiple environments - for example, configuring live error handling or hardening a CMS site’s content delivery-type servers. Environments then declare themselves as using 0-n archetypes (in addition to being able to extend the global pipelines themselves), allowing them to inherit cross-environment configurations.





Beaver was originally designed to support build and deploy operations for the Sitecore CMS. However, Sitecore is merely a set of archetypes - the system can be used for any sort of ASP.NET project.







Unicorn



Unicorn is a utility for Sitecore that solves the issue of moving templates, renderings, and other database items between Sitecore instances. This becomes problematic when developers have their own local instances - packages are error-prone and tend to be forgotten on the way to production. Unicorn solves this issue by using the Serialization APIs to keep copies of Sitecore items on disk along with the code - this way, a copy of the necessary database items for a given codebase accompanies it in source control.





Unicorn avoids the need to manually select changes to merge unlike some other serialization-based solutions because the disk is always kept up to date by the event handlers. This means that if you pull changes in from someone else’s Sitecore instance you will have to immediately merge and/or conflict resolve the serialized files in your source control system - leaving the disk still the master copy of everything. Then if you execute the sync page, the merged changes are synced into your Sitecore database.





Installing Unicorn is very simple and using it is nearly automatic as long as you follow the conventions that allow it to work (local databases required).




Selecting a Sitecore Rendering Technology



Sitecore has a dizzying array of ways you can write renderings. While there is documentation that explains what they are, there’s a lot less about where each kind should be used. This will attempt to explain when you should use each kind of rendering.


So what kinds of renderings are there?

  • Sublayouts (UserControls/.ascx)

  • Web controls (WebControl/.cs)
  • Razor renderings (.cshtml) - Sitecore MVC-style
  • Razor renderings (.cshtml) - WebForms-style
  • Method renderings (.cs)
  • URL renderings
  • XSLT renderings (.xslt)
  • Custom - yes, you can write your own rendering provider


  • That’s a lot of ways to emit HTML to a page. Let’s take a look at each kind and examine their strengths and weaknesses.


    Sublayouts (User Controls)


    These are probably the kind of rendering you should be using most of the time unless you have Razor at your disposal. Sublayouts are confusingly named because most of the time they are simply a rendering and not an actual subdivision of a layout. These are basically identical to a traditional .NET User Control - they have a page lifecycle, code behinds, events and all the other traditional Web Forms trappings. They have a relatively HTML-like appearance that makes them sensible to edit if you have HTML/CSS folks collaborating with you, unlike the C#-based renderings.


    However they also have the same issues as their User Control cousins. Web Forms’ at times utterly verbose syntax and confusing event/postback model can introduce bugs. Highly branched markup emission is also very hard in User Controls because the markup is all encoded in the .ascx file, and you have to resort to MultiViews or PlaceHolders and setting a ton of Visible properties to make it work.

    Verdict: Use these for relatively static markup emission or places where the Web Forms event model will help you - like custom form renderings.


    Web Controls


    Web controls are simply C# classes that derive from the Sitecore WebControl base class. Web controls are perfect if you have to do a rendering whose markup has a lot of branching, for example a list that might have two or three different kinds of elements in the list because you can modularize the rendering into C# methods.


    On the other hand WebControls can be extremely hard to read if not written in a disciplined manner. There is no obvious HTML emission, so you’ll have trouble with HTML/CSS folks when they need to change markup or add a class - it’s all in C#. You can also write spaghetti renderings that are very hard to follow how the code runs. You also have to remember that unless you override the GetCachingID() method, your WebControl cannot be output cached.

    Verdict: Use these for places where you need tight control over HTML emitted, or have a lot of branches in your markup emission.


    Razor Renderings (MVC-style)


    Razor is a templating language traditionally associated with ASP.NET MVC. It dispenses with a lot of the usually unnecessary page lifecycle of a Web Form for a more simple template that is both readable as HTML and allows a decent amount of flexibility in terms of composing branch-heavy renderings. If you’re using Sitecore’s MVC support it’s a no-brainer to use Razor renderings for nearly all purposes.


    However to use the built in Razor support you must use Sitecore’s MVC mode - which means you have to do everything in Razor. You also have to register controllers and views as items, and lose a lot of module support - for example, Web Forms for Marketers cannot presently run in MVC mode. At present this makes it nearly untenable to implement most real world sites using Sitecore’s MVC mode.

    Verdict: If you’ve got a Sitecore MVC site, use it.


    Razor Renderings (custom)


    There are a couple of third party shared source modules (such as Glass) that have implemented a custom rendering type that invokes the Razor template engine outside of a MVC context. This means you could reap the benefits of a Razor template’s syntax, without needing to resort to Sitecore’s shaky configuration-over-convention MVC implementation. These are dependent on how you feed the view data to the Razor rendering however, and each implementation works slightly differently.

    Verdict: If you’re using one of these modules, you’ll probably implement most of your renderings in Razor without needing Sitecore MVC


    Method Renderings, URL Renderings


    These render output from a C# method that returns a string, and a given URL respectively.


    Verdict: There’s almost no good use case for either of these rendering types.


    XSLT Renderings


    These were once the main type of rendering in use. They use extended XSLT syntax to transform an item “XML” into a rendering. While they can be useful for EXTREMELY simple renderings, they do not scale well to any level of complexity. Most very simple renderings may at some point gain a measure of complexity, which would then mean either very ugly and slow XSLT or a complete rewrite in a different rendering technology. Do yourself a favor and save the rewrite - use something other than XSLT in the first place.

    Verdict: Don’t use.



    Feel free to take these recommendations with a grain of salt. These are my opinions, based on the projects I’ve worked on. Your project may come across good reasons why I’m absolutely wrong about one of these options. Keep your eyes open :)

    C#4: Defining optional parameters in interfaces is very unreliable

    See this example

    In short, when the object is treated as the interface the default value from the interface applies. If it’s treated as the implementation, the implementation’s default applies. But the implementation isn’t required to have a default for it at all - in which case the call will fail, unless it’s treated as the interface.

    It’s all very unreliable, and while I’m sure it works this way for good technical reasons under the hood, realistically it doesn’t present a very coherent interface.

    Making asp:TextBox do HTML5 input types

    I was watching a presentation on HTML5 from one of my coworkers a few weeks ago and we were talking about the new input types in HTML5, like numbers and dates. They’re backward compatible with non-HTML5 browsers (which render them as text boxes), but provide very useful UI features particularly to smartphones where the on screen keyboard can change to be more appropriate.


    Anyway, let’s just say they’re a good idea. A good idea that, much as I like ASP.NET, are not likely to make an appearance any time soon in the core framework. So, I looked around to see if I could figure out a way to patch a standard asp:TextBox to render these new HTML5 types. Yes, you MVC types already can do this easily. For the rest of us that are using third party CMS tools, WebForms is still the way of life :)


    I came across a post by Phil Haack about a sneaky method of overriding HtmlTextWriter to change the attributes output by WebControls, and repurposed it to accomplish the HTML5-ization. Basically you override the AddAttribute() method and make it ignore the attribute you want to manually handle - in this case, the type attribute of the TextBox. Then I made a new TextMode-style enum that encompasses the HTML5 types and wrote a bit of logic that manually creates the right type attribute. The new enum is exposed in the Html5TextMode property, and the value is proxied back into the default TextMode property if it’s compatible with it.


    So how do you use it? It’s pretty simple:

    <prefix:Html5TextBox runat="server" Html5TextMode="Tel" ID="telephone" />
    

    And the code for the control itself:

    public class Html5TextBox : TextBox
    {
     /// <summary>
     /// When using non-HTML5 constructs this mode will be accurate. If using HTML5, it will return SingleLine.
     /// </summary>
     public override TextBoxMode TextMode
     {
      get
      {
       try
       {
        return (TextBoxMode)Enum.Parse(typeof(TextBoxMode), Html5TextMode.ToString());
       }
       catch(ArgumentException) { return TextBoxMode.SingleLine; }
      }
      set
      {
       Html5TextMode = (Html5TextBoxMode)Enum.Parse(typeof(Html5TextBoxMode), value.ToString());
      }
     }
    
     /// <summary>
     /// Sets the text mode of the control including HTML5 text modes such as Num and DateTime
     /// </summary>
     public Html5TextBoxMode Html5TextMode
     {
      get
      {
       object textMode = this.ViewState["5Mode"];
       if (textMode != null)
       {
        return (Html5TextBoxMode)textMode;
       }
       return Html5TextBoxMode.SingleLine;
      }
      set
      {
       this.ViewState["5Mode"] = value;
      }
     }
    
     /// <remarks>
     /// Adds the normal attributes (since the writer is actually a PatchedHtmlTextWriter from Render() the type attribute won't be rendered),
     /// then explicitly adds the appropriate type attribute including the HTML5 extensions
     /// </remarks>
     protected override void AddAttributesToRender(HtmlTextWriter writer)
     {
      base.AddAttributesToRender(writer);
    
      string type = null;
      if (Html5TextMode == Html5TextBoxMode.SingleLine)
       type = "text";
      else if (Html5TextMode == Html5TextBoxMode.DateTimeLocal)
       type = "datetime-local";
      else type = Html5TextMode.ToString().ToLowerInvariant();
    
      if (type != null)
       (writer as PatchedHtmlTextWriter).AddTypeAttribute(type);
     }
    
     /// <remarks>
     /// Patches the type of HtmlTextWriter that the control renders to
     /// </remarks>
     protected override void Render(HtmlTextWriter writer)
     {
      base.Render(new PatchedHtmlTextWriter(writer));
     }
    
     /// <summary>
     /// A version of HtmlTextWriter that intentionally ignores the "type" attribute when it is added.
     /// </summary>
     /// <remarks>
     /// Technique courtesy of Phil Haack
     /// http://haacked.com/archive/2006/01/18/UsingaDecoratortoHookIntoAWebControlsRenderingforBetterXHTMLCompliance.aspx
     /// </remarks>
     private class PatchedHtmlTextWriter : HtmlTextWriter
     {
      internal PatchedHtmlTextWriter(HtmlTextWriter basis) : base(basis) { }
    
      public override void AddAttribute(HtmlTextWriterAttribute key, string value)
      {
       if(key != HtmlTextWriterAttribute.Type)
        base.AddAttribute(key, value);
      }
    
      public override void AddAttribute(string name, string value)
      {
       if(name != "type")
        base.AddAttribute(name, value);
      }
    
      /// <summary>
      /// Manually adds a type attribute
      /// </summary>
      public void AddTypeAttribute(string value)
      {
       base.AddAttribute(HtmlTextWriterAttribute.Type, value);
      }
     }
    }
    
    /// <summary>
    /// Extends the TextMode enum with additional HTML5 types
    /// </summary>
    public enum Html5TextBoxMode { MultiLine, SingleLine, Password, DateTime, DateTimeLocal, Date, Month, Time, Week, Number, Range, Email, Url, Search, Tel, Color }
    

    Sitecore multi-site installations and output caching

    When running Sitecore in a multi-site configuration you may run into an odd issue: output caching may seem to get too greedy and not clear when you’d expect it to.

    There’s a simple culprit: the default Sitecore setup includes an event handler, Sitecore.Publishing.HtmlCacheClearer, that is invoked on the publish:end event. This event handler has a list of sites assigned to it, and the default is “website” - great, until you need to have more than one site and publishing doesn’t clear your site’s output cache. Fortunately it’s easy to configure more sites: just add more site nodes to the XML. You cannot however use config includes to allow each site to individually add itself to the list from its own config file.

    There’s also a nuclear option: you can implement your own event handler that clears all sites’ caches. I’m not sure if this would have a detrimental effect on any of the system sites (i.e. shell), but you could exclude it. An example of doing that:

    string[] siteNames = Factory.GetSiteNames();
    for (int i = 0; i < siteNames.Length; i++)
    {
       SiteInfo siteInfo = Factory.GetSiteInfo(siteNames[i]);
       if (siteInfo != null)
       {
            siteInfo.HtmlCache.Clear();
       }
    }