Quickly add SSL to Solr

There have been several people recently who I’ve seen having trouble setting up SSL for their Solr in order to use it with Sitecore 9. So, I present the following gist to you. It’s designed to automate the complete setup process of adding SSL to Solr with a self-signed certificate, and trusting that self-signed certificate. For production setups with a real certificate, it should be quite easy to modify.

It’s been tested on standalone as well as Bitnami Solr. The script requires Windows 10 to use the Import-PfxCertificate cmdlet; if you don’t have that you can remove the trust scripting and do it manually.

giphy

All about xConnect Security

Sitecore 9 introduces the new xConnect server to the ecosystem. xConnect is an abstracted service layer that Sitecore uses for all its analytics and marketing automation features. If you’re using Sitecore XP (aka xDB), you’ll need an xConnect server if you upgrade to Sitecore 9.

xConnect is noteworthy because it introduces client certificate authentication for the Sitecore XP server to communicate with xConnect. Certificates are a complex subject, and can fail in any number of less than helpful ways. This post aims to help you understand how certificates work in Sitecore 9, and provide you some tools to diagnose what’s wrong when they are not working right.

What is TLS?

In order to understand how xConnect works, it’s important to understand what’s going on: Transport Layer Security (TLS). You may also think of this as “SSL” or “HTTPS.”

TLS is a protocol for establishing secure encrypted connections between a server and a client. The key aspect of TLS is that the client and server can securely exchange encryption keys in such a way that they cannot be observed by malicious parties that may be watching the exchange.

Asymmetric vs Symmetric Encryption

To understand how TLS works, it’s important to understand the distinction between Asymmetric (also called Public Key) Encryption, and Symmetric Encryption.

If you ever made secret codes as a kid, you’ve probably used symmetric encryption. This is where the sender and receiver both need to know a key to decrypt the message, for example a simple shift cipher where D = A, E = B, and so forth. Julius Caesar famously sent secret messages by shifting letters three places forward like this. Symmetric encryption does have one major downfall, however: posession of the secret key lets you read any encrypted message even if not the intended recipient.

Asymmetric encryption on the other hand uses two different keys: a public key and a private key. The public key can be shared with anyone without compromising anything. However a client can use the public key to encrypt a message in such a way that it can only be decrypted with the server’s private key. In this way, you can receive private encrypted messages from clients you don’t share any secrets with - but they can still send the server private messages.

TLS uses asymmetric encryption to transfer an encryption key for symmetric encryption, which is used for ongoing data transfer over the encrypted connection. This is done because asymmetric encryption is much much slower than symmetric.

It’s important to understand the difference between public and private keys when you set up Sitecore 9, because they need to be deployed to different servers in your infrastructure. A certificate generally includes both a public and private key, however it can also include only a public key.

xConnect Setup

xConnect uses mutual authentication to secure the connections between it and the Sitecore XP server. This is accomplished using TLS client certificates.

If you’ve worked with SSL certificates before, this is a stronger form of SSL where not only does the client have to trust the server, but the server also has to trust a second certificate issued to the client. In this case, the client is the Sitecore XP server, and the server is the xConnect server. Let’s take a look at how this works:

SSL Server Certificate Negotiation

All SSL connections go through this process, whether xConnect or otherwise. In a standard Sitecore 9 XP installation, the xConnect server will have the server certificate installed. The Sitecore XP server will only have a server certificate if access to Sitecore itself, e.g. for administration, is done via SSL (in which case it will likely be a separate server certificate from xConnect’s).

  1. Client prepares to make a HTTPS request (e.g. you ask for https://xconnect)
  2. Client sends a ClientHello message to the server. This proposes encryption standards, among other things.
  3. The server replies with a ServerHello message back to the client. This includes the server’s public key, and the encryption standards that the server has selected from what the client proposed in the ClientHello.
  4. The client validates the server certificate (e.g. must have correct domain and trusted issuer)
  5. A symmetric encryption key is generated and exchanged using the server’s public key
  6. Now that an encrypted connection is established, a normal HTTP request is sent over the encrypted channel

What can go wrong with server certificate negotiation

The most common issues are domain mismatches and untrusted certificates. Generally you can diagnose issues with server certificates using a web browser - request the site over HTTPS and review the error shown in the browser. Make sure you request the xConnect server URL, not the Sitecore XP URL if you are diagnosing an xConnect connectivity issue.

Domain Mismatches

A domain mismatch occurs when a certificate’s domain does not match the domain being requested. For example, a certificate issued to sitecore.net will fail this validation if the site you’re requesting is https://foo.local. Certificates may also be issued using wildcards (e.g. *.sitecore.net). Note that wildcards apply to one level of subdomains only - so in the previous example sitecore.net or foo.sitecore.net would be valid, but bar.foo.sitecore.net would not be.

Domain matching is done based on the host header the server receives. For example if the xConnect server is https://xconnect but can also be accessed via https://127.0.0.1, the certificate will be invalid if the IP address is used because the certificate was not issued for 127.0.0.1.

If you have a domain mismatch issue, you will need to either get a new certificate (and update the xConnect IIS site(s) to use the new certificate) or change the domain for xConnect to one that is valid for the certificate.

Untrusted Certificates

To understand trust issues, it’s important to understand how certificates are issued. Certificates are issued by other certificates.

cert-issuer.jpg

In fact, certificates can be issued in chains (Xzibit would definitely approve). Trust issues occur when the certificate that issued the server certificate is not considered to be trusted by the client. On Windows, trust is established by being included in the Trusted Root Certification Authorities in the machine certificates:

trusted-certs.png

Note that to trust a certificate, only the public key for the server certificate must be imported here. If you’re using self-signed certificates that issued themselves - like localhost in the screenshot - you can add the certificate itself to the trusted root certificates by exporting it and reimporting it into the root certificates. If using a commercially issued certificate, that certification authority’s root certificates must be added to the trusted root - in most cases, they are already present.

More esoteric errors

There are some less common issues that can also cause server certificate negotiation errors. Servers will be commonly secured against supporting vulnerable ciphers, hash algorithms or SSL protocol versions. You might have heard of Heartbleed or POODLE vulnerabilities, or had to support TLS 1.2 if working with some web APIs such as SalesForce. This is a good idea, but if the server and client cannot mutually agree on a supported cipher, hash, and protocol version the connection will fail. If the certificate is trusted and has the correct domain, this would be the next thing to check.

If you’ve never heard of this before, you can secure your IIS servers using a tool like IISCrypto. Go do it now, this post will wait.

Note that the .NET HTTP client with framework versions prior to 4.6.2 defaults to only supporting TLS up to 1.1. Many modern security scripts will disable all TLS protocol versions except for 1.2, which will cause HTTP requests from clients with earlier versions of the .NET framework installed to fail.

SSL Client Certificate Negotiation

Hopefully now you have a decent idea of how server certificates work. But xConnect also uses client certificates. A client certificate enables mutual authentication. With only a server certificate, the client must decide to trust the server but the server has no way to know if it should trust the client. Enter client certificates.

A client certificate is essentially the opposite of the server certificate. When using a client certificate, the negotiation works similarly to the server certificate, except that when the server sends the ServerHello (#3 above) it requests a client certificate in addition to sending its public key. The client then sends the public key of its client certificate back to the server - and then the server decides whether it should trust the client certificate.

If the client certificate is not trusted, it is rejected. The rules for validating a client certificate are up to the server and do not necessarily follow the same validation rules as a server certificate on the client. In the case of xConnect:

  • The domain/subject on the client certificate does not seem to matter to xConnect
  • The trusting of the certificate is done using the thumbprint of the certificate (a hash of the certficate which uniquely identifies it). Note that the thumbprint will change when an expired certificate is renewed, so you will need to reconfigure xConnect after renewing a client certificate so that it trusts the newer thumbprint.
  • The xConnect server must trust the issuer of the client certificate

What can go wrong with client certificate negotiation

There are a lot of things that can go wrong with the client certificate, moreso than the server certificate. When troubleshooting, make your first step the Sitecore XP logs - they generally have some basic information about a bad client cert.

If you’re receiving HTTP 4xx responses

Chances are your client certificate validation failed. This could mean:

  • The client certificate is not installed on both the Sitecore XP server and the xConnect server (the xConnect server would only need the public key)
  • The client certificate is not considered trusted on the xConnect server
  • The certificate thumbprint configured in the xConnect server’s App_Config\ConnectionStrings.config is missing or incorrect. Note that the thumbprint must be all uppercase with no spaces or colons. If copied from certificate manager, an unprintable character might prefix the thumbprint - check for a hidden character there.
  • The certificate location configured in the xConnect server’s App_Config\ConnectionStrings.config is incorrect. Normally the certificate should be stored in local machine certificates and have a connection string similar to StoreName=My;StoreLocation=LocalMachine;FindType=FindByThumbprint;FindValue=THUMBPRINTVALUE.

“The certificate was not found”

This indicates one of two things:

  • The thumbprint is incorrect in the Sitecore XP server’s App_Config\ConnectionStrings.config file. Note that the thumbprint must be all uppercase with no spaces or colons. If copied from certificate manager, an unprintable character might prefix the thumbprint - check for a hidden character there.
  • The certificate location configured in the Sitecore XP server’s App_Config\ConnectionStrings.config is incorrect. Normally the certificate should be stored in local machine certificates and have a connection string similar to StoreName=My;StoreLocation=LocalMachine;FindType=FindByThumbprint;FindValue=THUMBPRINTVALUE.

System.Net.WebException: The request was aborted: Could not create SSL/TLS secure channel.

As long as the server certificate is valid, this message is most likely that the Sitecore XP server’s IIS app pool user account does not have read access to the client certificate’s private key. This access is needed so that the Sitecore XP server can encrypt communications using its client certificate.

To remedy this issue, open the local machine certificates (“Manage computer certificates” in a start menu search) on the Sitecore XP server. Find the client certificate (normally under Personal\Certificates). Right click it, choose All Tasks, then Manage Private Keys.... You should get a security assignment window like this:

pkey-security.png

Next, add your IIS app pool user to the ACLs and grant it Read permissions (as above). Remember if you’re using AppPoolIdentity (you should be, unless using a domain account for windows auth to SQL), you must select the account by choosing Local Computer as the search location, and enter IIS APPPOOL\MyAppPoolsName as the account name to find.

Still having issues? Well, you can also use the security audit log to find out which account is failing to get access, then add that account in the key ACLs above:

key-security-audit-failure.png

Local Development Tip

If you work at a Sitecore partner and will have multiple copies of Sitecore 9 running locally, this can cause issues if you issue a dedicated SSL server certificate to each site. This is because a given TCP port (e.g. 443, the default) can only have one SSL certificate bound to it. This precludes having multiple Sitecore 9 instances running together locally unless they share the same SSL certificate.

Wildcard certificates are perfect for this job. As long as you use the same top level suffix for all your dev sites (e.g. *.local.dev), you can use the same wildcard certificate for your server certificate for all dev sites. Note that IIS’ self-signed certificate generator will not create a wildcard certificate for you. You’ll have to use something else, like New-SelfSignedCertificate, to create one.

Important note: You must issue a wildcard for at least two segments of domain for it to be trusted. For example *.dev is bad, but *.local.dev is good.

Note that client certificates should be unique on each site, only the server certificate should be shared.

In the release version of Sitecore 9, you can also disable the requirement to use encryption with xConnect which can bypass a lot of debugging. Do not do this in production or else a herd of elephants will destroy you.

Advanced Debugging with Wireshark

It’s possible to watch the SSL negotiation at a TCP/IP level using a network monitor such as Wireshark. This can provide insights on why your setup is failing when error messages are less than optimal. For example I spent a couple days diagnosing what turned out to be private key security issues. I figured this out by using Wireshark and observing that the client was never sending its client certificate after the server requested it, and figuring out why that was.

To use Wireshark to watch SSL traffic, you’ll have to set it up to decrypt traffic. This guide walks you through setting up decryption on Windows with an exported private key.

If you’re tracing local dev traffic (e.g. from localhost to localhost, including using your machine’s DNS name) Wireshark will not capture that unless you install npcap instead of the default pcap packet capture software. Once npcap is installed, tell Wireshark to bind to the Npcap Loopback Adapter to see local traffic.

Here is a screenshot of the Wireshark capture where I diagnosed the client certificate security issue:

wireshark-trace.png

Good luck!

Where to find Sitecore documentation

keep-calm-and-rtfm.jpg

The land of Sitecore documentation is becoming a bit crowded these days. While at Symposium, I heard some people say they didn’t know how to keep up on new documentation - so here’s what I know. No doubt I missed some resources too, but these are the ones I usually use and follow.

Official Docs

Sitecore Doc Site

This is the main place to find documentation for Sitecore, as well as Sitecore modules. It has a handy RSS feed of updated articles you should subscribe to.

Unfortunately the RSS feed is not entirely complete due to documentation microsites being proxied in under the main doc site (for example Commerce and the v9 Scaling Guide). These statically generated sites generally do not provide their own RSS feeds, and are thus harder to track updates to.

Sitecore Knowledge Base

The Sitecore KB lists known issues, support resolutions, security bulletins, and other support information. Like the main doc site, it has its own RSS feed of updated articles that is absolutely worth subscribing to.

Sitecore Helix Docs

Sitecore’s official architecture guidance has its own website. Unfortunately, no RSS feed of updates.

Sitecore JSS Docs

The JavaScript Services module has its own separate documentation site. Unfortunately, no RSS feed of updates.

Sitecore Dev Site

Where to go to actually download Sitecore releases and official modules such as SXA and PXM. There’s no RSS feed of new releases and updates, unfortunately.

Community-run Docs

Sitecore Blog Feed

A Sitecore-run blog aggregator that serves up a fresh helping of most major Sitecore blogs. Worth subscribing to via its RSS feed.

Sitecore StackExchange

A community-driven Q&A site that’s part of StackExchange. If you have a question about Sitecore, there are many highly active members who are happy to help here.

Sitecore Slack

Slack is a group messaging/discussion tool. The Sitecore Community Slack group has over 2,700 Sitecore developers with very active participation. If you do Sitecore, you should be here.

Unofficial Sitecore Training

Community run unofficial training videos that cover development practices that are commonly used, but not covered in official Sitecore training. More opinionated, influenced heavily by real-world implementation experiences.

Sitecore Community Docs

Unofficial documentation. Not updated that often any longer but still some good information, especially the article on config patching.

Sitecore Powershell Extensions Docs

The SPE documentation is so complete that it’s worth mentioning even though it’s for a single Sitecore module.

Unicorn 4 Released

unicorn

I’m happy to announce the final release of Unicorn 4.0! Unicorn 4 comes with significant performance and developer experience improvements, along with bug fixes. Unicorn 4 is available from NuGet or GitHub.

Project Dilithium and Performance

Unicorn 4 is faster - a lot faster. Check out these benchmarks:

performance benchmarks

The speed increase is due to optimized caching routines, as well as the Dilithium batch processors. Dilithium is an optional feature that is off by default: because of its newness, it’s still experimental. I’m using it in production though. Give it a try - it can always be turned off without hurting anything.

For more detail into how Unicorn 4 is faster, and what Dilithium does, check out this detailed blog post.

Improved Modular Configuration

Unicorn 4 features a refactored configuration system that is designed to support Sitecore Helix projects with an improved configuration experience. The new config system is completely backwards-compatible, but now enables configuration inheritance, configuration variables, and configuration extension so that modular projects can encode their conventions (e.g. paths to include, physical paths) into one base config and all the module configs can extend it.

This drastically reduces the verbosity of the module configurations, and improves their maintainability by allowing conventions to be DRY. Here’s a very simple example of a base conventions configuration:

1
2
3
4
5
6
<configuration name="Habitat.Feature.Base" abstract="true">
<targetDataStore physicalRootPath="$(sourceFolder)\$(layer)\$(module)\serialization" />
<predicate>
<include name="$(layer).$(module).Templates" database="master" path="/sitecore/templates/$(layer)/$(module)" />
</predicate>
</configuration>

And here’s a module configuration that extends it:

1
2
3
4
5
6
<configuration
name="Feature.News"
extends="Habitat.Feature.Base">
<!-- automatically stores items at $(sourceFolder)\Feature\News\serialization -->
<!-- automatically includes /sitecore/templates/Feature/News -->
</configuration>

There’s a lot more that you can do with the configuration enhancements in Unicorn 4 too. For additional details, read this extensive blog post.

Sitecore PowerShell Extensions Support

unicorn + spe

Just about anything you can do with Unicorn can now be automated using Sitecore PowerShell Extensions in Unicorn 4. You can now run Unicorn SPE cmdlets to…

Console Output Scaling

The Unicorn console has received a serious upgrade in Unicorn 4. If you’ve ever run a sync that changed a large number of items from the Unicorn Control Panel, you may have noticed the browser slow to a crawl and the sync seem to almost stop. The console that underpins Unicorn 3 and earlier started to choke at around 500 lines.

No longer! Unicorn 4’s upgraded console has spit out 100,000 lines without a hitch, and it should scale beyond that.

The automated tool console (PowerShell API) has also received an upgrade. Previously the tool console buffered all the output of a sync before sending it back. This caused problems in certain environments, namely Azure, where TCP connections that don’t send any data for more than four minutes are terminated. This would cause long-running syncs in Azure to die unexpectedly.

In Unicorn 4 the automated tool console emits data in a stream just like the control panel console. There’s also a heartbeat timer where if no new console entries are made for 30 seconds, a . will be sent to make sure the connection is kept active.

The streaming tool console also requires updating your Unicorn.psm1 file - not only will you get defense against TCP timeouts, you’ll also be able to see the sync occur in real time using the PSAPI just like you would from the control panel. No more waiting until it’s done to see how things are going :)

Predicates can Exclude by Template ID or Name

Unicorn 4 can now exclude items from a configuration by template ID, thanks to Alan Płócieniak. See also Alan’s original post on the technique.

1
2
3
<include name="Template ID" database="master" path="/sitecore/allowed">
<exclude templateId="{3B4F2B85-778D-44F3-9B2D-BEFF1F3575E6}" />
</include>

You can also exclude items by a regular expression of their name. This enables scenarios such as wanting to include all templates, but exclude all __Standard values items.

1
2
3
<include name="Name pattern" database="master" path="/sitecore/namepattern">
<exclude namePattern="^__Standard values$" />
</include>

The complete grammar for predicates is always in the predicate test config.

Breaking Changes

Unicorn 4’s breaking changes do not break any common use-cases of Unicorn, but review them to see if they affect you.

  • The __Originator field is now serialized by default. This enables proper tracking of the origin of items instantiated from branch templates.
  • Multithreaded sync support has been removed due to Sitecore bugs preventing realistic use. This was disabled by default already. Dilithium is faster than multithread sync ever was.
  • The Rainbow UseLegacyAttributeFormatting (formats items in Unicorn < 3.1 format) setting has been removed. The new format is now always used. This has always been off by default.
  • Rainbow FieldComparers are no longer activated using the Sitecore Factory, so they only support parameterless constructors (this would only affect custom comparers; the stock ones have always been parameterless)
  • Due to the console upgrades, a new Unicorn.psm1 is required if you are using Unicorn’s PowerShell API. This file also now ships in the NuGet package, so you can be sure you’re getting the right version for your Unicorn.

Bug Fixes

  • Transparent sync misc fixes (to renaming, saving display names, instantiating branch templates into TpSync areas)
  • Renaming an item and changing fields on it in one operation (only possible with Sitecore API) now no longer loses the additional field data in the serialized item
  • Improved output and logging, clarified messaging, improved developer experience
  • Content editor warnings now handle items in more than one configuration correctly
  • Control panel now displays which paths are invalid when initial serialization is blocked due to invalid include paths
  • The required password length for user serialization is now configurable, should you really really want to use b
  • Using Fast Query while any Transparent Sync configuration is active will now log a warning to the Sitecore logs (fast query cannot find transparent sync items, so items may not be returned as expected). This can be disabled if your logs start to fill with spam and you understand the issue.
  • PowerShell API now defaults to not write secrets to the console (debug is off by default) for secure-by-default-ness
  • Fixed a background exception that could occur when modifying serialized items on disk in rare cases, which could cause the app pool to terminate #222
  • The Rainbow data cache now correctly invalidates if an item is moved or renamed on disk after being added to the cache
  • Choosing many configurations to sync will no longer push the sync button off the page #232
  • The default console logging level for interactive syncing has been changed from Info to Debug, since there is no longer a scaling issue with the console output. This provides better insights into what has been changed on items without needing to see the Sitecore logs

Upgrading

If you’re coming from classical Unicorn 3.1 or later, upgrading is actually really simple: just upgrade your NuGet package. Unicorn 4 changes nothing about storage or formatting (except that the __Originator field is no longer ignored by default), so all existing serialized items are compatible.

If you’re invoking Unicorn via its remote PowerShell API, make sure to upgrade your Unicorn.psm1 to the Unicorn 4 version to ensure correct error handling with the streaming console.

Thanks

Thank you to the community members who contributed code and bug reports to this release.

Address

early design for Unicorn 5

Simplifying Contact Facets with C# 6

Contact Facets allow you to persist information about visitors into the Sitecore xDB. We’re not going to get into the theory behind them in this post; for that go read Pete Navarra’s great blog post that summarizes current practices and how to add facets.

Today we’re going to discuss how to syntactically improve the declaration of a contact facet class using syntaxes available in C# 6.0 (VS 2015) and C# 7.0 (VS 2017). It’s important to note that the C# version is decoupled from the .NET framework version: the C# 7.0 compiler is perfectly capable of emitting C# 7 syntax to a .NET 4.5-targeted assembly, for instance. So you can use these modern language features as long as you’ve got the right version of MSBuild or Visual Studio :)

Here’s the example Pete uses in his post, which follows other examples out there as well:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
using System;
using Sitecore.Analytics.Model.Framework;
namespace SitecoreHacker.Sandbox.Facets
{
[Serializable]
public class MarketingData: Facet, IMarketingData
{
private const string CUSTOMER_ID = "CustomerId";
private const string SEGEMENT = "Segment"; // sic :p
#region Properties
public string CustomerId
{
get { return GetAttribute<string>(CUSTOMER_ID); }
set { SetAttribute(CUSTOMER_ID, value); }
}
public string Segment
{
get { return GetAttribute<string>(SEGEMENT); }
set { SetAttribute(SEGEMENT, value); }
}
#endregion
public MarketingData()
{
EnsureAttribute<string>(CUSTOMER_ID);
EnsureAttribute<string>(SEGEMENT);
}
}
}

As you can see, the facet API requires string keys for the facet values - in this case stored as const string - to get and set them. Further, as Pete notes:

I found out the hard way that the constants defined, the value must equal the actual name of the class property for the same attribute.

Well in C# 6 (VS 2015), there’s a syntax for that. The nameof statement allows you to get the string name of a variable or property. This essentially hands off the management and maintenance of the const value to the compiler, instead of the developer.

So we can clean up this example by using nameof instead of constants - and get as a bonus refactoring support and compile-time validation of the names:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
using System;
using Sitecore.Analytics.Model.Framework;
namespace Elephant.Sandbox.Facets
{
[Serializable]
public class MarketingData: Facet, IMarketingData
{
public string CustomerId
{
get { return GetAttribute<string>(nameof(CustomerId)); }
set { SetAttribute(nameof(CustomerId), value); }
}
public string Segment
{
get { return GetAttribute<string>(nameof(Segment)); }
set { SetAttribute(nameof(Segment), value); }
}
public MarketingData()
{
EnsureAttribute<string>(nameof(CustomerId));
EnsureAttribute<string>(nameof(Segment));
}
}
}

Finally if you have C# 7.0 (VS 2017), you can also utilize expression bodied members to further clean up the property syntax:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
using System;
using Sitecore.Analytics.Model.Framework;
namespace Rhino.Sandbox.Facets
{
[Serializable]
public class MarketingData: Facet, IMarketingData
{
public string CustomerId
{
// expression bodied members turn the single-expression get
// into a lamdba-style syntax, removing the need for braces
get => GetAttribute<string>(nameof(CustomerId));
set => SetAttribute(nameof(CustomerId), value);
}
public string Segment
{
get => GetAttribute<string>(nameof(Segment));
set => SetAttribute(nameof(Segment), value);
}
public MarketingData()
{
EnsureAttribute<string>(nameof(CustomerId));
EnsureAttribute<string>(nameof(Segment));
}
}
}

So there - now go forth and put your data in the xDB :)

Unicorn 4 Part III: Configuration Enhancements

TL;DR: Unicorn 4 prerelease is on NuGet right now!

Now that that’s out of the way, let’s talk about another new Unicorn 4 feature: modular architecture friendly configurations.

everywhere.jpg

When Habitat first launched, I was mildly incredulous at the amount of duplication in its Unicorn configurations. Tons of tiny modules, all of which shared similar but not identical configurations (such as custom root folders) was not really a consideration when multiple configurations were originally conceived. Fast forward to today, and that’s a major use case that is more difficult than it needs to be.

Here’s an example of a Habitat Unicorn configuration:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">
<sitecore>
<unicorn>
<configurations>
<configuration
name="Feature.News"
description="Feature News"
dependencies="Foundation.Serialization,Foundation.Indexing"
patch:after="configuration[@name='Foundation.Serialization']">
<targetDataStore physicalRootPath="$(sourceFolder)\feature\news\serialization"
type="Rainbow.Storage.SerializationFileSystemDataStore, Rainbow" useDataCache="false"
singleInstance="true" />
<predicate type="Unicorn.Predicates.SerializationPresetPredicate, Unicorn" singleInstance="true">
<include name="Feature.News.Templates" database="master" path="/sitecore/templates/Feature/News" />
<include name="Feature.News.Renderings" database="master" path="/sitecore/layout/renderings/Feature/News" />
<include name="Feature.News.Media" database="master" path="/sitecore/media library/Feature/News" />
</predicate>
<roleDataStore type="Unicorn.Roles.Data.FilesystemRoleDataStore, Unicorn.Roles" physicalRootPath="$(sourceFolder)\feature\news\serialization\Feature.News.Roles" singleInstance="true"/>
<rolePredicate type="Unicorn.Roles.RolePredicates.ConfigurationRolePredicate, Unicorn.Roles" singleInstance="true">
<include domain="modules" pattern="^Feature News .*$" />
</rolePredicate>
</configuration>
</configurations>
</unicorn>
</sitecore>
</configuration>

It’s long and it has a ton of boilerplate that is either identical in every module, or else defined by system conventions (e.g. physicalRootPath). We don’t need to be that verbose when using Unicorn 4. When we setup a modular, convention-based system using Unicorn 4 we can start by using abstract configurations to define the conventions of our system:

1
2
3
4
5
6
7
8
9
10
11
12
<configuration name="Habitat.Feature.Base" abstract="true">
<targetDataStore physicalRootPath="$(sourceFolder)\$(layer)\$(module)\serialization" />
<predicate>
<include name="$(layer).$(module).Templates" database="master" path="/sitecore/templates/$(layer)/$(module)" />
<include name="$(layer).$(module).Renderings" database="master" path="/sitecore/layout/renderings/$(layer).$(module)" />
<include name="$(layer).$(module).Media" database="master" path="/sitecore/media library/$(layer).$(module)" />
</predicate>
<roleDataStore type="Unicorn.Roles.Data.FilesystemRoleDataStore, Unicorn.Roles" physicalRootPath="$(sourceFolder)\$(layer)\$(module)\serialization\$(layer).$(module).Roles" singleInstance="true"/>
<rolePredicate type="Unicorn.Roles.RolePredicates.ConfigurationRolePredicate, Unicorn.Roles" singleInstance="true">
<include domain="modules" pattern="^$(layer) $(module) .*$" />
</rolePredicate>
</configuration>

This configuration defines a configuration that other configurations can extend. Because of its abstract-ness it is not a Unicorn configuration itself, only a template. Non-abstract configurations may also be extended.

This abstract configuration is also making use of Unicorn 4’s ability to do variable replacement in configurations. The $(layer) and $(module) variables are expanded in the extending configuration and are based on the convention of naming your configurations Layer.Module. You can also expand more than one config per module and use your own variables. Using our abstract Habitat.Feature.Base configuration above, the same Feature.News configuration we started with can now be expressed much more simply:

1
2
3
4
5
6
<configuration
name="Feature.News"
description="Feature News"
dependencies="Foundation.Serialization,Foundation.Indexing"
extends="Habitat.Feature.Base">
</configuration>

Nice huh? But what if you want to extend or replace a dependency in the inherited configuration? You can do that, too - and using Unicorn 4’s element inheritance system you can also do it very cleanly. Unicorn configurations have always been architecturally a set of independent IoC containers. The <defaults> node in Unicorn.config sets up the defaults, and then each configuration’s nodes override and replace the defaults if they exist. This is how you can deploy only new items with the NewItemsOnlyEvaluator - you’re replacing the default evaluator with a different dependency implementation.

Unicorn 4 takes this a step further: with config inheritance, dependencies can be partially extended at an element level. You might have noticed this already in the Habitat.Feature.Base configuration, when we did this:

1
<targetDataStore physicalRootPath="$(sourceFolder)\$(layer)\$(module)\serialization" />

In Unicorn 3, this would have required a type attribute. In Unicorn 4, unless you specify a type attribute, any attributes you add either replace or add to the default (or inherited) implementation. So instead this kept the same default dependency definition and changed an attribute on it - the physicalRootPath.

If you do specify a type, nothing is inherited and it works like Unicorn 3. Thus existing configurations will also work without modification :)

But what about things that have more than just attributes, like the predicate‘s include nodes? You can append elements in the inherited configuration in that case. If we take our Habitat.Feature.Base configuration above and extend it like this:

1
2
3
<predicate>
<include name="Foo" database="master" path="/sitecore/Foo" />
</predicate>

The end result is effectively:

1
2
3
4
5
6
<predicate type="Unicorn.Predicates.SerializationPresetPredicate, Unicorn" singleInstance="true">
<include name="Feature.News.Templates" database="master" path="/sitecore/templates/Feature/News" />
<include name="Feature.News.Renderings" database="master" path="/sitecore/layout/renderings/Feature/News" />
<include name="Feature.News.Media" database="master" path="/sitecore/media library/Feature/News" />
<include name="Foo" database="master" path="/sitecore/Foo" />
</predicate>

You cannot remove inherited predicate nodes (or other dependencies that use children like fieldFilter), so plan accordingly: adding elements only.

And there you have it: with Unicorn 4 you can reasonably simply create serialization conventions for your modules and avoid configuration duplication - or if you’re not ready to go modular, you can at least enjoy not needing to have a type on most configuration nodes.

But Wait, There’s More 🐘: The Console No Longer Sucks

In the Control Panel…

The Unicorn console has also received a serious upgrade in Unicorn 4. If you’ve ever run a sync that changed a large number of items from the Unicorn Control Panel, you may have noticed the browser slow to a crawl and the sync seem to almost stop. The console that underpins Unicorn 3 and earlier started to choke at around 500 lines.

No longer! Unicorn 4’s console has spit out 100,000 lines without a hitch.

Automated Tools (PowerShell API)

The automated tool console has also received an upgrade. Previously the tool console buffered all the output of a sync before sending it back. This caused problems in certain environments, namely Azure, where TCP connections that don’t send any data for more than 4 minutes are terminated. This would cause any long-running syncs in Azure to die unexpectedly.

In Unicorn 4 the automated tool console emits data in a stream just like the control panel console. There’s also a heartbeat timer where if no new console entries are made for 30 seconds, a . will be sent to make sure the connection is kept active.

The streaming console also requires updating your Unicorn.psm1 file - not only will you get defense against TCP timeouts, you’ll also be able to see the sync occur in real-time using the PSAPI just like you would from the control panel. No more waiting until it’s done to see how things are going :)

Can I have it yet?

Absolutely. You can find Unicorn 4.0.0-pre03 on NuGet right now!

How stable is this?

More stable than you might think. Unicorn 4 is largely additions, fixes, and enhancements to the already stable codebase behind Unicorn 3. The core pieces have not changed very much, unless you enable Dilithium and that’s optional. The new config inheritance stuff has 97% code coverage. That’s not to say it’s bug free either. If you find bugs let me know and I’ll fix them :)

What about installing it?

Installation is just like Unicorn 3: Install the Unicorn NuGet package, and follow the directions in the README that will launch on installation to set up configuration(s).

NOTE: Dilithium ships disabled by default. If you want to enable it, make a copy of Unicorn.Dilithium.config.example and enable it.

What about upgrading to it?

If you’re coming from classical Unicorn 3.1 or later, upgrading is actually really simple: just upgrade your NuGet package. Unicorn 4 changes nothing about storage or formatting (except that the __Originator field is no longer ignored by default), so all existing serialized items are compatible.

Taking advantage of the config enhancements detailed above is also entirely optional: Unicorn 3 configurations are totally readable by Unicorn 4.

If you’re invoking Unicorn via its PowerShell API, make sure to upgrade your Unicorn.psm1 to the Unicorn 4 version to ensure correct error handling with the streaming console.

Have fun!

Unicorn 4 Preview Part 2.5: Generating Unicorn Packages with SPE

Last time we talked about how Sitecore PowerShell Extensions support was coming to Unicorn 4. This time, we’ve got a new cmdlet to share.

Over time, many people have asked if there was a way to generate Sitecore packages from Unicorn. The answer has always been no, for many good reasons: packages install slowly, cannot ignore specific fields, or process advanced exclusions like a Unicorn predicate can. This makes them much less safe (and much slower) for deployment purposes compared to a remotely invoked Sync using deployed serialized items.

But there is a great use case for generating packages from Unicorn: authoring modules. As a module author, a method is needed to track the items that belong to your module and also to reliably create Sitecore packages for distribution of your module which contain those items. Unicorn is a natural fit for simply tracking module items, but it has lacked the ability to automatically push updates to release packages like it can to serialized items. This unnecessarily complicates things and reduces release reliability. That’s bad.

So when Michael West and Adam Najmanowicz, the authors of Sitecore PowerShell Extensions, asked if there was a way we could export Unicorn configurations to packages my answer was absolutely.

SPE has long had packaging support built into it, and in fact SPE’s release packages are built using SPE. Unicorn packaging support is also implemented through SPE, and here’s how it works:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# Create a new Sitecore Package (SPE cmdlet)
$pkg = New-Package
# Get the Unicorn Configuration(s) we want to package
$configs = Get-UnicornConfiguration "Foundation.*"
# Pipe the configs into New-UnicornItemSource
# to process them and add them to the package project
# (without -Project, this would emit the source object(s)
# which can be manually added with $pkg.Sources.Add())
$configs | New-UnicornItemSource -Project $pkg
# Export the package to a zip file on disk
Export-Package -Project $pkg -Path "C:\foo.zip"

And when you’re done with that, c:\foo.zip would contain a package that when installed will contain the entire contents of any Unicorn configuration matching Foundation.*.

New-UnicornItemSource also accepts parameters to specify package installation options, exactly like SPE’s New-ExplicitItemSource. This cmdlet is also very similar to how New-UnicornItemSource works: each item that is included in the configuration is added to the package as an explicit item source. Doing this also means that the exported package completely respects the Unicorn Predicate, including exclusions of child paths (note that if you specify -InstallMode Overwrite, excluded children may be deleted by the package).

Questions?

Are the packaged items pulled directly from the serialized items?

No, they are pulled from the Sitecore database because the Sitecore packaging APIs work in Items. So make sure to sync before you generate a package. Unless you’re using Transparent Sync in which case the items will already be up to date.

Should I use this to deploy my site?

No. As mentioned above, packages are a slower and more dangerous method to deploy item updates to your site.

Does this mean modules will start requiring Unicorn? 🐘

No. Unicorn would only be used in the development of the module, and the build process used to generate plain old Sitecore Packages for module releases. The module itself would need depend on neither Unicorn, Rainbow, or SPE.

Unicorn 4 Preview, Part 2: SPE Support

Unicorn 4 will feature full support for Sitecore PowerShell Extensions (SPE) to perform Unicorn actions. If you’ve ever wanted deep programmatic control over Unicorn (for example “I want to sync Foundation.*”), or if you’ve got an existing deployment process that’s already using SPE Remoting to perform deployment tasks - this is for you.

spe.png

To use Unicorn cmdlets in SPE, all that is necessary is to install the SPE package along with Unicorn. Unicorn 4 comes with configuration that remains quiescent until SPE is installed that will automatically enable the Unicorn cmdlets. In case that’s not clear enough: SPE is an optional addition and will not be required to use Unicorn 4.

So what can we do with Unicorn cmdlets for SPE?

Configurations

1
2
3
4
5
6
7
8
# Get one
Get-UnicornConfiguration "Foundation.Foo"
# Get by filter
Get-UnicornConfiguration "Foundation.*"
# Get all
Get-UnicornConfiguration

The result of Get-UnicornConfiguration is an array of IConfiguration objects, which you can spelunk (e.g. with their Name property) or pass to other cmdlets. Configurations are read only.

Syncing

Sync cmdlets make use of Write-Progress to provide a similar progress bar experience to the Control Panel, albeit a bit less responsive.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# Sync one
Sync-UnicornConfiguration "Foundation.Foo"
# Sync multiple by name
Sync-UnicornConfiguration @("Foundation.Foo", "Foundation.Bar")
# Sync multiple from pipeline
Get-UnicornConfiguration "Foundation.*" | Sync-UnicornConfiguration
# Sync all, except transparent sync-enabled configurations
Get-UnicornConfiguration | Sync-UnicornConfiguration -SkipTransparent
# Optionally set log output level (Debug, Info, Warn, Error)
Sync-UnicornConfiguration -LogLevel Warn

For example:

sync.png

Partial Syncing

Sometimes you want to only sync a portion of a configuration. You can do that with PowerShell using Sync-UnicornItem.

1
2
3
4
5
6
7
8
# Sync a single item (note: must be under Unicorn control)
Get-Item "/sitecore/content" | Sync-UnicornItem
# Sync multiple single items (note: all must be under Unicorn control)
Get-ChildItem "/sitecore/content" | Sync-UnicornItem
# Sync an entire item tree, show only warnings and errors
Get-Item "/sitecore/content" | Sync-UnicornItem -Recurse -LogLevel Warn

Reserializing

The cmdlet to reserialize is called Export-UnicornConfiguration because Reserialize is not an approved verb for a cmdlet :)

1
2
3
4
5
6
7
8
# Reserialize one
Export-UnicornConfiguration "Foundation.Foo"
# Reserialize multiple by name
Export-UnicornConfiguration @("Foundation.Foo", "Foundation.Bar")
# Reserialize from pipeline
Get-UnicornConfiguration "Foundation.*" | Export-UnicornConfiguration

Partial Reserializing

Sometimes you want to only reserialize a portion of a configuration. You can do that with PowerShell using Export-UnicornItem.

1
2
3
4
5
6
7
8
# Reserialize a single item (note: must be under Unicorn control)
Get-Item "/sitecore/content" | Export-UnicornItem
# Reserialize multiple single items (note: all must be under Unicorn control)
Get-ChildItem "/sitecore/content" | Export-UnicornItem
# Reserialize an entire item tree
Get-Item "/sitecore/content" | Export-UnicornItem -Recurse

Converting to Raw YAML

You can also dump out the raw YAML for an item - or items. The output of ConvertTo-RainbowYaml is either a string or array of strings depending on how many items were passed to it. Note that unless -Raw is specified, the default field formatters and excluded fields Unicorn ships with are used. These are non-customizable and do not follow Unicorn defaults if changed.

enyaml.png

This capability enables casual use of YAML serialization without having to use Unicorn or set up a configuration. It’s not a good solution for general purpose synchronization though simply because the nuances of storing trees of items in files are many. Very many. But I’m curious what uses people will find for this :)

1
2
3
4
5
6
7
8
9
10
# Convert an item to YAML format (always uses default excludes and field formatters)
Get-Item "/sitecore/content" | ConvertTo-RainbowYaml
# Convert many items to YAML strings
Get-ChildItem "/sitecore/content" | ConvertTo-RainbowYaml
# Disable all field formats and field filtering
# (e.g. disable XML pretty printing,
# and don't ignore the Revision and Modified fields, etc)
Get-Item "/sitecore/content" | ConvertTo-RainbowYaml -Raw

Converting from Raw YAML

convertfrom.png

In Rainbow the IItemData interface is the internal unit of an Item. The ConvertFrom-RainbowYaml cmdlet converts raw YAML string(s) into IItemData which you can then spelunk as objects or deserialize as needed.

1
2
3
4
5
6
# Get IItemDatas from YAML variable
$rawYaml | ConvertFrom-RainbowYaml
# Get IItemData and disable all field filters
# (use this if you ran ConvertTo-RainbowYaml with -Raw)
$yaml | ConvertFrom-RainbowYaml -Raw

Deserialization

To deserialize items, use Import-RainbowItem which takes IItemData items in and deserializes them into the Sitecore database. No comparison is done before deserialization, which makes this a bit slower than a full Unicorn sync.

elephant.png

As a shorthand Import-RainbowItem also accepts YAML strings, however as IItemData can represent any sort of item, it is not limited only to deserializing YAML-sourced items.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# Deserialize IItemDatas from ConvertFrom-RainbowYaml
$rawYaml | ConvertFrom-RainbowYaml | Import-RainbowItem
# Deserialize raw YAML from pipeline into Sitecore
# Shortcut bypassing ConvertFrom-RainbowYaml
$yaml | Import-RainbowItem
# Deserialize and disable all field filters
# (use this if you ran ConvertTo-RainbowYaml with -Raw)
$yaml | Import-RainbowItem -Raw
# Deserialize multiple at once
$yamlStringArray | Import-RainbowItem
# Complete example that does nothing but eat CPU
Get-ChildItem "/sitecore/content" | ConvertTo-RainbowYaml | Import-RainbowItem

Questions?

Does this mean the existing PowerShell Remote API is obsolete?

No. The existing PowerShell API uses Windows PowerShell to provide remote syncing capability and does not require installing Sitecore PowerShell. They serve different parallel purposes, and both are here to stay.

You have no gifs or memes in this post. Is something wrong?

you mad?

Can I have a beta yet?

I’ll be releasing a beta once I finish the features I have planned. Yep, there’s at least one more ;)

Unicorn 4 Preview, Part 1: Project Dilithium

Not too long ago I was considering the future of Unicorn. Rather naïvely, I couldn’t think of all that much that needed fixing. I wondered if there would ever be a Unicorn 4.

Now since you’ve read the title of this blog post, something tells me you don’t think that attitude lasted very long. It didn’t. Because there’s a pretty simple truth about Unicorn, even with all the optimizations I put in to Unicorn 3:

tdh.jpg

There were two primary causes of this:

  1. The Sitecore API is slow and not suited to dealing with large quantities of items
  2. Unicorn was originally designed to optimize for a few configurations, and Helix was blasting the number of active configurations to large numbers

I found that the sync time was becoming annoyingly long. Long enough that it was a distraction, long enough that a piddly 20% faster from more profiling just wasn’t good enough.

What to do, what to do.

Introducing Project Dilithium

I started on a crazy quest to speed up syncing performance. Sitecore’s recently added Publishing Service gained drastic speed gains in publishing by leveraging direct database access. Publishing, like syncing, is a batch-oriented operation. The Sitecore API is a single-item-oriented API. It does no optimizations, aside from caching, when you want more than one item. I resolved to try out SQL and see how it went. It went. See for yourself:

perf.png

(benchmarks on i5-3570k@4GHz, SSDs, freshly restarted IIS + SQL, average of 3 runs each)

whoa.jpg

So what is Dilithium?

There are two pieces of the Dilithium project: SQL and Serialized. Both operate in a similar fashion: before a sync operation begins, all items that will be involved in that sync are read in a single big batch either from SQL or disk. The actual sync then occurs against the pre-batched items. In case you’re wondering, the graph above includes the batching times for Dilithium - no cheating :)

The SQL version takes advantage of the Descendants table to grab all predicate-included items for every configuration in a single huge SQL query (this does mean that FastQueryDescendantsDisabled must be turned off). These items are then indexed and prepared for fast access. In this way, the entire data contents of the stock core database can be read in about 1500ms, compared to the Sitecore API’s 8 seconds.

The serialized version skips all the checks that usually are needed to traverse a Serialization File System tree - expensive operations that guarantee that the children you’re getting all belong to the same item ID, for example. It’s essentially reading all files sequentially and caching them for sync just like the SQL version.

When both DiSerialized and DiSQL are used together, they enable parallel loading of both sources at once. This reduces the load time quite significantly, as you can see on the shortest bar on the chart above.

Questions?

Direct SQL, are you nuts?

Yes. Yes I am. Directly querying the Sitecore database is a terrible idea in almost every situation…except this. The schema for content has also been very stable for several major revisions of the product. And Stephen Pope said it was ok :)

Is it any faster adding or modifying items?

No. I’m not crazy enough to also do direct SQL writes, so all modifications go through normal Sitecore item saving APIs and perform pretty much the same as Unicorn 3. This also means that they’re mature, well-tested code - Unicorn 4 under the hood is not much different from v3, except for Dilithium.

Can I turn it off?

Heck yes. You can enable or disable Dilithium on a per-configuration basis. It also works just fine if you sync Dilithium and standard configurations together.

Serialized items and formats are identical between Dilithium and normal, so you can also enable and disable at will with no migration process.

Will this eat up all my RAM?

If you’re syncing gigabytes of media items…yes, definitely. For more normal developer and light content items, nope. DiSQL does not read blob data, so if you become RAM constrained turn off DiSerialized first.

Note that the precaches are disposed of immediately after the sync completes, so memory spikes are temporary.

Let’s address the 🐘 in the room, how is Unicorn 4 60% faster than 3 without Dilithium?

Actually pretty simple: eliminating a cache bombing that was done between syncing configurations that was needed for Transparent Sync. Well, it wasn’t needed if you’re syncing a non-Transparent configuration. Doing that meant that sites like Habitat with a ton of small non-Transparent configurations perform a lot faster.

Any other performance tricks?

In my daily development I use a mix of Transparent Sync for developer items (templates, renderings), and standard sync for content items. Since Transparent Sync is disk-based anyway there is little point to running a sync on Transparent Sync configurations when you’re just doing local development. As such, there’s now an option to skip Transparent Sync-enabled configurations when you sync all. This is an option in the PowerShell API in Unicorn 4, as well as an option in the control panel (same place as log level).

Depending on how many Transparent Sync configurations you have, your time savings can be significant from enabling this. You should not however do this when deploying transparent configurations to a shared server.

Can I have it?

Not yet, it still needs some additional testing and vetting before I call it alpha. If you’re interested in being a brave early tester, get ahold of me on Sitecore Slack :)

Is this the only new thing in Unicorn 4?

Nope. I’ve got some more tricks up my sleeve, so stay tuned for some more preview blogs coming soon™