Heartbleed = overcomplexity + input validation failure

The Heartbleed vulnerability is because the OpenSSL code didn’t validate an input. It’s also because OpenSSL had unnecessary complexity.

OpenSSL has a heartbeat feature that allows clients to send up to 64 kilobytes of arbitrary data along with a field telling the server how much data was sent. The server then sends that same data back to confirm that the TLS/SSL connection is still alive. (Creating a new TLS/SSL connection can take significant effort.)

The problem is if the client specifies that it sent more data than it actually did, the server would send back the original data and some of its RAM. For example, suppose the client sent a 1K message but said it’s 64KB. In response, the server would send a 64KB message back, which was the original 1K message plus 63K of data from the server’s RAM, which could include sensitive, unencrypted data from other programs.

How this could have been prevented:

  1. Avoid pointless complexity: don’t require the client to also send the length of the arbitrary text. The server should have been able to detect the length of the text.
  2. Validate all input. The server failed to ensure that the client’s description of the text length matched its actual length. (The fact that the server could detect the message’s actual length further validates my view on #1.)

Keep it simple! In addition to driving up creation and maintenance costs, needless complexity is more opportunities for things to break.

Google is not linking to HTTPS versions of everyone’s sites

In the University Web Developer’s (UWEBD) listserv today, a conversation took off about how Google was linking to the HTTPS version of Florida Gulf Coast University’s web site. It was a problem because of FGCU’s broken HTTPS channel.

I was surprised at the misconceptions that came over a technically astute email group. Here’s my statement:

Two inaccurate things have been said about Google.

Inaccurate statement 1: Google is securing others’ sites. Dangerous misconception! Google cannot “secure” your site. If Google’s link to you uses HTTPS, that does not “secure” your site. It just means Google is linking to your site’s secure channel. “Securing” a site includes transport security (HTTPS channel) among many other things. Most importantly, YOU, the site owner, do the “securing”, not Google.

Inaccurate statement 2: Google is en masse sending users to HTTPS channels on web sites. Nope. For example, Southern Methodist University has had both HTTP and HTTPS channels for www.smu.edu for over a decade. Google links to the HTTP version.

Starting late last year, Google encrypts traffic between the user and its search site. If you visit http://google.com, Google redirects you to https://google.com. That has no bearing on whether Google’s search results link to HTTPS or HTTP channels. However, it may limit site owners’ view of search keywords (reference); that isn’t related to the inaccurate statement.

You can still get unsecured Google search using http://www.google.com/webhp?nord=1 (note the highlight), but only if you’re not signed in. A search on Florida Gulf Coast University on the unsecured version still links to the HTTPS channel.

There’s are many reasons why Google is linking to FGCU’s secure channel, but it’s almost certainly not because of Google’s own change.

Beware of Drupal for enterprise WCM

A colleague at large university asked my thoughts on using Drupal for enterprise web content management (WCM).

Drupal has its uses, but I only recommend it for point solutions or, if in large-scale use, for cookie-cutter things where all the Drupal instances are configured almost identically and share little content. I generally do not recommend it for enterprise-wide WCM.

I wrote:

Drupal is a great solution for certain purposes, but I hesitate at recommending it for an enterprise CMS. Even though some tools make it easier to manage [edit: like Aegir], Drupal “enterprise”, especially for a college campus, is still essentially flying many individual instances in parallel, so I don’t consider that a real enterprise setup. No other “enterprise” system that I am aware of is so loosely-coupled on the application layer of the service delivery stack.

You’ll find other institutions with a different perspective. I think Stanford is one. Stanford, I believe, is widely using Drupal. [edit: yes, it is: https://techcommons.stanford.edu/drupal]

The problem with their setups is to run a massive Drupal installation, your IT department’s staff commitment to web content management (WCM) will be more heavily expressed in sysadmin skillsets and FTEs than developer skillsets and FTEs. The problem there is cultural: as practitioners of stability, minimizing cost, avoiding change, etc., sysadmins are culturally much further away from web marketing than developers.

And that’s what I like about Sitecore: as a true enterprise WCM, it frees me from much of the sysadmin burden that I would bear by running gobs of parallel instances. It allows me to instead invest in developer skillets and FTEs, which in the long term helps our marketing mission.

Now, there’s also an economy of scale. At some point, you may have sufficient staff resources that it may in fact be cost-effective to go with virtually any free CMS as a base product and develop whatever you want on it.

Three general pointers:

  • Customized code that is not expressed as a formal Drupal module or theme will cause update headaches. Drupal core and the modules have frequent releases, and I recommend staying on top of them because they often contain important bug and security fixes. If you customize Drupal core or customize a module, you’ll have to re-customize it every time there’s a new release. However, if you can implement your custom code as a formal module, it is much more likely to be able to exist unchanged while other modules or Drupal core change. It’s likely, though, that you’ll need to revisit your modules on major new Drupal releases, like v7 to v8.
  • Software costs are a small part of TCO of a system. Just because the software is free does not mean you’re going to get a gigantic savings over a commercial product. That notwithstanding, there are FOSS [edit: free and open source software] tools that make a lot of sense, like, say, Firefox, Chrome, WordPress, and even Drupal for certain situations. But sometimes, FOSS tools may require more FTEs or, like I said above, FTE types that may not be the best cultural fit for marketing.
  • Generally, academic-targeted CMSes are not that good. They seem to have their rabid supporters, but my general experience is that these supporters are suffering from confirmation bias, and that these academic focused CMSes cannot compete with the big players.

Show all Sitecore Active Directory users

I manage a Sitecore installation that’s integrated with an enterprise Active Directory.

We have over 11,000 accounts in our Active Directory. I needed a list of the Sitecore users, who are only a small percentage of the 11,000.

We have nothing in Active Directory that sets them apart, like group membership.

We architected our solution so that users are never assigned directly to items; users are members of Sitecore roles, and we assign Sitecore roles to items. All I have to do is rifle through all my Sitecore roles.

So how do I find my users? It took a little C#. Here’s the core code:

var roles = Sitecore.Security.Domains.Domain.GetDomain("sitecore").GetRoles();
foreach (var role in roles)
{
    foreach(var roleMember in Sitecore.Security.Accounts.RolesInRolesManager.GetRoleMembers(role, false))
    {
        if (roleMember.AccountType == AccountType.User)
        {
            var userObject = Sitecore.Security.Accounts.User.FromName(roleMember.Name, false);
 
            // only adding SMU domain users
            if (userObject.Domain.Name == "myActiveDirectoryDomain")
                AddUserToList(userObject);
        }
    }
}

This gets all Sitecore domain groups and extracts all users who are a member of my corporate domain. Of course, you’ll replace myActiveDirectoryDomain with your own domain name.

I created a separate AddUserToList method to handle adding these items to a Dictionary:

private void AddUserToList(User user)
{
    if (!_users.ContainsKey(user.Name))
    {
        _users.Add(user.Name,user);
    }
}

After the core code runs, you’ll need to code your own stuff to spit out what’s in the dictionary.

Here’s what I used:

foreach(var user in _users)
{
    var row = new TableRow();
    OutputTable.Rows.Add(row);
 
    row.Cells.Add(new TableCell { Text = user.Value.Profile.UserName });
    row.Cells.Add(new TableCell { Text = user.Value.Profile.FullName });
    row.Cells.Add(new TableCell { Text = user.Value.Profile.Email });
 
    if (user.Value.Profile.FullName.Length == 0)
    {
        row.CssClass = "alert";
    }
 
    var rolesCell = new TableCell();
 
    foreach (var role in RolesInRolesManager.GetRolesForUser(user.Value, false))
    {
        if (role.Domain.Name == "sitecore")
        {
            rolesCell.Text += "
 " + role.Name;
        }
    }
 
    rolesCell.Text = rolesCell.Text.Substring(7);
    row.Cells.Add(rolesCell);
}

Note that I already had a Table named OutputTable on my ASPX page.

Tadaa! The result is a list of all my domain members who are Sitecore users.

Upgrading hardened Sitecore content delivery environments

How do you upgrade hardened content delivery (CD) environments? Ours are so hardened that you can’t even get to /sitecore/admin/UpdateInstallationWizard.aspx.

Here’s how. It can be tedious, but these make it easier:

  1. CDs are mostly stripped-down instances of full Sitecore environments.
  2. Any needed database changes are handled when upgrading the content mastering (authoring) environment.

You only need to do 3 things on CDs.

But first, two notes:

  1. If you upgrade the CDs before you’ve upgraded the content mastering (CM) environment, you may have unstable CDs until the CM upgrade is done.
  2. Sitecore’s .update files are really Zip files. Just open it with your favorite Zip program, like 7-Zip, and it works like any other Zip file.

For each update file, do the following three steps. You must perform them in the release order of the update files, starting with the oldest release.

1. Add new or changed files

Extract everything in the addedfiles and changedfiles directories of the update file. You’ll extract them over the web root. Tell your Zip program to overwrite existing files.

2. Delete files no longer needed

This is the hardest part. Inspect the update package’s deletedfiles and deletedfolders directories of the update file. Every file (not folder) under each corresponds to a file or folder under your web root that needs to be trashed.

Note the wording: “every file“. For example, in Sitecore 6.5.0 rev.110602_fromv640rev101012_2.update, there is a file named AuthoringFeedback under deletedfolders\sitecore\shell\Applications\Analytics. That means you would delete the directory at sitecore/shell/Applications/Analytics/AuthoringFeedback under the web root.

You may have to dig deeply and thoroughly to find all files and directories.

3. Edit .config files.

Do all .config file changes that correspond to the update package you just handled. A list of .config file changes are at http://sdn.sitecore.net/Products/Sitecore%20V5/Sitecore%20CMS%206/ReleaseNotes/webConfig.aspx.

Wrapping up

If you’re going through multiple upgrades, it’s tempting to do all them at once–do all the file additions at once, then all the file deletions, and then do all the .config changes. This might work as long as you work through the update files in their release order, starting with the oldest release, and if Sitecore didn’t delete something and add it back or vice versa.

For example, suppose you were doing four updates at once. In update #2, a file named x.png was deleted, but then it was added back in update #4. If you do all your file additions first, then do all deletions 2nd, your final state will have no x.png.

As long as you’ve been careful and did the CM environment upgrade first, the CDs should “just work” when done.