Content Summaries with HtmlAgilityPack

Recently I had a client ask me to modify their Blog page to list only the first paragraph and the first image found in each of their posts instead of the full post on the default Blog Post List.

The site (http://deancambray.com.au) was built with Umbraco and utilises the Blog4Umbraco package along with our Extensions package.  The solution that we came up with was to use the already available HtmlAgilityPack that's included in the Umbraco distribution and write our own Razor Script to list the Blog Posts.

While the entire script also incorporates other features such as a custom numeric pager, I wanted to focus this time on just extracting certain elements out of each post and displaying them in a customised format.

The helper: RenderSummary

First things first: make sure you have a reference to the HtmlAgilityPack library near the top of the script:

@using HtmlAgilityPack;

Our helper looks like this:

@helper RenderSummary(dynamic node) {
    var doc = new HtmlDocument();
    doc.LoadHtml(node.BodyText.ToString());
    var imgNode = doc.DocumentNode.SelectSingleNode("//img[@src]"); 
    if (imgNode != null) {
        var url = imgNode.Attributes["src"].Value;
        string alt = string.Empty;
        string title = string.Empty;
        if (imgNode.Attributes["alt"] != null) { alt = imgNode.Attributes["alt"].Value; }
        if (imgNode.Attributes["title"] != null) { title = imgNode.Attributes["title"].Value; }
<a href="@node.Url" title="Permalink to @node.Name"><img src="@url" alt="@alt" title="@title" /></a>
    }
    var para = doc.DocumentNode.SelectNodes("//p");
    if (para != null) {
        foreach (var p in para) {
            if (string.IsNullOrWhiteSpace(p.InnerText.Replace(" ", ""))) { continue; }
            
            <p>@Html.Raw(p.InnerText)</p>
            break;                                         
        }
    }
}

Our script uses a helper to render the Summary of each post that was found, and instantiates a new HtmlAgilityPack.HtmlDocument for each article by loading the article content using LoadHtml.  Once that's done, we can then use standard xpath queries to select the content that we want.  In this case, we want to find the first image that may be contained in the article and the first non-empty paragraph.

We can check that an image or paragraph exists by the return value of the SelectSingleNode or SelectNodes methods making it very easy to conditionally display the image or a placeholder if desired, for example.

Once we have our image, it's a trivial matter to extract the source url and other attributes using the Attributes collection on the returned HtmlNode and building our custom <img> tag.

Because it is very easy to insert paragraphs through TinyMCE that are empty, we want to find the first paragraph that actually has visible content in it. Otherwise our summary will look very empty indeed.  Once we have found the right paragraph, we can use the InnerText property to extract just the textual elements and ignore things like embedded images, lists and line breaks.  This results in a cleaner display and guarantees that the image (which may be found within the first paragraph) is not shown twice.

Note that you could also use the InnerHtml property instead if you wanted to include the extra format elements and other bits and pieces.

Tying it together

OUr BlogListPosts script is intended to replace the XSLT counterpart provided with Blog4Umbraco, so I've taken the basic structure of that script and tidied it up somewhat for clarity.  I've removed part of it that does the filtering and paging of the list items based on category and/or archive folder.  I wanted to focus on just the Summary rendering, so here's a condensed version of the body of the script featuring the use of the RenderSummary helper defined above:

@{
    var list = Current.DescendantsOrSelf("BlogPost").Items.OrderByDescending(n => n.GetPropertyValue("PostDate"));

    foreach (dynamic post in list)
    {
        <div class="post">
            <h2 class="entry-title"><a href="@post.Url" title="Permalink to @post.Name">@post.Name</a></h2>

            <div class="entry-date">
                <small class="published">@post.PostDate.ToString("dddd, MMM dd, yyyy")</small>
            </div>

            <div class="entry-content summary">
                @RenderSummary(post)
            </div>
            <div class="footer">
                <small class="more"><a href="@post.Url" title="Permalink to @post.Name">Read More...</a></small>
            </div>
        </div>
    }
    
}

Find this post helpful?  Why don't you drop us a line in the comments below...

Reply: How not to use Linq

Recently (well, today) the CodeProject Insider featured a blog entry - How not to use Linq.  It was a short but interesting artile that prompted me to write a comment but alas no comments can be written on the site.  So I thought I'd better answer it here instead.  Go have a read if you haven't already done so...

While I appreciate the intent and points of the article, I think it's also important to note the benefits of .Select() and .FirstOrDefault() as they have a place when used well.

With the former, it allows one to transform the results into another kind of object or even an anonymous object if required.  With the latter, the second form of the extension method is very powerful in that you can supply an object to use as the default if there are no results from the query.

So for example:

var product = products
     .Where(p => p.Id == 42)
    .Select(p => new SelectListItem{ Value = p.Id, Text = p.Name });

will produce an IEnumerable<SelectListItem> that can be used for example to populate a Select List in ASP.Net MVC templates.

var product = products
     .Where(p => p.Id == 42)
    .FirstOrDefault(new Product());

could be used to return a default product instance instead of a null.

var id = products
     .Where(p => p.Available)
     .Select(p => p.Id)
    .FirstOrDefault(-1);

would either return -1 if no available products are found, or the first product Id.

Very powerful stuff ths Linq with Extension Methods...

Any comments? Feel free to post your views on this topic...

Retrieving DropDownList Values in Razor or C#

Recently I needed to update the value of an Umbraco DropDownList property in code based on a value instead of the key that's automatically assigned by the Prevalue Editor.  I came across this but it discusses retrieving values for XSLT specifically.  In my scenario I needed to find a specific key.

The simplest way to do this is with XML to Linq.  In the example below I'm using the property's DataTypeDefinition to retrieve the relevant prevalue collection instead of hard-coding the Id of the DropDownList.  This means I have full flexibility in case something changes in the future:

if (p.getProperty("status") != null)
{
    var status = p.getProperty("status");

    status.Value = XElement.Parse(library.GetPreValues(status.PropertyType.DataTypeDefinition.Id).Current.OuterXml)
                           .Descendants("preValue").FirstOrDefault(pv => pv.Value == "On Offer").Attribute("id").Value;

    p.Save();
}

Note I could also have written it like this in Linq notation:

status.Value = (from pv in XElement.Parse(library.GetPreValues(status.PropertyType.DataTypeDefinition.Id).Current.OuterXml).Descendants("preValue")
                       where pv.Value == "On Offer"
                       select pv.Attribute("id").Value).FirstOrDefault();

 That's all there is to it.

Windows 8 DP with Boot Camp 4.0

UPDATE 1st March 2012: Installing Boot Camp 4.0 has been verified to work with Windows 8 Consumer Preview that has just been released.

Today I downloaded the Apple Boot Camp 4.0 using the BootCamp Assistant in OSX Lion with the intention of installing it on my Windows 8 Developer Preview.  After burning it to CD and re-booting in Windows 8, I discovered that Boot Camp 4.0 only supports Windows 7, and promptly refused to install under Windows 8.

After searching the 'net for a solution to no avail (the best advice on offer was to go into the Drivers directory and install each one individually), I tried something radical.  I turned on the Compatibility settings for the actual BootCamp64.MSI. Lo and behold, it worked! Boot Camp proceeded to install itself and the relevant drivers without a hitch.

Note: You will need a mouse with a dedicated Right Mouse Button so that you can bring up the properties on the setup file.

Consumer Preview Note: This procedure has been updated for the Windows 8 Consumer Preview.

So here's the process:

  1. Navigate to the folder containing the BootCamp MSI files
  2. Right-click on the Setup executable
  3. Open up the Properties dialog and go to the Compatibility tab.
  4. Turn Compatibility Mode on (select "Run this program in compatility mode for:") and choose"Windows 7" in the drop down.
  5. Click OK, and go run the installer again.
  6. Restart your computer, and enjoy.

That's about it.

 

Introducing the Umbraco View Counter

Over the last couple of days we've been busy creating an Umbraco package that deals with Content View Counters - it enables the web master to track the number of times content has been viewed on the site.

The Documentation and package has just been uploaded to the Umbraco Project Repository and can be downloaded from here.  This post deals with a few of the features of the package, which was built agains Umbraco 4.7 and dotNet 4.0

Introduction

TheRefactored Content Viewspackage is essentially a content views (number of times  viewed) counter.  The current functionality offered by this package includes:

  • Optional Data Type that allows for configuring view counters with various categories and the ability to instruct Macros etc. to "hide" the view Count yet still increment it.
  • Optional incrementing when displaying the view count (useful when you want to display the view count in a content listing, for example)
  • Example Razor Script and Macro.
  • Library methods to manipulate the counters and retrieve details as an XML fragment for use with XSLT.

Basic Usage.

To simply retrieve and/or increment the counter for a specific content item, call the following library method.  The category and increment parameters are optional, with default values shown initalics:

ViewCount.GetViewCount(nodeId, category: "<empty string>", increment: false);

There is no requirement to configure a DataType; supplying the node id of any valid Content-based node (Member, Document, Media, etc.) will create the Views record in the database if it doesn't exist.  However configuring and using a DataType will allow you to control the advanced features of the counter.

Out of the box

Out of the box you get a default DataType (View Count) and a sample Razor Macro that displays the current View Count of the node being displayed.  If you have set up the Document Type with the View Count DataType, the macro will check whether the View Count should be displayed or not.

Macro Parameters for Page Views:

  • Category (text) - optionally specifies the Category to record the Page Count against.
  • Increment (bool) - set to true to increment the Page Count when the macro is called.

Macro Script Contents:

@inherits umbraco.MacroEngines.DynamicNodeContext
@using umbraco.MacroEngines;
@using umbraco.NodeFactory;
@using Refactored.UmbracoViewCounter;

@if (!ViewCount.HideCounter(Model.Id, category: Parameter.Category)) {
  <span># Views:@ViewCount.GetViewCount(@Model.Id, category: @Parameter.Category, increment: @Parameter.Increment == "1").ToString("N0")</span>
} else {
  ViewCount.Increment(Model.Id, category: Parameter.Category);
}

 Setting up a Data Type

The Data Type has the following Parameters:

View Count DataType

  • Category- Specifyinga different category for multiple DataTypes allows you to differentiate between multiple View Counts in a single content item.  You can then render the content in different views and have a different View Count for each rendering.
  • Hide View Count- Allows you to control (in conjunction with the API and Razor or XSLT macros, for example) whether to hide or show the view count at a Data Type level.
  • Enable View History- Turns on recording of View Count History data including the time the view was incremented.  Also recorded is Reset command events.  This data is stored in the refViewCountHistory table and persists even if the current view count is reset.  This is off by default.
  • Disable Counter Reset- Turning this on disables the Reset action on Content configured with a View Count DataType.

The Playground - interactive Canvas goodness

We've just created a Playground for testing out Javascript and the Canvas tag with Raphaël and EaselJS... At the moment, we're still loading example code to use with the playground, however there is also a default "blank" sandbox that you can use to test your own code in provided as well.

The feature set will grow over time, including contributing your own examples; however for now feel free to have a play and leave a comment here with any suggestions for features or code samples you may like to share.

Enjoy!

High School Students and the Internet - not just consumers

Yesterday I met with the local High School to discuss the possibilities of getting involved by running programs for interested students in the areas of web and games development.  During the course of the meeting, I was asked to come up with a few paragraphs that the staff could use when promoting these programs, so I thought I'd put it up here (partly to help consolidate my thoughts into a hopefully cohesive introduction rather than just rambling on...).  So here goes.  Any constructive comments or suggestions are entirely welcome, so please don't be shy - jump right in to the conversation!

First a little background.  The school, Lara Secondary College in Victoria, Australia is really what you would call a regional school - Lara is situated outside of the nearest large city (Geelong) and is considered to be "The country between the cities" to quote one local resident.  While Lara is by no means a small country town and is rapidly growing, it still retains that small town community atmosphere.  The school itself has a Connections program for Year 9 Students running for a good part of the year one day a week, in which students are encouraged to take on projects that directly benefit the community in some way.

My original vision for setting up a what might look like a club within the school was to teach students some of the more cutting edge web technologies focussing on two areas:

  1. Web page layout and design utilising HTML5 and CSS3, and
  2. Online Games development utilising open source javascript frameworks and tools.

We would do this by coming up with a project that would benefit the community in some way - perhaps by re-desiging one of the local not-for-profit community organisations, or by designing a game with an educational focus to be used by the wider school community (and even by other schools)

So this is what I have come up with to introduce students to the idea and to generate some interest:

. o o o 0 0 O 0 0 o o o .

Online Computing these days is pervasive in all areas of our daily lives - whether it be reading emails or chatting on Facebook, or perhaps playing your favourite game of Pac Man, the humble Web Browser is fast becoming less of an application in its own right and taking a seat in the background while it acts as simply a host to highly interactive and expressive web-based applications.  These days you can play games and watch movies; connect with people from across the globe; or just write that business document - all from within one of the popular Web Browsers without having to install a thing.

Today we have tools freely available within our grasp to not only create visually compelling web sites, but also make those websites dynamically interactive with video, sound, and animation.  There are tools and frameworks upon which games are being built, and high quality graphics can be rendered right within the browser window.  In fact, you can author a complete website  from scratch without having to open anything more than your favourite browser.

JavaScript - the programming language of the web browser - can be used to program a website to pull information and present it from sources all over the web - want your Facebook status displayed on your website? no problem.  Perhaps you have uploaded a new YouTube video and want to embed it on your site. Easy.  Or maybe you want to be able to plot geographical data and connect the dots with Google Maps.  JavaScript can be used to do all this and more.

We are looking for a group of students with an interest in computing, and who have a passion to do more than just use the web - they want to build it.  Students will gain an understanding of the building blocks of a website, and we then go beyond that to examine what it takes to create a game from scratch that can be used to deliver value to the wider community.  We will also have professionals who are experts in their field come in and hold workshops in the areas of Graphic Design and Web Development.

. o o o 0 0 O 0 0 o o o .

That's it.  If anyone has anything to add, or any other comments, suggestions or thoughts, please, drop me a line...

Blog 4 Umbraco Extensions Documentation

Finally, some 8 months after the Blog4Umbraco Extensions library became available, I decided to post it to the package repository on our.umbraco.org and create some actual documentation for it - this is the result...

This document may also be downloaded as a PDF from here.

Introduction

The Refactored Blog4Umbraco Extensions came about because the current version of Blog4Umbraco (currently 2.0.26) had some issues when it came to creating multiple blogs within a single website, and in addition under some circumstances creating a new blog entry would cause a "Yellow Screen of Death" (YSod).

In order to address these shortcomings this package was created, and later extended with other functionality.  The current functionality offered by this package includes:

  • Allowing Comments to be Disabled at the Blog Level
  • Enable setting a Blog-wide Category and having Tags bound to that Category

An additional Datatype called Blog Tags and derived from the built in Tags Datatype is also provided which is the basis upon which the Blog-wide Category is built.

For a more detailed and technical description of the package, the reader is directed to the blog entries found at /blog:

The Future

Work is currently underway to release a version 3 of Blog4Umbraco (B4U) which will address the issues discussed here and add other much needed functionality including Trackbacks and Comment Notifications.

Post-Installation Steps

After installing the package, additional steps are required in order to activate the features.  These involve modifying the Blog-related document types as follows:

Globally Disabling Comments

In order to be able to globally disable comments, edit the Blog Document Type by adding a new property based on the True/False data type as follows:

Blog DisableComments Property 

If the Disable Comments checkbox is checked on a Blog page then the Close comments field will be also become checked when it is saved.

Blog Categories

Updating the Blog Document Type

In order to facilitate Blog Categories, an additional property needs to be added to the Blog Document Type as follows:

BlogCategory 

Coupled with the change to the Blog Post Document type below, this will cause tags added to blog posts to use the category set in this property of the corresponding Blog.

Updating the Blog Post Document Type

Change the Tags property in the Blog Post Document so that it uses the "Blog Tags" type instead of the built-in "Tags" data type:

 blogPost Blog Tags

Enabling Time fields in the Blog Entry Post Date

In the original Blog4Umbraco package, there is no way to enable the Post Date to use time as well as date, which results in all posts being set as being posted at midnight. 

The updated Umlaut.Umb.Blog.dll file included in this package addresses this issue, but you still need to modify the Blog Post document type in order to take advantage of the change.  In order to do so, change the Post Date property type from "Date Picker" to "Date Picker with time":

blogPost PostDate Property 

Other Issues:

Blog for Umbraco generates the following error when attempting to create a new Blog:

Issue # 5612 - http://blog4umbraco.codeplex.com/workitem/5612

Operand type clash: int is incompatible with ntext
Description: An unhandled exception occurred during the execution of the current web request. Please review the stack trace for more information about the error and where it originated in the code.

Exception Details: System.Data.SqlClient.SqlException: Operand type clash: int is incompatible with ntext

Workaround:

If you encounter this type of error, double-check the Author Picker and make sure that the datatype is set tointinstead ofntext.

Conditionally disabling Ajax in MVC3 Forms

While working with MVC3 and Razor views recently, we came across the need to disable the Ajax behaviour in a form when the user pressed a certain submit button.  To give you some background, we were working on a Shopping Cart whereby the user had the following possible actions:

  • Edit an item,
  • Delete an item
  • Update the quantity of an item,
  • Submit the cart to Checkout
  • Clear the cart
  • Refresh the cart.

Now, for most cases, we wanted the cart to be refreshed without the user having to see a post back, so Ajax was the best way to handle this, of course.  Although the application is based on the Umbraco CMS, we developed the e-Commerce side of things in MVC3 from scratch and integrated it with Umbraco using the excellent MVCBridge add-on.  This allowed us to take advantage of all MVC has to offer.  However, the solution that follows is not dependant on Umbraco or MVCBridge at all.

Because this is a new project and has no legacy MVC code in it, we are able to take full advantage of the new Unobtrusive Ajax style for binding Ajax to the form.  This means that under the hood we are using jquery's ajax engine only, and not the legacy Microsoft one.  Here's the basic form:

@using (Ajax.BeginForm(new AjaxOptions
{
    OnSuccess = "updateCart"
}))
{

    @Html.RenderFormToken();
<section id="shoppingCart">
    <h1>Items in your Shopping Cart</h1>
    <table cellpadding="0" cellspacing="0">
    @foreach (var item in Model.Items.Values)
    {
        // Render the items, including the Edit, Delete and Update submit buttons...
    }
    <tr>
        <td colspan="3" class="totalValue">Total Value:</td>
        <td class="totalValue">@Html.DisplayFor(model => model.TotalIncTax)
        @Html.HiddenFor(model => model.TotalItemCount)
        @Html.HiddenFor(model => model.TotalIncTax)</td>
        <td class="totalValue"></td>
    </tr>
    </table>
    <span class="submit"><input type="submit" value="Refresh Cart" name="refresh" id="refresh" />
    <input type="submit" value="Clear" name="reset" id="reset" />
    <input type="submit" value="Checkout" name="checkout" id="checkout" /></span>
</section>
}


Now, when a user presses any of the submit buttons, the form will be posted back to the server, and the new page content will be returned to the updateCart function so that we can update the user's view without having to refresh the page.

But we want the item Edit and the Checkout buttons to re-direct to a new page instead of submitting back to the shopping cart.  For that to happen, we need to do two things (let's focus on the Checkout button, which we want to re-direct to the Checkout page):

  1. In the ShoppingCartController we need to check which submit button was pressed by inspecting the form elements, and do a Resonse.Redirect() to the appropriate page if the user pressed the Checkout button, for example; and
  2. Disable the Ajax behaviour when the user presses the Checkout button.

The code to handle the second step is as follows:

   // These variables are defined here as they may be referenced in other code blocks.
   // The $().ready function is used to populate them.
    var cartSection = null;
    var eShopCartForm = null;

    $(document).ready(function () {
        cartSection = $("#shoppingCart");
        eShopCartForm = cartSection.closest("form");

        // Disable the ajax behaviour if the checkout button is pressed.  We want the form
        // to submit normally so that the page can be redirected. 
        var checkoutSubmit = cartSection.find("#checkout");

        // We supply our own handler for this button to remove the form's ajax submit handler.
        checkoutSubmit.live("click", function (evt) {
            // Setting this attribute to false means the ajax form submit handler won't be triggered...
            eShopCartForm.attr("data-ajax", "false");
        });
    });

If you care to dig deeper, then I recommend taking a look through the jquery.unobtrusive-ajax.js file that is bundled with the MVC3 projects.  Basically though we are changing the data-ajax attribute that is generated on the form element when the user clicks the checkout button so that the ajax submit handler doesn't trigger.

There you have it.  Any questions, suggestions, remarks, please leave a comment...

Resetting IE9's Javascript Engine

Quick post this time - lately I've come across this obscure error in Javascript with Internet Explorer 9 whenever I tried to run a jQuery ligthbox script:

appendChildOriginal(element);

jscript debugger
breaking on jscript runtime error - invalid calling object

after tearing my hair out and "researching" the problem on Google, I came across this obscure solution:

IE:  Tools->Options Advanced Settings "Reset"

(found here: "Microsoft J")

Seems that all that is needed to fix the problem is resetting the Internet Explorer settings fixes the problem.

A translation of the line above is as follows:

In Internet Explorer 9, Open the "Internet Options" either from the Tools menu (hit ALT to make the menu appear) or the "Cog" icon drop down.

Go to the Advanced Tab, and hit the "Reset..." button.  On the dialog that pops up, hit "Reset" again.

That's it. problem solved.  No need to reinstall IE 9.