Recently I had a client ask me to modify
their Blog page to list only the first paragraph and the first
image found in each of their posts instead of the full post on the
default Blog Post List.
The site (http://deancambray.com.au)
was built with Umbraco and utilises the Blog4Umbraco package along
with our Extensions package. The solution that we came up
with was to use the already available HtmlAgilityPack that's
included in the Umbraco distribution and write our own Razor Script
to list the Blog Posts.
While the entire script also incorporates other features such as
a custom numeric pager, I wanted to focus this time on just
extracting certain elements out of each post and displaying them in
a customised format.
The helper: RenderSummary
First things first: make sure you have a reference to the
HtmlAgilityPack library near the top of the script:
@using HtmlAgilityPack;
Our helper looks like this:
@helper RenderSummary(dynamic node) {
var doc = new HtmlDocument();
doc.LoadHtml(node.BodyText.ToString());
var imgNode = doc.DocumentNode.SelectSingleNode("//img[@src]");
if (imgNode != null) {
var url = imgNode.Attributes["src"].Value;
string alt = string.Empty;
string title = string.Empty;
if (imgNode.Attributes["alt"] != null) { alt = imgNode.Attributes["alt"].Value; }
if (imgNode.Attributes["title"] != null) { title = imgNode.Attributes["title"].Value; }
<a href="@node.Url" title="Permalink to @node.Name"><img src="@url" alt="@alt" title="@title" /></a>
}
var para = doc.DocumentNode.SelectNodes("//p");
if (para != null) {
foreach (var p in para) {
if (string.IsNullOrWhiteSpace(p.InnerText.Replace(" ", ""))) { continue; }
<p>@Html.Raw(p.InnerText)</p>
break;
}
}
}
Our script uses a helper to render the
Summary of each post that was found, and instantiates a new
HtmlAgilityPack.HtmlDocument for each article by
loading the article content using LoadHtml.
Once that's done, we can then use standard xpath queries to select
the content that we want. In this case, we want to find the
first image that may be contained in the article and the first
non-empty paragraph.
We can check that an image or paragraph
exists by the return value of the SelectSingleNode
or SelectNodes methods making it very easy to
conditionally display the image or a placeholder if desired, for
example.
Once we have our image, it's a trivial
matter to extract the source url and other attributes using the
Attributes collection on the returned HtmlNode and
building our custom <img> tag.
Because it is very easy to insert
paragraphs through TinyMCE that are empty, we want to find the
first paragraph that actually has visible content in it. Otherwise
our summary will look very empty indeed. Once we have found
the right paragraph, we can use the InnerText
property to extract just the textual elements and ignore things
like embedded images, lists and line breaks. This results in
a cleaner display and guarantees that the image (which may be found
within the first paragraph) is not shown twice.
Note that you could also use the
InnerHtml property instead if you wanted to
include the extra format elements and other bits and pieces.
Tying it together
OUr BlogListPosts script is intended to
replace the XSLT counterpart provided with Blog4Umbraco, so I've
taken the basic structure of that script and tidied it up somewhat
for clarity. I've removed part of it that does the filtering
and paging of the list items based on category and/or archive
folder. I wanted to focus on just the Summary rendering, so
here's a condensed version of the body of the script featuring the
use of the RenderSummary helper defined above:
@{
var list = Current.DescendantsOrSelf("BlogPost").Items.OrderByDescending(n => n.GetPropertyValue("PostDate"));
foreach (dynamic post in list)
{
<div class="post">
<h2 class="entry-title"><a href="@post.Url" title="Permalink to @post.Name">@post.Name</a></h2>
<div class="entry-date">
<small class="published">@post.PostDate.ToString("dddd, MMM dd, yyyy")</small>
</div>
<div class="entry-content summary">
@RenderSummary(post)
</div>
<div class="footer">
<small class="more"><a href="@post.Url" title="Permalink to @post.Name">Read More...</a></small>
</div>
</div>
}
}
Find this post helpful? Why don't
you drop us a line in the comments below...