Adding Search Metadata to Publishing Site Pages in SharePoint 2010

Scenario

You have a SharePoint publishing site with a number of pages that display dynamic content based on a query string. You followed a process similar to Crawling Publishing Sites in SharePoint 2010 to configure SharePoint search to index the dynamic page content. Now you’d like to enrich the items in the search index with additional metadata that can be used for property restriction queries or for adding custom refiners.

Solution

Add dynamically generated META tag to the page. SharePoint will automatically create a crawled property of type Text under in the Web category using the name attribute of the META tag as the crawled property name. You can then map the crawled property to a new managed property that will get its value populated with the content attribute value of the META tag.

Example

I’ll use the web part and pages created in my previous blog post¬†and will simply extend the web part to generate a META tag.

[ToolboxItemAttribute(false)]
public class ProductInformation : WebPart
{
    protected override void CreateChildControls()
    {
        // get the model number from query string
        string modelNumber = Page.Request.QueryString["ModelNumber"];
        if (!string.IsNullOrEmpty(modelNumber))
        {
            // assign a product category based on the model number
            string productCategory = string.Empty;
            switch (modelNumber)
            {
                case "M300":
                case "M400":
                case "M500":
                case "X200":
                case "X250":
                    productCategory = "Digital Camera";
                    break;
                case "X300":
                case "X358":
                case "X400":
                case "X458":
                case "X500":
                    productCategory = "Digital SLR";
                    break;
            }

            // set the page title
            ContentPlaceHolder contentPlaceHolder = (ContentPlaceHolder)Page.Master.FindControl("PlaceHolderPageTitle");
            contentPlaceHolder.Controls.Clear();
            contentPlaceHolder.Controls.Add(new LiteralControl() { Text = string.Format("{0} {1}", modelNumber, productCategory) });

            // add the model number and product category to the page as an H2 heading
            Controls.Add(new LiteralControl() { Text = string.Format("<h2>{0} {1}</h2>", modelNumber, productCategory) });

            // generate a META tag
            Page.Header.Controls.Add(new HtmlMeta() { Name = "modelnumber", Content = modelNumber });
        }
    }
}

If we refresh one of the product information pages after deploying the code change above, we should be able to see the META tag in the page source.

<meta name="modelnumber" content="M300" />

Now run a full crawl and then verify that the crawled property was created by going to Central Administration > Search Service Application > Metadata Properties > Crawled Properties (for SharePoint Search) or to Central Administration > Query SSA > FAST Search Administration > Crawled property categories > Web (for FAST Search).

Next, create a new managed property of type Text and add a mapping to the crawled property above. If using FAST Search, also check the Query property and Refiner property checkboxes.

Run another full crawl and the managed property is now ready to be used for property restriction queries or as a refiner.

Let’s test it by running the following query first:
PropertyRestrictionQuery

You can now also use the new managed property as a refiner.
CustomPropertyRefiner

  • Rob

    I tried this solution and it works great but when I do a full crawl on my website the crawled property is never created. Is there something special I need to set up to get the injected meta data crawled? Thank you.

    • Vassili Altynikov

      Here’s a couple of things to check:
      * Open the page in a web browser, view HTML source and confirm that the meta tag is populated.
      * Check the crawl log in Central Administration to make sure that the page was crawled during the full crawl.

      • Rob

        Vassilie, thans for your response. I do see the meta tag when I view the page source and the page is being crawled but when I look at the crawled content, I don’t see a new crawled property for the meta tag I injected. If I inject a meta tag with name=Test1 content=Content1 I should see a crawled property with a name something like ows_Test1, correct? Thank you.

        • Vassili Altynikov

          Hi Rob. In your example, you should see a crawled property named test1 (without the ows_ prefix) in the Web category. Are you using SharePoint Search or FAST Search?

          • Rob

            Vassili, I am using Search. The injected meta tags only exist when the page is loaded so I’m not sure I understand how the crawl would index the injected meta tags if they are not static. My page gets crawled but I don’t see any crawled property. Thanks.

  • Vassili Altynikov

    Rob, do you have the crawl rule configured to “Crawl SharePoint content as http pages” as described in my prior post Crawling Publishing Sites in SharePoint 2010? When using SharePoint Search, you should see the crawled properties under Central Administration -> Search Service Application -> Metadata Properties -> Categories -> Web.

    • Rob

      Vassili, I do have the crawl rule configured on my site but I still don’t see a crawled property for the meta tag I am adding dynamically. I guess I will have to research this a little bit more to understand what I am missing.

      • Vassili Altynikov

        Yes, that’s very strange. I’ve used this approach in a number of different environments and it always worked as expected. Please let me know what you find.

        • Rob

          Vassili, what I found is that if I take your web part above or the one I created for my project and comment out the piece of code that looks for the query string (Page.Request.Querystring) everything works fine. I see the Crawled Properties and I can create Managed Properties. As soon as I put the If statement back in and check for the query string, no Crawled Properties are created. This tells me that SharePoint never crawls the page with the querystring and the querystring is always null. Any thoughts on this? I have the custom crawl rule in place, just like the one you created in your other post.

          • Vassili Altynikov

            OK, I agree. Looks like the crawl rule doesn’t apply for some reason. My only guess would be that the crawl rule URL doesn’t match the start URL configured in the content source. Does your web application have more than one zone?