When it comes to searching across the web, we all know that Google is king, but does this still hold true across your own internal network?
Over the past 12 months we have wrestled with this question, particularly in an environment with multiple search mechanisms, manually maintained indexes, and masses of sites that were created when metadata was primarily used to categorise instead of search.
In a series of posts, I’m going to go through our experiences in improving search across our internal network – I’m not suggesting we have found the magic search bullet, or that we’re anywhere near finished tweaking and tinkering, but I do know we’re a hell of a lot closer than where we were at this time last year.
The problem
In our travels across campus, we kept hearing “I can’t find what I’m looking for!” – not surprising, given that we had;
- 500+ individual sites, ranging in age (earliest was 1997), metadata (none to Dublin Core to ‘something’) and ownership
- Inconsistent use of key SEO elements, such as title, headings, tags and meta descriptions across the majority of our sites
- Multiple sources of content and internal search mechanisms, each with their own set of search results
- Manually maintained indexes, all categorised and sub-related, together with an in-house redirect mechanism
- An internal audience of staff and students with heavily cyclical search requests – a search for ‘physics’ at the beginning of semester is more likely to be for text books, and at the end of semester past exam papers
Given Google’s dominance in search, we quickly went down the path of a Google Search Appliance, or ‘Mini’, which is a self contained rack mounted system that gives you God-like powers over the Google algorithm. We were bringing in a little bit of Google in to magically transform our disparate set of sites into a cohesive set of search results.
Once plugged in, the Mini worked really well – for pages that were properly formatted for organic search.
Pages that were missing or incorrectly using titles, headings and metadata didn’t fare so well, and we found the search results were not the most relevant, as the Mini couldn’t make much sense of most of the content it crawled. We also found that there was no clear way to incorporate the feeds from other systems, with the “how do I…” answers primarily provided by a community of Search Appliance users and resellers, and not Google themselves.
Given the wide ownership of the sites we were working with, updating each with appropriate SEO friendly content was unrealistic. What we needed was a way to;
- compensate for the lack of SEO content,
- incorporating multiple sources/ formats of content,
- allow for cyclical requests to ensure the most relevant results appear, and
- combine all the different sources of search results into a single set of user-centric search results.
Enter Adobe Search&Promote
If you’re a regular visitor to this blog, it will come as no surprise that Tim is a power user of Omniture products, steadily working his way around the product wheel. We became aware of the Search&Promote product (then called SiteSearch) which promised to solve our key internal search issues.
Search&Promote uses a search algorithm to organically crawl your sites, in addition to ranking rules based on a wide range of configurable data. Once you’ve defined your rules, you can adjust the overall balance between your ranking rules and natural search relevance.
Where there is a lack of metadata, Search&Promote can be configured to dynamically inject metadata on crawl, based on a URL pattern. Additional custom metadata can also be injected to create facets (filters) that allow users to drill further down into predefined categories.
If your multiple sources of content can be transformed into XML feeds, then that content can be crawled, categorised, and integrated with the organic results by Search&Promote.
Yes, there are other internal search products on the market that will do the above, however there is one thing that Search&Promote has over its competitors – the ability to tightly integrate with SiteCatalyst and Test & Target.
We’ve known for some time that internal search terms follow highly cyclical patterns as our student (and staff) needs change over the semester. We’ve helped them find what they’re looking for using of real-time SiteCatalyst data in search-as-you-type and tag cloud mechanisms, however with Search&Promote we now have the opportunity to take internal search to the next level.
In the report below (7 day moving average) you can see two popular search results across three semesters peaking at different times during the semester;
Notice how ‘bookshop’ peaks at the beginning of semester, then dies down, only to peak again at the beginning of the following semester. No surprises here, but it does coincide with a significant increase in page views across the Bookshop website.
Then look at the results for ‘timetable’ – there’s a peak at both the beginning and end of semester. The difference here is that people are actually looking for two different pieces of content – their semester timetable at the beginning, and their exam timetable at the end – using the same keyword. Again, the rise in search terms coincides with increased page views across each piece of content.
So, in theory, by looking at the last week’s worth of traffic across our group of sites, we should be able to determine what content students are looking for, then re-rank the search results accordingly. For example, the term ‘timetable’ at the beginning of semester will push results related to the semester timetable to the top, and at the end of the semester push results related to the exam timetable to the top.
Exciting stuff!