Discovery Layers: potential and pitfalls

The Public Libraries Victoria Network ICT Special Interest Group, of which I am convener, ran a one-day seminar last week on discovery layers and public libraries. We invited four Victorian public library speakers to talk about their experiences, new as they are to discovery layers and then speakers from the State Library of Victoria and Trove, to give us an idea of potential and a chance to dream of what might be…..

Here’s my notes from the day.

If you are unfamiliar with discovery layers, check out my short introductory powerpoint, which will give you framework from which to read the rest of this post.

Ken Harris, Port Phillip Library Service – Serials Solution’s 360 Search and Civica’s Sorcer
Have not chosen full Discovery layer software, but have introduced federated search and what they call a bibliographic discovery layer.  Most library users expect more from a search than just keyword and browse.  There is no “did you mean” on Sorcer.

About a year ago, they talked to various vendors, but most said “you couldn’t afford us” or were academic based. Looked at Encore – too expensive and Aqua Browser.

Was about to sign up with a product when realised there were major issues – couldn’t moderate comments.  Instead, found that Federated Search could meet their searching needs across platforms.

WebFeat was chosen for federated search, better back end and could be administered themselves and included a proxy solution. Set all up quickly, they were very good with remote support. Called it super search and went live.

All stats went up, but was it because all databases were being searched individually? Full article stats also went up. Didn’t get the stats add-on for Web Feat (didn’t realise), so no other statistics. No feedback from borrowers, unless it stopped working.

In recent months, they updated their LMS so they could implement Sorcer – their bibliographic discovery layer.
Sorcer is a module of Spydus, based on a subscription service. It leverages off the bibliographic data. It’s mostly live, but pre-indexes some of the facts, so it can drill down.  Its gives browsing capabilities that are not available in the base catalogue.

Includes Word Clouds, which are useful. Sorcer is as good as you want it to be. You set up the containers they way you want. You can save lists, set up pre-packaged Boolean searches.  Limit of 52 items in a gallery display –  a limit which they had to work out for themselves.  Cover images have to come from external sources such as Syndetics (sourced and paid for by the library)

How is it going? Stats don’t compare to catalogue use, but should they? Civicia doesn’t support Google Analytics, so they set it up themselves. Can get stats on titles once clicked on, but not on containers.

They are not replacing the catalogue with Sorcer. It is not WC3 compliant, it doesn’t render in all browsers, etc so it is an extra. Borrowers like it, the look, the auto complete feature, and other things.

Issues they have found: Staff interface is technical, they have to be able to edit HTML. Not able to use Library Thing for Libraries – where the catalogue can. Friends functionality is limited and has no privacy restrictions. Some glitches with the container functionality. It does not search other resources, not a federated search tool – at least not yet – to alleviate this, they have added their database and ebook records into the catalogue.

Civica has plans to implement federated search through Sorcer – first stop will be the ability to search other sorcer libraries.

They have now migrated to 360 search from WebFeat. Express migration took 6 months (sigh), but they now have an improved search function and it does some relevancy ranking. Most connectors are maintained by support – can not be done by library staff any longer, inlike WebFeat.

Its live now at

You can get widget code to embed in your website and it uses IP authentication.

Unfortunately it lists database vendor names, rather than the database name.

Can stop a search partway through and it will give you the results it has already found.

Port Phillip Library Service is building a new website – they are aiming to provide a seamless gateway to the catalogue and federated search through widgets and html dressing on external sites.  Sorcer can’t be changed in look and feel, so they can’t bring it into the new website yet.

May still be looking at an alternative discovery layer – there are many new products on the market which would be worth investigating.

Web Feat uses its own proxy, they commission their own end point. Kind of like a limited Ezy Proxy. They can build connectors, more quickly if you pay for it.

Primo and Aquabrowser are not WC3 compliant. Accessibilty increases costs, may be new products which are better in this area

Hugh Rundle – Boroondara – Civica’s Sorcer
Haven’t launched it yet, (next week or so)

The Dream: Staff on board and excited, Everything in the collection findable in new ways, Replace booklists, fliers and pathfinders and Provide a new way to explore the collection for the majority of users who never ask for help.

Reality – staff need to be convinced (especially when not live), some things are not accessible (only bibliographic) – not federated search and some things can’t be seen through Sorcer (even though can through the catalogue) eg. individual copies of journals . Old ways die hard – people think they know how to do it the old way, ability to be nimble doesn’t mean you have more time – it can be quick to create new containers etc, but you then end up creating a lot more.

Civica’s definition – Sorcer – new Consumer Portal.  Hugh’s definition – a web 2.0 enabled catalogue encouraging browsing and discovery, but not replacing the catalogue.

Catalogues don’t speak human, they speak librarian. And sometimes librarians don’t even speak librarian very well.

Sorcer makes connections – gives you the standard catalogue information, but also people who borrowed this also borrowed and similar titles – makes it a bit more like Amazon. Data mining……… All of these recommendations are only titles within the collection. Very useful.

Sorcer makes information more beautiful – word clouds and gallery displays.  Easier to use for our users.

How does it work – Front End –  Sorcer uses tags, which can be created by users or staff – they can be made public or private. If private, its a shortcut for the user. If public, it becomes searchable. Tags can be deleted and blacklisted – so they can never be created again.

Friends not investigated yet, because they haven’t launched yet.

Front end – patron login. Can look at Sorcer or login and then look at it. When you login, you get options like books for you – look at your borrowing history and makes suggestions, recently borrowed – by anyone in the library, recent biographies and prize winning books. The latter uses the save list technology – a collection of titles library staff create and insert into a Sorcer container. Can do it with anything – eg. currently no subject headings on fiction books (but they are looking at doing so), but can create a save list on a genre.

The back end is complicated.  If you have the data in your database, you can do just about anything with it. If you can do it as a search or advanced search in your catalogue, you can put it in a container. Includes wildcards.

Sideway ends – standard OPAC. Port Phillip has kept catalogue as main and link to Sorcer. Boroondara is planning on making OPAC and website catalogue Sorcer, with a link back to the standard catalogue.

In a nutshell – Sorcer is much better than standard OPAC, but can not yet completely replace the OPAC (they do have plans to bring in new things). You’ll need some Boolean – to create containers. Sorcer is flexible and allows instant changes via Spydus Supervisor – no FTP required.

Saved lists don’t update – static and need to be updated. Boolean lists are created on the fly, so update as new items are added to the catalogue that meet that search. Static list option caches overnight rather than everytime you load in.

Have problems with LOTE collections in standard catalogue as the older records are not in Unicode. New records are, but can’t be done retrospectively. Sorcer will read Unicode, so newer records are fine.

Lloyd Brady – SWIFT – SirsiDynix’s Enterprise/Portfolio
Portfolio is the expanded version of Enterprise, brings digital asset management into the discovery layer/content management system based Enterprise. Swift Libraries who are live now are on Enterprise 3.1. Portfolio 4.1 will go live with Swift Libraries in coming months.

Enterprise is a discovery layer for interacting with the OPAC and other federated search target – via 39.50 or with Serials Solutions’ 360 search, but is also a content management system of its own.

Enterprise allows you to play with colours, CSS and can customise extensively, with some restrictions. 20 of the 22 Swift Libraries have purchased Portfolio/Enterprise.

Enterprise was initially just an alternative search interface for the catalogue with Google type features. Have added content management, better integration with library accounts etc.

Has a simple search box, can configure search limits – SWIFT has configured limits on home library service etc.
Enterprise has “Did you mean”, but it will also give you the possible alternatives in search results, regardless. It is Unicode compliant, so can search in non-Roman script, depending on your catalogue records of course.

Have incorporated Web 2.0 functions, eg. facebook share, digg submit, tweet this, tag in delicious, library thing links etc.  Facets are displayed on the left side in Portfolio.  eg. narrow your results etc by various facets including format, type, date etc.

The database Portfolio uses is indexed from the LMS. It is harvested overnight and then uses those items, in its own index.  Copy availability is live – using web services to pull that info from the LMS at that time. New records added, will not appear til next day.  Results are produced in relevancy order – the limit is 75% – no items are shown under this limit (it is adjustable in setup).

There can be issues with this eg. Ballet – when typed in a search, had westerns appear in the search , because of the similarity to bullet. Wimmera bumped up to 80% to fix this. Can be too harsh though and miss out on things when users spell incorrectly.

Have rooms that feature particular areas and can limit the search within that space to items relevant to that area eg. childrens, local history etc.

Can separate out catalogue and federated search results in different tabs. Can group federated search sources however you like.

Portfolio is the next level up, but incorporates all of Enterprise. Adds digital assets management – so you can add your scanned newspapers or documents, historical photos, digital audio or video. Set up asset collections using Dublin Core and it becomes accessible through the catalogue. Is also OAIPMH compliant, so images can be harvested by Picture Australia and it can harvest OAIPMH compliant databases.

Have upgraded search, can use Google-like search interface or limit search by field, like in a traditional catalogue. Can select multiple facets and exclude or include them, select multiple items and place a hold on them etc. Has lists, email etc features. Can also integrate Project Gutenberg titles and download the full text of the ebook.

Uses Google Analytics for reporting statistics. Administration via a Web interface and have user access restrictions.  No need to know HTML, uses WYSIWYG.  Also available for Horizon.

User suggestions, etc not in the works yet, but under discussion. Development still going on. Have their own Facebook ap. Have a good booklists function.

BookMyne app is a discovery layer as well for the iPhone, coming soon for Android.

No user interaction – no tagging, but under development is conversations, where users can talk about things they find in the catalogue, or about particular rooms – chat and feedback functionality. If a search term is typed in three times, it is configured as an auto complete and sent to the admin as inclusion as a search term. Can also blacklist search terms.

Accessbility compliant – meets US standard (ADA mode) via a link.

Suzanne Male – Yarra Plenty – Bibliocommons
Went live at end of November 2010. Have just taken off beta release from the logo – still is a work in progress.
There are two components, the discovery layer and the website. They are combined to provide a seamless user experience. Incorporates web 2.0 features like Facebook, and google type search facilities.

First Bibliocommons combined site, using Drupal for content management system.

Public feature – My Shelves, can add titles to this and they are then visible to others (not everything you borrow). Can turn on a private recording of what you borrow.  Can add comments and share your thoughts on a catalogue record, as well as tags.  Can create lists – both users and staff, used for recommendations.  Have asked for more recommendation type features (Amazon like).

Can explore by facet, check out library blogs, staff lists and more.  Site search needs some work, but the catalogue search is good.  Getting good feedback on facets.

Events function and blogs will eventually come up in search results – can also book for events online.

Positives: Seamless user experience, its social, enhances the catalogue using what is already there, can browse serendipitously, encourages staff to be part of the community.

Bibliocommons now incorporates the Google Books preview.

Created lists can be chosen as local or of broader interest.

Will look at rewarding prolific contributors to their site – those who create many lists, recommendations, comments, thoughts etc with prizes.

RSS feeds not available at present, but coming. Has a mobile app for iPhone, android and a mobile version for other devices.

Bibliocommons – SaaS, as their LMS is already. Have some speed issues as it is hosted in the US.  Nightly synchronisation and indexing. Indexed to their server, constantly updated in a 15 minute cycle.

Have a bug tracking system and they talk to Bibliocommons daily. Privacy concerns, so have strict policies and users must specify their level of privacy – haven’t had any problems with this in the 7 months of operation. Inappropriate content can be removed by staff or voted out by other users. Users can opt to ignore another user.

Some confusion for borrowers with having username and a library card – as each are required in different places. Some users have tried to borrow with their username etc.

Does have local history photos on their catalogue.

Custom template as they were first – but not their template, was created and owned by Bibliocommons.  There is an annual cost as well as setup.

They manage most of their Drupal site inhouse, but some of it managed by Bibliocommons.

Who owns the comments – don’t know, but think they would belong to Biblicommons.

Patron takeup – don’t know numbers, but its growing.

Kirstie McRobert – State Library of Victoria (SLV) – Ex Libris’ Primo
Primo – high end product – mainly academic.

On the whole have been very pleased with it.

Working on version 2 at present, will soon upgrade to 3. Primo has a federated search component – called MetaLib, also a Ex libris product and a link resolver – SFX. Together its a solution.

Dream – provide simultaneous search across the Library’s website incorporating both internal and external data sites.

Created a link to new style catalogue and moved the classic link over. People started using it straight away. Recommends this sneaky launch.

Launched the big red search box on every page when the new website was launched in 2010.

Wanted to provide a google type experience, single search across all library resources and incorporating web 2.0 features.

Simpler search for all, one interface to learn – seamless, natural language, fewer zero hits than OPAC, provides clearer pathways to targeted resources.

Provides a means to discover more of the library resources, by aggregating data sources, by exposing more metadata through results facets, start discovery though scoped default searches and make serendipity possible.

Wanted more visible journal/newspaper articles and ebooks search – via articles database tab, consistent search experience. Web 2.0 included reviews, comments, tags, store search results, alerts, bookmarking services, check requests.

Benefits – highly configurable – collection scopes, search indexes, relevance ranking, results facets, record displays, links to external sources such as Google Books, Trove, WorldCat etc. Can have Library Thing and Syndetics, but don’t at present.

Benefits for SLV – Data configuration managed by librarians – rather than by IT staff – via web admin module, normalisation rules to control search and display, each data source can have its own normalisation rules. Also gives SLV a greater return on investment, as eletronic resources have greater visibility – both licenced and in-house. Exposes more SLV data where users are searching – eg. in Trove and WorldCat – next version will enable them to expose more through OAI harvests (to go to Trove) and opening it up to Google, if they so choose.

SLQ One Search is where SLV is headed with the next version of Primo.

Have various pre-searching limiting options – eg. books, journals, pictures, audio and video  – but not websites or database because of poor quality coding on them.

Big challenge for staff to leave refining searches until the initial search is completed. Works best with the ‘dumb’ search. As long as there is metadata, you can setup a facet for it.

Allows you to tailor further suggested searches, generated from authors, subjects etc. Automatic word stemming if results below configurable threshold (set at 25 by SLV). Also has FRBRisation of editions – brings different editions together – this can be turned off for certain sets of items.

Can make tags, comments and reviews, but none are private. Must be logged in to make them, but can do so anonymously, even whilst logged in. Most tags they have seen added, are pretty esoteric. There is no real moderation, but they have set up an Oracle report to monitor them. Haven’t removed any at this time, but can if need be.

Pitfalls- try to avoid excessive look and feel design – have to rebuild when there are upgrades; avoid customisation that is not vendor supported where possible – just because you can, doesn’t mean you should.

Phase 3 due December 2011 – new look and feel, enhanced brief results summary, more default OPAC-like indexes, more default sorting options.

Users are directed to Primo, but have the option to go to the  OPAC. However, SLV wants to retire it and will continue investigating the gaps, to ensure no functionality is lost when that happens.

Next phase – incorporate library web pages as sources, eg. Mirror of the World site, or La Trobe Journal – set up as a separate search and eventually replace the existing website search and  offer a mobile version as well.

Comments – historical just accepted as a comment. If it points out something clearly wrong, which can be verified, the original is changed. If it is fixed, people will often then remove their comment.

Alison Delitt – National Library of Australia – Trove
Trove was built by NLA – more flexibility.

Content is king. People will put up with a lot if you have good content – they will complain, but they will access it. 80% of usage is  the newspapers/magazines. 50% of their keenest researchers are genealogists, the other 50% have special interests – eg. crime in victoria, transport etc.

Most popular – digital and can get immediately, rare content which can’t be found elsewhere, and the undiscovered – for example, theses. When designing systems, what are the drawcards that are bringing people into your site and content.

Convenience. Most users want to get through your site to what they want to get, as quickly and painlessly as possible. Single search box was the only thing they got right, right from the start, everything else has been changed since then. Users particularly love the copyright check and cite this buttons. The FRBRisation of titles has created some problems – if you are seeking a particular edition of a title, it is almost impossible.  Are working to resolve it.  The system has to be simple for the majority but also have a level of complexity for those who want it – likely to involve multiple screens. User testing is king. If Trove staff can’t agree on something, they will adjudicate it through user testing.

Collaboration. Deliberately built a community who have ownership over what they do. Over 40 million lines of newspaper have been corrected by the community. Users are correcting newspapers, adding images, tagging, commenting, merging or splitting works and adding lists. Small team of staff do censor comments,  have had to do so, including removal of some spam.

On Trove, you can access some online content, with your local library membership. (eg. Gale/Cengage, RMIT/Informit). If your library subscribes to those resources, you can search them and authenticate on Trove, using your local library card.

Implemented a user forum.  Due to privacy restrictions, they couldn’t connect users directly, but users can find each other there and do, particularly the newspaper text correctors.

Changes to Libraries Australia are immediately updated on Trove.

How do they do it?

Iterative – did a soft launch – but actively solicited feedback, every change was made from this. Any enquiry about that relates to using Trove starts a process of investigating how they could improve how Trove works. Launched with an imperfect product, so they could test with live users and adjust as they went.

Collaborative. Only possible because of the great team, many of whom work on other things at the NLA.  Built using SOLR – which underlines a lot of other discovery layer software. Its a base level of software. They have a full development team. Heavily use javascript.

Flickr harvested photos go to both Picture Australian and Trove. Picture Australia will eventually be rolled into Trove, but a lot has to be resolved before that happens. Trove directly harvests images from 10 cultural institutions. (80 go through Picture Australia).

Can turn fuzzy logic off for advanced searches.