Archive for February 11th, 2010

Top Trends Panel – VALA2010 Day 2 Afternoon

conference No Comments »

Top Trends Panel – Tom Tague, Roy Tennant, ? ,Marshall Breeding and Karen Calhoun – moderated by Anne Beaumont.

Anne began with an example to start the conversation. It started with a photo, with information provided by the library, which conflicted with what a user offered, who dated it between 1870 and 1876, due to the type of features included or lacking in the image. How do you check the authority of it when things are coming in so fast.

What do you do? Suggested that we add it like a kind of letter to the editor, so that its a chance to share the information, without needing to worry about authority.  If its wrong then the community of users will correct it amongst themselves. Need to acknowledge the difference and separation between library generated content and user generated content.

Powerhouse Museum has this problem all the time, but not often that they get something that needs the curators to go away and fact check, but when they do, it is well worth the effort. Powerhouse allows uses to add and delete tags, both their own and other peoples. We would be lucky to get tags, so we shouldn’t be putting barriers in place to discourage this. Flickr is a social environment with people who are used to tagging and where you don’t have ultimately responsibility for what people add.

NLA adds their user generated content as a layer to the content. It can look as if it is integrated but it isn’t, but the difference is made clear. UGC is not moderated. Need to be able to hear from them. Can blend these things in useful ways. Sometimes our users know more about a subject than we do, so we should make the most of their knowledge when they contribute it.

Starting point for user contributed content – using the alphabetic descriptors that match particular Dewey numbers. It gives people a stepping off point.

VALA2010 Concurrent Session 7 – Innovation

conference No Comments »

Warwick Cathro and Susan Collier – Developing Trove: the policy and technical challenges

Trove is a free discovery service for the public. It allows them to discover annotate content. For both the casual user and researcher. It is part of Australian infrastructure not a purchased product. Its all NLAs services rolled into one, then with more added.

Two imperatives for the NLA – streamline and integrating the proliferation of national collection discovery tools and as per their Direction Statement, to develop online spaces for user interaction.

Trove comes from treasure trove – the latter coming from French to discover, so it combines the content and finding it.

It benefits from their experiences with Libraries Australia, Pandora, ARO and more.

Small team of five developed Trove.  Started September 2008, prototype in May 2009, nine versions of prototype and released version 1.0 in November 2009.  Three updates since then.

Challenges: collection views, works and versions, what is online?
Collection views: search results are grouped into collection views. Need to decide what they would be. Newspapers and people were easy, the rest was not so easy. Realised that they were working from a library view – recruited a group of students, teachers, family historians and general public to card sort the different types into groups and got them to name the groups.  Then used the group names to get people to put types into them.   The results were: books-journals etc, pictures and photos, Australian newspapers, diaries-letters, and much more.

Creating metadata for these groups was very difficult. Rules are not perfect, so they know that there are items which are in the wrong groups. Hopefully in future, users will be able to suggest alternatives.

Trove is FRBRish. Has a similar structure, with some variations. Trove takes old MARC records and make them do new things.

Issues with determining online access. Easy to discover a resource is online, but hard to discover what the item is and whether access is free. Three types identified: available online, available online (access condition), possibly online.

Want users to add value – they can tag, split and merge records, fix the OCR on the newspapers. Enhancements are included in a separate layer. It improves the quality, as evidenced by the Australian newspapers project.

They can monitor what users are doing online, in terms of interaction with the content. Comments have been added to Trove by users. Eg, photo had comment from person’s grandmother, giving more biographical detail: newspapers have been corrected and more information provided.

Future developments: currently working on RSS feeds, enhanced sorting, more external targets, more full text, an API. Then – search and delivery of NLA digitised journals, inclusion of journal article indexing data from partner vendors, more goals for obtaining data from archives and museums.

Trove release comes after three years of discussion and development. Takes resource discovery to a new level. There are other products out there that will do the same. Trove is different, includes more unique content and is national.

Paul Hagon – Everything I know about cataloguing I learned from watching James Bond.

Senior web person at the secret society of librarians at Canberra – also known as NLA.

Newspapers used to be papers in metal filing drawers, all carefully labelled with metadata – then fed into a microfilm reader. Services like Trove allow the discovery down to deep content – the metadata has been relegated to the rear. Content now rocks and metadata is relegated.

All full text searching of the  newspapers is made possible through OCR. Deep content searching is possible with text, but what about images?  Computers are good at identifying mathematical markers within images. Begin with facial recognition. Can we use this on our collections on a global scale. Chose a series of photos on a range of Australian Prime Ministers, using iPhoto. Laborious process to do, but didn’t do too well at identifying people accurately – 32%.  OpenCV – from Intel was tried out – didn’t try to identify people, just tried to identify a face. When it did, it boxed it. It was very successful in identifying two photos of the same person, regardless of context. Didn’t do so well of people in profile or poor quality images.  Was successful 85% of the time.

What could it be used for? If you do a search on Parks, get people, town and feature. If you click on portraits, you would get images as well.

Also did work on colours. Broke down images into colours, recognising both the colours and the % of the image that had that colour. Some colours can be lost however, as there is not enough of the value to display this. Can go up to 64 colours (from 8) to pick those up, but then data storage requirements grow dramatically.

Did more testing with ImageMagick – which can analyse an image – shows the RGB values which can be stored in the database.  You can then search the database just by colour. Can end up with different types of images depending on which colours you search.

http://1104.nla.gov.au -go and play and get feedback to Paul.

Why research? Computer applications are already using this technology. Iphone – Shazam app – identifies music that is being played and gives you more info about it. Etsy craft store lets you search by colour. Google Goggles – take a photo and it analyses a feature and brings back info on it. Pattern recognition in an item, no metadata required.

Marshall Breeding – VALA2010 Day 2 Morning Plenary

conference, ILMS No Comments »

Marshall Breeding – Vanderbilt University Libraries – Blending evolution with revolution: a new cycle of library automation spins on

Library Technology Guides (website) is where Marshall puts all the information he gathers as he does his research. It shows whats going on in the field of library automation.  Check out the chart on the Australian LMS scence at www.librarytechnology.org.  Interesting to look at the current standings of LMS’s, but more interesting to look at the dynamics of change – who is taking the library field into the future.

Perceptions 2009 – third annual survey, gatherered November to January, over 2000 responses with 109 from Australia and New Zealand. Asks library staff about what system they use and what they really think about it. Its not just gossip, its an informal survey showing what people really think of the products they use.  Available online.

Observations from this study: smaller library and nice products generally receive better perception scores, companies supporting proprietary products generally higher satisfaction that those involved  with open source, except for libraries already using open source – these products were perceived as poor performing.

Library Journal Automation Marketplace – published annually in April 1 issue, based on vendor provided data, focused primarily on US market. Gives a broad view of the industry.

Context: Libraries in transition – shift from print to electronic, increasing emphasis on subscribed content (especially articles and databases), strong emphasis on digitising local collections, demands for enterprise integration an interoperability. Electronic resources and projects are taking increasing amounts of library budgets.

Marshall reflected that Abbey in the VALA video from yesterday, had summed up what he wanted to get across at VALA.

New generation of library users, millenials wwho are tech savvy.

Technologies are in transition: XML is the focus, Web services and service-oriented architecture. W e are beyond Web 2.0, its now part of what we do. Moving from local to cloud computing – Saas, private and public cloud. Full spectrum of devices: full scale – netbook, tablet, mobile with the focus on mobile at present. Need to be more device indpendent.

Dynamics of the Library Automation Scene:
Evolutionary path: gradual enhancement of long-standing LMSs, wrap legacy code in APIs and Web  services.  Library market prefers evolved systems, hard to build systems from scratch.
Revolutionary path: Ex Libris URM, Kuali OPE and WorldCat Management System which are clean slate automation frameworks or cloud based.

Rethinking library automation: LMSs don’t work too well for hybrid libraries.

OLE Project is collaborative project, with NLA involved – one to watch. OCLC Management system will take what they already have (eg. WorldCat Local) and just add back end operations to make it a full LMS.

Open Source LMS are growing fastest – not just in US, big companies in Australia and New Zealand.

Opening up Library Systems through Web Services and SOA: Hype or reality? Library Technology Report. Showed that proprietary systems had more APIs for customers to use. Even best APIs are still quirky and not comprehensive – still a way to go. Need to have the widest range of APIs available, so that we can use the data the way we want to. Open APIs allows you to tweak, without using the deep source code of your LMS.

Marshall spoke about discovery layers – check my notes from the L-Plate series so that I don’t have to take these notes for the second time. :)   Discovery products list and information available from www.librarytechnology.org/discovery.pl.

Outlook for next five years: most libraries still using evolved systems, increasing ranks of next generation LMS, library resource discovery matures,, mobile, transition from local to cloud computing.