Tuesday, November 5, 2013
Continental Breakfast
8:00 a.m. - 9:00 a.m.
Welcome and Keynote: Taxonomy Is Power: Bringing It All Together
9:00 a.m. - 10:00 a.m.
Bob Boiko, Founder & CEO, Metatorial Services, Inc. Senior Lecturer, University of Washington Information School, & Author, The Content Management Bible, Laughing at the CIO, & the upcoming The Structure of Information
Taxonomy is controlling information by naming and organizing it. Boiko dives into the depths of taxonomy to talk about what, at its essence, it is and does. He describes the usual and potential position of the taxonomist in projects and how the skills and methods of the taxonomist can be one of leadership in the teams, departments, and organizations you live in. Hear how taxonomy fits into the structure of information and the crucial role it plays in bringing people, information, and technology together.
Coffee Break
10:00 a.m. - 10:15 a.m.
Moderated by:
Hannah Rubin, Information Research Specialist, Congressional Research Service
Taxonomy Fundamentals Workshop
10:15 a.m. - 12:00 p.m.
Marjorie M.K. Hlava, President & Chairman, Access Innovations, Inc. Data Harmony My blog is TaxoDiary.com
This interactive session starts by building a solid conceptual foundation for taxonomy creation and reinforces those concepts through audience participation. Starting with the basics, Hlava quickly advances to where and how to leverage taxonomies. This gives beginning and intermediate practitioners a good overview of the foundational knowledge for the more advanced sessions throughout the conference. Leveraging the taxonomy standards for the key components of a thesaurus, Hlava explores how those elements support the information needs of users from multiple perspectives and examines illustrative sites and behind-the-scenes solutions to see how a well-constructed taxonomy with a rich interplay of terms and synonyms leads to better information access. The workshop discusses developing a taxonomy that serves users, respecting their needs for specialized vocabularies. With hands-on activities, attendees gain insight into how a subject area can be viewed, described, and structured.This learn-by-doing session provides basic knowledge to create a taxonomy that suits your needs.
Attendee Lunch
12:00 p.m. - 1:00 p.m.
Taxonomies: From Idea to Reality
1:00 p.m. - 2:00 p.m.
Gary Carlson, Founder, FactorSeth Maislin, Principal Consultant, Digital Transformation, Earley Information ScienceRalph Tamlyn, Principal, Taxonomy and Classification Metadata Consulting IEEE, SLA, (ACBL as a hobby)Carol Hert, Senior Consultant, Factor
There are many challenges in managing different types of taxonomies.Taxonomies can range in size from less than 10 terms to more than a million, be in one language or 20, and have a simple hierarchy or complex ontological structure. It’s also important to remember that taxonomy is not a panacea to solve content management issues. It’s a critical step, but successful solutions go far beyond just taxonomy; if you’re not planning for LAT (“life after taxonomy”), then you might find that you’ve invested a great deal of time and money for a system that’s not ready to perform. This panel brings together experts who have tackled different taxonomies from a range of organizations. They discuss the unique challenges, different approaches, and expectations that can be helpful when working on different taxonomies, including how to transform your taxonomy from an academic exercise to a full-fledged vehicle for content management; how to design a taxonomy that meets real user needs; the relationship of taxonomy and metadata; the rigors of taxonomy governance; editorial guidelines and tagging strategy; and post-implementation tasks, analysis, and adjustments. Creating your taxonomy is merely the beginning; by the end of this session, you will have learned what it takes to cross the finish line.
TAXONOMIES IN SHAREPOINT
2:15 p.m. - 3:00 p.m.
How Evolving SharePoint Functionality Requires an Enterprise View of Taxonomy
Seth Earley, CEO, Earley Information Science Author, The AI Powered Enterprise
SharePoint has evolved through the years in the level of sophistication around functionality driven by metadata and taxonomies. Metadata has always been important, but the product has evolved in very significant ways. The implication: Taxonomy derivation and thoughtful application are no longer nice to have but are now critical to the effective use of the platform. Earley reviews recent advances as well as outlines all the ways that taxonomy is leveraged in foundational reference architectures. He provides a clear set of use cases and business justification for taxonomy development programs.
Double Tag! Managed Metadata & Taxonomies in SharePoint
Chris McNulty, Senior Product Manager, Microsoft
Information architecture finally gets a helping hand with the second edition of Managed Metadata Service in SharePoint 2013. Our expert reviews all the traditional uses for the term store and social tags. He begins with a hands-on review of SharePoint 2013’s managed metadata services for taxonomies, folksonomies, hashtags, site policies, and content types. He concludes by looking at how metadata navigation comes together to create a dynamic information catalog to collect far-flung content united only by common metadata tags.
Coffee Break
3:00 p.m. - 3:15 p.m.
COMMUNICATING WITH STAKEHOLDERS
3:15 p.m. - 4:00 p.m.
Explaining Metadata: Tools You Can Use
Ruven Gotz, Director, Collaboration, Avanade Microsoft SharePoint MVP
This meta-presentation improves your understanding of metadata, but more importantly, it gives you the tools and techniques to help you explain metadata and taxonomy to your stakeholders in terms they can understand.Through the use of metaphors and interactive tools (that are provided), you will be able to excite your stakeholders and get them engaged in the process of defining metadata for their business area.
Mchines vs Humans: Selling both!
Daniel Mayer, CMO, Expert System Enterprise
Both in taxonomy development and in content tagging, there is a long-standing information management debate between the human element of quality and the automation-driven efficiency. A well-built thesaurus supports automated content annotation, while effective semantic enrichment supports taxonomy maintenance, successfully leveraging human input. This session showcases the benefits of a connected architecture where taxonomy (or ontology) management and semantic enrichment - along with human operators - work as a team to support a cohesive information lifecycle.
Taxonomy: Science or Whimsy?
John Matthew Upton, Principal Consultant, ByteManagers
Categorization is an intensely personal exercise: each group of objects - whether physical or digital - can be grouped according to myriad organizational methods. When justification for wholesale change to taxonomy structures is based on personal opinion instead of data, the categorization scheme breaks down and taxonomy becomes ineffective. This session takes taxonomy out of the academic vacuum: using “jars of whimsy” – a small mason jar filled with seemingly random odds and ends – and shares a brief sorting exercise that illustrates the type of categorization analysis that must be performed on a regular basis in order to maintain a high-functioning taxonomy.
PANEL: The Curious Lives of Full-Time Taxonomists
4:15 p.m. - 5:00 p.m.
Zach Wahl, CEO, Enterprise KnowledgeAhren Lehnert, Principal Taxonomist, Nike Inc., USAJenny Benevento, Freelance TaxonomistDan Segal, Information Architect, IBM
This popular session facilitates a conversation with a panel of full-time taxonomists from the public and private sectors and the consulting world. The taxonomists discuss their career path, daily activities, and noted trends in the industry.The audience has the opportunity to ask questions, with answers and different perspectives provided from each panelist.
Welcome Reception
6:00 p.m. - 7:00 p.m.
Moderated by:
Michael Crandall, Senior Lecturer and Director, iAffiliates Program, The Information School, University of Washington
BUILDING LARGE TAXONOMIES & THESAURI
10:15 a.m. - 11:00 a.m.
Taxonomy Development
Ahren Lehnert, Principal Taxonomist, Nike Inc., USAKim Glover, Director, Internal Communications, TechnipFMC
Developing a taxonomy from existing sources of information is a good way to ensure your vocabulary is accurately reflecting the content you are classifying. Additional sources of content, however, can change the scope of the vocabulary and potentially create conflicts with the existing structure. How do you plan for what you don’t know? Are there tactics for developing a taxonomy that can grow and adapt to new information without causing a partial or complete overhaul of what you have already built? This session includes a real-world taxonomy development case study as well as helpful suggestions and best practices for designing a taxonomy able to adapt to new information sources.
Building a Multidisciplinary Thesaurus
Nancy Murray, Associate Director of Metadata, Content Management, ITHAKA/JSTOR American Society for Indexing
JSTOR is a digital library of more than 1,500 academic journals, books, and primary sources. These holdings comprise a wide range of topics from the humanities to the sciences. Today, no one single thesaurus holds the terms to cover all these subjects. Known for quality metadata and quality images, JSTOR’s goal is to have a thesaurus provide for high-quality enhancement of its content. Hear about the steps and the issues involved in creating this thesaurus.
Tilling the Fields: Growing Old Taxonomies to Fit New Content
John Magee, Director, Indexing & Vocabulary Services, Cengage LearningMaureen McClarnon, Metadata Architect, Cengage Learning
What does one do when one has 100-plus decade-old taxonomies that need to be repurposed? What if the taxonomies in question are highly idiosyncratic in both content and organization, but now need to play nicely in a wider, standards-driven environment? Education and reference publisher Cengage Learning confronted this issue in 2011, when the metadata team needed to harvest previously untended fields of course- syllabus-based taxonomies to feed new content and products. The crop yield from the legacy taxonomies wasn’t enough to implement this new indexing workflow in production systems and products. Would Cengage Learning practice slash-and- burn agriculture, or could it use the seeds of previous efforts to cultivate bountiful new fields? Members of Cengage Learning’s Metadata Standards and Services team take you through how they identified problems with their existing taxonomies, analyzed the problems and opportunities, and ultimately merged the old taxonomies into improved, discipline-based taxonomies.They share tips on taxonomy cultivation, and keep you from buying the farm.
TAXONOMY MANAGEMENT
11:15 a.m. - 12:00 p.m.
Taxonomy & Classification Metadata Management: Best Practices
Ralph Tamlyn, Principal, Taxonomy and Classification Metadata Consulting IEEE, SLA, (ACBL as a hobby)
Organizing and delivering relevant information to people inside and outside organizations is an evermore complex challenge as the volume of information grows. Among the critical com- ponents underlying the solution is the classification of information through high-quality metadata. The metadata in turn depends on high-quality taxonomies and ontologies.The metadata is used to classify, manage, organize, and integrate information, including web content. IBM has undertaken a multi- year effort to improve the integration and delivery of information and web content by improving the management of classification metadata and taxonomies. This effort is building on best practices for managing classification metadata and managing taxonomies and the processes implementing these practices through enterprise tools and enhancements to the myriad systems managing information and content.Tamlyn led the development of those best practices and the design of those tools, using metrics and governance to complete the solution. In this session, he focuses on the graceful evolution of existing systems to implement such practices. He discusses user-facing taxonomies vis-à-vis normalized classification taxonomies and ontologies, as well as evolutionary steps in systems to improve taxonomies and classification metadata without interrupting the operation of those systems.
Successfully Managing Multilingual Taxonomies: 3 Approaches
Jim Sweeney, Senior Product Manager, Taxonomy & Ontology Solutions, Synaptica LLC, USA
This talk covers three different approaches to managing multilingual taxonomies, their terms, and translations. All three methods are discussed in detail as well as the pros and cons of each strategy.
Assessing Management Needs: Using a Vocabulary Governance Maturity Mode
Richard Iams, Information Architect, The Eliassen Group
Information exchange, between systems and users, is vital in today’s knowledge-based business environment. Effective governance across information systems, taxonomies, and data yields stable and predictable results as changes are applied in response to business needs. However, the gold-standard plan may not always be achievable on tight budgets. Iams discusses a vocabulary governance maturity model. The model provides a framework for comparing current vocabulary governance to best practices. It defines specific success measurements that can be used to prioritize vocabulary management activities which are likely to provide the most value when implemented. An example scorecard for a vocabulary management application is shared.
Attendee Lunch
12:00 p.m. - 1:00 p.m.
TAXONOMY EVALUATION AND TESTING
1:00 p.m. - 2:00 p.m.
Evaluating Taxonomies
Joseph Busch, Principal, Taxonomy StrategiesVivian Bliss, Independent Consultant, USA
Taxonomies are developed in communities and evolve across time. From the outset, there is a need to evaluate existing schemes for organizing content and questions about whether to build or buy them. Once built out and implemented, taxonomies require ongoing revisions and periodic evaluation to keep them current and structurally consistent.Taxonomy evaluation includes the following dimensions that are discussed: 1) editorial evaluation, including depth and breadth, comprehensiveness, currency, relationships, polyhierarchy (is it applied appropriately?), and naming conventions; 2) collection analysis, including category usage analytics (is distribution of categories appropriate?), completeness and consistency, and query log/content usage analysis; 3) market analysis, including industry standards/leaders, user surveys, card sorting, and taskbased usability. Examples are provided from clients in B2B and B2C ecommerce, intranets, and public websites in the public, nonprofit, and commercial sectors.
Testing Taxonomies
Heather Hedden, Taxonomy Consultant, Hedden Information Management Author, The Accidental Taxonomist
Just because you have a taxonomy, it’s not safe to assume that it will function as well as it could. An important part of any taxonomy development or redesign project is testing the taxonomy. The session includes an overview and examples of different types of tests that can be used on taxonomies, including card sorting, user/use case testing, and A/B testing, tells what tools or methods can be used, and explains when each is most appropriate. This presentation also discusses the difference between testing and evaluating a taxonomy and when each should be done. Finally, taxonomy testing and evaluation are compared with general website design testing and evaluation.
Discussion, Questions, & Answers
STANDARDS UPDATE
2:15 p.m. - 3:00 p.m.
Taxonomy Interoperability Standard
Marjorie M.K. Hlava, President & Chairman, Access Innovations, Inc. Data Harmony My blog is TaxoDiary.com
Taxonomies at last have a standard to support interoperability between taxonomies and other controlled vocabularies. The linking, multilingual, or interoperability of standards has been a holy grail for many years.With the passage of the ISO 25964 Part Two in 2013, the groundwork has been laid for further development in these areas. A brief overview of the standard and its relation to other current standards such as ISO 25964 Part One, revisions to Z39.19 and the updated British standard BS 8723 Parts 1–5 is provided. In addition, the ISO terminology standards used by the computer science community are discussed.
Taxonomy Modeling’s New Guard— SKOS-XL Concepts
Jim Sweeney, Senior Product Manager, Taxonomy & Ontology Solutions, Synaptica LLC, USA
This talk examines concept/label taxonomy design as represented by SKOS-XL (Extended Language) modeling, compared to traditional taxonomy design. It explores the differences between the two models and how each is able to handle specific design requirements, such as managing multilingual instances, synonymy, and distinct attributes. Relation to OWL and its variations and the use of “term ID” and its pros and cons within these standards are also discussed.
Discussion, Questions, & Answers
Coffee Break
3:00 p.m. - 3:15 p.m.
ENTERPRISE TAXONOMIES
3:15 p.m. - 5:00 p.m.
Building Enterprise Taxonomies: Lessons Learned
Seth Earley, CEO, Earley Information Science Author, The AI Powered Enterprise
After almost 2 decades of building taxonomies for a variety of industries, applications, tools and organizations, there are many lessons to be learned and applied to today’s highly distributed and loosely connected information environments. Internal versus external applications, departmental versus business unit versus enterprise, application to structured versus unstructured, text versus rich media, back end versus front end, machine-applied versus human indexing, and a range of other variables for the derivation, application, and maintenance of taxonomies provide a rich backdrop for lessons learned.
7 Steps to EIM Taxonomy Success
Myles Miller, CEO & Founder, SuccessHQ
The need to classify and categorize corporate information has never been greater. The proliferation of information channels, sources, and delivery platforms makes managing information a complex business challenge. Compounding this issue is the fact that, due to the increasing speed of business, information is growing at a rate that far surpasses standard institutional frameworks and controls. Information continues to be recognized as a key source of competitive advantage, and there is an increasing need for the business worker to access relevant information in a timely manner. Information by its very nature is dynamic, so attempting to set boundaries results in a cumbersome way to control the flow of information. The key to managing this information is to develop a way to identify, classify, and categorize enterprise information.This categorization allows for effective management of content throughout the information life cycle: capture, storage, retrieval, archival, and disposal. A systematic approach to taxonomy development goes a long way to ensure that the finished product, the corporate taxonomy, is relevant, usable, and provides value to the business. Hear the steps to develop your enterprise information management (EIM) taxonomy and the best practices to create the growth and outcomes for an ongoing taxonomy in the future.
Taxonomies for Program Management
Joseph Busch, Principal, Taxonomy StrategiesVivian Bliss, Independent Consultant, USA
Today’s organizational landscape, characterized by virtual offices, shorter tenure, global markets, and rapidly changing technology, makes effective information management a key performance objective. Common information management practices are needed for creating and storing resources so the information can be easily found and shared later. These practices range from simple file and folder naming conventions to more robust metadata schemas and tagging vocabularies. These taxonomies need to be readily understandable to employees without much, if any, training; they must be “natural” and “universal.” Some organizations are framing their information management practices as an integral part of over- all goals and objectives planning. In these organizations, taxonomies reflect the overall program goals of the organization. For example, every resource is related to one or more key business activities or tasks; and key differentiators, such as methodologies, are identified. In some organizations, creating, tagging, finding, and presenting information assets is a natural part of everyone’s daily routine, as natural as searching for a website or shopping for products in an online store. Finally, a taxonomy-based information ecosystem provides common and easy ways to measure and report on organizational performance as analytics and visualizations.While taxonomies are typically built to solve an information management problem such as browsing for content on a website, this presentation discusses how taxonomies are being used to 1) reflect the overall program goals of an organization; 2) be the framework for organizing, finding, and presenting assets from disparate systems; and 3) provide a common way to measure and report on organizational performance. Examples are provided from organizations that are using taxonomies to meet today’s program management challenges.
Enhancing Information Infrastructure Enterprise Taxonomy
Chrystie Stachura, Product Specialist, Deloitte Touche Tohmatsu LimitedAnn Jacklin, Sr. Product Analyst, Deloitte Touche Tohmatsu Limited
Speakers walk through five aspects of developing complex enterprise taxonomies. 1) Scalability—a successful taxonomy is recognized for its dynamic nature and is integrated into a standard business workflow that accommodates business organizations, both internal and external areas of focus, geographies—regional and global differences, cultures—philosophical as well as geo-political differences among business organizational areas, and IT systems—providing integration points that are broadly consumable. 2) Inputs—use cases (requirements), lessons learned, and best practices. 3) Critical risks and success factors. 4) Data and information architecture—designing front-end data structure to facilitate downstream consumption with custom connectors, web services, indexing service for web-based tools, and reports with specified criteria. 5) Implement enterprise taxonomy as a management tool—aggregate like terms from diverse groups into consolidated lists that can be leveraged by the majority of consumers; using a confluence of filters and enhanced relationship management, deliver specialized taxonomy views to groups requesting taxonomy integration.
Challenges of Multipurpose Enterprise Taxonomy
Branka Kosovac, Founder and Primary Consultant, dotWit Consulting
An enterprise taxonomy is intended to be used by multiple groups within a company and to bring all the benefits and efficiencies associated with standardization across the enterprise. Meeting needs of diverse groups and synchronizing different conceptualizations and terminologies are known and often discussed challenges with a more-or-less established arsenal of solutions. But is there a point at which the essential purposes of taxonomies used by different groups are so different that challenges of synchronization and reuse acquire completely new dimensions? This talk presents challenges faced by a large company with huge amounts of information and generally decentralized information management that has seen work on shared vocabularies in different organizational units since the late ’90s. Three major efforts which have survived through 15 or so years of coping with organizational restructuring, staff fluctuation, changing strategies, technologies, and budget priorities have recently converged, and options for integration are currently being explored. These three sets of controlled vocabularies have substantial overlaps and an increasing number of shared stakeholders, but they have been developed for essentially different purposes, come from different communities, and follow different global standards and governance approaches, in addition to all being shaped to some extent by their long and winding histories and constraints of specific tools. The session includes analysis of the problem, selected approach for addressing it, and lessons learned that can be translated into best practices to be followed when developing enterprise taxonomies.
Integrating Enterprise Taxonomies With Local Variations
Tom Reamy, Chief Knowledge Architect & Founder, KAPS Group Author, Deep Text
Balancing the need for a standard taxonomy for the entire enterprise and the desire to support local variations is one of the basic problems of enterprise taxonomy development. In addition to taxonomy structure issues, there is a large change management component. Trying to impose the same standard vocabulary on every group, while often attractive to enterprise taxonomists, fails to adequately reflect real local needs. This talk is based on a recent project at a large international financial institution which dealt with a somewhat fragmented environment using an enterprise taxonomy implemented with a text analytics tool, a secondary enterprise structure, a special topic taxonomy, and multiple knowledge management taxonomies managed by several KM networks.
Discussion, Questions, & Answers
Welcome Reception
6:00 p.m. - 7:00 p.m.
Wednesday, November 6, 2013
Continental Breakfast
8:00 a.m. - 8:45 a.m.
Building Collaborative Organizations
8:45 a.m. - 9:45 a.m.
Nicco Mele, Co-Founder, EchoDitto and Faculty, Harvard Kennedy School, & Author, The End of Big
Our ability to connect instantly, constantly, and globally is altering the exercise of power with dramatic speed. Governments, corporations, centers of knowledge, and expertise are eroding before the power of the individual. Based on ideas from his recent book, internet pioneer Mele provides insights and ideas for building collaborative organizations using revolutionary technology and more!
KEYNOTE: A New Search Architecture for the Big Data Era
9:45 a.m. - 10:00 a.m.
Kamran Khan, Managing Director, Accenture
Search engines, distributed processing and content processing pipelines are not new. However enabling technologies of mature search engines, powerful content processing pipelines and cheap distributed processing are coming together to empower a next generation of information access, analysis and presentation much closer to the holy grails of knowledge management. Hear from the founder of Search Technologies how modern search engines are currently being combined with powerful independent content processing pipelines and the distributed processing technologies from big data to form new and exciting enterprise search architecture, delivering results only available to the biggest companies with the deepest pockets in the past.
TRUE TALES OF TAXONOMY USE
10:15 a.m. - 11:00 a.m.
Corporate Folksonomy for Collaborative Teams
Joanne du Hommet, Knowledge Manager, Ubisoft Entertainment PhD student, Paris 8 University, INDEX-Paragraphe laboratoryBeatrice Cacace, Senior Program Manager, Knowledge Management, Ubisoft Entertainment
At Ubisoft, collaboration and sharing are key factors to successfully creating great games. On this journey to success, expertise recognition and knowledge sharing can become major challenges.To facilitate knowledge accessibility and discovery, Ubisoft teams implemented a common referential of keywords, usable on collaborative platforms and internal applications.What is the vision the folksonomy will help to achieve? How does the tag system connect and interact with Ubisoft’s other internal applications? These are described along with how Ubisoft handled this mix of folksonomy and taxonomy, the interactions with other KM tools, and the governance behind the Ubisoft tag system.
Creating a Unified Front: Taxonomy Automatic Indexing
Ashleigh N. Faith, Librarian Data Scientist, EBSCO Information Services
SAE International uses automatic indexing software. Before its project was under way, “use” of the software was loosely applied at best and the taxonomy was in a poor state for automatic indexing. SAE created its own taxonomy, based in engineering mobility and science terminology, from scratch. Developing a cohesive taxonomy that would also facilitate automatic indexing on content reaching more than 136,000 pages (and growing) across eight different content types was a challenge. The nature of scientific content makes automatic indexing difficult. Faith discusses the process that SAE used to establish a taxonomy to capture content and create the bedrock in which the indexing software could be trained, as well as the trials and iterations of training the software and validating the assignments. SAE improved its taxonomy assignment of content to 89% accuracy, well above the typically accepted 75% accuracy rate of automatic indexing, and established a repeatable process that can be used as the taxonomy grows. Learn from concrete examples, lessons learned, and how to duplicate the process with any automatic indexing software.
User Experience Testing for Content Types & Retention Rules for Records & Econtent
Kyle Stannert, Assistant Director - City Clerk's Office, Public Records, City of Bellevue
As the City of Bellevue embarked on implementing new technologies and compliance requirements, it faced a challenge. With a retention schedule made up of more than 6,000 records series, the ability to support emerging business and technology requirements seemed next to impossible. The city’s records management program took on this challenge and refined the agency retention schedule into a format that would work for users and could be implemented in systems including email archiving, instant messaging, and SharePoint 2010/Gimmal Compliance Suite. This talk shares lessons learned in developing retention rules and a content type framework that is as easy to navigate as a visit to Disneyland. Learn how to consider the value of a functional retention schedule in your organization; connect the value of a simplified schedule in implementing email archiving, unified communication and/or ECM technologies; and apply multiple ideas to simplify your retention schedule at your place of work when you get back to the office.
AUTO-CATEGORIZATION: MACHINES VS. HUMANS IN BIG DATA
11:15 a.m. - 11:45 a.m.
Pattern Analysis & Categorization: Big Metadata Toolkits
Joseph Busch, Principal, Taxonomy StrategiesVivian Bliss, Independent Consultant, USA
The exploding volume, complexity, and velocity of structured and unstructured data and their interactions presents challenges and opportunities to derive valuable insights. Among the challenges of managing massive datasets are gathering, validating, preserving, analyzing, and maintaining linkages from those analyses to the source dataset. Identifying patterns in datasets using information retrieval methods and writing out the results as metadata are well-established information management processes that should be adopted by organizations working with today’s Big Data sets. This presentation provides an overview of pattern analysis and categorization methods, including keyword and regular expression matching, business rules, pattern categorizers, entity extraction, and trained categorizers that are the key building blocks of analytics toolkits for big metadata applications.
Using Example-Based Auto-Categorization to Tame Big Data
John Felahi, Founder, JGF Strategies LLC
There are at least two schools of thought regarding Big Data. Part of the organization wants to take advantage of it.The other part views it as “Dark Data”—undiscovered, unanalyzed, and unreachable without the proper analysis tools and skills. It’s both. Concept-based auto-categorization,which automatically categorizes documents based on their actual content, not keywords or terms, is the fastest, easiest, and most repeatable way to pinpoint only the most important documents and emails among libraries spanning millions of files and messages. It is an established standard in legal e-discovery and U.S. intelligence, having proved defensible and highly scalable. Learn how companies are beginning to step up to big data analytics using example-based auto-categorization in order to take advantage of all their data, no matter where—or how—it resides. Analytics for Big Data can bring great value to many business applications—social media, market analysis, internal information analysis. It is also being applied toward governance issues and regulatory and legal compliance matters. We’ll also look at empowering the entire information life cycle; how the “infomediaries” gain business advantages; and why traditional search engines have not been able to keep up.
Constructing a Focused Taxonomy
11:45 a.m. - 12:00 p.m.
William Pieser, VP, Chief Marketing Officer, Pingar
This session describes a new method for constructing custom taxonomies from document collections. It includes identifying relevant concepts and entities in text; linking them to knowledge sources like Wikipedia, DBpedia, Freebase, and any supplied taxonomies from related domains; disambiguating conflicting concept mappings; and selecting semantic relations that best group them hierarchically.
ATTENDEE LUNCH & KEYNOTE: File Sync/Share Is Not Endpoint Backup
12:15 p.m. - 12:30 p.m.
Ann Fellman, Director, Product Marketing, Code42 Software
Due to its “social” essence, file sync/share requires a more open, less rigorous data security approach than enterprise endpoint backup, which demands the highest measures of data security. Attempting to marry the two business challenges through a single application results in an unhappy union that jeopardizes the safety and privacy of your organizational data.
INDEXING IMAGES
1:15 p.m. - 2:00 p.m.
Practical Aspects of Natural Language Processing
Daniel Joseph Vasicek, Programmer, Access Innovations, Inc. SIAMKathryn Brown, Editor, Access Innovations, Inc.
Recognizing data in medical records includes forming a regular expression and using it to extract a list of tags from the data. You probably need to do several iterations of this process to tune the regular expression(s).The next two steps involve classifying the tags into useful groups (recognizing context) and extracting the data to build your database. Vasicek presents a case study and walks through these steps for a set of medical records containing images, demonstrating the overall process used for extracting tags. The next talk shows how this process could be used to index images from those records.
Image Indexing (UF=Indexing Images)
Bob Kasenchak, Information Architect, Factor
Many corporations, publishers, content providers, and other organizations have large stores of images: photographs, graphs, tables, pictures, diagrams. It is useful to be able to find and retrieve these images on demand without browsing through pages and volumes of documents and files. How can we index an image? Until optical recognition software is far more advanced, indexing an image itself is not practical. We can, however, examine and extract concepts from the text associated with an image using a thesaurus and indexing software to tag the image with metadata. In this way we can implement an easy, accurate image search.What text would be useful for this purpose? And how much text should be captured and indexed without generating too much noise, rendering the search meaningless? Get some answers here!
Taxonomy Meet Art, Art Meet Taxonomy
Dave Clarke, EVP, Semantic Graph Technology, Synaptica, part of Squirro AG, UK
The vast majority of taxonomy development discussion and effort is applied toward accessing textual data. This talk focuses on how taxonomy may be used to enhance access to visual data. It discusses current research into relevant technologies, illustrates examples using stunning high-definition imagery of masterpiece works of art, and concludes with a summary of taxonomy’s untapped potential to come to the aid of information access for visual data.
USER EXPERIENCE (UX) IN TAXONOMY DESIGN
2:15 p.m. - 3:00 p.m.
Benefits of Integrated UX-IM Design
Michael Rudy, VP Business Imagineering, Factor
Customers today face market-driven design challenges that span multiple content formats and sources, multiple application platforms, and multiple user platforms. These challenges often require the services of two types of designers: user experience (UX) designers and the information modelers (IM) of taxonomies and metadata. An integrated UX-IM design team is one approach to accelerating solution implementation. When approaching the overall design with a collaborative team, the project can have a common methodology and shorter design cycles.This presentation examines how collaboration between these teams ultimately benefits the entire design cycle while resonating with executive sponsors.
Ensuring Consistent & Accurate Tagging: Interface Design & Metadata Application
Ben Licciardi, Manager, PwC
As taxonomists, we spend a lot of energy designing vocabularies that are browsable and user-friendly, but we often overlook the system interfaces that taggers use to access and apply taxonomy terms to objects. The irony is, even the best vocabulary is of little value if a poorly designed tagging interface impedes a user from tagging consistently and accurately. This presentation looks at how tagging interfaces impact the metadata application process, exploring examples from content management systems, product information management systems, and crowdsourcing sites from around the web. It discusses how interface design can influence data quality, findability, and the long-term viability of a taxonomy project and concludes with some high-level principles to consider if you’re evaluating, building, or revamping a tagging interface.
Applying User Research to Designing Info Models
Bram Wessel, Principal, Factor Board Member, Information Architecture Institute
User experience design isn’t the only practice that can benefit from user research. Exploring the way users natively organize information in their minds gives taxonomists a means to precisely identify and define information structures, as well as a powerful guiding mechanism for difficult information model design decisions. This talk demonstrates how to construct an effective research plan that blends quantitative and qualitative methods to explore and analyze user mental models and how to transform research insights into viable, flexible, and sustainable information models.
Coffee Break
3:00 p.m. - 3:15 p.m.
SEMANTIC SEARCH
3:15 p.m. - 4:00 p.m.
Utilizing Ontologies for Taxonomy & Content Organization
Tony Rhem, CEO/Principal Consultant, A. J. Rhem & Associates Author, Knowledge Management in Practice; Essential Topics in Artificial Intelligence
This presentation focuses on the design and implementation of ontologies and how they are leveraged to implement taxonomies and provide better content organization. It demonstrates through case studies how this implementation improves unstructured content search and retrieval. Case studies illustrate how this approach to content organization has improved “findability” and reuse of content and knowledge in several organizations. Along with examples of ontology and taxonomy adoption, an underlying view of the card sort results, keyword, and controlled vocabulary building used to meet the business expectations of the KM solution are shared.
Enhancing Searches With Taxonomies & Semantic Tech
Bob DuCharme, Technical Writer, Commonwealth Computer Research, Inc.
We’ve all seen how the major search engines sometimes second-guess—often correctly—what we meant to search for; taxonomies using standards-based semantic technology can help your own applications do this and more.While semantic technologies rarely try to store the complete meanings of words, data about the relationships of words and phrases to other words and phrases (for example, “broader than” or “alter- native label”) can often store enough semantics to automate search enhancement.When this data is stored using the W3C SKOS standard, it can more easily be aggregated, queried, and used by a variety of tools. Because SKOS is based on RDF, the growing amount of publicly available RDF data about terms and term relationships can be an especially big help to drive improved searches with your own systems. This talk looks at how taxonomies based on semantic technology can help with focusing and augmenting searches, correcting terms, disambiguation, and using other vocabulary metadata sources to improve the data driving your search enhancements.
Developing a Semantic Search Application: A Pharma Case Study
Tom Reamy, Chief Knowledge Architect & Founder, KAPS Group Author, Deep Text
Adding semantics to search can be a daunting task since it involves dealing with language (messy and not really an IT core competency) and relatively new technologies such as text analytics, and, to work well, requires an interdisciplinary team. Given the complexity and uncertainty of developing semantic search, it usually makes sense to start with a small, focused pilot or POC.This is what one pharmaceutical company decided to do. This talk describes an initial pilot carried out by a varied and diverse team using three different search and text analytics products. It discusses the pluses and problems this diversity caused, what worked best, and what didn’t work so well and covers the key issues and approaches that were needed for success.
VISUALIZING ONTOLOGIES AND METADATA
4:15 p.m. - 5:00 p.m.
Visualization Illuminates Data & Convinces Stakeholders
Suzanne Carroll, Product Director, Data Intelligence, XO Group (The Knot)
Most people’s eyes glaze over at the mention of organizing information,metadata, or taxonomies. Show off a graphic illustrating how the organization fits together, and watch how people are automatically drawn in, finding how they fit in the big picture and asking questions. Put it up by your desk, and they’ll stop by and ask you what it is and—bingo!—instant connection and conversation on your terms. Visualizations are eye-catching conversation starters. This session shares several types of visualizations and discusses which ones work for showing off taxonomies and metadata. It showcases available online tools to build your own visualizations.
Ontology Diagrams for Successful Knowledge Capture & Transfer
Brandon Olson, Associate Professor, School of Business and Technology, College of St. Scholastica
Knowledge rarely resides in a single location. It may be embedded within hundreds or thousands of documents throughout the organization and within the employees’ implicit understandings and experiences. While many knowledge management approaches seek to quickly locate individual knowledge artifacts or facilitate collaboration across knowledge holders, these methods are limited by the number of documents or collaborators involved. Greater value may be realized through a more holistic view of the domain area. This presentation describes the use of ontology diagrams as means to capture knowledge from across many sources and to depict the knowledge in a manner that is easily managed and communicated.
Ontology-Driven Search and Information Access: How Abstractions Become Actionable
Seth Earley, CEO, Earley Information Science Author, The AI Powered Enterprise
Taxonomies have come a very long way through the years, from rudimentary application in navigation and search thesauri to systems driven by complex domain models with intricate linked knowledge and data. Today, ontologies form the underpinning of unified information access and search-based application development. In this final session of the conference, we see how all of these pieces can come together with examples from the areas of finance, healthcare, and media/entertainment.The applications may look very different, but the end result is the same: getting users to the content and information they need in the context of their processes, helping to reveal knowledge structures and relationships, and allowing on-the-fly synthesis of structured data and unstructured content. We end the conference by providing the ammunition needed to make this case at the highest levels in your company.
Enterprise Solutions Showcase GRAND OPENING RECEPTION
5:00 p.m. - 7:00 p.m.