Diapositive 1 - BCS IRSG

Search-Based Applications: the Maturation of Search Gregory Grefenstette Exalead Exalead S.A. 2009 Maturation of Search 2 www.exalead.com/search 8 billion URLS, 2 billion images, 200 million videos Wikipedia, cloud tags also Labs.exalead.com

3 Two ways to find information DATABASES VS SEARCH ENGINES 4 Recent Past

DATABASES Structured Structured Data Data Transaction Transaction Precise Precise All All tuples tuples SQL

SQL Slow Slow SEARCH ENGINES Text Similarity Ranking Intuitive Fast Partial 5 More Recent

SEARCH ENGINES DATABASES Structured Structured Data Data Transaction Transaction ss Precise Precise All

All tuples tuples SQL SQL Slow Slow Text Similarity Ranking

Top-K Top-K Column Column store store Map Map Reduce Reduce Data Data Cube Cube

Connectors Connectors Facets Facets Map Map Reduce Reduce Tables Tables

Intuitive Fast Partial 6 NOW DATABASES SEARCH BASED APPLICATIONS SEARCH ENGINES

Search based Application An application which uses a search engine component, but whose final purpose is not searching for a document, but rather a domain-oriented process result Examples: Custom response management Logistic tracking and tracing Contextual Advertising Database reporting after offloading 8 Current situation Databases are the backbone of search in information systems

Data Warehouse BI reports Database Business processes DataMart Front-office users

Search-enabled application Optimized solution for information access Data Warehouse BI reports Database Business processes Search Engine

Front-office users Drawbacks of Using Database Search As a Component Standard Architecture Search Based Architecture

12 How does a Search Based Application work? 14 Database converted to Business Items Stored as structured documents Business items are concrete objects directly understandable by end-users Product, Customer, Purchase order, Technical support call

Each business item becomes a document Straightforward and simple format of the document index allows performance and ease-of-use Search engine can offer rich and powerful query language that allows to make queries as complex and advanced as SQL despite the flat data model Search Engine must support typed fields, intra field scope search, category/facets 15 Database into structured documents Product_ID

Product_Name Manufacturer_Names 123 control switch ACME Inc ; The Control Switch Company; Karl GmbH 124 red warning light

Scope Search Product_ID Product_Name 123 control switch 124 Product_ID

Manufacturer_ID 123 345 123 8574 123 4483

red warning light Manufacturer_ID Manufacturer_NAME 345 ACME Inc. 8574 The Control Switch Company

4483 Karl GmbH Product_ID Product_Name Manufacturer_Names 123 control switch

ACME Inc ; The Control Switch Company; Karl GmbH 124 red warning light but the manufacturer names can still be searched as individual records with scope search "ACME GmbH" does not match the document here)

Hierarchical categories Product_ID 123 Color Red Brand ACME Fragile Y

Multiple kinds of attributes can be mixed in a same category field. The hierarchical tree structure of the categories preserves the differences between attribute types Nb of wheels Wheel type 3

2 Product_ID Country 123 France 123 UK

123 Germany Multi-valued attributes can also be represented by categories. A single category field can be used to store hundreds or thousands of attribute columns. Product_ID Attributes

123 Color/Red ; Brand/ACME ; Fragile/Y ; Nb_wheels/3 ; Wheel_type/2; Country/France ; Country/UK; Country/Germany 124 18 Multi-dimensional facets

19 Multi-dimensional facets Search results facets provide aggregate values computed onthe-fly with the search results list One single search query can return the equivalent of dozens of GROUP BY SQL clauses Numerical values associated with facets (count, score, ) can be used to perform complex computations on the results list Search performance is not affected by the size of the category tree Thousands of attribute types can be represented by categories Facets are dynamically selected by the search results: the displayed

attributes are always consistent with the search query (e.g. color and engine type when searching for a car, screen size and CPU speed when searching for a laptop) 20 CASE STUDY LOGISTICS TRACK & TRACE 21 Gefco overview A subsidiary of French car maker PSA (Peugeot, Citron) Now does most of its business outside of PSA

Logistics operator Carries cars from factories to dealers (road, rail) Carries freight (parcels ; originally spare parts) Supply chain and logistic platform design 3.5B, 10 000 employees, 100 countries The original pain Classical multi-criteria search over Oracle, 2 million rows Poor performance despite 2 years of optimization Minute response times Ask users to do simple queries and preferably at some given hours From forms to a search box

24 25 New application With operational reporting Partner French Post Office 28

Tracing of incidents Real-time system Used as an internal audit tool for the mail Suggestion of addresses for customers Search in file numbers, addresses, names, etc. Case Study: RightMove

31 Rightmove: Reduce Costs and Improve Performance through Database 32 Advantages of Search Based Applications 33 35

Conclusions Search engines mature Structured data, high volume, high speed Search based Applications offer Usage: Search interface familiar to user Performance: Search engine geared to search, eases load on database platform Agility: Original database design untouched, reconfiguring output lightweight 36

Recently Viewed Presentations

  • Senate Bill 1720 Developmental Education

    Senate Bill 1720 Developmental Education

    January 2014 Implementation1008.30 Common placement testing for public postsecondary education. 4(a) A student who entered 9th grade in a Florida public school in the 2003-2004 school year, or any year thereafter, and earned a Florida standard high school diploma or...
  • Prolegomena to a sociolinguistics of 'modern RP': varieties ...

    Prolegomena to a sociolinguistics of 'modern RP': varieties ...

    Introduction. ICLAVE#5, June 2009. Language change in progress, its social embedding, predictions and complications. A real-time diachronic study of some features of modern RP/changing SSBE
  • Living Things - Quia

    Living Things - Quia

    Study the circle graph and then answer the questions. Mineral Mixture Reading Graphs: What mineral is most abundant in granite? Feldspar Mineral Mixture Reading Graphs: About what percentage of granite is made up of dark minerals? 10% Mineral Mixture Calculating:...
  • Management Planning and Assessment

    Management Planning and Assessment

    Assessment: based on goals and data trends, what are the areas/questions that will be the basis for assessments in your June report? Based on the year-to-date financial results, it will be imperative to constantly monitor and assess the impact of...
  • Balancing Chemical Equations

    Balancing Chemical Equations

    Conservation of Mass (Counting Atoms) Step 1: Count the number of atoms on both sides of the equation (for the reagents and products).If it's balanced, you don't have to do anything! 2 . NaCl + F. 2. 2 . NaF...
  • Socialism and Communism - Southeast Missouri State University

    Socialism and Communism - Southeast Missouri State University

    Socialism and Communism Historical Development Plato's Republic Communism of the select (the upper class) pursuit of common interests Utopianism Renaissance Utopias (Thomas More's Utopia) French Utopian Socialism Industrial (Modern) Socialism Robert Owen New Harmony, IN Karl Marx Das Kapital (the...
  • CANADIAN GEOGRAPHY 1202 - Ms. Dale's work in progress!!

    CANADIAN GEOGRAPHY 1202 - Ms. Dale's work in progress!!

    canadian geography 1202. unit 1: natural and human systems. what is a system? system. a system is made up of different parts that connect to form a whole. there are many different types of systems of various sizes.
  • Work and Power - PC\|MAC

    Work and Power - PC\|MAC

    Thin wedge of a given length has a greater IMA than a thick wedge of the same length. Screw. Inclined plane wrapped around a cylinder. Closer threads have more IMA than far apart threads. Pulley. Rope that fits into a...