Chimera: Data Sharing Flexibility, Shared Nothing Simplicity Umar

Chimera: Data Sharing Flexibility, Shared Nothing Simplicity Umar Farooq Minhas University of Waterloo David Lomet, Chandu Thekkath Microsoft Research Distributed database architectures In a shared nothing system a single node can only access local data less complex, easier to implement provides good performance if data is partitionable e.g., Microsoft SQL Server, IBM DB2/UDB Data sharing allows multiple nodes to share access to common data complex, difficult to implement provides increased responsiveness to load imbalances e.g., Oracle RAC, IBM Mainframe DB2 Goal: Design and implement a hybrid database system Umar Farooq Minhas IDEAS 2011 2

Shared nothing vs data sharing Node 1 CPU CPU Memory Node 2 CPU CPU Memory Node 3 CPU CPU Memory Node 1 CPU Node 2

CPU Memory CPU CPU Node 3 CPU Memory CPU Memory Data sharing software layer Disk Disk Disk Disk

Data sharing Shared nothing Disk Hardware configuration can be identical for both systems Software managing the system is different Umar Farooq Minhas IDEAS 2011 3 Disk Our approach Start with shared nothing cluster of low-cost desktop machines each node hosts a standalone shared nothing DBMS with locally attached storage Extend shared nothing system with data sharing capability a remote node can access a database hosted at a local node

Additional code required for distributed locking cache consistency Techniques presented here are applicable to any shared nothing DBMS Umar Farooq Minhas IDEAS 2011 4 Outline Introduction Chimera: Overview Chimera: Implementation Details Experimental Evaluation Conclusion Umar Farooq Minhas IDEAS 2011 5 Chimera: Best of both worlds Chimera is an extension to a shared nothing DBMS

built using off-the-shelf components Provides the simplicity of shared nothing, flexibility of data sharing Provides effective scalability and load balancing with less than 2% overhead Umar Farooq Minhas IDEAS 2011 6 Chimera: Main components 1. Shared file system to store data accessible to all nodes of a cluster e.g., Common Internet File System (CIFS) or Network File System (NFS) 2. Generic distributed lock manager provides ownership control e.g., ZooKeeper, Chubby, Boxwood 3. Extra code in the shared nothing DBMS for data access and sharing among nodes Umar Farooq Minhas

IDEAS 2011 7 Advantages of Chimera Load balancing at table granularity offloads execution cost of database functionality Scale-out for read-mostly workloads read-mostly workloads are very common and important e.g., a service hosted at Microsoft, Yahoo, or Google. non-partitionable data is stored in a centralized database Chimera provides effective scale-out for such workloads Close to shared nothing simplicity key point: allow only a single node to update a database at a time greatly simplifies data sharing, transaction log, and recovery Umar Farooq Minhas IDEAS 2011 8 Outline Introduction Chimera: Overview Chimera: Implementation Details

Experimental Evaluation Conclusion Umar Farooq Minhas IDEAS 2011 9 Chimera: Overall system architecture GLM Queries Queries SP SP LC EBM Queries DBMS 1 (local)

EBM SP LC DBMS 2 (remote) CIFS DB Umar Farooq Minhas LC EBM DBMS N (remote) SP Stored Procedure LC Lock Client EBM Enhance Buffer Manager GLM Global Local Manager CIFS Common Internet File System

IDEAS 2011 10 Stored Procedure Most of the required changes are implemented in a user defined stored procedure invoked like a standard stored procedure An instance of this stored procedure is installed at each node accepts user queries does appropriate locking and buffer management executes the query against a local or remote table returns the results to the caller Umar Farooq Minhas IDEAS 2011 11

Enhanced Buffer Manager Implement a cross-node cache invalidation scheme maintain cache consistency across nodes Dirty pages need to be evicted from all readers after an update we do not know in advance which pages will get updated Selective cache invalidation updating node captures a list of dirty pages sends a message to all the readers to evict those pages Umar Farooq Minhas IDEAS 2011 12 Global Lock Manager We need a richer lock manager that can handle locks on shared resources across machines implemented using an external global lock manager with corresponding local lock clients A lock client is integrated with each DBMS instance Lock types: Shared or Exclusive Lock resources: an abstract name (string)

Umar Farooq Minhas IDEAS 2011 13 Read sequence 1. Acquire a shared lock on the abstract resource (table) ServerName.DBName.TableName 2. On lock acquire, proceed with Select 3. Release the shared lock Umar Farooq Minhas IDEAS 2011 14 Write sequence 1. Acquire an exclusive lock on ServerName.DBName ServerName.DBName.TableName 2. On lock acquire, proceed with the Update

3. Do selective cache invalidation on all reader nodes 4. Release the exclusive locks Umar Farooq Minhas IDEAS 2011 15 Outline Introduction Chimera: Overview Chimera: Implementation Details Experimental Evaluation Conclusion Umar Farooq Minhas IDEAS 2011 16 Experimental setup We use a 16 node cluster

2x AMD Opteron CPU @ 2.0GHz 8GB RAM Windows Server 2008 Enterprise with SP2 patched Microsoft SQL Server 2008 buffer pool size = 1.5GB Benchmark TPC-H: A decision support benchmark scale factor 1 total size on disk ~3GB Umar Farooq Minhas IDEAS 2011 17 Overhead of our prototype Run the 22 TPC-H queries on a single node with and without the prototype code Without TPCH Runtime Prototype Query (ms) Q1 4809

Q6 163 Q9 2258 Q11 462 Q12 1131 Q13 1349 Q18 4197 Q19 183 Q21 2655 Q22 457 Umar Farooq Minhas Runtime With Prototype (ms) 5120 171 2303

431 1247 1345 3895 185 2673 485 IDEAS 2011 Slowdown Factor 1.06 1.05 1.02 0.93 1.10 1.00 0.93 1.01 1.01 1.06 Avg Slowdown: 1.006 X 18

Remote execution overhead (cold cache) Run the 22 TPC-H queries on the local node and remote node measure the query run time and calculate the slowdown factor flush DB cache between subsequent runs Umar Farooq Minhas IDEAS 2011 19 Remote execution overhead (warm cache) Repeat the previous experiment with warm cache Avg Slowdown (before): 1.46 X Umar Farooq Minhas Avg Slowdown (now): 1.03 X IDEAS 2011 20 Cost of updates

Baseline: A simple update on a node with no readers Test Scenarios: Perform update while 1, 2, 4, or 8 other nodes read the database in an infinite loop Average runtime (secs) 3 Local Remote 2.39 3 2.16 2.12 1.90 2 1.42

2 1.39 2.06 1.77 1.41 1.14 1 1 0 Baseline Umar Farooq Minhas 1 Reader 2 Readers IDEAS 2011 4 Readers

8 Readers 21 Cost of reads with updates Perform simple updates at local node with varying frequency: 60s, 30s, 15s, and 5s Run one of the TPC-H read queries at a remote node for a fixed duration of 300s and calculate Response time: average runtime Throughput: queries completed per second Umar Farooq Minhas IDEAS 2011 22 Cost of reads with updates (1) Q6 Update Average Steady State Queries/ Frequency Runtime Average

sec (secs) (secs) (secs) 60 0.21 0.20 4.85 30 0.22 4.54 15 0.23 4.23 5 0.26 3.77 1.38 1.38 1.39 1.38 1.43 Q13 60

30 15 5 1.62 1.65 1.99 2.04 1.62 Q20 60 30 15 5 0.62 0.60 0.49 0.49 2.78 2.85 3.01 3.60

2.73 Q21 60 30 15 5 0.36 0.35 0.33 0.28 Umar Farooq Minhas IDEAS 2011 0.73 0.73 0.72 0.73 Non-conflicting read

23 Scalability Run concurrent TPC-H streams start with a single local node incrementally add remote nodes up to a total of 16 nodes Umar Farooq Minhas IDEAS 2011 24 Conclusion Data-sharing systems are desirable for load-balancing We enable data-sharing as an extension to a shared nothing DBMS We presented design and implementation of Chimera enables data sharing at table granularity uses global locks for synchronization implements cross-node cache invalidation does not require extensive changes to shared nothing DBMS

Chimera provides effective scalability and load balancing with overhead Umar Farooq Minhas IDEAS 2011 25

Recently Viewed Presentations

  • 7 Wastes - Six Sigma Material

    7 Wastes - Six Sigma Material

    Your team may find that is can reduce the production of a part from multiple machines to one machine, such as putting two components on at the same time, or redesigning tooling to create more complex stamping in one cycle.The...
  • Andy Warhol - Amazon S3

    Andy Warhol - Amazon S3

    Warhol Legacy. Warhol died on February 22, 1987, at the age of 58. The Andy Warhol Museum in his native city, Pittsburgh, Pennsylvania, holds an extensive permanent collection of art and archives. It is the largest museum in the United...
  • İnternet Ortaminda Yapilan Yayinlarin Düzenlenmesi̇ Ve Bu ...

    İnternet Ortaminda Yapilan Yayinlarin Düzenlenmesi̇ Ve Bu ...

    (3) Birinci fıkrada belirtilen yükümlülüğe aykırı hareket eden kişiye mahallî mülkî amir tarafından üçbin Yeni Türk Lirasından onbeşbin Yeni Türk Lirasına kadar idarî para cezası verilir.
  • Four quadrant approach Professor Heather Draper, Catherine Hale,

    Four quadrant approach Professor Heather Draper, Catherine Hale,

    NB this helps to define the scope of your moral responsibilities. Sets out options and potential outcomes "good ethics start with good facts" Sometimes disagreements over the best medical management present as ethical issues and can be resolved at this...
  • Your Presenters! Mandy Henk, CEO mandy@tohatoha.nz  Brendan Boughen,

    Your Presenters! Mandy Henk, CEO [email protected] Brendan Boughen,

    Like, what are we even doing here? Vision. Realise the full potential of the internet — universal access to research and education, full participation in culture — to build a stronger Aotearoa New Zealand.
  • Games Production Process - Lecturer Scott's Blog

    Games Production Process - Lecturer Scott's Blog

    Games Production Process ... of playing the game Story quality Ease of play Desirability Game mechanics Rules Level of interaction Game balance Emergent gameplay Ways unimagined by game's designers Gameplay Bold statement: All games are computer games. Game design is...
  • Pitch Template

    Pitch Template

    Pitch Template Creative Activist Toolkit 008
  • Subnets Routing within an Organization Subnet Subnets are

    Subnets Routing within an Organization Subnet Subnets are

    Subnets Routing within an Organization Subnet Subnets are a subset of the entire network Networks can be divided into subnets Subnets can be divided into subnets Each subnet is treated as a separate network A subnet can be a WAN...