Linearly Compressed Pages: - Carnegie Mellon University

Linearly Compressed Pages: A Main Memory Compression Framework with Low Complexity and Low Latency Gennady Pekhimenko Advisers: Todd C. Mowry & Onur Mutlu Executive Summary Main memory is a limited shared resource Observation: Significant data redundancy Idea: Compress data in main memory Problem: How to avoid latency increase? Solution: Linearly Compressed Pages (LCP): fixed-size cache line granularity compression 1. Increases capacity (69% on average) 2. Decreases bandwidth consumption (46%) 3. Improves overall performance (9.5%) 2

Challenges in Main Memory Compression 1. Address Computation 2. Mapping and Fragmentation 3. Physically Tagged Caches 3 Address Computation Cache Line (64B) Uncompressed Page Address Offset 0 Compressed Page Address Offset L1 L0 0

L1 ? ... 128 64 L0 L2 L2 ? LN-1 (N-1)*64 LN-1 ... ?

4 Mapping and Fragmentation Virtual Page (4kB) Virtual Address Physical Address Physical Page (? kB) Fragmentation 5 Physically Tagged Caches Core Virtual Address Critical Path

TLB L2 Cache Lines tag tag tag Address Translation Physical Address data data data 6 Shortcomings of Prior Work Compression Access Decompression Complexity Compression Mechanisms Latency Latency Ratio IBM MXT [IBM J.R.D. 01]

7 Shortcomings of Prior Work Compression Access Decompression Complexity Compression Mechanisms Latency Latency Ratio IBM MXT [IBM J.R.D. 01] Robust Main Memory Compression [ISCA05]

8 Shortcomings of Prior Work Compression Access Decompression Complexity Compression Mechanisms Latency Latency Ratio IBM MXT [IBM J.R.D. 01] Robust Main Memory Compression [ISCA05]

LCP: Our Proposal

9 Linearly Compressed Pages (LCP): Key Idea Uncompressed Page (4kB: 64*64B) 64B 64B 64B 64B ... 64B 4:1 Compression ... Compressed Data (1kB) M

E Exception Storage Metadata (64B): ? (compressible) 10 LCP Overview Page Table entry extension compression type and size extended physical base address Operating System management support 4 memory pools (512B, 1kB, 2kB, 4kB) Changes to cache tagging logic physical page base address + cache line index (within a page) Handling page overflows Compression algorithms: BDI [PACT12] , FPC [ISCA04]

11 LCP Optimizations Metadata cache Avoids additional requests to metadata Memory bandwidth reduction: 64B 64B 64B 64B 1 transfer instead of 4 Zero pages and zero cache lines Handled separately in TLB (1-bit) and in metadata (1-bit per cache line) Integration with cache compression BDI and FPC 12

Methodology Simulator x86 event-driven simulators Simics-based [Magnusson+, Computer02] for CPU Multi2Sim [Ubal+, PACT12] for GPU Workloads SPEC2006 benchmarks, TPC, Apache web server, GPGPU applications System Parameters L1/L2/L3 cache latencies from CACTI [Thoziyoor+, ISCA08] 512kB - 16MB L2, simple memory model 13 C o m p r e s s io n R a ti o Compression Ratio Comparison SPEC2006, databases, web workloads, 2MB L2 cache 3.5 3 Zero Page

LZ FPC LCP (BDI) LCP (BDI+FPC-fixed) MXT 2.60 2.5 2.31 2 1.59 1.5 1.62 1.69 1.30

1 GeoMean LCP-based frameworks achieve competitive average compression ratios with prior work 14 N o r m a liz e d B P K I Bandwidth Consumption Decrease Better SPEC2006, databases, web workloads, 2MB L2 cache 1.2 1 0.8 0.6 0.4 0.2 0

FPC-cache (None, LCP-BDI) (BDI, LCP-BDI+FPC-fixed) 0.92 BDI-cache (FPC, FPC) FPC-memory (BDI, LCP-BDI) 0.89 0.57 0.63 0.54 0.55 0.54 GeoMean

LCP frameworks significantly reduce bandwidth (46%) 15 Performance Improvement Cores LCP-BDI (BDI, LCP-BDI) (BDI, LCP-BDI+FPC-fixed) 1 6.1% 9.5% 9.3% 2 13.9% 23.7%

23.6% 4 10.7% 22.6% 22.5% LCP frameworks significantly improve performance 16 Conclusion A new main memory compression framework called LCP(Linearly Compressed Pages) Key idea: fixed size for compressed cache lines within a page and fixed compression algorithm per page LCP evaluation:

Increases capacity (69% on average) Decreases bandwidth consumption (46%) Improves overall performance (9.5%) Decreases energy of the off-chip bus (37%) 17 Linearly Compressed Pages: A Main Memory Compression Framework with Low Complexity and Low Latency Gennady Pekhimenko Advisers: Todd C. Mowry & Onur Mutlu

Recently Viewed Presentations

  • Medication Training for Schools

    Medication Training for Schools

    LEGAL BACKGROUND. ORS 339.866-339.874 . OAR 581-021-0037 . Designated school personnel. District procedures must address how to handle field trips and other events that occur outside the usual school setting.
  • Diapositive 1 - Kiwanis

    Diapositive 1 - Kiwanis

    BAPTEMES DE L'AIRDivision Lorraine NordSamedi 14 mai 2011. 2 avions et 3 pilotes des Ailes Mosellanes de Metz fidèles partenaires du rendez-vous kiwanien annuel des baptêmes offerts à une quarantaine d'enfants de la Division Lorraine Nord
  • Honors Program Kick-Off 2017 Honors Program Kick-Off 2017

    Honors Program Kick-Off 2017 Honors Program Kick-Off 2017

    Honors Program Kick-Off 2017. What is a National Scholarship? Merit based scholarships for undergraduate, graduate, or public service programs. Highly prestigious, setting a student apart from their peers.
  • Learning to Simplify Sentences Using Wikipedia

    Learning to Simplify Sentences Using Wikipedia

    readability formulas. simple word lists. ... Unsimplified sentence is probabilistically broken into phrases "phrase" is a sequence of words. I disdain . green ham . with. green eggs . this is the approach for phrase-based machine translation.
  • Windows Azure Platform Overview - Rice University

    Windows Azure Platform Overview - Rice University

    This presentation provides an overview of the Windows Azure Platform. After this presentation you will understand the services Microsoft is providing as part of the Windows Azure Platform, the key concepts, and how to get started. The additional presentations in...
  • Strategic Payment Solutions - GSA SmartPay

    Strategic Payment Solutions - GSA SmartPay

    Smart Tip. Put a feather in your CAP and think SmartPay when addressing your goals! Cross Agency Priority Goals, or CAP Goals, have been established to drive implementation of the President's Management Agenda and tackle critical government-wide challenges that cut...
  • Fracture Overview Fall 2011 Rectangular Plate with Hole

    Fracture Overview Fall 2011 Rectangular Plate with Hole

    R designates rock The work to propagate the crack is positive, and is defined as an increase in surface energy (dUs) Griffith Energy-Balance Concept As the crack propagates, the rock undergoes a change in strain energy (dUE).
  • Nuclear Chemistry Structure and Stability of Nuclei, Fission,

    Nuclear Chemistry Structure and Stability of Nuclei, Fission,

    Na. Sodium. 22.99. Atomic Number. Number of . Protons. Number of Electrons(when atom is neutrally charged) Property unique to each element. Key. Average atomic mass* Weighted Average number of . Protons. and . Neutrons (approximately) Na. Sodium. 22.99. 11. Isotopes....