Hybrid Memory Adalah Pdf

Hybrid Memory Adalah Pdf

Hybrid Memory Cube (HMC) is a high-performance computer random-access memory (RAM) interface for through-silicon via (TSV)-based stacked DRAM memory. HMC competes with the incompatible rival interface High Bandwidth Memory (HBM).

Hybrid Memory Cube was co-developed by Samsung Electronics and Micron Technology in 2011,[1] and announced by Micron in September 2011.[2] It promised a 15 times speed improvement over DDR3.[3] The Hybrid Memory Cube Consortium (HMCC) is backed by several major technology companies including Samsung, Micron Technology, Open-Silicon, ARM, HP (since withdrawn), Microsoft (since withdrawn), Altera (acquired by Intel in late 2015), and Xilinx.[4][5] Micron, while continuing to support HMCC, is discontinuing the HMC product [6] in 2018 when it failed to achieve market adoption.

HMC combines through-silicon vias (TSV) and microbumps to connect multiple (currently 4 to 8) dies of memory cell arrays on top of each other.[7] The memory controller is integrated as a separate die.[2]

HMC uses standard DRAM cells but it has more data banks than classic DRAM memory of the same size. The HMC interface is incompatible with current DDRn (DDR2 or DDR3) and competing High Bandwidth Memory implementations.[8]

HMC technology won the Best New Technology award from The Linley Group (publisher of Microprocessor Report magazine) in 2011.[9][10]

The first public specification, HMC 1.0, was published in April 2013.[11] According to it, the HMC uses 16-lane or 8-lane (half size) full-duplex differential serial links, with each lane having 10, 12.5 or 15 Gbit/s SerDes.[12] Each HMC package is named a cube, and they can be chained in a network of up to 8 cubes with cube-to-cube links and some cubes using their links as pass-through links.[13] A typical cube package with 4 links has 896 BGA pins and a size of 31×31×3.8 millimeters.[14]

The typical raw bandwidth of a single 16-lane link with 10 Gbit/s signalling implies a total bandwidth of all 16 lanes of 40 GB/s (20 GB/s transmit and 20 GB/s receive); cubes with 4 and 8 links are planned, though the HMC 1.0 spec limits link speed to 10 Gbit/s in the 8-link case. Therefore, a 4-link cube can reach 240 GB/s memory bandwidth (120 GB/s each direction using 15 Gbit/s SerDes), while an 8-link cube can reach 320 GB/s bandwidth (160 GB/s each direction using 10 Gbit/s SerDes).[15] Effective memory bandwidth utilization varies from 33% to 50% for smallest packets of 32 bytes; and from 45% to 85% for 128 byte packets.[7]

As reported at the HotChips 23 conference in 2011, the first generation of HMC demonstration cubes with four 50 nm DRAM memory dies and one 90 nm logic die with total capacity of 512 MB and size 27×27 mm had power consumption of 11 W and was powered with 1.2 V.[7]

Engineering samples of second generation HMC memory chips were shipped in September 2013 by Micron.[3] Samples of 2 GB HMC (stack of 4 memory dies, each of 4 Gbit) are packed in a 31×31 mm package and have 4 HMC links. Other samples from 2013 have only two HMC links and a smaller package: 16×19.5 mm.[16]

The second version of the HMC specification was published on 18 November 2014 by HMCC.[17] HMC2 offers a variety of SerDes rates ranging from 12.5 Gbit/s to 30 Gbit/s, yielding an aggregate link bandwidth of 480 GB/s (240 GB/s each direction), though promising only a total DRAM bandwidth of 320 GB/sec.[18] A package may have either 2 or 4 links (down from the 4 or 8 in HMC1), and a quarter-width option is added using 4 lanes.

The first processor to use HMCs was the Fujitsu SPARC64 XIfx,[19] which is used in the Fujitsu PRIMEHPC FX100 supercomputer introduced in 2015.

JEDEC's Wide I/O and Wide I/O 2 are seen as the mobile computing counterparts to the desktop/server-oriented HMC in that both involve 3D die stacks.[20]

In August 2018, Micron announced a move away from HMC to pursue competing high-performance memory technologies such as GDDR6 and HBM.[21]

A hybrid memory model, or architecture, is when the index is purely in-memory and not persisted. Data is stored only at persistent storage (SSD) and read directly from the disk. Disk I/O is not required to access the index, which enables predictable performance.

This hybrid memory model is possible because the read latency characteristic of I/O in SSDs is the same. This is regardless of whether it’s random or sequential. For such a model, optimizations described are used to avoid the cost of a device scan to rebuild indexes.

%PDF-1.3 %âãÏÓ 785 0 obj<>stream xÚ¬[[oÜ6þ+òÒ>¬‡<¼E�Äiº�SLj½èCN2M�ulcj/š¿ß¡HJQœvó`q$~çÆï^¤H²�è$¹N+å;©$A£A�ÖÆ¢¡:m$?ÒhxtV¦ÓV¶ÓNpgÀ�åξӞ¸s@# ³��µìŒ�è¤�µê#¤Ö�!Á� –;ÛÎ(ÐphÄξ30�€K6¢3Ï¥�dËf@�±V¢ÉŽm6�ìœA’=|’pÉxÉ’ôHã;+$[�þHˆ°’€‚)VYH¶ª³Z ŽVÃiMgá 6úÎZ„SZtŠï4<ä8Hö1É1,’£$‡À}tç¢v˜ë¤€ñpN@…s�£Àpß9¥¹O@ãÆÃi¸Ëýœ‘xä vÙ«ÎYîì!ÙÆÎ�ì0hÒC²Œ‚dχÇ.ðè 4.@¼„•^ðP‰L�°ÀKvHÏfJÅ<Ál‡0ÃÔàЀ»2ø®ÃŒÐ@Hx¤=3Š>o6Bd½• H¶-a8½ƒ=$ Ù!ü$ êÑ€ä ù$‡(0tA àÿƒ€u$e˜M@`[€@n‚Ð@�™.(P� 8¨À�\tDù.Ár téË� ’›'¢‰$# "p0 1 â i;Èq‚=¤À-Œ)ÄDHü%$Z 1CK$¦º Є�wR(b„á–cFAhXGÌ3¡A9âL>�b†-áá–MѬ¾Ä<ŽûiÖÁéAšux⧬ó_œ"°cÌS¯Ñ‚˜8 %þÁ$"ì/'¼äÀs¤�ä¢D}�$#Z<¨\`P éˆøO†uÁ’Y‡Áoâ¼@~B2§7’’¹N ³ Ù²g¸Å:<ÈÃÑ‚vbzƒÍÐaYG€•„xIðÚàZð†×AºÓ–¸ôpüÑbn0¹‰£C'´@}â@°ùtà!$s¢ ,�up&§Ì…<.Ÿ0O9¯!Ú<ëpŽŸ²ö�gÞñSÖ"–uüÆXÁ<ð±dGDà[ÅåEqŒ ™‹òÍÑàâ¡F�8ë•âŒˆUžkgZìyànßcVð=Öa1¢ŠK4è'ÑbÉ#¯¸(*�\Q\BÒ¸SqJ’Ÿ²ä yJp ñÑrÜÂx+.r$B‹K=iÅ%Lsú+ä0ZA8È“4)É€BœUœI -Ã-ø¡$ë`?”dVÌw0ç"&£;–ï˜|GŸ09y6¶¿wq¢N�š'iãr?ò$Ïè¹—9AÂÁ½B¾§OP8¯��ã{Öó=9ÑêQ�¬L÷T´„åYµ§#. &ýPÑqÏL,V±Ÿ�ÊÃ(jë¦öñ¢Åîû�ÈÙ0� rë•q0ÂÚÉ=K°òq4�'kujÚÏóêGO5€½¬%Ý£��ç",Ž&ý¸v�vÓ{ŽmñSy<Ú.Lå¡^h/¦öq”½œŽ¢‰µØ/¸ŸšF™XÞÔxù6 M¼¶‹~üüóæâz·½‹æB;·Î·?žm¿c1±yý2õé�xw»í þòþóÓ7ôgØãæãðsóñôþáûîæëŸñÇåÃõçíîwn¾¸ÿò�¯ÏH]>}º|Ü><Âï\<}Bóåõã6?ÇÃ(êíù«×/9¿zýüÍèQÆ=ÿô×ãîúsTtuýév‹—QÖ?·×_p7Þ¼ºy¼ÝnN¯oîï6ÏŸno!çùÃÃöîËÍߌ=¿¼ùÌà+x ‚SøyþôíÓv ùðí¸èÔÓ§×w_’»¯Pø‰øå?Û;¾ÁºÇ7¢ÜksÿS“¶÷÷ÿâÉ7/·Ÿ£­|é½'Ňô$눾Š?ûâ“Ž«øw¢—StÛî¶_rxoî¾Þnsü‹oo¿|™¸–~σŽË›í10/ß=u5îõVoG¬@ëù‘¿Ý~¼¸ü•Ûû}Ñ 8»í,ÆÉlÇ^{¿�ž©½gñ÷‹Ýõ›ûŒüéÅé¯Ï(D[9ij» §ÇÞ$BÍú¾yûëÛ¹Øèôq7ë52Q'^ÆX`D—L2{83Í°©Aˆò¢=ç÷ïž*ž�2h³AÖéöö6çöé›×,õþÛ7tÜ‹ç\ûÌ¿¡Ë0þ§×OÉÄןs¸GÊËÓ‚øenâžÂš Aa£ÓÌZóûõî² éÕÍק݌Fs‹S–å&S¿/O•jIÈ`J-ÔCQm‡zÐÓî—p°pÐЈíÜضk†iÁ߶ªYÔ–[µ%�HZ‰ÿTÏoO·�7‹ŠÚž•yé`ÏÉ| ÅFöe3²Ãr¤mEKCÔØÓF·Šé´hUCÎÔÅ#Úî6öàhfÑ5)mÆÅ9«¯1é_P“ÓtÈ¢CÆVêñ´ð7,8ìA5_öŒ” ýUü,„Ë©?^­Îç¢ÑÓý{y›m©æÖ²È?’ŒÓ%Á¾N:¼>(ë•ÆÍ£|¤]Kñh‘°]X+öÑÁ"_7s‘AG,:V/`ÚW—ËN�Ú™8zô3þaŸwö^ë¸iŒ‡è|ÕÔÿæƒ3\?l.øL¬ßkƽLĽ~ùÃ+Þh[åƒùqsÖÑæy§ÒŽþ"¾*èýBhsñµ“®ßœNÐAXFûˆömghU´g´‹hWÐnŽ65´ÔŒÖm3ÚˆZùš£mD›¢ÛÏÑUË)Z®"Zt8Vwb®Šås¿mU·é¦‚–3´V5´CÔdAÓ±hÆ[´š¡MÕr­Ëx›Âµ|7FWÇÛŠ5S¸fæLµU¦:]ÆÛ®™9Sûé}tbn®ù#v ´5Æ™Ép~ÿÇ R(—�Ö6Øußí½È@*À¶|ÖV�ª Ñòñ[ ŠFS€…cù<®´¢ mzåø*�2P`aV>�«múÔ…T~98VêXø”Ïùª@Ÿ€Z`¡’_Žª¥ì£.Ð…EùÄ° tX s¼]*™�… º0Ç7†C•ápX˜“-«@•5æq”CŽÛÙpœ½OéÓ©~:ÕGõÃæt|¦ÑyLÒÉTÐWÞø¶¸¿æ©Ð¥«O×^Aʘøž¸¿RºªtMò\’ç’<—ä¹$Ï%y>ÉóIžOò|’ç“<ŸíËý“\Ÿäz_¦ê|À?>hìÝ-a¸Ès=z�èb¯7×w_ØÞýã_—q(†é�ò�ç“Ì$T¥Ÿ²Oc>®ŒÞÃ(¢)†,i”øÍPÞ~ ß õWJWµ?ìMq"‰IŒHb„JW�®&]ÓzH¸tõ+Ô••Ð±œ¤4ÖÔ�ñÑd.‹¦£%ÒROÚ5ŠÜ:E)›(eõÙ´F¡_§°/ñµt•éJëêé$@'}�oæû«N×4fi)M:y¬WR%¬ñ3~R°&.ii?Yè¯fe½4)ŠÖSàd Œôk¤ˆ§¤$±6;Œ^Û¾¬00yÖW•5ž¥Úò5ÕöàÖz¸.ÿùS¢^QšT‚\©0쇥$NòyAš�b@­Ð:€á®Ã«mîÆ˱½å9ïbæ Ѩ9À‡€úé¦âƒ]r:¬øu ª8íZ>È À¶ ¢0í(-�´Ypz¦!eÞHÃh+u?Y«kÐu ¢¢A55H» �ªÞ“È5#MŒ"Í[2Õi9ËHì#§—Ÿ†åÜ�{ñòÞFoö‡p´i>‹ßßíycik¼ö(2fë

Li J, Lam C. Phase change memory. Science China Information Sciences, 2011, 54(5): 1061–1072

Article  Google Scholar

Cai M, Huang H. A survey of operating system support for persistent memory. Frontiers of Computer Science, 2021, 15(4): 154207

Article  Google Scholar

Izraelevitz J, Yang J, Zhang L, Kim J, Liu X, Memaripour A, Soh Y J, Wang Z, Xu Y, Dulloor S R, Zhao J, Swanson S. Basic performance measurements of the INTEL optane DC persistent memory module. 2019, arXiv preprint arXiv: 1903.05714

Loh G, Hill M D. Supporting very large DRAM caches with compound-access scheduling and missmap. IEEE Micro, 2012, 32(3): 70–78

Article  Google Scholar

Liu H, Chen Y, Liao X, Jin H, He B, Zheng L, Guo R. Hardware/software cooperative caching for hybrid DRAM/NVM memory architectures. In: Proceedings of International Conference on Supercomputing. 2017, 26

Qureshi M K, Srinivasan V, Rivers J A. Scalable high performance main memory system using phase-change memory technology. In: Proceedings of the 36th Annual International Symposium on Computer Architecture. 2009, 24–33

Yoon H, Meza J, Ausavarungnirun R, Harding R A, Mutlu O. Row buffer locality aware caching policies for hybrid memories. In: Proceedings of the 30th IEEE International Conference on Computer Design. 2012, 337–344

Chen C, An J. DRAM write-only-cache for improving lifetime of phase change memory. In: Proceedings of the 59th IEEE International Midwest Symposium on Circuits and Systems. 2016, 1–4

Awad A, Basu A, Blagodurov S, Solihin Y, Loh G H. Avoiding TLB shootdowns through self-invalidating TLB entries. In: Proceedings of the 26th International Conference on Parallel Architectures and Compilation Techniques. 2017, 273–287

Vasilakis E, Papaefstathiou V, Trancoso P, Sourdis I. LLC-guided data migration in hybrid memory systems. In: Proceedings of 2019 IEEE International Parallel and Distributed Processing Symposium. 2019, 932–942

Loh G H, Hill M D. Efficiently enabling conventional block sizes for very large die-stacked DRAM caches. In: Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture. 2011, 454–464

Jevdjic D, Loh G H, Kaynak C, Falsafi B. Unison cache: a scalable and effective die-stacked DRAM cache. In: Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture. 2014, 25–37

Hallnor E G, Reinhardt S K. A fully associative software-managed cache design. In: Proceedings of the 27th International Symposium on Computer Architecture. 2000, 107–116

Oskin M, Loh G H. A software-managed approach to die-stacked DRAM. In: Proceedings of 2015 International Conference on Parallel Architecture and Compilation. 2015, 188–200

Wang X, Liu H, Liao X, Chen J, Jin H, Zhang Y, Zheng L, He B, Jiang S. Supporting superpages and lightweight page migration in hybrid memory systems. ACM Transactions on Architecture and Code Optimization, 2019, 16(2): 11

Article  Google Scholar

Ryoo J H, John L K, Basu A. A case for granularity aware page migration. In: Proceedings of 2018 International Conference on Supercomputing. 2018, 352–362

Sanchez D, Kozyrakis C. ZSim: fast and accurate microarchitectural simulation of thousand-core systems. ACM SIGARCH Computer Architecture News, 2013, 41(3): 475–486

Article  Google Scholar

Poremba M, Xie Y. Nvmain: an architectural-level main memory simulator for emerging non-volatile memories. In: Proceedings of 2012 IEEE Computer Society Annual Symposium on VLSI. 2012, 392–397

Poremba M, Zhang T, Xie Y. Nvmain 2.0: a user-friendly memory simulator to model (non-)volatile memory systems. IEEE Computer Architecture Letters, 2015, 14(2): 140–143

Article  Google Scholar

Hao Y, Xiang S, Han G, Zhang J, Ma X, Zhu Z, Guo X, Zhang Y, Han Y, Song Z, Liu Y, Yang L, Zhou H, Shi J, Zhang W, Xu M, Zhao W, Pan B, Huang Y, Liu Q, Cai Y, Zhu J, Ou X, You T, Wu H, Gao B, Zhang Z, Guo G, Chen Y, Liu Y, Chen X, Xue C, Wang X, Zhao L, Zou X, Yan L, Li M. Recent progress of integrated circuits and optoelectronic chips. Science China Information Sciences, 2021, 64(10): 201401

Article  Google Scholar

Lu Y, Wu D, He B, Tang X, Xu J, Guo M. Rank-aware dynamic migrations and adaptive demotions for dram power management. IEEE Transactions on Computers, 2016, 65(1): 187–202

Article  MathSciNet  Google Scholar

Lu Y, He B, Tang X, Guo M. Synergy of dynamic frequency scaling and demotion on DRAM power management: models and optimizations. IEEE Transactions on Computers, 2015, 64(8): 2367–2381

Article  MathSciNet  Google Scholar

Mittal S, Vetter J S. A survey of software techniques for using nonvolatile memories for storage and main memory systems. IEEE Transactions on Parallel and Distributed Systems, 2016, 27(5): 1537–1550

Article  Google Scholar

Zhang J, Guo M, Wu C, Chen Y. Toward multi-programmed workloads with different memory footprints: a self-adaptive last level cache scheduling scheme. Science China Information Sciences, 2018, 61(1): 012105

Article  Google Scholar

Gulur N, Mehendale M, Manikantan R, Govindarajan R. Bi-modal DRAM cache: Improving hit rate, hit latency and bandwidth. In: Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture. 2014, 38–50

Huang C C, Nagarajan V. ATCache: reducing DRAM cache latency via a small SRAM tag cache. In: Proceedings of the 23rd International Conference on Parallel Architecture and Compilation Techniques. 2014, 51–60

Yang D, Liu H, Jin H, Zhang Y. HMvisor: dynamic hybrid memory management for virtual machines. Science China Information Sciences, 2021, 64(9): 192104

Article  Google Scholar

Chen T, Liu H, Liao X, Jin H. Resource abstraction and data placement for distributed hybrid memory pool. Frontiers of Computer Science, 2021, 15(3): 153103

Article  Google Scholar

Jiang X, Madan N, Zhao L, Upton M, Iyer R, Makineni S, Newell D, Solihin Y, Balasubramonian R. CHOP: adaptive filter-based DRAM caching for CMP server platforms. In: Proceedings of the 16th International Symposium on High-Performance Computer Architecture. 2010, 1–12

Chen P, Yue J, Liao X, Jin H. Trade-off between hit rate and hit latency for optimizing dram cache. IEEE Transactions on Emerging Topics in Computing, 2021, 9(1): 55–64

Luk C K, Cohn R, Muth R, Patil H, Klauser A, Lowney G, Wallace S, Reddi V J, Hazelwood K. Pin: building customized program analysis tools with dynamic instrumentation. ACM SIGPLAN Notices, 2005, 40(6): 190–200

Article  Google Scholar

Lee B C, Ipek E, Mutlu O, Burger D. Architecting phase change memory as a scalable DRAM alternative. ACM SIGARCH Computer Architecture News, 2009, 37(3): 2–13

Article  Google Scholar

Henning J L. SPEC CPU2006 benchmark descriptions. ACM SIGARCH Computer Architecture News, 2006, 34(4): 1–17

Article  Google Scholar

Shun J, Blelloch G E, Fineman J T, Gibbons P B, Kyrola A, Simhadri H V, Tangwongsan K. Brief announcement: the problem based benchmark suite. In: Proceedings of the 24th Annual ACM Symposium on Parallelism in Algorithms and Architectures. 2012, 68–70

Bienia C, Kumar S, Singh J P, Li K. The PARSEC benchmark suite: characterization and architectural implications. In: Proceedings of 2008 International Conference on Parallel Architectures and Compilation Techniques. 2008, 72–81

Zhang Q, Sui X, Hou R, Zhang L. Line-coalescing dram cache. Sustainable Computing: Informatics and Systems, 2021, 29: 100449

Jevdjic D, Volos S, Falsafi B. Die-stacked dram caches for servers: Hit ratio, latency, or bandwidth? Have it all with footprint cache ACM SIGARCH Computer Architecture News, 2013, 41(3): 404–415

Article  Google Scholar

Agarwal N, Wenisch T F. Thermostat: application-transparent page management for two-tiered main memory. In: Proceedings of the 22nd International Conference on Architectural Support for Programming Languages and Operating Systems. 2017, 631–644

Aswathy N S, Bhavanasi S, Sarkar A, Kapoor H K. SRS-Mig: selection and run-time scheduling of page migration for improved response time in hybrid PCM-DRAM memories. In: Proceedings of Great Lakes Symposium on VLSI 2022. 2022, 217–222

%PDF-1.5 %µµµµ 1 0 obj <> endobj 2 0 obj <> endobj 3 0 obj <>/Font<>/ProcSet[/PDF/Text/ImageB/ImageC/ImageI] >>/MediaBox[ 0 0 720 540] /Contents 4 0 R/Group<>/Tabs/S>> endobj 4 0 obj <> stream xœ��OKÄ0Åï�|‡wl6ÎLÒ$½ŠPQ‹ÄC‘îR°®Û]~{ÓnŠöBÞ¼ß{“­V2DÄ"„P8Bßhõ´À{’MRÒˆDŸNW¢_ÿzˆ<“›™V­î´ÚjµœFœ™"¢eþ'BÈš„IP‰)Φ—¥)¯�V'W]½n œmpÀOt¦`“ï/ô´J¦†-MéQ­´â1�áÝÈ.ØÄ€ª¶WºÔê9{hw9»lŸ3gM‡Û�¦Ï¹Ìê]‹üÕµVçÕφÇV(Ü,ߊñ1Â1ïÇŒA>öp»«<¶ýþ³~ÃMÓmú|i³¯yƒoÿXÕ endstream endobj 5 0 obj <> stream xœì�wt•Uö÷™ºf¹udDfX@B3a��@�¢„&‰‚4…„BAB�ˆ †ÒKŒô&¤Dª¢è«ŽÎpßϺ{¹×ùÝòäæÞ›5û�¬äÉóœçœ}öþîï>í)V¬HòKþüç??ñÄAAAÁÁÁ={öŒ��Ÿ:uêÒ¥K·mÛvüøñË—/óÍ7÷îݳy&?üðÃ_|qæÌ™}ûö­]»6==}ìرÑÑÑmÛ¶­W¯^ùòåÿú׿v‹‹¤Hr‘?üáÅ‹¯T©RHHHLLÌôéÓ³²²NŸ>}íÚµ;wî`ä?ýôÓÿûßÿýïú…'Biûã�?~ûí·8QNNN´páÂáLJ‡‡×ªU«T©RùË_ [7Eò›\ãÁx饗ƌóÁ=zôæÍ›ß}÷ṩcçÿùÏp¥ï¿ÿþ;»ü¿Ÿ…߹ȿ¸�ÛòT,Oá¡.\ÈÎÎÆm{÷î]¿~ý%Jæ~÷»ß¶òŠäW.ØƆÉÁyæÍ›wìØ1\c¶¶aøFû駟9r»]¶lYZZÚøñãccc£¢¢ºu놯………5kÖ¬qãÆ 6|Ö. 4hÔ¨QÓ¦MCCCÛµk×¹sç^½z

Anda mungkin ingin melihat