Download "How does Computer Memory Work? 💻🛠"

Download this video with UDL Client
  • Video mp4 HD+ with sound
  • Mp3 in the best quality
  • Any size files
Video tags
|

Video tags

DRAM
CPU
Computer
Computer Memory
RAM
How does DRAM Work?
How do Computers Work?
How do CPUs Work?
Processors
How do
Work
Computer Work
Laptops
Memory
Working Memory
Main Memory
DDR5
DDR4
SDRAM
NVMe
CPU Memory
Gigabyte
16GB
16 Gigabytes
computer memory
random access memory
computer science
random access memory explained
Random Access
Processor
Computer Architecture
4GB
Comp Sci
Electronics
Hardware
Computer Hardware
Crucial
Vengeance
DDR
Corsair
M.2
M2
dram
cpu
computer
Subtitles
|

Subtitles

00:00:00
Have you ever wondered what’s happening inside  your computer when you load a program or video  
00:00:05
game?  Well, millions of operations are happening,  but perhaps the most common is simply just copying  
00:00:12
data from a solid-state drive or SSD into dynamic  random-access memory or DRAM.   An SSD stores all  
00:00:22
the programs and data for long-term storage,  but when your computer wants to use that data,  
00:00:27
it has to first move the appropriate  files into DRAM, which takes time,  
00:00:33
hence the loading bar.  Because your CPU works  only with data after it’s been moved to DRAM,  
00:00:39
it’s also called working memory or main memory. The reason why your desktop uses both SSDs and  
00:00:47
DRAM is because Solid-State Drives permanently  store data in massive 3D arrays composed of a  
00:00:54
trillion or so memory cells, yielding terabytes of  storage, whereas DRAM temporarily stores data in  
00:01:02
2D arrays composed of billions of tiny capacitor  memory cells yielding gigabytes of working memory.  
00:01:09
Accessing any section of cells in the massive  SSD array and reading or writing data takes  
00:01:16
about 50 microseconds whereas reading or  writing from any DRAM capacitor memory  
00:01:22
cell takes about 17 nanoseconds, which is 3000  times faster.  For comparison, a supersonic jet  
00:01:30
going at Mach 3 is around 3000 times faster  than a moving tortoise.  So, the speed of  
00:01:36
17 nanosecond DRAM versus 50 microsecond SSD is  like comparing a supersonic jet to a tortoise.  
00:01:45
  However, speed is just one factor.  DRAM is  limited to a 2D array and temporarily stores  
00:01:52
one bit per memory cell. For example, this stick  of DRAM with 8 chips holds 16 gigabytes of data,  
00:02:00
whereas a solid-state drive of a smaller  size can hold 2 terabytes of data, more  
00:02:06
than 100 times that of DRAM.  Additionally,  DRAM requires power to continuously store  
00:02:13
and refresh the data held in its capacitors.   Therefore, computers use both SSDs and DRAM and,  
00:02:21
by spending a few seconds of loading time  to copy data from the SSD to the DRAM,  
00:02:27
and then prefetching, which is the process of  moving data before it’s needed, your computer can  
00:02:34
store terabytes of data on the SSD and then access  the data from programs that were preemptively  
00:02:40
copied into the DRAM in a few nanoseconds.  For example, many video games have a loading  
00:02:46
time to start up the game itself, and then a  separate loading time to load a save file.   
00:02:52
During the process of loading a save file, all  the 3D models, textures, and the environment of  
00:02:59
your game state are moved from the SSD into DRAM  so any of it can be accessed in a few nanoseconds,  
00:03:06
which is why video games have DRAM capacity  requirements.  Just imagine, without DRAM,  
00:03:12
playing a game would be 3,000 times slower.   We covered solid-state drives in other videos,  
00:03:19
so in this video, we’re going to take a deep  dive into this 16-gigabyte stick of DRAM.  First,  
00:03:26
we’ll see exactly how the CPU communicates  and moves data from an SSD to DRAM.  Then  
00:03:32
we’ll open up a DRAM microchip and see how  billions of memory cells are organized into  
00:03:38
banks and how data is written to and read from  groups of memory cells.  In the process, we’ll  
00:03:45
dive into the nanoscopic structures inside  individual memory cells and see how each  
00:03:50
capacitor physically stores 1 bit of data.   Finally, we’ll explore some breakthroughs and  
00:03:56
optimizations such as the burst buffer and folded  DRAM layouts that enable DRAM to move data around  
00:04:04
at incredible speeds. A few quick notes.   First, you can find similar DRAM chips inside  
00:04:11
GPUs, Smartphones, and many other devices, but  with different optimizations.  As examples,  
00:04:18
GPU DRAM or VRAM, located all around the  GPU chip, has a larger bandwidth and can  
00:04:26
read and write simultaneously, but operates at  a lower frequency, and DRAM in your smartphone  
00:04:32
is stacked on top of the CPU and is optimized for  smaller packaging and lower power consumption.   
00:04:39
Second, this video is sponsored by  Crucial.  Although they gave me this  
00:04:44
stick of DRAM to model and use in the  video, the content was independently  
00:04:49
researched and not influenced by them.   Third, there are faster memory structures  
00:04:54
in your CPU called cache memory and even faster  registers.  All these types of memory create a  
00:05:01
memory hierarchy, with the main trade-off  being speed versus capacity while keeping  
00:05:06
prices affordable to consumers and optimizing  the size of each microchip for manufacturing. 
00:05:12
Fourth, you can see how much of  your DRAM is being utilized by  
00:05:17
each program by opening your computer’s  resource monitor and clicking on memory. 
00:05:22
Fifth, there are different generations of DRAM,  and we’ll explore DDR5.  Many of the key concepts  
00:05:29
that we explain apply to prior generations,  although the numbers may be different.   
00:05:34
Sixth, 17 nanoseconds is incredibly fast!   Electricity travels at around 1 foot per  
00:05:41
nanosecond, and 17 nanoseconds is about the  time it takes for light to travel across a room. 
00:05:47
Finally, this video is rather long as it covers  a lot of what there is to know around DRAM.  We  
00:05:54
recommend watching it first at one point two  five times speed, and then a second time at  
00:06:00
one and a half speed to fully comprehend this  complex technology.  Stick around because this  
00:06:06
is going to be an incredibly detailed video.   To start, a stick of DRAM is also called a Dual  
00:06:14
Inline Memory Module or DIMM and there are 8  DRAM chips on this particular DIMM.  On the  
00:06:21
motherboard, there are 4 DRAM slots, and when  plugged in, the DRAM is directly connected to  
00:06:28
the CPU via 2 memory channels that run through  the motherboard.  Note that the left two DRAM  
00:06:34
slots share these memory channels, and the right  two share a separate channel.  Let’s move to  
00:06:40
look inside the CPU at the processor.  Along  with numerous cores and many other elements,  
00:06:46
we find the memory controller which manages  and communicates with the DRAM.  There’s also  
00:06:52
a separate section for communicating with SSDs  plugged into the M2 slots and with SSDs and  
00:06:58
hard drives plugged into SATA connectors.  Using  these sections, along with data mapping tables,  
00:07:04
the CPU manages the flow of data from  the SSD to DRAM, as well as from DRAM  
00:07:10
to cache memory for processing by the cores. Let’s move back to see the memory channels.   
00:07:16
For DDR5 each memory channel is divided into two  parts, Channel A and Channel B. These two memory  
00:07:25
channels A and B independently transfer 32 bits at  a time using 32 data wires.   Using 21 additional  
00:07:35
wires each memory channel carries an address  specifying where to read or write data and, using  
00:07:42
7 control signal wires, commands are relayed. The addresses and commands are sent to and shared  
00:07:49
by all 4 chips on the memory channel which  work in parallel.  However, the 32-bit data  
00:07:55
lines are divided among the chips and thus each  chip only reads or writes 8 bits at a time.   
00:08:01
Additionally, power for DRAM is  supplied by the motherboard and  
00:08:07
managed by these chips on the stick itself. Next, let’s open and look inside one of these  
00:08:13
DRAM microchips.  Inside the exterior packaging,  we find an interconnection matrix that connects  
00:08:20
the ball grid array at the bottom with the die  which is the main part of this microchip.  This 2  
00:08:26
gigabyte DRAM die is organized into 8 bank groups  composed of 4 banks each, totaling 32 banks.   
00:08:34
Within each bank is a massive array, 65,536 memory  cells tall by 8192 cells across, essentially rows  
00:08:47
and columns in a grid, with tens of thousands of  wires, and supporting circuitry running outside  
00:08:53
each bank.  Instead of looking at this die, we’re  going to transition to a functional diagram,  
00:08:59
and then reorganize the banks and bank groups. In order to access 17 billion memory cells,  
00:09:06
we need a 31-bit address.  3 bits are used to  select the appropriate bank group, then 2 bits  
00:09:13
to select the bank.  Next 16 bits of the address  are used to determine the exact row out of 65  
00:09:21
thousand.  Because this chip reads or writes 8  bits at a time, the 8192 columns are grouped by  
00:09:30
8 memory cells, all read or written at a time,  or ‘by 8’, and thus only 10 bits are needed for  
00:09:37
the column address.  One optimization is that  this 31-bit address is separated into two parts  
00:09:44
and sent using only 21 wires.  First, the bank  group, bank, and row address are sent, and then  
00:09:52
after that the column address.  Next, we’ll look  inside these physical memory cells, but first,  
00:09:58
let’s briefly talk about how these structures are  manufactured as well as this video’s sponsor.    
00:10:03
This incredibly complicated die,  also called an integrated circuit,  
00:10:09
is manufactured on 300-millimeter silicon wafers,  2500ish dies at a time.  On each die are billions  
00:10:18
of nanoscopic memory cells that are fabricated  using dozens of tools and hundreds of steps in  
00:10:24
a semiconductor fabrication plant or fab.  This  one was made by Micron which manufactures around  
00:10:31
a quarter of the world’s DRAM, including both  Nvidia’s and AMD’s VRAM in their GPUs Micron also  
00:10:39
has its own product line of DRAM and SSDs under  the brand Crucial which, as mentioned earlier,  
00:10:46
is the sponsor of this video.  In addition  to DRAM, Micron is one of the world’s leading  
00:10:51
suppliers of solid-state drives such as this  Crucial P5+ M2 NVME SSD.   By installing your  
00:11:00
operating system and video games on a Crucial  NVMe solid-state drive, you’ll be sure to have  
00:11:07
incredibly fast loading times and smooth gameplay,  and if you do video editing, make sure all those  
00:11:14
files are on a fast SSD like this one as well.   This is because the main speed bottleneck for  
00:11:20
loading is predominantly limited by the speed of  the SSD or hard drive where the files are stored. 
00:11:26
For example, this hard drive can only transfer  data at around 150 megabytes a second whereas  
00:11:34
this Crucial NVMe SSD can transfer data at a  rate of up to 6,600 megabytes a second, which,  
00:11:42
for comparison is the speed of a moving tortoise  versus a galloping horse.  By using a Crucial NVMe  
00:11:50
SSD, loading a video game that requires gigabytes  of DRAM is reduced from a minute or more down to  
00:11:58
a couple seconds.  Check out the Crucial NVMe  SSDs using the link in the description below. 
00:12:09
Let’s get back to the details of how DRAM works  and zoom in to explore a single memory cell  
00:12:15
situated in a massive array. This memory cell is  called a 1T1C cell and is a few dozen nanometers  
00:12:24
in size.  It has two parts, a capacitor to store  one bit of data in the form of electrical charges  
00:12:31
or electrons and a transistor to access and read  or write data.  The capacitor is shaped like a  
00:12:37
deep trench dug into silicon and is composed of  two conductive surfaces separated by a dielectric  
00:12:44
insulator or barrier just a few atoms thick, which  stops the flow of electrons but allows electric  
00:12:50
fields to pass through.  If this capacitor  is charged up with electrons to 1 volt,  
00:12:56
it’s a binary 1, and if no charges are present  and it’s at 0 volts, it’s a binary 0, and thus  
00:13:04
this cell only holds one bit of data.  Designs  of capacitors are constantly evolving but in  
00:13:11
this trench capacitor, the depth of the silicon is  utilized to allow for larger capacitive storage,  
00:13:17
while taking up as little area as possible. Next let’s look at the access transistor and  
00:13:24
add in two wires.  The wordline wire connects to  the gate of the transistor while the bitline wire  
00:13:31
connects to the other side of the transistor’s  channel.  Applying a voltage to the wordline  
00:13:36
turns on the transistor, and, while it’s on,  electrons can flow through the channel thus  
00:13:42
connecting the capacitor to the bitline.  This  allows us to access and charge up the capacitor  
00:13:47
to write a 1 or discharge the capacitor to write  a 0.  Additionally, we can read the stored value  
00:13:54
in the capacitor by measuring the amount of  charge.  However, when the wordline is off,  
00:14:00
the transistor is turned off, and the capacitor  is isolated from the bitline thus saving the  
00:14:05
data or charge that was previously written.  Note  that because this transistor is incredibly small,  
00:14:12
only a few dozen nanometers wide, electrons slowly  leak across the channel, and thus over time the  
00:14:19
capacitor needs to be refreshed to recharge  the leaked electrons. We’ll cover exactly how  
00:14:25
refreshing memory cells works a little later. As mentioned earlier, this 1T1C memory cell is  
00:14:33
one of 17 billion inside this single die and is  organized into massive arrays called banks.  So,  
00:14:41
let’s build a small array for illustrative  purposes.  In our array, each of the wordlines  
00:14:47
is connected in rows, and then the bitlines are  connected in columns.  Wordlines and bitlines  
00:14:53
are on different vertical layers so one can  cross over the other, and they never touch.  
00:15:00
Let’s simplify the visual and use symbols for the  capacitors and the transistors.  Just as before,  
00:15:06
the wordlines connect to each transistor’s control  gate in rows, and then all the bitlines in columns  
00:15:13
connect to the channel opposite each capacitor.  As a result, when a wordline is active,  
00:15:19
all the capacitors in only that row are  connected to their corresponding bitlines,  
00:15:24
thereby activating all the memory cells in that  row.  At any given time only one wordline is  
00:15:31
active because, if more than one wordline were  active, then multiple capacitors in a column  
00:15:37
would be connected to the bitline and the data  storage functionalities of these capacitors would  
00:15:42
interfere with one another, making them useless.   As mentioned earlier, within a single bank there  
00:15:48
are 65,536 rows and 8,192 columns and the 31-bit  address is used to activate a group of just 8  
00:15:59
memory cells.  The first 5 bits select the bank,  and the next 16-bits are sent to a row decoder  
00:16:06
to activate a single row.  For example, this  binary number turns on the wordline row 27,524,  
00:16:15
thus turning on all transistors in that row and  connecting the 8,192 capacitors to their bitlines,  
00:16:23
while at the same time the other 65  thousandish wordlines are all off.   
00:16:29
Here’s the logic diagram for a simple decoder. The remaining 10 bits of the address are sent  
00:16:35
to the column multiplexer.  This multiplexer  takes in the 8192 bitlines on the top, and,  
00:16:42
depending on the 10-bit address, connects a  specific group of 8 bitlines to the 8 input  
00:16:48
and output IO wires at the bottom.  For example,  if the 10-bit address we this, then only the  
00:16:55
bitlines 4,784 through 4,791 would be connected  to the IO wires, and the rest of the 8000ish  
00:17:05
bitlines would be connected to nothing.  Here’s  the logic diagram for a simple multiplexer.   
00:17:11
We now have the means of accessing any  memory cell in this massive array; however,  
00:17:16
to understand the three basic operations,  reading, writing, and refreshing let’s add  
00:17:22
two elements to our layout:  A sense amplifier  at the bottom of each bitline, and a read and  
00:17:28
write driver outside of the column multiplexer. Let’s look at reading from a group of memory  
00:17:34
cells.  First the read command and 31-bit address  are sent from the CPU to the DRAM.  The first 5  
00:17:42
bits select a specific bank. The next step is  to turn off all the wordlines in that bank,  
00:17:48
thereby isolating all the capacitors, and then  precharge all 8000ish bitlines to .5 volts.  Next  
00:17:57
the 16-bit row address turns on a row, and all  the capacitors in that row are connected to their  
00:18:03
bitlines.  If an individual capacitor holds a 1  and is charged to 1 volt, then some charge flows  
00:18:10
from the capacitor onto the .5-volt bitline, and  the voltage on the bitline increases.  The sense  
00:18:17
amplifier then detects this slight change  or perturbation of voltage on the bitline,  
00:18:22
amplifies the change, and pushes the voltage on  the bitline all the way up to 1 volt. However,  
00:18:28
if a 0 is stored in the capacitor, charge  flows from the bitline into the capacitor,  
00:18:35
and the .5-volt bitline decreases in voltage.   The sense amplifier then sees this change,  
00:18:41
amplifies it and drives the bitline voltage down  to 0 volts or ground.  The sense amplifier is  
00:18:49
necessary because the capacitor is so small,  and the bitline is rather long, and thus the  
00:18:54
capacitor needs to have an additional component  to sense and amplify whatever value is stored.   
00:19:00
Now, all 8000ish bitlines are driven to 1  volt or 0 volts corresponding to the stored  
00:19:08
charge in the capacitors of the activated  row, and this row is now considered open.   
00:19:12
Next, the column select multiplexer uses  the 10-bit column address to connect the  
00:19:19
corresponding 8 bitlines to the read  driver which then sends these 8 values  
00:19:24
and voltages over the 8 data wires to the CPU.  Writing data to these memory cells is similar  
00:19:31
to reading, however with a few key differences. First the write command, address, and 8 bits to  
00:19:39
be written are sent to the DRAM chip.  Next, just  like before the bank is selected, the capacitors  
00:19:46
are isolated, and the bitlines are precharged  to .5 volts.  Then, using a 16-bit address,  
00:19:54
a single row is activated, the capacitors perturb  the bitline, and the sense amplifiers sense this  
00:20:01
and drive the bitlines to a 1 or 0 thus opening  the row.  Next the column address goes to the  
00:20:09
multiplexer, but, this time, because a write  command was sent, the multiplexer connects the  
00:20:15
specific 8 bitlines to the write driver which  contains the 8 bits that the CPU had sent along  
00:20:20
the data wires and requested to write.  These  write drivers are much stronger than the sense  
00:20:26
amplifier and thus they override whatever voltage  was previously on the bitline, and drive each of  
00:20:32
the 8 bitlines to 1 volt for a 1 to be written,  or 0 volts for a 0.  This new bitline voltage  
00:20:39
overrides the previously stored charges or values  in each of the 8 capacitors in the open row,  
00:20:45
thereby writing 8 bits of data to the memory  cells corresponding to the 31-bit address. 
00:20:52
Three quick notes.  First, as a reminder, writing  and reading happens concurrently with all the 4  
00:20:58
chips in the shared memory channel, using  the same 31-bit address and command wires,  
00:21:03
but with different data wires for each chip.   Second, with DDR5 for a binary 1 the voltage  
00:21:11
is actually 1.1 volts, for DDR4 it’s 1.2 volts,  and prior generations had even higher voltages,  
00:21:19
with the bitline precharge voltages being  half of these voltages.  However, for DDR5,  
00:21:26
when writing or refreshing a higher voltage,  around 1.4 volts is applied and stored in each  
00:21:33
capacitor for a binary 1 because charge leaks  out over time. However, for simplicity, we’re  
00:21:39
going to stick with 1 and 0.  Third, the number  of bank groups, banks, bitlines and wordlines  
00:21:46
varies widely between different generations  and capacities but is always in powers of 2. 
00:21:53
Let’s move on and discuss the third operation  which is refreshing the memory cells in a bank.   
00:21:59
As mentioned earlier, the transistors used to  isolate the capacitors are incredibly small,  
00:22:05
and thus charges leak across the channel.  The  refresh operation is rather simple and is a  
00:22:11
sequence of closing all the rows, precharging  the bitlines to .5 volts, and opening a row.   
00:22:17
To refresh, just as before, the capacitors perturb  the bitlines and then the sense amplifiers drive  
00:22:24
the bitlines and capacitors of the open row fully  up to 1 volt or down to 0 volts depending on the  
00:22:31
stored value of the capacitor, thereby refilling  the leaked charge.  This process of row closing,  
00:22:38
precharging, opening, and sense amplifying happens  row after row, taking 50 nanoseconds for each row,  
00:22:46
until all 65 thousandish rows are refreshed  taking a total of 3 milliseconds or so to  
00:22:53
complete.  The refresh operation occurs  once every 64 milliseconds for each bank,  
00:22:58
because that’s statistically below the  worst-case time it takes for a memory  
00:23:03
cell to leak too much charge to make a stored 1  turn into a 0, thus resulting in a loss of data. 
00:23:12
Let’s take a step back and consider the  incredible amount of data that is moved  
00:23:17
through DRAM memory cells. These banks of memory  cells handle up to 4 thousand 8 hundred million  
00:23:24
requests to read and write data every second  while refreshing every memory cell in each  
00:23:30
bank row by row around 16 times a second.  That’s a staggering amount of data movement  
00:23:37
and illustrates the true strength of computers.  Yes, they do simple things like comparisons,  
00:23:44
arithmetic, and moving data around, but  at a rate of billions of times a second.  
00:23:50
Now, you might wonder why computers  need to do so much data movement. Well,  
00:23:55
take this video game for example. You have obvious  calculations like the movement of your character  
00:24:01
and the horse. But then there are individual  grasses, trees, rocks, and animals whose  
00:24:07
positions and geometries are stored in DRAM.  And then the environment such as the lighting  
00:24:14
and shadows change the colors and textures of the  environment in order to create a realistic world. 
00:24:21
Next, we’re going to explore breakthroughs and  optimizations that allow DRAM to be incredibly  
00:24:28
fast. But, before we get into all those  details, we would greatly appreciate it  
00:24:33
if you could take a second to hit that like  button, subscribe if you haven’t already,  
00:24:37
and type up a quick comment below, as it helps get  this video out to others.  Also, we have a Patreon  
00:24:45
and would appreciate any support.  This is our  longest and most detailed video by far, and we’re  
00:24:51
planning more videos that get into the inner  details of how computers work.  We can’t do it  
00:24:57
without your help, so thank you for watching and  doing these three quick things. It helps a ton. 
00:25:07
The first complex topic which we’ll explore  is why there are 32 banks, as well as what the  
00:25:14
parameters on the packaging of DRAM are.   After that, we’ll explore burst buffers,  
00:25:19
sub-arrays, and folded DRAM architecture  and what’s inside the sense amplifier. 
00:25:25
Let’s take a look at the banks.  As  mentioned earlier opening a single  
00:25:30
row within a bank requires all these  steps and this process takes time.
00:25:34
However, if a row were already open, we  could read or write to any section of  
00:25:39
8 memory cells using only the 10-bit  column address and the column select  
00:25:44
multiplexer.   When the CPU sends a read or  write command to a row that’s already open,  
00:25:51
it’s called a row hit or page hit, and this  can happen over and over.  With a row hit,  
00:25:56
we skip all the steps required to open a row, and  just use the 10-bit column address to multiplex a  
00:26:03
different set of 8 columns or bitlines, connecting  them to the read or write driver, thereby saving  
00:26:09
a considerable amount of time.  A row miss is  when the next address is for a different row,  
00:26:14
which requires the DRAM to close and isolate the  currently open row, and then open the new row.  
00:26:20
On a package of DRAM there are typically 4 numbers  specifying timing parameters regarding row hits,  
00:26:27
precharging, and row misses.  The first number  refers to the time it takes between sending an  
00:26:33
address with a row open, thus a row hit, to  receiving the data stored in those columns.   
00:26:38
The next number is the time it takes to open  a row if all the lines are isolated and the  
00:26:44
bitlines are precharged.  Then the next number  is the time it takes to precharge the bitlines  
00:26:49
before opening a row, and the last number is  the time it takes between a row activation and  
00:26:55
the following precharge.  Note that these  numbers are measured in clock cycles.   
00:27:00
Row hits are also the reason why the address is  sent in two sections, first the bank selection and  
00:27:07
row address called RAS and then the column address  called CAS. If the first part, the bank selection  
00:27:14
and row address, matches a currently open row,  then it’s a row hit, and all the DRAM needs is the  
00:27:20
column address and the new command, and then the  multiplexer simply moves around the open row.   
00:27:26
Because of the time saving in accessing an  open row, the CPU memory controller, programs,  
00:27:32
and compilers are optimized for increasing the  number of subsequent row hits. The opposite,  
00:27:38
called thrashing, is when a program jumps around  from one row to a different row over and over,  
00:27:44
and is obviously incredibly inefficient  both in terms of energy and time.   
00:27:49
Additionally, DDR5 DRAM has 32 banks for  this reason.  Each bank’s rows, columns,  
00:27:57
sense amplifiers and row decoders operate  independently of one another, and thus multiple  
00:28:03
rows from different banks can be open all at the  same time, increasing the likelihood of a row hit,  
00:28:09
and reducing the average time it takes for the CPU  to access data.  Furthermore, by having multiple  
00:28:16
bank groups, the CPU can refresh one bank in each  bank group at a time while using the other three,  
00:28:22
thus reducing the impact of refreshing.  A question you may have had earlier is why  
00:28:28
are banks significantly taller than they are  wide? Well, by combining all the banks together  
00:28:34
one next to the other you can think of this chip  as actually being 65 thousand rows tall by 262  
00:28:44
thousand columns wide. And, by adding 31 equally  spaced divisions between the columns, thus  
00:28:50
creating banks, we allow for much more flexibility  and efficiency in reading, writing and refreshing. 
00:28:58
Also, note that on the DRAM packaging are  its capacity in Gigabytes, the number of  
00:29:04
millions of data transfers per second, which  is two times the clock frequency, and the peak  
00:29:10
data transfer rate in Megabytes per second. The next design optimization we’ll explore  
00:29:16
is the burst buffer and burst length.  Let’s add a  128-bit read and write temporary storage location,  
00:29:23
called a burst buffer to our functional diagram.   Instead of 8 wires coming out of the multiplexer,  
00:29:30
we’re going to have 128 wires that connect  to these 128-bit buffer locations.  Next  
00:29:38
the 10-bit column address is broken into two  parts, 6 bits are used for the multiplexer,  
00:29:44
and 4 bits are for the burst buffer.  Let’s explore a reading command.  With  
00:29:49
our burst buffer in place, 128 memory cells and  bitlines are connected to the burst buffer using  
00:29:56
the 6 column bits, thereby temporarily loading,  or caching 128 values into the burst buffer.   
00:30:04
Using the 4 bits for the buffer, 8 quickly  accessed data locations in the burst buffer  
00:30:10
are connected to the read drivers and the data is  sent to the CPU.  By cycling through these 4 bits,  
00:30:16
all 16 sets of 8 bits are read out, and thus the  burst length is 16.  After that a new set of 128  
00:30:25
bitlines and values are connected and loaded  into the burst buffer.  There’s also a write  
00:30:31
burst buffer which operates in a similar way. The benefit of this design is that 16 sets of  
00:30:37
8 bits per microchip, totaling 1024 bits, can be  accessed and read or written extremely quickly,  
00:30:45
as long as the data is all next to one  another, but at the same time we still  
00:30:49
have the granularity and ability to access any  set of 8 bits if our data requests jump around. 
00:30:56
The next design optimization is that this bank  of 65536 rows by 8192 columns is rather massive,  
00:31:07
and results in extremely long wordlines and  bitlines, especially when compared to the size of  
00:31:14
each trench capacitor memory cell.  Therefore,  the massive array is broken up into smaller  
00:31:20
blocks 1,024 by 1,024, with intermediate  sense amplifiers below each subarray,  
00:31:27
and subdividing wordlines and using a hierarchical  row decoding scheme.  By subdividing the bitlines,  
00:31:34
the distance and amount of wire that each tiny  capacitor is connected to as it perturbs the  
00:31:40
bitline to the sense amplifier is reduced, and  thus the capacitor doesn’t have to be as big.  By  
00:31:47
subdividing the wordlines the capacitive load from  eight thousandish transistor gates and channels is  
00:31:53
decreased, and thus the time it takes to turn on  all the access transistors in a row is decreased. 
00:31:59
The final topic we’re going to talk about is  the most complicated.  Remember how we had  
00:32:05
a sense amplifier connected to the bottom of  each bitline?  Well, this optimization has two  
00:32:10
bitlines per column going to each sense amplifier  and alternating rows of memory cells connected to  
00:32:17
the left and right bitlines, thus doubling the  number of bitlines.  When one row is active,  
00:32:22
half of the bitlines are active while the other  half are passive and vice versa when the next row  
00:32:28
is active.   Moving down to see inside the sense  amplifier we find a cross-coupled inverter.  How  
00:32:34
does this work?  Well, when the active bitline is  a 1, the passive bitline will be driven by this  
00:32:41
cross-coupled inverter to the opposite value  of 0, and when the active is a 0, the passive  
00:32:46
becomes a 1.  Note that the inverted passive  bitline isn’t connected to any memory cells,  
00:32:52
and thus it doesn’t mess up any stored data.  The  cross-coupled inverter makes it such that these  
00:32:58
two bitlines are always going to be opposite  one another, and they’re called a differential  
00:33:03
pair.  There are three benefits to this design.   First, during the precharge step, we want to bring  
00:33:09
all the bitlines to .5 volts and, by having a  differential pair of active and passive bitlines,  
00:33:15
the easiest solution is to disconnect the cross  coupled inverters and open a channel between the  
00:33:22
two using a transistor.  The charge easily  flows from the 1 bitline to the 0, and they  
00:33:28
both average out and settle at .5 volts.   The other two benefits are noise immunity,  
00:33:34
and a reduction in parasitic capacitance of the  bitline.  These benefits are related to that fact  
00:33:39
that by creating two oppositely charged electric  wires with electric fields going from one to  
00:33:45
the other we reduce the amount of electric fields  emitted in stray directions and relatedly increase  
00:33:51
the ability of the sense amplifier to amplify  one bitline to 1 volt and the other to 0 volts.   
00:33:58
One final note is that when discussing DRAM,  one major topic is the timing of addresses,  
00:34:04
command signals and data, and the related  acronyms DDR or double data rate, and SDRAM,  
00:34:12
or Synchronous DRAM.  These topics were omitted  from this video because it would have taken an  
00:34:17
additional 15 minutes to properly explore.   That’s  
00:34:24
pretty much it for the DRAM, and we are grateful  you made it this far into the video.  We believe  
00:34:30
the future will require a strong emphasis on  engineering education and we’re thankful to all  
00:34:36
our Patreon and YouTube Membership Sponsors  for supporting this dream.  If you want to  
00:34:41
support us on YouTube Memberships, or Patreon,  you can find the links in the description.  
00:34:47
A huge thanks goes to the Nathan, Peter, and  Jacob who are doctoral students at the Florida  
00:34:53
Institute for Cybersecurity Research for helping  to research and review this video’s content!  They  
00:35:00
do foundational research on finding the weak  points in device security and whether hardware  
00:35:05
is compromised.  If you want to learn more about  the FICS graduate program or their work, check out  
00:35:11
the website using the link in the description.   This is Branch Education, and we create 3D  
00:35:18
animations that dive deep into the technology that  drives our modern world.  Watch another Branch  
00:35:24
video by clicking one of these cards or click here  to subscribe.  Thanks for watching to the end!

Description:

Check out Crucial NVMe SSDs Here: https://www.crucial.com/ Have you ever wondered why it takes time for computers to load programs or video games? Also, ever wonder why your computer uses both DRAM as well as SSDs when they both are used to store data? Well, most of that time is spent moving data from a hard drive or SSD into DRAM or Dynamic Random Access Memory, which is the working memory inside your computer. In this video, we're going to take a very deep dive into DRAM. We'll see how it connects to other parts of your computer, and then we'll explore how DRAM can store gigabytes of data in nanoscopic capacitors. After that, we'll cover the three main operations of DRAM: Reading, Writing, and Refreshing. And finally, we'll dive deep into some more complex aspects of DRAM that make it so amazingly fast such as folded DRAM architecture. We'll also learn what burst buffers are, and why there are so many banks of DRAM memory cells. Do you want to support in-depth engineering and technology education? Support us at: https://www.patreon.com/brancheducation Website: https://www.branch.education/ On Facebook: https://www.facebook.com/unsupportedbrowser On Twitter: https://twitter.com/TeddyTablante On Insta: https://www.facebook.com/unsupportedbrowser Thanks to Nathan, Peter, and Jacob for helping research and review this video! They're doctoral students at the Florida Institute for Cybersecurity Research, and you can learn more about their program here: https://fics.institute.ufl.edu/ Table of Contents: 00:00 - Intro to Computer Memory 00:47 - DRAM vs SSD 02:23 - Loading a Video Game 03:25 - Parts of this Video 04:07 - Notes 06:10 - Intro to DRAM, DIMMs & Memory Channels 10:43 - Crucial Sponsorship 12:09 - Inside a DRAM Memory Cell 15:28 - An Small Array of Memory Cells 17:41 - Reading from DRAM 19:38 - Writing to DRAM 21:55 - Refreshing DRAM 23:16 - Why DRAM Speed is Critical 25:06 - Complicated DRAM Topics: Row Hits 26:21 - DRAM Timing Parameters 27:51 - Why 32 DRAM Banks? 29:17 - DRAM Burst Buffers 30:58 - Subarrays 32:02 - Inside DRAM Sense Amplifiers 34:24 - Outro to DRAM Key Branches from this video are: How do Solid State Drives Work? Erratum: At 10m 08s : Cicruit || Should be Circuit At 21m 54s : 32 Bank Groups || Should be 32 Banks. Script, Modeling, Animation: Teddy Tablante Twitter: @teddytablante Animation: Mike Radjabov Modeling: Prakash Kakadiya Voice Over: Phil Lee Sound Design: www.drilu.mx Music Editing: Luis Zuleta Sound Effects: Paulo de los Cobos Supervising Sound Editor and Mixer: Luis Huesca Animation built using Blender 3.1.2 https://www.blender.org/ Post with Adobe Premiere Pro References: DDR5 SDRAM. JEDEC Standard. JESD79-5 July 2020 Dr. Cutress, Ian. "Insights into DDR5 Sub-Timings and Latencies". Oct 6th, 2020. Dr. El-Maleh, Aiman. "Functions and Functional Blocks: Digital Logic Design" College of Computer Sciences and Engineering. King Fahd University of Petroleum and Minerals. Hajimiri, Ali. Et al. "Design Issues in Cross-Coupled inverter Sense Amplifier". IEEE. Stanford University 1998 IBM. Understanding DRAM Operation. IBM 1996. Jacob, Bruce. NG, Spencer W. Wang, David T. "Memory Systems: Cache, DRAM, Disk." Elsevier Inc. 2008 Keeth, Brent. Baker, R Jacob. Johnson, Brian. Lin, Feng. "DRAM Circuit Design: Fundamental and High-Speed Topics." IEE Press 2008. Kim, Yoongu et. Al. "A Case for Exploiting Subarray-Level Parallelism in DRAM". Carnegie Mellon University Lee, Donghuk et.al. "Tiered-Latency DRAM: A Low Latency and Low Cost DRAM Architecture. Carnegie Mellon University Micron. "DDR4 SDRAM. MT40A4G4. MT40A2G8. MT40A1G16. 16Gb: x4, x8, x16 DDR4 SDRAM Features" Micron Technologies 2018 Micron. "DDR5 SDRAM Product Core Data Sheet DDR5SDRAM Features." Micron Technologies 2020 Ryan, Kevin J. Morzano, Christopher K. Li, Wen. "Write Data Masking for Higher Speed DRAMS" US Patent 6532180 B2 Mar. 11 2003. Shilov, Anton. "SK Hynix Details DDR5-6400". ANANDTECH. Feb 26th, 2019. Sunami, Hideo. "Dimension Increase in Metal-Oxide Semiconductor Memories and Transistors". From intechopen.com. From "Advances in Solid State Circuit Technologies". Apr 2010. Wikipedia contributors. "CAS Latency". "DDR5 SDRAM". "Dynamics Random-Access Memory". "Memory Timing". "Synchronous Dynamic Random-Access Memory". Wikipedia, The Free Encyclopedia. Wikipedia, The Free Encyclopedia, Visited Nov 2022

Mediafile available in formats

popular icon
Popular
hd icon
HD video
audio icon
Only sound
total icon
All
* — If the video is playing in a new tab, go to it, then right-click on the video and select "Save video as..."
** — Link intended for online playback in specialized players

Questions about downloading video

question iconHow can I download "How does Computer Memory Work? 💻🛠" video?arrow icon

    http://univideos.ru/ website is the best way to download a video or a separate audio track if you want to do without installing programs and extensions.

    The UDL Helper extension is a convenient button that is seamlessly integrated into YouTube, Instagram and OK.ru sites for fast content download.

    UDL Client program (for Windows) is the most powerful solution that supports more than 900 websites, social networks and video hosting sites, as well as any video quality that is available in the source.

    UDL Lite is a really convenient way to access a website from your mobile device. With its help, you can easily download videos directly to your smartphone.

question iconWhich format of "How does Computer Memory Work? 💻🛠" video should I choose?arrow icon

    The best quality formats are FullHD (1080p), 2K (1440p), 4K (2160p) and 8K (4320p). The higher the resolution of your screen, the higher the video quality should be. However, there are other factors to consider: download speed, amount of free space, and device performance during playback.

question iconWhy does my computer freeze when loading a "How does Computer Memory Work? 💻🛠" video?arrow icon

    The browser/computer should not freeze completely! If this happens, please report it with a link to the video. Sometimes videos cannot be downloaded directly in a suitable format, so we have added the ability to convert the file to the desired format. In some cases, this process may actively use computer resources.

question iconHow can I download "How does Computer Memory Work? 💻🛠" video to my phone?arrow icon

    You can download a video to your smartphone using the website or the PWA application UDL Lite. It is also possible to send a download link via QR code using the UDL Helper extension.

question iconHow can I download an audio track (music) to MP3 "How does Computer Memory Work? 💻🛠"?arrow icon

    The most convenient way is to use the UDL Client program, which supports converting video to MP3 format. In some cases, MP3 can also be downloaded through the UDL Helper extension.

question iconHow can I save a frame from a video "How does Computer Memory Work? 💻🛠"?arrow icon

    This feature is available in the UDL Helper extension. Make sure that "Show the video snapshot button" is checked in the settings. A camera icon should appear in the lower right corner of the player to the left of the "Settings" icon. When you click on it, the current frame from the video will be saved to your computer in JPEG format.

question iconHow do I play and download streaming video?arrow icon

    For this purpose you need VLC-player, which can be downloaded for free from the official website https://www.videolan.org/vlc/.

    How to play streaming video through VLC player:

    • in video formats, hover your mouse over "Streaming Video**";
    • right-click on "Copy link";
    • open VLC-player;
    • select Media - Open Network Stream - Network in the menu;
    • paste the copied link into the input field;
    • click "Play".

    To download streaming video via VLC player, you need to convert it:

    • copy the video address (URL);
    • select "Open Network Stream" in the "Media" item of VLC player and paste the link to the video into the input field;
    • click on the arrow on the "Play" button and select "Convert" in the list;
    • select "Video - H.264 + MP3 (MP4)" in the "Profile" line;
    • click the "Browse" button to select a folder to save the converted video and click the "Start" button;
    • conversion speed depends on the resolution and duration of the video.

    Warning: this download method no longer works with most YouTube videos.

question iconWhat's the price of all this stuff?arrow icon

    It costs nothing. Our services are absolutely free for all users. There are no PRO subscriptions, no restrictions on the number or maximum length of downloaded videos.