Download "A tiny AI supercomputer for your desk"

Download this video with UDL Client
  • Video mp4 HD+ with sound
  • Mp3 in the best quality
  • Any size files
Video tags
|

Video tags

dell pro max
gb10
nvidia
amd
arm
grace
blackwell
gb
superchip
ai
artificial
intelligence
llm
dev
developer
development
platform
factory
datacenter
connectx
connect-x
networking
network
200g
nccl
rdma
infiniband
ethernet
tcp
iperf
mellanox
apple
m3
ultra
mac
studio
framework
desktop
comparison
benchmark
tested
testing
inference
training
personal
workstation
mini
tiny
micro
pc
computer
airflow
improved
design
power
models
Subtitles
|

Subtitles

subtitles menu arrow
  • ruRussian
Download
00:00:00
This little box isn't a mini PC. Well,
00:00:03
at least not the same as like Apple's
00:00:04
Mac Mini or even the Minis Form MSR1 I
00:00:07
tested a few videos ago. And it isn't an
00:00:10
AI box. At least in the way that some
00:00:12
people think. If you just want to run
00:00:13
large language models, honestly, the AMD
00:00:16
Stricks Halo, like the Framework Desktop
00:00:18
mainboard that I reviewed earlier this
00:00:19
year, gets similar performance for like
00:00:22
half the price. And if you want to run
00:00:24
huge models, a maxed out Mac Studio goes
00:00:26
faster with more efficiency. This thing
00:00:29
is a $4,000 box built specifically for
00:00:32
developers in Nvidia's ecosystem,
00:00:34
deploying code to servers that like cost
00:00:36
half a million dollars each. And a major
00:00:38
part of the selling point are these
00:00:40
built-in 200 gigabit QSFP ports, which
00:00:43
if I'm going to be honest, behave a
00:00:45
little strangely, but on paper at least,
00:00:47
those ports alone are worth 1,500 bucks.
00:00:50
Dell sent me two of their Dell Pro Max
00:00:52
with GB10 boxes to test. That name just
00:00:55
rolls right off the tongue. but they
00:00:57
aren't paying me for this video and have
00:00:58
no control over what I say. In fact, one
00:01:00
of the main things they said was this
00:01:02
isn't a gaming machine, so maybe don't
00:01:03
focus on that. But that got me thinking,
00:01:06
what if I did? Like, Valve just
00:01:08
announced the Steam frame and it runs on
00:01:10
ARM. And supposedly the GPU inside this
00:01:12
thing is equivalent to like maybe a 4070
00:01:15
just with gobs of extra VRAM for AI. And
00:01:18
crossover preview for ARM just shipped
00:01:19
using FEX. And that's the same tech
00:01:22
that's going to power the Steam Frame.
00:01:23
So, of course, I tried gaming on here.
00:01:25
Sorry about that, Dell. After loading up
00:01:27
Steam, which runs perfectly on this
00:01:29
little ARM Linux box, I ran Cyberpunk
00:01:31
2077, I played through a bit of the game
00:01:33
and had zero problems. Running the
00:01:35
built-in benchmark, I got 40 FPS at
00:01:37
1080p with full ray tracing. And if I
00:01:40
turn that off and used Steam Deck
00:01:41
settings, I got 50. Then, if I turned
00:01:43
down the settings a little more, I was
00:01:45
hitting almost 100 frames per second,
00:01:47
which was surprising. I ran these same
00:01:49
benchmarks on the fastest ARM desktop in
00:01:52
the world, the Theelio Astra, and that
00:01:54
only got like 30 to 50 FPS. So, next I
00:01:57
fired up Doom Eternal, and I was getting
00:01:59
100 to 200 frames per second all day
00:02:01
running it with ray tracing on and ultra
00:02:03
settings at 1080p. There was zero
00:02:06
stuttering, and the whole experience
00:02:07
gaming on this thing was just as good as
00:02:09
my little Windows PC that I have in my
00:02:10
rack mount at my desk. And of course, I
00:02:13
know you'll ask, can it play Crisis? And
00:02:14
the answer to that is yes, very well. In
00:02:17
fact, I couldn't get Mango Hood working
00:02:19
on here, so I don't have a frame rate,
00:02:20
but it was well over 60 fps and more
00:02:22
playable than on any other ARM system
00:02:24
I've tested so far, including a Mac
00:02:27
Studio. Even Ultimate Epic Battle
00:02:29
Simulator 2, which kind of slaughters
00:02:31
ARM CPUs, was playable at 40 to 50 FPS
00:02:34
with thousands of chickens slugging it
00:02:36
out against a Roman legion. So, yeah,
00:02:38
gaming on ARM Linux, maybe Valve is on
00:02:40
to something. But no, despite all that,
00:02:42
there are tons of games with like kernel
00:02:44
level anti-che that doesn't run on Linux
00:02:46
at all, much less ARM Linux like this
00:02:48
box runs. And while I don't agree with
00:02:50
Dell that this isn't a gaming machine, I
00:02:52
do agree that that shouldn't be the
00:02:54
focus. This thing costs almost 4 grand.
00:02:57
And for that much, you can build a much
00:02:58
more capable gaming system if that's
00:03:00
what you're after. And you can do that
00:03:02
even with the RAM prices the way they
00:03:03
are today. This machine is built for AI
00:03:06
development. But hold on. I just I I
00:03:09
can't talk about that without telling
00:03:10
you the real reason I wanted to test
00:03:12
this particular model. The thing that
00:03:14
got me to look into this isn't the AI
00:03:16
chops, Nvidia's developer ecosystem, or
00:03:18
even those fancy networking jacks. The
00:03:20
thing that made me interested in testing
00:03:21
this is the GB10 chip inside. And GB
00:03:24
stands for Grace Blackwell. Blackwell is
00:03:26
the GPU architecture that costs tens of
00:03:29
thousands of dollars per unit. But the
00:03:30
grace part is the Grace CPU, an ARM CPU
00:03:33
that has cores that should be
00:03:35
competitive with like Apples or
00:03:36
Qualcomms. That's not the most important
00:03:38
part of this platform, especially
00:03:40
according to Nvidia, but it was
00:03:41
interesting to me. Why would Nvidia
00:03:44
ditch Intel and AMD and all the
00:03:46
compatibility of x86 in their premier
00:03:48
development platform? We'll get to that
00:03:50
right after we also answer why Nvidia
00:03:52
didn't put a power LED on the front of
00:03:54
their version of this thing, the DJX
00:03:56
Spark. The answer, I don't know. But
00:03:58
Dell did, and they fixed some thermal
00:04:00
problems designing the box for better
00:04:01
air flow front to back. In my testing, I
00:04:04
didn't see any thermal throttling, and
00:04:05
the things are pretty quiet, too. It was
00:04:07
just hitting 42 to 43 dB maxed out. And
00:04:10
that was while it was running in a
00:04:12
cluster of two of these burning through
00:04:14
300 W of power coming from a couple
00:04:16
external PSUs that are also a little
00:04:18
more generous than the 240 W versions
00:04:20
that Nvidia shipped. But there's not a
00:04:22
whole lot to look at here. Like I said,
00:04:24
the main thing I wanted to check out
00:04:25
today is the Grace part of the GB10
00:04:28
chip. The Grace CPU has a big little
00:04:30
layout with 10 performance cores and 10
00:04:32
efficiency cores. It's apparently
00:04:34
co-designed by MediaTek and put on the
00:04:37
same chip next to the Blackwell GPU. I
00:04:39
think partly because of that
00:04:40
architecture, the system's idle power
00:04:42
draw is a bit higher than I'm used to
00:04:43
for ARM, coming in around 30 watts. And
00:04:46
just having over 100 GB of RAM isn't an
00:04:49
excuse for that. I mean, otherwise AMD
00:04:51
and Apple would both be pumping out a
00:04:52
lot of idle power, too. But on the power
00:04:55
situation, I do like how Dell provides a
00:04:56
power supply that gives a little more
00:04:58
headroom up to 280 W. Not all that power
00:05:01
goes into the GB10 chip, though. There's
00:05:03
also those crazy networking ports and
00:05:05
all the USBC ports that can also put out
00:05:07
some power, too. In my testing, it seems
00:05:09
the chip itself maxes out around 140 W,
00:05:12
which is still a lot of power to feed
00:05:14
into this little guy. Anyway, we'll get
00:05:16
to performance soon. For now though, I
00:05:18
want to switch tracks and talk about
00:05:19
software. Nvidia ships a customized
00:05:21
version of Ubuntu with this thing called
00:05:23
DGXOS. Regular Ubuntu LTS versions are
00:05:26
supported for 5 years with optional pro
00:05:28
support extending that out to 10 or even
00:05:30
15 years. But DGXOS only guarantees
00:05:33
updates for 2 years it looks like, which
00:05:35
for a box that costs nearly 4 grand,
00:05:38
that seems pretty weak. It might not be
00:05:40
as big an issue if other Linux distros
00:05:42
would just run on this thing, and they
00:05:43
may, but this is one downside to this
00:05:46
being built on ARM instead of x86. ARM,
00:05:48
despite some progress in the past few
00:05:50
years, is still not as compatible as
00:05:52
like x86 platforms. So, in a few years,
00:05:55
for any features that aren't ported into
00:05:56
mainline Linux, you might have to
00:05:58
sacrifice functionality if you want a
00:06:00
newer version of like Ubuntu or Fedora.
00:06:02
The reason I mention this is Nvidia
00:06:04
hasn't had the best track record,
00:06:06
especially for their more enduserfacing
00:06:08
systems. Like my old Jetson and Nano,
00:06:10
which I bought a few years after my
00:06:11
first Raspberry Pi, which is still
00:06:13
supported, still only has Ubuntu 18
00:06:15
support, which is way out of life. And
00:06:18
sometimes trying to figure out just what
00:06:19
is supported on NVIDIA's embedded
00:06:21
devices can be a nightmare. Some people
00:06:23
have already had luck getting other
00:06:24
distros running, but they're still
00:06:26
running in Nvidia's Linux kernel. So, if
00:06:28
you buy one of these, know that there's
00:06:29
no guarantees for ongoing support beyond
00:06:32
a few years from now. But anyway, once
00:06:33
you have DJXOs running, you can install
00:06:36
practically anything that'll work in
00:06:37
Linux. Server software runs perfectly,
00:06:39
but there are some desktop tools that
00:06:41
are a little more of a hassle, like
00:06:42
Blender doesn't have a stable release
00:06:44
that uses GPU acceleration on ARM, but
00:06:46
if you compile it from source, like
00:06:48
GitHub user Coconut Macaroon did, you
00:06:50
can get full acceleration. And I already
00:06:52
covered games earlier, but in general,
00:06:53
just using this box as a little ARM
00:06:55
workstation, it felt plenty fast for all
00:06:57
the things that I do from coding to
00:06:58
browsing the web and light editing.
00:07:00
Anyway, to get some numbers behind my
00:07:02
intuition, I ran my full gauntlet of
00:07:04
benchmarks, both on a single node and in
00:07:06
a cluster with two of them connected
00:07:07
together with this 200 GB Amphenol QSFP
00:07:11
cable. I'm going to leave cluster
00:07:12
performance for a later video, so get
00:07:14
subscribed if you want to see that. But
00:07:15
as a standalone ARM Linux box, this
00:07:17
thing is pretty fast. Geekbench 6 was a
00:07:20
little unstable, but I did get it to run
00:07:21
and it was about on par with the AMD
00:07:23
Ryzen AIAX Plus 395 system I tested
00:07:26
earlier this year, the Framework
00:07:28
Desktop. And Apple's two generation old
00:07:30
M3 Ultra Max Studio beats both, but it
00:07:33
does cost quite a bit more, so that's to
00:07:35
be expected. And testing with high
00:07:37
performance Lindpack, this thing gets
00:07:38
about 675 gigaflops. But wait, that's
00:07:42
not even a teraflop. Nvidia said this
00:07:44
thing offers a pedlop of AI computing
00:07:46
performance. And that's a,000 teraflops.
00:07:49
Well, look more closely. Nvidia says
00:07:51
it's a pedlop of AI at FP4 precision.
00:07:54
HPL tests at FP64, aka double precision,
00:07:58
which is used more in scientific
00:08:00
computing. So, don't always believe the
00:08:02
things you hear in marketing. A flop is
00:08:03
not always a flop. And even that pedlop
00:08:06
claim is disputed, at least if I'm
00:08:07
reading John Carmarmac's tweets
00:08:09
correctly here. I only tell you the
00:08:10
things that I can measure. And so far, I
00:08:12
haven't measured a pedlop. But we'll get
00:08:14
to AI benchmark soon. All that said, I
00:08:16
was able to put two of these together
00:08:18
and build a tiny ARM cluster that would
00:08:20
have made it to the global top 500
00:08:22
supercomputer list all the way into
00:08:24
2005. And this little guy is pretty
00:08:26
efficient, too. It definitely beats
00:08:27
Intel and AMD with its Grace CPU. Idle
00:08:30
power is one area that this falls short,
00:08:32
though. Even without the power hungry
00:08:34
networking active, this thing is sucking
00:08:35
down 30 watts at idle. That's three
00:08:38
times what Apple and even modern AMD can
00:08:40
do. But a huge part of the value of this
00:08:42
box is the built-in ConnectX networking.
00:08:44
I tested that and yeah, it's fast. Way
00:08:48
faster than I could get for either the
00:08:49
Mac or AMD machines on their fastest
00:08:52
Thunderbolt ports, but 106 GBs isn't
00:08:55
200. So, is Nvidia lying again? Well,
00:08:58
no. See, this is a little complicated.
00:09:00
I'm going to have to refer you out to
00:09:01
this Serve the Home article. The way
00:09:03
these two ports are built, you're only
00:09:04
ever going to get about 200 gigs of
00:09:06
bandwidth on both ports, even though
00:09:08
each one is rated at 200 for a total of
00:09:10
400. And the only way to achieve 200
00:09:12
Gbits isn't with normal Ethernet. It's
00:09:14
with RDMA. That's the same tech that I
00:09:17
showed in Apple's cluster a couple weeks
00:09:19
ago. That lets things like LLMs work
00:09:21
together better when you're clustering
00:09:22
multiple GB10s. But it doesn't mean you
00:09:24
just get a blanket 200 GB and definitely
00:09:27
not 400 GB. But the fact you get
00:09:29
basically a $1,500 network card built
00:09:32
into this tiny computer is part of the
00:09:35
overall value of this box. Being able to
00:09:37
work with the same clustering tech that
00:09:39
you run in Nvidia's so-called AI
00:09:41
factories on a desktop sitting here very
00:09:43
quiet is what it's all about. And from
00:09:45
that perspective, if you want to
00:09:46
replicate this kind of developer setup
00:09:48
on AMD, you'd have to spend around the
00:09:50
same amount of money for the Max Plus
00:09:52
395 and a Connect X card on top of that.
00:09:55
A lot of people don't care about RDMA or
00:09:57
Infiniban, but that doesn't mean it's
00:09:59
not extremely useful for the people who
00:10:01
do. Just like with Apple's new RDMA over
00:10:03
Thunderbolt support, this stuff's
00:10:05
expensive, but to some people, it's not
00:10:07
a bad value. For now, on this one
00:10:09
machine, I'm just running two models,
00:10:10
both of them with llama. CPP optimized
00:10:13
for each architecture. And for a small
00:10:15
model that requires a decent amount of
00:10:17
CPU to keep up with the GPU, the GB10
00:10:19
does pretty well, almost hitting 100
00:10:21
tokens per second for inference, which
00:10:23
is second to the M3 Ultra. But for
00:10:25
prompt processing, which is important
00:10:26
for how fast you get a response out of
00:10:28
AI models, the GB10 chip is the winner.
00:10:31
despite costing less than half of the M3
00:10:33
Ultra. And it's a similar story for a
00:10:35
huge dense model, Llama 3.170B, except
00:10:38
here it gets beat just a little bit by
00:10:40
AMD's stricks Halo in the framework
00:10:41
desktop. But again, prompt processing is
00:10:44
a strong selling point for these boxes.
00:10:46
That's the reason XOTE running a DJ
00:10:48
Spark as the compute node for a Mac
00:10:50
Studio cluster. With that, you could run
00:10:52
the DJX Spark or one of these Dell
00:10:53
boxes, and have it handle the thing it's
00:10:55
best at, prompt processing, while the
00:10:57
Mac Studios handle the thing they're
00:10:59
best at, memory bandwidth for token
00:11:01
generation. Anyway, these are just two
00:11:03
quick AI benchmarks, and I have a lot
00:11:04
more in the GitHub issue I'll link to
00:11:06
below. I'm doing a lot more testing,
00:11:08
including model training and how I
00:11:09
clustered to these things in this tiny
00:11:11
little mini rack, but you'll have to
00:11:12
wait until next year for those things.
00:11:14
Until then, I'm Jeff Gearling.

Description:

Let's see if Nvidia's GB10 "AI Superchip" is all it's hyped up to be... Thanks to Dell for providing the two Dell Pro Max with GB10 units for testing and evaluation, along with accessories to get them clustered. Resources I mentioned in this video: - Dell Pro Max with GB10 Benchmark Results: https://github.com/geerlingguy/sbc-reviews/issues/92 - Dell Pro Max with GB10 AI Benchmarks: https://github.com/geerlingguy/ai-benchmarks/issues/34 - Dell Pro Max with GB10: https://www.dell.com/en-us/shop/desktop-computers/dell-pro-max-with-gb10/spd/dell-pro-max-fcm1253-micro/xcto_fcm1253_usx - DGX Spark has no power LED: https://community.frame.work/t/dgx-spark-vs-strix-halo-initial-impressions/77055 - Mediatek on Grace CPU: https://www.mediatek.com/press-room/newly-launched-nvidia-dgx-spark-features-gb10-superchip-co-designed-by-mediatek - DGX OS Release Cadence: https://docs.nvidia.com/dgx/dgx-spark/dgx-os.html#release-cadence - Jetson Nano Ubuntu 18.04 only: https://forums.developer.nvidia.com/t/trying-to-install-ubuntu-20-or-22-on-jetson-nano-2gb/327491/2 - CoconutMacaroon's Blender compilation instructions for Arm Linux: https://github.com/CoconutMacaroon/blender-arm64/ - FCLC's post on FLOPs: https://bsky.app/profile/fclc.bsky.social/post/3lc4qpte3ys2o - John Carmack's tweet on the 'petaflop': https://x.com/ID_AA_Carmack/status/1982831774850748825 - Top500 list for June 2025: https://www.top500.org/lists/top500/list/2005/06/?page=4 - Exo blog post on DGX Spark + Mac Studio: https://blog.exolabs.net/nvidia-dgx-spark/ Support me on Patreon: https://www.patreon.com/geerlingguy Sponsor me on GitHub: https://github.com/sponsors/geerlingguy Merch: https://www.redshirtjeff.com/ 2nd Channel: https://www.youtube.com/@GeerlingEngineering 3rd Channel: https://www.youtube.com/@Level2Jeff Contents: 00:00 - It's not a mini PC 01:07 - It's not a gaming PC 03:06 - It's an Arm Linux PC 03:49 - Improvements over the DGX Spark 04:28 - Grace CPU 05:19 - DGX OS and a concern 07:00 - Benchmarks (and FLOPs) 08:40 - Dual 200 Gbps networking 10:07 - AI on the GB10

Mediafile available in formats

popular icon
Popular
hd icon
HD video
audio icon
Only sound
total icon
All
* — If the video is playing in a new tab, go to it, then right-click on the video and select "Save video as..."
** — Link intended for online playback in specialized players

Questions about downloading video

question iconHow can I download "A tiny AI supercomputer for your desk" video?arrow icon

    http://univideos.ru/ website is the best way to download a video or a separate audio track if you want to do without installing programs and extensions.

    The UDL Helper extension is a convenient button that is seamlessly integrated into YouTube, Instagram and OK.ru sites for fast content download.

    UDL Client program (for Windows) is the most powerful solution that supports more than 900 websites, social networks and video hosting sites, as well as any video quality that is available in the source.

    UDL Lite is a really convenient way to access a website from your mobile device. With its help, you can easily download videos directly to your smartphone.

question iconWhich format of "A tiny AI supercomputer for your desk" video should I choose?arrow icon

    The best quality formats are FullHD (1080p), 2K (1440p), 4K (2160p) and 8K (4320p). The higher the resolution of your screen, the higher the video quality should be. However, there are other factors to consider: download speed, amount of free space, and device performance during playback.

question iconWhy does my computer freeze when loading a "A tiny AI supercomputer for your desk" video?arrow icon

    The browser/computer should not freeze completely! If this happens, please report it with a link to the video. Sometimes videos cannot be downloaded directly in a suitable format, so we have added the ability to convert the file to the desired format. In some cases, this process may actively use computer resources.

question iconHow can I download "A tiny AI supercomputer for your desk" video to my phone?arrow icon

    You can download a video to your smartphone using the website or the PWA application UDL Lite. It is also possible to send a download link via QR code using the UDL Helper extension.

question iconHow can I download an audio track (music) to MP3 "A tiny AI supercomputer for your desk"?arrow icon

    The most convenient way is to use the UDL Client program, which supports converting video to MP3 format. In some cases, MP3 can also be downloaded through the UDL Helper extension.

question iconHow can I save a frame from a video "A tiny AI supercomputer for your desk"?arrow icon

    This feature is available in the UDL Helper extension. Make sure that "Show the video snapshot button" is checked in the settings. A camera icon should appear in the lower right corner of the player to the left of the "Settings" icon. When you click on it, the current frame from the video will be saved to your computer in JPEG format.

question iconHow do I play and download streaming video?arrow icon

    For this purpose you need VLC-player, which can be downloaded for free from the official website https://www.videolan.org/vlc/.

    How to play streaming video through VLC player:

    • in video formats, hover your mouse over "Streaming Video**";
    • right-click on "Copy link";
    • open VLC-player;
    • select Media - Open Network Stream - Network in the menu;
    • paste the copied link into the input field;
    • click "Play".

    To download streaming video via VLC player, you need to convert it:

    • copy the video address (URL);
    • select "Open Network Stream" in the "Media" item of VLC player and paste the link to the video into the input field;
    • click on the arrow on the "Play" button and select "Convert" in the list;
    • select "Video - H.264 + MP3 (MP4)" in the "Profile" line;
    • click the "Browse" button to select a folder to save the converted video and click the "Start" button;
    • conversion speed depends on the resolution and duration of the video.

    Warning: this download method no longer works with most YouTube videos.

question iconWhat's the price of all this stuff?arrow icon

    It costs nothing. Our services are absolutely free for all users. There are no PRO subscriptions, no restrictions on the number or maximum length of downloaded videos.