How to Build a PC Wisely—A Complete Guide to PC Building (Part 2)

#4 Power Supply Selection

This is, in my opinion, the most essential Non-Semiconductor component of the entire system, as it delivers power to all of your parts and if the quality of the power supply unit is bad, then it might end up dying and frying all your other precious components. Even worse, some of the terrible ones might even catch on fire when it is overloaded and burn your house down. So make sure to purchase a well-built power supply from reputable brands and don’t cheap out on it.

One of the most critical measure for a good Power Supply is its efficiency. This table below shows that for a high-end mainstream system that would consume about 400W during a combined load (gaming or workstation workload) and the amount of power you would pull from the wall socket (AC Input) VS. The Watts Lost as heat. To calculate the Watts Lost numbers by yourself, take the component power draw (DC power draw) and divide it by the power supply efficiency, and use the calculated amount subtracted by the component power draw.

DC Output AC Input Efficiency Watts Lost
Minimum ATX Efficiency Requirement 400W 667W 60% 267W
Typical low cost PSU 400W 571W 70% 171W
80 PLUS PSU 400W 500W 80% 100W
80 PLUS Bronze PSU 400W 471W 85% 71W
80 PLUS Silver PSU 400W 455W 88% 55W
80 PLUS Gold PSU 400W 444W 90% 44W
80 PLUS Platinum 400W 435W 92% 35W

Choosing a highly efficient power supply is absolutely crucial for workstation computers, as they are usually on for a long time doing lots of calculations. Here we assume that the PC is on doing calculations 12 hours a day on both CPU and GPU, pulling a combined wattage of 500W, and the rest of the days the owner games on it for 2 hours pulling 400W and the other 8 hours leaving it on idle which draws about 80W, which will yield a combined average throughout a day of 343.33W. This will calculate the cost of ownership per year using different efficiency values. The cost per year value is estimated using the electricity price of 12¢/kWh, and the cost may vary depending on where you live.

Efficiency Level kWh/year Cost per Year
Minimum ATX Efficiency Requirement (60%) 5012.67kWh $601.52
Typical low cost PSU (70%) 4296.57kWh $515.59
80 PLUS PSU (80%) 3759.50kWh $451.14
80 PLUS Bronze PSU (85%) 3538.35kWh $424.60
80 PLUS Silver PSU (88%) 3417.73kWh $410.13
80 PLUS Gold PSU (90%) 3341.78kWh $401.01
80 PLUS Platinum (92%) 3269.13kWh $392.30

Another important measure for a good power supply purchase is the maximum power it can actually deliver. Firstly, you should decide whether you will overclock the system or not. If you won’t, add up all the TDP for your components and multiply it by 2, that way the power consumption will land at about the dead center of the Power Supply’s efficiency curve, which is usually the place that will have the highest efficiency.

PSU efficiency curve
This graph shows a Platinum efficiency power supply efficiency curve with 115V AC input and 230V AC input. Using 220V AC input will be more efficient than 120V AC. Image Source: Corsair.com

The reputable brands of Power Supply include EVGA, Corsair, Seasonic, CoolerMaster, Thermaltake, Silverstone (SFF), you decide which power supply you would like to purchase, but there really shouldn’t be much difference since most of them are provided by the same supplier (Superflower).

#5 Graphics Cards Selection

Overview

The GPU is the component that would throw out the most Single-Precision Floating Point calculations, also knows as FP32 calculations and the metric to measure the performance is named TFLOPS (Tera-Floating point Operations per Second) or GFLOPS for lower speeds, and 1000GFLOPS = 1TFLOPS. There are also applications that take advantages of Half-Precision FLOPS, aka FP16, such as Artificial Intelligence Training and Inference. Due to the vast amount of data generated by a neural network, higher precision of numbers are generally not required to achieve same or mostly similar results. Also, since the precision is reduced, modern GPUs can handle 16-bit float numbers twice as fast as 32-bit float numbers, for example, AMD’s Vega architecture and Nvidia’s Pascal architecture (Server only) allows 1 FP32 : 2 FP16 instructions, while Nvidia’s Turing and Volta architecture can handle 1 FP32 : 4-5 FP16 instructions, making it very advantageous for AI and Deep Learning. With scientific workloads that requires high numerical precision, FP64 could be a necessity, and for consumer GPUs AMD GPU such as Vega and Vega 2nd Generation have significantly higher FP64 numbers than Nvidia’s gaming counterparts.

There are also CPUs with integrated GPUs called iGPUs, such as the AMD A-series CPU and Ryzen APU, as well as most Intel CPUs (except the HEDT platform and the CPUs ending with an F letter). Intel iGPU provides 720P low gaming at around 30FPS for AAA titles and around 1080P low or 720P high for eSports games, and it should be good enough for casual indie games. However, the Vega iGPU in Ryzen G-series CPUs are significantly more powerful than Intel’s iGPU, but it does lack Quicksync, meaning that if you are doing video editing with Premiere, it can be accelerated on Intel iGPUs but not AMD’s APU, or that you are just going to be watching videos online, which would also leverage Intel Quicksync.

Advanced Details (skip if you don’t want more details)

Gaming:

Firstly, we will talk about gaming. For a gaming graphics card, you really don’t want to look at the TFLOPs values, as that is basically irrelevant for determining the actual performance of it in various different titles. However, it does give you a ballpark about how many compute units (essentially the equivalent of CPU cores) in a graphics card have, and generally more cores = higher framerates for the SAME architecture graphics card. The graphics cards from Nvidia and AMD have different architectures, each has different gaming performance and clock speed even on the identical core count and memory system, as the architectures aren’t identical they can’t be directly compared upon. On the Nvidia side, the current generations of GPUs are based on the Pascal and Turing architecture, with the first one being mostly based off Maxwell architecture from several years ago but on a more advanced process node (28nm vs. 14nm) and higher clockspeed due to optimized silicon timings. For the AMD camp, the architecture on all of their GPU is based off the GCN architecture, and it has remained basically unchanged since 2011, with the most significant modification on the architecture being GCN 5th gen (Vega architecture) released in 2017.

This slideshow requires JavaScript.

Here is a list comparing the differences between the architectures from the two companies (AMD GCN vs Nvidia Maxwell/Pascal/Turing):

  • AMD GCN is limited to 64CU, which means that it would also be limited to 4096 Stream Processors (basically equivalent to CUDA cores as they are all Vector SIMD units), 256 TMUs (Texture mapping units), and 64 ROPs (Render output units which are the primary factor of gaming performance).
  • Only since the Vega architecture, the GCN architecture hasn’t used Tile Based Rasterization, in which Nvidia first implemented in the Maxwell architecture back in 2014.
  • Generally, Nvidia GPUs have better VRAM compression algorithm compared to AMD GCN, meaning that they would require less memory bandwidth and thus can get away with GDDR memories instead of HBM that AMD has to use to alleviate the memory bandwidth limitation. Better compression also means less VRAM usage in games, and that makes Nvidia’s framebuffer more efficient in videogames compared to AMD’s.
  • Nvidia GPUs have significantly higher clockspeed compared to AMD GPUs under the same process node, no matter on the low end side or the high-end side, which was a shortcoming for the GCN architecture as it only started breaking the 1.1GHz barrier with the launch of GCN 4th Gen (Polaris) and with significant clockspeed uplift with Vega. They manage to tie up or beat Nvidia’s clockspeed with the 7nm process node running on Vega Architecture, which made it basically tied up to the similarly priced (and similar ROPs count) RTX 2080, but with significantly faster memory and core count.
  • Although not very useful in a gaming perspective, the RTX aspect of Nvidia RTX series GPU can be used for other applications that include 3D graphics and modeling as it would significantly accelerate Ray Tracing workloads (not necessarily path tracing as it’s an entirely different algorithm). It also has double FP16 rate support, which would accelerate AI workloads, however, since GCN 5th Gen, AMD also has double FP16 rate.
  • Finally, if you are doing video game streaming, and you don’t have a powerful enough CPU to encode your videos while you are gaming and you are willing to sacrifice the video quality, an Nvidia GPU is definitely your choice. This is because Nvidia GPUs have a much stronger NVENC encoder compared to AMD VCE, which is especially apparent with the GeForce RTX lineup as the Turing architecture bumped up the performance of the encoder even compared to the already good Pascal NVENC encoder. The quality would still be lower than CPU encoding, and it is perceivable, but as long as the quality is not the top concern, you can save a lot of money by doing so instead of buying an expensive HEDT CPU to couple with the already expensive GPU.

As a summary, the AMD GPUs are limited to 64 ROPs and somewhat limited by its geometry processing power of 4 Triangles/Clock. However, for the most people that aren’t going for the maximum high-end GPUs, from a pure gaming standpoint, anything from AMD including Vega 56, RX 580, RX 570 are all great purchases compared to its Nvidia counterparts. Despite that, for the rest of the stack, Nvidia 2060, 2080, and 2080Ti are definitely the best graphics cards for their respective counterparts. In my opinion, the 2070 and 2080 are absolutely not worth it, especially the RTX 2070. Since if you have the budget to go for 2080, might as well step it up to a 2080Ti and cheap out a bit more on the CPU or purchase a used 1080Ti. For 2070, a Vega 56 would be a better contender if you don’t care about power draw and you are going to overclock it, and 2060 is also relatively close to it but is quite a good bit cheaper, and can pretty much overclock to similar level of performance as a stock 2070, with both cards’ RTX feature practically useless for gaming anyways.

Update: With the launch of the Nvidia RTX Super series as well as AMD’s Navi graphics cards, the recommendations list has changed significantly. Firstly, the RTX 2060 Super omitted the old RTX 2060 and RTX 2070, and the RTX 2070 Super omitted the old RTX 2080. Meanwhile on the AMD side, the RX 5700 supercedes the RX Vega series entirely and the 5700XT supercedes the Radeon VII. The RX 5700XT is within 7% of the performance of an RTX 2070 Super while being 100$ cheaper, so it’s strongly recommended over the Nvidia counterpart. Similarly, the RX 5700 is slightly faster than the RTX 2060 Super, while being 50$ cheaper, so it’s definitely the superior buy for traditional gaming. Also, one huge thing about the RX 5700 series is the increased efficiency, as they draw almost equal amount of power to the Nvidia counterparts. If you do want hardware accelerated ray tracing in games, then the RTX 2070 Super is the best choice in the entire series.

Here is a list of all my recommendation for a certain resolution and frame rate combination:

  1. 1080P/60Hz: AMD RX 570 / RX 580 or Nvidia GTX 1660
  2. 1080P/144Hz or 1440P/60Hz: AMD Radeon RX 5700 or Nvidia RTX 2060 Super
  3. 4K/60Hz: AMD Radeon RX 5700XT or Nvidia RTX 2070 Super
  4. 4K/>60Hz or 1440P/144: Nvidia RTX 2080Ti

Compute:

Using GPUs for computing purpose is entirely different than gaming, as the metrics won’t be based on primarily TMU, ROPs, and Geometry counts, but mostly based on the core counts. However, that’s not everything, as each workload dramatically differs from the rest, some dependent on Single Precision, some Double Precision, some Half Precision, and some memory bandwidth, and finally some are a combination of the above. One thing you really need to know is what type of workload are you dealing with, and purchase a card based on your workload and not just a general performance metrics in TFLOPS published by a particular company. Another important aspect of buying a GPU for computing purpose is to see whether the workload you are doing supports CUDA only or both CUDA and OpenCL, as AMD GPUs only work with OpenCL workloads and can’t run CUDA workloads. Meanwhile, Nvidia GPUs are terribly optimized for OpenCL workloads but can run it, and will obviously run any CUDA workloads too.

For comparison and simplicity purposes, I will be splitting the workloads into 3 basic categories: Category I, II, and III. Category II is low precision low memory workload, this includes purely arithmetic works that utilize basic numerical operations, such as calculating hash functions, which also includes most mining algorithms. Category III is high precision high memory, which consists of any large number multiplication, large-FFT (>1024K), financial analysis (depending on the algorithm), and almost all scientific computing workloads that require large data sets and high precision.

Category I:

This slideshow requires JavaScript.

  • This category is primarily focused on single precision (FP32) or half precision (FP16) workloads. Usually, these workloads can be very dependent on memory bandwidth and capacity, but it depends entirely on the use case. Typical workloads of this type include
    • Deep learning and AI (Training and Inference)
    • 3D rendering and CGI (Path tracing and Ray Tracing)
    • Video editing (especially color correction and effects)
    • Image editing (filters and enhancements)
    • Mining algorithms such as Ethereum and Monero
  • For these reasons, it is generally recommended to get a gaming graphics card as these software’s requirements are practically satisfied by all gaming GPU, primarily the FP32 performance aspect and the memory bandwidth. They are also significantly cheaper than the workstation/server counterparts, meaning that it would be a substantially better value for these workloads as you generally won’t get any perceivable speedup.
    • In this case, the best GPUs would be RTX 2080Ti for the Nvidia side, which is terrific for the CUDA only software. The best GPU from AMD side would be Radeon VII as it has exceptionally high memory bandwidth and capacity, as well as decent FP32 performance (nearing 15TFLOPS after overclock), making it a great contender for programs that supports OpenCL Compute.
  • For mining, you really can’t go wrong with the Radeon VII as it is the absolutely best GPU for the algorithms discussed above in this category (Ethereum and Monero), it even beats some other HBM2 enabled Nvidia GPU such as the Titan V by up to 25%.
  • Some 3D modeling applications require Pro GPUs such as Radeon Pro and Quadro to run well, as they have optimized driver code for OpenGL displays, meaning that it would be significantly better to model on that GPU compared to the gaming counterpart. But for the most users using Maya, Blender, or Cinema4D, they are going to be able to get away with a gaming GPU.

Category II:

Hash functions

  • This category is primarily focused on the FP32, or FP16 compute performance, and it really isn’t dependent on the memory bandwidth or the capacity. Typical workloads of this category include
    • General coin mining algorithm (SHA-256 (bitcoin), Lyra2REv2, X11, etc.)
    • Password cracking
    • Large scale encryption
    • Practically all hashing algorithms
  • For this workload, there’s absolutely no point going for the workstation/server GPUs with ECC memory as there is so much data that the GPUs are processing. When you have so much data at the same time, numerical precision doesn’t matter much, and neither does memory error. Hence the best GPU for this workload is the ones with pure FP32 or FP16 power such as the Radeon RX Vega 64, Radeon VII, RTX 2080Ti, and Titan V. AMD GPUs are the best value since the Vega 64 is usually around 400$, Radeon VII is about 700$, and the 2080Ti is around 1200$ with the Titan V being the most expensive at 3000$, which is complete overkill. For this application, as you can buy multiple Vega 64 or Radeon VII to significantly exceed the performance of a single Titan V since it is easy to achieve parallelism for these algorithms, meaning that it has a near 100% scaling with multiple GPU.
  • However, there are indeed exceptions for AMD hardware, for example, the Trial Factoring workload for GIMPS can execute Float32 instructions and Int32 at the same time, the Nvidia Volta and Turing cards can get up to 3x speedup compared to the Radeon and Pascal counterparts. For that specific workload, there’s really no contest for that type of workload to buy the Nvidia Volta/Turing cards, with the Titan V performing the best out of any card.

Category III:

big numbers

  • This category is primarily FP64 computation, with some algorithms being extremely demanding on the memory bandwidth as well. Most of these algorithms require a decent amount of memory due to the data set it uses, and it is usually quite large. Typical workloads of this category include
    • Financial analysis
    • Scientific computing
    • FFT workloads (size larger than 1024K)
  • This workload is the ones that would benefit the most from a Pro grade GPU as they offer ECC and superior FP64 performance compared to their gaming counterpart. The fastest cards for this category would be the Quadro GV100, Tesla V100, Radeon Instinct MI50/60, as they all offer large HBM buffer with lots of memory bandwidth and capacity, as well as 1:2 FP64 : FP32 ratio, meaning that they are all nearing the 7.5TFLOPS mark for FP64. With ECC enabled, they would virtually be error-free compared to the gaming counterpart.
  • The best value card for this workload is definitely the Radeon VII, it is only 700$, and it has nearly 3.5TFLOPS of FP64 performance as well as 1TB/s memory bandwidth and a 16GB memory capacity. This means that not only can it handle large data sets, it can also handle FP64 very well and a large amount of transfer of data. In the most memory demanding situation using FFT size larger than 4096K, it is even faster than the Titan V by over 25% after both are overclocked, making it a superior value compared to the 3000$ price of the Titan. If ECC is not a necessity, the Titan V is definitely the fastest FP64 compute card, especially when it’s overclocked as it can reach near 10TFLOPS of FP64.

#6 Storage Selection

There are primarily 2 choices of storage you can take a look about: Spinning Mechanical Hard Drive (HDD) and Solid State Drives (SSD). They each have their own advantages and disadvantages, and each serving different purposes determined by how you are going to use them.

The hard drive selection is extremely simple, as all 4 large brands of hard drive: Seagate, Western Digital (WD), Hitachi, and Toshiba all have similarly performing drives that have similar build quality, price, and failure rate. As the hard drive is a mechanical device that’s physically moving, it is more prone to failure than an SSD would as it is run on the chips instead of physically moving parts to write and read data. It would also be significantly slower than an SSD, no matter if it’s in sequential read write or in random read write, and its response time is greatly higher than an SSD as well. This means that you should never use a HDD as a boot drive as it would greatly slow down the system response time and make application launch and booting up computer taking a long time. However, they are great for archival storage since the files are usually just written on there and read once in a long time, and the disadvantage of a hard drive reading random data won’t really show up on the uses as most archives are large sequential data.

There are 2 types of SSDs, one uses the SATA protocol and the other one uses the NVME protocol, both have their own advantages. The SATA SSDs will write a lot slower compared to the NVME counterparts since they are limited by the SATA III bandwidth instead of the PCIE x4 speed of NVME drives. However, they usually have a lot more capacity for the same price and they also produce less heat than the NVME drives and draws less power. For example, an NVME SSD would read at around 3000MB/s while a SATA one would only read at around 530MB/s, which is still a great deal faster than any hard drives on the market but it’s not fast enough if you are doing a lot of really heavy sequential reads and writes. For daily uses however, the SATA SSDs will generally make no difference compared to the NVME ones, with the NVME ones launching applications and booting Windows marginally faster than a regular SATA SSD, it’s really not worth the price unless they are priced very similarly, which is generally not the case.

When purchasing SSDs, try to not aim for the cheapest options, as they generally have unstable write speeds and also they might also not have a DRAM cache within the drive, which would significantly speed up the burst write speed in some cases as DRAM is significantly faster than the TLC NAND cells those cheaper drives use. However, it is also not really worth it to purchase an expensive MLC or even SLC drive as they generally offer little to no performance increase for the general users unless they are doing extremely write intensive workloads, such as NAS and other server usages.

The SATA SSDs that performs well and comes in a decent price includes the Crucial MX500 series, Samsung 860 series (whether if it’s mSATA, M.2, or regular 2.5inch form factor they still perform identically), and WD blue series.

For storage, generally you are looking for an SSD paired up with a HDD if you want extra storage for your video files, images, or game libraries, but if you want to save some money, there’s absolutely no point purchasing an extra hard drive.

There are also options that involves caching that makes you able to store the small files on the faster but with less capacity SSD, and then store the large files that are able to be read sequentially from the hard drive. For this, you can either use Intel’s Optane option or AMD’s StoreMI option, either option works, however, the Intel Optane version is limited to only using Optane sticks, which only have capacities of 16GB or 32GB to choose from, which is still tiny even compared to the SSDs. However, AMD’s StoreMI solution is able to use a regular SSD to use it as the caching drive and speed up the data transfer for smaller files, and it also requires using an AMD Platform that’s an X-series chipset, and it also have to be newer than or equal to X470 as X370 would need to purchase the StoreMI softwares separately. Since right now the price of SSDs have dropped significantly, there’s usually no point on buying a small capacity hard drive such as below 2TB ones anyways, and usually you can get a 1TB SSD for around the same price as you would buying a 512GB SSD and a 1TB hard drive, in which the SSD option is going to be significanlty faster and more power efficient.