Deep Fundamental Research

Deep Fundamental Research

Share this post

Deep Fundamental Research
Deep Fundamental Research
Deep Dive: Liquid Cooling

Deep Dive: Liquid Cooling

The key infrastruture trend to support the next generation GPU clusters; How companies like Vertiv serve hyperscalers and compete in the market

Patrick Zhou's avatar
Patrick Zhou
Feb 09, 2025
∙ Paid
11

Share this post

Deep Fundamental Research
Deep Fundamental Research
Deep Dive: Liquid Cooling
2
Share

Previously, I’ve written three pieces on data center infrastructure: 1) optical module deep dive, 2) data center networking, and 3) switch deep dive. Today, I’ll cover another crucial topic in data center deployment — the growing trend of liquid cooling for next generation AI clusters.

We will begin by exploring the transition from traditional air-cooled systems to next-generation liquid cooling. Next, we’ll examine the various types of liquid cooling, along with their pros and cons. From there, we’ll delve into the key components of liquid cooling systems. Finally, we will analyze the competitive landscape across different segments, including core vendors, moats, customer decision criteria, market share, and competition dynamics.

Subscribe to receive 10-20 in-depth analyses per year on key artificial intelligence sectors and top-tier AI companies.


Part 1. The Liquid Cooling Transition

1.1. Air Cooling vs. Liquid Cooling

Previously, data centers have relied on air cooling for controlling the temperature in the facilities. In an air-cooled system, air conditioners, chillers, and fans in computer rooms are used to cool down IT equipment.

One big challenge of air cooling is its significant power consumption from fans (server fans and chiller fans), facility cooling, and other non-IT loads. According to Semianalysis, all these non-IT loads add ~45% additional power per each watt delivered to chips in Microsoft’s largest H100-based training cluster, causing a high PUE (Power Usage Effectiveness, defined as the ratio of total facility power to IT equipment power) of 1.223 for the cluster. Note that PUE calculation includes server fan power consumption, which accounts for 15%+ of server power, as part of IT equipment power, which inherently gives air-cooled systems a measurement advantage already.

However, the even greater issue for air cooling is its inherent limitation in handling increasing power density of AI racks. Air cooling remains effective for racks that have power density below 25 kW, and beyond this threshold, it struggles to maintain server and GPU temperatures within optimal ranges (18–27°C for servers and below 60°C for GPUs). Today’s AI data centers easily feature GPU racks that exceed 25 kW power density. For instance, a typical H100 rack includes four 8-H100 servers, totaling over 40 kW power per rack – a level that is already very challenging for air cooling.

As the industry moves toward next-generation Blackwell systems and even more power-dense rack configurations, air cooling will no longer be able to maintain fair thermal conditions for servers and chips. Consequently, liquid cooling has become an essential trend to support the scaling of AI data centers.

Compared to air cooling, liquid cooling offers several advantages:

  • Higher efficiency: Water (or other coolants) has a much higher thermal conductivity and specific heat capacity than air, allowing it to absorb and carry away more heat in less time

  • Reduced Energy Consumption: Liquid cooling reduces the need for excessive airflow (and large fans) and other power intensive facility cooling equipment, enhancing overall energy efficiency of the system. Using the example above, while Microsoft’s air-cooled systems require 45% additional non-IT power per watt delivered to the chips, Google’s liquid-cooled data center operates with a much lower c.15% non-IT loads.

  • Space saving: Liquid cooling allows for more compact server and rack configurations; according to a study by Schneider, liquid cooling helps save 10-14% of space vs. air cooling

  • Extending equipment lifespan: Liquid cooling helps maintain more consistent temperature conditions for IT equipment, which improves hardware performance and longevity. A study from LiquidStack shows that liquid cooling extends the operational lifespan of equipment to 20-30 years from typical ~8 years with air cooling


1.2. Liquid-to-Air vs. Liquid-to-Liquid

In the early stages of liquid cooling development, most data centers use liquid-to-air (L2A) methods, where liquid absorbs heat from IT equipment (GPUs, CPUs, etc.) and dissipates heat into the air through a radiator or heat exchanger. The advantage of L2A is that it’s easier and quicker to implement.

The limitation, however, is also limited power density it can handle, due to lower heat capacity and thermal conductivity of air. Currently one L2A unit can manage about 70-100 kW of power density, enough for a GB200 NVL36 rack (~70 kW per rack) but challenging for the more popular NVL72 configuration (100+ kW per rack). As a result, L2A is mainly used in smaller data centers with limited-scale deployments.

In contrast, liquid-to-liquid (L2L) involves liquid absorbing heat from IT equipment and passing through a liquid-to-liquid heat exchanger housed within a Coolant Distribution Unit (CDU), where a secondary liquid carries away the heat. As AI data centers evolve toward higher density workloads, L2L is clearly the way to go going forward. It can handle much higher power systems and offers higher cooling efficiencies. With scale, L2L is even more economical to deploy than L2A: a typical L2L CDU has 8-10 times the capacity of an L2A CDU, yet costs less than twice as much.


1.3. Direct-to-Chip (DTC) vs. Immersion Cooling

Liquid cooling can also be categorized as Direct-to-Chip (DTC) or Immersion cooling.

DTC cooling primarily utilizes cold plates that make direct metal-to-metal contact with the compute chips, allowing coolant to flow through the tubes to absorb the heat from the chips. The heated coolant then travels through manifolds to a CDU, where it undergoes a second heat exchange loop with a secondary coolant. The cooled primary coolant is then pumped back to the server side through manifolds by the CDU.

For non-compute chips, components such as VRMs, DIMMs, storage controllers, and other onboard electronics rely on air cooling to stay within safe temperature limits. Overall, compute chips generate c.80%+ of the total server heat, and thus the majority of the heat is dissipated through the liquid loop while the remaining <20% is still managed by air cooling.

Coolant: Common coolants used in DTC include deionized water, pure water, and P25 (25% ethylene glycol + 75% water). These coolants are all non-conductive, which helps reduce the risk of leakage causing server board short circuits. Overall, key components within a DTC system are: 1) CDU, 2) manifolds, 3) cold plates

Compared to Immersion cooling (discussed later), DTC offers several major advantages:

  • Compatibility: DTC requires fewer modifications from and is more adaptable to existing data center infrastructure, making it easier to build

  • Cost: Based on insights from Tegus experts and my own channel checks, DTC systems cost about $300–$500 per kW to build, vs. $1,000+ per kW required for immersion cooling. DTC also benefits from lower coolant cost. Cold plate coolants’ price range from $1.50 to $3.00 per liter, whereas the least expensive immersion cooling option, synthetic oil, costs $10-$13 per liter. Immersion also requires a higher volume of coolants. For example 130 kW Immersion cooling unit requires c.1,000 liters of coolants vs. c.600 liters for a DTC system – the volume disparity becomes even greater when transitioning from Single-Phase to Two-Phase cooling, a topic we will discuss later

  • Maintenance: Operating and maintaining a DTC system is simpler and cheaper than immersion cooling.

However, DTC also comes with a few disadvantages:

  • Leakage risk: Coolant leakage in DTC systems can lead to short circuits on chip boards (although most DTC coolants are non-conductive in nature, sediment and residuals on servers could change their properties), potentially causing costly damage; in contrast, immersion cooling systems are native for chipboards to be fully immersed in coolants.

  • Limited redundancies: DTC systems face challenges in building redundancies due to space constraints on the servers; typically, only one small cold plate is paired with one compute chip (CPU/GPU), and there is not much room to host additional plates – therefore If a cold plate malfunctions, the affected chip can overheat, potentially significantly impacting system performance

Immersion cooling involves submerging electronic components directly into a cooling fluid (dielectric) within a tank. As the components generate heat, the coolant absorbs it and rises to the top of the tank. The heated coolant is then pumped out, passing through manifolds into the CDU, where it undergoes a liquid-to-liquid heat exchange, similar to the process in DTC cooling. The cooled coolant is then returned to the tank to continue circulating and absorbing heat. Overall the key components for immersion cooling are 1) CDU, 2) manifolds, 3) tank (contains coolant).

Based on the law of conservation of energy (“Energy in a closed system cannot be created or destroyed, only transformed”), the coolant absorbs ALL the heat generated by the IT equipment – as a result, there’s no need for additional air-cooling equipment and server fans, which consume large amounts of electricity.

Coolant: Fluorinert is a preferred coolant for Immersion cooling due to its high heat conductivity. However, it comes at a premium price, costing $100+ per liter, compared to synthetic oil (also for Immersion) at $10-13 per liter, and just $1.5-3.0 per liter for coolants used in DTC systems. 3M ($MMM) holds a dominant position in the Fluorinert market with 90%+ market share, although some Chinese companies like Juhua Co. (600160.SH) also offer the products at a relatively lower quality.

Synthetic oil does come at more affordable price than Fluorinert, but its drawback is that it tends to have high viscosity, which affects circulation and has a lower specific heat capacity (amount of heat energy required to raise the temperature of a substance by one degree Celsius), making it less efficient at heat transfer. As a result, synthetic oils are more suitable for low-power density use cases.

Compared to DTC, Immersion cooling’s advantages include:

  • Energy saving: Since all energy transfer occurs within the tank and system, immersion cooling achieves higher PUE compared to DTC where electronic components are still exposed to air and fans are used to take out part of the generated heat; this makes immersion cooling ideal for data centers that aim for maximizing energy utilization efficiency

  • Reduced risk of localized overheating: With all electronic components submerged in the same cooling tank, immersion cooling can maintain a more uniform temperature across components and reduce the chances of localized overheating or the development of accidental hot spots

  • Lower noise levels: as the system doesn't have fans

However, immersion cooling also comes with drawbacks and limitations, presenting challenges to its widespread adoption:

  • Lowe Cooling Capability (for Single-Phase Immersion cooling): The coolants used in immersion cooling, especially synthetic oils, generally have lower thermal conductivity (0.1-0.2 W/m•K) compared to those in DTC (~0.6 W/m•K). This means heat is transferred more slowly in the former case. DTC also features a more targeted cooling mechanism, with cold plates placed directly on computing chips, which helps concentrate cooling capability on the key parts, whereas immersion cooling uses a more passive approach of submerging the whole board in a tank of coolant

  • Rack-level architecture change: Immersion cooling requires horizontal rack designs while most data centers currently are vertically designed; thus, its implementation would require rearchitecting the DCs. In contrast, DTC doesn’t require such modifications. (See picture below for the horizontal design of immersion cooling)

  • Warranty issues: Most existing server warranties, issued for current DC conditions, are voided once the hardware is submerged in an immersion tank

  • Environmental regulations: Fluorinated substances used in immersion cooling may face increasing regulatory scrutiny. For example, PFAS, a fluoroalkyl-based substance produced primarily by 3M in Belgium, is planned to be phased out by the Belgian government due to potential environmental risks

Source: Green Revolution Cooling

1.4. Single-Phase vs. Two-Phase

Simply put, Single-Phase cooling means the coolant remains in liquid form throughout the cooling process. In contrast, Two-phase cooling involves the coolant evaporating into vapor as it absorbs heat, which then moves to the condenser in the CDU, where it turns back into liquid form and is returned to the tank for the next cycle of circulation and cooling. Due to the latent heat involved in the vaporization process, which absorbs a significant amount of heat while maintaining a constant temperature at the boiling point, Two-Phase cooling offers very high heat transfer capability and efficiency, more stable thermal operating conditions, and allows for a smaller system size (requiring less coolant).

However, Two-Phase cooling also has some major challenges. First, It involves a more complex design and requires specialized coolants that boil at specific temperatures, leading to higher cost for system setup. Secondly, technicians need specialized training and equipment to maintain two-phase cooling systems, making operating expenditures also higher. Lastly, the vaporized coolant can pose risks to both the environment and human health, leading to stricter environmental regulations for Two-Phase cooling systems.

Because of these above factors, the widespread adoption of Two-Phase cooling will take time. For instance, NVDA’s next-generation GB200/GB300 NVL72 racks will be supported by Single-Phase cooling (possibly a mixture of DTC and immersion cooling). Some industry experts predict that the Rubin generation (which follows Blackwell) will also be manageable by upgraded single-phase systems. The general consensus in the industry is that we are still 4 to 5 years away from Two-Phase cooling taking mass adoption.


1.5. Mix-and-Match: The 2 X 2 Matrix

So far, we’ve discussed two primary methods of categorizing liquid cooling: DTC vs. Immersion and Single-Phase vs. Two-Phase. These two classifications exist on independent axes, so we can “mix” them into four distinct subcategories:

  • Single-Phase DTC: This is the currently most common liquid cooling methods of the four and also the easiest-to-implement. It offers higher heat transfer efficiency than Single-Phase Immersion due to its highly efficient direct metal-to-metal contact in the secondary loop (the coolant loop near the servers). Single-Phase DTC systems can handle heat densities of up to 180 W/cm² now, which is sufficient for most AI systems today. For instance, the H100 chip has a TDP of 700W and a power density of ~150-160 W/cm²; while B100 has a higher TDP, it also a larger die size, which keeps its power density comparable to H100’s

  • Single-phase Immersion: This method has a main drawback – relatively slow heat dissipation. Without vaporization in Two-Phase Immersion cooling, the process relies on more passive heat transfer. It also comes with the other disadvantages of immersion cooling as we discussed earlier (rack design, warranty, and environment). As a result, this method has seen limited adoption, particularly in the U.S.

  • Two-Phase DTC: This method combines the main advantages of DTC (ease of implementation) and Two-Phase (superior cooling capabilities). Many industry experts predict that this is where the industry is trending for the medium future. Companies like Vertiv ($VRT) have prioritized this as the next step for next-generation cooling solutions

  • Two-phase Immersion: This method offers the highest cooling potential but is also the most challenging and expensive to implement. It may become the only viable solution when rack-level power density reaches the megawatt (MW) range (current NVL72 racks are at ~100 kW level). However, these systems demand exceptional leak-proofing, involve high technical complexity, and are more costly to build. Additionally, breakthroughs in materials science are still needed for having new coolants with both optimal boiling points and low viscosity, at a reasonable price

Apart from the four methods above, the industry is also exploring hybrid approaches. For example, in the next-generation GB200 NVL72 racks, NVDA is working with 7 different vendors to develop new cooling solutions that integrate Single-Phase DTC and Single-Phase Immersion cooling. In this setup, GPU/CPU cooling relies on DTC for more targeted and efficient cooling of high-energy-density components, while other components use Single-Phase Immersion, which is more energy-efficient but has lower cooling capabilities. Within this system, DTC is projected to account for roughly ¼ to ⅓ of the total system value, while the Immersion component makes up the remainder.


Part 2: Key Components in the Liquid Cooling Systems

2.1. Cold Plates

Cold plates are essential components of DTC cooling. They are metal structures that channel coolant to absorb heat from the computing chips. Typically, cold plates are sold as a part of the server, because otherwise clients would need to disassemble the server upon arrival to install cold plates, which would be both economically inefficient and impractical, and would likely void the server provider’s warranty. Cold plates used in Two-Phase DTC are more complex than those in Single-Phase DTC. They require strong control to ensure uniform heat dissipation as liquid coolant flows in and vaporized gas exits during the phase change process. Different cold plate designs can exhibit temperature control differences of 10-20°C.

Price: On average, a single cold plate for Single-Phase DTC costs around $100 and each is paired with one computing chip (GPU/CPU). For example, in a GB200/GB300 superchip, which contains one CPU and two GPUs, a total of three cold plates are needed for that unit.

Moat: I wouldn’t say cold plates have high technical difficulty to make. The key is to ensure smooth coolant flow without blockages and preventing leaks, which is primarily an engineering issue. Since cold plate pipes/ channels can be just a few millimeters wide, even a small amount of dust particles can create blockages. When this happens, pressure in the pipes increase, potentially leading to coolant leakage if the sealing isn’t properly done.


2.2. Manifolds

Manifolds are plumbing or piping assemblies that distribute and collect coolant across multiple cooling loops, cold plates, or racks. They split the incoming (supply) liquid stream from CDU into several smaller coolant lines, or merge multiple return lines back into a single (return) line which returns to CDU.

Price: The price of manifolds is typically $6-10K per unit (leaning toward the lower end of the range). The primary BOM item is quick disconnects (discussed in the next section).

Moat: Manifolds are considered to have low technical barriers. They are primarily made of metal sheet components and machined parts, utilizing mature manufacturing processes.


2.3. Quick Disconnects

Quick disconnects are fluid connectors integrated in manifolds that allow for fast, tool-free attachment and detachment of coolant lines. They make up a significant portion of the BOM for manifolds. Currently, there are two main sizes available: a 1-inch version and a 2-inch version. The 1-inch variant supports a flow rate of nearly 200 LPM (liters per minute), making it the preferred choice, but over the past half year, there have been supply bottlenecks for the 2-inch version, and thus most of the shipments so far have been the 1-inch model.

Volume and Price: The upcoming GB200 NVL72 rack requires 10 quick disconnects per compute tray (total 18 compute trays) and 2 per switch tray (total 9 switch trays), bringing the total number of disconnects per rack to 198 units (calculated as 10×18 + 2×9). For the GB300 NVL72, however, each compute tray will require 24 quick disconnects, increasing the total requirement per rack to 450 units (calculated as 24×18 + 2×9), more than doubling GB200’s. However, my channel check suggests that NVDA may adopt specialized in-house quick disconnects for GB300 systems. These new disconnects will be smaller, and specifically designed for NVDA’s use cases (different from the UKD universal quick disconnects used in GB200). The price per unit is also expected to drop from c.$70 for the universal UKD version in GB200 to c.$45-50 for the specialized version used in GB300.


2.4. Coolant Distribution Unit (CDU)

CDU: is the central piece of equipment that handles the circulation and management of liquid coolant to and from the server racks. It is responsible for controlling the pressure, temperature, flow rate and the filtration of coolants. Most CDUs have pumps, heat exchangers, valves, MCUs, and coolant quality control systems.

Types: There are two types of CDU configurations. In-Rack CDUs are self-contained units that are integrated directly into a rack, often in a rack-mounted form, to provide localized cooling management. In contrast, In-Row CDUs are positioned between or adjacent to racks, and usually manage multiple racks. In-row CDUs are generally more cost-effective when connected to the full number of racks allowed by their specs. Additionally, they are easier to maintain than in-rack CDUs, which are mounted inside racks and challenging to disassemble. However, most server OEMs/ODMs push for in-rack solutions as that enhances the overall value of racks that they supply and give them control over more aspects of the value chain.

Price: CoolIT’s L2L CDU that supports max 8 NVL72 racks sells at c.$140K, translating to c.$18K per rack. In contrast, its in-rack solution is priced at c.$20K per rack. Vertiv’s solution is slightly more expensive, and its flagship Liebert 1350 series sells at $150-200K. To put things in perspective, the value of manifolds (which contain quick disconnects) is $6K, as discussed earlier.


Part 3: Major Clients and Partners

3.1. Cloud Service Providers (CSPs)

Decision making process: Liquid cooling vendors sell to the end customers – CSPs especially hyperscalers – in majority of the cases (except for when they supply OEMs, as discussed later). Although system designers (like Vertiv) and server ODMs may recommend certain component vendors, clients ultimately have the decision power and can opt for different vendors if they choose. Thus, relationships with end customers owning the data centers is key for vendors.

Deployment: The liquid cooling systems, including the CDUs, are usually installed in data centers before the servers and racks arrive. This ensures that the peripheral infrastructure is ready ahead of time, allowing for as much downtime as possible for the high-value GPUs.

Among major hyperscalers (Microsoft, Google, Amazon, Meta, and Oracle), Google has the most extensive experience with liquid cooling, having implemented it since 2018 with its TPU v3. Other CSPs have had relatively limited liquid cooling deployment experience previously, but they are quickly catching up to prepare for next generation GPU deployment.


3.2. Server OEMs

Server OEMs like Supermicro ($SMCI), Hon Hai (2317.TW), Dell ($DELL), HP ($HPQ), and Lenovo (992.HK) primarily produce standard branded server racks for enterprises and smaller-scale customers, rather than hyperscalers. To enhance their value proposition, these OEMs are increasingly integrating liquid cooling components into their solutions. This would also provide added convenience for their clients who typically lack extensive expertise to build complex data center infrastructures themselves.

However, most OEMs follow a "white label" approach, as they rely on third-party vendors for the equipment and components, which they then integrate into their systems under their brands. For instance, SMCI, Dell, and HPQ heavily utilize Nidec’s in-rack CDUs. One exception is Hon Hai, which, as previously mentioned, has begun producing cold plates and manifolds in-house.

Supermicro ($SMCI) has been one of the most active OEMs in the liquid cooling space. It recently completed the delivery of ~12,000 liquid-cooled servers for X.AI’s newly built 100,000-GPU (H100) data center. These servers, each housing 8 GPUs, were exclusively supplied by SMCI, despite initial expectations that a second vendor would also participate. Each server was sold at c.$280,000–$290,000, with a gross profit margin exceeding 10%, surpassing SMCI’s traditional non-liquid-cooled servers. This higher profitability has provided a strong incentive for the company to expand its focus on liquid-cooled rack solutions. SMCI primarily sourced in-rack CDUs from Nidec for these servers.

Looking ahead, SMCI is fighting for X.AI’s Phase II data center, which plans to scale up to 200,000 GPUs, with bidding expected to start in 2025. However, this time, the company anticipates more fierce competition from rivals such as DELL.


3.3. Nvidia (NVDA)

Although NVDA is not a direct customer of liquid cooling vendors, it plays a crucial role in the supply chain, as most next-generation GPUs will come from the company and will closely interact with liquid cooling systems.

Since the introduction of the H-series GPUs, liquid cooling has become the primary method for cooling NVDA’s chips. For the next-generation GB200 NVL72 racks, the company is collaborating with seven different vendors to develop a new liquid cooling solution that integrates both Single-Phase DTC and Single-Phase Immersion cooling for a target power density of 200kW per rack, as we discussed before:

Apart from the four methods above, the industry is also exploring hybrid approaches. For example, in the next-generation GB200 NVL72 racks, NVDA is working with 7 different vendors to develop new cooling solutions that integrate Single-Phase DTC and Single-Phase Immersion cooling. In this setup, GPU/CPU cooling relies on DTC for more targeted and efficient cooling of high-energy-density components, while other components use Single-Phase Immersion, which is more energy-efficient but has lower cooling capabilities.

In this reference design, Vertiv ($VRT) will be responsible for the overall cooling system design, Boyd will handle the manufacturing of cold plates, and Honeywell ($HON) will develop the coolant used for the immersion cooling component – the objective is to identify a chemical with a relatively high boiling point (since it’s single-phase) and exhibit strong heat transfer properties for quick heat dissipation; the final coolant choice has yet to be determined.

Participating in the reference design process gives vendors like Vertiv and Boyd the opportunity to collaborate closely with NVDA, gain early access to server and chip specifications, and ensure their solutions are fully optimized for future rack designs. Additionally, being recognized as an official NVDA partner provides a strong credibility when selling to clients. However, it's important to note that inclusion in the reference list does not guarantee sales – NVDA would not “mandate” that clients purchase from partner vendors and clients ultimately have the final say on which cooling vendors they use for their data centers. This point is also mentioned by another article from Semianalysis previously.


Part 4: Competitive Landscape

4.1. Cold Plate

Currently, Taiwan’s Cooler Master (CM) leads the market, holding 50%+ share for the first batch of GB200 racks shipped so far. Asia Vital Components (AVC), also from Taiwan, follows in second place with a c.30-40% share. Looking ahead, Aurus (Taiwan), Delta (Taiwan), CoolIT (Canada), and Boyd (U.S.) have strong potential to also enter the market. CM holds a slight technical advantage in Two-Phase DTC for ensuring uniform heat dissipation during vaporization. In China’s domestic market, Avic Jonhon Optronic Technology (002179.SH) is a leading player in the field.

One potential disruption to the industry comes from GPU OEMs. Given the relatively low technical complexity and barriers to cold plates, some OEMs are attempting to produce them in-house. For instance, Hon Hai (2317.TW) is working on manufacturing cold plates and manifolds (discussed later) internally. These efforts pose a potential threat to established players like CM and AVC. However, it's important to note that the disruption only applies to racks shipped under Hon Hai’s brands, not those under NVDA’s.


4.2. Manifolds

Cooler Master (CM) holds 50%+ of the market share for the initial batch of GB200 racks shipped to date, with Aurus following with a 30-40% share, while the remainder comes primarily from server OEMs. This segment faces similar potential disruptions as discussed earlier in the cold plates section, with server OEMs looking to expand their service offerings and market reach.


4.3. Quick Disconnect

Danfoss (Denmark) is a leading player in the market now, supplying the majority of units shipped so far for AI data centers. It is a recommended vendor by NVDA and is recognized for the solid quality of its products. Other competitive peers include Stäubli (Switzerland) and Parker-Hannifin ($PH, U.S.). However, customer willingness to switch vendors is low, as quick disconnects represent a small portion of the total liquid cooling system’s cost but pose significant risk if leakage occurs due to inferior products. At present, there are no major supply bottlenecks from Danfoss. In China’s domestic market, Avic Jonhon (002179.SH) is again a leading player, offering solid product quality at two-thirds of Danfoss’s price.


4.4. CDU

When selecting CDU providers, clients typically consider the following key factors:

  • Product Quality: The performance and reliability of the products. Overall, I wouldn’t say CDU has an exceptionally high technical barrier – many vendors, including VRT, CoolIT, Motivair, Boyd, nVent, Delta, MGCooling, GigaByte, Nidec, Envicool, and more, have the capability to manufacture CDUs. However, it is still the most technically complex unit in the liquid cooling system. As the central control unit, CDU has an extensive BOM and must manage millisecond-level fluctuations in load to ensure precise and reliable coolant distribution in real-time. The “downside,” however, is that clients may not always have a very clear perception of the performance differences between CDUs

  • Global Service Capabilities: Many clients operate data centers worldwide. If a local data center cannot receive in-time maintenance and repair services from the CDU vendor, it can create significant operational headaches. This factor has become quite important for many hyperscalers, if not more important than product quality

  • NVDA recommendations: Clients may give preference to solutions listed on NVDA’s reference list, particularly in early stages of liquid cooling deployments, as being on this list provides additional credibility and giving clients peace of mind when they first try out liquid cooling

Vertiv ($VRT) is the leading player in the field now, holding 70%+ of the volume share for the first batch of GB200 racks shipped, driven by its flagship 1350 series. The remaining share is largely shared by Motivair (private) and CoolIT (private) so far. Looking ahead to 2025, VRT’s market share is expected to decline as more vendors complete testing and qualification, and clients are also trying to explore more cost-effective alternatives. Despite this, VRT is projected to remain the industry leader.

Next, we will explore the key advantages of VRT. Additionally, we will examine other competitors in detail, including CoolIT (private), Motivair (private), nVent ($NVT), Boyd (private), Nidec (6594.JP), as well as Taiwanese vendors such as Delta (2308.TW), Aurus (private), Gigabyte (2376.TW), and MGCooling (private), and how I would rank these firms into 3 different tiers.

We will also explore the competitive landscape of the Chinese liquid cooling market, which operates independently from the Western market and has its own unique characteristics and dynamics.

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 Patrick Zhou
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share