GPU server hosting from A to I ~ COOLHOUSING s.r.o.

GPU Hosting from A to I (case studies)

14. August, 2024 6 min. read

Looking for a GPU server for AI and machine learning? Comprehensive GPU server hosting solution from A to Z in our data center.

Hosting of GPU and AI server

At Coolhousing, we have been providing server hosting in the form of dedicated server rentals since 2006 and virtual servers since 2015. Over these years, we have met and resolved many interesting server solutions and ‘exotic’ hardware requests from our clients. In the past three years, we have seen a growing interest in hosting servers for artificial intelligence, machine learning, and deep data analysis..

This trend goes hand in hand with globally popular tools based on language models and statistically generated content, such as ChatGPT, Claude, Copilot, Bard, Stable Diffusion, Runway, and hundreds of other applications that emerge every day. However, unlike regular server hosting, running analyses, machine learning, and AI applications requires much higher performance combined with graphics cards and high-quality connectivity. As a result, a new terminology has emerged in the professional community, which we will increasingly meet in the areas of hosting, administration, and colocation.

New terminology around Artificial Intelligence

1) AI Hosting (Artificial Intelligence hosting)

Artificial Intelligence possibilities

In other words, AI hosting focuses on hosting applications and systems that utilize artificial intelligence techniques and algorithms. Such servers require efficient real-time data processing, analysis, and interpretation of complex data sets.

Examples of use: Chatbots, natural language processing, image recognition, voice and virtual assistants, decision-making process automation and management, or personalized medicine.

2) ML hosting (Machine Learning hosting)

Machine learning hosting is a subcategory of AI hosting, specifically focused on hosting programs and systems that use machine learning. Machine learning involves algorithms that learn from data and improve their performance over time without explicit programming. This type of hosting requires a large amount of disk capacity combined with GPU cards for parallel data processing.

Practical examples: Training machine learning models (e.g., weather forecasting), predictive analytics for financial, engineering, marketing, and e-commerce purposes, or data classification and clustering for fraud detection.

3) HPC hosting (High-Performance Computing hosting)

This type of hosting is focused on providing computational resources that enable the solution of very demanding and complex computational tasks. HPC systems utilize parallel processing and distributed computing systems to achieve high performance.

Practical use: Scientific simulations, research and development (e.g., DNA analysis, medication development), modeling and simulation (e.g., fluid dynamics simulation, astronomy), big data analysis (e.g., network traffic analysis), operation of supercomputers or extremely powerful clusters with many computing nodes.

Case studies – GPU servers from A to I (Z)

Graphic card (GPU)

The method of machine learning is not a new concept in the world of technology; it can be traced back to the concepts of Alan Turing and Arthur Samuel in the 1950s and 1960s. Over time, this method gained broader recognition, but its full implementation was limited by the technological capabilities of the time. This changed with the development of powerful processors and supercomputers in the last 10 years.

And it is no surprise that we have been hosting and housing AI, ML, and HPC servers in our data center for 4 years. Our clients are experts in artificial intelligence and deep data analysis, and now we would like to present two case studies that we have worked on with them, which you can have as well.

1) Supermicro server with CPU Intel, 1 TB RAM and GPU Ada

The configuration below was implemented based on a request from a long-term client who asm for Intel processors. To achieve optimal performance for machine learning, a pair of Intel Xeon Gold 6438M processors was selected, which, when combined with DDR5 memory, delivers excellent computational results. For fast response times during data processing, SSD disks with an NVMe controller of nearly 16 TB were chosen, achieving write speeds of 4 GB per second.

In all types of AI hosting, response time and the amount of data for processing and learning are crucial. For this reason, the server includes a network card with an SFP card, ready for speeds up to 100 Gbps. To ensure smooth and efficient data processing, the server was equipped with eight NVIDIA Ada L40S series graphics cards The Ada L40S GPU card offers balanced performance with relatively low power consumption and is highly suitable if you plan to work with video rendering, 3D modeling, and media outputs.

CPU: 2x Intel Xeon Gold 6438M (2x 32C, 64T, 60M, 2,2 – 3,9 GHz)
RAM: 8x 128GB DDR5 4800MHz ECC REG
NVMe: 4x 3.8TB Samsung PM9A3 NVMe PCIe G4 V4 TLC 2.5″ 7mm
NUC: 1x AIOM 2-port 100GbE QSFP28, Mellanox CX-6 DX
GPU: 8x 48GB GDDR6 NVIDIA Ada L40S PCIe Gen 4th

2) Supermicro server with CPU AMD and GPU over 0,6 TB

The second case study involves a new client who preferred processors from AMD, whose popularity is significantly rising at the expense of Intel. This setup was equipped with dual processors from the new EPYC Zen4 series, providing excellent and efficient performance for each core. Compared to the first variant, this server was fitted with 8 DDR5 memory modules, each with a capacity of 64 GB. To achieve optimal performance, SSD disks with an NVMe controller from Samsung were chosen again, with which we have had the best experience in our data center. The speed of up to 100 Gbps is once again handled by a Mellanox network card with two ports and an SFP card.

The server may have half the memory capacity, but it compensates for the overall performance required for machine learning with eight H100 series graphics cards, providing extreme power. The Nvidia H100 is truly top-of-the-line in the field of GPUs for working with artificial intelligence and it’s no surprise that both Microsoft and Meta want to use this series of graphics cards to launch their AI bot Pi, and are planning to release open-source artificial general intelligence (AGI).

CPU: 2x AMD EPYC4 Genoa (SP5 LGA) 9334 (2x 32C, 64T, 128M, 2,7 – 3,9 GHz)
RAM: 8x 64GB DDR5 4800 MHz ECC REG
NVMe: 4x 3.8TB Samsung PM9A3 NVMe PCIe G4 V4 TLC 2.5″ 7mm
NUC: 1x AIOM 2-port 100GbE QSFP28, Mellanox CX-6 DX
GPU: 8x 80GB NVIDIA H100 PCIe 5.0

Purpose and selection of GPU is crucial

If you decide to start using AI hosting in our data center, which not only provides a performance GPU server but also ensures optimal cooling infrastructure with Freecooling technology, excellent connectivity with speeds up to 100 Gbps, and complete hardware service, it is essential to consider the purpose for which you will be using the server and choose the appropriate type of graphics cards accordingly. Every GPU card has its advantages, but also disadvantages, which need to be considered when configuring the server and creating a budget. We are preparing an article on this topic for the end of August, which will delve into the benefits and drawbacks of the most popular Nvidia GPU card series in more detail.

If you are interested in GPU hosting or would like to discuss this solution, please write to us at info@coolhousing.net. We will be happy to consult with you all the possibilities that this new and dynamic type of hosting offers.

Coolhousing team

Best articles