富通科技发展控股有限公司

Recently, Futong Technology successfully signed an AI computing project with an industry client and delivered a 671B full-scale DeepSeek AI computing appliance, built on a dual-server, 16-GPU-per-node NVIDIA H20 full-stack architecture.

The solution integrates Futong Technology’s self-developed Futong iCore one-stop AI application platform, deeply combining intelligent agent engineering, data intelligence, compute scheduling, and full-capability foundation models. Through an end-to-end turnkey delivery model, Futong provides a complete solution covering data center deployment, environment adaptation, and model go-live—ensuring true plug-and-play deployment with full-performance computing power.

Futong Technology DeepSeek AI Computing Appliance Product Line

Executive Summary

H20 671B Full-Scale DeepSeek AI Computing Appliance
High-performance computing, turnkey delivery, intelligent agents empowering business scenarios

Top-Tier Computing Architecture
Dual-server, 16-GPU-per-node NVIDIA H20 full-stack solution, optimized for the DeepSeek 671B foundation model. Supports millisecond-level inference for trillion-parameter models, with deep optimization of memory and interconnect topology to deliver industry-leading performance.
Turnkey Delivery
End-to-end coverage from data center deployment and environment adaptation to model go-live, including system tuning, security policies, and O&M monitoring. Deployment cycles reduced by 60%, enabling true plug-and-play, full-power operation.
Futong iCore One-Stop AI Platform
Integrates no-code intelligent agent construction, unstructured data intelligence, and cloud–edge collaborative compute scheduling, providing a unified pipeline for AI application development, data governance, and cost-efficient compute deployment.
Proprietary Patented Technologies
Enables small models to approach the full capabilities of large models: <200 ms latency under thousand-level concurrency, <3% performance gap compared to the 671B model, at only one-quarter of the total cost.

Full-Stack Server Supply Unlocking Elite Computing Power

DeepSeek 671B Full-Scale × NVIDIA H20

The delivered H20 671B full-scale DeepSeek AI computing appliance adopts a dual-server, 16-GPU-per-node NVIDIA H20 architecture, purpose-built for ultra-large-scale AI inference and fully aligned with the stringent infrastructure requirements of the DeepSeek 671B foundation model. The solution supports millisecond-level inference for hundred-billion–parameter models.

Built on NVIDIA H20’s 141GB high-bandwidth memory architecture and a deeply optimized interconnect topology, the system ensures stable and efficient operation of large-scale inference workloads. As NVIDIA’s next-generation AI inference GPU, H20 offers significant improvements over previous generations in throughput, energy efficiency, and multi-GPU interconnect performance—making it an ideal choice for large-scale AI inference scenarios.

Futong Technology provides a complete server supply solution, covering high-performance compute servers, storage systems, data center–grade networking switches, and full-rack integration. This full-stack optimization ensures industry-leading performance, reliability, and scalability for high-density GPU clusters. Through systematic tuning, Futong fully unleashes the inference performance of the DeepSeek 671B model, delivering high-performance, cost-effective AI computing infrastructure.

End-to-End Turnkey Delivery Enabling Rapid AI Adoption

This project adopts Futong Technology’s end-to-end turnkey delivery model, providing comprehensive services from data center deployment, environment adaptation, and testing optimization to model go-live. The solution includes integrated system tuning, security strategies, and O&M monitoring, reducing deployment timelines by 60% and ensuring immediate, full-performance availability.

As AI industrialization accelerates, compute delivery has evolved beyond standalone hardware procurement into a system-level engineering discipline encompassing architecture optimization, ecosystem integration, and stable operations. Futong’s turnkey delivery services tailor system configurations to real business requirements, achieving significant improvements in inference speed, resource utilization, and model accuracy. By integrating hardware and software ecosystems end to end, Futong substantially lowers deployment barriers and provides customers with sustainable, scalable AI computing capabilities.

Futong iCore One-Stop AI Platform

End-to-End Support Across “Compute–Data–Model–Application”

The delivered appliance comes pre-integrated with Futong Technology’s self-developed “Zhixin” one-stop AI solution platform. Designed for enterprise digital transformation and intelligent upgrading, the platform integrates intelligent agent management, orchestration and scheduling, model knowledge bases, intelligent data processing, and efficient compute resource management—enabling customized enterprise AI transformation.

Three Core Modules of Futong iCore

1. Intelligent Agent Factory

No-Code Construction of Enterprise AI Collaboration Ecosystems

Based on a Retrieval-Augmented Generation (RAG) architecture, the Intelligent Agent Factory enables multi-agent collaboration and no-code development:

Scenario-Based AI Assistants
Supports over 20 scenarios, including report generation, contract review, and resume screening. One financial institution reduced approval time from 8 hours to 15 minutes using intelligent agents.
Dynamic Knowledge Base Integration
With dynamic distillation technology, agents can invoke the full capabilities of large models in real time, ensuring response quality aligned with the 671B model.
Collaboration Efficiency Optimization
Distributed task scheduling algorithms reduce inter-agent communication latency by 40% and resource contention by 65% under thousand-level concurrency.

2. Data Workshop

Efficient Transformation from Unstructured Data to Industry Knowledge

The Data Workshop connects the data intelligence pipeline through a three-layer technology stack, providing high-quality inputs for full-scale foundation models:

Intelligent Semantic Parsing
Multimodal NLP models achieve 98.5% annotation accuracy for unstructured data such as medical imaging and engineering drawings.
Industry Knowledge Base Construction
Accumulated over 200,000 automotive fault cases and 500,000 medical diagnostic records, enabling rapid adaptation to vertical scenarios.
Closed-Loop Data Governance
Built-in data quality monitoring detects annotation deviations in real time and triggers automated feedback correction, improving data cleaning efficiency by 70%.

3. CloudMap Intelligent Compute Operations Platform

Flexible, Efficient Integration of Heterogeneous Compute Resources

By integrating GPUs, NPUs, and other heterogeneous compute resources, and leveraging a cloud–edge collaborative architecture, the platform optimizes large-model deployment costs while supporting local training and edge inference:

Unified Management of Heterogeneous Compute Resources
Centralized management and scheduling across multiple compute platforms.
Unified Operations and Billing
Integrated metering, pricing strategies, and billing management.
Standardized Productization
Compute resources and model services are standardized and listed as products for tenant consumption.
Full Observability
Comprehensive visibility into models, networks, compute, storage, and data.

Proprietary Patented Technologies

Small Models, Full-Scale Capabilities

Through proprietary intellectual property combining dynamic distillation knowledge bases and intelligent scheduling agents, Futong Technology enables a 1.5B model to approach the performance of a 671B foundation model:

Capability Distillation
Decomposes large-model reasoning into over 200 atomic capability units, dynamically invoked by smaller models, achieving 85% task coverage.
Incremental Learning
Online feedback mechanisms continuously update the knowledge base, reducing model iteration cycles from two weeks to eight hours.
Performance Assurance
<200 ms latency under thousand-level concurrency, <3% performance gap versus the 671B model, at only 25% of the overall cost.

Strengthening the Compute Foundation for Industrial AI Adoption

The delivery of the H20 671B full-scale DeepSeek AI computing appliance represents a benchmark project for Futong Technology in integrated computing infrastructure and intelligent compute operations. Looking ahead, Futong will continue to focus on compute integration and operations, model customization, data services, and intelligent agent construction—providing end-to-end solutions from infrastructure to intelligent agent orchestration, and helping enterprises fully unlock AI potential and accelerate intelligent transformation.