diff --git a/docs/how-to/hip_porting_driver_api.md b/docs/how-to/hip_porting_driver_api.md index 382b20a581..57879264a2 100644 --- a/docs/how-to/hip_porting_driver_api.md +++ b/docs/how-to/hip_porting_driver_api.md @@ -1,4 +1,4 @@ -# Porting CUDA Driver API +# Porting CUDA driver API ## Introduction to the CUDA Driver and Runtime APIs diff --git a/docs/how-to/hip_porting_guide.md b/docs/how-to/hip_porting_guide.md index 1eed813e2a..1bcbd3ea8e 100644 --- a/docs/how-to/hip_porting_guide.md +++ b/docs/how-to/hip_porting_guide.md @@ -1,4 +1,4 @@ -# HIP Porting Guide +# HIP porting guide In addition to providing a portable C++ programming environment for GPUs, HIP is designed to ease the porting of existing CUDA code into the HIP environment. This section describes the available tools diff --git a/docs/how-to/hip_rtc.md b/docs/how-to/hip_rtc.md index 344bd7b35e..6a37f8d87e 100644 --- a/docs/how-to/hip_rtc.md +++ b/docs/how-to/hip_rtc.md @@ -1,4 +1,4 @@ -# Programming for HIP Runtime Compiler (RTC) +# Programming for HIP runtime compiler (RTC) HIP lets you compile kernels at runtime with the `hiprtc*` APIs. Kernels can be stored as a text string and can be passed to HIPRTC APIs alongside options to guide the compilation. diff --git a/docs/how-to/performance_guidelines.rst b/docs/how-to/performance_guidelines.rst index aa8bcb1fce..ced9707356 100644 --- a/docs/how-to/performance_guidelines.rst +++ b/docs/how-to/performance_guidelines.rst @@ -3,7 +3,7 @@ :keywords: AMD, ROCm, HIP, CUDA, performance, guidelines ******************************************************************************* -Performance Guidelines +Performance guidelines ******************************************************************************* The AMD HIP Performance Guidelines are a set of best practices designed to help diff --git a/docs/how-to/programming_manual.md b/docs/how-to/programming_manual.md index 4c13246a37..33ab58de93 100644 --- a/docs/how-to/programming_manual.md +++ b/docs/how-to/programming_manual.md @@ -1,4 +1,4 @@ -# HIP Programming Manual +# HIP programming manual ## Host Memory diff --git a/docs/how-to/unified_memory.rst b/docs/how-to/unified_memory.rst index b24cd4c82f..f64189454c 100644 --- a/docs/how-to/unified_memory.rst +++ b/docs/how-to/unified_memory.rst @@ -4,7 +4,7 @@ :keywords: AMD, ROCm, HIP, CUDA, unified memory, unified, memory, UM, APU ******************************************************************************* -Unified Memory +Unified memory ******************************************************************************* In conventional architectures, CPUs and GPUs have dedicated memory like Random diff --git a/docs/index.md b/docs/index.md index 526996ab0d..b19f100c88 100644 --- a/docs/index.md +++ b/docs/index.md @@ -11,7 +11,7 @@ For HIP supported AMD GPUs on multiple operating systems, see: The CUDA enabled NVIDIA GPUs are supported by HIP. For more information, see [GPU Compute Capability](https://developer.nvidia.com/cuda-gpus). -On the AMD ROCm platform, HIP provides header files and runtime library built on top of HIP-Clang compiler in the repository [Common Language Runtime (CLR)](./understand/amd_clr), which contains source codes for AMD's compute languages runtimes as follows, +On the AMD ROCm platform, HIP provides header files and runtime library built on top of HIP-Clang compiler in the repository [Common Language Runtimes (CLR)](./understand/amd_clr), which contains source codes for AMD's compute languages runtimes as follows, On non-AMD platforms, like NVIDIA, HIP provides header files required to support non-AMD specific back-end implementation in the repository ['hipother'](https://github.com/ROCm/hipother), which translates from the HIP runtime APIs to CUDA runtime APIs. @@ -38,14 +38,14 @@ On non-AMD platforms, like NVIDIA, HIP provides header files required to support :::{grid-item-card} How to -* [Programming Manual](./how-to/programming_manual) -* [HIP Porting Guide](./how-to/hip_porting_guide) -* [HIP Porting: Driver API Guide](./how-to/hip_porting_driver_api) +* [Programming manual](./how-to/programming_manual) +* [HIP porting guide](./how-to/hip_porting_guide) +* [HIP porting: driver API guide](./how-to/hip_porting_driver_api) * {doc}`./how-to/hip_rtc` * {doc}`./how-to/performance_guidelines` * [Debugging with HIP](./how-to/debugging) * {doc}`./how-to/logging` -* [Unified Memory](./how-to/unified_memory) +* [Unified memory](./how-to/unified_memory) * [Cooperative Groups](./how-to/cooperative_groups) * {doc}`./how-to/faq` @@ -54,12 +54,12 @@ On non-AMD platforms, like NVIDIA, HIP provides header files required to support :::{grid-item-card} Reference * {doc}`/doxygen/html/index` -* [C++ Language Extensions](./reference/cpp_language_extensions) -* [C++ Language Support](./reference/cpp_support) +* [C++ language extensions](./reference/cpp_language_extensions) +* [C++ language support](./reference/cpp_language_support) * [HIP math API](./reference/math_api) -* [Comparing Syntax for Different APIs](./reference/terms) -* [HSA Runtime API for ROCm](./reference/virtual_rocr) -* [HIP Managed Memory Allocation API](./reference/unified_memory_reference) +* [Comparing syntax for different APIs](./reference/terms) +* [HSA runtime API for ROCm](./reference/virtual_rocr) +* [HIP managed memory allocation API](./reference/unified_memory_reference) * [HIP Cooperative Groups API](./reference/cooperative_groups) * [List of deprecated APIs](./reference/deprecated_api_list) diff --git a/docs/reference/cpp_language_extensions.rst b/docs/reference/cpp_language_extensions.rst index e4bc3782ac..7c0eb0ccf8 100644 --- a/docs/reference/cpp_language_extensions.rst +++ b/docs/reference/cpp_language_extensions.rst @@ -5,7 +5,7 @@ :keywords: AMD, ROCm, HIP, CUDA, c++ language extensions, HIP functions ******************************************************************************** -C++ Language Extensions +C++ language extensions ******************************************************************************** HIP provides a C++ syntax that is suitable for compiling most code that commonly appears in diff --git a/docs/reference/cpp_support.rst b/docs/reference/cpp_language_support.rst similarity index 100% rename from docs/reference/cpp_support.rst rename to docs/reference/cpp_language_support.rst diff --git a/docs/reference/terms.md b/docs/reference/terms.md index 4d4be12296..ea2b9d96ab 100644 --- a/docs/reference/terms.md +++ b/docs/reference/terms.md @@ -1,4 +1,4 @@ -# Table Comparing Syntax for Different Compute APIs +# Table comparing syntax for different compute APIs |Term|CUDA|HIP|OpenCL| |---|---|---|---| diff --git a/docs/reference/unified_memory_reference.rst b/docs/reference/unified_memory_reference.rst index 312a67ef20..12922d7664 100644 --- a/docs/reference/unified_memory_reference.rst +++ b/docs/reference/unified_memory_reference.rst @@ -6,7 +6,7 @@ .. _unified_memory_reference: ******************************************************************************* -HIP Managed Memory Allocation API +HIP managed memory allocation API ******************************************************************************* .. doxygengroup:: MemoryM diff --git a/docs/reference/virtual_rocr.rst b/docs/reference/virtual_rocr.rst index 8241fa07ef..444882fc7e 100644 --- a/docs/reference/virtual_rocr.rst +++ b/docs/reference/virtual_rocr.rst @@ -5,7 +5,7 @@ :keywords: AMD, ROCm, HIP, HSA, ROCR runtime, virtual memory management ******************************************************************************* -HSA Runtime API for ROCm +HSA runtime API for ROCm ******************************************************************************* The following functions are located in the https://github.com/ROCm/ROCR-Runtime repository. diff --git a/docs/sphinx/_toc.yml.in b/docs/sphinx/_toc.yml.in index e79aa9f850..be820ed494 100644 --- a/docs/sphinx/_toc.yml.in +++ b/docs/sphinx/_toc.yml.in @@ -31,24 +31,24 @@ subtrees: - file: how-to/logging - file: how-to/cooperative_groups - file: how-to/unified_memory - title: Unified Memory + title: Unified memory - file: how-to/faq - caption: Reference entries: - file: doxygen/html/index - file: reference/cpp_language_extensions - title: C++ Language Extensions - - file: reference/cpp_support.rst - title: C++ Language Support + title: C++ language extensions + - file: reference/cpp_language_support + title: C++ language support - file: reference/math_api - file: reference/terms - title: Comparing Syntax for different APIs + title: Comparing syntax for different APIs - file: reference/cooperative_groups_reference - title: HIP Cooperative Groups API + title: HIP Cooperative groups API - file: reference/virtual_rocr - file: reference/unified_memory_reference - title: HIP Managed Memory Allocation API + title: HIP managed memory allocation API - file: reference/deprecated_api_list title: List of deprecated APIs diff --git a/docs/tutorial/saxpy.rst b/docs/tutorial/saxpy.rst index 4c0006930f..b1f693cafb 100644 --- a/docs/tutorial/saxpy.rst +++ b/docs/tutorial/saxpy.rst @@ -3,7 +3,7 @@ :keywords: AMD, ROCm, HIP, SAXPY, tutorial ******************************************************************************* -Tutorial: SAXPY - Hello, HIP +SAXPY - Hello, HIP ******************************************************************************* This tutorial explains the basic concepts of the single-source diff --git a/docs/understand/amd_clr.rst b/docs/understand/amd_clr.rst index 31907a869e..3a643cb051 100644 --- a/docs/understand/amd_clr.rst +++ b/docs/understand/amd_clr.rst @@ -5,7 +5,7 @@ .. _AMD_Compute_Language_Runtimes: ******************************************************************************* -AMD Common Language Runtimes (CLR) +AMD common language runtimes (CLR) ******************************************************************************* CLR contains source codes for AMD's compute languages runtimes: ``HIP`` and ``OpenCLâ„¢``. @@ -14,7 +14,7 @@ For developers and users, CLR implements HIP runtime APIs including streams, eve The source codes for all headers and the library implementation are available on GitHub in the `CLR repository `_. -Project Organization +Project organization ==================== CLR includes the following source code, diff --git a/docs/understand/hardware_implementation.rst b/docs/understand/hardware_implementation.rst index 8ee3e0e08c..9cf97b444a 100644 --- a/docs/understand/hardware_implementation.rst +++ b/docs/understand/hardware_implementation.rst @@ -5,13 +5,13 @@ .. _hardware_implementation: ******************************************************************************* -Hardware Implementation +Hardware implementation ******************************************************************************* This chapter describes the typical hardware implementation of GPUs supported by HIP, and how the :ref:`inherent_thread_model` maps to the hardware. -Compute Units +Compute units ============= The basic building block of a GPU is a compute unit (CU), also known @@ -79,7 +79,7 @@ instructions of the other branch have to be executed in the same way. The best performance can therefore be achieved when thread divergence is kept to a warp level, i.e. when all threads in a warp take the same execution path. -Vector Cache +Vector cache ------------ The usage of cache on a GPU differs from that on a CPU, as there is less cache @@ -88,7 +88,7 @@ warps in order to reduce the amount of accesses to device memory, and make that memory available for other warps that currently reside on the compute unit, that also need to load those values. -Local Data Share +Local data share ---------------- The local data share is memory that is accessible to all threads within a block. @@ -103,7 +103,7 @@ The scalar unit performs instructions that are uniform within a warp. It thereby improves efficiency and reduces the pressure on the vector ALUs and the vector register file. -CDNA Architecture +CDNA architecture ================= The general structure of CUs stays mostly as it is in GCN @@ -122,7 +122,7 @@ multiply-accumulate operations for Block Diagram of a CDNA3 Compute Unit. -RDNA Architecture +RDNA architecture ================= RDNA makes a fundamental change to CU design, by changing the @@ -145,7 +145,7 @@ an L0 cache. Block Diagram of an RDNA3 work group processor. -Shader Engines +Shader engines ============== For hardware implementation's sake, multiple CUs are grouped diff --git a/docs/understand/programming_model.rst b/docs/understand/programming_model.rst index 88ba476a89..53299bd6e4 100644 --- a/docs/understand/programming_model.rst +++ b/docs/understand/programming_model.rst @@ -5,7 +5,7 @@ :keywords: AMD, ROCm, HIP, CUDA, API design ******************************************************************************* -Understanding the HIP programming model +HIP programming model ******************************************************************************* The HIP programming model makes it easy to map data-parallel C/C++ algorithms to @@ -14,7 +14,7 @@ such as GPUs. A basic understanding of the underlying device architecture helps make efficient use of HIP and general purpose graphics processing unit (GPGPU) programming in general. -RDNA & CDNA Architecture Summary +RDNA & CDNA architecture summary ================================ Most GPU architectures, like RDNA and CDNA, have a hierarchical structure. @@ -68,7 +68,7 @@ memory subsystem resources. .. _programming_model_simt: -Single Instruction Multiple Threads +Single instruction multiple threads =================================== The single instruction, multiple threads (SIMT) programming model behind the @@ -117,7 +117,7 @@ usually isn't exploited from the width of the built-in vector types, but via the thread id constants ``threadIdx.x``, ``blockIdx.x``, etc. For more details, refer to :ref:`inherent_thread_model`. -Heterogeneous Programming +Heterogeneous programming ========================= The HIP programming model assumes two execution contexts. One is referred to as diff --git a/docs/understand/programming_model_reference.rst b/docs/understand/programming_model_reference.rst index 1fe9a44647..5c8d9c8a28 100644 --- a/docs/understand/programming_model_reference.rst +++ b/docs/understand/programming_model_reference.rst @@ -13,7 +13,7 @@ onto various architectures, primarily GPUs. While the model may be expressed in most imperative languages, (for example Python via PyHIP) this document will focus on the original C/C++ API of HIP. -Threading Model +Threading model =============== The SIMT nature of HIP is captured by the ability to execute user-provided @@ -26,7 +26,7 @@ The set of integers identifying a thread relate to the hierarchy in which thread .. _inherent_thread_model: -Inherent Thread Model +Inherent thread Model --------------------- The thread hierarchy inherent to how AMD GPUs operate is depicted in