Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doc rework #185

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 7 additions & 7 deletions doc/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,8 @@ blocks*, used to develop explicit memory and data management policies. The goals
of AML are:

* **composability**: application developers and performance experts should be
able to pick and choose the building blocks to use depending on their specific
needs.
able to pick and choose which building blocks to use depending on their
specific needs.

* **flexibility**: users should be able to customize, replace, or change the
configuration of each building block as much as possible.
Expand All @@ -36,7 +36,7 @@ AML currently implements the following abstractions:
* :doc:`Area <pages/areas>`, a set of addressable physical memories,
* :doc:`Layout <pages/layout>`, a description of data structure organization,
* :doc:`Tiling <pages/tilings>`, a description of data blocking (decomposition)
* :doc:`DMA <pages/dmas>`, an engine to asynchronously move data structures between areas,
* :doc:`DMA <pages/dmas>`, an engine to asynchronously move data structures between areas.

Each of these abstractions has several implementations. For instance, areas
may refer to the usual DRAM or its subset, to GPU memory, or to non-volatile memory.
Expand Down Expand Up @@ -76,7 +76,7 @@ Installation
Workflow
~~~~~~~~

Include the AML header:
Include AML header:

.. code-block:: c

Expand All @@ -93,7 +93,7 @@ Check the AML version:
return 1;
}

Initialize and clean up the library:
Initialize and cleanup AML:

.. code-block:: c

Expand All @@ -106,8 +106,8 @@ Initialize and clean up the library:

Link your program with *-laml*.

Check the above building-blocks-specific pages for further examples and
information on the library features.
See the above pages on specific building blocks for further examples and
information on library features.

Support
-------
Expand Down
14 changes: 14 additions & 0 deletions doc/pages/area_cuda_api.rst
Original file line number Diff line number Diff line change
@@ -1,4 +1,18 @@
Area Cuda Implementation API
=================================
Cuda Implementation of Areas.

.. codeblock:: c
#include <aml/area/cuda.h>

Cuda implementation of AML areas.
This building block relies on Cuda implementation of
malloc/free to provide mmap/munmap on device memory.
Additional documentation of cuda runtime API can be found here:
https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__MEMORY.html

AML cuda areas may be created to allocate current or specific cuda devices.
Also allocations can be private to a single device or shared across devices.
Finally allocations can be backed by host memory allocation.

.. doxygengroup:: aml_area_cuda
89 changes: 87 additions & 2 deletions doc/pages/area_linux_api.rst
Original file line number Diff line number Diff line change
@@ -1,4 +1,89 @@
Area Linux Implementation API
=================================
Area Linux Implementation
=========================

This is the Linux implementation of AML areas.

This building block relies on the libnuma implementation and the Linux
mmap() / munmap() to provide mmap() / munmap() on NUMA host processor memory.
New areas may be created to allocate a specific subset of memories.
This building block also includes a static declaration of a default initialized
area that can be used out-of-the-box with the abstract area API.

.. codeblock:: c
#include <aml/area/linux.h

Example
-------
Using built-in feature of linux areas:
We allocate data accessible by several processes with the same address, spread
across all CPU memories (using linux interleave policy)

.. codeblock:: c
// include ..

struct aml_area* area;
aml_area_linux_create(&area, AML_AREA_LINUX_MMAP_FLAG_SHARED, NULL,
AML_AREA_LINUX_BINDING_FLAG_INTERLEAVE);

// When work is done with this area, free resources associated with it
aml_area_linux_destroy(&area);

Integrating new feature in a new area implementation with some linux features:
You need an area feature not integrated in AML, but you want to work with AML
features around areas.
You can extend the features of linux area and reimplement a custom
implementation of mmap and munmap functions with
additional fields.

.. codeblock:: c
// include ..

// declaration of data field used in generic areas
struct aml_area_data {
// uses features of linux areas
struct aml_area_linux_data linux_data;
// implements additional features
void* my_data;
};

// create your struct my_area_data with custom linux settings
struct aml_area_data {
.linux_data = {
.nodeset = NULL,
.binding_flags = AML_AREA_LINUX_BINDING_FLAG_INTERLEAVE,
.mmap_flags = AML_AREA_LINUX_FLAG_SHARED,
},
.my_data = whatever_floats_your_boat,
} my_area_data;

// implements mmap using linux area features and custom features
void* my_mmap(const struct aml_area_data* data, void* ptr, size_t size){
program_data = aml_area_linux_mmap(data->linux_data, ptr, size);
aml_area_linux_mbind(data->linux_data, program_data, size);
// additional work we wnat to do on top of area linux work
whatever_shark(data->my_data, program_data, size);
return program_data;
}
// same for munmap
int* my_munmap(cont struct aml_area_data* data, void* ptr, size_t size);

// builds your custom area
struct aml_area_ops {
.mmap = my_mmap,
.munmap = my_munmap,
} my_area_ops;

struct aml_area {
.ops = my_area_ops,
.data = my_area_data,
} my_area;

void* program_data = aml_area_mmap(&my_area, NULL, size);


And now you can call the generic API on your area.

Area Linux API
==============

.. doxygengroup:: aml_area_linux
11 changes: 11 additions & 0 deletions doc/pages/area_opencl_api.rst
Original file line number Diff line number Diff line change
@@ -1,4 +1,15 @@
Area OpenCL Implementation API
=================================

OpenCL Implementation of Areas.

.. codeblock:: c
#include <aml/area/opencl.h>

OpenCL implementation of AML areas.
This building block relies on OpenCL implementation of
device memory allocation to provide mmap/munmap on device memory.
Additional documentation of OpenCL memory model can be found here:
https://www.khronos.org/registry/OpenCL/specs/2.2/html/OpenCL_API.html#_memory_model

.. doxygengroup:: aml_area_opencl
12 changes: 12 additions & 0 deletions doc/pages/area_ze_api.rst
Original file line number Diff line number Diff line change
@@ -1,4 +1,16 @@
Area Level Zero Implementation API
==================================

Implementation of Areas with Level Zero API.

.. codeblock:: c
#include <aml/area/ze.h>

Implementation of Areas with Level Zero API.
This building block relies on Ze implementation of
host and device memory mapping to provide mmap/munmap on device memory.
Additional documentation of Ze memory model can be found here:

https://spec.oneapi.com/level-zero/latest/core/api.html#memory

.. doxygengroup:: aml_area_ze
80 changes: 80 additions & 0 deletions doc/pages/areas.rst
Original file line number Diff line number Diff line change
@@ -1,10 +1,90 @@
Areas: Addressable Physical Memories
====================================

AML areas represent places where data can be stored.
In shared memory systems, locality is a major concern for performance.
Being able to query memory from specific places is of major interest to achieve
this goal.
AML areas provide low-level mmap() / munmap() functions to query memory from
specific places materialized as areas.
Available area implementations dictate the way such places can be arranged and
their properties.

.. image:: ../img/area.png
:width=700px
"Illustration of areas on a complex system."

An AML area is an implementation of memory operations for several type of
devices through a consistent abstraction.
This abstraction is meant to be implemented for several kind of devices, i.e.
the same function calls allocate different kinds of devices depending on the
area implementation provided.

With the high level API, you can:

* Use an area to allocate space for your data
* Release the data in this area

Example
-------

Let's look how these operations can be done in a C program.

.. code-block:: c
#include <aml.h>
#include <aml/area/linux.h>

int main(){

void* data = aml_area_mmap(&aml_area_linux, s);
do_work(data);
aml_area_munmap(data, s);
}

We start by importing the AML interface, as well as the area implementation we
want to use.

We then proceed to allocate space for the data of size s using the default from
the AML Linux implementation.
The data will be only visible by this process and bound to the CPU with the
default linux allocation policy.

Finally, when the work is done with data, we free it.


Area API
--------

It is important to notice that the functions provided through the Area API are
low-level functions and are not optimized for performance as allocators are.

.. doxygengroup:: aml_area


Implementations
---------------
Aware users may create or modify implementation by assembling appropriate
operations in an aml_area_ops structure.

The linux implementation is the go to for using simple areas on NUMA CPUs with
linux operating system.

There is an ongoing work on hwloc, CUDA and OpenCL areas.

Let's look at an example of a dynamic creation of a linux area identical to the
static default aml_area_linux:

.. code-block:: c
#include <aml.h>
#include <aml/area/linux.h>

int main(){
struct aml_area* area;
aml_area_linux_create(&area, AML_AREA_LINUX_MMAP_FLAG_PRIVATE, NULL,
AML_AREA_LINUX_BINDING_FLAG_DEFAULT);
do_work(area);
aml_area_linux_destroy(&area);
}

.. toctree::

Expand Down
Loading