Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: redesign the declarative layout APIs #102

Open
wants to merge 47 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
bf5f319
[feature] Add layout building block
perarnau Nov 7, 2018
6ee737b
[feature] Add copy/transform utilities
perarnau Nov 7, 2018
a821c11
[wip/feature] add new layout design
perarnau Dec 7, 2018
5b9b77e
Bugfix for layout.
Kerilk Dec 8, 2018
01ed44a
Adapted copy operators. For now transforms are expressed in row major…
Kerilk Dec 8, 2018
163ddf9
[refactor] use a bitfield to add type information
perarnau Dec 10, 2018
ab9d661
[refactor] better names for copy_layout functions
perarnau Dec 10, 2018
55de883
Change arguments of transform (transpose) to match those of Python an…
Kerilk Dec 10, 2018
bc47647
Swapped column and row to match the classical notations.
Kerilk Dec 10, 2018
9c05ce6
[feature] implement layout dense internals
perarnau Dec 13, 2018
9238feb
Added tests for column and row major layouts.
Kerilk Dec 13, 2018
b3596fb
Store the pittch given by the user.
Kerilk Dec 13, 2018
ae1bdaf
Replaced copy operators by generated ones and added the generator.
Kerilk Dec 20, 2018
29bbe34
Removed useless curly braces also.
Kerilk Dec 20, 2018
18906d9
Fix naming inconsistencies.
Kerilk Jan 4, 2019
f9c8c2d
Automatic enumeration of design space.
Kerilk Jan 4, 2019
fea8a1e
Starting adding support for generic copy operators of layout.
Kerilk Jan 4, 2019
35420d1
Added ndims and element_size methods to layouts.
Kerilk Jan 7, 2019
5033fe9
Added generic copy operators.
Kerilk Jan 7, 2019
0d35405
Refactoring.
Kerilk Jan 11, 2019
afd21ac
Corrected (I hope) the bit set macro...
Kerilk Jan 11, 2019
6ed9d36
Added a padding layout.
Kerilk Jan 11, 2019
8da06a1
Bugfix.
Kerilk Jan 12, 2019
997b420
More rigorous asserts.
Kerilk Jan 12, 2019
a249a8e
Added reshape operation for dense layouts.
Kerilk Jan 15, 2019
3d4fdf0
Bugfix
Kerilk Jan 15, 2019
9d72fde
Added a reshaping layout to be used on padding layouts or as a fallba…
Kerilk Jan 16, 2019
5e6148d
Bugfix
Kerilk Jan 16, 2019
a670594
Added dims accessor in column order for layout.
Kerilk Jan 26, 2019
695d641
WIP
Kerilk Jan 26, 2019
58a8b88
[build] add libexcit as build dependency
perarnau Jan 28, 2019
03db551
[fix] try to fix CI
perarnau Jan 28, 2019
e8e71d0
[ci] fix knl ci for excit
perarnau Jan 28, 2019
93ba5f0
Added slice operation to dense layouts.
Kerilk Jan 28, 2019
efbc93b
[fix] various typos on new tilings
perarnau Jan 28, 2019
0134a28
Added native (column) version of some functions.
Kerilk Jan 28, 2019
a4facf5
Working version of resizing tiling with tests.
Kerilk Jan 28, 2019
4863232
[feature] add layout-aware dma
perarnau Jan 29, 2019
8aefa5e
[feature] add dma operator to the layout dma
perarnau Jan 29, 2019
323fe9e
[fix] add tests for the new dma, make it work
perarnau Jan 29, 2019
87e0c55
Added padding tiling and corrected tests.
Kerilk Jan 29, 2019
ea73d4f
Exposed layout column api.
Kerilk Jan 29, 2019
dccfebe
Use column api for copy operators and better checks of compatibility.
Kerilk Jan 29, 2019
ff26f8c
Test mixing row and column layout/tiling/copy.
Kerilk Jan 29, 2019
0a4a5bf
[feature] double dma/layout scratch
perarnau Jan 30, 2019
a7b9a10
Added tiling collapsing unused dimensions.
Kerilk Feb 4, 2019
af6da1a
Bugfix and associated test.
Kerilk Feb 4, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 17 additions & 1 deletion .gitlab-ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,19 @@ stages:
make:generic:
stage: build
script:
- git clone https://xgitlab.cels.anl.gov/argo/excit.git
- cd excit
- ./autogen.sh
- mkdir build
- ./configure --prefix=`pwd`/build
- make
- make install
- cd ..
- ./autogen.sh
- mkdir build
- PKG_CONFIG_PATH=excit/build/lib/pkgconfig ./configure --prefix=`pwd`/build
- make
- make install
- make check
artifacts:
when: on_failure
Expand All @@ -22,9 +30,17 @@ make:knl:
stage: build
script:
- source /opt/intel/compilers_and_libraries/linux/bin/compilervars.sh intel64
- git clone https://xgitlab.cels.anl.gov/argo/excit.git
- cd excit
- ./autogen.sh
- mkdir build
- ./configure --prefix=`pwd`/build
- make
- make install
- cd ..
- ./autogen.sh
- mkdir build
- CC=icc CFLAGS="-mkl -xhost" ./configure --prefix=`pwd`/build --enable-benchmarks
- CC=icc CFLAGS="-mkl -xhost" PKG_CONFIG_PATH=excit/build/lib/pkgconfig ./configure --prefix=`pwd`/build --enable-benchmarks
- make -j64
- make install
- make check
Expand Down
3 changes: 3 additions & 0 deletions configure.ac
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,9 @@ AM_CONDITIONAL([ADD_BENCHMARKS],[test "x$benchmarks" = xtrue])
AC_CHECK_HEADERS(numa.h)
AC_CHECK_LIB(numa, move_pages)

# excit iterators
PKG_CHECK_MODULES([EXCIT],[libexcit])

# internal jemalloc
ac_configure_args="$ac_configure_args \
'--with-jemalloc-prefix=jemk_aml_' \
Expand Down
36 changes: 30 additions & 6 deletions src/Makefile.am
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
AM_CPPFLAGS = -I$(top_srcdir)/jemalloc/include
AM_CPPFLAGS = -I$(top_srcdir)/jemalloc/include @EXCIT_CFLAGS@
lib_LTLIBRARIES = libaml.la

ARENA_JEMALLOC_CSOURCES = arena_jemalloc.c
Expand All @@ -10,21 +10,33 @@ AREA_LINUX_CSOURCES = area_linux.c \

AREA_POSIX_CSOURCES = area_posix.c

LAYOUT_CSOURCES = layout.c \
layout_dense.c \
layout_pad.c \
layout_reshape.c

TILING_CSOURCES = tiling.c \
tiling_1d.c \
tiling_2d.c

TILING_ND_CSOURCES = tiling_nd.c \
tiling_nd_resize.c \
tiling_nd_pad.c \
tiling_nd_collapse.c

BINDING_CSOURCES = binding.c \
binding_single.c \
binding_interleave.c

DMA_CSOURCES = dma.c \
dma_linux_par.c \
dma_linux_seq.c
dma_linux_seq.c \
dma_layout.c

SCRATCH_CSOURCES = scratch.c \
scratch_seq.c \
scratch_par.c
scratch_par.c \
scratch_double.c

UTILS_CSOURCES = vector.c

Expand All @@ -34,12 +46,24 @@ LIBCSOURCES = aml.c area.c arena.c \
$(AREA_LINUX_CSOURCES) \
$(AREA_POSIX_CSOURCES) \
$(TILING_CSOURCES) \
$(TILING_ND_CSOURCES) \
$(BINDING_CSOURCES) \
$(DMA_CSOURCES) \
$(SCRATCH_CSOURCES)
$(SCRATCH_CSOURCES) \
$(LAYOUT_CSOURCES) \
copy.c

LIBHSOURCES = aml.h
LIBHSOURCES = aml.h \
aml-layout.h \
aml-layout-dense.h \
aml-layout-pad.h \
aml-layout-reshape.h \
aml-tiling.h \
aml-tiling-resize.h \
aml-tiling-pad.h \
aml-tiling-collapse.h \
aml-copy.h

libaml_la_SOURCES = $(LIBCSOURCES) $(LIBHSOURCES)
libaml_la_LIBADD = -L$(top_srcdir)/jemalloc/lib/ -ljemalloc-aml
libaml_la_LIBADD = -L$(top_srcdir)/jemalloc/lib/ -ljemalloc-aml @EXCIT_LIBS@
include_HEADERS = $(LIBHSOURCES)
192 changes: 192 additions & 0 deletions src/aml-copy.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,192 @@
#ifndef AML_COPY_H
#define AML_COPY_H 1

/*******************************************************************************
* Hypervolume copy and transpose functions.
******************************************************************************/

/*
* Copies a (sub-)hypervolume to another (sub-)hypervolume.
* "d": number of dimensions.
* "dst": pointer to the destination hypervolume.
* "dst_pitch": pointer to d-1 pitch values representing the pitch
* in each dimension of the destination hypervolume.
* "src": pointer to the source hypervolume.
* "src_pitch": pointer to d-1 pitch values representing the pitch
* in each dimension of the source hypervolume.
* "elem_number": pointer to d values representing the number of elements
* in each dimension of the (sub-)hypervolume to copy.
* "elem_size": size of memory elements.
* Returns 0 if successful; an error code otherwise.
*/
int aml_copy_nd(size_t d, void *dst, const size_t *dst_pitch,
const void *src, const size_t *src_pitch,
const size_t *elem_number, const size_t elem_size);
/*
* Copies a (sub-)hypervolume to another (sub-)hypervolume while transposing.
* Reverse of aml_copy_rtnd.
* Example a[3][4][5] -> b[5][3][4] (C notation).
* "d": number of dimensions.
* "dst": pointer to the destination hypervolume.
* "dst_pitch": pointer to d-1 pitch values representing the pitch
* in each dimension of the destination hypervolume.
* "src": pointer to the source hypervolume.
* "src_pitch": pointer to d-1 pitch values representing the pitch
* in each dimension of the source hypervolume.
* "elem_number": pointer to d values representing the number of elements
* in each dimension of the (sub-)hypervolume to copy.
* "elem_size": size of memory elements in the src hypervolume order.
* Returns 0 if successful; an error code otherwise.
*/
int aml_copy_tnd(size_t d, void *dst, const size_t *dst_pitch,
const void *src, const size_t *src_pitch,
const size_t *elem_number, const size_t elem_size);
/*
* Copies a (sub-)hypervolume to another (sub-)hypervolume while transposing.
* Reverse of aml_copy_tnd.
* Example a[3][4][5] -> b[4][5][3] (C notation).
* "d": number of dimensions.
* "dst": pointer to the destination hypervolume.
* "dst_pitch": pointer to d-1 pitch values representing the pitch
* in each dimension of the destination hypervolume.
* "src": pointer to the source hypervolume.
* "src_pitch": pointer to d-1 pitch values representing the pitch
* in each dimension of the source hypervolume.
* "elem_number": pointer to d values representing the number of elements
* in each dimension of the (sub-)hypervolume to copy.
* "elem_size": size of memory elements in the src hypervolume order.
* Returns 0 if successful; an error code otherwise.
*/
int aml_copy_rtnd(size_t d, void *dst, const size_t *dst_pitch,
const void *src, const size_t *src_pitch,
const size_t *elem_number, const size_t elem_size);

/*
* Copies a (sub-)hypervolume to another (sub-)hypervolume while shuffling
* dimensions. Example a[4][2][3][5] -> b[5][4][3][2] (C notation).
* "d": number of dimensions.
* "target_dims": array of d dimension index representing the mapping
* between the source dimensions and the target dimensions.
* Example [3, 1, 0, 2]
* "dst": pointer to the destination hypervolume.
* "dst_pitch": pointer to d-1 pitch values representing the pitch
* in each dimension of the destination hypervolume.
* "src": pointer to the source hypervolume.
* "src_pitch": pointer to d-1 pitch values representing the pitch
* in each dimension of the source hypervolume.
* "elem_number": pointer to d values representing the number of elements
* in each dimension of the (sub-)hypervolume to copy.
* "elem_size": size of memory elements in the src hypervolume order.
* Returns 0 if successful; an error code otherwise.
*/
int aml_copy_shnd(size_t d, const size_t *target_dims, void *dst,
const size_t *dst_pitch, const void *src,
const size_t *src_pitch, const size_t *elem_number,
const size_t elem_size);
/*
* Strided version of aml_copy_nd.
*/
int aml_copy_ndstr(size_t d, void *dst, const size_t *dst_pitch,
const size_t *dst_stride, const void *src,
const size_t *src_pitch, const size_t *src_stride,
const size_t *elem_number, const size_t elem_size);
/*
* Strided version of aml_copy_tnd.
*/
int aml_copy_tndstr(size_t d, void *dst, const size_t *dst_pitch,
const size_t *dst_stride, const void *src,
const size_t *src_pitch, const size_t *src_stride,
const size_t *elem_number, const size_t elem_size);
/*
* Strided version of aml_copy_rtnd.
*/
int aml_copy_rtndstr(size_t d, void *dst, const size_t *dst_pitch,
const size_t *dst_stride, const void *src,
const size_t *src_pitch, const size_t *src_stride,
const size_t *elem_number, const size_t elem_size);
/*
* Strided version of aml_copy_shnd.
*/
int aml_copy_shndstr(size_t d, const size_t *target_dims, void *dst,
const size_t *dst_pitch, const size_t *dst_stride,
const void *src, const size_t *src_pitch,
const size_t *src_stride, const size_t *elem_number,
const size_t elem_size);
/*
* Version of aml_copy_nd using cumulative pitch.
*/
int aml_copy_nd_c(size_t d, void *dst, const size_t *cumul_dst_pitch,
const void *src, const size_t *cumul_src_pitch,
const size_t *elem_number, const size_t elem_size);
/*
* Version of aml_copy_ndstr using cumulative pitch.
*/
int aml_copy_ndstr_c(size_t d, void *dst, const size_t *dst_pitch,
const size_t *cumul_dst_stride, const void *src,
const size_t *src_pitch, const size_t *cumul_src_stride,
const size_t *elem_number, const size_t elem_size);
/*
* Version of aml_copy_nd using cumulative pitch.
*/
int aml_copy_tnd_c(size_t d, void *dst, const size_t *cumul_dst_pitch,
const void *src, const size_t *cumul_src_pitch,
const size_t *elem_number, const size_t elem_size);
/*
* Version of aml_copy_nd using cumulative pitch.
*/
int aml_copy_rtnd_c(size_t d, void *dst, const size_t *cumul_dst_pitch,
const void *src, const size_t *cumul_src_pitch,
const size_t *elem_number, const size_t elem_size);
/*
* Version of aml_copy_shnd using cumulative pitch.
*/
int aml_copy_shnd_c(size_t d, const size_t *target_dims, void *dst,
const size_t *cumul_dst_pitch, const void *src,
const size_t *cumul_src_pitch, const size_t *elem_number,
const size_t elem_size);
/*
* Version of aml_copy_tndstr using cumulative pitch.
*/
int aml_copy_tndstr_c(size_t d, void *dst, const size_t *cumul_dst_pitch,
const size_t *dst_stride, const void *src,
const size_t *cumul_src_pitch, const size_t *src_stride,
const size_t *elem_number, const size_t elem_size);
/*
* Version of aml_copy_rtndstr using cumulative pitch.
*/
int aml_copy_rtndstr_c(size_t d, void *dst, const size_t *cumul_dst_pitch,
const size_t *dst_stride, const void *src,
const size_t *cumul_src_pitch, const size_t *src_stride,
const size_t *elem_number, const size_t elem_size);
/*
* Version of aml_copy_shndstr using cumulative pitch.
*/
int aml_copy_shndstr_c(size_t d, const size_t *target_dims, void *dst,
const size_t *cumul_dst_pitch, const size_t *dst_stride,
const void *src, const size_t *cumul_src_pitch,
const size_t *src_stride, const size_t *elem_number,
const size_t elem_size);

/*******************************************************************************
* Generic building block API: Native version
* Native means using AML-internal layouts.
******************************************************************************/

int aml_copy_layout_native(struct aml_layout *dst,
const struct aml_layout *src);
int aml_copy_layout_transform_native(struct aml_layout *dst,
const struct aml_layout *src,
const size_t *target_dims);
int aml_copy_layout_generic(struct aml_layout *dst,
const struct aml_layout *src);
int aml_copy_layout_transform_generic(struct aml_layout *dst,
const struct aml_layout *src,
const size_t *target_dims);
int aml_copy_layout_transpose_native(struct aml_layout *dst, const struct aml_layout *src);
int aml_copy_layout_reverse_transpose_native(struct aml_layout *dst,
const struct aml_layout *src);
int aml_copy_layout_transpose_generic(struct aml_layout *dst, const struct aml_layout *src);
int aml_copy_layout_reverse_transpose_generic(struct aml_layout *dst,
const struct aml_layout *src);

#endif
41 changes: 41 additions & 0 deletions src/aml-dma-layout.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
#ifndef AML_DMA_LAYOUT_H
#define AML_DMA_LAYOUT_H 1

/*******************************************************************************
* Layout aware DMA
* DMA using layouts as source and destination.
******************************************************************************/

extern struct aml_dma_ops aml_dma_ops_layout;

struct aml_dma_request_layout {
int type;
struct aml_layout *dest;
struct aml_layout *src;
};

typedef int (*aml_dma_operator)(struct aml_layout *, struct aml_layout *, void*);
struct aml_dma_layout {
struct aml_vector requests;
pthread_mutex_t lock;
aml_dma_operator do_work;
void *work_arg;
};

#define AML_DMA_LAYOUT_DECL(name) \
struct aml_dma_layout __ ##name## _inner_data; \
struct aml_dma name = { \
&aml_dma_ops_layout, \
(struct aml_dma_data *)&__ ## name ## _inner_data, \
};

#define AML_DMA_LAYOUT_ALLOCSIZE \
(sizeof(struct aml_dma_layout) + \
sizeof(struct aml_dma))

int aml_dma_layout_create(struct aml_dma **dma, ...);
int aml_dma_layout_init(struct aml_dma *dma, ...);
int aml_dma_layout_vinit(struct aml_dma *dma, va_list args);
int aml_dma_layout_destroy(struct aml_dma *dma);

#endif
Loading