Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#198] shad STL algorithm examples #199

Open
wants to merge 5 commits into
base: ics_2021_tutorial
Choose a base branch
from

Conversation

NanmiaoWu
Copy link
Collaborator

This pr creates examples about applying shad STL algorithms for shad_array, shad_unordered_set, and shad_unordered_map, respectively.

The command line arguments and results on puma cluster are following

  • For shad_array
[wuna926@pnode02 build_gmt_Debug]$ mpirun -np 4 ./examples/stl_algorithm/array 

----------------------------------------------------------------------------------------------------
                                        GMT config
----------------------------------------------------------------------------------------------------
                   Option   Dynamic           Value

                num_nodes       yes               4
              num_workers       yes              19
              num_helpers       yes               1
  num_uthreads_per_worker       yes            1024
              max_nesting       yes               2
         comm_buffer_size       yes          262144
    num_buffs_per_channel       yes              64
           num_cmd_blocks       yes             128
           cmd_block_size       yes            4096
         mtasks_per_queue       yes         1048576
        num_mtasks_queues       yes               9
     max_handles_per_node       yes          262144
      handle_check_interv       yes            1024
       mtask_check_interv       yes          100000
        cmdb_check_interv       yes          100000
    node_agg_check_interv       yes         2000000
           dta_chunk_size       yes            1024
dta_prealloc_worker_chunks       yes              32
dta_prealloc_helper_chunks       yes             256
               state_name       yes                
                 state_rw       yes               1
           state_populate       yes               0
                 ssd_path       yes                
                disk_path       yes                
        limit_parallelism       yes               0
           thread_pinning       yes               0
                num_cores       yes              20
           stride_pinning       yes               1
    release_uthread_stack       yes               0
        print_stack_break       yes               0
      print_gmt_mem_usage       yes               0
       print_sched_interv       yes               0
        enable_usr_signal       yes               0

  ENABLE_SINGLE_NODE_ONLY                         0
           ENABLE_ASSERTS                         1
         ENABLE_PROFILING                         0
            ENABLE_TIMING                         0
     ENABLE_GMT_UCONTEXTS                         1
 ENABLE_EXPANDABLE_STACKS                         1
       ENABLE_AGGREGATION                         1
  ENABLE_HELPER_BUFF_COPY                         1
            BUILD_VERSION                     2.0.0
                   CFLAGS                          
            COMPILER_NAME                          
         COMPILER_VERSION                          
----------------------------------------------------------------------------------------------------
Array, using 4 localities, shad::fill took 0.151982 seconds
Array, using 4 localities, shad::generate took 0.167836 seconds
Array, using 4 localities, shad::count took 0.231977 seconds (numbers of 0 = 101 )
Array, using 4 localities, shad::find_if took 0.223933 seconds, array contains an even number
Array, using 4 localities, shad::for_each took 0.192939 seconds
Array, using 4 localities, shad::minmax took 0.191974 seconds
WARNING: Called wait on a NULL handle
WARNING: Called wait on a NULL handle
WARNING: Called wait on a NULL handle
WARNING: Called wait on a NULL handle
Array, using 4 localities, shad::transform took 0.125959 seconds
  • For shad_unordered_set
[wuna926@pnode02 build_gmt_Debug]$ mpirun -np 4 ./examples/stl_algorithm/unordered_set 
----------------------------------------------------------------------------------------------------
                                        GMT config
----------------------------------------------------------------------------------------------------
                   Option   Dynamic           Value

                num_nodes       yes               4
              num_workers       yes              19
              num_helpers       yes               1
  num_uthreads_per_worker       yes            1024
              max_nesting       yes               2
         comm_buffer_size       yes          262144
    num_buffs_per_channel       yes              64
           num_cmd_blocks       yes             128
           cmd_block_size       yes            4096
         mtasks_per_queue       yes         1048576
        num_mtasks_queues       yes               9
     max_handles_per_node       yes          262144
      handle_check_interv       yes            1024
       mtask_check_interv       yes          100000
        cmdb_check_interv       yes          100000
    node_agg_check_interv       yes         2000000
           dta_chunk_size       yes            1024
dta_prealloc_worker_chunks       yes              32
dta_prealloc_helper_chunks       yes             256
               state_name       yes                
                 state_rw       yes               1
           state_populate       yes               0
                 ssd_path       yes                
                disk_path       yes                
        limit_parallelism       yes               0
           thread_pinning       yes               0
                num_cores       yes              20
           stride_pinning       yes               1
    release_uthread_stack       yes               0
        print_stack_break       yes               0
      print_gmt_mem_usage       yes               0
       print_sched_interv       yes               0
        enable_usr_signal       yes               0

  ENABLE_SINGLE_NODE_ONLY                         0
           ENABLE_ASSERTS                         1
         ENABLE_PROFILING                         0
            ENABLE_TIMING                         0
     ENABLE_GMT_UCONTEXTS                         1
 ENABLE_EXPANDABLE_STACKS                         1
       ENABLE_AGGREGATION                         1
  ENABLE_HELPER_BUFF_COPY                         1
            BUILD_VERSION                     2.0.0
                   CFLAGS                          
            COMPILER_NAME                          
         COMPILER_VERSION                          
----------------------------------------------------------------------------------------------------
WARNING: Called wait on a NULL handle
Unordered set, using 4 localities, shad::count took 0.174822 seconds (min = 2, max = 2048 )
Unordered set, using 4 localities, shad::find_if took 0.172014 seconds, and this unordered set contains an even number
Unordered set, using 4 localities, shad::any_of took 0.19099 seconds, and this unordered set contains at least one number that is divisible by 7
Unordered set, using 4 localities, shad::count_if took 0.152018 seconds, and number divisible by 3: 341
Unordered set, using 4 localities, shad::transform took 10.51 seconds
  • shad_unordered_map
[wuna926@pnode02 build_gmt_Debug]$ mpirun -np 4 ./examples/stl_algorithm/unordered_map 
--------------------------------------------------------------------------
By default, for Open MPI 4.0 and later, infiniband ports on a device
are not used by default.  The intent is to use UCX for these devices.
You can override this policy by setting the btl_openib_allow_ib MCA parameter
to true.

  Local host:              pnode14
  Local adapter:           mlx4_0
  Local port:              1

--------------------------------------------------------------------------
--------------------------------------------------------------------------
WARNING: There was an error initializing an OpenFabrics device.

  Local host:   pnode14
  Local device: mlx4_0
--------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------
                                        GMT config
----------------------------------------------------------------------------------------------------
                   Option   Dynamic           Value

                num_nodes       yes               4
              num_workers       yes              19
              num_helpers       yes               1
  num_uthreads_per_worker       yes            1024
              max_nesting       yes               2
         comm_buffer_size       yes          262144
    num_buffs_per_channel       yes              64
           num_cmd_blocks       yes             128
           cmd_block_size       yes            4096
         mtasks_per_queue       yes         1048576
        num_mtasks_queues       yes               9
     max_handles_per_node       yes          262144
      handle_check_interv       yes            1024
       mtask_check_interv       yes          100000
        cmdb_check_interv       yes          100000
    node_agg_check_interv       yes         2000000
           dta_chunk_size       yes            1024
dta_prealloc_worker_chunks       yes              32
dta_prealloc_helper_chunks       yes             256
               state_name       yes                
                 state_rw       yes               1
           state_populate       yes               0
                 ssd_path       yes                
                disk_path       yes                
        limit_parallelism       yes               0
           thread_pinning       yes               0
                num_cores       yes              20
           stride_pinning       yes               1
    release_uthread_stack       yes               0
        print_stack_break       yes               0
      print_gmt_mem_usage       yes               0
       print_sched_interv       yes               0
        enable_usr_signal       yes               0

  ENABLE_SINGLE_NODE_ONLY                         0
           ENABLE_ASSERTS                         1
         ENABLE_PROFILING                         0
            ENABLE_TIMING                         0
     ENABLE_GMT_UCONTEXTS                         1
 ENABLE_EXPANDABLE_STACKS                         1
       ENABLE_AGGREGATION                         1
  ENABLE_HELPER_BUFF_COPY                         1
            BUILD_VERSION                     2.0.0
                   CFLAGS                          
            COMPILER_NAME                          
         COMPILER_VERSION                          
----------------------------------------------------------------------------------------------------
WARNING: Called wait on a NULL handle
Unordered map, using 4 localities, shad::count took 0.20866 seconds (min = 3, max = 3072 )
Unordered map, using 4 localities, shad::find_if took 0.151011 seconds, and this unordered map contains an even number
Unordered map, using 4 localities, shad::any_of took 0.182961 seconds, and this unordered map contains at least one number that is divisible by 7
Unordered map, using 4 localities, shad::count_if took 0.161005 seconds, and number divisible by 4: 256
Unordered map, using 4 localities, shad::transform took 7.62998 seconds

@NanmiaoWu NanmiaoWu changed the base branch from master to ics_2021_tutorial June 17, 2021 01:33
#include "shad/util/measure.h"

constexpr static size_t kArraySize = 1024;
using array_t = shad::impl::array<int, kArraySize>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why using the impl::array?

@NanmiaoWu NanmiaoWu force-pushed the ics_2021_tutorial branch 2 times, most recently from 5b50f39 to eaa6450 Compare June 18, 2021 03:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants