- Freeing Memory Allocated by the HDF5 Library
+ @ref freeing_memory
|
Describes how inconsistent memory management can cause heap corruption or resource leaks and possible solutions.
diff --git a/doxygen/dox/UsingIdentifiers.dox b/doxygen/dox/UsingIdentifiers.dox
new file mode 100644
index 00000000000..7fe1923990c
--- /dev/null
+++ b/doxygen/dox/UsingIdentifiers.dox
@@ -0,0 +1,97 @@
+/** \page UsingIdentifiers Using Identifiers
+ *
+ * \section sec_using_identifiers Using Identifiers
+ *
+ * The purpose of this topic is to describe how identifiers behave and how they should be treated by application programs.
+ *
+ * When an application program uses the HDF5 library to create or open an item, a unique identifier is
+ * returned. The items that return a unique identifier when they are created or opened include the following:
+ * \li dataset
+ * \li group
+ * \li datatype
+ * \li dataspace
+ * \li file
+ * \li attribute
+ * \li property list
+ * \li referenced object
+ * \li error stack
+ * \li error message
+ *
+ * An application may open one of the items listed above more than once at the same time. For example, an
+ * application might open a group twice, receiving two identifiers. Information from one dataset in the
+ * group could be handled through one identifier, and the information from another dataset in the group
+ * could be handled by a different identifier.
+ *
+ * An application program should track every identifier it receives as a result of creating or opening one of
+ * the items listed above. In order for an application to close properly, it must release every identifier it
+ * has opened. If an application opened a group twice for example, it would need to issue two #H5Gclose
+ * commands, one for each identifier. Not releasing identifiers causes resource leaks. Until an identifier
+ * is released, the item associated with the identifier is still open.
+ *
+ * The library considers a file open until all of the identifiers associated with the file and with the file’s
+ * various items have been released. The identifiers associated with these open items must be released separately.
+ * This means that an application can close a file and still work with one or more portions of the file. Suppose
+ * an application opened a file, a group within the file, and two datasets within the group. If the application
+ * closed the file with #H5Fclose, then the file would be considered closed to the application, but the group
+ * and two datasets would still be open.
+ *
+ * There are several exceptions to the above file closing rule. One is when the #H5close function is used
+ * instead of #H5Fclose. #H5close causes a general shut down of the library: all data is written to disk,
+ * all identifiers are closed, and all memory used by the library is cleaned up. Another exception occurs on
+ * parallel processing systems. Suppose on a parallel system an application has opened a file, a group in the
+ * file, and two datasets in the group. If the application uses the #H5Fclose function to close the file, the
+ * call will fail with an error. The open group and datasets must be closed before the file can be closed.
+ * A third exception is when the file access property list includes the property #H5F_CLOSE_STRONG. This
+ * property causes the closing of all of the file’s open items when the file is closed with #H5Fclose. For
+ * more information about #H5close, #H5Fclose, and #H5Pset_fclose_degree, see the \ref RM.
+ *
+ * The reference manual entries for functions that return identifiers describe what might be returned as
+ * follows:
+ * \b Returns:
+ * Returns an identifier if successful; otherwise returns a negative value.
+ *
+ * In other words, a successful operation will return a non-negative identifier which will never be 0
+ * (zero) and will always be a positive value.
+ *
+ * \subsection subsec_using_identifiers_func Functions that Return Identifiers
+ *
+ * Some of the functions that return identifiers are listed below.
+ *
+ * \li #H5Acreate
+ * \li #H5Acreate_by_name
+ * \li #H5Aget_type
+ * \li #H5Aopen
+ * \li #H5Aopen_by_idx
+ * \li #H5Aopen_by_name
+ * \li #H5Dcreate
+ * \li #H5Dcreate_anon
+ * \li #H5Dget_access_plist
+ * \li #H5Dget_create_plist
+ * \li #H5Dget_space
+ * \li #H5Dget_type
+ * \li #H5Dopen
+ * \li #H5Ecreate_msg
+ * \li #H5Ecreate_stack
+ * \li #H5Fcreate
+ * \li #H5Fopen
+ * \li #H5Freopen
+ * \li #H5Gcreate
+ * \li #H5Gcreate_anon
+ * \li #H5Gopen
+ * \li #H5Oopen
+ * \li #H5Oopen_by_addr
+ * \li #H5Oopen_by_idx
+ * \li #H5Pcreate
+ * \li #H5Pget_virtual_srcspace
+ * \li #H5Pget_virtual_vspace
+ * \li #H5Rdereference
+ * \li #H5Rget_region
+ * \li #H5Screate
+ * \li #H5Screate_simple
+ * \li #H5Tcopy
+ * \li #H5Tcreate
+ * \li #H5Tdecode
+ * \li #H5Tget_member_type
+ * \li #H5Tget_super
+ * \li #H5Topen
+*/
diff --git a/doxygen/dox/branches-explained.dox b/doxygen/dox/branches-explained.dox
new file mode 100644
index 00000000000..46f9c00b70b
--- /dev/null
+++ b/doxygen/dox/branches-explained.dox
@@ -0,0 +1,63 @@
+/** \page BRANCHEXPL HDF5 Git Branching Model Explained
+
+This document describes current HDF5 branches.
+
+Branches are tested nightly and testing results are available at
+https://my.cdash.org/index.php?project=HDF5.
+Commits that break daily testing should be fixed by 3:00 pm Central time or reverted.
+We encourage code contributors to check the status of their commits. If you have any questions,
+please contact help@hdfgroup.org.
+
+\section sec_branchexpl_develop develop branch
+Develop is the main branch whose source code always reflects a state with the latest delivered
+development changes for the next major release of HDF5.
+This is also considered the integration branch, as \b all new features are integrated into this
+branch from respective feature branches. Although
+develop is considered an integration branch, it is not an unstable branch. All code merged to
+develop is expected to pass all GitHub actions and daily tests.
+
+\section sec_branchexpl_maintenace Maintenance branches
+Each currently supported release line of HDF5 (e.g. 1.8.x, 1.10.x, 1.12.x, 1.14.x) has an associated
+branch with the name hdf5_1_10, etc..
+Maintenance branches are similar to the develop branch, except the source code in a maintenance
+branch always reflects a state
+with the latest delivered development changes for the next \b maintenance release of that particular
+supported release-line of HDF5.
+\b Some new features will be integrated into a release maintenance branch, depending on whether or
+not those features can be
+introduced in minor releases. Maintenance branches are removed when a release-line is retired from
+support.
+
+\section sec_branchexpl_release Release branches
+Release branches are used to prepare a new production release. They are primarily used to allow for
+last minute dotting of i's and crossing of t's
+(things like setting the release version, finalizing release notes, and generating Autotools files)
+and do not include new development.
+They are created from the maintenance branch at the time of the maintenance release and have
+names like hdf5_1_10_N, where N is the minor release number. Once the release is done it is
+tagged, with a slightly different format: hdf5-1_10_N.
+Release branches are deleted after the tag has been created. If we have to create a patch version
+of a release (which is rare), we create a branch off of the tag.
+
+\section sec_branchexpl_feature feature/\*
+Feature branches are temporary branches used to develop new features in HDF5.
+Feature branches branch off of develop and exist as long as the feature is under development.
+When the feature is complete, the branch is merged back into develop, as well as into any support
+branches in which the change will be included, and then the feature branch is removed.
+
+Ideally, all feature branches should contain a BRANCH.md file in the root directory that explains
+the purpose of the branch, contact information for the person responsible, and, if possible, some
+clues about the branch's life cycle (so we have an idea about when it can be deleted, merged, or
+declared inactive).
+
+Minor bug fixes and refactoring work usually takes place on personal forks, not feature branches.
+
+\section sec_branchexpl_inactive inactive/\*
+These branches are for experimental features that were developed in the past, have not been merged
+to develop, and are not under active development. The exception to this is that some feature branches
+are labeled inactive and preserved for a short time after merging to develop. Integration branches
+are usually not kept in sync with the develop branch.
+
+As for feature branches, inactive branches should have a BRANCH.md file as described above.
+
+*/
diff --git a/doc/cmake-vols-fetchcontent.md b/doxygen/dox/cmake-vols-fetchcontent.dox
similarity index 59%
rename from doc/cmake-vols-fetchcontent.md
rename to doxygen/dox/cmake-vols-fetchcontent.dox
index f7b395dec7b..2d1d9420469 100644
--- a/doc/cmake-vols-fetchcontent.md
+++ b/doxygen/dox/cmake-vols-fetchcontent.dox
@@ -1,70 +1,64 @@
-# Building and testing HDF5 VOL connectors with CMake FetchContent
+/** \page CMakeVols HDF5 Building and testing HDF5 VOL connectors with CMake FetchContent
+\section sec_cmakevols_intro Introduction
This document details the process of using CMake options to build and test
an HDF5 VOL connector alongside the HDF5 library when building HDF5 from
source. There are several benefits that this may provide, but among them
are the following:
-
- * A VOL connector built this way can be tested at the same time that
+\li A VOL connector built this way can be tested at the same time that
HDF5 is, which eliminates the need to have a multi-step build process
where one builds HDF5, uses it to build the VOL connector and then
uses the external [HDF5 VOL tests](https://github.com/hdfGroup/vol-tests)
repository to test their connector.
- * Building VOL connectors in this manner will usually install the built
+\li Building VOL connectors in this manner will usually install the built
connector library alongside the HDF5 library, allowing future opportunities
- for HDF5 to set a default plugin path such that the HDF5_PLUGIN_PATH
+ for HDF5 to set a default plugin path such that the #HDF5_PLUGIN_PATH
environment variable doesn't need to be set.
-## Building
-
+\section sec_cmakevols_build Building
To enable building of an HDF5 VOL connector using HDF5's CMake functionality,
a CMake variable must first be set:
-
- HDF5_VOL_ALLOW_EXTERNAL (Default: "NO")
+\li HDF5_VOL_ALLOW_EXTERNAL (Default: "NO")
This variable is a string that specifies the manner in which the source code for
an external VOL connector will be retrieved. This variable must be set
- to "GIT" for building external VOL connectors from a Github repository, or
- set to "LOCAL_DIR" to build from a local source directory.
-
+ to GIT for building external VOL connectors from a Github repository, or
+ set to LOCAL_DIR to build from a local source directory.
-### Building
-
-If the `HDF5_VOL_ALLOW_EXTERNAL` option is set to "GIT", the CMake cache will be populated with a predefined
-(currently 10) amount of new variables, named:
-
- HDF5_VOL_URL01
- HDF5_VOL_URL02
- HDF5_VOL_URL03
- ...
+\subsection subsec_cmakevols_build_git Building From GIT
+If the HDF5_VOL_ALLOW_EXTERNAL option is set to GIT, the CMake cache
+will be populated with a predefined (currently 10) amount of new variables, named:
+\li HDF5_VOL_URL01
+\li HDF5_VOL_URL02
+\li HDF5_VOL_URL03
+\li ...
For each of these variables, a URL that points to an HDF5 VOL connector Git
repository can be specified. These URLs should currently be HTTPS URLs. For
example, to specify the HDF5 Asynchronous I/O VOL Connector developed by the
-ECP team, one can provide the following option to `cmake`:
-
- -DHDF5_VOL_URL01=https://github.com/hpc-io/vol-async.git
+ECP team, one can provide the following option to CMake:
+\li -DHDF5_VOL_URL01=https://github.com/hpc-io/vol-async.git
For each URL specified, HDF5's CMake code will attempt to use CMake's
[FetchContent](https://cmake.org/cmake/help/latest/module/FetchContent.html)
functionality to retrieve the source code for a VOL connector pointed to by
that URL and will try to build that VOL connector as part of the HDF5 library
-build process.
+build process.
-If `HDF5_VOL_ALLOW_EXTERNAL` is instead set to "LOCAL_DIR", then the CMake cache
-will instead be populated with the variables:
+\subsection subsec_cmakevols_build_local Building From Local Folder
+If HDF5_VOL_ALLOW_EXTERNAL is instead set to LOCAL_DIR,
+then the CMake cache will instead be populated with the variables:
- HDF5_VOL_PATH01
- HDF5_VOL_PATH02
- HDF5_VOL_PATH03
- ...
+\li HDF5_VOL_PATH01
+\li HDF5_VOL_PATH02
+\li HDF5_VOL_PATH03
+\li ...
-For each of these variables, an absolute path that points to a local
+For each of these variables, an absolute path that points to a local
directory containing source code for an HDF5 VOL connector
-can be specified. For example, to specify a local clone of the
-REST VOL connector stored under one's home directory, one can provide
-the following option to `cmake`:
-
- -DHDF5_VOL_PATH01=/home/vol-rest
+can be specified. For example, to specify a local clone of the
+REST VOL connector stored under one's home directory, one can provide
+the following option to CMake:
+\li -DHDF5_VOL_PATH01=/home/vol-rest
Regardless of the method used to obtain the VOL source code,
the VOL connector must be able to be built by CMake and currently
@@ -81,23 +75,21 @@ If the source was retrieved from a URL, then the name is generated
by stripping off the last part of the Git repository URL given for the connector,
removing the ".git" suffix and any whitespace and then upper-casing the result.
For example, the name of the VOL connector located at the URL
-https://github.com/hpc-io/vol-async.git would become "VOL-ASYNC". If the source was
-retrieved from a local directory, then the source directory's name is trimmed of whitespace,
-upper-cased, and has any trailing slashes removed.
+https://github.com/hpc-io/vol-async.git would become VOL-ASYNC .
+If the source was retrieved from a local directory, then the source directory's name is
+trimmed of whitespace, upper-cased, and has any trailing slashes removed.
After the VOL's internal name is generated, the following new variables get created:
-
- HDF5_VOL__NAME (Default: "")
+\li HDF5_VOL__NAME (Default: "")
This variable specifies the string that should be used when setting the
- HDF5_VOL_CONNECTOR environment variable for testing the VOL connector
- with the CMake-internal name ''. The value for this variable
+ #HDF5_VOL_CONNECTOR environment variable for testing the VOL connector
+ with the CMake-internal name \ . The value for this variable
can be determined according to the canonical name given to the connector
by the connector's author(s), as well as any extra info that needs to be
passed to the connector for its configuration (see example below). This
variable must be set in order for the VOL connector to be testable with
HDF5's tests.
-
- HDF5_VOL__CMAKE_PACKAGE_NAME (Default: ">")
+\li HDF5_VOL__CMAKE_PACKAGE_NAME (Default: "\\>")
This variable specifies the exact name that would be passed to CMake
find_package(...) calls for the VOL connector in question. It is used as
the dependency name when making CMake FetchContent calls to try to ensure
@@ -105,43 +97,40 @@ After the VOL's internal name is generated, the following new variables get crea
can make find_package(...) calls for this VOL connector at configure time.
By default, this variable is set to a lowercased version of the internal
name generated for the VOL connector (described above).
-
- HDF5_VOL__TEST_PARALLEL (Default: OFF)
+\li HDF5_VOL__TEST_PARALLEL (Default: OFF)
This variable determines whether the VOL connector with the CMake-internal
- name '' should be tested against HDF5's parallel tests.
+ name \ should be tested against HDF5's parallel tests.
If the source was retrieved from a Git URL, then the following variable will additionally be created:
-
- HDF5_VOL__BRANCH (Default: "main")
+\li HDF5_VOL__BRANCH (Default: "main")
This variable specifies the git branch name or tag to use when fetching
the source code for the VOL connector with the CMake-internal name
- ''.
+ \ .
As an example, this would create the following variables for the
previously-mentioned VOL connector if it is retrieved from a URL:
+\li HDF5_VOL_VOL-ASYNC_NAME ""
+\li HDF5_VOL_VOL-ASYNC_CMAKE_PACKAGE_NAME "vol-async"
+\li HDF5_VOL_VOL-ASYNC_BRANCH "main"
+\li HDF5_VOL_VOL-ASYNC_TEST_PARALLEL OFF
- HDF5_VOL_VOL-ASYNC_NAME ""
- HDF5_VOL_VOL-ASYNC_CMAKE_PACKAGE_NAME "vol-async"
- HDF5_VOL_VOL-ASYNC_BRANCH "main"
- HDF5_VOL_VOL-ASYNC_TEST_PARALLEL OFF
-
-**NOTE**
+NOTE
If a VOL connector requires extra information to be passed in its
-HDF5_VOL__NAME variable and that information contains any semicolons,
+HDF5_VOL__NAME variable and that information contains any semicolons,
those semicolons should be escaped with a single backslash so that CMake
-doesn't parse the string as a list. If `cmake` is run from a shell, extra care
+doesn't parse the string as a list. If CMake is run from a shell, extra care
may need to be taken when escaping the semicolons depending on how the
shell interprets backslashes.
-### Example - Build and test HDF5 Asynchronous I/O VOL connector from GIT
-
+\subsection subsec_cmakevols_build_ex Example - Build and test HDF5 Asynchronous I/O VOL connector from GIT
Assuming that the HDF5 source code has been checked out and a build directory
-has been created, running the following cmake command from that build directory
+has been created, running the following CMake command from that build directory
will retrieve, build and test the HDF5 Asynchronous I/O VOL connector while
-building HDF5. Note that `[hdf5 options]` represents other build options that
-would typically be passed when building HDF5, such as `CMAKE_INSTALL_PREFIX`,
-`HDF5_BUILD_CPP_LIB`, etc.
+building HDF5. Note that [hdf5 options] represents other build options that
+would typically be passed when building HDF5, such as CMAKE_INSTALL_PREFIX ,
+HDF5_BUILD_CPP_LIB , etc.
+\code
cmake [hdf5 options]
-DHDF5_ENABLE_THREADSAFE=ON
-DHDF5_ENABLE_PARALLEL=ON
@@ -153,78 +142,77 @@ would typically be passed when building HDF5, such as `CMAKE_INSTALL_PREFIX`,
-DHDF5_VOL_VOL-ASYNC_NAME="async under_vol=0\;under_info={}"
-DHDF5_VOL_VOL-ASYNC_TEST_PARALLEL=ON
..
+\endcode
Here, we are specifying that:
-
- * HDF5 should be built with thread-safety enabled (required by Async VOL connector)
- * HDF5 should be built with parallel enabled (required by Async VOL connector)
- * Allow unsupported HDF5 combinations (thread-safety and HL, which is on by default)
- * Enable the API tests so that they can be tested with the Async VOL connector
- * Build and use the HDF5 Asynchronous I/O VOL connector, located at
+\li HDF5 should be built with thread-safety enabled (required by Async VOL connector)
+\li HDF5 should be built with parallel enabled (required by Async VOL connector)
+\li Allow unsupported HDF5 combinations (thread-safety and HL, which is on by default)
+\li Enable the API tests so that they can be tested with the Async VOL connector
+\li Build and use the HDF5 Asynchronous I/O VOL connector, located at
https://github.com/hpc-io/vol-async.git
- * Clone the Asynchronous I/O VOL connector from the repository's 'develop' branch
- * When testing the Asynchronous I/O VOL connector, the `HDF5_VOL_CONNECTOR` environment
- variable should be set to "async under_vol=0\;under_info={}", which
- specifies that the VOL connector with the canonical name "async" should
- be loaded and it should be passed the string "under_vol=0;under_info={}"
+\li Clone the Asynchronous I/O VOL connector from the repository's develop branch
+\li When testing the Asynchronous I/O VOL connector, the #HDF5_VOL_CONNECTOR environment
+ variable should be set to "async under_vol=0\;under_info={}" , which
+ specifies that the VOL connector with the canonical name async should
+ be loaded and it should be passed the string "under_vol=0;under_info={}"
for its configuration (note the backslash-escaping of semicolons in the string
provided)
- * The Asynchronous I/O VOL connector should be tested against HDF5's parallel API tests
+\li The Asynchronous I/O VOL connector should be tested against HDF5's parallel API tests
-Note that this also assumes that the Asynchronous I/O VOL connector's
+Note that this also assumes that the Asynchronous I\/O VOL connector's
[other dependencies](https://hdf5-vol-async.readthedocs.io/en/latest/gettingstarted.html#preparation)
are installed on the system in a way that CMake can find them. If that is not
the case, the locations for these dependencies may need to be provided to CMake
by passing extra options, such as:
-
+\code
-DABT_INCLUDE_DIR=/path/to/argobots/build/include
-DABT_LIBRARY=/path/to/argbots/build/lib/libabt.so
-
+\endcode
which would help CMake find an argobots installation in a non-standard location.
-## Testing
-
+\section sec_cmakevols_test Testing
To facilitate testing of HDF5 VOL connectors when building HDF5, tests from
the [HDF5 VOL tests](https://github.com/hdfGroup/vol-tests) repository were
integrated back into the library and the following new CMake options were
added to HDF5 builds for the 1.14.1 release:
-
- HDF5_TEST_API (Default: OFF)
+\li HDF5_TEST_API (Default: OFF)
This variable determines whether the HDF5 API tests will be built and tested.
-
- HDF5_TEST_API_INSTALL (Default: OFF)
+\li HDF5_TEST_API_INSTALL (Default: OFF)
This variable determines whether the HDF5 API test executables will be installed
on the system alongside the HDF5 library.
-
- HDF5_TEST_API_ENABLE_ASYNC (Default: OFF)
+\li HDF5_TEST_API_ENABLE_ASYNC (Default: OFF)
This variable determines whether the HDF5 Asynchronous I/O API tests will be
built and tested. These tests will only run if a VOL connector reports that
- it supports asynchronous I/O operations when queried via the H5Pget_vol_cap_flags
+ it supports asynchronous I/O operations when queried via the #H5Pget_vol_cap_flags
API routine.
-
- HDF5_TEST_API_ENABLE_DRIVER (Default: OFF)
+\li HDF5_TEST_API_ENABLE_DRIVER (Default: OFF)
This variable determines whether the HDF5 API test driver program will be
built and used for testing. This driver program is useful when a VOL connector
- uses a client/server model where the server program needs to be up and running
+ uses a client\/server model where the server program needs to be up and running
before the VOL connector can function. This option is currently not functional.
-When the `HDF5_TEST_API` option is set to ON, HDF5's CMake code builds and tests
+When the HDF5_TEST_API option is set to ON, HDF5's CMake code builds and tests
the new API tests using the native VOL connector. When one or more external VOL
connectors are built successfully with the process described in this document,
the CMake code will duplicate some of these API tests by adding separate
versions of the tests (for each VOL connector that was built) that set the
-`HDF5_VOL_CONNECTOR` environment variable to the value specified for the
-HDF5_VOL__NAME variable for each external VOL connector at build time.
-Running the `ctest` command will then run these new tests which load and run with
-each VOL connector that was built in turn. When run via the `ctest` command, the
+#HDF5_VOL_CONNECTOR environment variable to the value specified for the
+HDF5_VOL__NAME variable for each external VOL connector at build time.
+Running the ctest command will then run these new tests which load and run with
+each VOL connector that was built in turn. When run via the ctest command, the
new tests typically follow the naming scheme:
-
+\code
HDF5_VOL_-h5_api_test_
HDF5_VOL_-h5_api_test_parallel_
+\endcode
-**NOTE**
+\section sec_cmakevols_note NOTE
If dependencies of a built VOL connector are installed on the system in
-a non-standard location that would typically require one to set `LD_LIBRARY_PATH`
+a non-standard location that would typically require one to set LD_LIBRARY_PATH
or similar, one should ensure that those environment variables are set before
running tests. Otherwise, the tests that run with that connector will likely
fail due to being unable to load the necessary libraries for its dependencies.
+
+*/
+
diff --git a/doxygen/dox/code-conventions.dox b/doxygen/dox/code-conventions.dox
new file mode 100644
index 00000000000..54aa5155160
--- /dev/null
+++ b/doxygen/dox/code-conventions.dox
@@ -0,0 +1,58 @@
+/** \page CODECONV HDF5 Library Code Conventions
+
+This document describes some practices that are new, or newly
+documented, starting in 2020.
+
+\section sec_codeconv_func Function / Variable Attributes
+
+In H5private.h, the library provides platform-independent macros
+for qualifying function and variable definitions.
+
+\subsection subsec_codeconv_func_1 Functions that accept printf(3) and scanf(3) format strings
+
+Label functions that accept a printf(3) -compliant format string with
+H5_ATTR_FORMAT(printf,format_argno,variadic_argno) , where
+the format string is the format_argno th argument (counting from 1)
+and the variadic arguments start with the variadic_argno th.
+
+Functions that accept a scanf(3) -compliant format string should
+be labeled H5_ATTR_FORMAT(scanf,format_argno,variadic_argno) .
+
+\subsection subsec_codeconv_func_2 Functions that do never return
+
+The definition of a function that always causes the program to abort and hang
+should be labeled H5_ATTR_NORETURN to help the compiler see which flows of
+control are infeasible.
+
+\subsection subsec_codeconv_func_other Other attributes
+
+**TBD**
+
+\subsection subsec_codeconv_func_unused Unused variables and parameters
+
+Compilers will warn about unused parameters and variables—developers should pay
+attention to those warnings and make an effort to prevent them.
+
+Some function parameters and variables are unused in \b all configurations of
+the project. Ordinarily, such parameters and variables should be deleted.
+However, sometimes it is possible to foresee a parameter being used, or
+removing it would change an API, or a parameter has to be defined to conform a
+function to some function pointer type. In those cases, it's permissible to
+mark a symbol H5_ATTR_UNUSED .
+
+Other parameters and variables are unused in \b some configurations of the
+project, but not all. A symbol may fall into disuse in some configuration in
+the future—then the compiler should warn, and the symbol should not be
+defined—so developers should try to label a sometimes-unused symbol with an
+attribute that's specific to the configurations where the symbol is (or is not)
+expected to be used. The library provides the following attributes for that
+purpose:
+\li H5_ATTR_DEPRECATED_USED : used only if deprecated symbols \b are enabled
+\li H5_ATTR_NDEBUG_UNUSED : used only if NDEBUG is \b not \#defined
+\li H5_ATTR_DEBUG_API_USED : used if the debug API \b is enabled
+\li H5_ATTR_PARALLEL_UNUSED : used only if Parallel HDF5 is \b not configured
+\li H5_ATTR_PARALLEL_USED : used only if Parallel HDF5 \b is configured
+
+Some attributes may be phased in or phased out in the future.
+
+*/
diff --git a/doc/file-locking.md b/doxygen/dox/file-locking.dox
similarity index 57%
rename from doc/file-locking.md
rename to doxygen/dox/file-locking.dox
index 067f7ab3993..aacf2135b04 100644
--- a/doc/file-locking.md
+++ b/doxygen/dox/file-locking.dox
@@ -1,5 +1,6 @@
-# File Locking in HDF5
+/** \page FileLock HDF5 File Locking in HDF5
+\section sec_filelock_intro Introduction
This document describes the file locking scheme that was added to HDF5 in
version 1.10.0 and how you can work around it, if you choose to do so. I'll
try to keep it understandable for everyone, though diving into technical
@@ -7,8 +8,7 @@ details is unavoidable, given the complexity of the material. We're in the
process of converting the HDF5 user guide (UG) to Doxygen and this document
will eventually be rolled up into those files as we update things.
-**Parallel HDF5 Note**
-
+Parallel HDF5 Note
Everything written here is from the perspective of serial HDF5. When we say
that you can't access a file for write access from more than one process, we
mean "from more than one independent, serial process". Parallel HDF5 can
@@ -16,18 +16,15 @@ obviously write to a file from more than one process, but that involves
IPC and multiple processes working together, not independent processes with
no knowledge of each other, which is what the file locks are for.
-
-## Why file locks?
-
+\section sec_filelock_why Why file locks?
The short answer is: "To prevent you from corrupting your HDF5 files and/or
crashing your reader processes."
The long answer is more complicated.
An HDF5 file's state exists in two places when it is open for writing:
-
-1. The HDF5 file itself
-2. The HDF5 library's various caches
+\li The HDF5 file itself
+\li The HDF5 library's various caches
One of those caches is the metadata cache, which stores things like B-tree
nodes that we use to locate data in the file. Problems arise when parent
@@ -55,7 +52,7 @@ it wrong could result in corrupt files or crashed readers, we decided to add
a file locking scheme to help users get it right. Since this would also help
prevent harmful accesses when SWMR is not in use, we decided to switch the
file locking scheme on by default. This scheme has been carried forward into
-HDF5 1.12 and 1.13 (soon to be 1.14).
+HDF5 1.12 and 1.14 (soon to be 2.0).
Note that the current implementation of SWMR is only useful for appending to chunked
datasets. Creating file objects like groups and datasets is not supported
@@ -67,8 +64,7 @@ on parallel file systems, especially when file locks have been disabled, which
often causes lock calls to fail. As a result of this, we've added work-arounds
to disable the file locking scheme over the years.
-## The existing scheme
-
+\section sec_filelock_scheme The existing scheme
There are two parts to the file locking scheme. One is the file lock itself.
The second is a mark we make in the HDF5 file's superblock. The superblock
mark isn't really that important for understanding the file locking, but since
@@ -82,11 +78,10 @@ SWMR and prevent dangerous file access.
Here's how it all works:
1. The first thing we do is check if we're using file locks
-
- - We first check the file locking property in the file access property list
+ \li We first check the file locking property in the file access property list
(fapl). The default value of this property is set at configure time when
the library is built.
- - Next we check the value of the `HDF5_USE_FILE_LOCKING` environment variable,
+ \li Next we check the value of the `HDF5_USE_FILE_LOCKING` environment variable,
which was previously parsed at library startup. If this is set,
we use the value to override the property list setting.
@@ -97,16 +92,14 @@ Here's how it all works:
take place.
2. We also check for ignoring file locks when they are disabled on the file system.
-
- - The environment variable setting for this is checked at VFD initialization
+ \li The environment variable setting for this is checked at VFD initialization
time for all library VFDs.
- - We check the value in the fapl in the `open` callback. The default value for
+ \li We check the value in the fapl in the `open` callback. The default value for
this property was set at configure time when the library was built.
3. When we open a file, we lock it based on the file access flags:
-
- - If the `H5F_ACC_RDWR` flag is set, use an exclusive lock
- - Otherwise use a shared lock
+ \li If the `H5F_ACC_RDWR` flag is set, use an exclusive lock
+ \li Otherwise use a shared lock
If we are ignoring disabled file locks (see below), we will silently swallow
lock API call failure when locks are not implemented on the file system.
@@ -115,26 +108,23 @@ Here's how it all works:
file consistency flags in the file's superblock to indicate this.
**NOTE!**
-
- - The VFD has to have a lock callback for this to happen. It doesn't matter if
+ \li The VFD has to have a lock callback for this to happen. It doesn't matter if
the locking was disabled - the check is simply for the callback.
- - We mark the superblock in **ANY** write case - both SWMR and non-SWMR.
- - Only the latest version of the superblock is marked in this way. If you
+ \li We mark the superblock in **ANY** write case - both SWMR and non-SWMR.
+ \li Only the latest version of the superblock is marked in this way. If you
open up a file that wasn't created with the 1.10.0 or later file format,
it won't get the superblock mark, even if it's been opened for writing.
According to the file format document and H5Fpkg.h:
-
- - Bit 0 is set if the file is open for writing (`H5F_SUPER_WRITE_ACCESS`)
- - Bit 2 is set if the file is open for SWMR writing (`H5F_SUPER_SWMR_WRITE_ACCESS`)
+ \li Bit 0 is set if the file is open for writing (`H5F_SUPER_WRITE_ACCESS`)
+ \li Bit 2 is set if the file is open for SWMR writing (`H5F_SUPER_SWMR_WRITE_ACCESS`)
We check these superblock flags on file open and error out if they are
unsuitable.
-
- - If the file is already opened for non-SWMR writing, no other process can open
+ \li If the file is already opened for non-SWMR writing, no other process can open
it.
- - If the file is open for SWMR writing, only SWMR readers can open the file.
- - If you try to open a file for reading with `H5F_ACC_SWMR_READ` set and the
+ \li If the file is open for SWMR writing, only SWMR readers can open the file.
+ \li If you try to open a file for reading with `H5F_ACC_SWMR_READ` set and the
file does not have the SWMR writer bits set in the superblock, the open
call will fail.
@@ -148,196 +138,178 @@ Here's how it all works:
handle it when the file descriptors are closed since file locks don't
normally surivive closing the underlying file descriptor.
-**TL;DR**
-
When locks are available, HDF5 files will be exclusively locked while they are
in use. The exception to this are files that are opened for SWMR writing, which
are unlocked. Files that are open for any kind of writing get a "writing"
superblock mark that HDF5 1.10.0+ will respect and refuse to open outside of SWMR.
-## `H5Fstart_swmr_write()`
-
-This API call can be used to switch an open file to "SWMR writing" mode as
-if it had been opened with the `H5F_ACC_SWMR_WRITE` flag set. This is used
+\section sec_filelock_smrfunc H5Fstart_swmr_write
+This #H5Fstart_swmr_write API call can be used to switch an open file to "SWMR writing" mode as
+if it had been opened with the #H5F_ACC_SWMR_WRITE flag set. This is used
when code needs to perform SWMR-forbidden operations like creating groups
and datasets before appending data to datasets using SWMR.
Most of the work of this API call involves flushing out the library caches
in preparation for SWMR access, but there are a few locking operations that
take place under the hood:
-
-- The file's superblock is marked as in the SWMR writer case, above.
-- For a brief period of time in the call, we convert the exclusive lock to
- a shared lock. It's unclear why this was done and we'll look into removing
- this.
-- At the end of the call, the lock is removed, as in the SWMR write open
- case described above.
-
-## Disabling the locks
-
+\li The file's superblock is marked as in the SWMR writer case, above.
+\li For a brief period of time in the call, we convert the exclusive lock to
+ a shared lock. It's unclear why this was done and we'll look into removing
+ this.
+\li At the end of the call, the lock is removed, as in the SWMR write open
+ case described above.
+
+\section sec_filelock_disable Disabling the locks
There are several ways to disable the locks, depending on which version of the
HDF5 library you are working with. This section will describe the file lock
disable schemes as they exist in late 2022. The current library versions at
-this time are 1.10.9, 1.12.3, and 1.13.2. File locks are not present in HDF5
+this time were 1.10.9, 1.12.3, and 1.13.2. File locks are not present in HDF5
1.8. The lock feature matrix later in this document will describe the
limitations of earlier versions.
-### Configure option
-
+\subsection subsec_filelock_disable_config Configure option
You can set the file locking defaults at configure time. This sets the defaults
for the associated properties in the fapl. Users can override the configure
-defaults using `H5Pset_file_locking()` or the `HDF5_USE_FILE_LOCKING`
+defaults using #H5Pset_file_locking or the HDF5_USE_FILE_LOCKING
environment variable.
-- Autotools
-
- `--enable-file-locking=(yes|no|best-effort)` sets the file locking behavior.
- `on` and `off` should be self-explanatory. `best-effort` turns file locking
- on but ignores file locks when they are disabled (default: `best-effort`).
-
-- CMake
+Autotools
+\li --enable-file-locking=(yes | no | best-effort) sets the file locking behavior.
+ on and off should be self-explanatory. best-effort turns file locking
+ on but ignores file locks when they are disabled (default: best-effort ).
- - set `IGNORE_DISABLED_FILE_LOCK` to `ON` to ignore file locks when they
- are disabled on the file system (default: `ON`).
- - set `HDF5_USE_FILE_LOCKING` to `OFF` to disable file locks (default: `ON`)
+CMake
+\li set IGNORE_DISABLED_FILE_LOCK to ON to ignore file locks when they
+\li are disabled on the file system (default: ON ).
+\li set HDF5_USE_FILE_LOCKING to OFF to disable file locks (default: ON )
-### `H5Pset_file_locking()`
-
-This API call can be used to override the configure defaults. It takes
-`hbool_t` parameters for both the file locking and "ignore file locks when
+\section sec_filelock_funcset H5Pset_file_locking
+This #H5Pset_file_locking API call can be used to override the configure defaults. It takes
+#hbool_t parameters for both the file locking and "ignore file locks when
disabled on the file system" parameters. The values set here can be
overridden by the file locking environment variable.
-There is a corresponding `H5Pget_file_locking()` call that can be used to check
-the currently set values of both properties in the fapl. **NOTE** that this
-call just checks the property list values. It does **NOT** check the
+There is a corresponding #H5Pget_file_locking call that can be used to check
+the currently set values of both properties in the fapl. NOTE that this
+call just checks the property list values. It does NOT check the
environment variables!
-### Environment variables
-
-The `HDF5_USE_FILE_LOCKING` environment variable overrides all other file
+\section sec_filelock_env Environment variables
+The HDF5_USE_FILE_LOCKING environment variable overrides all other file
locking settings.
HDF5 1.10.0
-- No file locking environment variable
+\li No file locking environment variable
HDF5 1.10.1 - 1.10.6, 1.12.0:
-- `FALSE` turns file locking off
-- Anything else turns file locking on
-- Neither of these values ignores disabled file locks
-- Environment variable parsed at file create/open time
-
-HDF5 1.10.7+, 1.12.1+, 1.13.x:
-- `FALSE` or `0` disables file locking
-- `TRUE` or `1` enables file locking
-- `BEST_EFFORT` enables file locking and ignores disabled file locks
-- Anything else gives you the defaults
-- Environment variable parsed at library startup
-
-### Lock disable scheme interactions
-
+\li FALSE turns file locking off
+\li Anything else turns file locking on
+\li Neither of these values ignores disabled file locks
+\li Environment variable parsed at file create/open time
+
+HDF5 1.10.7+, 1.12.1+, 1.14.x:
+\li FALSE or 0 disables file locking
+\li TRUE or 1 enables file locking
+\li BEST_EFFORT enables file locking and ignores disabled file locks
+\li Anything else gives you the defaults
+\li Environment variable parsed at library startup
+
+\section sec_filelock_lockdisable Lock disable scheme interactions
As mentioned above and reiterated here:
-- Configure-time settings set fapl defaults
-- `H5Pset_file_locking()` overrides configure-time defaults
-- The environment variable setting overrides all
+\li Configure-time settings set fapl defaults
+\li #H5Pset_file_locking overrides configure-time defaults
+\li The environment variable setting overrides all
If you want to check that file locking is on, you'll need to check the fapl
setting AND check the environment variable, which can override the fapl.
-**!!! WARNING !!!**
-
+\subsection subsec_filelock_lockdisable_warn !!! WARNING !!!
Disabling the file locks is at your own risk. If more than one writer process
modifies an HDF5 file at the same time, the file could be corrupted. If a
reader process reads a file that is being modified by a writer, the reader
process might attempt to read garbage and encounter errors or even crash.
In the case of:
-
-- A single process accessing a file with write access
-- Any number of processes accessing a file read-only
+\li A single process accessing a file with write access
+\li Any number of processes accessing a file read-only
You can safely disable the file locking scheme.
If you are trying to set up SWMR without the benefit of the file locks, you'll
just need to be extra careful that you hold to rules for SWMR access.
-## Feature Matrix
-
+\section sec_filelock_feat Feature Matrix
The following table indicates which versions of the library support which file
lock features. 1.13.0 and 1.13.1 are experimental releases (basically glorified
release candidates) so they are not included here.
-**Locks**
-
-- P = POSIX locks only, Windows was a no-op that always succeeded
-- WP = POSIX and Windows locks
-- (-) = POSIX no-op lock fails
-- (+) = POSIX no-op lock passes
-
-**Configure Option and Environment Variable**
-
-- on/off = sets file locks on/off
-- try = can also set "best effort", where locks are on but ignored if disabled
-
-|Version|Has locks|Configure option|`H5Pset_file_locking()`|`HDF5_USE_FILE_LOCKING`|
-|-------|---------|----------------|-----------------------|-----------------------|
-|1.8.x|No|-|-|-|
-|1.10.0|P(-)|-|-|-|
-|1.10.1|P(-)|-|-|on/off|
-|1.10.2|P(-)|-|-|on/off|
-|1.10.3|P(-)|-|-|on/off|
-|1.10.4|P(-)|-|-|on/off|
-|1.10.5|P(-)|-|-|on/off|
-|1.10.6|P(-)|-|-|on/off|
-|1.10.7|P(+)|try|Y|try|
-|1.10.8|WP(+)|try|Y|try|
-|1.10.9|WP(+)|try|Y|try|
-|1.12.0|P(-)|-|-|on/off|
-|1.12.1|WP(+)|try|Y|try|
-|1.12.2|WP(+)|try|Y|try|
-|1.13.2|WP(+)|try|Y|try|
-
-
-## Appendix: File lock implementation
-
-The file lock system is implemented with `flock(2)` as the archetype since it
+\subsection subsec_filelock_feat_locks Locks
+\li P = POSIX locks only, Windows was a no-op that always succeeded
+\li WP = POSIX and Windows locks
+\li (-) = POSIX no-op lock fails
+\li (+) = POSIX no-op lock passes
+
+\subsection subsec_filelock_feat_var Configure Option and Environment Variable
+\li on/off = sets file locks on/off
+\li try = can also set "best effort", where locks are on but ignored if disabled
+
+
+Version | Has locks | Configure option | #H5Pset_file_locking | HDF5_USE_FILE_LOCKING |
+1.8.x | No | - | - | - |
+1.10.0 | P(-) | - | - | - |
+1.10.1 | P(-) | - | - | on/off |
+1.10.2 | P(-) | - | - | on/off |
+1.10.3 | P(-) | - | - | on/off |
+1.10.4 | P(-) | - | - | on/off |
+1.10.5 | P(-) | - | - | on/off |
+1.10.6 | P(-) | - | - | on/off |
+1.10.7 | P(+) | try | Y | try |
+1.10.8 | WP(+) | try | Y | try |
+1.10.9 | WP(+) | try | Y | try |
+1.12.0 | P(-) | - | - | on/off |
+1.12.1 | WP(+) | try | Y | try |
+1.12.2 | WP(+) | try | Y | try |
+1.13.2 | WP(+) | try | Y | try |
+
+
+\section sec_filelock_appd Appendix: File lock implementation
+The file lock system is implemented with flock(2) as the archetype since it
has simple semantics and we don't need range locking. Locks are advisory on many
systems, but this shouldn't be a problem for most users since the HDF5 library
always respects them. If you have a program that parses or modifies HDF5 files
independently of the HDF5 library, you'll want to be mindful of any potential
for concurrent access across processes.
-On Unix systems, we call `flock()` directly when it's available and pass
-`LOCK_SH` (shared lock), `LOCK_EX` (exclusive lock), and `LOCK_UN` (unlock) as
+On Unix systems, we call flock() directly when it's available and pass
+LOCK_SH (shared lock), LOCK_EX (exclusive lock), and LOCK_UN (unlock) as
described in the algorithm section. All locks are non-blocking, so we set the
-`LOCK_NB` flag. Sadly, `flock(2)` is not POSIX and it doesn't lock files over
+LOCK_NB flag. Sadly, flock(2) is not POSIX and it doesn't lock files over
NFS. We didn't consider a lack of NFS support a problem since SWMR isn't
supported on networked file systems like NFS (write order preservation isn't
-guaranteed) and `flock(2)` usually doesn't fail when you attempt to lock NFS
+guaranteed) and flock(2) usually doesn't fail when you attempt to lock NFS
files.
-On Unix systems without `flock(2)`, we implement a scheme based on `fcntl(2)`
-(`Pflock()` in `H5system.c`). On these systems we use `F_SETLK` (non-blocking)
-as the operation and set `l_type` in `struct flock` to be:
-
-- `F_UNLOCK` for `LOCK_UN`
-- `F_WRLOCK` for `LOCK_EX`
-- `F_RDLOCK` for `LOCK_SH`
+On Unix systems without flock(2) , we implement a scheme based on fcntl(2)
+(Pflock() in H5system.c ). On these systems we use F_SETLK (non-blocking)
+as the operation and set l_type in struct flock to be:
+\li F_UNLOCK for LOCK_UNc
+\li F_WRLOCK for LOCK_EXc
+\li F_RDLOCK for LOCK_SHc
-We set the range to be the entire file. Most Unix-like systems have `flock()`
+We set the range to be the entire file. Most Unix-like systems have flock()
these days, so this system probably isn't very well tested.
-We don't use `fcntl`-based open file locks or mandatory locking anywhere. The
+We don't use fcntl -based open file locks or mandatory locking anywhere. The
former scheme is non-POSIX and the latter is deprecated.
-On Windows, we use `LockFileEx()` and `UnlockFileEx()` to lock the entire file
-(`Wflock()` in `H5system.c`). We set `LOCKFILE_FAIL_IMMEDIATELY` to get
-non-blocking locks and set `LOCKFILE_EXCLUSIVE_LOCK` when we want an exclusive
+On Windows, we use LockFileEx() and UnlockFileEx() to lock the entire file
+(Wflock() in H5system.c ). We set LOCKFILE_FAIL_IMMEDIATELY to get
+non-blocking locks and set LOCKFILE_EXCLUSIVE_LOCK when we want an exclusive
lock. SWMR isn't well-tested on Windows, so this scheme hasn't been as
-thoroughly vetted as the `flock`-based scheme.
+thoroughly vetted as the flock -based scheme.
-On non-Windows systems where neither `flock(2)` nor `fcntl(2)` is available,
-we substitute a no-op stub that always succeeds (`Nflock()` in `H5system.c`).
+On non-Windows systems where neither flock(2) nor fcntl(2) is available,
+we substitute a no-op stub that always succeeds (Nflock() in H5system.c ).
In the past, the stub always failed (see the matrix for when we made the switch).
We currently know of no non-Windows systems where neither call is available
so this scheme is not well-tested.
@@ -347,15 +319,15 @@ locking, is that all of these schemes have subtly different semantics. We're
using file locking in a fairly crude manner, though, and lock use has always
been optional, so we consider this a lower-order concern.
-Locks are implemented at the VFD level via `lock` and `unlock` callbacks. The
+Locks are implemented at the VFD level via lock and unlock callbacks. The
VFDs that implement file locks are: core (w/ backing store), direct, log, sec2,
-and stdio (`flock(2)` locks only). The family, multi, and splitter VFDs invoke
+and stdio (flock(2) locks only). The family, multi, and splitter VFDs invoke
the lock callback of their underlying sub-files. The onion and MPI-IO VFDs do NOT
use locks, even though they create normal, on-disk native HDF5 files. The
read-only S3 VFD and HDFS VFDs do not use file locking since they use
alternative storage schemes.
-Lock failures are detected by checking to see if `errno` is set to `ENOSYS`.
+Lock failures are detected by checking to see if errno is set to ENOSYS .
This is not particularly sophisticated and was implemented as a way of working
around disabled locks on popular parallel file systems.
@@ -363,4 +335,7 @@ One other thing to note here is that, in all of the locking schemes we use, the
file locks do not survive process termination, so you don't have to worry
about files being locked forever if a process exits abnormally. If a writer
crashed and the library didn't clear the superblock mark, you can remove it with
-the h5clear command-line tool, which is built with the library.
+the \ref sec_cltools_h5clear command-line tool, which is built with the library.
+
+*/
+
diff --git a/doxygen/dox/high_level/extension.dox b/doxygen/dox/high_level/extension.dox
index fc0da48ee83..456692ed868 100644
--- a/doxygen/dox/high_level/extension.dox
+++ b/doxygen/dox/high_level/extension.dox
@@ -7,16 +7,14 @@
* for working with region references, hyperslab selections, and bit-fields.
* These functions were created as part of a project supporting
* NPP/NPOESS Data Production and Exploitation (
- *
- * project,
- * software ).
+ * project,
+ * software).
* While they were written to facilitate access to NPP, NPOESS, and JPSS
* data in the HDF5 format, these functions may be useful to anyone working
* with region references, hyperslab selections, or bit-fields.
*
* Note that these functions are not part of the standard HDF5 distribution;
- * the
- * software
+ * the software
* must be separately downloaded and installed.
*
* A comprehensive guide to this library,
diff --git a/doxygen/dox/library-init-shutdown.dox b/doxygen/dox/library-init-shutdown.dox
new file mode 100644
index 00000000000..0c48ee49d53
--- /dev/null
+++ b/doxygen/dox/library-init-shutdown.dox
@@ -0,0 +1,55 @@
+/** \page InitShut HDF5 Library initialization and shutdown
+
+\section sec_initshut_app Application perspective
+
+\subsection subsec_initshut_app_implicit Implicit initialization and shutdown
+When a developer exports a new symbol as part of the HDF5 library,
+they should make sure that an application cannot enter the library in an
+uninitialized state through a new API function, or read an uninitialized
+value from a non-function HDF5 symbol.
+
+The HDF5 library initializes itself when an application either enters
+the library through an API function call such as #H5Fopen, or when
+an application evaluates an HDF5 symbol that represents either a
+property-list identifier such as #H5F_ACC_RDONLY or #H5F_ACC_RDWR,
+a property-list class identifier such as #H5P_FILE_ACCESS, a VFD
+identifier such as #H5FD_FAMILY or #H5FD_SEC2, or a type identifier
+such as #H5T_NATIVE_INT64.
+
+The library sets a flag when initialization occurs and as long as the
+flag is set, skips initialization.
+
+The library provides a couple of macros that initialize the library
+as necessary. The library is initialized as a side-effect of the
+FUNC_ENTER_API* macros used at the top of most API functions. HDF5
+library symbols other than functions are provided through \#define s
+that use #H5OPEN to introduce a library-initialization call (#H5open)
+at each site where a non-function symbol is used.
+
+Ordinarily the library registers an atexit(3) handler to shut itself
+down when the application exits.
+
+\subsection subsec_initshut_app_explicit Explicit initialization and shutdown
+An application may use an API call, #H5open, to explicitly initialize
+the library. #H5close explicitly shuts down the library.
+
+\section sec_initshut_int Library internals perspective
+No matter how library initialization begins, eventually the internal
+function H5_init_library will be called. H5_init_library is
+responsible for calling the initializers for every internal HDF5
+library module (aka "package") in the correct order so that no module is
+initialized before its prerequisite modules. A table in H5_init_library
+establishes the order of initialization. If a developer adds a
+module to the library that it is appropriate to initialize with the rest
+of the library, then they should insert its initializer into the right
+place in the table.
+
+H5_term_library drives library shutdown. Library shutdown is
+table-driven, too. If a developer adds a module that needs to release
+resources during library shutdown, then they should add a call at the
+right place to the shutdown table. Note that some entries in the shutdown
+table are marked as "barriers," and if a new module should only be
+shutdown strictly after the preceding modules, then it should be marked
+as a barrier. See the comments in H5_term_library for more information.
+
+*/
diff --git a/doc/parallel-compression.md b/doxygen/dox/parallel-compression.dox
similarity index 76%
rename from doc/parallel-compression.md
rename to doxygen/dox/parallel-compression.dox
index 523aa758fec..962b9907857 100644
--- a/doc/parallel-compression.md
+++ b/doxygen/dox/parallel-compression.dox
@@ -1,7 +1,6 @@
-# HDF5 Parallel Compression
-
-## Introduction
+/** \page ParCompr HDF5 Parallel Compression
+\section sec_parcompr_intro Introduction
When an HDF5 dataset is created, the application can specify
optional data filters to be applied to the dataset (as long as
the dataset uses a chunked data layout). These filters may
@@ -45,28 +44,28 @@ their modifications to the owning MPI rank.
The parallel compression feature is always enabled when HDF5
is built with parallel enabled, but the feature may be disabled
if the necessary MPI-3 routines are not available. Therefore,
-HDF5 conditionally defines the macro `H5_HAVE_PARALLEL_FILTERED_WRITES`
+HDF5 conditionally defines the macro H5_HAVE_PARALLEL_FILTERED_WRITES
which an application can check for to see if the feature is
available.
-## Examples
+\section sec_parcompr_ex Examples
Using the parallel compression feature is very similar to using
compression in serial HDF5, except that dataset writes **must**
be collective:
-```
+\code
hid_t dxpl_id = H5Pcreate(H5P_DATASET_XFER);
H5Pset_dxpl_mpio(dxpl_id, H5FD_MPIO_COLLECTIVE);
H5Dwrite(..., dxpl_id, ...);
-```
+\endcode
The following are two simple examples of using the parallel
compression feature:
-[ph5_filtered_writes.c][u1]
+ph5_filtered_writes.c
-[ph5_filtered_writes_no_sel.c][u2]
+ph5_filtered_writes_no_sel.c
The former contains simple examples of using the parallel
compression feature to write to compressed datasets, while the
@@ -76,10 +75,10 @@ Remember that the feature requires these writes to use collective
I/O, so the MPI ranks which have nothing to contribute must still
participate in the collective write call.
-## Multi-dataset I/O support
+\section sec_parcompr_multi Multi-dataset I/O support
The parallel compression feature is supported when using the
-multi-dataset I/O API routines ([H5Dwrite_multi][u3]/[H5Dread_multi][u4]), but the
+multi-dataset I/O API routines (#H5Dwrite_multi/#H5Dread_multi), but the
following should be kept in mind:
- Parallel writes to filtered datasets **must** still be collective,
@@ -97,17 +96,17 @@ following should be kept in mind:
datasets if desired, while still performing collective writes to
the filtered datasets.
-## Incremental file space allocation support
+\section sec_parcompr_incr Incremental file space allocation support
-HDF5's [file space allocation time][u5]
+HDF5's file space allocation time function. #H5Pset_alloc_time,
is a dataset creation property that can have significant effects
on application performance, especially if the application uses
parallel HDF5. In a serial HDF5 application, the default file space
-allocation time for chunked datasets is "incremental". This means
+allocation time for chunked datasets is incremental. This means
that allocation of space in the HDF5 file for data chunks is
deferred until data is first written to those chunks. In parallel
HDF5, the file space allocation time was previously always forced
-to "early", which allocates space in the file for all of a dataset's
+to early, which allocates space in the file for all of a dataset's
data chunks at creation time (or during the first open of a dataset
if it was created serially). This would ensure that all the necessary
file space was allocated so MPI ranks could perform independent I/O
@@ -118,7 +117,7 @@ While this strategy has worked in the past, it has some noticeable
drawbacks. For one, the larger the chunked dataset being created,
the more noticeable overhead there will be during dataset creation
as all of the data chunks are being allocated in the HDF5 file.
-Further, these data chunks will, by default, be [filled][u6]
+Further, these data chunks will, by default, be filled, using #H5Pset_fill_value,
with HDF5's default fill data value, leading to extraordinary
dataset creation overhead and resulting in pre-filling large
portions of a dataset that the application might have been planning
@@ -126,16 +125,16 @@ to overwrite anyway. Even worse, there will be more initial overhead
from compressing that fill data before writing it out, only to have
it read back in, unfiltered and modified the first time a chunk is
written to. In the past, it was typically suggested that parallel
-HDF5 applications should use [H5Pset_fill_time][u7]
-with a value of `H5D_FILL_TIME_NEVER` in order to disable writing of
+HDF5 applications should use #H5Pset_fill_time
+with a value of #H5D_FILL_TIME_NEVER in order to disable writing of
the fill value to dataset chunks, but this isn't ideal if the
application actually wishes to make use of fill values.
-With [improvements made][u8]
-to the parallel compression feature for the HDF5 1.13.1 release,
-"incremental" file space allocation is now the default for datasets
-created in parallel *only if they have filters applied to them*.
-"Early" file space allocation is still supported for these datasets
+With improvements made
+to the parallel compression feature for the HDF5 1.14.0 release,
+incremental file space allocation is now the default for datasets
+created in parallel only if they have filters applied to them.
+Early file space allocation is still supported for these datasets
if desired and is still forced for datasets created in parallel that
do *not* have filters applied to them. This change should significantly
reduce the overhead of creating filtered datasets in parallel HDF5
@@ -144,7 +143,7 @@ use a fill value for these datasets. It should also help significantly
reduce the size of the HDF5 file, as file space for the data chunks
is allocated as needed rather than all at once.
-## Performance Considerations
+\section sec_parcompr_perf Performance Considerations
Since getting good performance out of HDF5's parallel compression
feature involves several factors, the following is a list of
@@ -152,9 +151,9 @@ performance considerations (generally from most to least important)
and best practices to take into account when trying to get the
optimal performance out of the parallel compression feature.
-### Begin with a good chunking strategy
+\subsection subsec_parcompr_perf_begin Begin with a good chunking strategy
-[Starting with a good chunking strategy][u9]
+Starting with a good \ref hdf5_chunking strategy
will generally have the largest impact on overall application
performance. The different chunking parameters can be difficult
to fine-tune, but it is essential to start with a well-performing
@@ -166,11 +165,11 @@ chosen chunk size becomes a very important factor when compression
is involved, as data chunks have to be completely read and
re-written to perform partial writes to the chunk.
-[Improving I/O performance with HDF5 compressed datasets][u10]
+\ref improve_compressed_perf
is a useful reference for more information on getting good
performance when using a chunked dataset layout.
-### Avoid chunk sharing
+\subsection subsec_parcompr_perf_avoid Avoid chunk sharing
Since the parallel compression feature has to assign ownership
of data chunks to a single MPI rank in order to avoid the
@@ -185,7 +184,7 @@ application will get the best performance out of parallel compression
if it can avoid writing in a way that causes more than 1 MPI rank
to write to any given data chunk in a dataset.
-### Collective metadata operations
+\subsection subsec_parcompr_perf_coll Collective metadata operations
The parallel compression feature typically works with a significant
amount of metadata related to the management of the data chunks
@@ -203,7 +202,7 @@ performance and scalability and is generally always recommended
unless application performance shows negative benefits by doing
so.
-```
+\code
...
hid_t fapl_id = H5Pcreate(H5P_FILE_ACCESS);
H5Pset_fapl_mpio(fapl_id, MPI_COMM_WORLD, MPI_INFO_NULL);
@@ -211,23 +210,23 @@ H5Pset_all_coll_metadata_ops(fapl_id, 1);
H5Pset_coll_metadata_write(fapl_id, 1);
hid_t file_id = H5Fcreate("file.h5", H5F_ACC_TRUNC, H5P_DEFAULT, fapl_id);
...
-```
+\endcode
-### Align chunks in the file
+\subsection subsec_parcompr_perf_align Align chunks in the file
The natural layout of an HDF5 file may cause dataset data
chunks to end up at addresses in the file that do not align
well with the underlying file system, possibly leading to
poor performance. As an example, Lustre performance is generally
good when writes are aligned with the chosen stripe size.
-The HDF5 application can use [H5Pset_alignment][u11]
+The HDF5 application can use #H5Pset_alignment
to have a bit more control over where objects in the HDF5
file end up. However, do note that setting the alignment
of objects generally wastes space in the file and has the
potential to dramatically increase its resulting size, so
caution should be used when choosing the alignment parameters.
-[H5Pset_alignment][u11]
+#H5Pset_alignment
has two parameters that control the alignment of objects in
the HDF5 file, the "threshold" value and the alignment
value. The threshold value specifies that any object greater
@@ -246,16 +245,16 @@ the Lustre stripe size), this should cause dataset data
chunks to be well-aligned and generally give good write
performance.
-```
+\code
hid_t fapl_id = H5Pcreate(H5P_FILE_ACCESS);
H5Pset_fapl_mpio(fapl_id, MPI_COMM_WORLD, MPI_INFO_NULL);
/* Assuming Lustre stripe size is 1MiB, align data chunks
in the file to address multiples of 1MiB. */
H5Pset_alignment(fapl_id, dataset_chunk_size, 1048576);
hid_t file_id = H5Fcreate("file.h5", H5F_ACC_TRUNC, H5P_DEFAULT, fapl_id);
-```
+\endcode
-### File free space managers
+\subsection subsec_parcompr_perf_space File free space managers
As data chunks in a dataset get written to and compressed,
they can change in size and be relocated in the HDF5 file.
@@ -264,29 +263,29 @@ in a file, this can create significant amounts of free space
in the file over its lifetime and eventually cause performance
issues.
-An HDF5 application can use [H5Pset_file_space_strategy][u12]
-with a value of `H5F_FSPACE_STRATEGY_PAGE` to enable the paged
+An HDF5 application can use #H5Pset_file_space_strategy
+with a value of #H5F_FSPACE_STRATEGY_PAGE to enable the paged
aggregation feature, which can accumulate metadata and raw
data for dataset data chunks into well-aligned, configurably
-sized "pages" for better performance. However, note that using
+sized pages for better performance. However, note that using
the paged aggregation feature will cause any setting from
-[H5Pset_alignment][u11]
+#H5Pset_alignment
to be ignored. While an application should be able to get
-comparable performance effects by [setting the size of these pages][u13]
-to be equal to the value that would have been set for [H5Pset_alignment][u11],
+comparable performance effects by setting the size of these pages, using #H5Pset_file_space_page_size,
+to be equal to the value that would have been set for #H5Pset_alignment,
this may not necessarily be the case and should be studied.
-Note that [H5Pset_file_space_strategy][u12]
-has a `persist` parameter. This determines whether or not the
+Note that #H5Pset_file_space_strategy
+has a persist parameter. This determines whether or not the
file free space manager should include extra metadata in the
HDF5 file about free space sections in the file. If this
-parameter is `false`, any free space in the HDF5 file will
+parameter is false, any free space in the HDF5 file will
become unusable once the HDF5 file is closed. For parallel
-compression, it's generally recommended that `persist` be set
-to `true`, as this will keep better track of file free space
+compression, it's generally recommended that persist be set
+to true, as this will keep better track of file free space
for data chunks between accesses to the HDF5 file.
-```
+\code
hid_t fcpl_id = H5Pcreate(H5P_FILE_CREATE);
/* Use persistent free space manager with paged aggregation */
H5Pset_file_space_strategy(fcpl_id, H5F_FSPACE_STRATEGY_PAGE, 1, 1);
@@ -294,58 +293,43 @@ H5Pset_file_space_strategy(fcpl_id, H5F_FSPACE_STRATEGY_PAGE, 1, 1);
H5Pset_file_space_page_size(fcpl_id, 1048576);
...
hid_t file_id = H5Fcreate("file.h5", H5F_ACC_TRUNC, fcpl_id, fapl_id);
-```
+\endcode
-### Low-level collective vs. independent I/O
+\subsection subsec_parcompr_perf_low Low-level collective vs. independent I/O
While the parallel compression feature requires that the HDF5
application set and maintain collective I/O at the application
-interface level (via [H5Pset_dxpl_mpio][u14]),
+interface level (via #H5Pset_dxpl_mpio),
it does not require that the actual MPI I/O that occurs at
the lowest layers of HDF5 be collective; independent I/O may
perform better depending on the application I/O patterns and
parallel file system performance, among other factors. The
-application may use [H5Pset_dxpl_mpio_collective_opt][u15]
+application may use #H5Pset_dxpl_mpio_collective_opt
to control this setting and see which I/O method provides the
best performance.
-```
+\code
hid_t dxpl_id = H5Pcreate(H5P_DATASET_XFER);
H5Pset_dxpl_mpio(dxpl_id, H5FD_MPIO_COLLECTIVE);
H5Pset_dxpl_mpio_collective_opt(dxpl_id, H5FD_MPIO_INDIVIDUAL_IO); /* Try independent I/O */
H5Dwrite(..., dxpl_id, ...);
-```
+\endcode
-### Runtime HDF5 Library version
+\subsection subsec_parcompr_perf_libver Runtime HDF5 Library version
-An HDF5 application can use the [H5Pset_libver_bounds][u16]
+An HDF5 application can use the #H5Pset_libver_bounds
routine to set the upper and lower bounds on library versions
to use when creating HDF5 objects. For parallel compression
specifically, setting the library version to the latest available
version can allow access to better/more efficient chunk indexing
types and data encoding methods. For example:
-```
+\code
...
hid_t fapl_id = H5Pcreate(H5P_FILE_ACCESS);
H5Pset_libver_bounds(fapl_id, H5F_LIBVER_LATEST, H5F_LIBVER_LATEST);
hid_t file_id = H5Fcreate("file.h5", H5F_ACC_TRUNC, H5P_DEFAULT, fapl_id);
...
-```
-
-[u1]: https://github.com/HDFGroup/hdf5/blob/develop/HDF5Examples/C/H5PAR/ph5_filtered_writes.c
-[u2]: https://github.com/HDFGroup/hdf5/blob/develop/HDF5Examples/C/H5PAR/ph5_filtered_writes_no_sel.c
-[u3]: https://hdfgroup.github.io/hdf5/develop/group___h5_d.html#gaf6213bf3a876c1741810037ff2bb85d8
-[u4]: https://hdfgroup.github.io/hdf5/develop/group___h5_d.html#ga8eb1c838aff79a17de385d0707709915
-[u5]: https://hdfgroup.github.io/hdf5/develop/group___d_c_p_l.html#ga85faefca58387bba409b65c470d7d851
-[u6]: https://hdfgroup.github.io/hdf5/develop/group___d_c_p_l.html#ga4335bb45b35386daa837b4ff1b9cd4a4
-[u7]: https://hdfgroup.github.io/hdf5/develop/group___d_c_p_l.html#ga6bd822266b31f86551a9a1d79601b6a2
-[u8]: https://www.hdfgroup.org/2022/03/04/parallel-compression-improvements-in-hdf5-1-13-1/
-[u9]: https://hdfgroup.github.io/hdf5/develop/chunking__in__hdf5_8dox.html
-[u10]: https://support.hdfgroup.org/releases/hdf5/documentation/hdf5_topics/HDF5ImprovingIOPerformanceCompressedDatasets.pdf
-[u11]: https://hdfgroup.github.io/hdf5/develop/group___f_a_p_l.html#gab99d5af749aeb3896fd9e3ceb273677a
-[u12]: https://hdfgroup.github.io/hdf5/develop/group___f_c_p_l.html#ga167ff65f392ca3b7f1933b1cee1b9f70
-[u13]: https://hdfgroup.github.io/hdf5/develop/group___f_c_p_l.html#gad012d7f3c2f1e1999eb1770aae3a4963
-[u14]: https://hdfgroup.github.io/hdf5/develop/group___d_x_p_l.html#ga001a22b64f60b815abf5de8b4776f09e
-[u15]: https://hdfgroup.github.io/hdf5/develop/group___d_x_p_l.html#gacb30d14d1791ec7ff9ee73aa148a51a3
-[u16]: https://hdfgroup.github.io/hdf5/develop/group___f_a_p_l.html#gacbe1724e7f70cd17ed687417a1d2a910
+\endcode
+
+*/
diff --git a/doxygen/dox/threadsafety-warning.dox b/doxygen/dox/threadsafety-warning.dox
new file mode 100644
index 00000000000..16194f22a96
--- /dev/null
+++ b/doxygen/dox/threadsafety-warning.dox
@@ -0,0 +1,42 @@
+/** \page ThrdSafe HDF5 Threadsafety Warning
+Any application that creates threads that use the HDF5 library must join those threads before
+either process exit or library close through H5close(). If all HDF5-using threads aren't joined,
+the threads may exhibit undefined behavior.
+
+\section sec_thrdsafe Discussion for Developers on Potential Improvements
+
+It would in principle be possible to make it safe to have threads continue using HDF5 resources
+after a call to #H5close by keeping a count of threads within the library. (There is probably
+no solution to an early process exit producing undefined behavior within threads.) This method
+would only be able to count (and presumably, only _need_ to count) threads that directly interact
+with the library. Because each thread would need to be counted exactly once, this would most
+likely be done by use of a thread-local key with e.g. a boolean value used to track whether the
+a global atomic thread counter has already counted this thread. Then, if #H5close is invoked
+while this thread counter is above one (because one thread must be doing the closing), the library
+would not close, and instead keep its resources valid to hopefully avoid bad behavior with the threads.
+
+The issues with this approach are as follows:
+
+- The process of checking for the existence/value of the thread-local key is slow, or at least
+ slow enough that it's probably not worth adding this to almost every single API call to prevent
+ this particular edge case.
+- Even with this approach, bad behavior would still be possible if the application does something
+ like expose HDF5 resources to threads indirectly via a global variable.
+- How to allow #H5close to fail is nonobvious. #H5close could be allowed to return an error
+ indicating a failure to close, but the number of applications which could usefully respond to such
+ an error by joining threads is small. If an application were able/willing to join its created
+ threads, presumably it would have done so before calling #H5close. Alternatively, #H5close could
+ succeed but silently leave the library open. This creates the potential for confusing, unexpected
+ behavior when the user thinks they are closing and re-opening the library, e.g. if environment
+ variables are modified between close and re-open, or if resources such as default property lists
+ are modified.
+- Applications should join threads before closing libraries that those threads are using, so all
+ of this work would constitute an above-and-beyond effort to maintain safe and defined behavior in
+ the face of an unsafe application.
+
+
+Despite these issues, if a more performant method was found to perform threadcounting like this,
+it might still constitute a worthwhile change.
+
+*/
+
\ No newline at end of file
diff --git a/doxygen/examples/tables/fileDriverLists.dox b/doxygen/examples/tables/fileDriverLists.dox
index 437d32a7b93..c321284944b 100644
--- a/doxygen/examples/tables/fileDriverLists.dox
+++ b/doxygen/examples/tables/fileDriverLists.dox
@@ -71,7 +71,7 @@
*
//! [supported_file_driver_table]
-Supported file drivers
+Supported file drivers
Driver Name |
Driver Identifier |
diff --git a/doxygen/examples/tables/propertyLists.dox b/doxygen/examples/tables/propertyLists.dox
index 76727b58a59..4480cabc64f 100644
--- a/doxygen/examples/tables/propertyLists.dox
+++ b/doxygen/examples/tables/propertyLists.dox
@@ -2,7 +2,7 @@
*
//! [plcr_table]
-Property list class root functions (H5P)
+Property list class root functions (H5P)
Function |
Purpose |
@@ -32,7 +32,7 @@
*
//! [plcra_table]
-Property list class root (Advanced) functions (H5P)
+Property list class root (Advanced) functions (H5P)
Function |
Purpose |
@@ -102,7 +102,7 @@
*
//! [fcpl_table]
-File creation property list functions (H5P)
+File creation property list functions (H5P)
Function |
Purpose |
@@ -157,7 +157,7 @@ creation property list.
*
//! [fapl_table]
-File access property list functions (H5P)
+File access property list functions (H5P)
Function |
Purpose |
@@ -298,7 +298,7 @@ versions used when creating objects.
*
//! [fd_pl_table]
-File driver property list functions (H5P)
+File driver property list functions (H5P)
Function |
Purpose |
@@ -418,7 +418,7 @@ and one raw data file.
*
//! [dcpl_table]
-Dataset creation property list functions (H5P)
+Dataset creation property list functions (H5P)
Function |
Purpose |
@@ -579,7 +579,7 @@ encoding for object names.
*
//! [dapl_table]
-Dataset access property list functions (H5P)
+Dataset access property list functions (H5P)
Function |
Purpose |
@@ -621,7 +621,7 @@ encoding for object names.
*
//! [dxpl_table]
-Data transfer property list functions (H5P)
+Data transfer property list functions (H5P)
C Function |
Purpose |
@@ -727,7 +727,7 @@ of the library for reading or writing the actual data.
*
//! [gcpl_table]
-Group creation property list functions (H5P)
+Group creation property list functions (H5P)
Function |
Purpose |
@@ -820,7 +820,7 @@ encoding for object names.
*
//! [gapl_table]
-Group access property list functions (H5P)
+Group access property list functions (H5P)
Function |
Purpose |
@@ -834,7 +834,7 @@ encoding for object names.
*
//! [lapl_table]
-Link access property list functions (H5P)
+Link access property list functions (H5P)
Function |
Purpose |
@@ -864,7 +864,7 @@ encoding for object names.
*
//! [ocpl_table]
-Object creation property list functions (H5P)
+Object creation property list functions (H5P)
Function |
Purpose |
@@ -910,7 +910,7 @@ encoding for object names.
*
//! [ocpypl_table]
-Object copy property list functions (H5P)
+Object copy property list functions (H5P)
Function |
Purpose |
@@ -936,7 +936,7 @@ encoding for object names.
*
//! [strcpl_table]
-String creation property list functions (H5P)
+String creation property list functions (H5P)
Function |
Purpose |
@@ -950,7 +950,7 @@ encoding for object names.
*
//! [lcpl_table]
-Link creation property list functions (H5P)
+Link creation property list functions (H5P)
Function |
Purpose |
@@ -964,7 +964,7 @@ encoding for object names.
*
//! [acpl_table]
|