Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

shareiscsi implementation #1

Open
wants to merge 1,239 commits into
base: master
Choose a base branch
from
Open

shareiscsi implementation #1

wants to merge 1,239 commits into from

Conversation

FransUrbo
Copy link
Owner

This is iSCSI sharing for ZoL (shareiscsi).

  • Supports the following iSCSI implementations (in this order of discovery):
  • Will refuse to unshares an active target (one with sessions).
  • Supports the following options to the 'shareiscsi' property:
    • name/iqn Full iSCSI Qualified Name (IQN), including identifier.
      This is generated by iscsi_generate_target():
      + Uses the content of an optional /etc/iscsi_target_id file:
      Example: iqn.YYYY-MM.tld.domain
      + If this file doesn't exist, it uses the current year, month
      and domain name to generate the iqn.
      + The dataset name is appended at the end of the iqn, with
      slashes replaced with dots.
      => :
    • lun LUN (0-16384)
      Default: 0 (1 for STGT)
    • type Share mode (fileio, blockio, nullio, disk, tape)
      STGT: ssc, pt
      Default: blockio (disk for STGT)
    • iomode IO mode (wb, wt, ro)
      STGT: rdwr, aio, mmap, sg, ssc
      Default: wt (rdwr for STGT)
    • blocksize Logical block size (512, 1024, 2048, 4096)
      Default: Volume blocksize, 4096 if not usable.
      NOTE: Currently not supported for STGT (doesn't seem to be
      an option for it in tgtadm).
    • initiator Allow only this initiator to bind to target.
      Currently only availible for LIO, STGT and SCST.
    • authname Global user to use in binds on targets.
    • authpass Password for global user.
  • If not called with a 'name/iqn' value, then force setting it.
    This so that the IQN doesn't change every month (when it is
    regenerated again). It will be generated by iscsi_generate_target()
    (see above).

NOTE:

  • Moved nfs.c:foreach_nfs_shareopt() to libshare.c:foreach_shareopt()
    so that it can be (re)used in smb.c and iscsi.c.

  • Split iSCSI implementation into their own separate source files.

  • Use the list_{create,insert}() etc for keeping tabs of the linked lists instead of using a home-made version.

  • A half second delay had to be added in lib/libzfs/libzfs_mount.c:zfs_unmount() after the successful unshare. This to avoid 'dataset busy' when destroying recursivly.

  • The 'initiator', 'authname' and 'authpass' option might have some issues:
    It will make the compatibility between the different iSCSI implementations
    questionable - can't switch between them easily (it will ONLY be availible
    in ZoL). I COULD make the option be silently ignored (instead of forcibly
    rejected if not availible).

    But that still doesn't solve the (possible) problem between ZoL and
    OpenZFS/Illumos (setting the option, will possibly introduce problems when
    importing the pool on something else than ZoL).

    So some more discussion might be needed..

FransUrbo pushed a commit that referenced this pull request Aug 2, 2016
DMU_MAX_ACCESS should be cast to a uint64_t otherwise the
multiplication of DMU_MAX_ACCESS with spa_asize_inflation will
be 32 bit and may lead to an overflow. Currently DMU_MAX_ACCESS
is 64 * 1024 * 1024, so spa_asize_inflation being 64 or more will
lead to an overflow.

Found by static analysis with CoverityScan 0.8.5

CID 150942 (#1 of 1): Unintentional integer overflow
  (OVERFLOW_BEFORE_WIDEN)
overflow_before_widen: Potentially overflowing expression
  67108864 * spa_asize_inflation with type int (32 bits, signed)
  is evaluated using 32-bit arithmetic, and then used in a context
  that expects an expression of type uint64_t (64 bits, unsigned).

Signed-off-by: Colin Ian King <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes openzfs#4889
FransUrbo pushed a commit that referenced this pull request Aug 2, 2016
Leaks reported by using AddressSanitizer, GCC 6.1.0

Direct leak of 4097 byte(s) in 1 object(s) allocated from:
    #1 0x414f73 in process_options cmd/ztest/ztest.c:721

Direct leak of 5440 byte(s) in 17 object(s) allocated from:
    #1 0x41bfd5 in umem_alloc ../../lib/libspl/include/umem.h:88
    #2 0x41bfd5 in ztest_zap_parallel cmd/ztest/ztest.c:4659
    openzfs#3 0x4163a8 in ztest_execute cmd/ztest/ztest.c:5907

Signed-off-by: Gvozden Neskovic <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes openzfs#4896
Chunwei Chen and others added 27 commits June 1, 2017 06:39
If, for example, your aux device was /dev/sdc, but now the aux device is
removed and /dev/sdc points to other device. zpool import will still
use that device and corrupt it.

The problem is that the spa_validate_aux in spa_import, rather than
validate the on-disk label, it would actually write label to disk. We
remove them since spa_load_{spares,l2cache} seems to do everything we
need and they would actually validate on-disk label.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Chunwei Chen <[email protected]>
Closes openzfs#6158
When inheriting the "snapdev" property to we don't always call
zfs_prop_set_special(): this prevents device nodes from being created in
certain situations. Because "snapdev" is the only *special* property
that is also inheritable we need to call zfs_prop_set_special() even
when we're not reverting it to the received value ('zfs inherit -S').

Additionally, fix a NULL pointer dereference accidentally introduced in
5559ba0 that can be triggered when setting the "snapdev" property to
the value "hidden" twice.

Finally, add a new test case "zvol_misc_snapdev" to the ZFS Test Suite.

Reviewed by: Boris Protopopov <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: loli10K <[email protected]>
Closes openzfs#6131 
Closes openzfs#6175 
Closes openzfs#6176
Users can now provide their own scripts to be run
with 'zpool iostat/status -c'. User scripts should be
placed in ~/.zpool.d to be included in zpool's
default search path.

Provide a script which can be used with
'zpool iostat|status -c' that will return the type of
device (hdd, sdd, file).

Provide a script to get various values from smartctl
when using 'zpool iostat/status -c'.

Allow users to define the ZPOOL_SCRIPTS_PATH
environment variable which can be used to override
the default 'zpool iostat/status -c' search path.

Allow the ZPOOL_SCRIPTS_ENABLED environment
variable to enable or disable 'zpool status/iostat -c'
functionality.

Use the new smart script to provide the serial command.

Install /etc/sudoers.d/zfs file which contains the sudoer
rule for smartctl as a sample.

Allow 'zpool iostat/status -c' tests to run in tree.

Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Giuseppe Di Natale <[email protected]>
Closes openzfs#6121 
Closes openzfs#6153
Since torvalds/linux@d0a5b99 IOP_XATTR is used to indicate the inode
has xattr support: clear it for the ctldir inodes to avoid EIO errors.

Reviewed-by: Chunwei Chen <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: loli10K <[email protected]>
Closes openzfs#6189
Allow new members to be added to a pool mixing raidz and mirror vdevs
without giving -f, as long as they have matching redundancy.  This case
was missed in openzfs#5915, which only handled zpool create.

Add zfstest zpool_add_010_pos.ksh, with test of zpool create
followed by zpool add of mixed raidz and mirror vdevs.

Add some more mixed raidz and mirror cases to zpool_create_006_pos.ksh.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Haakan Johansson <[email protected]>
Issue openzfs#5915 
Closes openzfs#6181
The number of blocks which can be freed per TXG is controlled
by the zfs_free_max_blocks module option (defaults to 100,000).
Both speed up this test case and reduce the memory requirements
by only creating 4 TXGs worth of blocks to be freed.

Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue openzfs#5479 
Closes openzfs#6192
zpool_create_024_pos, zvol_misc_002_pos, write_dirs_002_pos are slow
on the buildbot 32-bit builder. Skip the test cases for now on 32-bit
builders.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Signed-off-by: Giuseppe Di Natale <[email protected]>
Closes openzfs#6195
Commit torvalds/linux@9e8925b6 allowed for kernels to be built
without support for mandatory locking (MS_MANDLOCK).  This will
result in 'zfs mount' failing when the nbmand=on property is set
if the kernel is built without CONFIG_MANDATORY_FILE_LOCKING.

Unfortunately we can not reliably detect prior to the mount(2) system
call if the kernel was built with this support.  The best we can do
is check if the mount failed with EPERM and if we passed 'mand'
as a mount option and then print a more useful error message. e.g.

  filesystem 'tank/fs' has the 'nbmand=on' property set, this mount
  option may be disabled in your kernel.  Use 'zfs set nbmand=off'
  to disable this option and try to mount the filesystem again.

Additionally, switch the default error message case to use
strerror() to produce a more human readable message.

Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes openzfs#4729
Closes openzfs#6199
…uffers

Authored by: Matthew Ahrens <[email protected]>
Reviewed by: Dan Kimmel <[email protected]>
Reviewed by: George Wilson <[email protected]>
Approved by: Robert Mustacchi <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Ported-by: Giuseppe Di Natale <[email protected]>

When writing pre-compressed buffers, arc_write() requires that
the compression algorithm used to compress the buffer matches
the compression algorithm requested by the zio_prop_t, which is
set by dmu_write_policy(). This makes dmu_write_policy() and its
callers a bit more complicated.

We simplify this by making arc_write() trust the caller to supply
the type of pre-compressed buffer that it wants to write,
and override the compression setting in the zio_prop_t.

OpenZFS-issue: https://www.illumos.org/issues/8155
OpenZFS-commit: openzfs/openzfs@b55ff58
Closes openzfs#6200
- After some ZIL changes 6 years ago zil_slog_limit got partially broken
due to zl_itx_list_sz not updated when async itx'es upgraded to sync.
Actually because of other changes about that time zl_itx_list_sz is not
really required to implement the functionality, so this patch removes
some unneeded broken code and variables.

 - Original idea of zil_slog_limit was to reduce chance of SLOG abuse by
single heavy logger, that increased latency for other (more latency critical)
loggers, by pushing heavy log out into the main pool instead of SLOG.  Beside
huge latency increase for heavy writers, this implementation caused double
write of all data, since the log records were explicitly prepared for SLOG.
Since we now have I/O scheduler, I've found it can be much more efficient
to reduce priority of heavy logger SLOG writes from ZIO_PRIORITY_SYNC_WRITE
to ZIO_PRIORITY_ASYNC_WRITE, while still leave them on SLOG.

 - Existing ZIL implementation had problem with space efficiency when it
has to write large chunks of data into log blocks of limited size.  In some
cases efficiency stopped to almost as low as 50%.  In case of ZIL stored on
spinning rust, that also reduced log write speed in half, since head had to
uselessly fly over allocated but not written areas.  This change improves
the situation by offloading problematic operations from z*_log_write() to
zil_lwb_commit(), which knows real situation of log blocks allocation and
can split large requests into pieces much more efficiently.  Also as side
effect it removes one of two data copy operations done by ZIL code WR_COPIED
case.

 - While there, untangle and unify code of z*_log_write() functions.
Also zfs_log_write() alike to zvol_log_write() can now handle writes crossing
block boundary, that may also improve efficiency if ZPL is made to do that.

Sponsored by:   iXsystems, Inc.

Authored by: Alexander Motin <[email protected]>
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed by: Prakash Surya <[email protected]>
Reviewed by: Andriy Gapon <[email protected]>
Reviewed by: Steven Hartland <[email protected]>
Reviewed by: Brad Lewis <[email protected]>
Reviewed by: Richard Elling <[email protected]>
Approved by: Robert Mustacchi <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Richard Yao <[email protected]>
Ported-by: Giuseppe Di Natale <[email protected]>

OpenZFS-issue: https://www.illumos.org/issues/7578
OpenZFS-commit: openzfs/openzfs@aeb13ac
Closes openzfs#6191
dmu_object_alloc() is single-threaded, so when multiple threads are
creating files in a single filesystem, they spend a lot of time waiting
for the os_obj_lock.  To improve performance of multi-threaded file
creation, we must make dmu_object_alloc() typically not grab any
filesystem-wide locks.

The solution is to have a "next object to allocate" for each CPU. Each
of these "next object"s is in a different block of the dnode object, so
that concurrent allocation holds dnodes in different dbufs.  When a
thread's "next object" reaches the end of a chunk of objects (by default
4 blocks worth -- 128 dnodes), it will be reset to the per-objset
os_obj_next, which will be increased by a chunk of objects (128).  Only
when manipulating the os_obj_next will we need to grab the os_obj_lock.
This decreases lock contention dramatically, because each thread only
needs to grab the os_obj_lock briefly, once per 128 allocations.

This results in a 70% performance improvement to multi-threaded object
creation (where each thread is creating objects in its own directory),
from 67,000/sec to 115,000/sec, with 8 CPUs.

Work sponsored by Intel Corp.

Authored by: Matthew Ahrens <[email protected]>
Reviewed-by: Ned Bass <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Ported-by: Matthew Ahrens <[email protected]>
Signed-off-by: Matthew Ahrens <[email protected]>

OpenZFS-issue: https://www.illumos.org/issues/8199
OpenZFS-commit: openzfs/openzfs#374
Closes openzfs#4703
Closes openzfs#6117
Authored by: Matthew Ahrens <[email protected]>
Reviewed by: Dan Kimmel <[email protected]>
Reviewed by: Paul Dagnelie <[email protected]>
Approved by: Robert Mustacchi <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Ported-by: Giuseppe Di Natale <[email protected]>

dbuf_evict_notify() holds the dbuf_evict_lock while checking if it should
do the eviction itself (because the evict thread is not able to keep up).
This can result in massive lock contention.  It isn't necessary to hold
the lock, because if we make the wrong choice occasionally, nothing bad
will happen. This commit results in a ~60% performance improvement for
ARC-cached sequential reads.

OpenZFS-issue: https://www.illumos.org/issues/8156
OpenZFS-commit: openzfs/openzfs@f73e5d9
Closes openzfs#6204
Authored by: Paul Dagnelie <[email protected]>
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed by: Pavel Zakharov <[email protected]>
Approved by: Robert Mustacchi <[email protected]>
Reviewed-by: Kash Pande <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Ported-by: Giuseppe Di Natale <[email protected]>

The send size estimate for a zvol can be too low, if the size of the
record headers (dmu_replay_record_t's) is a significant portion of the
size. This is typically the case when the data is highly compressible,
especially with embedded blocks.

The problem is that dmu_adjust_send_estimate_for_indirects() assumes
that blocks are the size of the "recordsize" property (128KB). However,
for zvols, the blocks are the size of the "volblocksize" property (8KB).
Therefore, we estimate that there will be 16x less record headers than
there really will be.

The fix is to check the type of the object set (whether it is a zvol or
not) and pick the appropriate property. In addition, while we are at it,
we also add the size of the BEGIN and END records to the estimate.

OpenZFS-issue: https://www.illumos.org/issues/8056
OpenZFS-commit: openzfs/openzfs@faf09cd
Closes openzfs#6205
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: DHE <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Reviewed-by: Jack Draak <[email protected]>
Signed-off-by: Kash Pande <[email protected]>
Closes openzfs#6203
The log function log_must_busy was added in commit e623aea for
this purpose.  Update destroy_pool to use it.

Reviewed-by: Giuseppe Di Natale <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes openzfs#6217
Buildbots and zfs-tests regularly see 7 kilobytes of stack
usage with this function. Convert self-calls to iterations

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: DHE <[email protected]>
Closes openzfs#6219
Cleanup zpool_import_all_001_pos to no longer use devices.
The test is meant to test zpool import -a and by no longer
requiring devices, a number of dependencies are no longer
necessary.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Giuseppe Di Natale <[email protected]>
Closes openzfs#6198
This continues what was started in
0eef1bd by fully converting zvols
to avoid unnecessary dnode_hold() calls. This saves a small amount
of CPU time and slightly improves latencies of operations on zvols.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Closes openzfs#6058
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
In arc_evict_state() we start pruning when arc_dnode_size >
arc_dnode_limit, i.e. arc_dnode_limit is a ceiling rather than a
floor.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Chris Dunlop <[email protected]>
Closes openzfs#6228
5559ba0 added zv_state_lock to protect zvol_state_t internal data:
this, however, doesn't guard zv->zv_open_count and
zv->zv_disk->private_data in zvol_remove_minors_impl().

Fix this by taking zv->zv_state_lock before we check its zv_open_count.

P1 (z_zvol)                       P2 (systemd-udevd)
---                               ---
zvol_remove_minors_impl()
: zv->zv_open_count==0
                                  zvol_open()
                                  ->mutex_enter(zv_state_lock)
                                  : zv->zv_open_count++
                                  ->mutex_exit(zv_state_lock)
->mutex_enter(zv->zv_state_lock)
->zvol_remove(zv)
->mutex_exit(zv->zv_state_lock)
: zv->zv_disk->private_data = NULL
->zvol_free()
-->ASSERT(zv->zv_open_count==0) *
                                  zvol_release()
                                  : zv = disk->private_data
                                  ->ASSERT(zv && zv->zv_open_count>0) *
---                               ---
* ASSERT() fails

Reviewed by: Boris Protopopov <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]
Signed-off-by: loli10K <[email protected]>
Closes openzfs#6213
Use queue_flag_set_unlocked() in zvol_alloc().

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Boris Protopopov <[email protected]>
Issue openzfs#6226
Add links for information about the ZFS buildbot options
to the contributing guidelines and PR template.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Giuseppe Di Natale <[email protected]>
Closes openzfs#6235
In the original form of device error injection, it was an all or nothing
situation.  To help simulate intermittent error conditions, you can now
specify a real number percentage value. This is also very useful for our
ZFS fault diagnosis testing and for injecting intermittent errors during
load testing.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Don Brady <[email protected]>
Closes openzfs#6227
In zfs/dmu_object and icp/core/kcf_sched, the CPU_SEQID macro
should be surrounded by `kpreempt_disable` and `kpreempt_enable`
calls to avoid a Linux kernel BUG warning.  These code paths use
the cpuid to minimize lock contention and is is safe to reschedule
the process to a different processor at any time.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Morgan Jones <[email protected]>
Closes openzfs#6239
This prints dashes instead of zeros for zero latency values in
'zpool iostat -p'.  You'll get zero latencies reported when the
disk is idle, but technically a zero latency is invalid, since you
can't measure the latency of doing nothing.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Signed-off-by: Tony Hutter <[email protected]>
Closes openzfs#6210
…ailure

Authored by: Andrew Stormont <[email protected]>
Reviewed by: Marcel Telka <[email protected]>
Reviewed by: Toomas Soome <[email protected]>
Approved by: Dan McDonald <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Ported-by: Giuseppe Di Natale <[email protected]>

OpenZFS-issue: https://www.illumos.org/issues/8331
OpenZFS-commit: openzfs/openzfs@4f4378c
Closes openzfs#6255
behlendorf and others added 28 commits October 13, 2017 12:39
The default 128M vdev size used by zloop.sh isn't always large
enough and can result in ENOSPC failures which suspend the pool.
Increase the default size to 512M and provide a -s option which
can be used to specify an alternate size.

This does increase the free space requirements to run zloop.sh.
However, since the vdevs are sparse 4x the space is not required.

Reviewed-by: Don Brady <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes openzfs#6758
Currently the function documentation states that two strings are 
allocated, this is outdated. Only one char ** parameter is passed 
into the function now, clearly only a pointer to a single string 
is returned and needs to be free'd.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tobin C. Harding <[email protected]>
Closes openzfs#6754
* config/deb.am: Enable building DKMS packages for Debian
* rpm/generic/zfs-dkms.spec.in: Adjust spec to be Debian-compatible
  * Condition kernel-devel Req to RPM distros
  * Adjust the DKMS Req to have a minimum of a version only
  * Ensure that --rpm_safe_upgrade isn't used on non-RPM distros
* config/deb.am: Drop CONFIG_KERNEL and CONFIG_USER guards
* Makefile.am: Add pkg-dkms target

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Neal Gompa <[email protected]>
Closes openzfs#6044 
Closes openzfs#6731
CID 147480: Logically dead code (DEADCODE)

Remove non-null check and subsequent function call. Add ASSERT to future
proof the code.

usage label is only jumped to before `zhp` is initialized.

CID 147584: Out-of-bounds access (OVERRUN)

Subtract length of current string from buffer length for `size` argument
to `snprintf`.

Starting address for the write is the start of the buffer + the current
string length. We need to subtract this string length else risk a buffer
overflow.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tobin C. Harding <[email protected]>
Closes openzfs#6745
CID 161388: Resource Leak (REASOURCE_LEAK)

Jump to errout so that file descriptor gets closed before returning
from function.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tobin C. Harding <[email protected]>
Closes openzfs#6755
Update the codecov.yml included in the repository to behave as
originally intended.  This can be refined as needed.

* Always post coverage results to the GitHub PR after two builds
  have been uploaded.  This is the normal case since there will
  be a build uploaded for both kernel and user coverage results.

* Adjust red -> yellow -> green coloring in the web interface.
  Due to the number of unlikely error conditions which are hard
  to force consider 90% coverage an excellent level of coverage.

* Allow a 1% variance in coverage between test runs.  This is
  approximately 10x larger than the typical variance observed
  which leaves us a reasonable margin to prevent false positives.

* Always post a new smaller comment to PRs which does not include
  a file list.  Old coverage reports are removed.

Reviewed by: Prakash Surya <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes openzfs#6765
This small patch fixes an issue where dmu_free_long_object_raw()
calls dnode_hold() after freeing the dnode a line above.

Reviewed-by: Jorgen Lundman <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tom Caputi <[email protected]>
Closes openzfs#6766
The only place vn_rename and vn_remove are used is when writing
out an updated pool configuration file.  By truncating the file
instead of renaming and removing it we can avoid having to implement
these interfaces entirely.  Functionally an empty cache file is
treated the same as a missing cache file.  This is particularly
advantageous because the Linux kernel has never provided a way
to reliably implement vn_rename and vn_remove.

The cachefile_004_pos.ksh test case was updated to understand
that an empty cache file is the same as a missing one.

The zfs-import-* systemd service files were not updated to use
ConditionFileNotEmpty in place of ConditionPathExists.  This
means that after exporting all pools and rebooting new pools
will not the scanned for on the next boot.  This small change
should not impact normal usage since pools are not exported
as part of a normal shutdown.

Documentation was updated accordingly.

Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Arkadiusz Bubała <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes openzfs/spl#648 
Closes openzfs#6753
Add get functions to match existing ones.

Reviewed-by: Giuseppe Di Natale <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: John Ramsden <[email protected]>
Closes openzfs#6308
Support integration with new QAT products: Intel(R) C62x Chipset,
or Atom(R) C3000 Processor Product Family SoC:
1. Detect new file name in auto-conf.
2. Change MAX_INSTANCES to 48.
3. Change "num_inst" to U16 to clean a build warning.

Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Weigang Li <[email protected]>
Closes openzfs#6767
History commands and events were being suppressed for the
'zpool create' command since the history object did not
yet exist.  Create the object earlier so this history
doesn't get lost.

Split the pool_destroy event in to pool_destroy and
pool_export so they may be distinguished.

Updated events_001_pos and events_002_pos test cases.  They
now check for the expected history events and were reworked
to be more reliable.

Reviewed-by: Nathaniel Clark <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes openzfs#6712 
Closes openzfs#6486
Provide details about the commit message format for Coverity defect
fixes submitted.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Signed-off-by: Giuseppe Di Natale <[email protected]>
Closes openzfs#6771
Currently the 480GB models of this disk do not use ashift=12 by
default.  SSDSC2BW48 is also optimized for 4k blocks.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Signed-off-by: adisbladis <[email protected]>
Closes openzfs#6774
The ZED is expected to automatically kick in a hot spare device
when there's one available in the pool and a sufficient number of
read errors have been encountered.  Use zinject to simulate the
failure condition and verify the hot spare is used.

auto_spare_001_pos.ksh: read IO errors, the vdev is FAULTED
auto_spare_002_pos.ksh: read CHECKSUM errors, the vdev is DEGRADE

Reviewed by: Richard Elling <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: David Quigley <[email protected]>
Closes openzfs#6280
Fix new flake8 errors related to bare excepts and ambiguous
variable names due to a STYLE builder update.

Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Giuseppe Di Natale <[email protected]>
Closes openzfs#6776
Use mfu_size and mru_size pulled from the arcstats
kstat file to calculate the mfu and mru percentages
for arc size breakdown.

Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Richard Elling <[email protected]>
Reviewed-by: AndCycle <[email protected]>
Signed-off-by: Giuseppe Di Natale <[email protected]>
Closes openzfs#5526 
Closes openzfs#6770
Enable commitcheck.sh to test if a commit message is
in the expected format for a coverity defect fix.

Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Giuseppe Di Natale <[email protected]>
Closes openzfs#6777
Allow commitcheck.sh to handle multiple OpenZFS ports in
a single commit. This is useful in the cases when a change
upstream has bug fixes and it makes sense to port them with
the original patch.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Chris Dunlop <[email protected]>
Signed-off-by: Giuseppe Di Natale <[email protected]>
Closes openzfs#6780
Otherwise, if arcstat gets interrupted before the desired number of
iterations is reached, the output file will be empty (both if set via
'-o' or via shell redirection).

Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Signed-off-by: Fabian Grünbichler <[email protected]>
Closes openzfs#6775
Added -n flag to zpool reopen that allows a running scrub
operation to continue if there is a device with Dirty Time Log.

By default if a component device has a DTL and zpool reopen
is executed all running scan operations will be restarted.

Added functional tests for `zpool reopen`

Tests covers following scenarios:
* `zpool reopen` without arguments,
* `zpool reopen` with pool name as argument,
* `zpool reopen` while scrubbing,
* `zpool reopen -n` while scrubbing,
* `zpool reopen -n` while resilvering,
* `zpool reopen` with bad arguments.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Tom Caputi <[email protected]>
Signed-off-by: Arkadiusz Bubała <[email protected]>
Closes openzfs#6076 
Closes openzfs#6746
8558 lwp_create() returns EAGAIN on system with more than 80K ZFS filesystems

On a system with more than 80K ZFS filesystems, we've seen cases
where lwp_create() will start to fail by returning EAGAIN. The
problem being, for each of those 80K ZFS filesystems, a taskq will
be created for each dataset as part of the ZIL for each dataset.

Porting Notes:
- The new nomem taskq kstat was dropped.
- Added module options and documentation for new tunings
  zfs_zil_clean_taskq_nthr_pct, zfs_zil_clean_taskq_minalloc,
  zfs_zil_clean_taskq_maxalloc, and zfs_sync_taskq_batch_pct.

Reviewed by: George Wilson <[email protected]>
Reviewed by: Sebastien Roy <[email protected]>
Approved by: Robert Mustacchi <[email protected]>
Authored by: Prakash Surya <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Chris Dunlop <[email protected]>
Ported-by: Brian Behlendorf <[email protected]>

OpenZFS-issue: https://www.illumos.org/issues/8558
OpenZFS-commit: openzfs/openzfs@216d772

8602 remove unused "dp_early_sync_tasks" field from "dsl_pool" structure

Reviewed by: Serapheim Dimitropoulos <[email protected]>
Reviewed by: Matthew Ahrens <[email protected]>
Approved by: Robert Mustacchi <[email protected]>
Authored by: Prakash Surya <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Chris Dunlop <[email protected]>
Ported-by: Brian Behlendorf <[email protected]>

OpenZFS-issue: https://www.illumos.org/issues/8602
OpenZFS-commit: openzfs/openzfs@2bcb545
Closes openzfs#6779
Additionally add four new tests:

 * zpool_events_clear: verify 'zpool events -c' functionality
 * zpool_events_cliargs: verify command line options and arguments
 * zpool_events_follow: verify 'zpool events -f'
 * zpool_events_poolname: verify events filtering by pool name

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Signed-off-by: loli10K <[email protected]>
Closes openzfs#3285 
Closes openzfs#6762
When dumping objects larger than 128PiB it's possible for do_dump() to
miscalculate the FREE_RECORD offset due to an integer overflow
condition: this prevents the receiving end from correctly restoring
the dumped object.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Fabian Grünbichler <[email protected]>
Signed-off-by: loli10K <[email protected]>
Closes openzfs#6760
The current make recipe for mancheck silently ignores errors. Correct
the recipe so errors cause the mancheck recipe fail.

The zpool reopen command in the zpool.8 manpage had a bullet list
without an .El.

Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: loli10K <[email protected]>
Signed-off-by: Giuseppe Di Natale <[email protected]>
Closes openzfs#6790
Fix compiler warnings in zdb.  With these changes, FreeBSD can compile
zdb with all compiler warnings enabled save -Wunused-parameter.

usr/src/cmd/zdb/zdb.c
usr/src/cmd/zdb/zdb_il.c
usr/src/uts/common/fs/zfs/sys/sa.h
usr/src/uts/common/fs/zfs/sys/spa.h
	Fix numerous warnings, including:
	* const-correctness
	* shadowing global definitions
	* signed vs unsigned comparisons
	* missing prototypes, or missing static declarations
	* unused variables and functions
	* Unreadable array initializations
	* Missing struct initializers

usr/src/cmd/zdb/zdb.h
	Add a header file to declare common symbols

usr/src/lib/libzpool/common/sys/zfs_context.h
usr/src/uts/common/fs/zfs/arc.c
usr/src/uts/common/fs/zfs/dbuf.c
usr/src/uts/common/fs/zfs/spa.c
usr/src/uts/common/fs/zfs/txg.c
	Add a function prototype for zk_thread_create, and ensure that every
	callback supplied to this function actually matches the prototype.

usr/src/cmd/ztest/ztest.c
usr/src/uts/common/fs/zfs/sys/zil.h
usr/src/uts/common/fs/zfs/zfs_replay.c
usr/src/uts/common/fs/zfs/zvol.c
	Add a function prototype for zil_replay_func_t, and ensure that
	every function of this type actually matches the prototype.

usr/src/uts/common/fs/zfs/sys/refcount.h
	Change FTAG so it discards any constness of __func__, necessary
	since existing APIs expect it passed as void *.

Porting Notes:
- Many of these fixes have already been applied to Linux.  For
  consistency the OpenZFS version of a change was applied if the
  warning was addressed in an equivalent but different fashion.

Reviewed by: Matthew Ahrens <[email protected]>
Reviewed by: Prakash Surya <[email protected]>
Authored by: Alan Somers <[email protected]>
Approved by: Richard Lowe <[email protected]>
Ported-by: Brian Behlendorf <[email protected]>

OpenZFS-issue: https://www.illumos.org/issues/8081
OpenZFS-commit: openzfs/openzfs@843abe1b8a
Closes openzfs#6787
The 'zpool status' command supports the -P option for printing full
path names.  It does not support the -p parsable option for printing
exact values.
    
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: loli10K <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes openzfs#6792 
Closes openzfs#6794
* Supports the following iSCSI implementations (in this order of discovery):
  + IET         http://iscsitarget.sourceforge.net
  + STGT        http://stgt.berlios.de
  + SCST        http://scst.sourceforge.net
    + Requires that SCST was compiled with /sys support (the default).
  + LIO         http://linux-iscsi.org
* Will refuse to unshares an active target (one with sessions).
* Supports the following options to the 'shareiscsi' property:
  + name/iqn    Full iSCSI Qualified Name (IQN), including identifier.
                This is generated by iscsi_generate_target():
                + Uses the content of an optional /etc/iscsi_target_id file:
                  Example: iqn.YYYY-MM.tld.domain
                + If this file doesn't exist, it uses the current year, month
                  and domain name to generate the iqn.
                + The dataset name is appended at the end of the iqn, with
                  slashes replaced with dots.
                  => <iqn>:<dataset>
  + lun         LUN (0-16384)
                Default: 0 (1 for STGT)
  + type        Share mode (fileio, blockio, nullio, disk, tape)
                STGT: ssc, pt
                Default: blockio (disk for STGT)
  + iomode      IO mode (wb, wt, ro)
                STGT: rdwr, aio, mmap, sg, ssc
                Default: wt (rdwr for STGT)
  + blocksize   Logical block size (512, 1024, 2048, 4096)
                Default: Volume blocksize, 4096 if not usable.
                NOTE: Currently not supported for STGT (doesn't seem to be
                      an option for it in tgtadm).
  + initiator   Allow only this initiator to bind to target.
                Currently only availible for LIO, STGT and SCST.
  + authname    Global user to use in binds on targets.
  + authpass    Password for global user.
* If not called with a 'name/iqn' value, then force setting it.
  This so that the IQN doesn't change every month (when it is
  regenerated again). It will be generated by iscsi_generate_target()
  (see above).

NOTE:
+ Moved nfs.c:foreach_nfs_shareopt() to libshare.c:foreach_shareopt()
  so that it can be (re)used in smb.c and iscsi.c.
+ Split iSCSI implementation into their own separate source files
+ Use the list_{create,insert}() etc for keeping tabs of the linked lists
  instead of using a home-made version.
+ A half second delay had to be added in lib/libzfs/libzfs_mount.c:zfs_unmount()
  after the successful unshare. This to avoid 'dataset busy' when destroying
  recursivly.
= The 'initiator', 'authname' and 'authpass' option might have some issues:
  It will make the compatibility between the different iSCSI implementations
  questionable - can't switch between them easily (it will ONLY be availible
  in ZoL). I COULD make the option be silently ignored (instead of forcibly
  rejected if not availible).

  But that still doesn't solve the (possible) problem between ZoL and
  OpenZFS/Illumos (setting the option, will possibly introduce problems when
  importing the pool on something else than ZoL).

  So some more discussion might be needed..
!! Doesn't - a 'zfs rename' works, but most often than not, the rename
!! IOCTL hangs. Not every time and not always on the first dataset being
!! renamed... Seems a little to random to me.
FransUrbo pushed a commit that referenced this pull request Apr 28, 2019
The bug time sequence:
1. thread #1, `zfs_write` assign a txg "n".
2. In a same process, thread #2, mmap page fault (which means the
   `mm_sem` is hold) occurred, `zfs_dirty_inode` open a txg failed,
   and wait previous txg "n" completed.
3. thread #1 call `uiomove` to write, however page fault is occurred
   in `uiomove`, which means it need `mm_sem`, but `mm_sem` is hold by
   thread #2, so it stuck and can't complete,  then txg "n" will
   not complete.

So thread #1 and thread #2 are deadlocked.

Reviewed-by: Chunwei Chen <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Matthew Ahrens <[email protected]>
Signed-off-by: Grady Wong <[email protected]>
Closes openzfs#7939
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.