Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Published datasets should contain files #10994

Merged
merged 17 commits into from
Jan 3, 2025
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions doc/release-notes/10981-published-datasets-should-contain-files.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
## Feature: Prevent publishing Datasets without files
qqmyers marked this conversation as resolved.
Show resolved Hide resolved
A new attribute was added to Collections in order to control the publishing of Datasets without files.
Once set to "True", the publishing of a Dataset within a Collection, without files, will be blocked for all non superusers.
In order to configure a Collection to block publishing a superuser must set the attribute "requireFilesToPublishDataset" to "True".
The collection's hierarchy will be checked if the collection's "requireFilesToPublishDataset" attribute is not set explicitly to "True" or "False".

```shell
curl -X PUT -H "X-Dataverse-key:$API_TOKEN" "$SERVER_URL/api/dataverses/$ID/attribute/requireFilesToPublishDataset?value=true"
```

See also #10981.
1 change: 1 addition & 0 deletions doc/sphinx-guides/source/api/native-api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1059,6 +1059,7 @@ The following attributes are supported:
* ``description`` Description
* ``affiliation`` Affiliation
* ``filePIDsEnabled`` ("true" or "false") Restricted to use by superusers and only when the :ref:`:AllowEnablingFilePIDsPerCollection <:AllowEnablingFilePIDsPerCollection>` setting is true. Enables or disables registration of file-level PIDs in datasets within the collection (overriding the instance-wide setting).
* ``requireFilesToPublishDataset`` ("true" or "false") Restricted to use by superusers. Defines if Dataset needs files in order to be published. If not set the determination will be made through inheritance by checking the owners of this collection. Publishing by a superusers will not be blocked.

See also :ref:`update-dataverse-api`.

Expand Down
31 changes: 30 additions & 1 deletion src/main/java/edu/harvard/iq/dataverse/Dataverse.java
Original file line number Diff line number Diff line change
Expand Up @@ -602,7 +602,25 @@ public List<DatasetFieldType> getCitationDatasetFieldTypes() {
public void setCitationDatasetFieldTypes(List<DatasetFieldType> citationDatasetFieldTypes) {
this.citationDatasetFieldTypes = citationDatasetFieldTypes;
}


@Column(nullable = true)
private Boolean requireFilesToPublishDataset;
/**
* Specifies whether the existance of files in a dataset is required when publishing
* @return {@code Boolean.TRUE} if explicitly enabled, {@code Boolean.FALSE} if explicitly disabled.
* {@code null} indicates that the behavior is not explicitly defined, in which
* case the behavior should follow the explicit configuration of the first
* direct ancestor collection.
* @Note: If present, this configuration therefore by default applies to all
* the sub-collections, unless explicitly overwritten there.
*/
public Boolean getRequireFilesToPublishDataset() {
return requireFilesToPublishDataset;
}
public void setRequireFilesToPublishDataset(boolean requireFilesToPublishDataset) {
this.requireFilesToPublishDataset = requireFilesToPublishDataset;
}

/**
* @Note: this setting is Nullable, with {@code null} indicating that the
* desired behavior is not explicitly configured for this specific collection.
Expand Down Expand Up @@ -773,6 +791,17 @@ public List<Dataverse> getOwners() {
return owners;
}

public boolean getEffectiveRequiresFilesToPublishDataset() {
Dataverse dv = this;
while (dv != null) {
if (dv.getRequireFilesToPublishDataset() != null) {
return dv.getRequireFilesToPublishDataset();
}
dv = dv.getOwner();
}
return false;
}

@Override
public boolean equals(Object object) {
// TODO: Warning - this method won't work in the case the id fields are not set
Expand Down
2 changes: 1 addition & 1 deletion src/main/java/edu/harvard/iq/dataverse/api/Dataverses.java
Original file line number Diff line number Diff line change
Expand Up @@ -711,7 +711,7 @@ public Response updateAttribute(@Context ContainerRequestContext crc, @PathParam
}

private Object formatAttributeValue(String attribute, String value) throws WrappedResponse {
if (attribute.equals("filePIDsEnabled")) {
if (List.of("filePIDsEnabled","requireFilesToPublishDataset").contains(attribute)) {
return parseBooleanOrDie(value);
}
return value;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,18 @@

import edu.harvard.iq.dataverse.Dataset;
import edu.harvard.iq.dataverse.DatasetLock;
import edu.harvard.iq.dataverse.Dataverse;
import edu.harvard.iq.dataverse.authorization.Permission;
import edu.harvard.iq.dataverse.authorization.users.AuthenticatedUser;
import edu.harvard.iq.dataverse.engine.command.CommandContext;
import edu.harvard.iq.dataverse.engine.command.DataverseRequest;
import edu.harvard.iq.dataverse.engine.command.RequiredPermissions;
import edu.harvard.iq.dataverse.engine.command.exception.CommandException;
import edu.harvard.iq.dataverse.engine.command.exception.IllegalCommandException;
import edu.harvard.iq.dataverse.util.BundleUtil;
import edu.harvard.iq.dataverse.workflow.Workflow;
import edu.harvard.iq.dataverse.workflow.WorkflowContext.TriggerType;

import java.util.Optional;
import java.util.logging.Logger;
import static java.util.stream.Collectors.joining;
Expand Down Expand Up @@ -218,9 +221,20 @@ private void verifyCommandArguments(CommandContext ctxt) throws IllegalCommandEx
if (minorRelease && !getDataset().getLatestVersion().isMinorUpdate()) {
throw new IllegalCommandException("Cannot release as minor version. Re-try as major release.", this);
}

if (getDataset().getFiles().isEmpty() && getEffectiveRequiresFilesToPublishDataset()) {
throw new IllegalCommandException(BundleUtil.getStringFromBundle("dataset.mayNotPublish.FilesRequired"), this);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This gives a warning in the current UI, right? It would probably be better to just disable the publish button, or have a persistent message that files must be added before this dataset can be published, but if just failing with a warning is OK with @sbarbosadataverse, et. al., the current code avoids changes to the current UI code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stevenwinship - can you post a UI image for what this looks like?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stevenwinship images like this are so helpful. Can you please also add it under "Does this PR introduce a user interface change?" above?

}
}
private boolean getEffectiveRequiresFilesToPublishDataset() {
if (getUser().isSuperuser()) {
return false;
} else {
Dataverse dv = getDataset().getOwner();
return dv != null && dv.getEffectiveRequiresFilesToPublishDataset();
}
}


@Override
public boolean onSuccess(CommandContext ctxt, Object r) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ public class UpdateDataverseAttributeCommand extends AbstractCommand<Dataverse>
private static final String ATTRIBUTE_DESCRIPTION = "description";
private static final String ATTRIBUTE_AFFILIATION = "affiliation";
private static final String ATTRIBUTE_FILE_PIDS_ENABLED = "filePIDsEnabled";

private static final String ATTRIBUTE_REQUIRE_FILES_TO_PUBLISH_DATASET = "requireFilesToPublishDataset";
private final Dataverse dataverse;
private final String attributeName;
private final Object attributeValue;
Expand All @@ -45,8 +45,9 @@ public Dataverse execute(CommandContext ctxt) throws CommandException {
case ATTRIBUTE_AFFILIATION:
setStringAttribute(attributeName, attributeValue);
break;
case ATTRIBUTE_REQUIRE_FILES_TO_PUBLISH_DATASET:
case ATTRIBUTE_FILE_PIDS_ENABLED:
setBooleanAttributeForFilePIDs(ctxt);
setBooleanAttribute(ctxt, true);
break;
default:
throw new IllegalCommandException("'" + attributeName + "' is not a supported attribute", this);
Expand Down Expand Up @@ -86,25 +87,33 @@ private void setStringAttribute(String attributeName, Object attributeValue) thr
}

/**
* Helper method to handle the "filePIDsEnabled" boolean attribute.
* Helper method to handle boolean attributes.
*
* @param ctxt The command context.
* @param adminOnly True if this attribute can only be modified by an Administrator
* @throws PermissionException if the user doesn't have permission to modify this attribute.
*/
private void setBooleanAttributeForFilePIDs(CommandContext ctxt) throws CommandException {
if (!getRequest().getUser().isSuperuser()) {
private void setBooleanAttribute(CommandContext ctxt, boolean adminOnly) throws CommandException {
if (adminOnly && !getRequest().getUser().isSuperuser()) {
throw new PermissionException("You must be a superuser to change this setting",
this, Collections.singleton(Permission.EditDataset), dataverse);
}
if (!ctxt.settings().isTrueForKey(SettingsServiceBean.Key.AllowEnablingFilePIDsPerCollection, false)) {
throw new PermissionException("Changing File PID policy per collection is not enabled on this server",
this, Collections.singleton(Permission.EditDataset), dataverse);
}

if (!(attributeValue instanceof Boolean)) {
throw new IllegalCommandException("'" + ATTRIBUTE_FILE_PIDS_ENABLED + "' requires a boolean value", this);
throw new IllegalCommandException("'" + attributeName + "' requires a boolean value", this);
}
switch (attributeName) {
case ATTRIBUTE_FILE_PIDS_ENABLED:
if (!ctxt.settings().isTrueForKey(SettingsServiceBean.Key.AllowEnablingFilePIDsPerCollection, false)) {
throw new PermissionException("Changing File PID policy per collection is not enabled on this server",
this, Collections.singleton(Permission.EditDataset), dataverse);
}
dataverse.setFilePIDsEnabled((Boolean) attributeValue);
case ATTRIBUTE_REQUIRE_FILES_TO_PUBLISH_DATASET:
dataverse.setRequireFilesToPublishDataset((Boolean) attributeValue);
break;
default:
throw new IllegalCommandException("Unsupported boolean attribute: " + attributeName, this);
}

dataverse.setFilePIDsEnabled((Boolean) attributeValue);
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -163,6 +163,9 @@ public Dataverse parseDataverse(JsonObject jobj) throws JsonParseException {
if (jobj.containsKey("filePIDsEnabled")) {
dv.setFilePIDsEnabled(jobj.getBoolean("filePIDsEnabled"));
}
if (jobj.containsKey("requireFilesToPublishDataset")) {
dv.setRequireFilesToPublishDataset(jobj.getBoolean("requireFilesToPublishDataset"));
}

/* We decided that subject is not user set, but gotten from the subject of the dataverse's
datasets - leavig this code in for now, in case we need to go back to it at some point
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,6 @@
import edu.harvard.iq.dataverse.authorization.users.User;
import edu.harvard.iq.dataverse.branding.BrandingUtil;
import edu.harvard.iq.dataverse.dataaccess.DataAccess;
import edu.harvard.iq.dataverse.dataset.DatasetType;
import edu.harvard.iq.dataverse.dataset.DatasetUtil;
import edu.harvard.iq.dataverse.datavariable.CategoryMetadata;
import edu.harvard.iq.dataverse.datavariable.DataVariable;
Expand Down Expand Up @@ -294,6 +293,7 @@ public static JsonObjectBuilder json(Dataverse dv, Boolean hideEmail, Boolean re
if (dv.getFilePIDsEnabled() != null) {
bld.add("filePIDsEnabled", dv.getFilePIDsEnabled());
}
bld.add("effectiveRequiresFilesToPublishDataset", dv.getEffectiveRequiresFilesToPublishDataset());
bld.add("isReleased", dv.isReleased());

List<DataverseFieldTypeInputLevel> inputLevels = dv.getDataverseFieldTypeInputLevels();
Expand Down
1 change: 1 addition & 0 deletions src/main/java/propertyFiles/Bundle.properties
Original file line number Diff line number Diff line change
Expand Up @@ -1542,6 +1542,7 @@ dataset.mayNotPublish.administrator= This dataset cannot be published until {0}
dataset.mayNotPublish.both= This dataset cannot be published until {0} is published. Would you like to publish both right now?
dataset.mayNotPublish.twoGenerations= This dataset cannot be published until {0} and {1} are published.
dataset.mayNotBePublished.both.button=Yes, Publish Both
dataset.mayNotPublish.FilesRequired=This dataset cannot be published without uploaded files.
dataset.viewVersion.unpublished=View Unpublished Version
dataset.viewVersion.published=View Published Version
dataset.link.title=Link Dataset
Expand Down
2 changes: 2 additions & 0 deletions src/main/resources/db/migration/V6.4.0.3.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
-- files are required to publish datasets
ALTER TABLE dataverse ADD COLUMN IF NOT EXISTS requirefilestopublishdataset bool;
60 changes: 60 additions & 0 deletions src/test/java/edu/harvard/iq/dataverse/api/DatasetsIT.java
Original file line number Diff line number Diff line change
Expand Up @@ -5168,4 +5168,64 @@ public void testGetCanDownloadAtLeastOneFile() {
Response getUserPermissionsOnDatasetInvalidIdResponse = UtilIT.getCanDownloadAtLeastOneFile("testInvalidId", DS_VERSION_LATEST, secondUserApiToken);
getUserPermissionsOnDatasetInvalidIdResponse.then().assertThat().statusCode(BAD_REQUEST.getStatusCode());
}

@Test
public void testRequireFilesToPublishDatasets() {
// Create superuser and regular user
Response createUserResponse = UtilIT.createRandomUser();
createUserResponse.then().assertThat().statusCode(OK.getStatusCode());
String usernameAdmin = UtilIT.getUsernameFromResponse(createUserResponse);
String apiTokenAdmin = UtilIT.getApiTokenFromResponse(createUserResponse);
Response makeSuperUser = UtilIT.makeSuperUser(usernameAdmin);
assertEquals(200, makeSuperUser.getStatusCode());

createUserResponse = UtilIT.createRandomUser();
createUserResponse.then().assertThat().statusCode(OK.getStatusCode());
String apiToken = UtilIT.getApiTokenFromResponse(createUserResponse);

// Create and publish a top level Dataverse (under root) with a requireFilesToPublishDataset set to true
stevenwinship marked this conversation as resolved.
Show resolved Hide resolved
Response createDataverseResponse = UtilIT.createRandomDataverse(apiToken);
String ownerAlias = UtilIT.getAliasFromResponse(createDataverseResponse);
// Only admin can set this attribute
Response setDataverseAttributeResponse = UtilIT.setCollectionAttribute(ownerAlias, "requireFilesToPublishDataset", "true", apiToken);
setDataverseAttributeResponse.prettyPrint();
setDataverseAttributeResponse.then().assertThat().statusCode(UNAUTHORIZED.getStatusCode());
setDataverseAttributeResponse = UtilIT.setCollectionAttribute(ownerAlias, "requireFilesToPublishDataset", "true", apiTokenAdmin);
setDataverseAttributeResponse.prettyPrint();
setDataverseAttributeResponse.then().assertThat().statusCode(OK.getStatusCode());
setDataverseAttributeResponse.then().assertThat().body("data.effectiveRequiresFilesToPublishDataset",equalTo(true));
Response publishDataverseResponse = UtilIT.publishDataverseViaNativeApi(ownerAlias, apiTokenAdmin);
publishDataverseResponse.prettyPrint();
publishDataverseResponse.then().assertThat().statusCode(OK.getStatusCode());

// Create and publish a new Dataverse under the above Dataverse with requireFilesToPublishDataset not set (default null)
String alias = "dv2-" + UtilIT.getRandomIdentifier();
createDataverseResponse = UtilIT.createSubDataverse(alias, null, apiToken, ownerAlias);
createDataverseResponse.prettyPrint();
createDataverseResponse.then().assertThat().statusCode(CREATED.getStatusCode());
publishDataverseResponse = UtilIT.publishDataverseViaNativeApi(alias, apiToken);
publishDataverseResponse.then().assertThat().statusCode(OK.getStatusCode());

// Create a Dataset under the 2nd level Dataverse
Response createDatasetResponse = UtilIT.createRandomDatasetViaNativeApi(alias, apiToken);
createDatasetResponse.then().assertThat().statusCode(CREATED.getStatusCode());
Integer id = UtilIT.getDatasetIdFromResponse(createDatasetResponse);

// Try to publish with no files (minimum is 1 file from the top level Dataverse)
Response publishDatasetResponse = UtilIT.publishDatasetViaNativeApi(String.valueOf(id), "major", apiToken);
publishDatasetResponse.prettyPrint();
publishDatasetResponse.then().assertThat().statusCode(FORBIDDEN.getStatusCode());
publishDatasetResponse.then().assertThat().body("message", containsString(
BundleUtil.getStringFromBundle("dataset.mayNotPublish.FilesRequired")
));

// Upload 1 file and try to publish again
String pathToFile = "src/main/webapp/resources/images/dataverseproject.png";
Response uploadResponse = UtilIT.uploadFileViaNative(String.valueOf(id), pathToFile, apiToken);
uploadResponse.then().assertThat().statusCode(OK.getStatusCode());

publishDatasetResponse = UtilIT.publishDatasetViaNativeApi(String.valueOf(id), "major", apiToken);
publishDatasetResponse.prettyPrint();
publishDatasetResponse.then().assertThat().statusCode(OK.getStatusCode());
}
}
Loading