Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add public share results feature #20

Conversation

smathews-techempower
Copy link
Contributor

@smathews-techempower smathews-techempower commented Jan 22, 2020

This adds a new file store directory that is meant for storing results file uploads from anonymous users. It is limited to a configurable size and prevents uploads larger than a different configurable size. All uploads are given a new random filename, called a share ID.

There is an API endpoint that is meant to accept uploads from the benchmarks toolset upon run completion. Additionally, this adds a paste bin style page that can be used by anonymous users to upload or paste results using a UI. Results files are compressed to zip files for storage, making for significant storage saving.

After successful upload, the results file is accessible at /share-results/{shareId}.json.

Addressing #3 from issue #19

- add the new test_metadata field to results, optional because it was only
recently included in results by the toolset
- add a new directory for public results uploads
- add handler to support upload and viewing of results json files
- give each upload a new random filename, preventing overwrites
- add new handler allowing paste or upload of results json files
- move the URLs to config
- limit share upload file size and size of upload directory
- centralize upload logic to a new service
- add jdwp allowing remote debugging of java process locally using intellij
# Conflicts:
#	example-config.yml
#	src/main/java/tfb/status/bootstrap/ServicesBinder.java
#	src/main/java/tfb/status/config/ApplicationConfig.java
#	src/main/java/tfb/status/config/FileStoreConfig.java
#	src/main/java/tfb/status/service/ApplicationConfigFactory.java
#	src/test/java/tfb/status/handler/UploadResultsHandlerTest.java
#	src/test/java/tfb/status/service/ApplicationConfigFactoryTest.java
Copy link
Contributor

@michaelhixson michaelhixson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The biggest thing missing from this PR is tests. Tackle the other comments first, then do add tests for every new handler, service, and utility class. There is a fairly good test infrastructure in place already and plenty of examples, but let me know if it's unclear how to test a particular feature.

I suspect we'll want to replace the existing src/test/resources/test_managed_files results files with ones that contain testMetadata to help with this. There is a ResultsTester class that can be used for dynamically generating new fake results (for a test to upload, for example) based on these files, and I'm guessing these tests will benefit from using ResultsTester.

@@ -30,6 +30,8 @@

#fileStore:
# root: managed_files
# max_share_directory_size_bytes: 1000000000
# max_share_file_size_bytes: 5000000
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use lowerCamelCase for consistency with the other settings.

@@ -6,8 +6,10 @@ services:
args:
SKIP_TESTS: "true"
ports:
- "127.0.0.1:80:80"
- "127.0.0.1:8080:80"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was it necessary to change this? Personally I prefer to have HTTP servers listen on port 80.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll put it back to port 80, for me locally it fails to bind to port 80 due to some acces permissions issue. I changed it to 8080 to get it working quickly rather than look into it, and accidentally committed it.

}

private static final String DEFAULT_TFB_STATUS = "http://localhost:8080";
private static final String DEFAULT_TE_WEB = "http://localhost:3000";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think these are good default values. It would be bad for either one to make it into production, and neither one would be correct for me in local development. I'd say change them to the production URLs. Or, if the functionality that uses this can be disabled when these settings are absent (like EmailSender when EmailConfig is absent), then make these nullable. I haven't gotten far enough in the review to know if the nullable/disabled route is workable.

@JsonProperty(value = "tfbStatus", required = true)
@Nullable String tfbStatus,

@JsonProperty(value = "teWeb", required = true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should be required = false.

* @param in The byte channel containing the raw results JSON.
* @see #upload(Path)
*/
public ShareResultsJsonView upload(ReadableByteChannel in)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given other comments I made, I'm expecting this method to change to accept either an InputStream or a ByteSource.


/**
* Upload the given byte channel containing the raw results JSON. Creates
* a temporary file from the bytes and passes it to {@link #upload(Path)}.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This project uses an 80 character margin. Here, there is room for "a" to fit on the previous line.

I'm not particularly concerned with this "a", but I notice a lot of cases like this. Are you eyeballing the margin? If so, I suggest using IntelliJ's "visual guides" feature. Editor - Code Style - General - Visual guides. I have mine set to "80, 100, 120" since those are common margins used in projects. If you find the guides unpleasant and distracting, I still suggest enabling them while working on this project at least to help keep the wrapping correct.

}

@Override
public void handleRequest(HttpServerExchange exchange) throws Exception {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please narrow the exception type of this method if possible. Can it be declared to only throw IOException?

Path permanentFile = fileStore.shareDirectory().resolve(zipFileName);
MoreFiles.createParentDirectories(permanentFile);

try (FileOutputStream fos = new FileOutputStream(permanentFile.toFile());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid calling Path.toFile(), which uses java.io.File. I suggest avoiding the ZipOutputStream class completely and use the java.nio.file APIs instead.

try (FileSystem zipFs = Files.newFileSystem(permanentFile)) {
  Path entry = zipFs.getPath(fileName);
  try (InputStream in = Files.newInputStream(tempFile);
       OutputStream out = Files.newOutputStream(entry, CREATE_NEW)) {
    in.transferTo(out);
  }
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a slight correction to that example code for creating the zip file.

try (FileSystem zipFs =
         FileSystems.newFileSystem(
             permanentFile,
             Map.of("create", "true")) {
  ...
}

jdk.zipfs module documentation

long shareDirectorySize = FileStore.directorySizeBytes(
fileStore.shareDirectory().toFile());
if (shareDirectorySize >= fileStoreConfig.maxShareDirectorySizeBytes) {
throw new ShareResultsUploadException("Share uploads has reached max capacity.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you thought about what we (the maintainers of tfb-status) should do in this case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to audit the share directory and ensure it didn't get to capacity due to someone trying to break or exploit this feature. Depending, we can delete data from such a user, or expand the capacity if we feel it's being used to good end. If the oldest files are quite old, we can go the route of backing up then deleting old files from the share directory. Does that make sense? Also, should this send some sort of alert email if the capacity is hit?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alert email would be nice. Could you add that functionality along with a test? We'd need spam prevention measures, perhaps something like the one-email-per-day check implemented by RunCompleteMailer.

I suppose a manual solution is best with respect to not losing data. In practice, I'm not sure how I'd tell whether the uploaded results were legitimate or not. I might address the problem by simply deleting the share directory.

Please make the email message include at least one suggestion for what we should do to fix the problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add the email alert and related testing.

We could look at how closely together in time the results were uploaded, and if all of the results uploaded are identical. However these would be obvious signs that a user could get around. Simply deleting the share directory would fine in either case.

@@ -117,6 +117,13 @@
*/
public final @Nullable GitInfo git;

/**
* The test metadata, generated during this run (same as test_metadata.json).
* Test metadata was not always included in the results, so this will be
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This currently has the wrong name in TFB and you're planning to rename it, right? So we don't have an exact date for when it was added yet?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR fixed the naming, it was merged Jan 24. TechEmpower/FrameworkBenchmarks#5424

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, can you add a message like this to the documentation for this field?

* "checked". This allows mustache template to simply render the radio
* button as checked without having to do any if checks.
*/
public final ImmutableMap<String, String> shareTypeValues;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this used? I'm not seeing it.


Results results = resultsTester.newResults();

Path jsonFile = fileSystem.getPath("results_to_upload.json");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have an eventual goal to allow all the tests to run concurrently. I think this code here may be modeled after an existing test that did something similar using the same filename, but as a result, the two tests would conflict with each other if ran concurrently, wouldn't they? Could you prepend the test class name and/or test method name to this filename, or figure out some other scheme that's likely to keep this non-conflicting with other tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made them use random UUIDs, deos that work? If that's messy or not right I can update to use class name + test method name.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine.

@michaelhixson
Copy link
Contributor

The PR is looking good now aside from the feature request for a new email and the other couple of minor comments I just left. There are some minor style inconsistencies related to existing code, but it's easier for me to edit those things later myself.


messages = getShareDirectoryFullEmails(mailServer);

assertEquals(initialMessagesSize + 1, messages.size());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works now, but may cause an issue if you make the tests run in parallel. One of the other tests tries to share a results file after filling the share directory to capacity, which also sends this email. I'm curious what the right solution would be. One idea is that since the email contains the capacity and size in the body, I could add code to find a message with the specific numbers passed to onShareDirectoryFull above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not solve this now, but please annotate the test so that it can't run in parallel and add a TODO near the part of the test that's unfriendly to parallelism.

@michaelhixson michaelhixson merged commit 29d22a2 into TechEmpower:master Jan 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants