Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BrowserMobHttpClient.java]Capture Content in beta 8 #85

Open
d-jubeau opened this issue Mar 5, 2013 · 3 comments
Open

[BrowserMobHttpClient.java]Capture Content in beta 8 #85

d-jubeau opened this issue Mar 5, 2013 · 3 comments

Comments

@d-jubeau
Copy link

d-jubeau commented Mar 5, 2013

I experience some troubles since some of my produced HAR files are bigger than 10MB.
After browsing the code, I'd want to have your attention on this piece of code :

BrowserMobHttpClient.java; arround line 736 (in beta 8, not released) :

if (contentType != null && contentType.startsWith("text/")) {
        entry.getResponse().getContent().setText(new String(copy.toByteArray()));
} else { 
        entry.getResponse().getContent().setText(Base64.byteArrayToBase64(copy.toByteArray()));
}

I think there are 2 issues here :

  • a javascript content, with an application/javascript Content-Type header will be rendered in the HAR file as Base64 encoded. I don't think it should be, since text/javascript is not. I think there are other similar cases with other content types.
  • all contents are copied in the HAR file if captureContent property is true. I experienced an HAR file of 30MB... I think an additional configuration method could be usefull (setCapturedContents(List mimeTypes) ?), method that should be exposed in the ProxyServer as captureContent(boolean) existing one.

I have these needs in a student project, I'm probably going to do the changes, but I would be glad to have your opinions, and to know if publishing the changes could help someone.

@lightbody
Copy link
Member

I agree that we should not Base64 encode application/javascript. If you can submit a pull request that supports additional "plain text" content types I would be glad to accept them.

You're right that HAR files can get VERY large when you start capturing the content of every request. I'm open to ideas on how to limit that, such as configuration for limiting the size of each body or limited the capturing of the content only to certain URLs or file types.

@d-jubeau
Copy link
Author

I did some changes :
d-jubeau@93f430a
We are currently doing some tests on several hundreds of websites, it seems to work well up to now

@roydekleijn
Copy link
Contributor

Hi Patrick,

Capturing body content for particular URL's would be really nice.
Maybe we can add a regex parameter to the related method.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants