Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

before_record_response returns inconsistent data depending on recording or not #544

Open
Diaoul opened this issue Jul 11, 2020 · 12 comments · May be fixed by #594
Open

before_record_response returns inconsistent data depending on recording or not #544

Diaoul opened this issue Jul 11, 2020 · 12 comments · May be fixed by #594

Comments

@Diaoul
Copy link
Contributor

Diaoul commented Jul 11, 2020

It's all in the title and seems to be related to #356
My tests depend on some sensitive data to be at the value I replaced it with.
So when I run the tests I have sometimes the real data (which I don't want) and sometimes the fake data actually in the cassette.

This is an inconsistent behaviour and, depending on what the data is used for afterwards, could lead to sensitive information being leaked when recording cassettes. For example, in the test logs, integration servers, etc.

@Diaoul
Copy link
Contributor Author

Diaoul commented Jul 11, 2020

I think the use case of having subsequent requests depending on the real data like a login behaviour as described in #355 is valid although I think this should not be solved this way. Maybe by making it an optional behaviour?

@akaihola
Copy link

akaihola commented Jul 5, 2021

Maybe an additional before_handle_response option could provide a callback for modifying the response? The modification would apply both when returning data for a real response and when saving the response into a cassette.

I drafted a patch for this in #594 – it invokes the callback(s) here in vcr.stubs.VCRConnection.getresponse():

            # get the response
            response = self.real_connection.getresponse()

            # put the response into the cassette
            response = {
                "status": {"code": response.status, "message": response.reason},
                "headers": serialize_headers(response),
                "body": {"string": response.read()},
            }
            response = copy.deepcopy(response)                          ## callback 
            response = self.cassette._before_handle_response(response)  ## invoked here

            self.cassette.append(self._vcr_request, response)
        return VCRHTTPResponse(response)

My use case for this feature is a test suite for an API client which gets very large JSON responses from an actual API server. To reduce the amount of data in the test suite, I'd like to trim down the data but still make it simple to recreate the cassettes without having to manually tweak them.

I did try to use requests-mock and VCRpy together, but it seems requests-mock's patches disable VCRpy and no cassettes get written.

akaihola pushed a commit to akaihola/vcrpy that referenced this issue Jul 5, 2021
@akaihola akaihola linked a pull request Jul 5, 2021 that will close this issue
4 tasks
akaihola pushed a commit to akaihola/vcrpy that referenced this issue Jul 5, 2021
@akaihola
Copy link

akaihola commented Jul 5, 2021

Maybe by making it an optional behaviour?

@Diaoul, I added an alternative solution in #595 which does exactly that: you still define callbacks for modifying responses using the before_record_response option, but you can also specify alter_live_response = True to see those modifications in live responses when cassettes are being recorded.

@akaihola
Copy link

akaihola commented Jul 5, 2021

@kevin1024 do you think this enhancement is valid? The idea is to make sure the responses seen by the user's code are identical between recording and playback modes.

Your thoughts on the two different approaches in #594 and #595?

I think the pros and cons of #594 are:

  • PRO: user just needs to change from before_record_response to before_handle_response with no need for additional options
  • CON: code duplication
  • NOTE: I'm not sure whether it was appropriate to handle decode_compressed_response here, too

As for #595:

  • PRO: no code duplication
  • CON: two options to set (before_record_response and alter_live_response)
  • CON: adds confusion – I think this should be the default behavior actually
  • NOTE: private attribute access from stubs to Cassette._alter_live_response

If you accept the feature and one of these approaches, I'll proceed to writing unit tests, documentation and a change log entry.

@kevin1024
Copy link
Owner

Hmm, how does Ruby VCR handle this case?

And is there a way to make this change backwards-compatible?

I have to admit I don’t have much time to maintain VCR these days so I appreciate your time!

@akaihola
Copy link

akaihola commented Jul 5, 2021

Ruby VCR seems to have an after-http-request hook. Based on the documentation it's unclear whether modifying the response is allowed.

In the Stack Overflow question VCR ignored parameter: Get real http request, user 23tux says:

I tried it out with c.after_http_request(:stubbed?) where I get the response body AND the real value of the callback parameter, but this hook is too late, because my app already received the response body.

So maybe after-http-request hook of Ruby VCR doesn't actually do what I'm looking for either.

Both #594 and #595 should be backwards-compatible.

Are there other active contributors to VCRpy who I could ping and ask opinions from?

Also, I'm wondering what actually is the use case for modifying the recorded response but still passing through the original one when recording?

akaihola pushed a commit to akaihola/vcrpy that referenced this issue Jul 5, 2021
akaihola pushed a commit to akaihola/vcrpy that referenced this issue Jul 5, 2021
@kevin1024
Copy link
Owner

Also, I'm wondering what actually is the use case for modifying the recorded response but still passing through the original one when recording?

Ah! That one I can answer. It's usually to keep credentials/secrets from ending up in the repo.

@kevin1024
Copy link
Owner

kevin1024 commented Jul 6, 2021

RE #594 vs #595: You understand this problem more deeply than I do, so I trust your judgement and I'll happily merge your PR as long as it has tests and doesn't change anything (test suites or recorded cassettes) in a backwards-incompatibile way :)

Edit: Oh - and documentation as well!

@akaihola
Copy link

akaihola commented Jul 7, 2021

To summarize the difference between #594 and #595 from the user's perspective if she wishes her code to receive modified responses also when they are being recorded:

#594

def vcr_config():
    return {
        "before_handle_response": [my_callback],
    }

#595

def vcr_config():
    return {
        "before_record_response": [my_callback],
        "alter_live_response": True,
    }

@samoylovfp
Copy link

Hello, another happy user of vcrpy here.

From API perspective I like before_handle_response better,
because alter_live_response changes meaning of already passed hook, which indeed might be confusing.

I am sure the duplication can be factored out or even left as-is.

@akaihola
Copy link

akaihola commented Jul 7, 2021

I like before_handle_response better, because alter_live_response changes meaning of already passed hook

@samoylovfp, that makes sense. I closed #595 and will proceed with #594.

I wonder about the naming of the new hook before_handle_response. It allows modifying the response before it is recorded or returned to user code. Does the word handle properly communicate this? Or should we follow Ruby VCR's example and name it e.g. after_receive_response?

I also wonder whether there are any use cases where it's essential that the original unmodified response is seen by user code when recording. At least from a unit testing perspective, I'd say it's more important to actually get the identical response in both recording and playback situations.

If there are no use cases for before_record_response, maybe it could even be deprecated (with a long deprecation period) or its use discouraged in documentation?

@akaihola
Copy link

Reminder to self: I need to continue working on this pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants