Hello, I'm trying to intercept the response body of an XHR request. After looking through the spec files, I saw that Ferrum keeps track of all of the network traffic, which is convenient for this case. (I tried using on('Network.responseReceived') hooks, but couldn't work it out.)
While waiting for the correct Exchange to be finished? I kept running into Ferrum::BrowserError: No data found for resource with given identifier. Example code:
response = Timeout.timeout(15) do
# Find ajax request for search results
until xhr = browser.network.traffic.find { |exchange| exchange.request.url =~ /search\/bySize/ }
print '.'
sleep 0.2
end
# Wait for response to complete
until xhr.finished?
print 'x'
sleep 0.2
end
xhr.response.body
end
After some research, I've learned that instead you should wait for Network.loadingFinished before trying to get the response body.
I wanted to highlight a few notes in Ferrum's API as I wonder if Exchange#finished? should wait for that event, or if the Response object c/should be more clever. (Or both).
# network.rb #subscribe
# ...
@page.on("Network.responseReceived") do |params|
if exchange = select(params["requestId"]).last
response = Network::Response.new(@page, params)
exchange.response = response
end
end
@page.on("Network.loadingFinished") do |params|
exchange = select(params["requestId"]).last
if exchange && exchange.response
exchange.response.body_size = params["encodedDataLength"]
end
end
# exchange.rb
# ...
def finished?
blocked? || response || error
end
Note how the exchange is finished? when it has a response object. But also notice that this response attribute is assigned in the Network.responseReceived hook, not the Network.loadingFinished hook. There is, however, a property that gets set at Network.loadingFinished, body_size, so I tried keying off that:
response = Timeout.timeout(15) do
# Find ajax request for search results
until xhr = browser.network.traffic.find { |exchange| exchange.request.url =~ /search\/bySize/ }
print '.'
sleep 0.2
end
# Wait for response to complete
until xhr&.response&.body_size
print 'x'
sleep 0.2
end
xhr.response.body
end
With this change, this code no longer raises the noted exception, though the API is being used in a strange way.
- I feel that
Exchange#finished? should account for the response being fully loaded and prepared to query. Maybe this isn't sensible due to existing usages for #finished? and things like streamed content. So, worst case, maybe something new?
- I wonder about an attribute
Response#finished? or loaded? or ready?
- I also wonder if the api for querying the
response.body should wait for an attribute to signal that the response is finished loading, leveraging the standard Ferrum timeouts, similar to how other calls on the browser block until CDP has reported it's ready.
Hello, I'm trying to intercept the response body of an XHR request. After looking through the spec files, I saw that Ferrum keeps track of all of the network traffic, which is convenient for this case. (I tried using on('Network.responseReceived') hooks, but couldn't work it out.)
While waiting for the correct
Exchangeto befinished?I kept running intoFerrum::BrowserError: No data found for resource with given identifier. Example code:After some research, I've learned that instead you should wait for
Network.loadingFinishedbefore trying to get the response body.I wanted to highlight a few notes in Ferrum's API as I wonder if
Exchange#finished?should wait for that event, or if theResponseobject c/should be more clever. (Or both).Note how the exchange is
finished?when it has aresponseobject. But also notice that this response attribute is assigned in theNetwork.responseReceivedhook, not theNetwork.loadingFinishedhook. There is, however, a property that gets set atNetwork.loadingFinished,body_size, so I tried keying off that:With this change, this code no longer raises the noted exception, though the API is being used in a strange way.
Exchange#finished?should account for the response being fully loaded and prepared to query. Maybe this isn't sensible due to existing usages for#finished?and things like streamed content. So, worst case, maybe something new?Response#finished?orloaded?orready?response.bodyshould wait for an attribute to signal that the response is finished loading, leveraging the standard Ferrum timeouts, similar to how other calls on the browser block until CDP has reported it's ready.