Simple Example - Single Request
Because I prefer a number of examples with increasing complexity when I am tackling a new library, that's what you'll get here. This first example implements only a single HTTP request but establishes that we have figured out enough to determine that it is working for a single get.
It produces output that looks like this:
$ ruby bin/em-http/em-http-000.rb
D, [2016-07-24T17:12:39.386698 #42654] DEBUG -- : [] [url=http://ipv4.download.thinkbroadband.com/5MB.zip] [SUBMITTED] [runtime=0.197928]
D, [2016-07-24T17:12:42.238997 #42654] DEBUG -- : [] [url=http://ipv4.download.thinkbroadband.com/5MB.zip] [CALLBACK 200] [runtime=3.050257]
{"SERVER"=>"nginx", "DATE"=>"Sun, 24 Jul 2016 21:12:39 GMT", "CONTENT_TYPE"=>"application/zip", "CONTENT_LENGTH"=>"5242880", "LAST_MODIFIED"=>"Mon, 02 Jun 2008 15:30:42 GMT", "CONNECTION"=>"close", "ETAG"=>"\"48441222-500000\"", "ACCESS_CONTROL_ALLOW_ORIGIN"=>"*", "ACCEPT_RANGES"=>"bytes"}
So far, so good. We download from a URL and it is successful with a status code of 200.
Next Example: Crude Async Concurrency
The next example ensures that we have a solid enough understanding to deal with two concurrent requests. One thing we have to handle is ensuring that EM.stop is only called after all of the jobs have completed. So in this example, we add a request queue collection and a method call to EM.stop when all jobs are done.
It's getting a bit more complex but still quite grokkable and we can see that both requests get submitted at the same time, but the larger file takes longer to download.
$ ruby bin/em-http/em-http-010-async.rb
D, [2016-07-24T16:47:21.628277 #42542] DEBUG -- : [new_http_request] [url=http://ipv4.download.thinkbroadband.com/5MB.zip] [SUBMITTED] [runtime=0.18803]
D, [2016-07-24T16:47:21.631735 #42542] DEBUG -- : [new_http_request] [url=http://ipv4.download.thinkbroadband.com/10MB.zip] [SUBMITTED] [runtime=0.191508]
D, [2016-07-24T16:47:26.295605 #42542] DEBUG -- : [new_http_request] [url=http://ipv4.download.thinkbroadband.com/5MB.zip] [CALLBACK/ERRBACK 200] [runtime=4.855387]
D, [2016-07-24T16:47:26.295678 #42542] DEBUG -- : [stop_when_all_finished] [states=[:finished, :body]]
D, [2016-07-24T16:47:30.669914 #42542] DEBUG -- : [new_http_request] [url=http://ipv4.download.thinkbroadband.com/10MB.zip] [CALLBACK/ERRBACK 200] [runtime=9.229693]
D, [2016-07-24T16:47:30.669999 #42542] DEBUG -- : [stop_when_all_finished] [states=[:finished, :finished]]
Final Example: Async Concurrency with Concurrency Limits
If we have about 20 requests, we probably don't want them all going to a server and blowing it up. EventMachine provides an elegant mechanism to apply a concurrency constraint so that we can limit the number of active requests. In this final example, I issue six requests and I apply a concurrency limit of 2.
The code itself doesn't look very different. But you can watch the log and see that two jobs are submitted initially and then subsuquent jobs are added only as a job completes.
$ ruby bin/em-http/em-http-020-async-with-em-iterator.rb
D, [2016-07-24T16:55:37.620212 #42569] DEBUG -- : [new_http_request] [url=http://ipv4.download.thinkbroadband.com/5MB.zip] [SUBMITTED] [runtime=0.174998]
D, [2016-07-24T16:55:37.624177 #42569] DEBUG -- : [new_http_request] [url=http://ipv4.download.thinkbroadband.com/10MB.zip] [SUBMITTED] [runtime=0.178991]
D, [2016-07-24T16:55:42.003892 #42569] DEBUG -- : [new_http_request] [url=http://ipv4.download.thinkbroadband.com/5MB.zip] [CALLBACK/ERRBACK 200] [runtime=4.5587]
D, [2016-07-24T16:55:42.003984 #42569] DEBUG -- : [stop_when_all_finished] [states=[:finished, :body]]
D, [2016-07-24T16:55:42.006724 #42569] DEBUG -- : [new_http_request] [url=http://ipv4.download.thinkbroadband.com/5MB.zip] [SUBMITTED] [runtime=4.561536]
D, [2016-07-24T16:55:45.318885 #42569] DEBUG -- : [new_http_request] [url=http://ipv4.download.thinkbroadband.com/10MB.zip] [CALLBACK/ERRBACK 200] [runtime=7.873685]
D, [2016-07-24T16:55:45.319011 #42569] DEBUG -- : [stop_when_all_finished] [states=[:finished, :finished, :body]]
D, [2016-07-24T16:55:45.321175 #42569] DEBUG -- : [new_http_request] [url=http://ipv4.download.thinkbroadband.com/10MB.zip] [SUBMITTED] [runtime=7.875988]
D, [2016-07-24T16:55:46.646499 #42569] DEBUG -- : [new_http_request] [url=http://ipv4.download.thinkbroadband.com/5MB.zip] [CALLBACK/ERRBACK 200] [runtime=9.201303]
D, [2016-07-24T16:55:46.646617 #42569] DEBUG -- : [stop_when_all_finished] [states=[:finished, :finished, :finished, :body]]
D, [2016-07-24T16:55:46.651357 #42569] DEBUG -- : [new_http_request] [url=http://ipv4.download.thinkbroadband.com/5MB.zip] [SUBMITTED] [runtime=9.206166]
D, [2016-07-24T16:55:50.993760 #42569] DEBUG -- : [new_http_request] [url=http://ipv4.download.thinkbroadband.com/5MB.zip] [CALLBACK/ERRBACK 200] [runtime=13.548572]
D, [2016-07-24T16:55:50.993836 #42569] DEBUG -- : [stop_when_all_finished] [states=[:finished, :finished, :finished, :body, :finished]]
D, [2016-07-24T16:55:50.995672 #42569] DEBUG -- : [new_http_request] [url=http://ipv4.download.thinkbroadband.com/10MB.zip] [SUBMITTED] [runtime=13.550481]
D, [2016-07-24T16:55:52.958938 #42569] DEBUG -- : [new_http_request] [url=http://ipv4.download.thinkbroadband.com/10MB.zip] [CALLBACK/ERRBACK 200] [runtime=15.51375]
D, [2016-07-24T16:55:52.959023 #42569] DEBUG -- : [stop_when_all_finished] [states=[:finished, :finished, :finished, :finished, :finished, :body]]
D, [2016-07-24T16:55:59.708914 #42569] DEBUG -- : [new_http_request] [url=http://ipv4.download.thinkbroadband.com/10MB.zip] [CALLBACK/ERRBACK 200] [runtime=22.263717]
D, [2016-07-24T16:55:59.709023 #42569] DEBUG -- : [stop_when_all_finished] [states=[:finished, :finished, :finished, :finished, :finished, :finished]]
Conclusion
Thus ends my tutorial on using em-http-request and em-iterator to handle a number of long-running downloads with concurrency. A lot of the lines I have above are devoted to logging and comments. The code is actually quite concise, I think.
If this was helpful to you, share it along.