Fetcher

class FetcherInterface

Defines an interface for abstract network download.

By providing a concrete implementation of the abstract interface, users of the framework can plug-in their preferred/customized network stack.

Implementations of FetcherInterface only need to implement _fetch(). The public API of the class is already implemented.

abstract _fetch(url)

Fetch the contents of HTTP/HTTPS url from a remote server.

Implementations must raise DownloadHTTPError if they receive an HTTP error code.

Implementations may raise any errors but the ones that are not DownloadErrors will be wrapped in a DownloadError by fetch().

Parameters:

url (str) – URL string that represents a file location.

Raises:

exceptions.DownloadHTTPError – HTTP error code was received.

Returns:

Bytes iterator

Return type:

Iterator[bytes]

download_bytes(url, max_length)

Download bytes from given url.

Returns the downloaded bytes, otherwise like download_file().

Parameters:
  • url (str) – URL string that represents the location of the file.

  • max_length (int) – Upper bound of data size in bytes.

Raises:
  • exceptions.DownloadError – An error occurred during download.

  • exceptions.DownloadLengthMismatchError – Downloaded bytes exceed max_length.

  • exceptions.DownloadHTTPError – An HTTP error code was received.

Returns:

Content of the file in bytes.

Return type:

bytes

download_file(url, max_length)

Download file from given url.

It is recommended to use download_file() within a with block to guarantee that allocated file resources will always be released even if download fails.

Parameters:
  • url (str) – URL string that represents the location of the file.

  • max_length (int) – Upper bound of file size in bytes.

Raises:
  • exceptions.DownloadError – An error occurred during download.

  • exceptions.DownloadLengthMismatchError – Downloaded bytes exceed max_length.

  • exceptions.DownloadHTTPError – An HTTP error code was received.

Yields:

TemporaryFile object that points to the contents of url.

Return type:

Iterator[IO]

fetch(url)

Fetch the contents of HTTP/HTTPS url from a remote server.

Parameters:

url (str) – URL string that represents a file location.

Raises:
  • exceptions.DownloadError – An error occurred during download.

  • exceptions.DownloadHTTPError – An HTTP error code was received.

Returns:

Bytes iterator

Return type:

Iterator[bytes]

class RequestsFetcher(socket_timeout=30, chunk_size=400000)

An implementation of FetcherInterface based on the requests library.

Parameters:
  • socket_timeout (int) –

  • chunk_size (int) –

socket_timeout

Timeout in seconds, used for both initial connection delay and the maximum delay between bytes received.

chunk_size

Chunk size in bytes used when downloading.