Recently, I have been working with the Requests library in Python. I wrote a simple function to pull down a file that took more than a minute to download. While waiting for the download to complete I realized it would be nice to have some insight into the download’s progress. A quick search on StackOverflow led to an excellent example. Below is a simple way to display a progress bar while downloading a file.
def download_file(url, name): ''' Function takes a url and a filename, creates a request, opens a file and streams the content in chunks to the file system. It then writes out an '=' symbol for every two percent of the total content length to the console. ''' filename = 'myfile_' + str(name) + '.ext' r = requests.get(url, stream=True) with open(filename, 'wb') as f: total_length = r.headers.get('Content-Length') if total_length is None: # no content length header f.write(r.content) else: downloaded = 0 total_length = int(total_length) for data in r.iter_content(chunk_size=4096): downloaded += len(data) f.write(data) done = int(50 * dl / total_length) sys.stdout.write("\r[%s%s]" % ('=' * done, ' ' * (50 - done))) sys.stdout.flush() return 1
What’s going on?
requests.get() takes a URL and creates an HTTP request. The stream=True flag is an optional argument that can be submitted to the Request class. It lets the Request know that the content should be downloaded in chunks instead of attempted to be pulled all at once.
The response headers are then searched for the ‘Content-Length’ attribute. We use the ‘Content-Length’ value to calculate how much is downloaded and what is left to download. The values are then stored in variables and updated as the chunks are processed.
The final piece to point out in this little function is the iter_content() method. iter_content():
Iterates over the response data. When stream=True is set on the request, this avoids reading the content at once into memory for large responses. The chunk size is the number of bytes it should read into memory.