0 votes
in Education by (1.7m points)
I am looking for the best way to send large Numpy arrays (composed mainly of images) via Flask.

For now, I am now doing something like this:

Server side:

np.save(matrix_path, my_array)

return send_file(matrix_path+'.npy')

Client side:

with open('test_temp', 'wb') as f:

    f.write(r.content)

my_array = np.load('test_temp')

But the .npy file is very large so it takes too long.

I thought about using h5py but as the images have different size (array.shape = (200,)), I cannot use h5py (create a dataset for each image would be too long).

Does anyone get an idea of how to optimize this?

JavaScript questions and answers, JavaScript questions pdf, JavaScript question bank, JavaScript questions and answers pdf, mcq on JavaScript pdf, JavaScript questions and solutions, JavaScript mcq Test , Interview JavaScript questions, JavaScript Questions for Interview, JavaScript MCQ (Multiple Choice Questions)

1 Answer

0 votes
by (1.7m points)
As the comments section is really just starting to become an answer in and of itself, I'll write it all out here.

EDIT: numpy has a built-in way to compress multiple arrays into a file to neatly package them up for sending. This combined with using a buffer rather than a file on disk is probably the quickest and easiest way to gain some speed. Here is a quick example of numpy.savez_compressed saving some data to a buffer, and this question shows sending a buffer using flask.send_file

import numpy as np

import io

myarray_1 = np.arange(10) #dummy data

myarray_2 = np.eye(5)

buf = io.BytesIO() #create our buffer

#pass the buffer as you would an open file object

np.savez_compressed(buf, myarray_1, myarray_2, #etc...

         )

buf.seek(0) #This simulates closing the file and re-opening it.

            #  Otherwise the cursor will already be at the end of the

            #  file when flask tries to read the contents, and it will

            #  think the file is empty.

#flask.sendfile(buf)

#client receives buf

npzfile = np.load(buf)

print(npzfile['arr_0']) #default names are given unless you use keywords to name your arrays

print(npzfile['arr_1']) #  such as: np.savez(buf, x = myarray_1, y = myarray_2 ... (see the docs)

There are 3 quick ways to gain some speed in sending files.

don't write to disk: this one is pretty simple, just use a buffer to store the data before passing it to flask.send_file()

compress the data: once you have a buffer of binary data, there are many options for compression, but zlib is part of the standard python distribution. If your arrays are images (or even if they aren't), png compression is lossless and can sometimes provide better compression than zlib on its own. Scipy is depreciating it's builtin imread and imwrite so you should use imageio.imwrite now.

Get a higher performance server to actually do the file sending. The builtin development server that gets called when you call app.run() or invoke your app via flask directly ($flask run or $python -m flask run) does not support the X-Sendfile feature. This is one reason to run flask behind something like Apache or Nginx. Unfortunately this isn't implemented in the same way for each server, and may require a file in the filesystem (though you could possibly use an in-memory file if the OS supports it). This will be a case of rtfm for whatever deployment you choose.
...