Upload Large Files Directly to GCS with Dropzone and Signed URL’s
I had a django app deployed on GAE and needed to upload files (including large videos) which should be stored on GCS using our front end. The problem is GAE applies a 32MB hard limit on the request size. Hence we could not upload large files (video in our case) directly to app engine (read more about the quotas here.)
Hence the first approach I tried was
Break up files in Chunks on front end
I leveraged the chunking feature from the dropzone and send files in chunks of 30MB (which GAE allowed), once all of the chunks got uploaded I recreated the file and uploaded to GCS.
Dropzone.options.dropzoneForm = {
paramName: "file", //name that will be used to transfer the file
acceptedFiles: "image/png, image/jpeg, image/jpeg, video/mp4",
chunking: true, // Enable chunking
chunkSize: 30000000, // Chunk Size
forceChunking: true
}
The approach worked well, but had following limitations
- GAE does not allow you to have file system access, you can still write files to the /tmp directory but it has a limit of 2GB
- Each chunk upload and write was very slow
Direct Upload to GCS using signed URL’s
So finally i decided to upload files directly from my browser to GCS bucket. GCS offers two diff API’s for file upload
- XML : Only this supports signed URL’s so have to use this
- JSON
So the workflow for uploading file is going to be
File Uploaded → Request Signed URL from BackEnd → Write file to GCS → On success inform the backend
To create signed URL you can use google-api-client to generate signed url’s. You need to have a service account to create signed URL’s
from google.cloud.storage._signing import generate_signed_url
API_ACCESS_ENDPOINT = 'https://storage.googleapis.com'def _get_storage_client():
return storage.Client.from_service_account_json("credentials.json")def get_signed_url(name, content_type):
client = _get_storage_client()
expiration = datetime.datetime.now()+datetime.timedelta(days=1)
canonical_resource = "/" + settings.CLOUD_STORAGE_BUCKET + "/" + safe_filename(name)
url = generate_signed_url(
client._credentials,
resource=canonical_resource,
api_access_endpoint=API_ACCESS_ENDPOINT,
expiration=expiration,
method="PUT",
content_type=content_type
)
return url
As I am uploading various files with various content types, I have to provide the same content type in my request headers as well (GCS has this mandatory). Also the upload url in Dropzone needs to be changed dynamically on upload (based on the file your canonical resource will change). Hence i had to leverage Dropzone’s accept function to do that
Dropzone.options.dropzoneForm = {
acceptedFiles: "image/png, image/jpeg, image/jpeg, video/mp4",
method: "PUT",
timeout: null, // Dropzone has a default time out of 30sec// Get Upload Url dynamically
url: function (files) {
console.log(files[0].dynamicUploadUrl);
return files[0].dynamicUploadUrl
},headers: { // Remove unwanted headers
'Cache-Control': null,
'X-Requested-With': null,
'Accept': null
},
// IMP: Have to make this configuration to send raw files to GCS
sending: function (file, xhr) {
let _send = xhr.send;
xhr.send = function () {
_send.call(xhr, file);
};
},
init: function () {
this.on("success", function (file, response) {
//Inform the server. Uploaded Successfully
});
},
accept: function (file, done) {
//Dynamically set the content type header based on file
// uploaded .
this.options.headers['Content-Type'] = file.type;
//Request Signed URL from backend
//On success, Set url in file object which will be used
//in url function above. Something like
//file.dynamicUploadUrl = signed_url;
//call the dropzone
//done();
}
};
Be sure to set up cors on your GCS Bucket. (Documentation). It has two methods
To Get existing cors setting
gsutil cors get gs://vid-upload-test
To set Cors on the bucket
gsutil cors set cors.json gs://<bucket>
where cors.json is the local json file with settings. Something like this
[
{
"maxAgeSeconds": 3600,
"method": [
"GET",
"HEAD",
"PUT"
],
"origin": [
<Your origin whitelist>
],
"responseHeader": [
"X-Requested-With",
"Access-Control-Allow-Origin",
"Content-Type",
]
}
]
I struggled a lot while setting all of this up hope this helps!!! Cheers