Upload Large Files Directly to GCS with Dropzone and Signed URL’s

Prashant Pal
3 min readAug 2, 2021

--

I had a django app deployed on GAE and needed to upload files (including large videos) which should be stored on GCS using our front end. The problem is GAE applies a 32MB hard limit on the request size. Hence we could not upload large files (video in our case) directly to app engine (read more about the quotas here.)

Hence the first approach I tried was

Break up files in Chunks on front end

I leveraged the chunking feature from the dropzone and send files in chunks of 30MB (which GAE allowed), once all of the chunks got uploaded I recreated the file and uploaded to GCS.

Dropzone.options.dropzoneForm = {
paramName: "file", //name that will be used to transfer the file
acceptedFiles: "image/png, image/jpeg, image/jpeg, video/mp4",
chunking: true, // Enable chunking
chunkSize: 30000000, // Chunk Size
forceChunking: true
}

The approach worked well, but had following limitations

  1. GAE does not allow you to have file system access, you can still write files to the /tmp directory but it has a limit of 2GB
  2. Each chunk upload and write was very slow

Direct Upload to GCS using signed URL’s

So finally i decided to upload files directly from my browser to GCS bucket. GCS offers two diff API’s for file upload

  • XML : Only this supports signed URL’s so have to use this
  • JSON

So the workflow for uploading file is going to be

File Uploaded → Request Signed URL from BackEnd → Write file to GCS → On success inform the backend

To create signed URL you can use google-api-client to generate signed url’s. You need to have a service account to create signed URL’s

from google.cloud.storage._signing import generate_signed_url
API_ACCESS_ENDPOINT = 'https://storage.googleapis.com'
def _get_storage_client():
return storage.Client.from_service_account_json("credentials.json")
def get_signed_url(name, content_type):
client = _get_storage_client()
expiration = datetime.datetime.now()+datetime.timedelta(days=1)
canonical_resource = "/" + settings.CLOUD_STORAGE_BUCKET + "/" + safe_filename(name)
url = generate_signed_url(
client._credentials,
resource=canonical_resource,
api_access_endpoint=API_ACCESS_ENDPOINT,
expiration=expiration,
method="PUT",
content_type=content_type
)
return url

As I am uploading various files with various content types, I have to provide the same content type in my request headers as well (GCS has this mandatory). Also the upload url in Dropzone needs to be changed dynamically on upload (based on the file your canonical resource will change). Hence i had to leverage Dropzone’s accept function to do that

Dropzone.options.dropzoneForm = {
acceptedFiles: "image/png, image/jpeg, image/jpeg, video/mp4",
method: "PUT",
timeout: null, // Dropzone has a default time out of 30sec
// Get Upload Url dynamically
url: function (files) {
console.log(files[0].dynamicUploadUrl);
return files[0].dynamicUploadUrl
},
headers: { // Remove unwanted headers
'Cache-Control': null,
'X-Requested-With': null,
'Accept': null
},
// IMP: Have to make this configuration to send raw files to GCS
sending: function (file, xhr) {
let _send = xhr.send;
xhr.send = function () {
_send.call(xhr, file);
};
},
init: function () {
this.on("success", function (file, response) {
//Inform the server. Uploaded Successfully
});
},
accept: function (file, done) {
//Dynamically set the content type header based on file
// uploaded .
this.options.headers['Content-Type'] = file.type;
//Request Signed URL from backend
//On success, Set url in file object which will be used
//in url function above. Something like
//file.dynamicUploadUrl = signed_url;
//call the dropzone
//done();

}
};

Be sure to set up cors on your GCS Bucket. (Documentation). It has two methods

To Get existing cors setting

gsutil cors get gs://vid-upload-test

To set Cors on the bucket

gsutil cors set cors.json gs://<bucket>

where cors.json is the local json file with settings. Something like this

[
{
"maxAgeSeconds": 3600,
"method": [
"GET",
"HEAD",
"PUT"
],
"origin": [
<Your origin whitelist>
],
"responseHeader": [
"X-Requested-With",
"Access-Control-Allow-Origin",
"Content-Type",
]
}
]

I struggled a lot while setting all of this up hope this helps!!! Cheers

--

--

Responses (2)