Skip to content
Advertisement

Chrome, FileReader API, event.target.result === “”

I have a web app which does some processing on big text-files (> 500mb) via the FileReader API’s readAsText() method.
It has been working great for years but suddenly I got empty responses: event.target.result is an empty string.

369MB works but 589MB does not work.

I have tested on multiple computers; same result, however it does work in Firefox. Chrome must have introduced this in a recent update.

Has this bug been submitted?

Is there any workaround?

Advertisement

Answer

This is v8 limitation on String length.

Has this bug been submitted?

Here is the responsible commit: https://github.com/v8/v8/commit/ea56bf5513d0cbd2a35a9035c5c2996272b8b728

Running a bisect I felt on this Change-Log and found it was applied on Chrome v79.

Before this change the limit on 64-bits platforms was set to 1024MB, the new limit is 512MB, the half.

This means not only FileReader is affected, but any method that would try to produce such a big String.

Here is a simple example:

const header = 24;
const bytes = new Uint8Array( (512 * 1024 * 1024) - header );
let txt = new TextDecoder().decode( bytes );
console.log( txt.length ); // 536870888
txt += "f"; // RangeError

Is there any workaround?

The only way around that issue is to process your text by chunks.

Luckily, you are dealing with ASCII data, so you can easily split your resource and work on that chunk using the Blob.slice() method:

// working in a Web-Worker to not freeze the tab while generating the data
const worker_script = `
(async () => {

  postMessage( 'Generating file, may take some time...' );

  const bytes = Uint8Array.from(
    { length: 800 * 1024 * 1024 },
    (_, i) => (i % 25) + 65
  );
  const blob = new Blob( [ bytes ] );

  const length = blob.size;
  const chunk_size = 128 * 1024 * 1024;

  postMessage( 'Original file size: ' + length );
  
  let As = 0;
  let i = 0;
  while ( i < length ) {
    const str = await blob.slice( i, i + chunk_size ).text();
    i += chunk_size;
    As += str.split( 'A' ).length - 1;
  }
  postMessage( 'found ' + As + ' "A"s in the whole file' );

} )();
`;
const worker_blob = new Blob( [ worker_script ] );
const worker = new Worker( URL.createObjectURL( worker_blob ) );
worker.onmessage = (evt) => console.log( evt.data );

The ones working with rich text like UTF-8 would have to deal with multi-bytes characters, and this may not be that easy…

Also note that even in browsers that let you generate such big strings, you may very well face other problems too. For instance in Safari, you can generate bigger strings, but if you keep it alive too long in memory, then the browser will reload your page automaticaly.


2021 update

Almost all modern browsers now support the Blob.stream() method which returns a ReadableStream, allowing us to well… read that Blob’s content as a stream. We can thus process huge file texts in a more performant way, and thanks to the stream option of the TextDecoder API we can even handle non ASCII characters:

const bytes = Uint8Array.from(
  { length: 800 * 1024 * 1024 },
  (_, i) => (i % 25) + 65
);
const blob = new Blob( [ bytes ] );

console.log( 'Original file size: ' + blob.size );
const reader = blob.stream().getReader();
const decoder = new TextDecoder();
let As = 0;
reader.read().then( function process({ done, value }) {
  const str = decoder.decode( value, { stream: true } );
  As += str.split( 'A' ).length - 1;
  if( !done ) {
    reader.read().then( process );
  }
  else {
    console.log( 'found ' + As + ' "A"s in the whole file' );
  }
} );
User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement