Non-ASCII characters are not correctly displayed in PDF when served via HttpResponse and AJAX

Question

I have generated a PDF file which contains Cyrillic characters (non-ASCII) with ReportLab. For this purpose I have used the &#8220;Montserrat&#8221; font, which support such characters. When I look in the generated PDF file inside the media folder of Django, the characters are correctly displayed: I have embe…

Accepted Answer

You are doing some encoding/recoding, because if you look at the diff between the files, it&#8217;s littered with unicode replacement characters:% diff -ua Cyrillic_good.pdf Cyrillic_wrong.pdf > out.diff% hexdump out.diff|grep 'ef bf bd'|wc -l    2659You said you tried without setting the encoding and charset, but I don&#8217;t think that was tested properly &#8211; most likely you saw an aggressively browser-cached version.The proper way to do this is to use FileResponse, pass in the filename and let Django figure out the right content type.The following is a reproducible test of a working situation:First of all, put Cyrillic_good.pdf (not wrong.pdf), in your media root.Add the following to urls.py:#urls.pyfrom django.urls import pathfrom .views import pdf_serveurlpatterns = [    path("pdf/<str:filename>", pdf_serve),]And views.py in the same directory:#views.pyfrom pathlib import Pathfrom django.conf import settingsfrom django.http import (    HttpResponseNotFound, HttpResponseServerError, FileResponse)def pdf_serve(request, filename: str):    pdf = Path(settings.MEDIA_ROOT) / filename    if pdf.exists():        response = FileResponse(open(pdf, "rb"), filename=filename)        filesize = pdf.stat().st_size        cl = int(response["Content-Length"])        if cl != filesize:            return HttpResponseServerError(                f"Expected {filesize} bytes but response is {cl} bytes"            )        return response    return HttpResponseNotFound(f"No such file: {filename}")Now start runserver and request http://localhost:8000/pdf/Cyrillic_good.pdf.If this doesn&#8217;t reproduce a valid pdf, it is a local problem and you should look at middleware or your OS or little green men, but not the code. I have this working locally with your file and no mangling is happening.In fact, the only way to get a mangled pdf now is browser cache or response being modified after Django sends it, since the content length check would prevent sending a file that has different size then the one on disk.JS PartI would expect the conversion to happen in the blob constructor as it&#8217;s possible to hand a blob a type. I&#8217;m not sure the default is binary-safe.It&#8217;s also weird your data has an error property and you pass the entire thing to the blob, but we can&#8217;t see what promise you&#8217;re reacting on.success: function (data) {    if (data.error === undefined) {        console.log(data) // This will be informative        var blob = new Blob([data]);        var link = document.createElement('a');        link.href = window.URL.createObjectURL(blob);        link.download = filename + '.pdf';        link.click();    }}

Non-ASCII characters are not correctly displayed in PDF when served via HttpResponse and AJAX

Edit

Solution with the ideas from comments

Handling an error when returning response

Advertisement

Answer

JS Part