I'm working with a pdf viewer, that will load a pdf file from server (nodejs) and then rendering on client side to allow user reading direct in my site.
I'm using pdf.js to rendering pdf file on client side. The problem is client side must download entire pdf file before they can parse and render it, so if the file is too large (~200MB in my case), user must wait for download entire 200MB.
I researched and i think i can solve this problem by 2 ways:
Split the large pdf file into many smaller pdf files on the server side, and serve only specific small file on-demand. But this way, i will lost some important metadata like the pdf outlines,...
Using pdf.js direct in server side, get the pdf pages and then serve each page as binary to client side, client side will also use pdf.js (addPage function) to add each page to their viewer. But i don't know it is possible or not.
So what should i do to solve this problem? Thank you so much.
I'm working with a pdf viewer, that will load a pdf file from server (nodejs) and then rendering on client side to allow user reading direct in my site.
I'm using pdf.js to rendering pdf file on client side. The problem is client side must download entire pdf file before they can parse and render it, so if the file is too large (~200MB in my case), user must wait for download entire 200MB.
I researched and i think i can solve this problem by 2 ways:
Split the large pdf file into many smaller pdf files on the server side, and serve only specific small file on-demand. But this way, i will lost some important metadata like the pdf outlines,...
Using pdf.js direct in server side, get the pdf pages and then serve each page as binary to client side, client side will also use pdf.js (addPage function) to add each page to their viewer. But i don't know it is possible or not.
So what should i do to solve this problem? Thank you so much.
Share Improve this question edited Sep 12, 2019 at 16:11 Hung Nguyen asked Sep 12, 2019 at 16:02 Hung NguyenHung Nguyen 1,15611 silver badges19 bronze badges 3- were you able to make changes on server side to render pdf with range request? even i am trying to do the same i have made changes on server side, Pdf.js make call to the server as expected but it does not make for all ranges. Please let me know what changes are needed to be done on server side i am using spring boot. Help is highly appreciated. – Ejaz Ahmed Commented Sep 24, 2019 at 7:41
- 1 Yes i was solved this problem by update server to support range request and then pdf.js worked excellent. i didn't change anything on client side. I'm using nodejs on server side – Hung Nguyen Commented Sep 24, 2019 at 7:51
- Additionally, i was failed when manual implement range request on server side, so i tried to use a library (github./pillarjs/send), so you can refer code of that library to see what need to be change – Hung Nguyen Commented Sep 24, 2019 at 8:00
2 Answers
Reset to default 3The best solution is to optimize all your PDF files for web.
The default settings of pdf.js
will load only the portion it needs to render.
See here for more info.
I am facing the same issue. For 2024.09, you can try the range request and load the pdf partial without downloading the whole pdf. you can write a partial api and implement the range request or configure the static server to support the partial request. more information: https://github./mozilla/pdf.js/wiki/Frequently-Asked-Questions#range
this is the rust version of api code I am using(you can using any language to implement this same logic):
pub fn get_partial_pdf(lastest_pdf: &LatestCompile, range: Option<&HeaderValue>) -> HttpResponse {
let proj_base_dir = get_proj_base_dir(&lastest_pdf.project_id);
let pdf_name = format!(
"{}{}",
get_filename_without_ext(&lastest_pdf.file_name),
".pdf"
);
let pdf_file_path = join_paths(&[proj_base_dir, pdf_name]);
if range.is_none() {
let mut file = File::open(pdf_file_path).expect("Failed to open file");
let mut buf = Vec::new();
file.read_to_end(&mut buf);
let metadata = file.metadata().expect("Failed to get metadata");
let file_size = metadata.len();
return HttpResponse::PartialContent()
.insert_header(CacheControl(vec![CacheDirective::NoCache]))
.append_header(("Accept-Ranges", "bytes"))
.append_header(("Content-Length", file_size))
.append_header((
"Access-Control-Expose-Headers",
"Accept-Ranges,Content-Range",
))
.content_type("application/pdf")
.body(buf);
}
let range_value = range.unwrap().to_str().unwrap();
warn!("range_value {}", range_value);
let bytes_info: Vec<&str> = range.unwrap().to_str().unwrap().split("=").collect();
let mut parts = bytes_info[1].split('-');
let start = parts.next().unwrap_or("0").parse::<u64>().unwrap_or(0);
warn!("get the start {}", start);
let end = parts.next().unwrap_or("0").parse::<u64>().unwrap_or(0);
warn!("get the end {}", end);
let mut file = File::open(pdf_file_path).expect("Failed to open file");
let metadata = file.metadata().expect("Failed to get metadata");
let file_size = metadata.len();
file.seek(SeekFrom::Start(start))
.expect("Failed to seek file");
let mut buf = vec![0; (end - start + 1) as usize];
file.take(end - start + 1)
.read_exact(&mut buf)
.expect("Failed to read file");
let content_range = format!("bytes {}-{}/{}", start, end, file_size);
return HttpResponse::PartialContent()
.insert_header(CacheControl(vec![CacheDirective::NoCache]))
.append_header(("Content-Range", content_range))
.append_header(("Accept-Ranges", "bytes"))
.append_header(("Content-Length", file_size))
.append_header((
"Access-Control-Expose-Headers",
"Accept-Ranges,Content-Range",
))
.content_type("application/pdf")
.body(buf);
}
on the first request the client will send a request without range header, the server return header Accept-Ranges
and so on and told the client the server support range request, then the client will switch to the range request to loading the rest of the pdf content. hope this will help you. Also you can configure to fetch the whole pdf in the backend or just pre download part of pdf.
发布者:admin,转转请注明出处:http://www.yc00.com/questions/1745428188a4627273.html
评论列表(0条)