I'm running Apache Tika Server in a docker container and trying to extract the text from PDFs contained in a password protected ZIP file.
I've tried passing the password in the HTTP header as 'Password' and 'X-Tika-Password', however all it does is list the files in the ZIP folder without extracting the text.
If I remove the password from the ZIP file then it extracts the text from the PDFs perfectly.
I've tried this:
curl --location --request PUT '127.0.0.1:9998/tika' \
--header 'Accept: text/plain' \
--header 'Password: 123456' \
--header 'Content-Type: application/zip' \
--data-binary '@file/path/to.zip'
And just get back plain text with:
Name Of First File.pdf
Name of Second FIle.pdf
发布者:admin,转转请注明出处:http://www.yc00.com/questions/1745594513a4635028.html
评论列表(0条)