qt - qt如何在使用wkhtmltopdf时处理 ContentNotFoundError?

wkhtmltopdf无法与HTTPS/SSL一起使用,即使我为libssl.so和libcrypto.so设置了LD_LIBRARY_PATH


[deploy@localhost ~]$ wkhtmltopdf https://www.google.co.in google.pdf


loaded the Generic plugin 


Loading page (1/2)


Error: Failed loading page https://www.google.co.in (sometimes it will work just to ignore this error with --load-error-handling ignore)


Exit with code 1 due to network error: UnknownNetworkError




[deploy@localhost ~]$ wkhtmltoimage https://www.google.co.in sample.jpg


loaded the Generic plugin 


Loading page (1/2)


Error: Failed loading page https://www.google.co.in (sometimes it will work just to ignore this error with --load-error-handling ignore)


Exit with code 1 due to network error: UnknownNetworkError



wkhtmltopdf部分使用HTTP,输出pdf文件缺少一些content/background/positions。


[deploy@localhost ~]$ wkhtmltopdf http://localhost:8880/ sample.pdf


loaded the Generic plugin 


Loading page (1/2)


Printing pages (2/2) 


Done 


Exit with code 1 due to network error: ContentNotFoundError



[deploy@localhost ~]$ wkhtmltoimage http://localhost:8880/ sample.jpg


loaded the Generic plugin 


Loading page (1/2)


Rendering (2/2) 


Done 


Exit with code 1 due to network error: ContentNotFoundError



注:我使用wkhtmltopdf-0.12.1-1.fc20.x86_64和qt-4.8.6-10.fc20.x86_64

时间:

不幸的是,wkhtmltopdf没法处理复杂网站的下载,因为它使用Qt/QtWebKit库,这有一些问题。

一个问题是wkhtmltopdf不支持相对地址(GitHub )#1634,#1886,#2359,QTBUG-46240),例如:


<img src="/images/filetypes/txt.png">


<script src="//cdn.optimizely.com/js/653710485.js">



我发现的一种解决方案是通过就地编辑器纠正html文件:


ex -V1 page.html <<-EOF


 %s,'//,'http://,ge 


 %s,"//,"http://,ge 


 %s,'/,'http://www.example.com/,ge


 %s,"/,"http://www.example.com/,ge


 wq" Update changes and quit.


EOF



另一个问题是它不处理丢失的资源,可以尝试指定--load-error-handling ignore,但在大多数情况下它不能工作(请参见#2051),解决方法是在转换之前简单地删除这些无效资源。

或者,也可以将wkhtmltopdf与一些附加脚本一起使用,例如使用rasterize.js


phantomjs rasterize.js http://example.com/



或以下示例代码的dompdf(HTML到PDF转换器PHP,你可以由composer安装):


<?php


// somewhere early in your project's loading, require the Composer autoloader


// see: http://getcomposer.org/doc/00-intro.md


$HOMEDIR ="/Users/foo";


require $HOMEDIR . '/.composer/vendor/autoload.php';



// disable DOMPDF's internal autoloader if you are using Composer


define('DOMPDF_ENABLE_AUTOLOAD', FALSE);


define('DOMPDF_ENABLE_REMOTE', TRUE);



// include DOMPDF's default configuration


require_once $HOMEDIR . '/.composer/vendor/dompdf/dompdf/dompdf_config.inc.php';



$htmlString = file_get_contents("https://example.com/foo.pdf");



$dompdf = new DOMPDF();


$dompdf->load_html($htmlString);


$dompdf->render();


$dompdf->stream("sample.pdf");



...