We're sorry AsposeApp doesn't work properply without JavaScript enabled.

Free Support Forum - aspose.app

Reconstructing PDF -> HTML -> PDF


I have a PDF file that I need to convert to HTML to translate the content. The extraction renders HTML and a number of jpeg files. I modify the HTML and replace the old HTML and try to reconstruct. However, the new PDF document renders with the html content and the jpegs on separate pages.


Are you using this app for PDF to HTML conversion or our stand-alone API? Could you please share more details on this scenario along-with the problematic files.

I am using the Java APIs. Attached are the pdf, docx (SDAsposePDFWord.zip) and html (SDAsposeHTML.zip) files.

SDAsposeHTML.zip (90.6 KB)
SDAsposePDFWord.zip (298.6 KB)

  1. First I convert PDF to Word
  2. Then convert Word to HTML

public static void convertPDFToWord() {
try {
// Load source PDF file
com.aspose.pdf.Document doc = new com.aspose.pdf.Document(“SD_Aspose.pdf”);
doc.save(“SD_Aspose.docx”, SaveFormat.DocX);
} catch (Exception ex) {

public static void convertWordHTML() {
try {
Document doc = new Document(“SD_Aspose.docx”);
String dataDir = “SDAspose/”;
String outHtmlFile = “SD_Aspose.html”;
// Save the output file
doc.save(dataDir + outHtmlFile, com.aspose.words.SaveFormat.HTML);
} catch (Exception ex) {

I have to convert to docx first because I need to translate the text and I am converting to html because I need to display it in a browser.

This topic has been moved to the related forum: https://forum.aspose.com/t/reconstructing-pdf-html-pdf/225521