Introduction
A DocX user asked me during the week when was I going to support converting Word 2007 documents (.docx) into other useful forms such as (.doc, .pdf, .html). I would love to add this functionality to DocX, however there is a problem.
The Problem
The only easy way to do this conversion, is to use Microsoft’s Office interop libraries. For anyone who doesn't know what Microsoft’s Office interop libraries are, I envy you.
The Microsoft Office interop libraries are available in the Add Reference dialog.
The Code
Once you have added a reference to Microsoft.Office.Interop.Word you can use the below project to convert a Word 2007 .docx into .doc, .pdf, and .html.
- using System;
- using System.Collections.Generic;
- using System.Linq;
- using System.Text;
- using Word = Microsoft.Office.Interop.Word;
- using Microsoft.Office.Interop.Word;
- namespace ConsoleApplication1
- {
- class Program
- {
- static void Main(string[] args)
- {
- // Convert Input.docx into Output.doc
- Convert(@"C:\users\cathal\Desktop\Input.docx", @"c:\users\cathal\Desktop\Output.doc", WdSaveFormat.wdFormatDocument);
- /*
- * Convert Input.docx into Output.pdf
- * Please note: You must have the Microsoft Office 2007 Add-in: Microsoft Save as PDF or XPS installed
- * http://www.microsoft.com/downloads/details.aspx?FamilyId=4D951911-3E7E-4AE6-B059-A2E79ED87041&displaylang=en
- */
- Convert(@"c:\users\cathal\Desktop\Input.docx", @"c:\users\cathal\Desktop\Output.pdf", WdSaveFormat.wdFormatPDF);
- // Convert Input.docx into Output.html
- Convert(@"c:\users\cathal\Desktop\Input.docx", @"c:\users\cathal\Desktop\Output.html", WdSaveFormat.wdFormatHTML);
- }
- // Convert a Word 2008 .docx to Word 2003 .doc
- public static void Convert(string input, string output, WdSaveFormat format)
- {
- // Create an instance of Word.exe
- Word._Application oWord = new Word.Application();
- // Make this instance of word invisible (Can still see it in the taskmgr).
- oWord.Visible = false;
- // Interop requires objects.
- object oMissing = System.Reflection.Missing.Value;
- object isVisible = true;
- object readOnly = false;
- object oInput = input;
- object oOutput = output;
- object oFormat = format;
- // Load a document into our instance of word.exe
- Word._Document oDoc = oWord.Documents.Open(ref oInput, ref oMissing, ref readOnly, ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref isVisible, ref oMissing, ref oMissing, ref oMissing, ref oMissing);
- // Make this document the active document.
- oDoc.Activate();
- // Save this document in Word 2003 format.
- oDoc.SaveAs(ref oOutput, ref oFormat, ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing);
- // Always close Word.exe.
- oWord.Quit(ref oMissing, ref oMissing, ref oMissing);
- }
- }
- }
The result
Please note
This code will only execute on a machine that has Microsoft’s Office installed on it. The Microsoft’s Office interop libraries actually execute a “hidden” instance of the Office. If you run the above code and then take a look at taskmgr you will see the following.
If you want to convert to .pdf, you must also have the Microsoft Office 2007 Add-in: Microsoft Save as PDF or XPS installed.
It is for this reason that I have not included convert functionality into my DocX library. I do not want DocX to have a dependency on Word.exe.
The future
Is there no way to do conversions without having Word.exe installed on my machine. I didn’t say that, I said there is no easy way. This looks very promising, now if I could only find the time.
Donation?
As always, I offer this code to you for free. I am however a student and if you would like to say thank you, you can buy me lunch by sending a €5 euro donation via paypal.

Cool Trick to Export to PDF, i was looking it for quite some time.
ReplyDeleteThanks for sharing
The perfect!These articles written too great,they rich contents and data accurately.they are help to me.I expect to see your new share.
Delete-----------------
RS Gold Runescape Gold Buy WOW Gold
I just started look at your Open Source Project "DOCX" and then saw this blog. I would really suggest you to look at the OpenXmlPowerTools.HtmlConvertor and the iTextSharp. You can use both of these in combination to generate either html and or pdf from the html. It is not as perfect, but does the job pretty nicely without overage of the PIAs. The HTML is also pretty clean.
ReplyDeleteThanks,
I love it,Excellent article.I am decide to put this into use one of these days.Thank you for sharing this.To Your Success!
ReplyDelete_____________________________________________________________________________
Cocktail Dresses|Maternity Wedd Bride Dresses|Plus Size Mother of Bride Dresses
Once again great post. You seem to have a good understanding of these themes.When I entering your blog,I felt this . Come on and keep writting your blog will be more attractive. To Your Success!
DeleteClassic Dresses
Classic Bridesmaid Dresses
Wedding Dresses with Sleeves
I love it,Excellent article.I am decide to put this into use one of these days.Thank you for sharing this.To Your Success!
ReplyDelete_____________________________________________________________________________
Rc Helicopter Parts|Rc Helicopter|Mini Rc Helicopter
Thanks very helpful!
ReplyDeleteIt works good. Excellent article. Thanx.
ReplyDeleteIts pretty good and very easy to understand.
ReplyDeleteThanks
oDoc.Activate(); error "Object reference not set to an instance of an object." for iss. Help me plssssss :(
ReplyDeleteThankyou
ReplyDeleteHey guys, here you are a reliable store to buy WoW gold which is really cheap. I know it through my friend's recommendation. If you are a wow fan, you can have a try. You know, it is difficult to buy cheap wow gold with fast delivery. Hope you like it.
ReplyDeletesame error....oDoc.Activate(); error "Object reference not set to an instance of an object." for issue. Help me plssssss :( that's y i tried ur dll but it don't have this functionality...
ReplyDelete