About Me

My photo
Hello, my name is Cathal Coffey. I am best described as a hybrid between a developer and an adventurer. When I am not behind a keyboard coding, I am hiking and climbing the beautiful mountains of my home country Ireland. I am a full time student studying Computer Science & Software Engineering at the National University of Ireland Maynooth. I am finishing the final year of a 4 year degree in September 2009. I am the creator of an open source project on codeplex.com called DocX. At the moment I spend a lot of my free time advancing DocX and I enjoy this very much. My aim is to build a community around DocX and add features based on requests from this community. I really enjoy hearing about how people are using DocX in their work\personal projects. So if you are one of these people, please send me an email. Cathal coffey.cathal@gmail.com

Saturday, October 31, 2009

Converting .docx into (.doc, .pdf, .html)


A DocX user asked me during the week when was I going to support converting Word 2007 documents (.docx) into other useful forms such as  (.doc, .pdf, .html). I would love to add this functionality to DocX, however there is a problem.

The Problem

The only easy way to do this conversion, is to use Microsoft’s Office interop libraries. For anyone who doesn't know what Microsoft’s Office interop libraries are, I envy you.

The Microsoft Office interop libraries are available in the Add Reference dialog.


The Code

Once you have added a reference to Microsoft.Office.Interop.Word you can use the below project to convert a Word 2007 .docx into .doc, .pdf, and .html.

Code Snippet
  1. using System;
  2. using System.Collections.Generic;
  3. using System.Linq;
  4. using System.Text;
  5. using Word = Microsoft.Office.Interop.Word;
  6. using Microsoft.Office.Interop.Word;
  8. namespace ConsoleApplication1
  9. {
  10.     class Program
  11.     {
  12.         static void Main(string[] args)
  13.         {
  14.             // Convert Input.docx into Output.doc
  15.             Convert(@"C:\users\cathal\Desktop\Input.docx", @"c:\users\cathal\Desktop\Output.doc", WdSaveFormat.wdFormatDocument);
  17.             /*
  18.              * Convert Input.docx into Output.pdf
  19.              * Please note: You must have the Microsoft Office 2007 Add-in: Microsoft Save as PDF or XPS installed
  20.              * http://www.microsoft.com/downloads/details.aspx?FamilyId=4D951911-3E7E-4AE6-B059-A2E79ED87041&displaylang=en
  21.              */
  22.             Convert(@"c:\users\cathal\Desktop\Input.docx", @"c:\users\cathal\Desktop\Output.pdf", WdSaveFormat.wdFormatPDF);
  24.             // Convert Input.docx into Output.html
  25.             Convert(@"c:\users\cathal\Desktop\Input.docx", @"c:\users\cathal\Desktop\Output.html", WdSaveFormat.wdFormatHTML);
  26.         }
  28.         // Convert a Word 2008 .docx to Word 2003 .doc
  29.         public static void Convert(string input, string output, WdSaveFormat format)
  30.         {
  31.             // Create an instance of Word.exe
  32.             Word._Application oWord = new Word.Application();
  34.             // Make this instance of word invisible (Can still see it in the taskmgr).
  35.             oWord.Visible = false;
  37.             // Interop requires objects.
  38.             object oMissing = System.Reflection.Missing.Value;
  39.             object isVisible = true;
  40.             object readOnly = false;
  41.             object oInput = input;
  42.             object oOutput = output;
  43.             object oFormat = format;
  45.             // Load a document into our instance of word.exe
  46.             Word._Document oDoc = oWord.Documents.Open(ref oInput, ref oMissing, ref readOnly, ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref isVisible, ref oMissing, ref oMissing, ref oMissing, ref oMissing);
  48.             // Make this document the active document.
  49.             oDoc.Activate();
  51.             // Save this document in Word 2003 format.
  52.             oDoc.SaveAs(ref oOutput, ref oFormat, ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing);
  54.             // Always close Word.exe.
  55.             oWord.Quit(ref oMissing, ref oMissing, ref oMissing);
  56.         }
  57.     }
  58. }

The result










Please note

This code will only execute on a machine that has Microsoft’s Office installed on it. The Microsoft’s Office interop libraries actually execute a “hidden” instance of the Office. If you run the above code and then take a look at taskmgr you will see the following.


If you want to convert to .pdf, you must also have the Microsoft Office 2007 Add-in: Microsoft Save as PDF or XPS installed.

It is for this reason that I have not included convert functionality into my DocX library. I do not want DocX to have a dependency on Word.exe.

The future

Is there no way to do conversions without having Word.exe installed on my machine. I didn’t say that, I said there is no easy way. This looks very promising, now if I could only find the time.


As always, I offer this code to you for free. I am however a student and if you would like to say thank you, you can buy me lunch by sending a €5 euro donation via paypal.


  1. Cool Trick to Export to PDF, i was looking it for quite some time.

    Thanks for sharing

    1. The perfect!These articles written too great,they rich contents and data accurately.they are help to me.I expect to see your new share.
      RS Gold Runescape Gold Buy WOW Gold

    2. hi, i am a student of software engineering and i doing work on my final year project, i want help. i want to convert pdf file in different formats in clients side , means in jacascript. could you please help ma ?


    3. it doesn't work it shows some command failed exception on saveas property

  2. I just started look at your Open Source Project "DOCX" and then saw this blog. I would really suggest you to look at the OpenXmlPowerTools.HtmlConvertor and the iTextSharp. You can use both of these in combination to generate either html and or pdf from the html. It is not as perfect, but does the job pretty nicely without overage of the PIAs. The HTML is also pretty clean.


  3. I love it,Excellent article.I am decide to put this into use one of these days.Thank you for sharing this.To Your Success!

    Cocktail Dresses|Maternity Wedd Bride Dresses|Plus Size Mother of Bride Dresses

    1. Once again great post. You seem to have a good understanding of these themes.When I entering your blog,I felt this . Come on and keep writting your blog will be more attractive. To Your Success!

      Classic Dresses
      Classic Bridesmaid Dresses
      Wedding Dresses with Sleeves

  4. I love it,Excellent article.I am decide to put this into use one of these days.Thank you for sharing this.To Your Success!

    Rc Helicopter Parts|Rc Helicopter|Mini Rc Helicopter

  5. It works good. Excellent article. Thanx.

  6. Its pretty good and very easy to understand.


  7. This comment has been removed by the author.

  8. Hey guys, here you are a reliable store to buy WoW gold which is really cheap. I know it through my friend's recommendation. If you are a wow fan, you can have a try. You know, it is difficult to buy cheap wow gold with fast delivery. Hope you like it.

  9. same error....oDoc.Activate(); error "Object reference not set to an instance of an object." for issue. Help me plssssss :( that's y i tried ur dll but it don't have this functionality...

    1. check below setting

      Local Computer
      Config DCOM
      Search For Microsoft Word 97-2003 Documents->Properties
      Tab Identity ,change from Launching User To Interactive User

  10. Hi Cathal,

    This library appears to optimised for writing/editing documents. If there are good interfaces to enumerate the document then I would suggest coding up iTextSharp to output to many different potential formats. (From memory iTextSharp has a generic interface to output to many formats).

    I would be willing to donate toward to such a project. At the moment there is a lockup of commercial products. Docx to Pdf particularly would be great to have in the open source realm, as even commercial products can have bugs which you can't fix yourself.

  11. Hi...
    But it is not working in IIS .
    Can you suggest me how to do this

  12. can we convert doc to pdf without installing microsoft offic and open office and also without using third party dll's

  13. Not working need to have Microsoft office to avail Microsoft’s Office interop libraries.

  14. I had some errors on this when creating .doc and .html files. I found that i had to use oDoc.Close() to get everything to work properly.

  15. If DocX uses Microsoft.Office.Interop's dll then why don't we convert document to pdf using that only.?

  16. Thanks for sharing wonderful tips.
    I am very curious to know how doc file can be converted into PDF without installing MS office. The problem is there is restriction to install MS office on production server. Please suggest the best way to convert word to pdf or any third party tools.

  17. To be able to build the solution, you’ll need to have Office installed on your machine. This can be painful if you’re dealing with a CI server.

    I found a simple way around this problem. First change the Embed Interop Types property to false and rebuild your solution. You’ll end up with a Microsoft.Interop.*.dll in your bin/debug directory. Now take this assembly and store it in your project as a .dll. Next, you’ll have to update your project references to that new location, instead of directly referencing the installed version in your installation path.

    After that, you can change the Embed Interop Types to true, and rebuild again. Now you should be able to build without the need of an Office installation on your machine.

  18. i get this error on server 2012r2
    Retrieving the COM class factory for component with CLSID {000209FF-0000-0000-C000-000000000046} failed due to the following error: 80070005 Access is denied. (Exception from HRESULT: 0x80070005 (E_ACCESSDENIED)).

  19. do you know how can convert word to pdf without word installed?

  20. Thank you for sharing this. I know Microsoft Office has its own plug-in for saving Word and Excel as PDF files. But if anyone need to convert Word to PDF on other applications, a third-party converter is necessary. I use RasterEdge, which support Word, Excel and Tiff to PDF, tiff, word and convert Word to PDF, I think it's also a good tool for you.

  21. Anyone else have an issue where it thinks you updated the original word document and prompts if you would like to save?

  22. object doNotSaveChanges = Word.WdSaveOptions.wdDoNotSaveChanges;
    this.Close(ref doNotSaveChanges, ref missing, ref missing);

  23. This comment has been removed by the author.

  24. This comment has been removed by the author.

  25. More of the handy principles have been prescribed in well written form which will help in keeping your assignments get completed within time and there would be more tasks to done with. png to html

  26. For a Office product key, any editions, check out this site: www.gankings.com, got mine from here, perfectly used!

  27. Check out this site: www.motionkeys.com, you'll gonna find some fine working product keys.


  28. Windows 7 Key Code (http://www.windows10keysale.com)

    Hearing all the negative sentiments toward the Windows Server 2012 R2 Essentials Product Key? OS made me really cautious in upgrading from Windows 7.
    Finally, my Windows 7 became buggy enough that I decided it was time for me to go ahead and upgrade.
    The days of reformatting my hard drive and reinstalling Windows OS (95, 98, XP...), when the computer started to act up, are over. I got the pro upgrade version....for a really great price on Windows 7 Key Code (http://www.windows10keysale.com) .
    I purchased all my software from them. They are the best online store I ever buy.
    Installation took a while, with lots of downloading and installing drivers, updates...etc.
    When it was finally done, computer was running a lot faster, and all of the weird behaviors of previous installation were gone.
    Now as far as using the new OS, it really wasn't that big of a deal. Yes, it's different from the previous Windows; but after 2 or 3 days, I've picked up so many cool features that I truly believe it's a superior OS compared to the previous Windows releases. Really.
    The interface is quite convenient once you figure out how to use all the features and what happens when you move your cursor around the screen.
    Oh, yeah, and I don't have a touchscreen. Although I'm certain that a touchscreen would definitely be a more effective way to navigate Windows 8, I really have no problems using just the mouse.
    So, there you have it. I love this new OS.

    Windows 7 Key Code (http://www.windows10keysale.com)

  29. Hi cathal,
    I know very well we can convert doc to html with this library and also with office.interop. But my requirement is need to convert doc with entire content(headers & footers) to html. I tried more with interop but we couldn't find the solution. can i do this with Docx?? If yes please give references.


  30. Windows 10 Key Sale Store (http://www.windows10keysale.com)

    Just like me, not a master of the computer, I believe many of us may get stuck with the computer problem. Take this question for example, based on my previous experience, I think you have to get a product key to activate your Windows 10 Education Product Key. Just go to the site Windows 10 Key Sale Store (http://www.windows10keysale.com) and place your order to buy one. When you pay for it successfully, the genuine product key will be sent to you timely. You can activate your Windows 10 Education Product Key in several minutes.

    Windows 10 Key Sale Store (http://www.windows10keysale.com)

  31. This pdf to word conversion tool works well and is free. Keep adding more such information for the benefit of the people.

  32. Hi, I need your help. when i'm trying to load a file with.doc extension, it is giving an exception says that "file contains interrupted data. how can i overcome this?

  33. is there any way to print html tags to word using docx

  34. Best PDF Conversion Services are Available at reduced cost.

  35. There are a lot of tools claiming the best in converting the PDF into HTML but from my personal experience and as an developer, i would recommend the hand crafting conversion of PDF to HTML and the best agency to my knowledge in PDF to HTML Conversion services is HTML Pro

  36. Thank you for bringing more information to this topic for me. I’m truly grateful and really impressed.

    xml data conversion

  37. html to pdf converterHTML to PDF Converter | Convert from html to pdf online, using SelectPdf web to pdf convert

  38. This comment has been removed by the author.

  39. Awesome article…..truly appreciated thanks keep sharing convert word to html

  40. html to pdfHTML to PDF Converter | Convert from html to pdf online, using SelectPdf web to pdf convert

  41. html to pdfHTML to PDF Converter | Convert from html to pdf online, using SelectPdf web to pdf convert

  42. html to pdf converterHTML to PDF Converter for .NET | Select.Pdf offers a powerful html to pdf converter that can be used in any .NET application to convert any web page or raw html string to pdf

  43. pdf libraryCreate high quality PDFs with SelectPdf Html To Pdf Converter from the best PDF library. HTML to PDF API also available

  44. html to pdf converter

    SelectPdf Free Html To Pdf Converter Samples for C# / ASP.NET. Pdf Library for .NET with full sample code in C# and VB.NET.


  45. html to pdf online

    HTML to PDF Converter Online | SelectPdf offers a powerful and free to use online html to pdf converter, as well as the possibility to add "download as pdf" buttons on your site.


  46. I was looking for this process for so long! I went through a lot of different ways and approaches to make this happen but none of the procedures really worked for it. I saw this in practice and always wanted to know the whole dilemma behind it. Thanks to you my quest finally ended here!
    I also searched for the second option a Bytescout. I read good customer reviews about this tool too. I will definitely try these both options.

  47. Pdf library

    Create high quality PDFs with SelectPdf Html To Pdf Converter from the best PDF library. HTML to PDF API also available

    to get more - https://selectpdf.com/

  48. Big thanks for sharing this great post about DOCX to PDF Converter that will help all bloggers.

  49. Best information about software.Thanks for sharing such great information. hope you keep sharing such kind of information Batch RTF to Word Converter

  50. Html to pdf api

    HTML to PDF Converter Online | SelectPdf offers a powerful and free to use online html to pdf converter, as well as the possibility to add "download as pdf" buttons on your site.

    to get more - https://selectpdf.com/category/html-to-pdf-online/

  51. Nice post, I have just read your articles on HTML to PDF Converter which is really amazing and awesome.