About Me

My photo
Ireland
Hello, my name is Cathal Coffey. I am best described as a hybrid between a developer and an adventurer. When I am not behind a keyboard coding, I am hiking and climbing the beautiful mountains of my home country Ireland. I am a full time student studying Computer Science & Software Engineering at the National University of Ireland Maynooth. I am finishing the final year of a 4 year degree in September 2009. I am the creator of an open source project on codeplex.com called DocX. At the moment I spend a lot of my free time advancing DocX and I enjoy this very much. My aim is to build a community around DocX and add features based on requests from this community. I really enjoy hearing about how people are using DocX in their work\personal projects. So if you are one of these people, please send me an email. Cathal coffey.cathal@gmail.com

Thursday, December 23, 2010

Programmatically manipulate an Image imbedded inside a DocX document

This example uses DocX to write the text “Hello World” into an Image embedded inside a .docx document.

Note: Make sure the document Input.docx exists and that it contains an Image.

Below is an example Input.docx and Output.docx

image

image

Code Snippet
  1. using System;
  2. using System.Collections.Generic;
  3. using System.Linq;
  4. using System.Text;
  5. using System.Text.RegularExpressions;
  6. using Novacode;
  7. using System.Drawing;
  8. using System.Threading.Tasks;
  9. using System.IO;
  10. using System.Diagnostics;
  11. using System.Drawing.Imaging;
  12. namespace testDocX
  13. {
  14.     class Program
  15.     {
  16.         static void Main(string[] args)
  17.         {
  18.             // Open the document Input.docx.
  19.             using (DocX document = DocX.Load("Input.docx"))
  20.             {
  21.                 // Make sure this document has at least one Image.
  22.                 if (document.Images.Count() > 0)
  23.                 {
  24.                     Novacode.Image img = document.Images[0];
  25.                     CoolExample(img, "Hello World");
  26.                 }
  27.  
  28.                 else
  29.                     Console.WriteLine("The provided document contains no Images.");
  30.  
  31.                 // Save this document as Output.docx.
  32.                 document.SaveAs("Output.docx");
  33.             }
  34.         }
  35.  
  36.         // Write the given string into this Image.
  37.         private static void CoolExample(Novacode.Image img, string str)
  38.         {
  39.             // Write "Hello World" into this Image.
  40.             Bitmap b = new Bitmap(img.GetStream(FileMode.Open, FileAccess.ReadWrite));
  41.  
  42.             /*
  43.              * Get the Graphics object for this Bitmap.
  44.              * The Graphics object provides functions for drawing.
  45.              */
  46.             Graphics g = Graphics.FromImage(b);
  47.  
  48.             // Draw the string "Hello World".
  49.             g.DrawString
  50.             (
  51.                 str,
  52.                 new Font("Tahoma", 20),
  53.                 Brushes.Blue,
  54.                 new PointF(0, 0)
  55.             );
  56.  
  57.             // Save this Bitmap back into the document using a Create\Write stream.
  58.             b.Save(img.GetStream(FileMode.Create, FileAccess.Write), ImageFormat.Png);
  59.         }
  60.     }
  61. }

Replace text across many documents in Parallel

.NET 4.0 makes Parallel programming easy. Below is an example of how to replace text across many .docx documents in parallel.

This example contains 4 functions.

1) Replace: This function opens a document and does the text replace.

2) NonParallel_ReplaceText: This is how you would replace text across multiple documents without using parallel execution. This is included for comparisons sake.

3) Parallel_ReplaceText: This is how you would replace text across multiple documents in parallel.

4) Main: This function does the work sequentially and then in parallel and prints the time taken for both.

Before running this code replace the line
DirectoryInfo di = new DirectoryInfo(@"C:\Users\Cathal\Desktop\multiple");
with a directory on your machine that contains many .docx documents.

Note(s): 

1) There is over head when executing code in Parallel. Make sure your doing enough work to justify Parallel execution. For example: if you run this code on 4 small documents, the function NonParallel_ReplaceText may run faster than its parallel equivalent.

2) Run this example without the debugger, the debugger adds overhead which makes this code run significantly slower.

3) You can download and build the latest version of DocX.dll from here http://docx.codeplex.com/SourceControl/list/changesets#.

Code Snippet
  1. using System;
  2. using System.Collections.Generic;
  3. using System.Linq;
  4. using System.Text;
  5. using System.Text.RegularExpressions;
  6. using Novacode;
  7. using System.Drawing;
  8. using System.Threading.Tasks;
  9. using System.IO;
  10. using System.Diagnostics;
  11. namespace testDocX
  12. {
  13.     class Program
  14.     {
  15.         static void Main(string[] args)
  16.         {
  17.             // Directory containing many .docx documents.
  18.             DirectoryInfo di = new DirectoryInfo(@"C:\Users\Cathal\Desktop\multiple");
  19.  
  20.             // Print out the time taken in miliseconds.
  21.             Console.WriteLine("Non-Parallel took " + NonParallel_ReplaceText(di, "pear", "raep") + " miliseconds.");
  22.  
  23.             // Print out the time taken in miliseconds.
  24.             Console.WriteLine("Parallel took " + Parallel_ReplaceText(di, "raep", "pear") + " miliseconds.");
  25.  
  26.             // Wait until the user presses a key before exiting.
  27.             Console.ReadKey();
  28.         }
  29.  
  30.         // Replace text accross multiple documents sequentially.
  31.         private static long NonParallel_ReplaceText(DirectoryInfo di, string a, string b)
  32.         {
  33.             // Create a new Stopwatch, we will use this to time execution.
  34.             Stopwatch sw = new Stopwatch();
  35.  
  36.             sw.Start(); // Start the stop watch.
  37.  
  38.             // Loop through each document in this specified direction.
  39.             foreach (FileInfo fi in di.GetFiles())
  40.             {
  41.                 // Replace text in this document.
  42.                 Replace(fi.FullName, a, b);
  43.             }
  44.  
  45.             sw.Stop(); // Stop the stop watch.
  46.  
  47.             // Return the time taken in miliseconds.
  48.             return sw.ElapsedMilliseconds;
  49.         }
  50.  
  51.         // Replace text accross multiple documents in Parallel.
  52.         private static long Parallel_ReplaceText(DirectoryInfo di, string a, string b)
  53.         {
  54.             // Create a new Stopwatch, we will use this to time execution.
  55.             Stopwatch sw = new Stopwatch();
  56.  
  57.             sw.Start(); // Start the stop watch.
  58.  
  59.             // Loop through each document in this specified direction.
  60.             System.Threading.Tasks.Parallel.ForEach
  61.             (
  62.                 di.GetFiles(),
  63.                 currentFile =>
  64.                 {
  65.                     Replace(currentFile.FullName, a, b);
  66.                 }
  67.             );
  68.  
  69.             sw.Stop(); // Stop the stop watch.
  70.  
  71.             // Return the time taken in miliseconds.
  72.             return sw.ElapsedMilliseconds;
  73.         }
  74.  
  75.         // Replace the string a with the string b in filename and save the changes.
  76.         static void Replace(string filename, string a, string b)
  77.         {
  78.             // Load the document.
  79.             using (DocX document = DocX.Load(filename))
  80.             {
  81.                 // Replace text in this document.
  82.                 document.ReplaceText(a, b);
  83.  
  84.                 // Save changes made to this document.
  85.                 document.Save();
  86.             } // Release this document from memory.
  87.         }
  88.     }
  89. }