Thursday, May 1, 2014

Rod Shelton: How to import your marked-up MS Word file into your ...



Sigil is set up to import an MS Word file in its entirety, including all the styles and formatting. This might sound like a good idea until you actually do it and look at the file in code view. You will see literally hundreds of lines of code: the entire MS Word stylesheet, in fact, re-coded in html and CSS. Probably, for an epub, this might render properly, but there is so very much that could go wrong that something probably will. And if even just one tiny bit of it it does go wrong, mending it would involve unpicking the entire MS Word stylesheet. A task well beyond my limited skills and one which would depend also on how logically the original MS Word document styles were constructed in the first place. And as for Kindle, it supports so few html tags and even fewer CSS styles that the chances this will work on a kindle are in my opinion negligible.

So my preferred strategy is to keep the e-book as simple as possible, and that way eliminate any possible problems from the beginning and ensure I know exactly what is happening. This means importing your document as plain text (marked up to indicate where your formats go) and then re-applying the formats using CSS and html which you know will work on a Kindle and an e-pub. (See a forthcoming post about which CSS works with Kindle and which does not, to be linked here.) Persuading Sigil NOT to import the MS Word stylesheet is not as straightforward as it might at first sound. Even then a little bit of tidying up of the file is necessary, but this can be done in a matter of minutes using find and replace. This post outlines how I go about it.


Once you have marked up your book text in MS Word, remove any comments you might have in it. And then, when it is ready, save it as unformatted text in a .txt file. To do this, click the ‘Office’ button (top left) and select ‘Save As…/Other Formats’:





In the ‘Save As…’ dialog, select ‘Plain Text (*.txt)’ from the ‘Save as Type:’ pop-up menu:

You will get a dialog asking for details of how the file is to be converted to plain text. It should default to these settings: ‘Windows(default)’ and ‘CR/LF’ at the end of lines. Use these defaults and click ‘OK’:

You could alternatively create a blank .txt file in Word and then use ‘Paste Special …’ to paste your book text into it. Select the whole document and copy the contents to the clipboard. Then click in your new, blank .txt document. Select ‘Paste Special …’ from the ‘Clipboard’ pane in the ‘home’ group of the ‘Ribbon’:



Then select ‘unformatted text’ from the dialog and click ‘OK’:



You will want to save the file in the folder you are using for your ebook project.

It is important to distance yourself from MS Word at this point, so CLOSE the .txt file you have created. When you do, Word may throw up some dialogs asking you if you really want to save it as a .txt file and warning you that doing so will mean the loss of some formatting information. Select the options to lose AS MUCH formatting information as possible! Now RE-OPEN it using something like Notepad (Windows). Perhaps the easiest way to find a suitable program is to right-click (Windows) or Control-click (Mac) on the filename and then choose a program to open the file with from the pop-up menu:





In notepad, the file now looks like this:



Once the file is open, select the entire text and copy it to the clipboard. Now create a blank e-pub file or open one you made earlier using Sigil. (See my post on how to use get started by making a blank e-pub with Sigil.) Your blank epub should contain a single empty chapter. Open this in the main Sigil window in code view. Find and select the blank paragraph which Sigil has placed in the <body> of the page. NB ‘&#160;’ is an alternative html code for a non-breaking space:



DELETE this blank paragraph and ensure the blinking insertion point is positioned on a blank line between the opening <body> tag and the closing </body> tags:



SWITCH back to ‘Book View’ and paste the text into the chapter. (If you don’t switch to book view before pasting you will get the whole thing as one v-e-r-y l-o-n-g paragraph.) When you are finished, switch back to code view again and you should see each paragraph of the original document enclosed by opening <div> and closing </div> tags:



Now to the most important bit. You will need to use find and replace to change all the closing </div> tags to </p> tags and then change all the opening <div> tags to <p> tags. What you want to achieve is this:



Next, SAVE the file. Sigil should ‘clean up’ the code placing the <p> tags in the same line as the text they enclose:



If the file doesn’t automatically clean itself up, turn clean up ON by selecting ‘Preferences …’ from the Edit menu to bring up the Preferences dialog:



And in the preferences dialog, click ‘Clean Source’ and make sure the ‘Open’ and ‘Save’ check boxes are ticked:



You will need to do a bit of cleaning up of the file using Find and Replace to make sure there are no spaces before the opening <p> tags or after the closing </p> tags. There might also be some blank <p> tags and maybe some non-breaking spaces (&#160; or &nbsp;) left in the file which need to be deleted.

Find and Replace in Sigil is helpfully in a panel below the main window and is self-explanatory:





In the screenshot above, ‘&#160;’ is being replaced with nothing (i.e. it is being deleted).

As I said at the beginning of this post, unless you save the original MS Word file as unformatted text, when it is imported into the ebook, Sigil will copy the entire MS Word stylesheet along with the text and convert the styles into inline CSS which it will put into the <head> section of the document. This will usually ammount to several HUNDRED lines of code. You want to be sure that your ebook displays exactly the way you want it to, and so importing the MS Word stylesheet is NOT a good idea. It is WAY too complicated and finding and fixing any problems will be well nigh impossible. Keep your ebook coding simple and basic and there will be less to go wrong!


Next Steps: You need to create and link a CSS stylesheet in your epub. Then you are ready to replace the markup with the CSS styling you want and to split the one long chapter you have just created up into the individual chapters. (Posts to be linked here when published.)


Index to ‘how to …’ posts:


How to ‘unpack’ an epub file to edit the contents and see what’s inside.

How to understand what is inside an epub

How to link the html table of Contents in a Kindle e-book

How to restructure the html table of contents for a Kindle

How to delete the html cover for a Kindle ebook

How to link the cover IMAGE in a Kindle e-book

How to clean up your MS Word file before your get started

How to markup an MS Word file to identify the formats before importing it into an epub

How to create a new blank e-pub using Sigil

How to import your marked-up MS Word file into your ebook using Sigil


TinyURL for this post: http://ift.tt/1kwaUr2




Source:


http://ift.tt/1mgCFrC






The Late News from http://ift.tt/1c1TX0y