Batch Convert DOCX Files to TXT Format with textutil in Mac OS X here.
The following was originally found on https://developer.apple.com
textutil — text utility
textutil [command_option] [other_options] file …
textutil can be used to manipulate text files of various formats, using the mechanisms provided by the Cocoa text system.
The first argument indicates the operation to perform, one of:
-help Show the usage information for the command and exit. This is the default command option if none is specified.
-info Display information about the specified files.
-convert fmt Convert the specified files to the indicated format and write each one back to the file system.
-cat fmt Read the specified files, concatenate them, and write the result out as a single file in
the indicated format.
fmt is one of: txt, html, rtf, rtfd, doc, docx, wordml, odt, or webarchive
There are some additional options for general use:
-extension ext Specify an extension to be used for output files (by default, the extension will be
determined from the format).
-output path Specify the file name to be used for the first output file.
-stdin Specify that input should be read from stdin rather than from files.
-stdout Specify that the first output file should go to stdout.
-encoding IANA_name | NSStringEncoding
Specify the encoding to be used for plain text or HTML output files (by default, the output encoding will be UTF-8). NSStringEncoding refers to one of the numeric values recognized by NSString. IANA_name refers to an IANA character set name as understood by CFString. The operation will fail if the file cannot be converted to the specified encoding.
-inputencoding IANA_name | NSStringEncoding
Force all plain text input files to be interpreted using the specified encoding (by default, a file’s encoding will be determined from its BOM). The operation will fail if the file cannot be interpreted using the specified encoding.
-format fmt Force all input files to be interpreted using the indicated format (by default, a
file’s format will be determined from its contents).
-font font Specify the name of the font to be used for converting plain to rich text.
-fontsize size Specify the size in points of the font to be used for converting plain to rich text.
— Specify that all further arguments are file names.
There are some additional options for HTML and WebArchive files:
-noload Do not load subsidiary resources.
-nostore Do not write out subsidiary resources.
-baseurl url Specify a base URL to be used for relative URLs.
-timeout t Specify the time in seconds to wait for resources to load.
Specify a numeric factor by which to multiply font sizes.
-excludedelements (tag1, tag2, …)
Specify which HTML elements should not be used in generated HTML (the list should be a
single argument, and so will usually need to be quoted in a shell context).
-prefixspaces n Specify the number of spaces by which to indent nested elements in generated HTML
(default is 2).
There are some additional options for treating metadata:
-strip Do not copy metadata from input files to output files.
-title val Specify the title metadata attribute for output files.
-author val Specify the author metadata attribute for output files.
-subject val Specify the subject metadata attribute for output files.
-keywords (val1, val2, …)
Specify the keywords metadata attribute for output files (the list should be a single
argument, and so will usually need to be quoted in a shell context).
-comment val Specify the comment metadata attribute for output files.
-editor val Specify the editor metadata attribute for output files.
-company val Specify the company metadata attribute for output files.
Specify the creation time metadata attribute for output files.
Specify the modification time metadata attribute for output files.
textutil -info foo.rtf
displays information about foo.rtf.
textutil -convert html foo.rtf
converts foo.rtf into foo.html.
textutil -convert rtf -font Times -fontsize 10 foo.txt
converts foo.txt into foo.rtf, using Times 10 for the font.
textutil -cat html -title “Several Files” -output index.html *.rtf
loads all RTF files in the current directory, concatenates their contents, and writes the result out as
index.html with the HTML title set to “Several Files”.
The textutil command exits 0 on success, and 1 on failure.
There are a number of ways you can convert a text document to another format, by simply opening it in a text editor like TextEdit and then choosing Save As from the File menu to export it. With TextEdit, you can choose Word, Rich Text, Plain Text, and OpenDocument Text, among others, as the formats in which to save your current file; however, if you are a Terminal user then you might enjoy knowing you can do this right from the command line.
One command-line tool Apple includes in OS X is “textutil” which can be used for a number of manipulations of supported text documents, with one of them being to convert a targeted document to a specified format:
Open the Terminal
Type the following command, replacing FORMAT with one of txt, html, rtf, rtfd, doc, docx, wordml, odt, or webarchive to specify the desired format:
textutil -convert FORMAT
Ensure there is a space after the format specification, and then drag your target document to the Terminal window, so the command looks something like this (in this case, converting a webarchive called “mypage” to docx):
textutil -convert docx ~/Desktop/mypage.webarchive
When this command is executed, the resulting file will appear in the same folder as the original.
While the use of the Terminal for this might seem unnecessary given the ability to use various word processing programs for converting files, you can use it when managing text documents in automator routines, shell scripts, or applescripts where conversion of these documents might be desired.
One prime use for this routine is to batch-convert files, so if you have a folder of txt documents you would like to convert to docx, then you can do so by using Terminal wildcards with this command to target all or at least a group of desired files in the folder:
textutil -convert docx ~/Desktop/TextDocuments/*.txt
In the above command, any .txt documents in the folder called “TextDocuments” on the current user’s desktop will be converted to docx format.
The textutil command can be used for these format conversion routines, but also supports a number of other features such as specifying encoding, changing font sizes and type faces, and modifying file metadata.