OmegaT/indesign compatibility Thread poster: Roy Williams
|
Hello all,
I've using Wordfast up until now (for ms office formats) and have started experimenting with OmegaT so that I can work with other file formats. There was no mention of this any of the documentation I've looked through but would anyone know if OmegaT can be used with Indesign and or PDF formats as well? | | | Didier Briel France Local time: 02:53 English to French + ... InDesign through Rainbow | Oct 20, 2008 |
WilRoy wrote:
I've using Wordfast up until now (for ms office formats) and have started experimenting with OmegaT so that I can work with other file formats. There was no mention of this any of the documentation I've looked through but would anyone know if OmegaT can be used with Indesign
First, InDesign should be exported to the INX format.
Then Rainbow (Okapi) can be used to create an OmegaT project using an intermediate format.
and or PDF formats as well?
What do you call "PDF formats"?
OmegaT cannot read PDF files directly, the content must be extracted or converted (by OCR) first.
Didier | | | Roy Williams Austria Local time: 02:53 German to English TOPIC STARTER
By PDF format I meant PDF files. What is INX? | | | Didier Briel France Local time: 02:53 English to French + ... INX is an export/import format | Oct 21, 2008 |
WilRoy wrote:
By PDF format I meant PDF files.
PDF files can either contain text.
In this case, it can be extracted by copy/pasting into Word, for instance. Some reformatting will usually have to be done to get rid of the excess linefeeds.
Or they contain images, and no CAT tool can translate images. The images must be converted to text first, using OCR software.
What is INX?
An XML intermediate format allowing to export and import document in InDesign.
Didier | |
|
|
Samuel Murray Netherlands Local time: 02:53 Member (2006) English to Afrikaans + ...
WilRoy wrote:
By PDF format I meant PDF files.
I know of no CAT tool that can translate PDF files. Not even the mighty Trados can do it. You may be able to translate text extracted from PDF files, and if you're clever you can put the text back yourself using a PDF editor (search the forums), but I know of no CAT tool that offers both extraction and putting it back.
What is INX?
Tell me, how do you translate InDesign files at the moment? | | | esperantisto Local time: 03:53 Member (2006) English to Russian + ... SITE LOCALIZER Neither do I, but… | Oct 21, 2008 |
Samuel Murray wrote:
I know of no CAT tool that can translate PDF files.
I vaguely remember some new and bright wannabe program asserting that it supports PDF as input, or so. Maybe, they implemented PDF-to-something conversion on-the-fly? In all respects, though, that program did not look interesting, and I even can't remember its name. | | | Roy Williams Austria Local time: 02:53 German to English TOPIC STARTER
In the wordfast documentation it claims to be able to translate PDF's but also states that it "uncertain" as PDF were designed no be write protected. I reasoned that if wordfast could make such a claim, maybe there could be a better tool. I have not had to work with PDF's so I don't know if WF can actually do it.
As for indesign, the company where I work has only recently started using it. At present most of the documentation are still .doc files from which PDF's are created post tr... See more In the wordfast documentation it claims to be able to translate PDF's but also states that it "uncertain" as PDF were designed no be write protected. I reasoned that if wordfast could make such a claim, maybe there could be a better tool. I have not had to work with PDF's so I don't know if WF can actually do it.
As for indesign, the company where I work has only recently started using it. At present most of the documentation are still .doc files from which PDF's are created post translation. So to answer your question sam, at the moment I don't translate in Indesign. But with it's increasing use, I thought it would be prudent to find a tool to process said files. ▲ Collapse | | | Samuel Murray Netherlands Local time: 02:53 Member (2006) English to Afrikaans + ... Not Wordfast | Oct 21, 2008 |
WilRoy wrote:
In the Wordfast documentation it claims to be able to translate PDF's but also states that it "uncertain" as PDF were designed no be write protected. I reasoned that if Wordfast could make such a claim, maybe there could be a better tool. I have not had to work with PDF's so I don't know if WF can actually do it.
The Wordfast manual makes no such claims. Can you quote from it? The PlusTools manual does have a section on its PDF conversion functionality. I quote it here in full:
PDF
This pane offers two features: 1. extract textual contents from a PDF document currently opened with Acrobat Reader in the background, and 2. convert text from a currently opened document (typewriter-style, where all lines end with a paragraph mark) into regular text with whole paragraphs.
Both tasks are uncertain. The PDF format was created at first to be a read-only format, this is why it is CAT tool-unfriendly. Extracting text from Acrobat Reader is therefore uncertain.
Re-creating whole paragraphs in a document where each line ends with a paragraph mark (carriage return) is also an uncertain task for a machine, since it supposes an understanding of the text. A 90% success rate is usually achieved, however.
As for InDesign, the company where I work has only recently started using it. ... So to answer your question sam, at the moment I don't translate in Indesign.
Get your hands on a copy of it and find out how to export and import INX files. Then show the graphic people how to do it. | |
|
|
Roy Williams Austria Local time: 02:53 German to English TOPIC STARTER OK PlusTools then | Oct 22, 2008 |
[quote]Samuel Murray wrote:
[i]PDF
This pane offers two features: 1. extract textual contents from a PDF document currently opened with Acrobat Reader in the background, and 2. convert text from a currently opened document (typewriter-style, where all lines end with a paragraph mark) into regular text with whole paragraphs.
Both tasks are uncertain. The PDF format was created at first to be a read-only format, this is why it is CAT tool-unfriendly. Extracting text from Acrobat Reader is therefore uncertain.
The text you quoted is actually what I was refering to when talking about the PDF files, I keep the plustools, wordfast documentation and training manual in a folder I refer to as wordfast docs. Because it seems like a somewhat time-intensive process with "uncertain" results, I haven't tried it. So I thought if one tool had a method of working with PDF, howerver uncertain, perhaps there was another one that could but with solid results.
Didier thanks for answering, the information you provided has proven most useful.
[Edited at 2008-10-22 05:51] | | | Samuel Murray Netherlands Local time: 02:53 Member (2006) English to Afrikaans + ... Okay, let me put it this way | Oct 22, 2008 |
WilRoy wrote:
So I thought if one tool had a method of working with PDF, howerver uncertain, perhaps there was another one that could but with solid results.
It is my understanding that any non-OCR method of extracting text from a PDF will be flawed, because the paragraph reorderiser has to guess, based on certain rules made up by the programmer.
I have used the PlusTools method a number of times and I'm quite happy with the results, especially when the PDF is fairly simple. For shorter documents, I prefer to select and copy text by hand, for more control. | | | Roy Williams Austria Local time: 02:53 German to English TOPIC STARTER
Ok so I tried extracting text from a PDF with PlusTools and The extraction itselfe is not as time intensive as I thought after reading the manual. The Problem though is none of the formating was preserved; all text (content directory, text from tables, etc.) were simply left justified. Is that limit to PlusTools ability for this particular task? | | | Samuel Murray Netherlands Local time: 02:53 Member (2006) English to Afrikaans + ...
WilRoy wrote:
The Problem though is none of the formating was preserved; all text (content directory, text from tables, etc.) were simply left justified. Is that limit to PlusTools ability for this particular task?
Yes. PlusTools may in certain circumstances retain the character formatting, but not layout formatting. That is too difficult to guess correctly. Tables etc... forget about it. If you want a file with tables intact, pay your $9 per month here:
http://www.freepdfconvert.com/membership.asp
But even there you still need to do some post-formatting (eg removing superfluous tabs, superfluous hard returns etc). | | | There is no moderator assigned specifically to this forum. To report site rules violations or get help, please contact site staff » OmegaT/indesign compatibility TM-Town |
---|
Manage your TMs and Terms ... and boost your translation business
Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.
More info » |
| Trados Business Manager Lite |
---|
Create customer quotes and invoices from within Trados Studio
Trados Business Manager Lite helps to simplify and speed up some of the daily tasks, such as invoicing and reporting, associated with running your freelance translation business.
More info » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |