TCC (Taleo Connect Client) is Taleo's (Enterprise Edition) official data export/import tool, but it does not allow manipulations on files uploaded/attached into the Taleo system. To work with files attached, you need the TCC Custom Steps library. But again, it appears though stores files uploaded into some modules differently on their system.
For documents in the Recruiting module, Taleo first compresses, and then encodes the file in Base64. This is because internally it uses webservices for TCC, and Base64 seems like the only way to send binary attachments over SOAP XMLs. So while using TCC Custom Steps to export the attached files, you have to first decode the Base64 content, and then unzip it. Fortunately, the library's com.taleo.integration.client.customstep.xml.ExtractAttachedFilePostStep class does this automatically.
But when I used the same step to export documents from the Transitions module, the final files appeared corrupted, they would not open up in file viewers for PDF or Doc format. Turns out, the files exported are still in zip format. They need to be unzipped once more to get the final file.
So for Transition module, Taleo Zips it twice, then Base64 encodes it once. For exporting the files using TCC, the need to be unzipped twice. The above mentioned class does not do this automatically.
Answer: Different Class
The Custom Library documentation mentions another class in the same library ,com.taleo.integration.client.customstep.xml.ExtractFilesPostStep , which has more options.
According to the documentation, for documents, this step needs to have an operation value of "DecodeBase64,Gunzip". It does not mention whether operations can be chained.
They can. For transition documents, the operation has to be "DecodeBase64,Gunzip,Gunzip"
Here is the final configuration:
The final files generated by this sequence perfectly opened up in a PDF reader, so all transition forms are stored in PDF format in Taleo.


