The method could appear orthodox to some people, but this is what I had available.
Before I start let me tell you that the HUGE help that I had it, was that I had access to the COBOL software.
Saying that let me explain briefly what I had it done.
- Identify the DATA Files
- Extract the data to text format using runcobol recover2 "Index File Recovery Utility", this is a utility from COBOL that will export Index files as TEXT files.
- Now We have several Text files but when I opened them they have "non viewable ASCII chars", those represent the 4 bytes for the begin and end of a block. When you extract the data every block is treated as it is contained in a single record. To be able to Edit this Text File I had tried SEVERAL text editors, Notepad++, Textpad, Litepad, since the files created from the utility are bigger than 1 GB it was very hard to edit them. so I found HxD - Hexeditor from http://mh-nexus.de/en/ this tool save me a lot of TIME. You can see the 4 bytes that repeat, they are the END and BEGIN of new Block or Record in this case.
- So after I identified the Begin and End of the block 4 bytes chars, (for this example 42 11 00 00 are the HEX of each byte), I did a Ctrl-R and replace all 8 bytes by a "|" Pipe char followed by a New Line or Carriage Return char, the purpose of this is to have a single line record. Now I was able to open the text file and SEE the Data at least it was more organized that the RAW data you obtain from the COBOLutility. For this task I used HxD to do the replace.
- The next 2 images show how the raw data was aligned before and after of replacing the 4 bytes at the begin and at the end of each block.
This is the RAW Data
This is the Data after being fixed
- The next steps are pretty simple you need to Import the text files to a SQL DB using SSIS and from there feed your application database, as I mention at the beginning of this article my advantage was that I was able to use the Cobol Application and using that combined with the readable text files made it really easy to identify the columns in each table.
NOTE:
- One thing that I need to add to this article was that I am not sure if the COBOL files had any COMP or COMP3 fields but when I export them as Text files I found several numbers that have a letter at the end like this little sample:
AMOUNT
0000000600}
0000000570}
0000000735L
0000000009{
0000000891L
0000000400{
0000000606C
0000000400{
0000000400}
0000000400{
0000000549C
0000000870}
0000000050{
0000000400{
0000000840C
0000000400{
- There is a simple logic that you need to applied to obtain the right value, you need to replace the non numeric char for a number as is being describe in the script bellow





