magnify
Home Topic ETL Questions

ETL Questions

Home Forums Miscellaneous ETL Questions

This topic contains 3 replies, has 2 voices, and was last updated by  Cyrus Bharda 2 years, 5 months ago.

Viewing 4 posts - 1 through 4 (of 4 total)
  • Author
    Posts
  • #889

    Cyrus Bharda
    Participant

    Howdy Everyone,

    Does anyone have some experience with ETL’s? Several CQU Staff, including me, got the basic ETL Training from TechOne, but now that I am trying things out, I have some niggly things I just cant seem to get working.

    One of these things is the File Operation step.

    I can get it to work when I specify the exact file name, but if I would like to put in a wildcard, say something like name_of_file*.csv it errors as it cant find the file. I have tried *, ? and % still get the similar issue.

    I have taken a look at the documentation supplied and I have a feeling because the step specifies the exact file name, I may not be able to supply it with a mask, like the Zip File Operation for example.

    Cheers,
    Cy

    #890

    Manny
    Participant

    Hi Cyrus,

    What exactly are you trying to do – move or copy multiple files in a directory to another directory?

    Anyway the way that you need to do this is to use a load file system step with a file mask to load the files from the directory that you want to do something with. The wildcard is the windows wildcard so all csv files would have a file mask of *.csv.

    After you have this file list in a memory table (lets call the memory table FILELIST for my example below), you can run a repeat using this memory table as a source. This effectively produces a loop which will run once for each line in the input memory table.

    Within a repeat, each column of the input/source memory table becomes available as a repeat variable. The syntax for the variable is {&memory_table_name.column_name}.

    Now I don’t have ETL open in front of me but from memory, the column name in the Load File System step that has the filename in it is called “name”.

    So within the repeat you would have the File operation step and under the filename you pass in the variable ({&FILELIST.name}. Hence it will 1 by 1 process each file that was retrieved in the Load File System step.

    I generally will create a variable at the top of the process called FILENAME and within my repeat, before the File operation step I would have a set variable step that sets the FILENAME variable equal to ‘{&FILELIST.name}’.

    This will mean that as the Set Variable step gets executed, the value that the filename variable resolves to will be printed in the log file (which helps you work out where it fell over (if it does).

    The above instructions work better with an example. See attached an example of an ETL process we run to move log files (well actually it is set to copy and then delete the files but conceptually it is the same).

    Note: The set variable concept I talked about above is not employed in this process as I didn’t write it.

    Cheers,

    Manny

    #891

    Manny
    Participant

    Hi Cyrus,

    Sorry the forum didn’t let me upload a *.t1ETL file so I’ll try it as a zip.

    Cheers,

    Manny

    Attachments:
    You must be logged in to view attached files.
    #893

    Cyrus Bharda
    Participant

    Thanks Manny!

    That is a really handy ETL, one that I had been planning to tackle once I was more experienced using ETLs.

    I ended up using the following steps:

    1. Purge Memory Table. I put this in just to make sure the ETL was starting off with a clean set of memory tables.
    2. Load File System. As you indicated, this step loads all the file names into the memory table.
    3. Repeat Step. This repeats the File Operation Step and passes through the individual file name from the memory table to the File Operation Step.

    Thanks for your help, it was greatly appreciated!

    Cheers,
    Cy

Viewing 4 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic.