GridFTP
GridFTP
This tutorial looks at the use of the GridFTP client provided by Globus - globus-url-copy. This command uses a strict compliance with ftp standards on how to construct urls and only absolute file paths should be used. There is also support for multiple protocols for receiving and delivering data so it is, for example, possible to obtain a file via http and deliver it to the grid vis gridftp - referred to in the url as gsiftp for historical reasons.
All the following commands can be run on your local Linux machine that has GLobus installed.
1. Start this section by creating a file (full of random data) that is 100Mb large. In the top directory of your account run:
dd if=/dev/urandom of=testFile bs=1024k count=100
This file will be used for demonstrating the usage of globus-url-copy.
2. The simplest way to copy this file onto an NGS head-node (in this case ngs.oerc.ox.ac.uk) by typing (one line):
globus-url-copy -vb file:///<home directory path>/testFile gsiftp://ngs.oerc.ox.ac.uk/<head-node home directory path>/testFile
where <home directory path> is the directory to your home directory on your local machine whilst <head-node home directory path> is the path to your home directory on ngs.oerc.ox.ac.uk. The -vb enables verbose mode so that you can see how fast the data is being transferred.
3. Next perform the same transfer using parallelism. Parallelism is enabled using the "-p" option (do note that using "-p 1" is not the same as using no parallelism since the presence of the "-p" option changes how globus-url-copy works). Run the command:
globus-url-copy -vb -p 4 file:///<home directory path>/testFile gsiftp://ngs.oerc.ox.ac.uk/<head-node home directory path>/testFile
In general using between 4 and 8 parallel streams gives the best performance. Higher numbers of parallel streams don't generally gain any advantage.
4. The transfers of data in the previous two examples have not encrypted the data. You can encrypt the data by adding the option "-dcpriv". Add this option to the command from the previous section. Notice how the speed of transfer has substantially dropped. This is because of the overhead of encrypting the data and is why large data transfers don't use encryption unless absolutely necessary.
5. The final part of the tutorial is to perform a "third-party" data transfer. This allows data to be transferred between two sites with out having to log on to either site directly. Transfer the file you have already copied to ngs.oerc.ox.ac.uk to ngs.rl.ac.uk by typing:
globus-url-copy -vb -p 4 gsiftp://ngs.oerc.ox.ac.uk/<full path>/testFile gsiftp://ngs.rl.ac.uk/<full path>/FromOeSC
You can then log on to ngs.rl.ac.uk (using gsissh) and check that "FromOeSC" is actually there.
6. To conclude this tutorial log on to both ngs.rl.ac.uk and ngs.oerc.ox.ac.uk and delete the files FromOeSC and testFile respectively.

