Manage data in the Grid using GridFTP

1.  Introduction


GridFTP is a high-performance, secure, reliable data transfer protocol optimized for high-bandwidth wide-area networks. The GridFTP protocol is based on FTP, the highly-popular Internet file transfer protocol. We have selected a set of protocol features and extensions defined already in IETF RFCs and added a few additional features to meet requirements from current data grid projects.

GridFTP provides the following protocol features.

  • GSI security on control and data channels
  • Multiple data channels for parallel transfers
  • Partial file transfers
  • Third-party (direct server-to-server) transfers
  • Authenticated data channels
  • Reusable data channels
  • Command pipelining

The low-level tools can be used to allow users to perform actions an the GridFTP server of a SE.
There are 7 low level commands. All 7 commands shall be covered in this section.

edg-gridftp-existschecks the existence of a file or directory on a SE.
edg-gridftp-lslists the contents of a directory on a SE.
edg-gridftp-mkdircreates a directory on an SE.
edg-gridftp-renamerenames a file on an SE.
edg-gridftp-rmdeletes a file from an SE.
edg-gridftp-rmdirdeletes a directory from an SE.
globus-url-copycopies a file onto (or to another location on) an SE.
uberftpinteractive GridFTP client.

2.  edg-gridftp-* commands


All the edg-gridftp-* commands take as a parameter one or more URL with the following format:
protocol://hostname(:port)/filepath

where

  • protocol refers to the protocol to access the SE (gsiftp generally)
  • hostname refers to the host name of the SE you want to access to
  • port, optional, refers to the listening socked port of the GridFTP server. The default one is 2811
  • path, the filepath to the directory of file one wishes to peruse

Here an example of a real URL:
gsiftp://aliserv6.ct.infn.it/tmp/tony/bigfile.dat

2.1  edg-gridftp-ls

edg-gridftp-ls will list (to the standard output) a file or the contents of a directory for the given URL.

The command is as follows:
edg-gridftp-ls [--proxy=proxy] [--verbose] [--noauth] URL

It will return a zero status on success. In all other cases, it will return a non-zero value and print an error message to the standard error.

 [user@localhost ~]$ edg-gridftp-ls  gsiftp://grid2.fe.infn.it/tmp
 .
 ..
 crlpem-B10253
 crlpem-u27893
 crlpem-T16248
 crlpem-v22698
 [...]

2.2  edg-gridftp-exists command

edg-gridftp-exists determines if the file or directory associated with the given URL exists

The command is as follows:
$ edg-gridftp-exists URL [URL ...]

This command will return a status of 0 if the file or directory exists. In all other cases, it will return a non-zero value and print an error message to the standard error.

If more than one URL is given, then a successful return will only occur if all of the URLs exist. The processing will stop with the first failure.

 [user@localhost ~]$ edg-gridftp-exists    gsiftp://dpm.grid.box/tmp/$USER/delme
 [user@localhost ~]$ echo $?
 0 

2.3  edg-gridftp-rm

edg-gridftp-rm removes the file from a GridFTP server.

The command is as follows:
edg-gridftp-rm [--proxy=proxy] URL [URL ...]

This command will return a status of 0 if the file was successfully removed. In all other cases, it will return a non-zero value and print an error message to the standard error.

If more than one URL is given, then a successful return will only occur if all of the URLs were successfully deleted. The processing will stop with the first failure.

 [user@localhost ~]$ edg-gridftp-rm gsiftp://grid2.fe.infn.it/tmp/delme 
 [user@localhost ~]$ edg-gridftp-rm gsiftp://gridse.ilc.cnr.it/tmp/delme 

2.4  edg-gridftp-rename

edg-gridftp-rename renames a file or directory on a GridFTP server.

The command is as follows:
edg-gridftp-rename [--proxy=proxy] SourceURL DestinationURL

Everything but the filename is ignored on the destination URL meaning that files will not be moved between servers. This command will return a status of 0 if the file or directory was renamed. In all other cases, it will return a non-zero value and print an error message to the standard error.

[user@localhost ~]$  edg-gridftp-rename gsiftp://dpm.grid.box/tmp/$USER gsiftp://dpm.grid.box/tmp/test2
[user@localhost ~]$  edg-gridftp-exists gsiftp://dpm.grid.box/tmp/test2 

2.5  edg-gridftp-mkdir

edg-gridftp-mkdir creates a directory on a GridFTP server

The command is as follows:
edg-gridftp-mkdir [--proxy=proxy] [--parents] URL [URL ...]

This command will return a status of 0 if the directory was created successfully. In all other cases, it will return a non-zero value and print an error message to the standard error.

If the --parents option is given any parent directories of the given URL will also be created.

If more than one URL is given, then the URLs are processed sequentially. The processing will stop with the first failure.

NOTE: The creation is only attempted if the URL does not already exist. If the given file is a regular file, this command will return success.

[user@ui-1 ~]$ edg-gridftp-mkdir gsiftp://dpm.grid.box/tmp/$USER 
[user@ui-1 ~]$ edg-gridftp-ls gsiftp://dpm.grid.box/tmp/$USER 
.
..
[user@ui-1 ~]$ 

2.6  edg-gridftp-rmdir

edg-gridftp-rmdir removes the directory from a GridFTP server.

The command is as follows:
edg-gridftp-rmdir [--proxy=proxy] URL [URL ...]

This command will return a status of 0 if the directory was successfully removed. In all other cases, it will return a non-zero value and print an error message to the standard error.

If more than one URL is given, then a successful return will only occur if all of the URLs were successfully deleted. The processing will stop with the first failure.

3.  The uberftp client

uberftp is a GridFTP -enabled client that supports both interactive use and FTP commands on the uberftp command line to transfer files between two computers.
It provides many commands similar to the standard FTP clients (cat, cd, get, put, mget, mput, dir, ls, mkdir).

 [user@localhost ~]$ uberftp dpm.grid.box
 220 grid2.fe.infn.it GridFTP Server 2.3 (gcc32dbg, 1144436882-63) ready.
 230 User gridbox002 logged in.
 uberftp> ls
 -rw-r--r--    1  lanck002  planck   5619 May 20 12:01  .canna
 drwxr-xr-x  824      root    root  20480 May 20 12:02  ..
 -rw-r--r--    1  lanck002  planck    124 May 20 12:01  .bashrc
 -rw-r--r--    1  lanck002  planck    658 May 20 12:01  .zshrc
 drwx------    2  lanck002  planck   4096 May 20 12:01  .
 -rw-r--r--    1  lanck002  planck    120 May 20 12:01  .gtkrc
 -rw-r--r--    1  lanck002  planck    191 May 20 12:01  .bash_profile
 -rw-r--r--    1  lanck002  planck     24 May 20 12:01  .bash_logout
 -rw-r--r--    1  lanck002  planck    383 May 20 12:01  .emacs
 uberftp> 
 uberftp> put delme.txt  
 delme.txt:  6 bytes in 0.26 seconds. 0.02 KB/sec
 uberftp> ls
 -rw-r--r--    1  lanck002  planck   5619 May 20 12:01  .canna
 drwxr-xr-x  824      root    root  20480 May 20 12:02  ..
 -rw-r--r--    1  lanck002  planck    124 May 20 12:01  .bashrc
 -rw-r--r--    1  lanck002  planck    658 May 20 12:01  .zshrc
 drwx------    2  lanck002  planck   4096 Aug 18 19:54  .
 -rw-r--r--    1  lanck002  planck    120 May 20 12:01  .gtkrc
 -rw-r--r--    1  lanck002  planck      6 Aug 18 19:54  delme.txt
 -rw-r--r--    1  lanck002  planck    191 May 20 12:01  .bash_profile
 -rw-r--r--    1  lanck002  planck     24 May 20 12:01  .bash_logout
 -rw-r--r--    1  lanck002  planck    383 May 20 12:01  .emacs
 uberftp> rm delme.txt
 uberftp> 

4.  globus-url-copy

This command allows to transfer files to, from and between GridFTP servers.
It handles local-to-remote, remote-to-local, and remote-to-remote (third party) transfers.

The simplest way to use this command is as follows:

globus-url-copy [-vb] sourceURL destURL

When the source or the destination file is local, you have to use =*file://*= protocol. Full path have to be given.
The -vb flag is used to display, during the transfer, the number of bytes transferred and the transfer rate per second.

 [user@localhost ~]$ globus-url-copy -vb  file:/home/user/delme.txt gsiftp://dpm.grid.box/tmp/$USER/delme
 Source: file:/home/user/
 Dest:   gsiftp://dpm.grid.box/tmp/user
   delme.txt  ->  delme
             6 bytes         0.00 MB/sec avg         0.00 MB/sec inst
 [user@localhost ~]$ edg-gridftp-ls gsiftp://dpm.grid.box/tmp/user/delme --verbose
 -rw-r--r--   1 lanck002   planck            6 Aug 18 19:16 delme

globus-url-copy supports a multichannel file transfer that optimize the file transfer on WAN. Using -n option we set the number of parallel transfer channels.

 [user@localhost ~]$ globus-url-copy -vb -p 16 file:/home/user/delme.txt gsiftp://dpm.grid.box/tmp/$USER/delme2
 Source: file:/home/user/
 Dest:   gsiftp://dpm.grid.box/tmp/user
   delme.txt  ->  delme2
             6 bytes         0.00 MB/sec avg         0.00 MB/sec inst 

globus-url-copy supports third party transfer

 [user@localhost ~]$ globus-url-copy -vb -p 16 gsiftp://grid2.fe.infn.it/tmp/delme gsiftp://gridse.ilc.cnr.it/tmp/delme
 Source: gsiftp://grid2.fe.infn.it/tmp/
 Dest:   gsiftp://gridse.ilc.cnr.it/tmp/
   delme
             6 bytes         0.00 MB/sec avg         0.00 MB/sec inst
 [user@localhost ~]$ edg-gridftp-ls --verbose  gsiftp://gridse.ilc.cnr.it/tmp/delme
 -rw-r--r--   1  inaf008     inaf            6 Aug 18 19:19 /tmp/delme