Manage data in the Grid using GridFTP
1. Introduction
GridFTP is a high-performance, secure, reliable data transfer protocol optimized for high-bandwidth wide-area networks. The GridFTP protocol is based on FTP, the highly-popular Internet file transfer protocol. We have selected a set of protocol features and extensions defined already in IETF RFCs and added a few additional features to meet requirements from current data grid projects.
GridFTP provides the following protocol features.
- GSI security on control and data channels
- Multiple data channels for parallel transfers
- Partial file transfers
- Third-party (direct server-to-server) transfers
- Authenticated data channels
- Reusable data channels
- Command pipelining
The low-level tools can be used to allow users to perform actions an the GridFTP server of a SE.
There are 7 low level commands. All 7 commands shall be covered in this section.
edg-gridftp-exists | checks the existence of a file or directory on a SE. |
edg-gridftp-ls | lists the contents of a directory on a SE. |
edg-gridftp-mkdir | creates a directory on an SE. |
edg-gridftp-rename | renames a file on an SE. |
edg-gridftp-rm | deletes a file from an SE. |
edg-gridftp-rmdir | deletes a directory from an SE. |
globus-url-copy | copies a file onto (or to another location on) an SE. |
uberftp | interactive GridFTP client. |
2. edg-gridftp-* commands
All the edg-gridftp-* commands take as a parameter one or more URL with the following format:
protocol://hostname(:port)/filepath
where
protocolrefers to the protocol to access the SE (gsiftpgenerally)hostnamerefers to the host name of the SE you want to access toport, optional, refers to the listening socked port of the GridFTP server. The default one is 2811path, the filepath to the directory of file one wishes to peruse
Here an example of a real URL:
gsiftp://aliserv6.ct.infn.it/tmp/tony/bigfile.dat
2.1 edg-gridftp-ls
edg-gridftp-ls will list (to the standard output) a file or the contents of a directory for the given URL.
The command is as follows:
edg-gridftp-ls [--proxy=proxy] [--verbose] [--noauth] URL
It will return a zero status on success. In all other cases, it will return a non-zero value and print an error message to the standard error.
[user@localhost ~]$ edg-gridftp-ls gsiftp://grid2.fe.infn.it/tmp . .. crlpem-B10253 crlpem-u27893 crlpem-T16248 crlpem-v22698 [...]
2.2 edg-gridftp-exists command
edg-gridftp-exists determines if the file or directory associated with the given URL exists
The command is as follows: $ edg-gridftp-exists URL [URL ...]
This command will return a status of 0 if the file or directory exists. In all other cases, it will return a non-zero value and print an error message to the standard error.
If more than one URL is given, then a successful return will only occur if all of the URLs exist. The processing will stop with the first failure.
[user@localhost ~]$ edg-gridftp-exists gsiftp://dpm.grid.box/tmp/$USER/delme [user@localhost ~]$ echo $? 0
2.3 edg-gridftp-rm
edg-gridftp-rm removes the file from a GridFTP server.
The command is as follows:
edg-gridftp-rm [--proxy=proxy] URL [URL ...]
This command will return a status of 0 if the file was successfully removed. In all other cases, it will return a non-zero value and print an error message to the standard error.
If more than one URL is given, then a successful return will only occur if all of the URLs were successfully deleted. The processing will stop with the first failure.
[user@localhost ~]$ edg-gridftp-rm gsiftp://grid2.fe.infn.it/tmp/delme [user@localhost ~]$ edg-gridftp-rm gsiftp://gridse.ilc.cnr.it/tmp/delme
2.4 edg-gridftp-rename
edg-gridftp-rename renames a file or directory on a GridFTP server.
The command is as follows:
edg-gridftp-rename [--proxy=proxy] SourceURL DestinationURL
Everything but the filename is ignored on the destination URL meaning that files will not be moved between servers. This command will return a status of 0 if the file or directory was renamed. In all other cases, it will return a non-zero value and print an error message to the standard error.
[user@localhost ~]$ edg-gridftp-rename gsiftp://dpm.grid.box/tmp/$USER gsiftp://dpm.grid.box/tmp/test2 [user@localhost ~]$ edg-gridftp-exists gsiftp://dpm.grid.box/tmp/test2
2.5 edg-gridftp-mkdir
edg-gridftp-mkdir creates a directory on a GridFTP server
The command is as follows:
edg-gridftp-mkdir [--proxy=proxy] [--parents] URL [URL ...]
This command will return a status of 0 if the directory was created successfully. In all other cases, it will return a non-zero value and print an error message to the standard error.
If the --parents option is given any parent directories of the given URL will also be created.
If more than one URL is given, then the URLs are processed sequentially. The processing will stop with the first failure.
NOTE: The creation is only attempted if the URL does not already exist. If the given file is a regular file, this command will return success.
[user@ui-1 ~]$ edg-gridftp-mkdir gsiftp://dpm.grid.box/tmp/$USER [user@ui-1 ~]$ edg-gridftp-ls gsiftp://dpm.grid.box/tmp/$USER . .. [user@ui-1 ~]$
2.6 edg-gridftp-rmdir
edg-gridftp-rmdir removes the directory from a GridFTP server.
The command is as follows:
edg-gridftp-rmdir [--proxy=proxy] URL [URL ...]
This command will return a status of 0 if the directory was successfully removed. In all other cases, it will return a non-zero value and print an error message to the standard error.
If more than one URL is given, then a successful return will only occur if all of the URLs were successfully deleted. The processing will stop with the first failure.
3. The uberftp client
uberftp is a GridFTP -enabled client that supports both interactive use and FTP commands on the uberftp command line to transfer files between two computers.
It provides many commands similar to the standard FTP clients (cat, cd, get, put, mget, mput, dir, ls, mkdir).
[user@localhost ~]$ uberftp dpm.grid.box 220 grid2.fe.infn.it GridFTP Server 2.3 (gcc32dbg, 1144436882-63) ready. 230 User gridbox002 logged in. uberftp> ls -rw-r--r-- 1 lanck002 planck 5619 May 20 12:01 .canna drwxr-xr-x 824 root root 20480 May 20 12:02 .. -rw-r--r-- 1 lanck002 planck 124 May 20 12:01 .bashrc -rw-r--r-- 1 lanck002 planck 658 May 20 12:01 .zshrc drwx------ 2 lanck002 planck 4096 May 20 12:01 . -rw-r--r-- 1 lanck002 planck 120 May 20 12:01 .gtkrc -rw-r--r-- 1 lanck002 planck 191 May 20 12:01 .bash_profile -rw-r--r-- 1 lanck002 planck 24 May 20 12:01 .bash_logout -rw-r--r-- 1 lanck002 planck 383 May 20 12:01 .emacs uberftp> uberftp> put delme.txt delme.txt: 6 bytes in 0.26 seconds. 0.02 KB/sec uberftp> ls -rw-r--r-- 1 lanck002 planck 5619 May 20 12:01 .canna drwxr-xr-x 824 root root 20480 May 20 12:02 .. -rw-r--r-- 1 lanck002 planck 124 May 20 12:01 .bashrc -rw-r--r-- 1 lanck002 planck 658 May 20 12:01 .zshrc drwx------ 2 lanck002 planck 4096 Aug 18 19:54 . -rw-r--r-- 1 lanck002 planck 120 May 20 12:01 .gtkrc -rw-r--r-- 1 lanck002 planck 6 Aug 18 19:54 delme.txt -rw-r--r-- 1 lanck002 planck 191 May 20 12:01 .bash_profile -rw-r--r-- 1 lanck002 planck 24 May 20 12:01 .bash_logout -rw-r--r-- 1 lanck002 planck 383 May 20 12:01 .emacs uberftp> rm delme.txt uberftp>
4. globus-url-copy
This command allows to transfer files to, from and between GridFTP servers.
It handles local-to-remote, remote-to-local, and remote-to-remote (third party) transfers.
The simplest way to use this command is as follows:
globus-url-copy [-vb] sourceURL destURL
When the source or the destination file is local, you have to use =*file://*= protocol. Full path have to be given.
The -vb flag is used to display, during the transfer, the number of bytes transferred and the transfer rate per second.
[user@localhost ~]$ globus-url-copy -vb file:/home/user/delme.txt gsiftp://dpm.grid.box/tmp/$USER/delme
Source: file:/home/user/
Dest: gsiftp://dpm.grid.box/tmp/user
delme.txt -> delme
6 bytes 0.00 MB/sec avg 0.00 MB/sec inst
[user@localhost ~]$ edg-gridftp-ls gsiftp://dpm.grid.box/tmp/user/delme --verbose
-rw-r--r-- 1 lanck002 planck 6 Aug 18 19:16 delme
globus-url-copy supports a multichannel file transfer that optimize the file transfer on WAN. Using -n option we set the number of parallel transfer channels.
[user@localhost ~]$ globus-url-copy -vb -p 16 file:/home/user/delme.txt gsiftp://dpm.grid.box/tmp/$USER/delme2
Source: file:/home/user/
Dest: gsiftp://dpm.grid.box/tmp/user
delme.txt -> delme2
6 bytes 0.00 MB/sec avg 0.00 MB/sec inst
globus-url-copy supports third party transfer
[user@localhost ~]$ globus-url-copy -vb -p 16 gsiftp://grid2.fe.infn.it/tmp/delme gsiftp://gridse.ilc.cnr.it/tmp/delme
Source: gsiftp://grid2.fe.infn.it/tmp/
Dest: gsiftp://gridse.ilc.cnr.it/tmp/
delme
6 bytes 0.00 MB/sec avg 0.00 MB/sec inst
[user@localhost ~]$ edg-gridftp-ls --verbose gsiftp://gridse.ilc.cnr.it/tmp/delme
-rw-r--r-- 1 inaf008 inaf 6 Aug 18 19:19 /tmp/delme
