Tips on how to access the data files

Note: Many of the examples below are for the 35-ton data and use the experiment lbne. Use dune these days for DUNE MC challenges, 3x1x1 and ProtoDUNE data.

If you know the run number you are interested in (35-ton only), just find the corresponding file (one file per run) in the list linked above. Look for its location in samweb -- here's a minimal set of commands. Setting up a dunetpc offline environment will accompish all of the setups needed.

$ source /grid/fermiapp/products/dune/setup_dune.sh
$ setup sam_web_client
$ setup ifdhc
$ samweb -e dune get-file-access-url lbne_r004024_sr01_20151027T154934.root

which prints this out:

gsiftp://fndca1.fnal.gov:2811/pnfs/fnal.gov/usr/lbne/test-data/lbne/raw/00/04/41/51/lbne_r004024_sr01_20151027T154934.root

$ cd to a directory with lot of free space
$ ifdh cp -D gsiftp://fndca1.fnal.gov:2811/pnfs/fnal.gov/usr/lbne/test-data/lbne/raw/00/04/41/51/lbne_r004024_sr01_20151027T154934.root .

You can also access the file directly in dCache using the /pnfs file system, but the directory name in the url above should have /pnfs/fnal.gov/usr/lbne/ replaced with /pnfs/lbne/. Example:

$ config_dumper /pnfs/lbne/test-data/lbne/raw/00/04/41/51/lbne_r004024_sr01_20151027T154934.root

To get SAM metadata that's stored inside of an art-formatted file:

$ sam_metadata_dumper /pnfs/lbne/test-data/lbne/raw/00/04/41/51/lbne_r004024_sr01_20151027T154934.root

A more complete metadata printout can be had with this command, which queries the SAM database:

$ samweb -e dune get-metadata lbne_r004024_sr01_20151027T154934.root

To list rawdata files for a given run:

$ samweb list-files "run_number=4024 and data_tier=raw"

The "=" signs are optional. We have a second value for data_tier and that's "sliced", for files run after the splitter. The files were split using the current 'default' parameters which is to take all of every payload, therefore ghost triggers will be present in this sample and the size of each event is 15,000 TPC ticks. Note -- sliced data with application art, family daqag and version v00_00_01 are known to be buggy -- they have incorrect timestamp information for the Penn Trigger Board and SSP's. TPC data should be fine however. We will version the sliced data in the future with the dunetpc version.

To find out what metadata fields are available

$ samweb list-files --help-dimensions wa105_r774_s0_1499894298.root

One thing to note -- the "Application" field contains three subfields. The way to query on the application subfield "version" is just to use version and not application.version for example.

To create a SAM dataset definition:

$ kx509 #you have to first create a kerberos certificate from your kerberos ticket
$ samweb -e dune create-definition rawdata_run_8257 'run_number=8257 and data_tier=raw'
To make sure that you got the things right:

$ samweb list-definition-files rawdata_run_8257

Another example of a query on contributing components. There are sixteen RCE's (00 through 15), seven SSP's: (ssp01 through 07, though 07 is not connected to a detector), and penn01. This command gets a list of files that have RCE 0, SSP 2, and the Penn Trigger board contributing:

$ samweb -e dune list-files "(lbne_data.detector_type like %rce00%) and (lbne_data.detector_type like %penn01%) and (lbne_data.detector_type like %ssp02%)"

Here's an example requiring all components to be present (cut and paste to use):

$ samweb -e dune list-files "\
(lbne_data.detector_type like %rce00%) and \
(lbne_data.detector_type like %rce01%) and \
(lbne_data.detector_type like %rce02%) and \
(lbne_data.detector_type like %rce03%) and \
(lbne_data.detector_type like %rce04%) and \
(lbne_data.detector_type like %rce05%) and \
(lbne_data.detector_type like %rce06%) and \
(lbne_data.detector_type like %rce07%) and \
(lbne_data.detector_type like %rce08%) and \
(lbne_data.detector_type like %rce09%) and \
(lbne_data.detector_type like %rce10%) and \
(lbne_data.detector_type like %rce11%) and \
(lbne_data.detector_type like %rce12%) and \
(lbne_data.detector_type like %rce13%) and \
(lbne_data.detector_type like %rce14%) and \
(lbne_data.detector_type like %rce15%) and \
(lbne_data.detector_type like %ssp01%) and \
(lbne_data.detector_type like %ssp02%) and \
(lbne_data.detector_type like %ssp03%) and \
(lbne_data.detector_type like %ssp04%) and \
(lbne_data.detector_type like %ssp05%) and \
(lbne_data.detector_type like %ssp06%) and \
(lbne_data.detector_type like %penn01% and \
data_tier=raw)"

Here's an example filtering on the daq configuration:

$ samweb -e dune list-files "(lbne_data.run_mode ssps_and_ptb_pd_filter_study)"

Another useful command:

$ samweb -e dune locate-file lbne_r004024_sr01_20151027T154934.root

which gives a dCache directory and a tape label, enclosed in parentheses (if the file is on tape). Accessing a file that has migrated to tape and off of dCache should proceed as if it were on tape, but with a delay while the data are staged.

If you are not authorized to upload metadata, submit a Service Desk ticket at http://servicedesk.fnal.gov

You can skip the -e dune by defining the environment variable GROUP to the value dune.

The lbne GROUP value is no longer supported. Specifying -e lbne on a samweb command will now give "404 not found" errors.

Alex Himmel has a handy script

~ahimmel/bin/sam_query_paths

which will go straight from a query to a list of dCache paths

35-ton Useful Run Numbers

(Approximate) HV on first time, pumps on. First run taken on February 12, 2016: 10847
(Approximate) Tubing Break: 2:00 AM March 19, 2016. The last run taken March 19, 2016: 17594

Useful Links

Data Quality Monitoring for the 35-ton run: http://lbne-dqm.fnal.gov

SAM query dimension syntax

SAM user guide

samweb command line reference

SAM metadata format