SFU Arbutus Tests

Introduction

Purpose was to run and also benchmark jobs on SFU's arbutus CE. Input datasets used were triumf19Aug2010.txt which consisted of the following:

mc09_7TeV.105830.JF17_herwig_jet_filter.merge.AOD.e507_s765_s767_r1215_r1210/
total files: 1971
total size: 2.01 TB
total events: 9851996
> dq2-list-datasets-container mc09_7TeV.105830.JF17_herwig_jet_filter.merge.AOD.e507_s765_s767_r1215_r1210/
dq2-list-datasets-container mc09_7TeV.105830.JF17_herwig_jet_filter.merge.AOD.e507_s765_s767_r1215_r1210_tid125847_00
mc09_7TeV.108087.PythiaPhotonJet_Unbinned17.merge.AOD.e505_s765_s767_r1215_r1210/
total files: 1000
total size: 755.12 GB
total events: 4994464
> dq2-list-datasets-container mc09_7TeV.108087.PythiaPhotonJet_Unbinned17.merge.AOD.e505_s765_s767_r1215_r1210/
mc09_7TeV.108087.PythiaPhotonJet_Unbinned17.merge.AOD.e505_s765_s767_r1215_r1210_tid125845_00
mc09_7TeV.108087.PythiaPhotonJet_Unbinned17.merge.AOD.e505_s765_s767_r1207_r1210/
total files: 500
total size: 757.96 GB
total events: 4994464
> dq2-list-datasets-container mc09_7TeV.108087.PythiaPhotonJet_Unbinned17.merge.AOD.e505_s765_s767_r1207_r1210/
mc09_7TeV.108087.PythiaPhotonJet_Unbinned17.merge.AOD.e505_s765_s767_r1207_r1210_tid124453_00
mc09_7TeV.105001.pythia_minbias.merge.AOD.e517_s764_s767_r1204_r1210/
total files: 1000
total size: 777.51 GB
total events: 9996249
> dq2-list-datasets-container mc09_7TeV.105001.pythia_minbias.merge.AOD.e517_s764_s767_r1204_r1210/
mc09_7TeV.105001.pythia_minbias.merge.AOD.e517_s764_s767_r1204_r1210_tid123082_00
mc09_7TeV.105001.pythia_minbias.merge.AOD.e517_s764_s767_r1204_r1210_tid123186_00

Note there are 5 datasets in those 4 containers. (Note: the above datasets reside in LOCALGROUPDISK and so new HC sites reflecting this need to be created.) Additional HC configurations: uncheck "Resubmit enabled" and set "Num datasets per bulk = 5". For Panda, filesize set by Dan/Johannes to -o[defaults_DQ2JobSplitter]filesize=13336 (in MB). For Ganga, configure HC to submit with filesize=13336 so that the number of jobs will match that of the Panda test.

These tests can be compared to that at other sites; e.g. Victoria HermesII.

Hardware Specs

See CACloudSiteHardware

Note that unless stated otherwise, the SW area is cvmfs mounted by nfs on WNs.

Cluster configuration

Dec 3 - Dec 7 13:30 - worker nodes: s1, b413, b414

Dec 7 13:30 - worker nodes: +b1 (offline until we run stress tests)

Dec 12- Worker nodes - b1, s1, b413, b414

Jan 11 12:20 PM - Worker nodes - b1, s1, b413, b414, b404, b405 (s1 is temporarily offline for investigations)

Benchmark Tests

Test 14609

worker nodes: b413, b414 This test has the env variables ATLAS_LOCAL_AREA is undefined and VO_ATLAS_SW_DIR points to the nfs software area (VO_ATLAS_SW_DIR=/opt/exp_soft/atlas) instead of the cvmfs software area. (ie it will be like bugaboo).

Note: a wrapper lcg-cp script is being used for an EMI/d-cache problem.

Test 14609

Test 14659

worker nodes: b413, b414. Using cvmfs over nfs. Otherwise identical to test 14690.

Note: a wrapper lcg-cp script is being used for an EMI/d-cache problem.

Test 14659

Stress Tests

Test Templates

Template Description
486 UA 17.2.2.1 Panda
481 UA 17.6.0 Panda TTreeCache On skipEvent
469 17.6.0 AODToEgammaD3PD PANDA default data-access
467 MC AtlasG4 _trf 17.2.2.2 default data access
453 17.0.6.4 AODToPhysicsD3PD PANDA default data-access
451 Muon 16.0.3.3 PANDA default data access
447 Muon 16.0.3.3 PANDA Output Merging copy-to-scratch
446 Muon 17.0.6 PANDA default data-access

Round1

Test 14725

Template 486, UA 17.2.2.1 Panda

Test 14725

Test 14728

Template 481 UA 17.6.0 Panda TTreeCache On skipEvent

Test 14728

Test 14732

Template 469 17.6.0 AODToEgammaD3PD PANDA default data-access

Test 14732

Aborted as this test fails.

Test 14733

Template 467 MC AtlasG4 _trf 17.2.2.2 default data access

Test 14733

Test failed to submit ad is abandoned.

Test 14736

Template 453 17.0.6.4 AODToPhysicsD3PD PANDA default data-access

Test 14736

Test 14738

Template 451 Muon 16.0.3.3 PANDA default data access

Test 14738

Test 14740

Template 447 Muon 16.0.3.3 PANDA Output Merging copy-to-scratch

Test 14740

Test 14742

Template 446 Muon 17.0.6 PANDA default data-access

Test 14742

Round2

Test Template 486

Template 486, UA 17.2.2.1 Panda

Test Template 481

Template 481 UA 17.6.0 Panda TTreeCache On skipEvent

Test Template 453

Template 453 17.0.6.4 AODToPhysicsD3PD PANDA default data-access

Test Template 451

Template 451 Muon 16.0.3.3 PANDA default data access

Test Template 447

Template 447 Muon 16.0.3.3 PANDA Output Merging copy-to-scratch

Test Template 446

Template 446 Muon 17.0.6 PANDA default data-access

Results

Benchmark Tests:

Please see side-by-side comparison: SFU-Arbutus.pdf

These were run on 24 cores and there are not many differences, apart from SE access differences which could be because of the different loads. cvmfs over nfs for 24 cores (2 WNs) performs just as well as with regular ifs software which is expected.

-- AsokaDeSilva - 2012-12-05

Topic attachments
I Attachment History Action Size Date Who Comment
PDFpdf SFU-Arbutus.pdf r1 manage 351.7 K 2012-12-12 - 18:07 AsokaDeSilva  
Edit | Attach | Watch | Print version | History: r28 < r27 < r26 < r25 < r24 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r28 - 2013-02-17 - AsokaDeSilva
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2018 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback