Common ATLAS-Canada computing resouces and how to use them


Additional computing cpu and disk storage resources at the Canadian Tier-1 and Tier-2s beyond those used for ATLAS production and user grid resources are made available to ATLAS-Canada memebers.

History

In the past, ATLAS-Canada members accessed extra resources in two ways:

  1. Grid jobs with CA role that were brokered to Canadian grid sites were effectively given a priority boost in the panda queue. We will call this skewing of Canadians. If the jobs were brokered to non-Canadian sites, there was no skewing.
  2. Canadians could request power-user role to directly use beyond-pledged resources at Canadian grid sites only.
Due to technical features, method 1. above has not actually worked in the last 3-4 years. Method 2. becomes less relevant as the ATLAS computing model evolves. To first order, the Canadian beyond-pledged resources have been going to the benefit of ATLAS as a whole with very little advantage for ATLAS-Canada members. Thus, we are revisiting the usage of our beyond-pledge resources to allow a possibly bigger advantage to ATLAS-Canada members. The need for vetting the requests for these resources by Canada pcom remains unchanged.

Extra Canadian-only resources

The computing use cases roughly divide into two functions: MC production and user grid jobs.

Canadian MC production . In the last 9 years Canadians have requested resources to perform MC production 4 times. This occurred mostly in the earlier years. Nowadays, it is very difficult for a user to execute the MC production chain privately, and the produced samples would not be considered official. In keeping with ATLAS policy, we propose that extra resources request by ATLAS-Canada members for production jobs actually be produced in the ATLAS central production system in the normal way. Once approved by pcom, the request would be vetted by ATLAS PC/PMG as a beyond-pledged request (ATLAS as actually asked that the ATLAS-Canada computing coordinator consult with PC/PMG before Canadian internal review). The procedure would be identical to that implemented in 2017 for USA. The understanding is that these requests would be infrequent – about one a year – and should not require more resources then the Canadian beyond-pledged resources. In discussions with ADC, it would be considered a gentleman’s agreement and accounting would be done privately. In terms of implementation, the request would be submitted by a Canadian MC production manager. The later is simple as Doug Gingrich currently has that role, and Alberta has an institutional responsibility in MC production. In the long-distance future, the ATLAS-Canada computing manager can keep the books. But as mentioned above, these are likely to be once-off cases that happen only about once a year.

User grid jobs . These can be divided into three cases: a need for essentially cpu only, diskspace only, or both. The use case for needing extra cpu is not clear to many computing people. It seems equivalent to the need for a priority boost during times of heavy user grid activity, or tight schedules. The 6 use cases in the last 9 years seem less relevant lately and in some cases could have been accommodated if the user had access to a decent Tier-3, or a Canadian-only batch queue at a Canadian Tier-2 site. We propose to handle these requests on a case-by-case basis. In addition, a batch queue has been implemented at the Tier-2(s) for those cases in which it is useful. Please see below.

  1. If the user needs a short priority boost on standard ATLAS grid resources, we make an argument to ADC similar to the MC production argument, in which our beyond-pledge resources are leveraged into the priority boost. The accounting is managed as a gentleman’s agreement by the Canadian MC production manager (in the future ATLAS-Canada computing manager) so as not to exceed the total Canadian beyond-pledged resources.
  2. For the Canadian user that requires extra resources in Canada, these can be handle as power-user role as they have been done for the last 9 years. This is also how diskspace can be handled. When this becomes technically not possible (almost now), the user can use the batch queue at the Tier-2.
To initiate your request, contact the ATLAS-Canada computing coordinator (currently Doug Gingrich <gingrich@ualberta.ca>) . If none of the above fits your situation discuss it with the ATLAS-Canada computing coordinator. One of the roles of the ATLAS-Canada computing coordinator is to enable a computing advantage to ATLAS-Canada members.

Canadian Tier-3 resources at Compute Canada

Information about the Canadian Tier-3 resouces at Compute Canada and how to use them can be found here.

Requesting resources

If you just want to use Canadian Tier-3 CPU and the Compute Canada /project and /scratch space, no request is needed. Follow the above Tier-3 link.

If your require access to the ATLASGROUPDISKs beyond your quota, you need to request the space as below.

To explore posibilities and ask questions, first contact the ATLAS-Canada computing coordinator (currently Doug Gingrich <gingrich@ualberta.ca>).

A request should be submitted to the ATLAS-Canada Prioritization Committee (PCom) which consists of the Physics coordinator and the Computing coordinator; please send an email to atlas-canada-pcom@cern.ch.

This request must specify:

  • The physics motivation and justification for the request.
  • A list of the members on the project. We are only interest in the members at Canadian institutions, but please indicate if there are members at non-Canadian institutions.
  • If requesting CPU time, give the CPU required, how long, and details of how you estimated it.
  • If requesting disk spacke, give the TB required, how long you need it, and details of how you estimated it. It would also be useful to know the data format like EVNT, HITS, AOD, xAOD, other, etc. And if the files are partiuclarly big or small.
  • The timeline of the project with estimates when and how long the resources will be needed.
If the request deals with official datasets and users at non-Canadian institutions are running jobs, we may recommend you to go through official ADC channels but run the jobs on the CA cloud where we will ensure the jobs have the resources needed. On the other hand, if you are an individual user or a group of CA users who need these resources, we may grant you the role instead so that you can submit the tasks. Note that in no situation will ATLAS-Canada user support run jobs for you; we will diagnose and help resolve site issues and ensure you get the required resources.

-- Doug Gingrich - 2019-02-23

Edit | Attach | Watch | Print version | History: r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r2 - 2019-03-01 - DougGingrich
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback