de
en
Close
Detailsuche
Bibliotheken
Projekt
Imprint
Privacy Policy
de
en
Close
Imprint
Privacy Policy
jump to main content
Search Details
Quicksearch:
OK
Result-List
Title
Title
Content
Content
Page
Page
Search the document
A popularity prediction and dynamic data replication study for the ATLAS distributed data management / by Thomas Beermann. Wuppertal, July 5, 2017
Content
Abstract
Acknowledgements
Contents
I Introduction
1 The LHC, ATLAS and the WLCG
1.1 General
1.1.1 CERN and LHC
1.1.2 The ATLAS Experiment
1.1.3 Trigger
1.1.4 WLCG
2 ATLAS Computing Model and Systems
2.1 ATLAS Computing Model
2.1.1 Data Types
2.1.2 Cloud & Tier Structure
2.2 ATLAS Computing Systems
2.2.1 DDM
2.2.2 Automated Data Replication
2.2.3 Space Tokens
2.2.4 Data Retention
2.2.5 Data Replication Policies
2.2.6 WMS
2.2.7 Tracer
2.2.8 Popularity System
2.2.9 Victor
2.2.10 PD2P
3 Motivation
II Prediction
4 Popularity Prediction
4.1 Introduction
4.1.1 Time Series
4.1.2 Time Series Prediction
4.2 System Design
4.2.1 Input Data
4.3 Prediction Methods
4.3.1 Static Prediction
4.3.2 Linear Prediction
4.4 Neural Network Prediction
4.4.1 Basics
4.4.2 Training
4.4.3 Dataset Access Prediction
4.4.4 Neural network separation
4.5 Hybrid Solution
4.5.1 Prefiltering
4.5.2 Hybrid Prediction
4.6 Implementation
4.6.1 Python neural networks initialisation
4.6.2 Scale Down/Up
4.6.3 Training
4.6.4 Prediction
5 Prediction Evaluation
5.1 Prediction Input Data
5.2 Data Access Pattern Examples
5.2.1 Highly popular project / datatypes
5.2.2 Popular project / datatypes
5.2.3 Unpopular project / datatypes
5.3 Evaluation
5.3.1 Evaluation Metric
5.3.2 Static Prediction
5.3.3 Linear Prediction
5.3.4 Neural Networks
5.3.5 Summary
5.3.6 Hybrid Prediction
5.4 Conclusion
III Redistribution
6 Data Redistribution
6.1 Introduction
6.1.1 Example
6.2 Requirements
6.2.1 Input Data
6.2.2 Profits
6.2.3 Constraints
6.2.4 Costs
6.3 Redistribution Algorithm
6.3.1 Replica Creation
6.3.2 Replica Deletion
6.3.3 Combination
6.3.4 Example
6.4 Implementation
6.4.1 Site Module
6.4.2 Data Catalogue
6.4.3 Redistribution
7 Grid Simulator
7.1 Motivation
7.2 Requirements and Design
7.3 Architecture / Components
7.3.1 Sites
7.3.2 DDM
7.3.3 WMS
7.3.4 Inputs
7.3.5 Outputs
7.3.6 Workflow
7.4 Implementation
7.4.1 Simulation Framework
7.4.2 Handling of Jobs and Sites
7.4.3 Data Catalogue
7.4.4 Transfers and Deletions during a simulation
7.4.5 Adding Workload to the System
7.4.6 Site Utilisation
7.4.7 Break and Restart Simulation
7.4.8 Example
7.4.9 Summary
8 Redistribution Evaluation
8.1 Introduction
8.2 Simulation Setup
8.2.1 Sites Setup
8.2.2 Workload Setup
8.2.3 Replica Catalogue
8.2.4 Simulation Mode
8.2.5 Redistribution
8.3 Simulation Parameters
8.3.1 Workload
8.3.2 Sites
8.3.3 Replicas
8.4 Results
8.4.1 First Week
8.4.2 Second Week
8.4.3 Third Week
8.5 Summary
IV Summary
9 Conclusion and Outlook
9.1 Summary
9.2 Outlook
9.2.1 Open Questions
9.2.2 Further developments
V Appendix
List of Figures
List of Tables
10 Code Samples
10.1 Prediction
10.2 Redistribution
10.3 Simulator
Reference