|[Congrats] In 2017, our paper "Predicting HPC parallel program performance based on LLVM compiler" has been accepted by Cluster Computing.(SCI Impact factor 1.514)|
Performance prediction of parallel program plays key roles in many areas, such as parallel system design, parallel program optimization, and parallel system procurement. Accurate and efficient performance prediction on large-scale parallel systems is a challenging problem. To solve this problem, we present an effective framework for performance prediction based on the LLVM compiler technique in this paper. We can predict the performance of a parallel program on a small amount of nodes of the target parallel system using this framework toned but not execute this parallel program on a corresponding full-scale parallel system. This framework predicts the performance of computation and communication components separately and combines the two predictions to achieve full program prediction. As for sequential computation, we first combine the static branch probability and loop trip count identification and propose a new instrumentation method to acquire the number of each instruction type. We then construct a test program to measure the average execution time of each instruction type. Finally, we utilize the pruning technique to convert a parallel program into a corresponding sequential program to predict the performance on only one node of the target parallel system. As for communication, we utilize the LogGP model to model point-to-point communication and the artificial neural network technique to model collective communication. We validate our approach by a set of experiments that predict the performance of NAS parallel benchmarks and CGPOP parallel application. Experimental results show that the proposed framework can accurately predict the execution time of parallel programs, and the average error rate of these programs is 10.86%.
|[Congrats] In 2016, our paper "Trustworthy Enhancement for Cloud Proxy based on Autonomic Computing" has been accepted by IEEE Transactions on Cloud Computing|
Aiming to improve Internet content accessing capacity of the system, cloud proxy platforms are used to improve the visiting performance in network export environment. Limited by complexity of cloud proxy system, trustworthy guarantee of cloud system becomes a difficult problem. Considering the self-government of autonomic computing, it could enhance cloud system trustworthy and avoids system management security and reliable problems brought by complex construction. Based on the idea of self-supervisory, a mechanism to enhance security of cloud system was proposed in this paper. Firstly, a trustworthy autonomous enhancement framework for virtual machines was proposed. Secondly, a method to extract linear relationship of monitoring items in the virtual machine based on ARX model was put forward. According to the mapping relation between monitoring items and system modules, an abnormal module positioning technology based on Naive Bayes classifier was developed to realize self-sensing of abnormal system conditions. Finally, security threats of virtual machines including malicious dialogue and buffer memory of hot attacks were tested through experiments. Results showed that the proposed trustworthy enhancement mechanism of virtual machines based on autonomic computing could achieve trustworthy enhancement of virtual machines effectively and provide an effective safety protection for the cloud system.
|[Congrats] In 2016, our paper "Android platform-based individual privacy information protection system" has been accepted by Personal and Ubiquitous Computing (SCI Impact factor 1.498)|
With the popularity of mobile phones with Android platform, Android platform-based individual privacy information protection has been paid more attention to. In consideration of individual privacy information problem after mobile phones are lost, this paper tried to use SMS for remote control of mobile phones and providing comprehensive individual information protection method for users and completed a mobile terminal system with self-protection characteristics. This system is free from the support of the server and it can provide individual information protection for users by the most basic SMS function, which is an innovation of the system. Moreover, the protection mechanism of the redundancy process, trusted number mechanism and SIM card detection mechanism are the innovations of this system. Through functional tests and performance tests, the system could satisfy user functional and non-functional requirements, with stable operation and high task execution efficiency.
|[Congrats] In 2016, our paper "Network-aware Virtual Machine Migration in an Overcommitted Cloud" has been accepted by Future Generation Computer Systems (SCI Impact factor 2.78)|
Virtualization, which acts as the underlying technology for cloud computing, enables large amounts of third-party applications to be packed into virtual machines (VMs). VM migration enables servers to be reconsolidated or reshuffled to reduce the operational costs of data centers. The network traffic costs for VM migration currently attract limited attention.
However, traffic and bandwidth demands among VMs in a data center account for considerable total traffic. VM migration also causes additional data transfer overhead, which would also increase the network cost of the data center.
This study considers a network-aware VM migration (NetVMM) problem in an overcommitted cloud and formulates it into a non-deterministic polynomial time-complete problem. This study aims to minimize network traffic costs by considering the inherent dependencies among VMs that comprise a multi-tier application and the underlying topology of physical machines and to ensure a good trade-off between network communication and VM migration costs.
The mechanism that the swarm intelligence algorithm aims to find is an approximate optimal solution through repeated iterations to make it a good solution for the VM migration problem. In this study, genetic algorithm (GA) and artificial bee colony (ABC) are adopted and changed to suit the VM migration problem to minimize the network cost. Experimental results show that GA has low network costs when VM instances are small. However, when the problem size increases, ABC is advantageous to GA. The running time of ABC is also nearly half than that of GA. To the best of our knowledge, we are the first to use ABC to solve the NetVMM problem.
Weizhe (James) Zhang (张伟哲)
Professor, Ph.D. Supervisor (教授 博导)
|Professor (2013 - )||School of Computer Science and Technology, Harbin Institute of Technology, China|
|Visiting Professor (2013-2014)||With Prof. Marc Snir, Department of Computer Science, UIUC, USA|
|Ph.D. Supervisor (2012 -)||School of Computer Science and Technology, Harbin Institute of Technology, China|
|Associate Professor (2007 - 2012)||School of Computer Science and Technology, Harbin Institute of Technology, China|
|Post-Doctoral (2007-2010)||School of Electronics and Information Engineering, Harbin Institute of Technology, China|
|Visiting Scholar (2005-2006)||Department of Computer Science, University of Houston, USA|
|Lecturer (2003-2007)||School of Computer Science and Technology, Harbin Institute of Technology, China|
|Ph.D. (2001-2006)||Computer Science, Harbin Institute of Technology (HIT), China|
|M.S. (1999-2001)||Computer Science, Harbin Institute of Technology (HIT), China|
|B.Sc. (1995-1999)||Computer Science, Harbin Institute of Technology (HIT), China|
|E-mail:||wzzhang AT hit DOT edu DOT cn|
|Office:||Room 708, Zonghe Building,Harbin Institute of Technology, Harbin, Heilongjiang, China.|
|Address:||P.O.Box 320, No.92 West Dazhi Street, Nangang District, Harbin Institute of Technology, Harbin, Heilongjiang, China. 150001|
|[Congratulations!] In 2015, our paper "Automatic Memory Control of Multiple Virtual Machines on a Consolidated Server" has been accepted by IEEE Transactions on Cloud Computing.|
Through virtualization, multiple virtual machines can coexist and operate on one physical machine. When virtual machines (VMs) compete for memory, the performances of applications deteriorate, especially those of memory-intensive applications. In this study, we aim to optimize memory control techniques using a balloon driver for server consolidation. Our contribution is three-fold: (1) We design and implement an automatic control system for memory based on a Xen balloon driver. To avoid interference with VM monitor operation, our system works in user mode; therefore, the system is easily applied in practice. (2) We design an adaptive global-scheduling algorithm to regulate memory. This algorithm is based on a dynamic baseline, which can adjust memory allocation according to the memory used by the VMs. (3) We evaluate our optimized solution in a real environment with 10 VMs and well-known benchmarks (DaCapo and Phoronix Test Suites). Experiments confirm that our system can improve the performance of memory-intensive and disk-intensive applications by up to 500% and 300%, respectively. This toolkit has been released for free download as a GNU General Public License v3 software.
|[Congratulations!] In 2015, our paper "DwarfCode: A Performance Prediction Tool for Parallel Applications" has been accepted by IEEE Transactions on Computers.|
We present DwarfCode, a performance prediction tool for MPI applications on diverse computing platforms. The goal is to accurately predict the running time of applications for task scheduling and job migration. First, DwarfCode collects the execution traces to record the computing and communication events. Then, it merges the traces from different processes into a single trace. After that, DwarfCode identifies and compresses the repeating patterns in the final trace to shrink the size of the events. Finally, a dwarf code is generated to mimic the original program behavior. This smaller running benchmark is replayed in the target platform to predict the performance of the original application. In order to generate such a benchmark, two major challenges are to reduce the time complexity of trace merging and repeat compression algorithms. We propose an O(mpn) trace merging algorithm to combine the traces generated by separate MPI processes, where m denotes the upper bound of tracing distance, p denotes the number of processes, and n denotes the maximum of event numbers of all the traces. More importantly, we put forward a novel repeat compression algorithm, whose time complexity is O(nlogn). Experimental results show that DwarfCode can accurately predict the running time of MPI applications. The error rate is below 10 percent for compute and communication intensive applications. This toolkit has been released for free download as a GNU General Public License v3 software.
|[Congratulations!] In 2015, our paper "Solving Energy-Aware Real-Time Tasks Scheduling Problem with Shuffled Frog Leaping Algorithm on Heterogeneous Platforms" has been accepted by Sensors, whose impact factor is 2.245|
Reducing energy consumption is becoming very important in order to keep battery life and lower overall operational costs for heterogeneous real-time multiprocessor systems. In this paper, we first formulate this as a combinatorial optimization problem. Then, a successful meta-heuristic, called Shuffled Frog Leaping Algorithm (SFLA) is proposed to reduce the energy consumption. Precocity remission and local optimal avoidance techniques are proposed to avoid the precocity and improve the solution quality. Convergence acceleration significantly reduces the search time. Experimental results show that the SFLA-based energy-aware meta-heuristic uses 30% less energy than the Ant Colony Optimization (ACO) algorithm, and 60% less energy than the Genetic Algorithm (GA) algorithm. Remarkably, the running time of the SFLA-based meta-heuristic is 20 and 200 times less than ACO and GA, respectively, for finding the optimal solution.