Second PKU Seminar on Programming Languages

Time: 16:00-18:00, September 10th, 2019

Place: Rm. 1501, Science Building 1, Peking University

Talk 1: Binary semantics extraction from natural language specifications

Speaker: Mizuhito Ogawa (Japan Advanced Institute of Science and Technology)

Abstract: Symbolic execution is an old, powerful, and popular technique to analyze and/or verify software. It has been developed mostly for high-level programming languages, such as Java and C. Recently, binary symbolic execution tools gradually increase, e.g., McVeto, Miasm, Mayhem, Klee-MC, CoDisasm, BE-PUM , and ANGR, most of which appear around 2015 and target on x86. Dynamic binary symbolic execution is useful to understand the control behaviour of PC malware under the presence of obfuscation techniques. Adding to PC malware, the recent growth of IoT malware demands symbolic execution tools on various instruction sets, e.g., ARM, MIPS, Sparc, Power PC, PIC, AVR with their large number of variations. Contrary to a general impression on the intricacy of binaries, the good news is:

- IoT malware is mostly an user-mode sequential program without floating point arithmetic. Avoiding multi-threads, weak memory models, and floating point arithmetic allows us to consider a simple semantics framework as transitions on the environment made by memory, stack, registers, and flags.

- Each instruction set mostly has a rigid natural language specification.

- Each instruction set often has the debugger and emulation environments, and the ambiguity in natural language processing can be resolved by testing.

Supported by these observations, we overview a general automated flow of the binary semantics extraction by applying natural language processing techniques, We also show its application to generating dynamic symbolic execution tools, e.g., BE-PUM on x86, Corana on ARM, and SyMIPS on MIPS. These tools are available at GitHub and a part of the talk will appear soon at FM 2019.

Bio: Mizuhito Ogawa (JAIST), Dr. of Science Mizuhito Ogawa graduated the master course of University of Tokyo, majoring Mathematics. He worked NTT laboratories on functional programs, dataflow machine, and dataflow analysis till 2001. Then, he was JST PRESTO fellow until 2003, and stayed at University of Tokyo as a visiting researcher. After moving to JAIST on 2003, he has been a professor in Schools of Information Science. His research interest covers from theory to practical tool implementation. Examples in theoretical issues are Well-Quasi-Ordering and the regularity, and the decidability of the reachability of infinite state transition systems (e.g., timed and/or concurrent systems). Tool implementations with students are (1) SMT solver, raSAT, on polynomial constraints over reals, which was ranked second in QFNRA category of SMT-COMP from 2016 to 2018, and (2) the dynamic binary symbolic execution tools, BE-PUM for x86, Corana for ARM, and SyMIPS for MIPS. They are mostly generated based on semi-automatic binary semantics extraction from natural language specifications.

Talk 2: Interpolation over Nonlinear Arithmetic -- Towards Program Reasoning and Verification

Speaker: Mingshuai Chen (Institute of Software, Chinese Academy of Sciences)

Abstract: Interpolation-based technique provides a powerful mechanism for local and modular reasoning of programs, thereby improving scalability of various verification techniques, e.g., theorem proving, model checking and abstract interpretation, to name just a few. The underlying synthesis problem over nonlinear arithmetic, however, remains challenging and existing methods have limitations on the form of formulae to be interpolated. In this talk, we present our efforts over the recent years in synthesizing nonlinear Craig interpolants, with highlights on a counterexample-guided learning method, named NIL, as a unified framework tackling the interpolation problem for the general quantifier-free theory of nonlinear arithmetic, possibly involving transcendental functions. We show the soundness of NIL and propose sufficient conditions under which NIL is guaranteed to converge and to be complete. The applicability and effectiveness of our technique are demonstrated experimentally on a collection of representative benchmarks from the literature, where in particular, NIL suffices to address more interpolation tasks, including those with perturbations in parameters, and in many cases synthesizes simpler interpolants compared with existing approaches.

Bio.: Mingshuai Chen ( will join, as a postdoctoral researcher, the Chair for Software Modeling and Verification, Department of Computer Science, RWTH Aachen University, Aachen, Germany. He received the Ph.D. degree in computer science from the Institute of Software, Chinese Academy of Sciences, Beijing, China in 2019, and B.Sc. in computer science from the School of Computer Science and Technology, Jilin University, Changchun, China in 2013. His main research interests include modelling, verification and synthesis of hybrid systems, program reasoning, and time-delayed systems. He was the awardee of the Distinguished Paper Award at ATVA 2018 and the CAS-President Special Award in 2019.

Talk 3: Composing Optimization Techniques for Vertex-Centric Graph Processing via Communication Channels

Speaker: Yongzhe Zhang (SOKENDAI, National Institute of Informatics)

Abstract: Pregels vertex-centric model allows us to implement many interesting graph algorithms, where optimization plays an important role in making it practically useful. Although many optimizations have been developed for dealing with different performance issues, it is hard to compose them together to optimize complex algorithms, where we have to deal with multiple performance issues at the same time.

To address this problem, we propose a new approach to composing optimizations, by making use of the channel interface, as a replacement of Pregels message passing and aggregator mechanism, which can better structure the communication in Pregel algorithms. We demonstrate that this system is convenient to optimize a Pregel program by simply using a proper channel from the channel library or composing them to deal with multiple performance issues. By adopting the channel interface, this system achieves an all-around performance gain for various graph algorithms. In particular, the composition of different optimizations makes the S-V algorithm 3.39x faster than the current best implementation.

Bio: Yongzhe Zhang is a fifth-year Ph.D. student at the Department of Informatics at SOKENDAI (The Graduate University for Advanced Studies). His research interest is in parallel and distributed computing, large-scale graph processing domain-specific languages and graph algorithms.