I am a static analysis enthusiast at SonarSource, one of the developers of the analysis engine for the various editions of C++.
I am most interested in programming languages, code analysis and transformation, computer systems, and formal methods.
Experience
I participate in static-analysis development for C, C++, and Objective-C languages using the Clang AST and LLVM IR, as well as Clang Static Analyzer.
I lead the Vigor project. It enables regular developers of software network functions to apply formal verification to their applications. A core team of 6 people (doctoral students, professors and a postdoc) contributed to the project on a permanent basis, and more than 10 MS and BS students joined for semester or summer internships.
As part of the job, I coached student teams (4-7 people per team) as they learned to apply scrum methodology in their semester project - android application development. My advisors:
- Prof. Katerina Argyraki
- Prof. George Candea
Added event tracking for network routing calculation. For the new Google datacenter network, in order to render performance more transparent in a distributed control plane, I implemented a logging and aggregate analytics system.
Development of an AOT of the ECMAScript 5.0 (JavaScript) language, for the Samsung Smart TV OS.
- Development of a performance-optimizing binary translator from x86 to a new architecture with fine-grained concurrency.
- Development of a functional simulator for a new architecture with fine-grained concurrency.
Participated in the development of a base model of prosthesis limbs.
Publications
For more details on the publications I invite you to visit my Google Scholar page.
We present the design and implementation of Vigor, a software stack and toolchain for building and running software network middleboxes that are guaranteed to be correct, while preserving competitive performance and developer productivity.
Developers write the core of the middlebox - the network function (NF) - in C, on top of a standard packet-processing framework, putting persistent state in data structures from Vigor’s library; the Vigor toolchain then automatically verifies that the resulting software stack correctly implements a specification, which is written in Python.
Vigor has three key features: network function developers need no verification expertise, and the verification process does not require their assistance (push-button verification); the entire software stack is verified, down to the hardware (full-stack verification); and verification can be done in a pay-as-you-go manner, i.e., instead of investing upfront a lot of time in writing and verifying a complete specification, one can specify one-off properties in a few lines of Python and verify them without concern for the rest.
We developed five representative NFs - a NAT, a Maglev load balancer, a MAC-learning bridge, a firewall, and a traffic policer - and verified with Vigor that they satisfy standards-derived specifications, are memory-safe, and do not crash or hang. We show that they provide competitive performance.
Software network functions (NFs), or milddleboxes, promise flexibility and easy deployment of network services, but face the serious challenge of unexpected performance behaviour. We propose the notion of a performance contract, a construct formulated in terms of performance critical variables, that provides a precise description of NF performance. Performance contracts enable fine-grained prediction and scrutiny of NF performance for arbitrary workloads, without having to run the NF itself.
We describe Bolt, a technique and tool for computing such performance contracts for the entire software stack of NFs written in C, including the core NF logic, DPDK packet processing framework, and NIC driver. Bolt takes as input the NF implementation code and outputs the corresponding contract. Under the covers, it combines pre-analysis of a library of stateful NF data structures with automated symbolic execution of the NF’s code. We evaluate Bolt on four NFs - a Maglev-like load balancer, a NAT, an LPM router, and a MAC bridge - and show that its performance contracts predict the dynamic instruction count and memory access count with a maximum gap of 7% between the real execution and the conservatively predicted upper bound. With further engineering, this gap can drop to 0.
Software network functions promise to simplify the deployment of network services and reduce network operation cost. However, they face the challenge of unpredictable performance. Given this performance variability, it is imperative that during deployment, network operators consider the performance of the NF not only for typical but also adversarial workloads.
We contribute a tool that helps solve this challenge: it takes as input the LLVM code of a network function and outputs packet sequences that trigger slow execution paths. Under the covers, it combines directed symbolic execution with a sophisticated cache model to look for execution paths that incur many CPU cycles and involve adversarial memory-access patterns. We used our tool on 11 network functions that implement a variety of data structures and discovered workloads that can in some cases triple latency and cut throughput by 19% relative to typical testing workloads.
Prior work proved a stateful NAT network function to be semantically correct, crash-free, and memory safe [29]. Their toolchain verifies the network function code while assuming the underlying kernel-bypass framework, drivers, operating system, and hardware to be correct. We extend the toolchain to verify the kernel-bypass framework and a NIC driver in the context of the NAT. We uncover bugs in both the framework and the driver.
We present a Network Address Translator (NAT) written in C and proven to be semantically correct according to RFC 3022, as well as crash-free and memory-safe. There exists a lot of recent work on network verification, but it mostly assumes models of network functions and proves properties specific to network configuration, such as reachability and absence of loops.
Our proof applies directly to the C code of a network function, and it demonstrates the absence of implementation bugs. Prior work argued that this is not feasible (i.e., that verifying a real, stateful network function written in C does not scale) but we demonstrate otherwise: NAT is one of the most popular network functions and maintains per-flow state that needs to be properly updated and expired, which is a typical source of verification challenges. We tackle the scalability challenge with a new combination of symbolic execution and proof checking using separation logic; this combination matches well the typical structure of a network function. We then demonstrate that formally proven correctness in this case does not come at the cost of performance
Education
I made formal software verification practical in the domain of network functions (used in middleboxes).
I specialized in software engineering and PL systems at the Department of Radioelectronics and Cybernetics. During the studies, I lectured and lead practical groups in compiler construction class for junior students.
Graduation with honors.
Graduation with honors, top 1% of the class. GPA 5.0 out of 5
Selected Projects
Here are some of my side and study projects. Visit my GitHub page for more.
An automatic formal verification framework for software network functions
Started as a second-year class project, a 2d isometric arcade game based on SDL. It is based on a home-grown 2d physics engine with developed collision detection algorithm, fairly solid body model for the cars and non-isotropic damage model. The game supports scripting in ECL, including bots, and multiplayer on the same-keyboard and across the network.
As a side project during my school years, I enjoyed playing with 3d graphic rendering using open GL, and simulating solid body dynamics based on the first principles. The result was a tank-duel game with customizable terrain realistic physics with a home-grown game engine.
Skills & Proficiency
C
OCaml
Scrum
C++
Java/Android
Emacs
Vi
Biography
Dr. Arseniy Zaostrovnykh is a static analysis enthusiast working at SonarSource on improving the quality and security of program code. Arseniy Zaostrovnykh graduated from (2020) Network Architecture Laboratory and Dependable System Laboratory at the École polytechnique fédérale de Lausanne (EPFL), working under the supervision of Prof. Katerina Argyraki and Prof. George Candea. In his doctoral thesis, Arseniy proposed bringing formal software verification within the reach of regular software engineers in the domain of software network functions. Arseniy earned his MSc (2014) and BSc (2012) at Moscow Institute of Physics and Technology. He has interned at Intel and Google, and worked as a software engineer at the Samsung Research Institute (SRI) and Intel and as a junior researcher at the Institute of Electronic Control Computers in Moscow.