Prof. Hwang Sung Jae's Research Lab Publishes a papaer in ESEC FSE20232023-05-30
황성재 교수 연구실(소프트웨어 보안 연구실, SoftSec@SKKU) ESEC/FSE 2023 논문 게제 승인 소프트웨어 보안 연구실 (지도교수: 황성재)에서 작성한 논문이 소프트웨어 공학 분야의 최상위 국제 학술대회인 FSE 2023 (30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering)에 게재 승인되었습니다. 본 논문 “EtherDiffer: Differential Testing on RPC Services of Ethereum Nodes” 은 2023년 12월 미국 샌프란시스코에서 발표될 예정입니다. [논문 정보] - EtherDiffer: Differential Testing on RPC Services of Ethereum Nodes - Shinhae Kim, and Sungjae Hwang - 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2023) [논문 요약] Blockchain is a distributed ledger that records transactions among users on top of a peer-to-peer network. While various blockchain platforms exist, Ethereum is the most popular general-purpose platform and its support of smart contracts led to a new form of applications called decentralized applications (DApps). A typical DApp has an off-chain frontend and on-chain backend architecture, and the frontend often needs interactions with the backend network, e.g., to acquire chain data or make transactions. Therefore, Ethereum nodes implement the official RPC specification and expose a uniform set of RPC methods to the frontend. However, the specification is not sufficient in two points: (1) lack of clarification for non-deterministic event handling, and (2) lack of specification for invalid-as-themselves arguments. To effectively disclose any deviations caused by the insufficiency, this paper introduces EtherDiffer that automatically performs differential testing on four major node implementations in terms of their RPC services. EtherDiffer detected 48 different classes of deviations including 11 implementation bugs such as crash and denial-of-service bugs. We reported 44 of the detected classes to the specification and node developers and received acknowledgements as well as bug patches.
A paper of Computer Graphics Lab. (CGLab) is accepted to ACM SIGGRAPH 20232023-05-25
A paper of Computer Graphics Lab. (CGLab, Advisor: Sungkil Lee; the first author: Janghun Kim), entitled "Potentially Visible Hidden-Volume Rendering for Multi-View Warping," has been accepeted to ACM SIGGRAPH 2023. The paper is going to be presented at LA, USA, August, 2023. ACM SIGGRAPH is the most prestigious conference in Computer Graphics area. This paper is selected as a journal-track paper, and will be published in ACM Trasactions on Graphics, Volume 42, No. 4의 special issue. Abstract -------- This paper presents the model and rendering algorithm of Potentially Visible Hidden Volumes (PVHVs) for multi-view image warping. PVHVs are 3D volumes that are occluded at a known source view, but potentially visible at novel views. Given a bound of novel views, we define PVHVs using the edges of foreground fragments from the known view and the bound of novel views. PVHVs can be used to batch-test the visibilities of source fragments without iterating individual novel views in multi-fragment rendering, and thereby, cull redundant fragments prior to warping. We realize the model of PVHVs in Depth Peeling (DP). Our Effective Depth Peeling (EDP) can reduce the number of completely hidden fragments, capture important fragments early, and reduce warping cost. We demonstrate the benefit of our PVHVs and EDP in terms of memory, quality, and performance in multi-view warping.
Prof. Park HoGun's Research Lab Publishes a paper in the SIGKDD 20232023-05-23
Exploiting Relation-aware Attribute Representation Learning in Knowledge Graph Embedding for Numerical Reasoning Sookyung Kim+, Gayoung Kim+, Ko Keun Kim, Suchan Park, Heesoo Jung, Hogun Park* ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) 2023. Full Paper (Research Track). [+ Means Equal Contribution.] [Abstract] Numerical reasoning is one of the essential tasks to support machine learning applications such as recommendation and information retrieval. The reasoning task aims to compare two items and infer the new facts (e.g., is taller than) by leveraging existing relational information and numerical attributes (e.g., the height of an entity) in knowledge graphs. However, most existing methods are limited to introducing new attribute encoders or additional losses to predict the numeric values and are not robust when numerical attributes are sparsely available. In this paper, we propose a novel graph embedding method named RAKGE, which enhances numerical reasoning on knowledge graphs. The proposed method includes relation-aware attribute representation learning, which can leverage the association between relations and their corresponding numerical attributes. Additionally, we introduce a robust self-supervised learning method to generate unseen positive and negative examples, thereby making our approach more reliable when numerical attributes are sparse. Evaluated on three real-world datasets, our proposed model outperforms state-of-the-art methods, achieving an improvement of up to 65.1% in Hits@1 and up to 52.6% in MRR compared to the best competitor.
Prof. Lee JeeHyong's Research Lab Publishes Three papers in the ACL 20232023-05-23
논문 #1: “DIP: Dead code Insertion based Black-box Attack for Programming Language Model”, ACL 2023 (인공지능학과 석박통합과정 나철원, 소프트웨어학과 박사과정 최윤석) 논문 #2: “BLOCSUM: Block Scope-based Source Code Summarization via Shared Block Representation”, Findings of ACL 2023 (소프트웨어학과 박사과정 최윤석, 인공지능학과 석사과정 김효준) 논문 #3: “CodePrompt: Task-Agnostic Prefix Tuning for Program and Language Generation”, Findings of ACL 2023 (소프트웨어학과 박사과정 최윤석) (논문 #1) [Abstract] Automatic processing of source code, such as code clone detection and software vulnerability detection, is very helpful to software engineers. Large pre-trained Programming Language (PL) models (such as CodeBERT, GraphCodeBERT, CodeT5, etc.), show very powerful performance on these tasks. However, these PL models are vulnerable to adversarial examples that are generated with slight perturbation. Unlike natural language, an adversarial example of code must be semantic-preserving and compilable. Due to the requirements, it is hard to directly apply the existing attack methods for natural language models. In this paper, we propose DIP (Dead code Insertion based Black-box Attack for Programming Language Model), a high-performance and efficient black-box attack method to generate adversarial examples using dead code insertion. We evaluate our proposed method on 9 victim downstream-task large code models. Our method outperforms the state-of-the-art black-box attack in both attack efficiency and attack quality, while generated adversarial examples are compiled preserving semantic functionality. (논문 #2) [Abstract] Code summarization, which aims to automatically generate natural language descriptions from source code, has become an essential task in software development for better program understanding. Abstract Syntax Tree (AST), which represents the syntax structure of the source code, is helpful when utilized together with the sequence of code tokens to improve the quality of code summaries. Recent works on code summarization attempted to capture the sequential and structural information of the source code, but they considered less the property that source code consists of multiple code blocks. In this paper, we propose BLOCSUM, BLOck scope-based source Code SUMmarization via shared block representation that utilizes block-scope information by representing various structures of the code block. We propose a shared block position embedding to effectively represent the structure of code blocks and merge both code and AST. Furthermore, we develop variant ASTs to learn rich information such as block and global dependencies of the source code. To prove our approach, we perform experiments on two real-world datasets, the Java dataset and the Python dataset. We demonstrate the effectiveness of BLOCSUM through various experiments, including ablation studies and a human evaluation. (논문 #3) [Abstract] In order to solve the inefficient parameter update and storage issues of fine-tuning in Natural Language Generation (NLG) tasks, prompt-tuning methods have emerged as lightweight alternatives. Furthermore, efforts to reduce the gap between pre-training and fine-tuning have shown successful results in low resource settings. As large Pre-trained Language Models (PLMs) for Program and Language Generation (PLG) tasks are constantly being developed, prompt tuning methods are necessary for the tasks. However, due to the gap between pre-train and fine-tuning different from PLMs for natural language, a prompt tuning method that reflects the traits of PLM for program language is needed. In this paper, we propose a Task-Agnostic prompt tuning method for the PLG tasks, CodePrompt, that combines Input-Dependent Prompt Template (to bridge the gap between pre-training and fine-tuning of PLMs for program and language) and Corpus-Specific Prefix Tuning (to efficiently update the parameters of PLMs for program and language). Also, we propose a method to provide more rich prefix word information for limited prefix lengths. We prove that our method is effective in three PLG tasks, not only in the full-data setting, but also in the low-resource setting and cross domain setting.