Feng Yao 姚烽
Logo PhD student @ Northeastern University

I am a Ph.D. student in Computer Science at Northeastern University (China), supervised by Prof. Yanfeng Zhang, and a member of the iDC-NEU research group.

I’m interested in building distributed and parallel graph processing systems, heterogeneous graph data management. I am also interested in vector database.


Education
  • Northeastern University
    Northeastern University
    Ph.D. Student
    Sep. 2021 - present
  • Northeastern University
    Northeastern University
    M.S. in Computer Science
    Sep. 2018 - Jul. 2021
  • Changchun University of Science and Technology
    Changchun University of Science and Technology
    B.S. in Computer Science
    Sep. 2014 - Jul. 2018
Experience
  • Tongyi Lab
    Tongyi Lab
    Research Intern
    Mar. 2022 - Mar. 2024
Honors & Awards
  • Southern Manganese Industries Scholarship
    2024
  • Huawei Scholarship
    2023
Selected Publications (view all )
GastCoCo: Graph Storage and Coroutine-Based Prefetch Co-Design for Dynamic Graph Processing
GastCoCo: Graph Storage and Coroutine-Based Prefetch Co-Design for Dynamic Graph Processing

Hongfu Li, Qian Tao, Song Yu, Shufeng Gong, Yanfeng Zhang, Feng Yao, Wenyuan Yu, Ge Yu, Jingren Zhou

Proceedings of the International Conference on Vary Large Data Bases (VLDB) 2025

Existing disaggregated databases simply couple CC either with the execution layer or the storage layer, which limits the performance and elasticity of these systems. This paper proposes Concurrency Control as a Service (CCaaS), which decouples CC from databases, building an execution-CC-storage three-layer decoupled database, allowing independent scaling and upgrades for improved elasticity, resource utilization, and development agility.

GastCoCo: Graph Storage and Coroutine-Based Prefetch Co-Design for Dynamic Graph Processing

Hongfu Li, Qian Tao, Song Yu, Shufeng Gong, Yanfeng Zhang, Feng Yao, Wenyuan Yu, Ge Yu, Jingren Zhou

Proceedings of the International Conference on Vary Large Data Bases (VLDB) 2025

Existing disaggregated databases simply couple CC either with the execution layer or the storage layer, which limits the performance and elasticity of these systems. This paper proposes Concurrency Control as a Service (CCaaS), which decouples CC from databases, building an execution-CC-storage three-layer decoupled database, allowing independent scaling and upgrades for improved elasticity, resource utilization, and development agility.

RAGraph: A Region-Aware Framework for Geo-Distributed Graph Processing
RAGraph: A Region-Aware Framework for Geo-Distributed Graph Processing

Feng Yao, Qian Tao, Wenyuan Yu, Yanfeng Zhang, Shufeng Gong, Qiange Wang, Ge Yu, Jingren Zhou

Proceedings of the VLDB Endowment (PVLDB) 2024

In this paper, we propose RAGraph, a Region-Aware framework for geo-distributed graph processing. At the core of RAGraph, we design a region-aware graph processing framework that allows advancing inefficient global updates locally and enables sensible coordination-free message interactions. RAGraph also contains an adaptive hierarchical message interaction engine to switch interaction modes adaptively based on network heterogeneity and fluctuation, and a discrepancy-aware message filtering strategy to filter important messages.

RAGraph: A Region-Aware Framework for Geo-Distributed Graph Processing

Feng Yao, Qian Tao, Wenyuan Yu, Yanfeng Zhang, Shufeng Gong, Qiange Wang, Ge Yu, Jingren Zhou

Proceedings of the VLDB Endowment (PVLDB) 2024

In this paper, we propose RAGraph, a Region-Aware framework for geo-distributed graph processing. At the core of RAGraph, we design a region-aware graph processing framework that allows advancing inefficient global updates locally and enables sensible coordination-free message interactions. RAGraph also contains an adaptive hierarchical message interaction engine to switch interaction modes adaptively based on network heterogeneity and fluctuation, and a discrepancy-aware message filtering strategy to filter important messages.

Towards Efficient Graph Processing in Geo-Distributed Data Centers
Towards Efficient Graph Processing in Geo-Distributed Data Centers

Feng Yao, Qian Tao, Shengyuan Lin, Yanfeng Zhang, Wenyuan Yu, Shufeng Gong, Qiange Wang, Ge Yu, Jingren Zhou

IEEE Transactions on Parallel and Distributed Systems (TPDS) 2024

This work investigates the problem of data placement for graph processing in geo-distributed data centers. The key idea is to migrate boundary vertices with relatively low contributions to algorithm convergence, thereby enabling the relocated boundary vertices to generate and propagate more influential messages and improving the utilization of scarce network resources. Specifically:(1) We introduce a vertex contribution metric to quantify a vertex’s ability to generate and propagate influential messages, which reflects its contribution to algorithm convergence; (2) We propose a contribution-driven boundary migration algorithm that incorporates both contribution metrics and network heterogeneity, enabling the efficient identification and migration of high-contribution vertices near boundaries; (3) Experimental results demonstrate that our algorithm achieves 1.23× to 2.7× performance improvement and reduces WAN costs by 14.7% to 49.4% in geo-distributed graph processing systems.

Towards Efficient Graph Processing in Geo-Distributed Data Centers

Feng Yao, Qian Tao, Shengyuan Lin, Yanfeng Zhang, Wenyuan Yu, Shufeng Gong, Qiange Wang, Ge Yu, Jingren Zhou

IEEE Transactions on Parallel and Distributed Systems (TPDS) 2024

This work investigates the problem of data placement for graph processing in geo-distributed data centers. The key idea is to migrate boundary vertices with relatively low contributions to algorithm convergence, thereby enabling the relocated boundary vertices to generate and propagate more influential messages and improving the utilization of scarce network resources. Specifically:(1) We introduce a vertex contribution metric to quantify a vertex’s ability to generate and propagate influential messages, which reflects its contribution to algorithm convergence; (2) We propose a contribution-driven boundary migration algorithm that incorporates both contribution metrics and network heterogeneity, enabling the efficient identification and migration of high-contribution vertices near boundaries; (3) Experimental results demonstrate that our algorithm achieves 1.23× to 2.7× performance improvement and reduces WAN costs by 14.7% to 49.4% in geo-distributed graph processing systems.

Fast Iterative Graph Computing with Updated Neighbor States
Fast Iterative Graph Computing with Updated Neighbor States

Yijie Zhou, Shufeng Gong, Feng Yao, Hanzhang Chen, Song Yu, Pengxi Liu, Yanfeng Zhang, Ge Yu, Jeffrey Xu Yu

IEEE International Conference on Data Engineering (ICDE) 2024

In this paper, we propose a graph reordering method, GoGraph, which can construct a well-formed vertex processing order effectively reducing the number of iteration rounds and, consequently, accelerating iterative computation. Before delving into GoGraph, a metric function is introduced to quantify the efficiency of vertex processing order in accelerating iterative computation. This metric reflects the quality of the processing order by counting the number of edges whose source precedes the destination. GoGraph employs a divide-and-conquer mindset to establish the vertex processing order by maximizing the value of the metric function.

Fast Iterative Graph Computing with Updated Neighbor States

Yijie Zhou, Shufeng Gong, Feng Yao, Hanzhang Chen, Song Yu, Pengxi Liu, Yanfeng Zhang, Ge Yu, Jeffrey Xu Yu

IEEE International Conference on Data Engineering (ICDE) 2024

In this paper, we propose a graph reordering method, GoGraph, which can construct a well-formed vertex processing order effectively reducing the number of iteration rounds and, consequently, accelerating iterative computation. Before delving into GoGraph, a metric function is introduced to quantify the efficiency of vertex processing order in accelerating iterative computation. This metric reflects the quality of the processing order by counting the number of edges whose source precedes the destination. GoGraph employs a divide-and-conquer mindset to establish the vertex processing order by maximizing the value of the metric function.

All publications