{"id":22704,"date":"2022-02-01T18:00:48","date_gmt":"2022-02-01T10:00:48","guid":{"rendered":"https:\/\/cde.nus.edu.sg\/isem\/?post_type=nus-news&#038;p=22704"},"modified":"2024-02-21T16:15:51","modified_gmt":"2024-02-21T08:15:51","slug":"two-sided-deep-reinforcement-learning-for-dynamic-mobility-on-demand-management-with-mixed-autonomy","status":"publish","type":"nus-news","link":"https:\/\/cde.nus.edu.sg\/isem\/news\/two-sided-deep-reinforcement-learning-for-dynamic-mobility-on-demand-management-with-mixed-autonomy\/","title":{"rendered":"Two-Sided Deep Reinforcement Learning for Dynamic Mobility-on-Demand Management with Mixed-Autonomy"},"content":{"rendered":"<p class=\"CDt4Ke zfr3Q\" dir=\"ltr\"><span style=\"color: #0000ff\">Congratulations to Ms. Xie Jiaohong (supervised by A\/Prof. Chen Nan and Dr. Liu Yang), who won <strong>Second Place<\/strong> in the Student Research Competition in the virtual event on <em><strong>Artificial Intelligence Enabled Next Generation Transportation Systems<\/strong><\/em>. This was organized by the Artificial Intelligence in Transportation Committee of ASCE Transportation &amp; Development Institute. Her research and presentation are titled \u201c<strong>Two-Sided Deep Reinforcement Learning for Dynamic Mobility-on-Demand Management with Mixed-Autonomy<\/strong>\u201d.<\/span><\/p>\n<p dir=\"ltr\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-22705 \" src=\"https:\/\/cde.nus.edu.sg\/isem\/wp-content\/uploads\/sites\/12\/2024\/02\/ASCEDI.jpg\" alt=\"\" width=\"527\" height=\"392\" srcset=\"https:\/\/cde.nus.edu.sg\/isem\/wp-content\/uploads\/sites\/12\/2024\/02\/ASCEDI.jpg 1023w, https:\/\/cde.nus.edu.sg\/isem\/wp-content\/uploads\/sites\/12\/2024\/02\/ASCEDI-300x223.jpg 300w, https:\/\/cde.nus.edu.sg\/isem\/wp-content\/uploads\/sites\/12\/2024\/02\/ASCEDI-768x571.jpg 768w\" sizes=\"auto, (max-width: 527px) 100vw, 527px\" \/><\/p>\n<p class=\"CDt4Ke zfr3Q\" dir=\"ltr\"><span style=\"font-size: 10pt\"><em>Below is a short description of her work presented in this competition.<\/em><\/span><\/p>\n<p dir=\"ltr\">\n<section id=\"h.11ed490a9207cea2_92\" class=\"yaqOZd qeLZfd\">\n<div class=\"mYVXT\">\n<div class=\"LS81yb VICjCf j5pSsc db35Fc\">\n<div class=\"hJDwNd-AhqUyc-uQSCkd Ft7HRd-AhqUyc-uQSCkd purZT-AhqUyc-II5mzb ZcASvf-AhqUyc-II5mzb pSzOP-AhqUyc-qWD73c Ktthjf-AhqUyc-qWD73c JNdkSc SQVYQc\">\n<div class=\"JNdkSc-SmKAyb LkDMRd\">\n<div class=\"\">\n<div class=\"oKdM2c ZZyype Kzv0Me\">\n<div id=\"h.11ed490a9207cea2_95\" class=\"hJDwNd-AhqUyc-uQSCkd Ft7HRd-AhqUyc-uQSCkd jXK9ad D2fZ2 zu5uec OjCsFc dmUFtb wHaque g5GTcb JYTMs\">\n<div class=\"jXK9ad-SmKAyb\">\n<div class=\"tyJCtd mGzaTb Depvyb baZpAe\">\n<p class=\"CDt4Ke zfr3Q\" dir=\"ltr\">Autonomous vehicles (AVs) are expected to operate on Mobility-on-Demand (MoD) platforms because AV technology enables flexible self-relocation and system-optimal coordination. Unlike the existing studies, which focus on MoD with pure AV fleet or conventional vehicles (CVs) fleet, we aim to optimize the dynamic fleet management of an MoD system with a mixed autonomy of CVs and AVs. We consider a realistic case that human drivers may relocate freely and learn strategies to maximize their own compensation. In contrast, AVs are fully compliant with the platform&#8217;s decisions. To achieve a high level of service provided by a mixed fleet, we propose that the platform prioritizes human drivers in the matching decisions, when on-demand requests arrive, and dynamically determines the optimal commission fee to influence drivers&#8217; behavior.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/section>\n<section id=\"h.11ed490a9207cea2_15\" class=\"yaqOZd qeLZfd\">\n<div class=\"IFuOkc\"><\/div>\n<div class=\"mYVXT\">\n<div class=\"LS81yb VICjCf j5pSsc db35Fc\">\n<div class=\"hJDwNd-AhqUyc-OiUrBf Ft7HRd-AhqUyc-OiUrBf purZT-AhqUyc-II5mzb ZcASvf-AhqUyc-II5mzb pSzOP-AhqUyc-qWD73c Ktthjf-AhqUyc-qWD73c JNdkSc SQVYQc yYI8W HQwdzb\">\n<div class=\"JNdkSc-SmKAyb LkDMRd\">\n<div class=\"\">\n<div class=\"oKdM2c ZZyype Kzv0Me\">\n<div id=\"h.11ed490a9207cea2_14\" class=\"hJDwNd-AhqUyc-OiUrBf Ft7HRd-AhqUyc-OiUrBf jXK9ad D2fZ2 zu5uec OjCsFc dmUFtb\">\n<div class=\"jXK9ad-SmKAyb\">\n<div class=\"tyJCtd baZpAe\">\n<div class=\"t3iYD\">\n<figure id=\"attachment_22706\" aria-describedby=\"caption-attachment-22706\" style=\"width: 483px\" class=\"wp-caption alignleft\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-22706\" src=\"https:\/\/cde.nus.edu.sg\/isem\/wp-content\/uploads\/sites\/12\/2024\/02\/XJH-fig1.png\" alt=\"\" width=\"483\" height=\"252\" srcset=\"https:\/\/cde.nus.edu.sg\/isem\/wp-content\/uploads\/sites\/12\/2024\/02\/XJH-fig1.png 933w, https:\/\/cde.nus.edu.sg\/isem\/wp-content\/uploads\/sites\/12\/2024\/02\/XJH-fig1-300x156.png 300w, https:\/\/cde.nus.edu.sg\/isem\/wp-content\/uploads\/sites\/12\/2024\/02\/XJH-fig1-768x400.png 768w\" sizes=\"auto, (max-width: 483px) 100vw, 483px\" \/><figcaption id=\"caption-attachment-22706\" class=\"wp-caption-text\">Figure 1 Illustration of the MoD system with mixed fleet<\/figcaption><\/figure>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"oKdM2c ZZyype\">\n<div id=\"h.11ed490a9207cea2_20\" class=\"hJDwNd-AhqUyc-OiUrBf Ft7HRd-AhqUyc-OiUrBf jXK9ad D2fZ2 zu5uec wHaque g5GTcb JYTMs\">\n<div class=\"jXK9ad-SmKAyb\">\n<div class=\"tyJCtd mGzaTb Depvyb baZpAe\">\n<p class=\"CDt4Ke zfr3Q\" dir=\"ltr\"><span style=\"font-size: 16px\">However, it is challenging to make efficient real-time fleet management decisions when spatio-temporal uncertainty in demand and complex interactions among human drivers and operators are explicitly considered. To tackle the challenges, we develop a two-sided multi-agent deep reinforcement learning (DRL) approach, in which the operator acts as a supervisor agent on one side and makes centralized decisions on the mixed fleet, and each CV driver acts as an individual agent on the other side and learns to make desirable decisions non-cooperatively.<\/span><\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<p dir=\"ltr\">\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/section>\n<section id=\"h.11ed490a9207cea2_38\" class=\"yaqOZd qeLZfd\">\n<div class=\"mYVXT\">\n<div class=\"LS81yb VICjCf j5pSsc db35Fc\">\n<div class=\"hJDwNd-AhqUyc-R6PoUb Ft7HRd-AhqUyc-R6PoUb JNdkSc SQVYQc L6cTce-purZT L6cTce-pSzOP\">\n<div class=\"JNdkSc-SmKAyb LkDMRd\">\n<div class=\"\"><\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/section>\n<section id=\"h.11ed490a9207cea2_74\" class=\"yaqOZd qeLZfd\">\n<div class=\"mYVXT\">\n<div class=\"LS81yb VICjCf j5pSsc db35Fc\">\n<div class=\"hJDwNd-AhqUyc-uQSCkd Ft7HRd-AhqUyc-uQSCkd purZT-AhqUyc-II5mzb ZcASvf-AhqUyc-II5mzb pSzOP-AhqUyc-qWD73c Ktthjf-AhqUyc-qWD73c JNdkSc SQVYQc\">\n<div class=\"JNdkSc-SmKAyb LkDMRd\">\n<div class=\"\">\n<div class=\"oKdM2c ZZyype Kzv0Me\">\n<div id=\"h.11ed490a9207cea2_71\" class=\"hJDwNd-AhqUyc-uQSCkd Ft7HRd-AhqUyc-uQSCkd jXK9ad D2fZ2 zu5uec OjCsFc dmUFtb wHaque g5GTcb JYTMs\">\n<div class=\"jXK9ad-SmKAyb\">\n<div class=\"tyJCtd mGzaTb Depvyb baZpAe\">\n<p dir=\"ltr\">\n<figure id=\"attachment_22707\" aria-describedby=\"caption-attachment-22707\" style=\"width: 883px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-22707 size-full\" src=\"https:\/\/cde.nus.edu.sg\/isem\/wp-content\/uploads\/sites\/12\/2024\/02\/XJH-fig2.png\" alt=\"\" width=\"883\" height=\"357\" srcset=\"https:\/\/cde.nus.edu.sg\/isem\/wp-content\/uploads\/sites\/12\/2024\/02\/XJH-fig2.png 883w, https:\/\/cde.nus.edu.sg\/isem\/wp-content\/uploads\/sites\/12\/2024\/02\/XJH-fig2-300x121.png 300w, https:\/\/cde.nus.edu.sg\/isem\/wp-content\/uploads\/sites\/12\/2024\/02\/XJH-fig2-768x311.png 768w\" sizes=\"auto, (max-width: 883px) 100vw, 883px\" \/><figcaption id=\"caption-attachment-22707\" class=\"wp-caption-text\">Figure 2 The two-sided multi-agent reinforcement learning approach<\/figcaption><\/figure>\n<p class=\"CDt4Ke zfr3Q\" dir=\"ltr\">For the first time, a scalable algorithm, which uses the actor-critic (A2C) method and mean-field approximation method to train the agents, is developed here for the mixed fleet management. Furthermore, deep neural networks (DNNs) are adapted to enhance the approximation for our high-dimensional and large-scale problems. We propose a two-head policy network to enable the supervisor agent to make two sets of decisions based on one policy network, which greatly reduces the computational time. The proposed approach is validated using a case study in New York City using real taxi trip data. Results show that our algorithm can make high-quality decisions quickly and outperform benchmark policies. Our fleet management strategy makes both the platform and the drivers better off, especially in scenarios with higher demand volume.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/section>\n<section id=\"h.11ed490a9207cea2_75\" class=\"yaqOZd qeLZfd\">\n<div class=\"IFuOkc\">\n<figure id=\"attachment_22708\" aria-describedby=\"caption-attachment-22708\" style=\"width: 572px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-22708 size-full\" src=\"https:\/\/cde.nus.edu.sg\/isem\/wp-content\/uploads\/sites\/12\/2024\/02\/XJH-fig3.png\" alt=\"\" width=\"572\" height=\"366\" srcset=\"https:\/\/cde.nus.edu.sg\/isem\/wp-content\/uploads\/sites\/12\/2024\/02\/XJH-fig3.png 572w, https:\/\/cde.nus.edu.sg\/isem\/wp-content\/uploads\/sites\/12\/2024\/02\/XJH-fig3-300x192.png 300w\" sizes=\"auto, (max-width: 572px) 100vw, 572px\" \/><figcaption id=\"caption-attachment-22708\" class=\"wp-caption-text\">Figure 3 Training curves of gross merchandise value(GMV)<\/figcaption><\/figure>\n<\/div>\n<div class=\"mYVXT\">\n<div class=\"LS81yb VICjCf j5pSsc db35Fc\">\n<div class=\"hJDwNd-AhqUyc-qWD73c Ft7HRd-AhqUyc-qWD73c purZT-AhqUyc-II5mzb ZcASvf-AhqUyc-II5mzb pSzOP-AhqUyc-qWD73c Ktthjf-AhqUyc-qWD73c JNdkSc SQVYQc yYI8W HQwdzb\">\n<div class=\"JNdkSc-SmKAyb LkDMRd\">\n<div class=\"\">\n<div class=\"oKdM2c ZZyype\">\n<div id=\"h.11ed490a9207cea2_89\" class=\"hJDwNd-AhqUyc-qWD73c Ft7HRd-AhqUyc-qWD73c jXK9ad D2fZ2 zu5uec wHaque g5GTcb JYTMs\">\n<div class=\"jXK9ad-SmKAyb\">\n<div class=\"tyJCtd mGzaTb Depvyb baZpAe\">\n<figure id=\"attachment_22709\" aria-describedby=\"caption-attachment-22709\" style=\"width: 536px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-22709 size-full\" src=\"https:\/\/cde.nus.edu.sg\/isem\/wp-content\/uploads\/sites\/12\/2024\/02\/XJH-fig4.png\" alt=\"\" width=\"536\" height=\"349\" srcset=\"https:\/\/cde.nus.edu.sg\/isem\/wp-content\/uploads\/sites\/12\/2024\/02\/XJH-fig4.png 536w, https:\/\/cde.nus.edu.sg\/isem\/wp-content\/uploads\/sites\/12\/2024\/02\/XJH-fig4-300x195.png 300w\" sizes=\"auto, (max-width: 536px) 100vw, 536px\" \/><figcaption id=\"caption-attachment-22709\" class=\"wp-caption-text\">Figure 4 Training curves of order fulfilment rate (OFR)<\/figcaption><\/figure>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/section>\n","protected":false},"excerpt":{"rendered":"<p>Congratulations to Ms. Xie Jiaohong (supervised by A\/Prof. Chen Nan and Dr. Liu Yang), who won Second Place in the Student Research Competition in the virtual event on Artificial Intelligence Enabled Next Generation Transportation Systems. This was organized by the Artificial Intelligence in Transportation Committee of ASCE Transportation &amp; Development Institute. Her research and presentation<\/p>\n","protected":false},"author":26,"featured_media":0,"parent":0,"menu_order":0,"template":"","meta":{"_acf_changed":false,"site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"default","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","theme-transparent-header-meta":"default","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"set","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-gradient":""}},"footnotes":""},"news_category":[],"class_list":["post-22704","nus-news","type-nus-news","status-publish","hentry"],"acf":[],"_links":{"self":[{"href":"https:\/\/cde.nus.edu.sg\/isem\/wp-json\/wp\/v2\/news\/22704","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/cde.nus.edu.sg\/isem\/wp-json\/wp\/v2\/news"}],"about":[{"href":"https:\/\/cde.nus.edu.sg\/isem\/wp-json\/wp\/v2\/types\/nus-news"}],"author":[{"embeddable":true,"href":"https:\/\/cde.nus.edu.sg\/isem\/wp-json\/wp\/v2\/users\/26"}],"version-history":[{"count":1,"href":"https:\/\/cde.nus.edu.sg\/isem\/wp-json\/wp\/v2\/news\/22704\/revisions"}],"predecessor-version":[{"id":22710,"href":"https:\/\/cde.nus.edu.sg\/isem\/wp-json\/wp\/v2\/news\/22704\/revisions\/22710"}],"wp:attachment":[{"href":"https:\/\/cde.nus.edu.sg\/isem\/wp-json\/wp\/v2\/media?parent=22704"}],"wp:term":[{"taxonomy":"news_category","embeddable":true,"href":"https:\/\/cde.nus.edu.sg\/isem\/wp-json\/wp\/v2\/news_category?post=22704"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}