交通灯智能优化SUMO库随笔记录
SUMO项目
近期研究交通灯优化相关问题,查到了SUMO这个项目,此文章仅对此作简单的记录总结。
SUMO项目介绍
https://www.dlr.de/ts/en/desktopdefault.aspx/tabid-9883/16931_read-41000/
相关论文
教程
学习平台
-
DLR
-
Other Plamforms https://wenku.baidu.com/view/dd44decec8d376eeaeaa31a1.html
常用命令
-
netconvert 可将来源于OpenStreetMap的.osm文件转成.net.xml文件,生成交通场景;
-
polyconvert 生成.poly.xml文件
配置文件
-
.con.xml
-
.net.xml
- tlLogic
-
.rou.xml
-
.nod.xml
-
.edg.xml
-
.det.xml
-
.sumocfg 总配置文件,指定网络、路线等配置文件。如:
常用python模块
- traci
- TLS(Traffic Light Signal for a sumo network) https://sumo.dlr.de/daily/pydoc/sumolib.net.html#TLS
- TLSProgram
- tensorflow 安装后numpy出现问题: 更新numpy至1.16解决此错误:https://github.com/alpacahq/pylivetrader/issues/73 更新模块命令示例:pip3 install -U packageName
- rl 强化学习相关库,[介绍](强化学习 Reinforcement Learning 教程系列 | 莫烦Python) 安装过程中出现问题: 解决办法:pip3 install [keras-rl](keras-rl · PyPI)
- Ray Ray是加州伯克利分校的RISE实验室退出的一个分布式执行平台。 Ray RLib是一个可扩展的强化学习库,可在多个机器上运行与openAI gym完全兼容,支持Tensorflow和PyTorch。 Ray.tune是一个高效的分布式超参数搜索库,提供了API以执行深度学习、强化学习和其它计算密集型任务。 安装过程出现问题: https://github.com/ray-project/ray/issues/2683 见以上issue,目前ray模块不支持windows,于Linux or windows的Linux子系统可安装 https://pypi.org/simple/ray/ 使用Ray中的DQNAgent时,报错’PolicyEvaluator’ object has no attribute ‘sampler’
- keras
Traci API
connection重点函数
- simulationStep(self, step=0.)
Make a simulation step and simulate up to the given second in sim time.
If the given value is 0 or absent, exactly one step is performed.
Values smaller than or equal to the current sim time result in no action.
simulation重点函数
-
getCurrentTime(self)
getCurrentTime() -> integer
Returns the current simulation time in ms.
- getLoadedNumber(self)
getLoadedNumber() -> integer
Returns the number of vehicles which were loaded in this time step.
#### lane重点函数
- getLastStepVehicleNumber(self, laneID)
getLastStepVehicleNumber(string) -> integer
Returns the total number of vehicles for the last time step on the given lane.
- getLastStepHaltingNumber(self, laneID)
getLastStepHaltingNumber(string) -> integer
Returns the total number of halting vehicles for the last time step on the given lane. A speed of less than 0.1 m/s is considered a halt.
#### vehicle重点函数
- getWaitingTime(self, vehID)
getWaitingTime() -> double
计算车辆等待时间(若车辆有意停在服务区/停车区,则不作等待时间算)
The waiting time of a vehicle is defined as the time (in seconds) spent with a
speed below 0.1m/s since the last time it was faster than 0.1m/s.
(basically, the waiting time of a vehicle is reset to 0 every time it moves).
A vehicle that is stopping intentionally with a
- getAccumulatedWaitingTime(self, vehID)
getAccumulatedWaitingTime() -> double 计算一定时间间隔内某车的累计等待时间,时间间隔的以waiting-time-memory设置 The accumulated waiting time of a vehicle collects the vehicle’s waiting time over a certain time interval (interval length is set per option ‘–waiting-time-memory’)
#### trafficlight重点函数
- getIDList(self)
getIDList() -> list(string)
Returns a list of all objects in the network.
- getPhase(self,tlsID)
getPhase(string) -> integer 注意!此函数返回的是相位索引! Returns the index of the current phase within the list of all phases of the current program.
- getPhaseDuration(self, tlsID)
getPhaseDuration(string) -> double 返回当前相位的总共持续时间 Returns the total duration of the current phase (in seconds). This value is not affected by the elapsed or remaining duration of the current phase.
- setPhase(self,tlsID,index)
setPhase(string, integer) -> None
Switches to the phase with the given index in the list of all phases for the current program.
- setPhaseDuration(self,tlsID,phaseDuration)
setPhaseDuration(string, double) -> None 设置当前相位的剩余持续时间(秒),对之后周期的此相位的无影响! Set the remaining phase duration of the current phase in seconds. This value has no effect on subsquent repetitions of this phase.
- getControlledLanes(self, tlsID)
getControlledLanes(string) -> c
Returns the list of lanes which are controlled by the named traffic light.
- setCompleteRedYellowGreenDefinition(self, tlsID, tls)
setCompleteRedYellowGreenDefinition(string, Logic) -> None
Sets a new program for the given tlsID from a Logic object. See getCompleteRedYellowGreenDefinition.
traci重点类构造函数
-
Phase
class Phase: def __init__(self, duration, state, minDur=-1, maxDur=-1, next=-1): self.duration = duration self.state = state self.minDur = minDur # minimum duration (only for actuated tls) self.maxDur = maxDur # maximum duration (only for actuated tls) self.next = next def __repr__(self): return ("Phase(duration=%s, state='%s', minDur=%s, maxDur=%s, next=%s)" % (self.duration, self.state, self.minDur, self.maxDur, self.next))
-
Logic
class Logic: def __init__(self, programID, type, currentPhaseIndex, phases=None, subParameter=None): self.programID = programID self.type = type self.currentPhaseIndex = currentPhaseIndex self.phases = phases if phases is not None else [] self.subParameter = subParameter if subParameter is not None else {} def getPhases(self): return self.phases def getSubID(self): return self.programID def getType(self): return self.type def getParameters(self): return self.subParameter def getParameter(self, key, default=None): return self.subParameter.get(key, default) def __repr__(self): return ("Logic(programID='%s', type=%s, currentPhaseIndex=%s, phases=%s, subParameter=%s)" % (self.programID, self.type, self.currentPhaseIndex, self.phases, self.subParameter))
自定义类结构
class TrafficSignal:
def __init__(self, env, ts_id, delta_time, min_green, max_green, phases):
self.id = ts_id
self.env = env
self.time_on_phase = 0
self.delta_time = delta_time
self.min_green = min_green
self.max_green = max_green
self.green_phase = 0
self.num_green_phases = len(phases) // 2
self.edges = self._compute_edges()
self.edges_capacity = self._compute_edges_capacity()
logic = traci.trafficlight.Logic("ProgramID", 0, 00, phases)
traci.trafficlight.setCompleteRedYellowGreenDefinition(self.id, logic)
·
自定义计算模块
-
edges_capacity
边容量
-
lane_density
车道密度
- stopped_vehicles_num
平均停车数 新的绿灯相位到来时每条边上的车道在仿真的上一步的停车数之和 [sum([traci.lane.getLastStepHaltingNumber(lane) for lane in self.edges[p]]) for p in range(self.num_green_phases)]
- road_waiting_time
道路平均等待/延迟时间
- total_stopped
总共停车次数
模块封装
- update_phase
改变相位
- waiting_time_reward
基于车均等待/延迟时间的奖励函数
- queue_average_reward
基于平均排队长度的奖励函数
数据
-
https://sourceforge.net/projects/sumo/files/traffic_data/scenarios/
-
http://opendata.dc.gov/datasets/31ccad579ec449938c13a0ac82c9f46e_16
论文阅读
Towards a unified Evaluation of Traffic Light Algorithms
-
keywords traffic lights evaluation,traffic simulation,performance metrics
-
metrics to monitor the traffic light algorithms waiting time,queue size,delay,travel time
实践
遇到问题
- 没有设置SUMO_HOME 如在linux系统中,export “SUMO_HOME=/usr/share/sumo"即可
- Could not access configuration ‘XXXXX配置文件XXXXX’
- Error: Answered with error to command 0xa2: Traffic light ‘0’ is not known
- ImportError: cannot import name ‘Timestamp’ from ‘pandas.lib’ 解决办法:删除下图中的.lib
- Could not connect to TraCI server at localhost:XXXXX
命令通道
- 单车道单交叉路口 python3 experiments/ql_single-intersection.py -route /home/zhuge/Documents/TrafficSim/sumo-rl/nets/single-intersection/single-intersection.rou.xml -gui
- 双车道单交叉路口 python3 experiments/ql_single-intersection.py -route /home/zhuge/Documents/TrafficSim/sumo-rl/nets/2way-single-intersection/single-intersection.rou.xml -gui
- 双车道4*4路网——并行计算DQN
- 双车道4*4路网——并行计算A
强化学习
强化学习大体流程
重点
-
状态空间定义 状态空间即环境状态的量化描述,针对交通灯问题,可把状态空间定义为车辆密度
-
动作空间定义
-
转移函数设计
-
奖励函数设计 针对此问题,交通灯优化的程度可以从路口的车辆平均等待时间/车均延迟/排队长度等方面反映出来,故奖励函数可围绕这几个方面设计.
仿真参数
-
seconds 仿真程序持续秒数,固定秒数,一般时间长一点会更体现效果
-
alpha 学习率
-
gamma 折扣率
-
epsilon 探索率
-
reward 奖励函数,
-
runs 示例命令:python3 experiments/ql_2way-single-intersection.py -route /home/zhuge/Documents/TrafficSim/sumo-rl/nets/2way-single-intersection/single-intersection.rou.xml -g 0.9 -r waitingtime -runs 10
openAI库gym
-
gym是一个提供强化学习环境的类库,内置的经典环境有CartPole-v0、MountainCar-v0
参数搜索
统计分析
评价交通灯算法优劣 示例命令:python3 plot.py -f 2way-single-intersection\a3cteste9.csv
v0.1闭环步骤
输出起始相位和终止相位
小结
已有的交通信号控制系统/算法
- UTOPIA
- ……
相关可参考项目
- https://github.com/flow-project/flow
- https://github.com/sumoprojects/SumoEasyMiner
- https://github.com/JDGlick/sumo_reinforcement_learning
- https://github.com/lcodeca/MoSTScenario
- https://github.com/LucasAlegre/sumo-rl/tree/master/experiments
- https://github.com/Starofall/CrowdNav
- https://github.com/openai/universe
拓展阅读
-
Ray:https://www.jiqizhixin.com/articles/2018-01-10-2
-
Berkeley RISE:https://rise.cs.berkeley.edu/
任务
- 采用已有奖励函数,实现强化学习若干回合,输出各个回合的首尾状态各信号相位配时以及得分(停车次数、等待时间、排队长度),可视化展现,看随着强化学习次数的增加表现是否有越来越好的趋势
- 更新建立函数,参考网上论文(以前看过的),找一种看起来合理的更改看能否提高表现