The source of the IJCAI2017 paper "Modeling Trajectory with Recurrent Neural Networks"
The source code of my paper.
The directory structure may be as follows
workspace (e.g., /data)
└ dataset_name1 (e.g., porto_6k)
├ data
| └ your_traj_file.txt
├ map
| ├ nodeOSM.txt
| └ edgeOSM.txt
└ ckpt
└ CSSRNN
├ dest_emb
| └ emb_50_hid_50_deep_1
├ dest_coord
| └ emb_200_hid_50_deep_1
└ without_dest
└ emb_250_hid_350_deep_3
codespace
├ config
├ main.py
├ geo.py
├ trajmodel.py
└ ngram_model.py
The map is recommended to be constructed from OpenStreetMap.
You can get your own road network file from OpenStreetMap by selecting the rectangle area and export the file by, say, Overpass API
(the first choice in the web page).
Then you will be prompted to download a data named map
which is actually an XML-formatted file.
To parse the raw map data, you can use the tool here.
After successfully parsed the raw map data, the nodeOSM.txt
and edgeOSM.txt
will be automatically extracted.
The format of these two files are shown as follows.
Format: [NodeID]
\t[latitude]
\t[longitude]
One node/vertex per line with increasing (continuous) ids. E.g.,
0 41.1689665 -8.6444747
1 41.1658735 -8.6444774
2 41.1670798 -8.6424338
3 41.1673856 -8.642543
4 41.1669776 -8.6417132
5 41.1676312 -8.6424866
...
1000 41.1575375 -8.6443184
, which records 1,000 vertices in the road network.
Format: [EdgeId]
\t[StartNodeId]
\t[EndNodeId]
\t[k]
\t[lat1]
\t[lon1]
\t[lat2]
\t[lon2]
...\t[latk]
\t[lonk]
One edge (StartNode
-> EndNode
) per line with increasing (continuous) ids.
And k
refers to the number of points of a polyline representing the shape of the road (including the start and the end node).
I.e., (lat1, lon1)
is just the coordinate of StartNode
, and (latk, lonk)
is the coordinate of EndNode
.
E.g.,
0 0 1326 5 41.1689665 -8.6444747 41.1688112 -8.6443785 41.1685579 -8.6440804 41.1683059 -8.6438068 41.1680768 -8.6437482
1 4 5 2 41.1669776 -8.6417132 41.1676312 -8.6424866
...
1500 1499 1494 2 41.1849529 -8.6317477 41.185196 -8.6318118
, which records 1,500 edges in the road network.
Edge 0
represents an edge from node 0
to node 1326
with 5
points representing the shape of the road as a
polyline. And edge 1
represents an edge from node 4
to 5
with 2
points representing the shape, which means the
shape of this road is a straight line segment (i.e., the first point(41.1669776, -8.6417132)
is just the coordinate of
node 4
and the second point (41.1676312, -8.6424866)
is just the coordinate of node 5
).
The format of trajectory data is very simple. The data may like as follows,
1,2,4,6,8,12,7,23,
9,2,4,18,76,42,3,78,98,54,
432,214,678,532,3,5,74,13,123,67,4,
...
Each line records several edge ids in the road network, which represents a trajectory w.r.t. the definition introduced in the paper, i.e.,
Definition 2 (Trajectory). A trajectory T in the form of r_1 → r_2 → … → r_k captures the movement of an object from r_1 to r_2 and so on to r_k along the road network G, where every two consecutive road segments are connected, i.e., \forall r_i, r_{i+1} ∈ T, r_i, r_{i+1}∈ E ∧ r_i.e = r_{i+1}.s.
The trajectory (route) is generated by the HMM map matching algorithm Hidden Markov Map Matching Through Noise and Sparseness using the raw GPS trajectory data (sequence of coordinates, geographical coordinate should be transferred into rectangular coordiante). The kernel code is implemented in C++ (see HMM_mapmatching.cpp
). There remain some function to be implemented by your own. For more detail, please refer to the comments in HMM_mapmatching.cpp
.
The code can be successfully run under following environments
[TODO] I'll modify some code to make it compatible with newest API of Tensorflow.
The project will also need following package dependencies
config
filemain.py
All model settings are included in config
file or can be set through the instance of Config
class.
To run the model, the following settings are important and should be set according to your own dataset.
dataset_name
: give a name to your own dataset and put all stuffs w.r.t. this dataset into the directory named by
this name (as the directory tree structure in the above section).workspace
: the place you want to put all data in, e.g. \home\data
. Note that you may have several datasets,
e.g., with the names being, dataset1
, dataset2
, ... .
The directory may like as follows,/home/data/dataset1/data
/home/data/dataset1/map
/home/data/dataset1/ckpt
/home/data/dataset2/data
/home/data/dataset2/map
/home/data/dataset2/ckpt
...
file_name
: set this as the name of your own trajectory file. If you follow the directory tree sturcture as above,
you may set the file_name
as data/trajectory.txt
since the file is located in the data/
directory.If you have correctly set these fields in the config, the model will be able to run with configuration printed in the screen.
To get the details of the remaining attributes of config, please refer to main.py
. Each field is described in detail
in comments.