Spring 2017 Deep Reinforcement Learning Final Project
Applying learning algorithms to classic radio tasks
Preliminary Analysis (Data driven approaches to classic radio tasks)
Decentralized, multi-agent learning of modulation
Two agents, shared preamble, transmitter and receiver architecture, echoing, no other shared information
Two evaluation methods:
4/11: fixed Tx/Rx 4/17: one page description due at beginning of class 5/8: 5-8 page report
look at apsk, bpsk, qpsk, 16-quam do we want to whether we want to paramaterize output of transmitter as cartesian or polar?
fixed Tx, learn Rx: ⋅⋅⋅input x,y of complex, softmax output + eps greedy / boltzman exploration
reward shaping for transmitter (need to restrict power and maximize distance between points. former must be stronger than latter to prevent outer points from flying away)
Tx -> Rx and Rx gives reward back to Tx. Rx provides k-nn guess for each datasample back to Tx
Tx on both sides, Rx on both sides.