Spring 2017 Deep Reinforcement Learning Final Project
Introduction
Applying learning algorithms to classic radio tasks
Related Work
Background Information
Preliminary Analysis (Data driven approaches to classic radio tasks)
Problem formulation
Decentralized, multi-agent learning of modulation
Problem setup
Two agents, shared preamble, transmitter and receiver architecture, echoing, no other shared information
Results
Two evaluation methods:
4/11: fixed Tx/Rx 4/17: one page description due at beginning of class 5/8: 5-8 page report
look at apsk, bpsk, qpsk, 16-quam do we want to whether we want to paramaterize output of transmitter as cartesian or polar?
fixed Tx, learn Rx:
⋅⋅⋅input x,y of complex, softmax output + eps greedy / boltzman exploration
tasks:
reward shaping for transmitter (need to restrict power and maximize distance between points. former must be stronger than latter to prevent outer points from flying away)
Tx -> Rx and Rx gives reward back to Tx. Rx provides k-nn guess for each datasample back to Tx
Tx on both sides, Rx on both sides.
OpenAI style
https://www.sharelatex.com/project/58fe82f296da09b1289caec3
https://www.sharelatex.com/project/58eedf885eecccdc7a8817a8