Satisficing Q-Learning: Efficient Learning in Problems With Dichotomous Attributes

reportActive / Technical Report | Accession Number: ADA451568 | Open PDF

Abstract:

In some environments, a learning agent must learn to balance competing objectives. For example, a Q-learner agent may meed to learn which choices expose the agent to risk and which choices lead to a goal. This paper presents a variant of Q learning that learns a pair of utilities die worlds with dicotomous attributes and showe that this algorithm prpperly balances the competing objectives and, as a result, efficiently identifies satisficing solutions. This occurs because exploration of the environment is restricted to those options which, according to current knowledge, are likely to avoid exposure to risk. We empirically validate the algorithm by a showing that the algorithm quickly comnverges to good policies in several simulated worlds of various complexities and b applying the algorithm to learning a force feedback profile for a gas pedal that helps drivers avoid risk situations.

Security Markings

DOCUMENT & CONTEXTUAL SUMMARY

Distribution:
Approved For Public Release
Distribution Statement:
Approved For Public Release; Distribution Is Unlimited.

RECORD

Collection: TR
Identifying Numbers
Subject Terms