Satisficing Q-Learning: Efficient Learning in Problems With Dichotomous Attributes

Goodrich, Michael A.; Quigley, Morgan

Satisficing Q-Learning: Efficient Learning in Problems With Dichotomous Attributes

Active / Technical Report | Accession Number: ADA451568 |

Open PDF

Abstract:

In some environments, a learning agent must learn to balance competing objectives. For example, a Q-learner agent may meed to learn which choices expose the agent to risk and which choices lead to a goal. This paper presents a variant of Q learning that learns a pair of utilities die worlds with dicotomous attributes and showe that this algorithm prpperly balances the competing objectives and, as a result, efficiently identifies satisficing solutions. This occurs because exploration of the environment is restricted to those options which, according to current knowledge, are likely to avoid exposure to risk. We empirically validate the algorithm by a showing that the algorithm quickly comnverges to good policies in several simulated worlds of various complexities and b applying the algorithm to learning a force feedback profile for a gas pedal that helps drivers avoid risk situations.

Author(s):

Goodrich, Michael A. ; Quigley, Morgan

Author Organization(s):

BRIGHAM YOUNG UNIV PROVO UT DEPT OF COMPUTER SCIENCE

Supplementary Note:

The original document contains color images. DOI: 10.21236/ADA451568

Pagination:

0009

Security Markings

DOCUMENT & CONTEXTUAL SUMMARY

Distribution:

Approved For Public Release

Distribution Statement:

Approved For Public Release; Distribution Is Unlimited.

RECORD

Collection: TR

Identifying Numbers

Monitor Series:

DARPA

Subject Terms

Joint Capability Areas:

JCA_5_Command and Control; JCA_5.3_Planning; JCA_1.2_Force Preparation; JCA_1_Force Support; JCA_1.2.1_Training; JCA_1.2.7_Experimentation; JCA_5.3.4_Develop Courses of Action; JCA_5.3.5_Analyze Courses of Action; JCA_5.3.3_Develop Strategy; JCA_5.4_Decide; JCA_8_Building Partnerships; JCA_1.2.6_Concepts; JCA_5.5.2_Task; JCA_5.5_Direct; JCA_6_Net Centric; JCA_8.2_Shape

Modernization Areas:

AI and Machine Learning

Communities of Interest:

Human Systems

Descriptor(s):

*SYSTEMS ENGINEERING, *RISK, *SAFETY, ALGORITHMS, EXPOSURE(GENERAL), DRIVERS(PERSONNEL), LEARNING, FEEDBACK, SCENARIOS, SIMULATION

Field(s)/Group(s):

Safety Engineering, Operations Research

Keyword(s):

Q LEARNING, ACCELERATORS

Report Date:

2004 Jan 01

Creation Date:

2006 Aug 23