renom.algorithm ¶
renom.algorithm.image ¶
-
class
renom.algorithm.image.model.vgg.
VGG16
( classes=10 ) ¶
-
class
renom.algorithm.image.model.vgg.
VGG19
( classes=10 ) ¶
-
renom.algorithm.image.detection.yolo.
build_truth
( y , total_w , total_h , cells , classes ) ¶ -
Use to transform a list of objects per image into a image*cells*cells*(5+classes) matrix. Each cell in image can only be labeled for 1 object.
“5” represents: objectness (0 or 1) and X Y W H
ex: Input: 2 objects in first image, 5 classes
- y[0] = X Y W H 0 1 0 0 0 X Y W H 0 0 0 1 0
- |---1st object----|| —2nd object—|
Output: 7 * 7 cells * 10 per image
truth[0,0,0] = 1 X Y W H 0 1 0 0(cell 0,0 has first object)truth[0,0,1] = 0 0 0 0 0 0 0 0 0(cell 0,1 has no object)
-
renom.algorithm.image.detection.yolo.
apply_nms
( x , cells , bbox , classes , image_size , thresh=0.2 , iou_thresh=0.3 ) ¶ -
Apply to X predicted out of yolo_detector layer to get list of detected objects. Default threshold for detection is prob < 0.2. Default threshold for suppression is IOU > 0.4
Parameters: Returns: -
List of dict object is returned. The dict includes keys
class
, -
box
,score
.
-
List of dict object is returned. The dict includes keys
-
class
renom.algorithm.image.detection.yolo.
yolo
( *args , **kwargs ) ¶
-
class
renom.algorithm.image.detection.yolo.
Yolo
( cells=7 , bbox=2 , classes=10 ) ¶ -
Loss function for Yolo detection. Last layer of the network needs to be following size: cells*cells*(bbox*5+classes) 5 is because every bounding box gets 1 score and 4 locations (x, y, w, h)
Ex: Prediction: 2 bbox per cell, 7*7 cells per image, 5 classes X[0,0,0] = S X Y W H S X Y W H 0 0 0 1 0
|---1st bbox--|| —2nd bbox–||-classes-|Parameters:
renom.algorithm.reinforcement ¶
-
class
renom.algorithm.reinforcement.dqn.
DQN
( q_network , target_q , state_size , action_pattern , gamma=0.99 , buffer_size=100000.0 ) ¶ -
DQN class This class provides a reinforcement learning agent including training method.
Parameters: -
action
( state ) ¶ -
This method returns an action according to the given state. :param state: A state of an environment. :type state: ndarray
Returns: Action. Return type: ( int , ndarray)
-
update
( ) ¶ -
This function updates target network.
-
train
( env , loss_func=<renom.layers.loss.clipped_mean_squared_error.ClippedMeanSquaredError object> , optimizer=<renom.optimizer.Rmsprop object> , epoch=100 , batch_size=32 , random_step=1000 , one_epoch_step=20000 , test_step=1000 , test_env=None , update_period=10000 , greedy_step=1000000 , min_greedy=0.0 , max_greedy=0.9 , test_greedy=0.95 , train_frequency=4 ) ¶ -
This method executes training of a q-network. Training will be done with epsilon-greedy method.
Parameters: - env ( function ) – A function which accepts action as an argument and returns prestate, state, reward and terminal.
- loss_func ( Model ) – Loss function for training q-network.
- optimizer ( Optimizer ) – Optimizer object for training q-network.
- epoch ( int ) – Number of epoch for training.
- batch_size ( int ) – Batch size.
- random_step ( int ) – Number of random step which will be executed before training.
- one_epoch_step ( int ) – Number of step of one epoch.
- test_step ( int ) – Number of test step.
- test_env ( function ) – A environment function for test.
- update_period ( int ) – Period of updating target network.
- greedy_step ( int ) – Number of step
- min_greedy ( int ) – Minimum greedy value
- max_greedy ( int ) – Maximum greedy value
- test_greedy ( int ) – Greedy threshold
- train_frequency ( int ) – For the learning step, training is done at this cycle
Returns: A dictionary which includes reward list of training and loss list.
Return type: ( dict )
Example
>>> import renom as rm >>> from renom.algorithm.reinforcement.dqn import DQN >>> >>> q_network = rm.Sequential([ ... rm.Conv2d(32, filter=8, stride=4), ... rm.Relu(), ... rm.Conv2d(64, filter=4, stride=2), ... rm.Relu(), ... rm.Conv2d(64, filter=3, stride=1), ... rm.Relu(), ... rm.Flatten(), ... rm.Dense(512), ... rm.Relu(), ... rm.Dense(action_pattern) ... ]) >>> >>> state_size = (4, 84, 84) >>> action_pattern = 4 >>> >>> def environment(action): ... prestate = ... ... state = ... ... reward = ... ... terminal = ... ... return prestate, state, reward, terminal >>> >>> # Instantiation of DQN object >>> dqn = DQN(model, ... state_size=state_size, ... action_pattern=action_pattern, ... gamma=0.99, ... buffer_size=buffer_size) >>> >>> # Training >>> train_history = dqn.train(environment, ... loss_func=rm.ClippedMeanSquaredError(clip=(-1, 1)), ... epoch=50, ... random_step=5000, ... one_epoch_step=25000, ... test_step=2500, ... test_env=environment, ... optimizer=rm.Rmsprop(lr=0.00025, g=0.95)) >>> Executing random action for 5000 step... epoch 000 avg loss:0.0060 avg reward:0.023: 100%|██████████| 25000/25000 [19:12<00:00, 21.70it/s] /// Result Average train error: 0.006 Avg train reward in one epoch: 1.488 Avg test reward in one epoch: 1.216 Test reward: 63.000 Greedy: 0.0225 Buffer: 29537 ... >>> >>> print(train_history["train_reward"])
-