parameters - How does parameter sharing work in Efficient Neural Architecture Search (ENAS)

Question

I'm trying to understand how the parameter sharing works in ENAS. The first two questions are there partially to answer the third main question.

Are all nodes only used ONCE during macro search?
For macro search, will all the nodes definitely link to its previous node?
How are the parameters shared? Does each operations have their own weights, which are always loaded when called? If this is the case, then which weight to update and memorize during training, assuming multiple instances of the same operation is used. Or are there weights for each unique connection, e.g. Node1 to Node3 (W13) has one weight set, Node2 to Node3 (W23) has another weight set. If so, then how does it handle cases when there are skip connections (e.g. Node1 and Node2 are concatenated, which are then passed to Node 3. Will it have W12-3?)?

score 0 · Accepted Answer

我已经阅读了很多次代码，所以我想我会自己回答这些，以防将来有人看到。

1 回答 1