Not an exhaustive list! The activities below are not mutually
exclusive, for example in applying SMAs, new problem variations and
algorithmic ideas could present themselves.
Algorithmic understanding and algorithmic variations:
- How fast is the variance improved by increasing the number of
queue cells in the Qs SMA (in the stationary setting)?
- If one could have different queue capacities for different items,
how should the capacities be allocated? Should queues for more
frequent items have higher capacities? Does differential queue
capacity make a significant difference?
- Are there other better reduction schedules than harmonic decay of
the learning rate for EMA?
- Are there simple enough pure learning-rate based approaches that
rival DYAL? pure counting based approaches? Other approaches? (is a
hybrid a necessity?)
- How can DYAL be simplified (such as improving or simplifying the
change-detection, or switching to the queue, test)?
- Could there be gains if items (entries) are not updated basically
independently of one another?
- Improve the analyses: eg the convergence of EMA (tighten the
gap in the theorem).
- Feed-forward neural networks, with back propagation, can support
outputting probabilities, but often multiple passes are required and
extra step of calibaration is needed for good probabilities. Typical
NNs are not open ended: the classes should be fixed. For a fixed set
of classes, e.g. 10 many, how do such techniques do when the
distribution changes along a stream?