Tractography is the process of virtually reconstructing the white matter structure of the brain in a non-invasive manner. To tackle the various known problems of tractography, deep learning has been proposed, but the lack of well curated diverse datasets makes neural networks incapable of generalizing well on unseen data. Recently, deep reinforcement learning (RL) has been shown to effectively learn the tractography procedure without reference streamlines. While the performances reported were competitive, the proposed framework is complex and little is known on the role and impact of its multiple parts. In this work, we thoroughly explore the different components of the proposed framework through seven experiments on two datasets and shed light on their impact. Our goal is to guide researchers eager to explore the possibilities of deep RL for tractography by exposing what works and what does not work with this category of approach. We find that directionality is crucial for the agents to learn the tracking procedure and that the input signal and the seeding strategy offer a trade-offs in connectivity vs. Volume.