I'm sure it differs from person to person, but my experience was as follows. First off, I entered MIT freshman year not knowing much about computer science, and also not knowing what a mathematical proof was. I knew some C/C++ from teaching myself in my senior year of high school, and some HTML from teaching myself in 8th grade, but that was it. I also had no idea what research was.
My first exposure to proofs was Freshman year, taking the theory track for single/multi-variable calculus and differential equations (18.014, 18.024, and 18.034 — see Jelani Nelson – CLASSES
). My first exposure to algorithms was in my sophomore Spring algorithms course (6.046). I liked the class, but at that point is was just one class of many and I didn't think much more about algorithms for at least another year. My next course in theoretical computer science was then a graduate cryptography course (6.875) in my junior Spring, which I really enjoyed. At that point, I thought I wanted to do cryptography.
At the same time, I signed up for a TopCoder account near the end of junior year (in May). I also signed up for an account on the USACO training gateway, and on the websites of several other online judges. A couple friends of mine at MIT, namely Hubert Hwang (TopCoder: antimatter) and David Pritchard (TopCoder: daveagp), were really into TopCoder at that time and I got into it due to their influence. It is quite easy for me to get addicted to computer/video games, and I viewed solving problems on these websites as a game. Summer 2004 I probably spent over 8 hours a day solving algorithmic exercises on various websites. I would say this phase of my life was quite pivotal. After about 2 years on TopCoder, algorithmic thinking became second nature.
I also took a few graduate courses in algorithms my senior year: Advanced Algorithms (6.854), Randomized Algorithms (6.856), and Advanced Data Structures (6.897 at the time, now 6.851). They all had final projects, but I was quite busy taking 6 classes per semester so I didn't have much time to launch any serious research efforts. Luckily Erik Demaine as part of 6.897 had some "open problem solving sessions" which I attended. These were quite fun, and as part of one, a team of 7 of us solved some data structure problem on dynamic ham sandwich cuts which we published that summer in CCCG. It was not a major problem, but I believe one of the goals of these sessions was to introduce youngsters to research in theoretical computer science. It worked on me; I think it was both a good confidence booster, and simply a good way to get my feet wet. (It was also my first co-authored paper with Daniel Kane, who I ended up working with later on quite a few projects during graduate school.)
My senior Fall I also applied to (two) graduate schools (saying I wanted to do cryptography). I had a 5.0 GPA at that point but zero theory research experience, and I got rejected from both. (I also can't really even remember why I applied, considering that at that point I didn't really understand what a PhD was …). After senior year I did a one-year Masters supervised by Bradley Kuszmaul and Charles Leiserson, focused on external memory and cache oblivious data structures. Since I was only taking two classes per semester, I had much more time to focus on research. At some point Bradley and I came up with an update/query tradeoff for external memory predecessor, but then Jeremy Fineman pointed out it had been done in a SODA several years prior. We worked on other stuff though, which led to a SPAA paper a year later on two cache-oblivious solutions that aimed to get much better update performance while still having good query. My main contribution was in deamortizing the "cache-oblivious lookahead array" (COLA) in that work. So, I think by this point, I had the ability to do some theory work.
I didn't really find my current area though (streaming/sketching/dimensionality reduction/etc.) until my second year of PhD. My first year I tried to do some more cache-oblivious stuff but didn't really get anywhere. My second year, in the Fall, I took a course by Piotr Indyk on streaming and sketching. I did a final project with Krzysztof Onak and Nick Harvey on entropy estimation in data streams. We had some non-trivial progress by the end of the course, and we kept working on it intensely in the Spring. We managed to get a pretty good algorithm by the FOCS deadline, so we submitted and it got accepted. That was my first paper in the area that became the focus of my thesis.