Wish I could balance out my test taking for systems and theory courses a little. All assessments have been pretty fair this semester, but my performance is very.. hmm… high coefficient of variation. Spend a night on 418, top 10, spend a week on 857, bottom percentile… oh well.

Had a brief chat with a prof today, and I’m wrapping up my thoughts a little more. A common question, I think, is: if you took the undergraduate probability and computing, is graduate performance modeling worthwhile? To this, I think my answer is a pretty definite yes. If you’re strong in math, you can probably skip straight to the graduate version, but if you aren’t (i.e., me), the undergrad version is good preparation.

The undergraduate course goes through basic probability, randomized algorithms, and a little on Markov chains. I remember finding randomized algorithms and reservoir sampling very cool, enough to flip through Mitz and Upfal. The undergrad course was paced slowly enough that I didn’t feel like I was dying too much, and yet covered enough for me to see that probability-based math was actually pretty damn useful. We also covered how a bloom filter works, which should honestly be moved to the core curriculum somewhere. Definitely a course worth taking.

The graduate version cuts out randomized algorithms but otherwise covers all of the undergraduate content in about half a semester. It goes on to do basically everything in her book. To me, the great value-add from taking this course was covering:

  • classed Jackson networks
  • M/G/1 and approximating stuff with exponentials
  • scheduling policies, particularly the Linux Foreground-Background scheduler
  • preemptive and non-preemptive policies

From a systems perspective, this unlocks a lot that the undergrad material just didn’t have time to cover. In the undergrad course, you have DTMCs and CTMCs and need to make all these assumptions about arrival rates and service times. Who’s going to go measure that for you, and how do you know they’re right? I remember trying to model a scheduler over the summer (just poking my nose into other people’s projects), and it was pretty awkward.

In the grad version, though, classed Jackson networks and approximating distributions with mixtures of exponentials seems like it would have gotten me a little further. But more excitingly, SCHEDULING POLICIES. In class, she said something like “scheduling is one of the rare cases where you get free performance by being smarter about things”. I never thought about it that way until she said that, but she’s absolutely right. And the system I was looking at was plain old FCFS with unknown job size distribution; lots of room for free performance there (well, minus a little switching penalty). We spent comparatively little time on the last few chapters on scheduling, and it took a lot of material to get there, but I feel it is well worth it. There was also stuff about busy period analysis and renewal reward. Cute and powerful, but not something I’m hyped about, I guess.

In a perfect world, I would be taking her stuff and applying it to my task submission system.

In this slightly less perfect world I live in, it appears to be crunch time for 418 and 859. It also appears that, much like the scheduler I was looking at, I have a lot of room for growth. Particularly when it comes to the grade that I’ll get back for the performance modeling final.

Anyways, crunch time!

Written on December 6, 2019